Intel® 64 And IA 32 Architectures Software Developer’s Manual Ation Changes Intel Developer's May 2019 (252
User Manual:
Open the PDF directly: View PDF .
Page Count: 1601
Download | |
Open PDF In Browser | View PDF |
Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes May 2019 Notice: The Intel® 64 and IA-32 architectures may contain design defects or errors known as errata that may cause the product to deviate from published specifications. Current characterized errata are documented in the specification updates. Document Number: 252046-062 Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting http://www.intel.com/design/literature.htm. Intel, the Intel logo, Intel Atom, Intel Core, Intel SpeedStep, MMX, Pentium, VTune, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 1997-2019, Intel Corporation. All Rights Reserved. 2 Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes Contents Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Summary Tables of Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Documentation Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 3 Revision History Revision History Revision Description -001 • Initial release November 2002 -002 • • Added 1-10 Documentation Changes. Removed old Documentation Changes items that already have been incorporated in the published Software Developer’s manual December 2002 • • Added 9 -17 Documentation Changes. Removed Documentation Change #6 - References to bits Gen and Len Deleted. Removed Documentation Change #4 - VIF Information Added to CLI Discussion -003 • 4 Date February 2003 -004 • • Removed Documentation changes 1-17. -005 • • Removed Documentation Changes 1-24. Added Documentation Changes 1-15. September 2003 -006 • Added Documentation Changes 16- 34. November 2003 -007 • • Updated Documentation changes 14, 16, 17, and 28. Added Documentation Changes 35-45. -008 • • Removed Documentation Changes 1-45. Added Documentation Changes 1-5. -009 • Added Documentation Changes 7-27. -010 • • Removed Documentation Changes 1-27. Added Documentation Changes 1. -011 • Added Documentation Changes 2-28. -012 • • Removed Documentation Changes 1-28. Added Documentation Changes 1-16. -013 • • Updated title. There are no Documentation Changes for this revision of the document. -014 • Added Documentation Changes 1-21. -015 • • Removed Documentation Changes 1-21. Added Documentation Changes 1-20. -016 • Added Documentation changes 21-23. March 27, 2006 -017 • • Removed Documentation Changes 1-23. Added Documentation Changes 1-36. September 2006 -018 • Added Documentation Changes 37-42. -019 • • Removed Documentation Changes 1-42. Added Documentation Changes 1-19. -020 • Added Documentation Changes 20-27. -021 • • Removed Documentation Changes 1-27. Added Documentation Changes 1-6 -022 • • Removed Documentation Changes 1-6 Added Documentation Changes 1-6 August 2008 -023 • • Removed Documentation Changes 1-6 Added Documentation Changes 1-21 March 2009 Added Documentation changes 1-24. June 2003 January 2004 March 2004 May 2004 August 2004 November 2004 March 2005 July 2005 September 2005 March 9, 2006 October 2006 March 2007 May 2007 November 2007 Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes Revision History Revision Description Date -024 • • Removed Documentation Changes 1-21 Added Documentation Changes 1-16 June 2009 -025 • • Removed Documentation Changes 1-16 Added Documentation Changes 1-18 September 2009 -026 • • Removed Documentation Changes 1-18 Added Documentation Changes 1-15 December 2009 -027 • • Removed Documentation Changes 1-15 Added Documentation Changes 1-24 March 2010 -028 • • Removed Documentation Changes 1-24 Added Documentation Changes 1-29 June 2010 -029 • • Removed Documentation Changes 1-29 Added Documentation Changes 1-29 September 2010 -030 • • Removed Documentation Changes 1-29 Added Documentation Changes 1-29 January 2011 -031 • • Removed Documentation Changes 1-29 Added Documentation Changes 1-29 April 2011 -032 • • Removed Documentation Changes 1-29 Added Documentation Changes 1-14 May 2011 -033 • • Removed Documentation Changes 1-14 Added Documentation Changes 1-38 October 2011 -034 • • Removed Documentation Changes 1-38 Added Documentation Changes 1-16 December 2011 -035 • • Removed Documentation Changes 1-16 Added Documentation Changes 1-18 March 2012 -036 • • Removed Documentation Changes 1-18 Added Documentation Changes 1-17 May 2012 -037 • • Removed Documentation Changes 1-17 Added Documentation Changes 1-28 August 2012 -038 • • Removed Documentation Changes 1-28 Add Documentation Changes 1-22 January 2013 -039 • • Removed Documentation Changes 1-22 Add Documentation Changes 1-17 June 2013 -040 • • Removed Documentation Changes 1-17 Add Documentation Changes 1-24 September 2013 -041 • • Removed Documentation Changes 1-24 Add Documentation Changes 1-20 February 2014 -042 • • Removed Documentation Changes 1-20 Add Documentation Changes 1-8 February 2014 -043 • • Removed Documentation Changes 1-8 Add Documentation Changes 1-43 -044 • • Removed Documentation Changes 1-43 Add Documentation Changes 1-12 September 2014 -045 • • Removed Documentation Changes 1-12 Add Documentation Changes 1-22 January 2015 -046 • • Removed Documentation Changes 1-22 Add Documentation Changes 1-25 April 2015 -047 • • Removed Documentation Changes 1-25 Add Documentation Changes 1-19 June 2015 Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes June 2014 5 Revision History Revision Description Date -048 • • Removed Documentation Changes 1-19 Add Documentation Changes 1-33 September 2015 -049 • • Removed Documentation Changes 1-33 Add Documentation Changes 1-33 December 2015 -050 • • Removed Documentation Changes 1-33 Add Documentation Changes 1-9 April 2016 -051 • • Removed Documentation Changes 1-9 Add Documentation Changes 1-20 June 2016 -052 • • Removed Documentation Changes 1-20 Add Documentation Changes 1-22 September 2016 -053 • • Removed Documentation Changes 1-22 Add Documentation Changes 1-26 December 2016 -054 • • Removed Documentation Changes 1-26 Add Documentation Changes 1-20 March 2017 -055 • • Removed Documentation Changes 1-20 Add Documentation Changes 1-28 July 2017 -056 • • Removed Documentation Changes 1-28 Add Documentation Changes 1-18 October 2017 -057 • • Removed Documentation Changes 1-18 Add Documentation Changes 1-29 December 2017 -058 • • Removed Documentation Changes 1-29 Add Documentation Changes 1-17 March 2018 -059 • • Removed Documentation Changes 1-17 Add Documentation Changes 1-24 May 2018 -060 • • Removed Documentation Changes 1-24 Add Documentation Changes 1-23 November 2018 -061 • • Removed Documentation Changes 1-23 Add Documentation Changes 1-21 January 2019 -062 • • Removed Documentation Changes 1-21 Add Documentation Changes 1-28 May 2019 § 6 Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes Preface This document is an update to the specifications contained in the Affected Documents table below. This document is a compilation of device and documentation errata, specification clarifications and changes. It is intended for hardware system manufacturers and software developers of applications, operating systems, or tools. Affected Documents Document Title Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture ® Document Number/ Location 253665 Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-L 253666 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B: Instruction Set Reference, M-U 253667 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2C: Instruction Set Reference, V-Z 326018 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2D: Instruction Set Reference 334569 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1 253668 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2 253669 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3 326019 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3D: System Programming Guide, Part 4 332831 Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4: Model Specific Registers 335592 Nomenclature Documentation Changes include typos, errors, or omissions from the current published specifications. These will be incorporated in any new release of the specification. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 7 Summary Tables of Changes The following table indicates documentation changes which apply to the Intel® 64 and IA-32 architectures. This table uses the following notations: Codes Used in Summary Tables Change bar to left of table row indicates this erratum is either new or modified from the previous version of the document. Documentation Changes(Sheet 1 of 2) No. DOCUMENTATION CHANGES 1 Updates to Chapter 1, Volume 1 2 Updates to Chapter 2, Volume 1 3 Updates to Chapter 3, Volume 1 4 Updates to Chapter 13, Volume 1 5 Updates to Chapter 1, Volume 2A 6 Updates to Chapter 2, Volume 2A 7 Updates to Chapter 3, Volume 2A 8 Updates to Chapter 4, Volume 2B 9 Updates to Chapter 5, Volume 2C 10 Updates to Chapter 1, Volume 3A 11 Updates to Chapter 10, Volume 3A 12 Updates to Chapter 15, Volume 3B 13 Updates to Chapter 16, Volume 3B 14 Updates to Chapter 18, Volume 3B 15 Updates to Chapter 19, Volume 3B 16 Updates to Chapter 24, Volume 3C 17 Updates to Chapter 25, Volume 3C 18 Updates to Chapter 26, Volume 3C 19 Updates to Chapter 27, Volume 3C 20 Updates to Chapter 29, Volume 3C 21 Updates to Chapter 30, Volume 3C 22 Updates to Chapter 34, Volume 3C 23 Updates to Chapter 35, Volume 3C 24 Updates to Chapter 36, Volume 3D 25 Updates to Appendix B, Volume 3D Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 8 Documentation Changes(Sheet 2 of 2) No. DOCUMENTATION CHANGES 26 Updates to Appendix C, Volume 3D 27 Updates to Chapter 1, Volume 4 28 Updates to Chapter 2, Volume 4 Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 9 Documentation Changes Changes to the Intel® 64 and IA-32 Architectures Software Developer’s Manual volumes follow, and are listed by chapter. Only chapters with changes are included in this document. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 10 1. Updates to Chapter 1, Volume 1 Change bars and green text show changes to Chapter 1 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture. -----------------------------------------------------------------------------------------Changes to this chapter: Updates to Section 1.1 “Intel® 64 and IA-32 Processors Covered in this Manual”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 11 CHAPTER 1 ABOUT THIS MANUAL The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture (order number 253665) is part of a set that describes the architecture and programming environment of Intel® 64 and IA-32 architecture processors. Other volumes in this set are: • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D: Instruction Set Reference (order numbers 253666, 253667, 326018 and 334569). • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A, 3B, 3C & 3D: System Programming Guide (order numbers 253668, 253669, 326019 and 332831). • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4: Model-Specific Registers (order number 335592). The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D, describe the instruction set of the processor and the opcode structure. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A, 3B, 3C & 3D, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operatingsystem and BIOS designers. In addition, the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B, addresses the programming environment for classes of software that host operating systems. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4, describes the model-specific registers of Intel 64 and IA-32 processors. 1.1 INTEL® 64 AND IA-32 PROCESSORS COVERED IN THIS MANUAL This manual set includes information pertaining primarily to the most recent Intel 64 and IA-32 processors, which include: • • • • • • • • • • • • • • • • • • • Pentium® processors P6 family processors Pentium® 4 processors Pentium® M processors Intel® Xeon® processors Pentium® D processors Pentium® processor Extreme Editions 64-bit Intel® Xeon® processors Intel® Core™ Duo processor Intel® Core™ Solo processor Dual-Core Intel® Xeon® processor LV Intel® Core™2 Duo processor Intel® Core™2 Quad processor Q6000 series Intel® Xeon® processor 3000, 3200 series Intel® Xeon® processor 5000 series Intel® Xeon® processor 5100, 5300 series Intel® Core™2 Extreme processor X7000 and X6800 series Intel® Core™2 Extreme processor QX6000 series Intel® Xeon® processor 7100 series Vol. 1 1-1 ABOUT THIS MANUAL • • • • • • • • Intel® Pentium® Dual-Core processor • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Intel® Core™ i7 processor Intel® Xeon® processor 7200, 7300 series Intel® Xeon® processor 5200, 5400, 7400 series Intel® Core™2 Extreme processor QX9000 and X9000 series Intel® Core™2 Quad processor Q9000 series Intel® Core™2 Duo processor E8000, T9000 series Intel® Atom™ processor family Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are built from 45 nm and 32 nm processes Intel® Core™ i5 processor Intel® Xeon® processor E7-8800/4800/2800 product families Intel® Core™ i7-3930K processor 2nd generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx processor series Intel® Xeon® processor E3-1200 product family Intel® Xeon® processor E5-2400/1400 product family Intel® Xeon® processor E5-4600/2600/1600 product family 3rd generation Intel® Core™ processors Intel® Xeon® processor E3-1200 v2 product family Intel® Xeon® processor E5-2400/1400 v2 product families Intel® Xeon® processor E5-4600/2600/1600 v2 product families Intel® Xeon® processor E7-8800/4800/2800 v2 product families 4th generation Intel® Core™ processors The Intel® Core™ M processor family Intel® Core™ i7-59xx Processor Extreme Edition Intel® Core™ i7-49xx Processor Extreme Edition Intel® Xeon® processor E3-1200 v3 product family Intel® Xeon® processor E5-2600/1600 v3 product families 5th generation Intel® Core™ processors Intel® Xeon® processor D-1500 product family Intel® Xeon® processor E5 v4 family Intel® Atom™ processor X7-Z8000 and X5-Z8000 series Intel® Atom™ processor Z3400 series Intel® Atom™ processor Z3500 series 6th generation Intel® Core™ processors Intel® Xeon® processor E3-1500m v5 product family 7th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series Intel® Xeon® Processor Scalable Family 8th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series Intel® Xeon® E processors 9th generation Intel® Core™ processors 1-2 Vol. 1 ABOUT THIS MANUAL P6 family processors are IA-32 processors based on the P6 family microarchitecture. This includes the Pentium® Pro, Pentium® II, Pentium® III, and Pentium® III Xeon® processors. The Pentium® 4, Pentium® D, and Pentium® processor Extreme Editions are based on the Intel NetBurst® microarchitecture. Most early Intel® Xeon® processors are based on the Intel NetBurst® microarchitecture. Intel Xeon processor 5000, 7100 series are based on the Intel NetBurst® microarchitecture. The Intel® Core™ Duo, Intel® Core™ Solo and dual-core Intel® Xeon® processor LV are based on an improved Pentium® M processor microarchitecture. The Intel® Xeon® processor 3000, 3200, 5100, 5300, 7200, and 7300 series, Intel® Pentium® dual-core, Intel® Core™2 Duo, Intel® Core™2 Quad, and Intel® Core™2 Extreme processors are based on Intel® Core™ microarchitecture. The Intel® Xeon® processor 5200, 5400, 7400 series, Intel® Core™2 Quad processor Q9000 series, and Intel® Core™2 Extreme processors QX9000, X9000 series, Intel® Core™2 processor E8000 series are based on Enhanced Intel® Core™ microarchitecture. The Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are based on the Intel® Atom™ microarchitecture and supports Intel 64 architecture. P6 family, Pentium® M, Intel® Core™ Solo, Intel® Core™ Duo processors, dual-core Intel® Xeon® processor LV, and early generations of Pentium 4 and Intel Xeon processors support IA-32 architecture. The Intel® AtomTM processor Z5xx series support IA-32 architecture. The Intel® Xeon® processor 3000, 3200, 5000, 5100, 5200, 5300, 5400, 7100, 7200, 7300, 7400 series, Intel® Core™2 Duo, Intel® Core™2 Extreme, Intel® Core™2 Quad processors, Pentium® D processors, Pentium® DualCore processor, newer generations of Pentium 4 and Intel Xeon processor family support Intel® 64 architecture. The Intel® Core™ i7 processor and Intel® Xeon® processor 3400, 5500, 7500 series are based on 45 nm Nehalem microarchitecture. Westmere microarchitecture is a 32 nm version of the Nehalem microarchitecture. Intel® Xeon® processor 5600 series, Intel Xeon processor E7 and various Intel Core i7, i5, i3 processors are based on the Westmere microarchitecture. These processors support Intel 64 architecture. The Intel® Xeon® processor E5 family, Intel® Xeon® processor E3-1200 family, Intel® Xeon® processor E78800/4800/2800 product families, Intel® Core™ i7-3930K processor, and 2nd generation Intel® Core™ i7-2xxx, Intel® CoreTM i5-2xxx, Intel® Core™ i3-2xxx processor series are based on the Sandy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E7-8800/4800/2800 v2 product families, Intel® Xeon® processor E3-1200 v2 product family and 3rd generation Intel® Core™ processors are based on the Ivy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-4600/2600/1600 v2 product families, Intel® Xeon® processor E5-2400/1400 v2 product families and Intel® Core™ i7-49xx Processor Extreme Edition are based on the Ivy Bridge-E microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E3-1200 v3 product family and 4th Generation Intel® Core™ processors are based on the Haswell microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-2600/1600 v3 product families and the Intel® Core™ i7-59xx Processor Extreme Edition are based on the Haswell-E microarchitecture and support Intel 64 architecture. The Intel® Atom™ processor Z8000 series is based on the Airmont microarchitecture. The Intel® Atom™ processor Z3400 series and the Intel® Atom™ processor Z3500 series are based on the Silvermont microarchitecture. The Intel® Core™ M processor family, 5th generation Intel® Core™ processors, Intel® Xeon® processor D-1500 product family and the Intel® Xeon® processor E5 v4 family are based on the Broadwell microarchitecture and support Intel 64 architecture. The Intel® Xeon® Processor Scalable Family, Intel® Xeon® processor E3-1500m v5 product family and 6th generation Intel® Core™ processors are based on the Skylake microarchitecture and support Intel 64 architecture. The 7th generation Intel® Core™ processors are based on the Kaby Lake microarchitecture and support Intel 64 architecture. Vol. 1 1-3 ABOUT THIS MANUAL The Intel® Atom™ processor C series, the Intel® Atom™ processor X series, the Intel® Pentium® processor J series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont microarchitecture. The Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series is based on the Knights Landing microarchitecture and supports Intel 64 architecture. The Intel® Pentium® Silver processor series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont Plus microarchitecture. The 8th generation Intel® Core™ processors, 9th generation Intel® Core™ processors, and Intel® Xeon® E processors are based on the Coffee Lake microarchitecture and support Intel 64 architecture. The Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series is based on the Knights Mill microarchitecture and supports Intel 64 architecture. IA-32 architecture is the instruction set architecture and programming environment for Intel's 32-bit microprocessors. Intel® 64 architecture is the instruction set architecture and programming environment which is the superset of Intel’s 32-bit and 64-bit architectures. It is compatible with the IA-32 architecture. 1.2 OVERVIEW OF VOLUME 1: BASIC ARCHITECTURE A description of this manual’s content follows: Chapter 1 — About This Manual. Gives an overview of all five volumes of the Intel® 64 and IA-32 Architectures Software Developer’s Manual. It also describes the notational conventions in these manuals and lists related Intel manuals and documentation of interest to programmers and hardware designers. Chapter 2 — Intel® 64 and IA-32 Architectures. Introduces the Intel 64 and IA-32 architectures along with the families of Intel processors that are based on these architectures. It also gives an overview of the common features found in these processors and brief history of the Intel 64 and IA-32 architectures. Chapter 3 — Basic Execution Environment. Introduces the models of memory organization and describes the register set used by applications. Chapter 4 — Data Types. Describes the data types and addressing modes recognized by the processor; provides an overview of real numbers and floating-point formats and of floating-point exceptions. Chapter 5 — Instruction Set Summary. Lists all Intel 64 and IA-32 instructions, divided into technology groups. Chapter 6 — Procedure Calls, Interrupts, and Exceptions. Describes the procedure stack and mechanisms provided for making procedure calls and for servicing interrupts and exceptions. Chapter 7 — Programming with General-Purpose Instructions. Describes basic load and store, program control, arithmetic, and string instructions that operate on basic data types, general-purpose and segment registers; also describes system instructions that are executed in protected mode. Chapter 8 — Programming with the x87 FPU. Describes the x87 floating-point unit (FPU), including floatingpoint registers and data types; gives an overview of the floating-point instruction set and describes the processor's floating-point exception conditions. Chapter 9 — Programming with Intel® MMX™ Technology. Describes Intel MMX technology, including MMX registers and data types; also provides an overview of the MMX instruction set. Chapter 10 — Programming with Intel® Streaming SIMD Extensions (Intel® SSE). Describes SSE extensions, including XMM registers, the MXCSR register, and packed single-precision floating-point data types; provides an overview of the SSE instruction set and gives guidelines for writing code that accesses the SSE extensions. Chapter 11 — Programming with Intel® Streaming SIMD Extensions 2 (Intel® SSE2). Describes SSE2 extensions, including XMM registers and packed double-precision floating-point data types; provides an overview of the SSE2 instruction set and gives guidelines for writing code that accesses SSE2 extensions. This chapter also describes SIMD floating-point exceptions that can be generated with SSE and SSE2 instructions. It also provides general guidelines for incorporating support for SSE and SSE2 extensions into operating system and applications code. Chapter 12 — Programming with Intel® Streaming SIMD Extensions 3 (Intel® SSE3), Supplemental Streaming SIMD Extensions 3 (SSSE3), Intel® Streaming SIMD Extensions 4 (Intel® SSE4) and Intel® 1-4 Vol. 1 ABOUT THIS MANUAL AES New Instructions (Intel® AESNI). Provides an overview of the SSE3 instruction set, Supplemental SSE3, SSE4, AESNI instructions, and guidelines for writing code that accesses these extensions. Chapter 13 — Managing State Using the XSAVE Feature Set. Describes the XSAVE feature set instructions and explains how software can enable the XSAVE feature set and XSAVE-enabled features. Chapter 14 — Programming with AVX, FMA and AVX2. Provides an overview of the Intel® AVX instruction set, FMA and Intel AVX2 extensions and gives guidelines for writing code that accesses these extensions. Chapter 15 — Programming with Intel Transactional Synchronization Extensions. Describes the instruction extensions that support lock elision techniques to improve the performance of multi-threaded software with contended locks. Chapter 16 — Input/Output. Describes the processor’s I/O mechanism, including I/O port addressing, I/O instructions, and I/O protection mechanisms. Chapter 17 — Processor Identification and Feature Determination. Describes how to determine the CPU type and features available in the processor. Appendix A — EFLAGS Cross-Reference. Summarizes how the IA-32 instructions affect the flags in the EFLAGS register. Appendix B — EFLAGS Condition Codes. Summarizes how conditional jump, move, and ‘byte set on condition code’ instructions use condition code flags (OF, CF, ZF, SF, and PF) in the EFLAGS register. Appendix C — Floating-Point Exceptions Summary. Summarizes exceptions raised by the x87 FPU floatingpoint and SSE/SSE2/SSE3 floating-point instructions. Appendix D — Guidelines for Writing x87 FPU Exception Handlers. Describes how to design and write MSDOS* compatible exception handling facilities for FPU exceptions (includes software and hardware requirements and assembly-language code examples). This appendix also describes general techniques for writing robust FPU exception handlers. Appendix E — Guidelines for Writing SIMD Floating-Point Exception Handlers. Gives guidelines for writing exception handlers for exceptions generated by SSE/SSE2/SSE3 floating-point instructions. 1.3 NOTATIONAL CONVENTIONS This manual uses specific notation for data-structure formats, for symbolic representation of instructions, and for hexadecimal and binary numbers. This notation is described below. 1.3.1 Bit and Byte Order In illustrations of data structures in memory, smaller addresses appear toward the bottom of the figure; addresses increase toward the top. Bit positions are numbered from right to left. The numerical value of a set bit is equal to two raised to the power of the bit position. Intel 64 and IA-32 processors are “little endian” machines; this means the bytes of a word are numbered starting from the least significant byte. See Figure 1-1. Vol. 1 1-5 ABOUT THIS MANUAL Highest Address Data Structure 32 24 23 Byte 3 16 15 Byte 2 8 7 Byte 1 0 Byte 0 Bit offset 28 24 20 16 12 8 4 0 Lowest Address Byte Offset Figure 1-1. Bit and Byte Order 1.3.2 Reserved Bits and Software Compatibility In many register and memory layout descriptions, certain bits are marked as reserved. When bits are marked as reserved, it is essential for compatibility with future processors that software treat these bits as having a future, though unknown, effect. The behavior of reserved bits should be regarded as not only undefined, but unpredictable. Software should follow these guidelines in dealing with reserved bits: • Do not depend on the states of any reserved bits when testing the values of registers that contain such bits. Mask out the reserved bits before testing. • • • Do not depend on the states of any reserved bits when storing to memory or to a register. Do not depend on the ability to retain information written into any reserved bits. When loading a register, always load the reserved bits with the values indicated in the documentation, if any, or reload them with values previously read from the same register. NOTE Avoid any software dependence upon the state of reserved bits in Intel 64 and IA-32 registers. Depending upon the values of reserved register bits will make software dependent upon the unspecified manner in which the processor handles these bits. Programs that depend upon reserved values risk incompatibility with future processors. 1.3.2.1 Instruction Operands When instructions are represented symbolically, a subset of the IA-32 assembly language is used. In this subset, an instruction has the following format: label: mnemonic argument1, argument2, argument3 where: • • • A label is an identifier which is followed by a colon. A mnemonic is a reserved name for a class of instruction opcodes which have the same function. The operands argument1, argument2, and argument3 are optional. There may be from zero to three operands, depending on the opcode. When present, they take the form of either literals or identifiers for data items. Operand identifiers are either reserved names of registers or are assumed to be assigned to data items declared in another part of the program (which may not be shown in the example). When two operands are present in an arithmetic or logical instruction, the right operand is the source and the left operand is the destination. 1-6 Vol. 1 ABOUT THIS MANUAL For example: LOADREG: MOV EAX, SUBTOTAL In this example, LOADREG is a label, MOV is the mnemonic identifier of an opcode, EAX is the destination operand, and SUBTOTAL is the source operand. Some assembly languages put the source and destination in reverse order. 1.3.3 Hexadecimal and Binary Numbers Base 16 (hexadecimal) numbers are represented by a string of hexadecimal digits followed by the character H (for example, 0F82EH). A hexadecimal digit is a character from the following set: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Base 2 (binary) numbers are represented by a string of 1s and 0s, sometimes followed by the character B (for example, 1010B). The “B” designation is only used in situations where confusion as to the type of number might arise. 1.3.4 Segmented Addressing The processor uses byte addressing. This means memory is organized and accessed as a sequence of bytes. Whether one or more bytes are being accessed, a byte address is used to locate the byte or bytes memory. The range of memory that can be addressed is called an address space. The processor also supports segmented addressing. This is a form of addressing where a program may have many independent address spaces, called segments. For example, a program can keep its code (instructions) and stack in separate segments. Code addresses would always refer to the code space, and stack addresses would always refer to the stack space. The following notation is used to specify a byte address within a segment: Segment-register:Byte-address For example, the following segment address identifies the byte at address FF79H in the segment pointed by the DS register: DS:FF79H The following segment address identifies an instruction address in the code segment. The CS register points to the code segment and the EIP register contains the address of the instruction. CS:EIP 1.3.5 A New Syntax for CPUID, CR, and MSR Values Obtain feature flags, status, and system information by using the CPUID instruction, by checking control register bits, and by reading model-specific registers. We are moving toward a new syntax to represent this information. See Figure 1-2. Vol. 1 1-7 ABOUT THIS MANUAL CPUID Input and Output CPUID.01H:EDX.SSE[bit 25] = 1 Input value for EAX register Output register and feature flag or field name with bit position(s) Value (or range) of output Control Register Values CR4.OSFXSR[bit 9] = 1 Example CR name Feature flag or field name with bit position(s) Value (or range) of output Model-Specific Register Values IA32_MISC_ENABLE.ENABLEFOPCODE[bit 2] = 1 Example MSR name Feature flag or field name with bit position(s) Value (or range) of output Figure 1-2. Syntax for CPUID, CR, and MSR Data Presentation 1.3.6 SDM29002 Exceptions An exception is an event that typically occurs when an instruction causes an error. For example, an attempt to divide by zero generates an exception. However, some exceptions, such as breakpoints, occur under other conditions. Some types of exceptions may provide error codes. An error code reports additional information about the error. An example of the notation used to show an exception and error code is shown below: #PF(fault code) This example refers to a page-fault exception under conditions where an error code naming a type of fault is reported. Under some conditions, exceptions that produce error codes may not be able to report an accurate code. In this case, the error code is zero, as shown below for a general-protection exception: #GP(0) 1-8 Vol. 1 ABOUT THIS MANUAL 1.4 RELATED LITERATURE Literature related to Intel 64 and IA-32 processors is listed and viewable on-line at: https://software.intel.com/en-us/articles/intel-sdm See also: • The latest security information on Intel® products: https://www.intel.com/content/www/us/en/security-center/default.html • Software developer resources, guidance and insights for security advisories: https://software.intel.com/security-software-guidance/ • • • The data sheet for a particular Intel 64 or IA-32 processor • Intel® Fortran Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ • Intel® Software Development Tools: https://software.intel.com/en-us/intel-sdp-home • Intel® 64 and IA-32 Architectures Software Developer’s Manual (in one, four or ten volumes): https://software.intel.com/en-us/articles/intel-sdm • Intel® 64 and IA-32 Architectures Optimization Reference Manual: https://software.intel.com/en-us/articles/intel-sdm#optimization • Intel 64 Architecture x2APIC Specification: The specification update for a particular Intel 64 or IA-32 processor Intel® C++ Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ http://www.intel.com/content/www/us/en/architecture-and-technology/64-architecture-x2apic-specification.html • Intel® Trusted Execution Technology Measured Launched Environment Programming Guide: http://www.intel.com/content/www/us/en/software-developers/intel-txt-software-development-guide.html • Developing Multi-threaded Applications: A Platform Consistent Approach: https://software.intel.com/sites/default/files/article/147714/51534-developing-multithreaded-applications.pdf • Using Spin-Loops on Intel® Pentium® 4 Processor and Intel® Xeon® Processor: https://software.intel.com/sites/default/files/22/30/25602 • Performance Monitoring Unit Sharing Guide http://software.intel.com/file/30388 Literature related to selected features in future Intel processors are available at: • Intel® Architecture Instruction Set Extensions Programming Reference https://software.intel.com/en-us/isa-extensions • Intel® Software Guard Extensions (Intel® SGX) Programming Reference https://software.intel.com/en-us/isa-extensions/intel-sgx More relevant links are: • Intel® Developer Zone: • Developer centers: https://software.intel.com/en-us http://www.intel.com/content/www/us/en/hardware-developers/developer-centers.html • Processor support general link: http://www.intel.com/support/processors/ • Intel® Hyper-Threading Technology (Intel® HT Technology): http://www.intel.com/technology/platform-technology/hyper-threading/index.htm Vol. 1 1-9 ABOUT THIS MANUAL 1-10 Vol. 1 2. Updates to Chapter 2, Volume 1 Change bars and green text show changes to Chapter 2 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture. -----------------------------------------------------------------------------------------Changes to this chapter: Deletion of outdated statements. Typo correction on Intel 64 architecture physical address space in Section 2.2.10 “Intel® 64 Architecture”. Addition of Section 2.4 “Proposed Removal of Intel Instruction Set Architecture and Features from Upcoming Products” and Section 2.5 “Intel Instruction Set Architecture and Features Removed”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 12 CHAPTER 2 INTEL 64 AND IA-32 ARCHITECTURES ® 2.1 BRIEF HISTORY OF INTEL® 64 AND IA-32 ARCHITECTURE The following sections provide a summary of the major technical evolutions from IA-32 to Intel 64 architecture: starting from the Intel 8086 processor to the latest Intel® Core® 2 Duo, Core 2 Quad and Intel Xeon processor 5300 and 7300 series. Object code created for processors released as early as 1978 still executes on the latest processors in the Intel 64 and IA-32 architecture families. 2.1.1 16-bit Processors and Segmentation (1978) The IA-32 architecture family was preceded by 16-bit processors, the 8086 and 8088. The 8086 has 16-bit registers and a 16-bit external data bus, with 20-bit addressing giving a 1-MByte address space. The 8088 is similar to the 8086 except it has an 8-bit external data bus. The 8086/8088 introduced segmentation to the IA-32 architecture. With segmentation, a 16-bit segment register contains a pointer to a memory segment of up to 64 KBytes. Using four segment registers at a time, 8086/8088 processors are able to address up to 256 KBytes without switching between segments. The 20-bit addresses that can be formed using a segment register and an additional 16-bit pointer provide a total address range of 1 MByte. 2.1.2 The Intel® 286 Processor (1982) The Intel 286 processor introduced protected mode operation into the IA-32 architecture. Protected mode uses the segment register content as selectors or pointers into descriptor tables. Descriptors provide 24-bit base addresses with a physical memory size of up to 16 MBytes, support for virtual memory management on a segment swapping basis, and a number of protection mechanisms. These mechanisms include: • • • Segment limit checking Read-only and execute-only segment options Four privilege levels 2.1.3 The Intel386™ Processor (1985) The Intel386 processor was the first 32-bit processor in the IA-32 architecture family. It introduced 32-bit registers for use both to hold operands and for addressing. The lower half of each 32-bit Intel386 register retains the properties of the 16-bit registers of earlier generations, permitting backward compatibility. The processor also provides a virtual-8086 mode that allows for even greater efficiency when executing programs created for 8086/8088 processors. In addition, the Intel386 processor has support for: • • • • A 32-bit address bus that supports up to 4-GBytes of physical memory A segmented-memory model and a flat memory model Paging, with a fixed 4-KByte page size providing a method for virtual memory management Support for parallel stages 2.1.4 The Intel486™ Processor (1989) The Intel486™ processor added more parallel execution capability by expanding the Intel386 processor’s instruction decode and execution units into five pipelined stages. Each stage operates in parallel with the others on up to five instructions in different stages of execution. Vol. 1 2-1 INTEL® 64 AND IA-32 ARCHITECTURES In addition, the processor added: • An 8-KByte on-chip first-level cache that increased the percent of instructions that could execute at the scalar rate of one per clock • • An integrated x87 FPU Power saving and system management capabilities 2.1.5 The Intel® Pentium® Processor (1993) The introduction of the Intel Pentium processor added a second execution pipeline to achieve superscalar performance (two pipelines, known as u and v, together can execute two instructions per clock). The on-chip first-level cache doubled, with 8 KBytes devoted to code and another 8 KBytes devoted to data. The data cache uses the MESI protocol to support more efficient write-back cache in addition to the write-through cache previously used by the Intel486 processor. Branch prediction with an on-chip branch table was added to increase performance in looping constructs. In addition, the processor added: • • • • • Extensions to make the virtual-8086 mode more efficient and allow for 4-MByte as well as 4-KByte pages Internal data paths of 128 and 256 bits add speed to internal data transfers Burstable external data bus was increased to 64 bits An APIC to support systems with multiple processors A dual processor mode to support glueless two processor systems A subsequent stepping of the Pentium family introduced Intel MMX technology (the Pentium Processor with MMX technology). Intel MMX technology uses the single-instruction, multiple-data (SIMD) execution model to perform parallel computations on packed integer data contained in 64-bit registers. See Section 2.2.7, “SIMD Instructions.” 2.1.6 The P6 Family of Processors (1995-1999) The P6 family of processors was based on a superscalar microarchitecture that set new performance standards; see also Section 2.2.1, “P6 Family Microarchitecture.” One of the goals in the design of the P6 family microarchitecture was to exceed the performance of the Pentium processor significantly while using the same 0.6-micrometer, fourlayer, metal BICMOS manufacturing process. Members of this family include the following: • The Intel Pentium Pro processor is three-way superscalar. Using parallel processing techniques, the processor is able on average to decode, dispatch, and complete execution of (retire) three instructions per clock cycle. The Pentium Pro introduced the dynamic execution (micro-data flow analysis, out-of-order execution, superior branch prediction, and speculative execution) in a superscalar implementation. The processor was further enhanced by its caches. It has the same two on-chip 8-KByte 1st-Level caches as the Pentium processor and an additional 256-KByte Level 2 cache in the same package as the processor. • The Intel Pentium II processor added Intel MMX technology to the P6 family processors along with new packaging and several hardware enhancements. The processor core is packaged in the single edge contact cartridge (SECC). The Level l data and instruction caches were enlarged to 16 KBytes each, and Level 2 cache sizes of 256 KBytes, 512 KBytes, and 1 MByte are supported. A half-frequency backside bus connects the Level 2 cache to the processor. Multiple low-power states such as AutoHALT, Stop-Grant, Sleep, and Deep Sleep are supported to conserve power when idling. • The Pentium II Xeon processor combined the premium characteristics of previous generations of Intel processors. This includes: 4-way, 8-way (and up) scalability and a 2 MByte 2nd-Level cache running on a fullfrequency backside bus. • The Intel Celeron processor family focused on the value PC market segment. Its introduction offers an integrated 128 KBytes of Level 2 cache and a plastic pin grid array (P.P.G.A.) form factor to lower system design cost. • The Intel Pentium III processor introduced the Streaming SIMD Extensions (SSE) to the IA-32 architecture. SSE extensions expand the SIMD execution model introduced with the Intel MMX technology by providing a 2-2 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES new set of 128-bit registers and the ability to perform SIMD operations on packed single-precision floatingpoint values. See Section 2.2.7, “SIMD Instructions.” • The Pentium III Xeon processor extended the performance levels of the IA-32 processors with the enhancement of a full-speed, on-die, and Advanced Transfer Cache. 2.1.7 The Intel® Pentium® 4 Processor Family (2000-2006) The Intel Pentium 4 processor family is based on Intel NetBurst microarchitecture; see Section 2.2.2, “Intel NetBurst® Microarchitecture.” The Intel Pentium 4 processor introduced Streaming SIMD Extensions 2 (SSE2); see Section 2.2.7, “SIMD Instructions.” The Intel Pentium 4 processor 3.40 GHz, supporting Hyper-Threading Technology introduced Streaming SIMD Extensions 3 (SSE3); see Section 2.2.7, “SIMD Instructions.” Intel 64 architecture was introduced in the Intel Pentium 4 Processor Extreme Edition supporting Hyper-Threading Technology and in the Intel Pentium 4 Processor 6xx and 5xx sequences. Intel® Virtualization Technology (Intel® VT) was introduced in the Intel Pentium 4 processor 672 and 662. 2.1.8 The Intel® Xeon® Processor (2001- 2007) Intel Xeon processors (with exception for dual-core Intel Xeon processor LV, Intel Xeon processor 5100 series) are based on the Intel NetBurst microarchitecture; see Section 2.2.2, “Intel NetBurst® Microarchitecture.” As a family, this group of IA-32 processors (more recently Intel 64 processors) is designed for use in multi-processor server systems and high-performance workstations. The Intel Xeon processor MP introduced support for Intel® Hyper-Threading Technology; see Section 2.2.8, “Intel® Hyper-Threading Technology.” The 64-bit Intel Xeon processor 3.60 GHz (with an 800 MHz System Bus) was used to introduce Intel 64 architecture. The Dual-Core Intel Xeon processor includes dual core technology. The Intel Xeon processor 70xx series includes Intel Virtualization Technology. The Intel Xeon processor 5100 series introduces power-efficient, high performance Intel Core microarchitecture. This processor is based on Intel 64 architecture; it includes Intel Virtualization Technology and dual-core technology. The Intel Xeon processor 3000 series are also based on Intel Core microarchitecture. The Intel Xeon processor 5300 series introduces four processor cores in a physical package, they are also based on Intel Core microarchitecture. 2.1.9 The Intel® Pentium® M Processor (2003-2006) The Intel Pentium M processor family is a high performance, low power mobile processor family with microarchitectural enhancements over previous generations of IA-32 Intel mobile processors. This family is designed for extending battery life and seamless integration with platform innovations that enable new usage models (such as extended mobility, ultra thin form-factors, and integrated wireless networking). Its enhanced microarchitecture includes: • • Support for Intel Architecture with Dynamic Execution • • • • • • On-die, primary 32-KByte instruction cache and 32-KByte write-back data cache A high performance, low-power core manufactured using Intel’s advanced process technology with copper interconnect On-die, second-level cache (up to 2 MByte) with Advanced Transfer Cache Architecture Advanced Branch Prediction and Data Prefetch Logic Support for MMX technology, Streaming SIMD instructions, and the SSE2 instruction set A 400 or 533 MHz, Source-Synchronous Processor System Bus Advanced power management using Enhanced Intel SpeedStep® technology Vol. 1 2-3 INTEL® 64 AND IA-32 ARCHITECTURES 2.1.10 The Intel® Pentium® Processor Extreme Edition (2005) The Intel Pentium processor Extreme Edition introduced dual-core technology. This technology provides advanced hardware multi-threading support. The processor is based on Intel NetBurst microarchitecture and supports SSE, SSE2, SSE3, Hyper-Threading Technology, and Intel 64 architecture. See also: • • • • • • Section 2.2.2, “Intel NetBurst® Microarchitecture” Section 2.2.3, “Intel® Core™ Microarchitecture” Section 2.2.7, “SIMD Instructions” Section 2.2.8, “Intel® Hyper-Threading Technology” Section 2.2.9, “Multi-Core Technology” Section 2.2.10, “Intel® 64 Architecture” 2.1.11 The Intel® Core™ Duo and Intel® Core™ Solo Processors (2006-2007) The Intel Core Duo processor offers power-efficient, dual-core performance with a low-power design that extends battery life. This family and the single-core Intel Core Solo processor offer microarchitectural enhancements over Pentium M processor family. Its enhanced microarchitecture includes: • • • • • Intel® Smart Cache which allows for efficient data sharing between two processor cores Improved decoding and SIMD execution Intel® Dynamic Power Coordination and Enhanced Intel® Deeper Sleep to reduce power consumption Intel® Advanced Thermal Manager which features digital thermal sensor interfaces Support for power-optimized 667 MHz bus The dual-core Intel Xeon processor LV is based on the same microarchitecture as Intel Core Duo processor, and supports IA-32 architecture. 2.1.12 The Intel® Xeon® Processor 5100, 5300 Series and Intel® Core™2 Processor Family (2006) The Intel Xeon processor 3000, 3200, 5100, 5300, and 7300 series, Intel Pentium Dual-Core, Intel Core 2 Extreme, Intel Core 2 Quad processors, and Intel Core 2 Duo processor family support Intel 64 architecture; they are based on the high-performance, power-efficient Intel® Core microarchitecture built on 65 nm process technology. The Intel Core microarchitecture includes the following innovative features: • • • • • Intel® Wide Dynamic Execution to increase performance and execution throughput Intel® Intelligent Power Capability to reduce power consumption Intel® Advanced Smart Cache which allows for efficient data sharing between two processor cores Intel® Smart Memory Access to increase data bandwidth and hide latency of memory accesses Intel® Advanced Digital Media Boost which improves application performance using multiple generations of Streaming SIMD extensions The Intel Xeon processor 5300 series, Intel Core 2 Extreme processor QX6800 series, and Intel Core 2 Quad processors support Intel quad-core technology. 2.1.13 The Intel® Xeon® Processor 5200, 5400, 7400 Series and Intel® Core™2 Processor Family (2007) The Intel Xeon processor 5200, 5400, and 7400 series, Intel Core 2 Quad processor Q9000 Series, Intel Core 2 Duo processor E8000 series support Intel 64 architecture; they are based on the Enhanced Intel® Core microarchitec- 2-4 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES ture using 45 nm process technology. The Enhanced Intel Core microarchitecture provides the following improved features: • • • A radix-16 divider, faster OS primitives further increases the performance of Intel® Wide Dynamic Execution. Improves Intel® Advanced Smart Cache with Up to 50% larger level-two cache and up to 50% increase in wayset associativity. A 128-bit shuffler engine significantly improves the performance of Intel® Advanced Digital Media Boost and SSE4. Intel Xeon processor 5400 series and Intel Core 2 Quad processor Q9000 Series support Intel quad-core technology. Intel Xeon processor 7400 series offers up to six processor cores and an L3 cache up to 16 MBytes. 2.1.14 The Intel® Atom™ Processor Family (2008) The first generation of Intel® AtomTM processors are built on 45 nm process technology. They are based on a new microarchitecture, Intel® AtomTM microarchitecture, which is optimized for ultra low power devices. The Intel® AtomTM microarchitecture features two in-order execution pipelines that minimize power consumption, increase battery life, and enable ultra-small form factors. The initial Intel Atom Processor family and subsequent generations including Intel Atom processor D2000, N2000, E2000, Z2000, C1000 series provide the following features: • • • • Enhanced Intel® SpeedStep® Technology • • Support for Intel® Virtualization Technology Intel® Hyper-Threading Technology Deep Power Down Technology with Dynamic Cache Sizing Support for instruction set extensions up to and including Supplemental Streaming SIMD Extensions 3 (SSSE3). Support for Intel® 64 Architecture (excluding Intel Atom processor Z5xx Series) 2.1.15 The Intel® Atom™ Processor Family Based on Silvermont Microarchitecture (2013) Intel Atom Processor C2xxx, E3xxx, S1xxx series are based on the Silvermont microarchitecture. Processors based on the Silvermont microarchitecture supports instruction set extensions up to and including SSE4.2, AESNI, and PCLMULQDQ. 2.1.16 The Intel® Core™i7 Processor Family (2008) The Intel Core i7 processor 900 series support Intel 64 architecture; they are based on Intel® microarchitecture code name Nehalem using 45 nm process technology. The Intel Core i7 processor and Intel Xeon processor 5500 series include the following innovative features: • • • • • • • • Intel® Turbo Boost Technology converts thermal headroom into higher performance. Intel® HyperThreading Technology in conjunction with Quadcore to provide four cores and eight threads. Dedicated power control unit to reduce active and idle power consumption. Integrated memory controller on the processor supporting three channel of DDR3 memory. 8 MB inclusive Intel® Smart Cache. Intel® QuickPath interconnect (QPI) providing point-to-point link to chipset. Support for SSE4.2 and SSE4.1 instruction sets. Second generation Intel Virtualization Technology. 2.1.17 The Intel® Xeon® Processor 7500 Series (2010) The Intel Xeon processor 7500 and 6500 series are based on Intel microarchitecture code name Nehalem using 45 nm process technology. They support the same features described in Section 2.1.16, plus the following innovative features: Vol. 1 2-5 INTEL® 64 AND IA-32 ARCHITECTURES • • • Up to eight cores per physical processor package. • Advanced RAS supporting software recoverable machine check architecture. Up to 24 MB inclusive Intel® Smart Cache. Provides Intel® Scalable Memory Interconnect (Intel® SMI) channels with Intel® 7500 Scalable Memory Buffer to connect to system memory. 2.1.18 2010 Intel® Core™ Processor Family (2010) 2010 Intel Core processor family spans Intel Core i7, i5 and i3 processors. They are based on Intel® microarchitecture code name Westmere using 32 nm process technology. The innovative features can include: • • • • • Deliver smart performance using Intel Hyper-Threading Technology plus Intel Turbo Boost Technology. Enhanced Intel Smart Cache and integrated memory controller. Intelligent power gating. Repartitioned platform with on-die integration of 45 nm integrated graphics. Range of instruction set support up to AESNI, PCLMULQDQ, SSE4.2 and SSE4.1. 2.1.19 The Intel® Xeon® Processor 5600 Series (2010) The Intel Xeon processor 5600 series are based on Intel microarchitecture code name Westmere using 32 nm process technology. They support the same features described in Section 2.1.16, plus the following innovative features: • • • • Up to six cores per physical processor package. Up to 12 MB enhanced Intel® Smart Cache. Support for AESNI, PCLMULQDQ, SSE4.2 and SSE4.1 instruction sets. Flexible Intel Virtualization Technologies across processor and I/O. 2.1.20 The Second Generation Intel® Core™ Processor Family (2011) The Second Generation Intel Core processor family spans Intel Core i7, i5 and i3 processors based on the Sandy Bridge microarchitecture. They are built from 32 nm process technology and have innovative features including: • • • • • Intel Turbo Boost Technology for Intel Core i5 and i7 processors Intel Hyper-Threading Technology. Enhanced Intel Smart Cache and integrated memory controller. Processor graphics and built-in visual features like Intel® Quick Sync Video, Intel® InsiderTM etc. Range of instruction set support up to AVX, AESNI, PCLMULQDQ, SSE4.2 and SSE4.1. Intel Xeon processor E3-1200 product family is also based on the Sandy Bridge microarchitecture. Intel Xeon processor E5-2400/1400 product families are based on the Sandy Bridge-EP microarchitecture. Intel Xeon processor E5-4600/2600/1600 product families are based on the Sandy Bridge-EP microarchitecture and provide support for multiple sockets. 2.1.21 The Third Generation Intel® Core™ Processor Family (2012) The Third Generation Intel Core processor family spans Intel Core i7, i5 and i3 processors based on the Ivy Bridge microarchitecture. The Intel Xeon processor E7-8800/4800/2800 v2 product families and Intel Xeon processor E31200 v2 product family are also based on the Ivy Bridge microarchitecture. The Intel Xeon processor E5-2400/1400 v2 product families are based on the Ivy Bridge-EP microarchitecture. 2-6 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES The Intel Xeon processor E5-4600/2600/1600 v2 product families are based on the Ivy Bridge-EP microarchitecture and provide support for multiple sockets. 2.1.22 The Fourth Generation Intel® Core™ Processor Family (2013) The Fourth Generation Intel Core processor family spans Intel Core i7, i5 and i3 processors based on the Haswell microarchitecture. Intel Xeon processor E3-1200 v3 product family is also based on the Haswell microarchitecture. 2.2 MORE ON SPECIFIC ADVANCES The following sections provide more information on major innovations. 2.2.1 P6 Family Microarchitecture The Pentium Pro processor introduced a new microarchitecture commonly referred to as P6 processor microarchitecture. The P6 processor microarchitecture was later enhanced with an on-die, Level 2 cache, called Advanced Transfer Cache. The microarchitecture is a three-way superscalar, pipelined architecture. Three-way superscalar means that by using parallel processing techniques, the processor is able on average to decode, dispatch, and complete execution of (retire) three instructions per clock cycle. To handle this level of instruction throughput, the P6 processor family uses a decoupled, 12-stage superpipeline that supports out-of-order instruction execution. Figure 2-1 shows a conceptual view of the P6 processor microarchitecture pipeline with the Advanced Transfer Cache enhancement. System Bus Frequently used Bus Unit 2nd Level Cache On-die, 8-way Less frequently used 1st Level Cache 4-way, low latency Front End Fetch/ Decode Execution Instruction Cache Microcode ROM Execution Out-of-Order Core Retirement Branch History Update BTSs/Branch Prediction OM16520 Figure 2-1. The P6 Processor Microarchitecture with Advanced Transfer Cache Enhancement To ensure a steady supply of instructions and data for the instruction execution pipeline, the P6 processor microarchitecture incorporates two cache levels. The Level 1 cache provides an 8-KByte instruction cache and an 8-KByte Vol. 1 2-7 INTEL® 64 AND IA-32 ARCHITECTURES data cache, both closely coupled to the pipeline. The Level 2 cache provides 256-KByte, 512-KByte, or 1-MByte static RAM that is coupled to the core processor through a full clock-speed 64-bit cache bus. The centerpiece of the P6 processor microarchitecture is an out-of-order execution mechanism called dynamic execution. Dynamic execution incorporates three data-processing concepts: • Deep branch prediction allows the processor to decode instructions beyond branches to keep the instruction pipeline full. The P6 processor family implements highly optimized branch prediction algorithms to predict the direction of the instruction. • Dynamic data flow analysis requires real-time analysis of the flow of data through the processor to determine dependencies and to detect opportunities for out-of-order instruction execution. The out-of-order execution core can monitor many instructions and execute these instructions in the order that best optimizes the use of the processor’s multiple execution units, while maintaining the data integrity. • Speculative execution refers to the processor’s ability to execute instructions that lie beyond a conditional branch that has not yet been resolved, and ultimately to commit the results in the order of the original instruction stream. To make speculative execution possible, the P6 processor microarchitecture decouples the dispatch and execution of instructions from the commitment of results. The processor’s out-of-order execution core uses data-flow analysis to execute all available instructions in the instruction pool and store the results in temporary registers. The retirement unit then linearly searches the instruction pool for completed instructions that no longer have data dependencies with other instructions or unresolved branch predictions. When completed instructions are found, the retirement unit commits the results of these instructions to memory and/or the IA-32 registers (the processor’s eight general-purpose registers and eight x87 FPU data registers) in the order they were originally issued and retires the instructions from the instruction pool. 2.2.2 Intel NetBurst® Microarchitecture The Intel NetBurst microarchitecture provides: • The Rapid Execution Engine — Arithmetic Logic Units (ALUs) run at twice the processor frequency — Basic integer operations can dispatch in 1/2 processor clock tick • Hyper-Pipelined Technology — Deep pipeline to enable industry-leading clock rates for desktop PCs and servers — Frequency headroom and scalability to continue leadership into the future • Advanced Dynamic Execution — Deep, out-of-order, speculative execution engine • • Up to 126 instructions in flight Up to 48 loads and 24 stores in pipeline1 — Enhanced branch prediction capability • • • • Reduces the misprediction penalty associated with deeper pipelines Advanced branch prediction algorithm 4K-entry branch target array New cache subsystem — First level caches • • • Advanced Execution Trace Cache stores decoded instructions Execution Trace Cache removes decoder latency from main execution loops Execution Trace Cache integrates path of program execution flow into a single line 1. Intel 64 and IA-32 processors based on the Intel NetBurst microarchitecture at 90 nm process can handle more than 24 stores in flight. 2-8 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES • Low latency data cache — Second level cache • • • Full-speed, unified 8-way Level 2 on-die Advance Transfer Cache Bandwidth and performance increases with processor frequency High-performance, quad-pumped bus interface to the Intel NetBurst microarchitecture system bus — Supports quad-pumped, scalable bus clock to achieve up to 4X effective speed — Capable of delivering up to 8.5 GBytes of bandwidth per second • • • Superscalar issue to enable parallelism Expanded hardware registers with renaming to avoid register name space limitations 64-byte cache line size (transfers data up to two lines per sector) Figure 2-2 is an overview of the Intel NetBurst microarchitecture. This microarchitecture pipeline is made up of three sections: (1) the front end pipeline, (2) the out-of-order execution core, and (3) the retirement unit. System Bus Frequently used paths Less frequently used paths Bus Unit 3rd Level Cache Optional 2nd Level Cache 8-Way 1st Level Cache 4-way Front End Fetch/Decode Trace Cache Microcode ROM BTBs/Branch Prediction Execution Out-Of-Order Core Retirement Branch History Update OM16521 Figure 2-2. The Intel NetBurst Microarchitecture 2.2.2.1 The Front End Pipeline The front end supplies instructions in program order to the out-of-order execution core. It performs a number of functions: • • • • • Prefetches instructions that are likely to be executed Fetches instructions that have not already been prefetched Decodes instructions into micro-operations Generates microcode for complex instructions and special-purpose code Delivers decoded instructions from the execution trace cache Vol. 1 2-9 INTEL® 64 AND IA-32 ARCHITECTURES • Predicts branches using highly advanced algorithm The pipeline is designed to address common problems in high-speed, pipelined microprocessors. Two of these problems contribute to major sources of delays: • • time to decode instructions fetched from the target wasted decode bandwidth due to branches or branch target in the middle of cache lines The operation of the pipeline’s trace cache addresses these issues. Instructions are constantly being fetched and decoded by the translation engine (part of the fetch/decode logic) and built into sequences of micro-ops called traces. At any time, multiple traces (representing prefetched branches) are being stored in the trace cache. The trace cache is searched for the instruction that follows the active branch. If the instruction also appears as the first instruction in a pre-fetched branch, the fetch and decode of instructions from the memory hierarchy ceases and the pre-fetched branch becomes the new source of instructions (see Figure 2-2). The trace cache and the translation engine have cooperating branch prediction hardware. Branch targets are predicted based on their linear addresses using branch target buffers (BTBs) and fetched as soon as possible. 2.2.2.2 Out-Of-Order Execution Core The out-of-order execution core’s ability to execute instructions out of order is a key factor in enabling parallelism. This feature enables the processor to reorder instructions so that if one micro-op is delayed, other micro-ops may proceed around it. The processor employs several buffers to smooth the flow of micro-ops. The core is designed to facilitate parallel execution. It can dispatch up to six micro-ops per cycle (this exceeds trace cache and retirement micro-op bandwidth). Most pipelines can start executing a new micro-op every cycle, so several instructions can be in flight at a time for each pipeline. A number of arithmetic logical unit (ALU) instructions can start at two per cycle; many floating-point instructions can start once every two cycles. 2.2.2.3 Retirement Unit The retirement unit receives the results of the executed micro-ops from the out-of-order execution core and processes the results so that the architectural state updates according to the original program order. When a micro-op completes and writes its result, it is retired. Up to three micro-ops may be retired per cycle. The Reorder Buffer (ROB) is the unit in the processor which buffers completed micro-ops, updates the architectural state in order, and manages the ordering of exceptions. The retirement section also keeps track of branches and sends updated branch target information to the BTB. The BTB then purges pre-fetched traces that are no longer needed. 2.2.3 Intel® Core™ Microarchitecture Intel Core microarchitecture introduces the following features that enable high performance and power-efficient performance for single-threaded as well as multi-threaded workloads: • Intel® Wide Dynamic Execution enable each processor core to fetch, dispatch, execute in high bandwidths to support retirement of up to four instructions per cycle. — Fourteen-stage efficient pipeline — Three arithmetic logical units — Four decoders to decode up to five instruction per cycle — Macro-fusion and micro-fusion to improve front-end throughput — Peak issue rate of dispatching up to six micro-ops per cycle — Peak retirement bandwidth of up to 4 micro-ops per cycle — Advanced branch prediction — Stack pointer tracker to improve efficiency of executing function/procedure entries and exits • Intel® Advanced Smart Cache delivers higher bandwidth from the second level cache to the core, and optimal performance and flexibility for single-threaded and multi-threaded applications. 2-10 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES — Large second level cache up to 4 MB and 16-way associativity — Optimized for multicore and single-threaded execution environments — 256 bit internal data path to improve bandwidth from L2 to first-level data cache • Intel® Smart Memory Access prefetches data from memory in response to data access patterns and reduces cache-miss exposure of out-of-order execution. — Hardware prefetchers to reduce effective latency of second-level cache misses — Hardware prefetchers to reduce effective latency of first-level data cache misses — Memory disambiguation to improve efficiency of speculative execution execution engine • Intel® Advanced Digital Media Boost improves most 128-bit SIMD instruction with single-cycle throughput and floating-point operations. — Single-cycle throughput of most 128-bit SIMD instructions — Up to eight floating-point operation per cycle — Three issue ports available to dispatching SIMD instructions for execution Intel Core 2 Extreme, Intel Core 2 Duo processors and Intel Xeon processor 5100 series implement two processor cores based on the Intel Core microarchitecture, the functionality of the subsystems in each core are depicted in Figure 2-3. Instruction Fetch and P reD ecode Instruction Q ueue M icrocode ROM D ecode S hared L2 C ache U p to 10.7 G B /s FS B R enam e/A lloc R etirem ent U nit (R e-O rder B uffer) S cheduler A LU B ranch M M X /S S E /FP M ove A LU FA dd M M X /S S E A LU FM ul M M X/S S E Load S tore L1D C ache and D T LB Figure 2-3. The Intel Core Microarchitecture Pipeline Functionality 2.2.3.1 The Front End The front end of Intel Core microarchitecture provides several enhancements to feed the Intel Wide Dynamic Execution engine: • Instruction fetch unit prefetches instructions into an instruction queue to maintain steady supply of instruction to the decode units. • Four-wide decode unit can decode 4 instructions per cycle or 5 instructions per cycle with Macrofusion. Vol. 1 2-11 INTEL® 64 AND IA-32 ARCHITECTURES • Macrofusion fuses common sequence of two instructions as one decoded instruction (micro-ops) to increase decoding throughput. • • • • Microfusion fuses common sequence of two micro-ops as one micro-ops to improve retirement throughput. • Instruction queue provides caching of short loops to improve efficiency. Stack pointer tracker improves efficiency of executing procedure/function entries and exits. Branch prediction unit employs dedicated hardware to handle different types of branches for improved branch prediction. Advanced branch prediction algorithm directs instruction fetch unit to fetch instructions likely in the architectural code path for decoding. 2.2.3.2 Execution Core The execution core of the Intel Core microarchitecture is superscalar and can process instructions out of order to increase the overall rate of instructions executed per cycle (IPC). The execution core employs the following feature to improve execution throughput and efficiency: • • • • • • • • Up to six micro-ops can be dispatched to execute per cycle Up to four instructions can be retired per cycle Three full arithmetic logical units SIMD instructions can be dispatched through three issue ports Most SIMD instructions have 1-cycle throughput (including 128-bit SIMD instructions) Up to eight floating-point operation per cycle Many long-latency computation operation are pipelined in hardware to increase overall throughput Reduced exposure to data access delays using Intel Smart Memory Access 2.2.4 Intel® Atom™ Microarchitecture Intel Atom microarchitecture maximizes power-efficient performance for single-threaded and multi-threaded workloads by providing: • Advanced Micro-Ops Execution — Single-micro-op instruction execution from decode to retirement, including instructions with register-only, load, and store semantics. — Sixteen-stage, in-order pipeline optimized for throughput and reduced power consumption. — Dual pipelines to enable decode, issue, execution and retirement of two instructions per cycle. — Advanced stack pointer to improve efficiency of executing function entry/returns. • Intel® Smart Cache — Second level cache is 512 KB and 8-way associativity. — Optimized for multi-threaded and single-threaded execution environments — 256 bit internal data path between L2 and L1 data cache improves high bandwidth. • Efficient Memory Access — Efficient hardware prefetchers to L1 and L2, speculatively loading data likely to be requested by processor to reduce cache miss impact. • Intel® Digital Media Boost — Two issue ports for dispatching SIMD instructions to execution units. — Single-cycle throughput for most 128-bit integer SIMD instructions — Up to six floating-point operations per cycle — Up to two 128-bit SIMD integer operations per cycle 2-12 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES — Safe Instruction Recognition (SIR) to allow long-latency floating-point operations to retire out of order with respect to integer instructions. 2.2.5 Intel® Microarchitecture Code Name Nehalem Intel microarchitecture code name Nehalem provides the foundation for many innovative features of Intel Core i7 processors. It builds on the success of 45 nm Intel Core microarchitecture and provides the following feature enhancements: • Enhanced processor core — Improved branch prediction and recovery from misprediction. — Enhanced loop streaming to improve front end performance and reduce power consumption. — Deeper buffering in out-of-order engine to extract parallelism. — Enhanced execution units to provide acceleration in CRC, string/text processing and data shuffling. • Smart Memory Access — Integrated memory controller provides low-latency access to system memory and scalable memory bandwidth — New cache hierarchy organization with shared, inclusive L3 to reduce snoop traffic — Two level TLBs and increased TLB size. — Fast unaligned memory access. • HyperThreading Technology — Provides two hardware threads (logical processors) per core. — Takes advantage of 4-wide execution engine, large L3, and massive memory bandwidth. • Dedicated Power management Innovations — Integrated microcontroller with optimized embedded firmware to manage power consumption. — Embedded real-time sensors for temperature, current, and power. — Integrated power gate to turn off/on per-core power consumption — Versatility to reduce power consumption of memory, link subsystems. 2.2.6 Intel® Microarchitecture Code Name Sandy Bridge Intel® microarchitecture code name Sandy Bridge builds on the successes of Intel® Core™ microarchitecture and Intel microarchitecture code name Nehalem. It offers the following innovative features: • Intel Advanced Vector Extensions (Intel AVX) — 256-bit floating-point instruction set extensions to the 128-bit Intel Streaming SIMD Extensions, providing up to 2X performance benefits relative to 128-bit code. — Non-destructive destination encoding offers more flexible coding techniques. — Supports flexible migration and co-existence between 256-bit AVX code, 128-bit AVX code and legacy 128bit SSE code. • Enhanced front-end and execution engine — New decoded Icache component that improves front-end bandwidth and reduces branch misprediction penalty. — Advanced branch prediction. — Additional macro-fusion support. — Larger dynamic execution window. — Multi-precision integer arithmetic enhancements (ADC/SBB, MUL/IMUL). Vol. 1 2-13 INTEL® 64 AND IA-32 ARCHITECTURES — LEA bandwidth improvement. — Reduction of general execution stalls (read ports, writeback conflicts, bypass latency, partial stalls). — Fast floating-point exception handling. — XSAVE/XRSTORE performance improvements and XSAVEOPT new instruction. • Cache hierarchy improvements for wider data path — Doubling of bandwidth enabled by two symmetric ports for memory operation. — Simultaneous handling of more in-flight loads and stores enabled by increased buffers. — Internal bandwidth of two loads and one store each cycle. — Improved prefetching. — High bandwidth low latency LLC architecture. — High bandwidth ring architecture of on-die interconnect. For additional information on Intel® Advanced Vector Extensions (AVX), see Section 5.13, “Intel® Advanced Vector Extensions (Intel® AVX)” and Chapter 14, “Programming with AVX, FMA and AVX2” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. 2.2.7 SIMD Instructions Beginning with the Pentium II and Pentium with Intel MMX technology processor families, six extensions have been introduced into the Intel 64 and IA-32 architectures to perform single-instruction multiple-data (SIMD) operations. These extensions include the MMX technology, SSE extensions, SSE2 extensions, SSE3 extensions, Supplemental Streaming SIMD Extensions 3, and SSE4. Each of these extensions provides a group of instructions that perform SIMD operations on packed integer and/or packed floating-point data elements. SIMD integer operations can use the 64-bit MMX or the 128-bit XMM registers. SIMD floating-point operations use 128-bit XMM registers. Figure 2-4 shows a summary of the various SIMD extensions (MMX technology, SSE, SSE2, SSE3, SSSE3, and SSE4), the data types they operate on, and how the data types are packed into MMX and XMM registers. The Intel MMX technology was introduced in the Pentium II and Pentium with MMX technology processor families. MMX instructions perform SIMD operations on packed byte, word, or doubleword integers located in MMX registers. These instructions are useful in applications that operate on integer arrays and streams of integer data that lend themselves to SIMD processing. SSE extensions were introduced in the Pentium III processor family. SSE instructions operate on packed singleprecision floating-point values contained in XMM registers and on packed integers contained in MMX registers. Several SSE instructions provide state management, cache control, and memory ordering operations. Other SSE instructions are targeted at applications that operate on arrays of single-precision floating-point data elements (3D geometry, 3-D rendering, and video encoding and decoding applications). SSE2 extensions were introduced in Pentium 4 and Intel Xeon processors. SSE2 instructions operate on packed double-precision floating-point values contained in XMM registers and on packed integers contained in MMX and XMM registers. SSE2 integer instructions extend IA-32 SIMD operations by adding new 128-bit SIMD integer operations and by expanding existing 64-bit SIMD integer operations to 128-bit XMM capability. SSE2 instructions also provide new cache control and memory ordering operations. SSE3 extensions were introduced with the Pentium 4 processor supporting Hyper-Threading Technology (built on 90 nm process technology). SSE3 offers 13 instructions that accelerate performance of Streaming SIMD Extensions technology, Streaming SIMD Extensions 2 technology, and x87-FP math capabilities. SSSE3 extensions were introduced with the Intel Xeon processor 5100 series and Intel Core 2 processor family. SSSE3 offer 32 instructions to accelerate processing of SIMD integer data. SSE4 extensions offer 54 instructions. 47 of them are referred to as SSE4.1 instructions. SSE4.1 are introduced with Intel Xeon processor 5400 series and Intel Core 2 Extreme processor QX9650. The other 7 SSE4 instructions are referred to as SSE4.2 instructions. 2-14 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES AESNI and PCLMULQDQ introduce 7 new instructions. Six of them are primitives for accelerating algorithms based on AES encryption/decryption standard, referred to as AESNI. The PCLMULQDQ instruction accelerates general-purpose block encryption, which can perform carry-less multiplication for two binary numbers up to 64-bit wide. Intel 64 architecture allows four generations of 128-bit SIMD extensions to access up to 16 XMM registers. IA-32 architecture provides 8 XMM registers. Intel® Advanced Vector Extensions offers comprehensive architectural enhancements over previous generations of Streaming SIMD Extensions. Intel AVX introduces the following architectural enhancements: • • Support for 256-bit wide vectors and SIMD register set. • Instruction syntax support for generalized three-operand syntax to improve instruction programming flexibility and efficient encoding of new instruction extensions. • Enhancement of legacy 128-bit SIMD instruction extensions to support three operand syntax and to simplify compiler vectorization of high-level language expressions. • Support flexible deployment of 256-bit AVX code, 128-bit AVX code, legacy 128-bit code and scalar code. 256-bit floating-point instruction set enhancement with up to 2X performance gain relative to 128-bit Streaming SIMD extensions. In addition to performance considerations, programmers should also be cognizant of the implications of VEXencoded AVX instructions with the expectations of system software components that manage the processor state components enabled by XCR0. For additional information see Section 2.3.10.1, “Vector Length Transition and Programming Considerations” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A. See also: • • Section 5.4, “MMX™ Instructions,” and Chapter 9, “Programming with Intel® MMX™ Technology” • Section 5.6, “SSE2 Instructions,” and Chapter 11, “Programming with Intel® Streaming SIMD Extensions 2 (Intel® SSE2)” • Section 5.7, “SSE3 Instructions”, Section 5.8, “Supplemental Streaming SIMD Extensions 3 (SSSE3) Instructions”, Section 5.9, “SSE4 Instructions”, and Chapter 12, “Programming with Intel® SSE3, SSSE3, Intel® SSE4 and Intel® AESNI” Section 5.5, “SSE Instructions,” and Chapter 10, “Programming with Intel® Streaming SIMD Extensions (Intel® SSE)” Vol. 1 2-15 INTEL® 64 AND IA-32 ARCHITECTURES SIMD Extension Register Layout Data Type MMX Registers 8 Packed Byte Integers MMX Technology - SSSE3 4 Packed Word Integers 2 Packed Doubleword Integers Quadword SSE - AVX XMM Registers 4 Packed Single-Precision Floating-Point Values 2 Packed Double-Precision Floating-Point Values 16 Packed Byte Integers 8 Packed Word Integers 4 Packed Doubleword Integers 2 Quadword Integers Double Quadword AVX YMM Registers 8 Packed SP FP Values 4 Packed DP FP Values 2 128-bit Data Figure 2-4. SIMD Extensions, Register Layouts, and Data Types 2.2.8 Intel® Hyper-Threading Technology Intel Hyper-Threading Technology (Intel HT Technology) was developed to improve the performance of IA-32 processors when executing multi-threaded operating system and application code or single-threaded applications under multi-tasking environments. The technology enables a single physical processor to execute two or more separate code streams (threads) concurrently using shared execution resources. Intel HT Technology is one form of hardware multi-threading capability in IA-32 processor families. It differs from multi-processor capability using separate physically distinct packages with each physical processor package mated with a physical socket. Intel HT Technology provides hardware multi-threading capability with a single physical package by using shared execution resources in a processor core. Architecturally, an IA-32 processor that supports Intel HT Technology consists of two or more logical processors, each of which has its own IA-32 architectural state. Each logical processor consists of a full set of IA-32 data registers, segment registers, control registers, debug registers, and most of the MSRs. Each also has its own advanced programmable interrupt controller (APIC). Figure 2-5 shows a comparison of a processor that supports Intel HT Technology (implemented with two logical processors) and a traditional dual processor system. 2-16 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES IA-32 Processor Supporting Hyper-Threading Technology AS Traditional Multiple Processor (MP) System AS AS AS Processor Core Processor Core Processor Core IA-32 processor IA-32 processor IA-32 processor Two logical processors that share a single core Each processor is a separate physical package AS = IA-32 Architectural State OM16522 Figure 2-5. Comparison of an IA-32 Processor Supporting Hyper-Threading Technology and a Traditional Dual Processor System Unlike a traditional MP system configuration that uses two or more separate physical IA-32 processors, the logical processors in an IA-32 processor supporting Intel HT Technology share the core resources of the physical processor. This includes the execution engine and the system bus interface. After power up and initialization, each logical processor can be independently directed to execute a specified thread, interrupted, or halted. Intel HT Technology leverages the process and thread-level parallelism found in contemporary operating systems and high-performance applications by providing two or more logical processors on a single chip. This configuration allows two or more threads1 to be executed simultaneously on each a physical processor. Each logical processor executes instructions from an application thread using the resources in the processor core. The core executes these threads concurrently, using out-of-order instruction scheduling to maximize the use of execution units during each clock cycle. 2.2.8.1 Some Implementation Notes All Intel HT Technology configurations require: • • • A processor that supports Intel HT Technology A chipset and BIOS that utilize the technology Operating system optimizations See http://www.intel.com/products/ht/hyperthreading_more.htm for information. At the firmware (BIOS) level, the basic procedures to initialize the logical processors in a processor supporting Intel HT Technology are the same as those for a traditional DP or MP platform. The mechanisms that are described in the Multiprocessor Specification, Version 1.4 to power-up and initialize physical processors in an MP system also apply to logical processors in a processor that supports Intel HT Technology. An operating system designed to run on a traditional DP or MP platform may use CPUID to determine the presence of hardware multi-threading support feature and the number of logical processors they provide. Although existing operating system and application code should run correctly on a processor that supports Intel HT Technology, some code modifications are recommended to get the optimum benefit. These modifications are discussed in Chapter 7, “Multiple-Processor Management,” Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 1. In the remainder of this document, the term “thread” will be used as a general term for the terms “process” and “thread.” Vol. 1 2-17 INTEL® 64 AND IA-32 ARCHITECTURES 2.2.9 Multi-Core Technology Multi-core technology is another form of hardware multi-threading capability in IA-32 processor families. Multi-core technology enhances hardware multi-threading capability by providing two or more execution cores in a physical package. The Intel Pentium processor Extreme Edition is the first member in the IA-32 processor family to introduce multicore technology. The processor provides hardware multi-threading support with both two processor cores and Intel Hyper-Threading Technology. This means that the Intel Pentium processor Extreme Edition provides four logical processors in a physical package (two logical processors for each processor core). The Dual-Core Intel Xeon processor features multi-core, Intel Hyper-Threading Technology and supports multi-processor platforms. The Intel Pentium D processor also features multi-core technology. This processor provides hardware multithreading support with two processor cores but does not offer Intel Hyper-Threading Technology. This means that the Intel Pentium D processor provides two logical processors in a physical package, with each logical processor owning the complete execution resources of a processor core. The Intel Core 2 processor family, Intel Xeon processor 3000 series, Intel Xeon processor 5100 series, and Intel Core Duo processor offer power-efficient multi-core technology. The processor contains two cores that share a smart second level cache. The Level 2 cache enables efficient data sharing between two cores to reduce memory traffic to the system bus. Intel Core Duo Processor Intel Core 2 Duo Processor Intel Pentium dual-core Processor Pentium D Processor Architectual State Architectual State Execution Engine Execution Engine Local APIC Local APIC Architectual State Architectual State Execution Engine Execution Engine Local APIC Local APIC Bus Interface Bus Interface Second Level Cache Bus Interface System Bus System Bus Pentium Processor Extreme Edition Architectual State Architectual State Architectual State Execution Engine Local APIC Architectual State Execution Engine Local APIC Local APIC Bus Interface Local APIC Bus Interface System Bus OM19809 Figure 2-6. Intel 64 and IA-32 Processors that Support Dual-Core The Pentium® dual-core processor is based on the same technology as the Intel Core 2 Duo processor family. The Intel Xeon processor 7300, 5300 and 3200 series, Intel Core 2 Extreme Quad-Core processor, and Intel Core 2 Quad processors support Intel quad-core technology. The Quad-core Intel Xeon processors and the Quad-Core Intel Core 2 processor family are also in Figure 2-7. 2-18 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES Intel Core 2 Extreme Quad-core Processor Intel Core 2 Quad Processor Intel Xeon Processor 3200 Series Intel Xeon Processor 5300 Series Architectual State Architectual State Architectual State Architectual State Execution Engine Execution Engine Execution Engine Execution Engine Local APIC Local APIC Local APIC Local APIC Second Level Cache Second Level Cache Bus Interface Bus Interface System Bus OM19810 Figure 2-7. Intel 64 Processors that Support Quad-Core Intel Core i7 processors support Intel quad-core technology, Intel HyperThreading Technology, provides Intel QuickPath interconnect link to the chipset and have integrated memory controller supporting three channel to DDR3 memory. Intel Core i7 Processor Logical Proces sor Logical Proces sor Logical Proces sor Logical Proces sor Logical Proces sor Logical Proces sor Logical Proces sor Logical Proces sor L1 and L2 L1 and L2 L1 and L2 L1 and L2 Execution Engine Execution Engine Execution Engine Execution Engine Third Level Cache QuickPath Interconnect (QPI) Interface, Integrated Memory Controller QPI IMC DDR3 Chipset OM19810b Figure 2-8. Intel Core i7 Processor Vol. 1 2-19 INTEL® 64 AND IA-32 ARCHITECTURES 2.2.10 Intel® 64 Architecture Intel 64 architecture increases the linear address space for software to 64 bits and supports physical address space up to 52 bits. The technology also introduces a new operating mode referred to as IA-32e mode. IA-32e mode operates in one of two sub-modes: (1) compatibility mode enables a 64-bit operating system to run most legacy 32-bit software unmodified, (2) 64-bit mode enables a 64-bit operating system to run applications written to access 64-bit address space. In the 64-bit mode, applications may access: • • • • • • • 64-bit flat linear addressing 8 additional general-purpose registers (GPRs) 8 additional registers for streaming SIMD extensions (SSE, SSE2, SSE3 and SSSE3) 64-bit-wide GPRs and instruction pointers uniform byte-register addressing fast interrupt-prioritization mechanism a new instruction-pointer relative-addressing mode An Intel 64 architecture processor supports existing IA-32 software because it is able to run all non-64-bit legacy modes supported by IA-32 architecture. Most existing IA-32 applications also run in compatibility mode. 2.2.11 Intel® Virtualization Technology (Intel® VT) Intel® Virtualization Technology for Intel 64 and IA-32 architectures provide extensions that support virtualization. The extensions are referred to as Virtual Machine Extensions (VMX). An Intel 64 or IA-32 platform with VMX can function as multiple virtual systems (or virtual machines). Each virtual machine can run operating systems and applications in separate partitions. VMX also provides programming interface for a new layer of system software (called the Virtual Machine Monitor (VMM)) used to manage the operation of virtual machines. Information on VMX and on the programming of VMMs is in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C. Intel Core i7 processor provides the following enhancements to Intel Virtualization Technology: • • • Virtual processor ID (VPID) to reduce the cost of VMM managing transitions. Extended page table (EPT) to reduce the number of transitions for VMM to manage memory virtualization. Reduced latency of VM transitions. 2.3 INTEL® 64 AND IA-32 PROCESSOR GENERATIONS In the mid-1960s, Intel cofounder and Chairman Emeritus Gordon Moore had this observation: “... the number of transistors that would be incorporated on a silicon die would double every 18 months for the next several years.” Over the past three and half decades, this prediction known as “Moore's Law” has continued to hold true. The computing power and the complexity (or roughly, the number of transistors per processor) of Intel architecture processors has grown in close relation to Moore's law. By taking advantage of new process technology and new microarchitecture designs, each new generation of IA-32 processors has demonstrated frequency-scaling headroom and new performance levels over the previous generation processors. 2-20 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES The key features of the Intel Pentium 4 processor, Intel Xeon processor, Intel Xeon processor MP, Pentium III processor, and Pentium III Xeon processor with advanced transfer cache are shown in Table 2-1. Older generation IA-32 processors, which do not employ on-die Level 2 cache, are shown in Table 2-2. Table 2-1. Key Features of Most Recent IA-32 Processors Intel Processor Date Introduced Micro-architecture Top-Bin Clock Fre-quency at Introduction Transistors Register Sizes1 System Max. Bus Band- Extern. width Addr. Space On-Die Caches2 Intel Pentium M Processor 7553 2004 Intel Pentium M Processor 2.00 GHz 140 M GP: 32 FPU: 80 MMX: 64 XMM: 128 3.2 GB/s 4 GB L1: 64 KB L2: 2 MB Intel Core Duo Processor T26003 2006 Improved Intel Pentium M Processor Microarchitecture; Dual Core; 2.16 GHz 152M GP: 32 FPU: 80 MMX: 64 XMM: 128 5.3 GB/s 4 GB L1: 64 KB L2: 2 MB (2MB Total) 1.86 GHz 800 MHz 47M GP: 32 FPU: 80 MMX: 64 XMM: 128 Up to 4.2 GB/s 4 GB L1: 56 KB4 L2: 512KB Intel Smart Cache, Advanced Thermal Manager Intel Atom Processor Z5xx series 2008 Intel Atom Microarchitecture; Intel Virtualization Technology. NOTES: 1. The register size and external data bus size are given in bits. 2. First level cache is denoted using the abbreviation L1, 2nd level cache is denoted as L2. The size of L1 includes the first-level data cache and the instruction cache where applicable, but does not include the trace cache. 3. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See http://www.intel.com/products/processor_number for details. 4. In Intel Atom Processor, the size of L1 instruction cache is 32 KBytes, L1 data cache is 24 KBytes. Table 2-2. Key Features of Most Recent Intel 64 Processors Intel Processor Date Introduced Micro-architecture Highest Processor Base Frequency at Introduction Transistors Register Sizes System Bus/QPI Link Speed Max. Extern. Addr. Space On-Die Caches 64-bit Intel Xeon Processor with 800 MHz System Bus 2004 Intel NetBurst Microarchitecture; Intel HyperThreading Technology; Intel 64 Architecture 3.60 GHz 125 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 6.4 GB/s 64 GB 12K µop Execution Trace Cache; 16 KB L1; 1 MB L2 64-bit Intel Xeon Processor MP with 8MB L3 2005 Intel NetBurst Microarchitecture; Intel HyperThreading Technology; Intel 64 Architecture 3.33 GHz 675M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 5.3 GB/s 1 1024 GB (1 TB) 12K µop Execution Trace Cache; 16 KB L1; 1 MB L2, 8 MB L3 Vol. 1 2-21 INTEL® 64 AND IA-32 ARCHITECTURES Table 2-2. Key Features of Most Recent Intel 64 Processors (Contd.) Intel Processor Date Introduced Micro-architecture Highest Processor Base Frequency at Introduction Transistors Register Sizes System Bus/QPI Link Speed Max. Extern. Addr. Space On-Die Caches Intel Pentium 4 Processor Extreme Edition Supporting HyperThreading Technology 2005 Intel NetBurst Microarchitecture; Intel HyperThreading Technology; Intel 64 Architecture 3.73 GHz 164 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 8.5 GB/s 64 GB 12K µop Execution Trace Cache; 16 KB L1; 2 MB L2 Intel Pentium Processor Extreme Edition 840 2005 Intel NetBurst Microarchitecture; Intel HyperThreading Technology; Intel 64 Architecture; 3.20 GHz 230 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 6.4 GB/s 64 GB 12K µop Execution Trace Cache; 16 KB L1; 1MB L2 (2MB Total) 3.00 GHz 321M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 6.4 GB/s 64 GB 12K µop Execution Trace Cache; 16 KB L1; 2MB L2 (4MB Total) 3.80 GHz 164 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 6.4 GB/s 64 GB 12K µop Execution Trace Cache; 16 KB L1; 2MB L2 3.46 GHz 376M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 8.5 GB/s 64 GB 12K µop Execution Trace Cache; 16 KB L1; 2MB L2 Dual-core 2 Dual-Core Intel Xeon Processor 7041 2005 Intel NetBurst Microarchitecture; Intel HyperThreading Technology; Intel 64 Architecture; Dual-core 3 Intel Pentium 2005 4 Processor 672 Intel NetBurst Microarchitecture; Intel HyperThreading Technology; Intel 64 Architecture; Intel Virtualization Technology. Intel Pentium Processor Extreme Edition 955 2006 Intel NetBurst Microarchitecture; Intel 64 Architecture; Dual Core; Intel Virtualization Technology. Intel Core 2 Extreme Processor X6800 2006 Intel Core Microarchitecture; Dual Core; Intel 64 Architecture; Intel Virtualization Technology. 2-22 Vol. 1 (4MB Total) 2.93 GHz 291M GP: 32,64 FPU: 80 MMX: 64 XMM: 128 8.5 GB/s 64 GB L1: 64 KB L2: 4MB (4MB Total) INTEL® 64 AND IA-32 ARCHITECTURES Table 2-2. Key Features of Most Recent Intel 64 Processors (Contd.) Intel Processor Date Introduced Micro-architecture Highest Processor Base Frequency at Introduction Transistors Register Sizes System Bus/QPI Link Speed Max. Extern. Addr. Space On-Die Caches Intel Xeon Processor 5160 2006 Intel Core Microarchitecture; Dual Core; 3.00 GHz 291M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 10.6 GB/s 64 GB L1: 64 KB L2: 4MB (4MB Total) 3.40 GHz 1.3 B GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 12.8 GB/s 64 GB L1: 64 KB L2: 1MB (2MB Total) Intel 64 Architecture; Intel Virtualization Technology. Intel Xeon Processor 7140 2006 Intel NetBurst Microarchitecture; Dual Core; Intel 64 Architecture; L3: 16 MB (16MB Total) Intel Virtualization Technology. Intel Core 2 Extreme Processor QX6700 2006 Intel Core Microarchitecture; Quad Core; 2.66 GHz 582M GP: 32,64 FPU: 80 MMX: 64 XMM: 128 8.5 GB/s 64 GB L1: 64 KB L2: 4MB (4MB Total) 2.66 GHz 582 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 10.6 GB/s 256 GB L1: 64 KB L2: 4MB (8 MB Total) 3.00 GHz 291 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 10.6 GB/s 64 GB L1: 64 KB L2: 4MB (4MB Total) 2.93 GHz 582 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 8.5 GB/s 1024 GB L1: 64 KB L2: 4MB (8MB Total) Intel 64 Architecture; Intel Virtualization Technology. Quad-core Intel Xeon Processor 5355 2006 Intel Core Microarchitecture; Quad Core; Intel 64 Architecture; Intel Virtualization Technology. Intel Core 2 Duo Processor E6850 2007 Intel Core Microarchitecture; Dual Core; Intel 64 Architecture; Intel Virtualization Technology; Intel Trusted Execution Technology Intel Xeon Processor 7350 2007 Intel Core Microarchitecture; Quad Core; Intel 64 Architecture; Intel Virtualization Technology. Vol. 1 2-23 INTEL® 64 AND IA-32 ARCHITECTURES Table 2-2. Key Features of Most Recent Intel 64 Processors (Contd.) Intel Processor Date Introduced Micro-architecture Highest Processor Base Frequency at Introduction Transistors Register Sizes System Bus/QPI Link Speed Max. Extern. Addr. Space On-Die Caches Intel Xeon Processor 5472 2007 Enhanced Intel Core Microarchitecture; Quad Core; 3.00 GHz 820 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 12.8 GB/s 256 GB L1: 64 KB L2: 6MB (12MB Total) 2.0 - 1.60 GHz 47 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 Up to 4.2 GB/s Up to 64GB L1: 56 KB4 L2: 512KB 2.67 GHz 1.9 B GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 8.5 GB/s 1024 GB L1: 64 KB L2: 3MB (9MB Total) Intel 64 Architecture; Intel Virtualization Technology. Intel Atom Processor 2008 Intel Atom Microarchitecture; Intel 64 Architecture; Intel Virtualization Technology. Intel Xeon Processor 7460 2008 Enhanced Intel Core Microarchitecture; Six Cores; L3: 16MB Intel 64 Architecture; Intel Virtualization Technology. Intel Atom 2008 Processor 330 Intel Atom Microarchitecture; Intel 64 Architecture; 1.60 GHz 94 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 Up to 4.2 GB/s Up to 64GB L1: 56 KB5 L2: 512KB (1MB Total) 3.20 GHz 731 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 QPI: 6.4 GT/s; Memory: 25 GB/s 64 GB L1: 64 KB L2: 256KB Dual core; Intel Virtualization Technology. Intel Core i7965 Processor Extreme Edition 2008 Intel microarchitecture code name Nehalem; Quadcore; HyperThreading Technology; Intel QPI; Intel 64 Architecture; Intel Virtualization Technology. 2-24 Vol. 1 L3: 8MB INTEL® 64 AND IA-32 ARCHITECTURES Table 2-2. Key Features of Most Recent Intel 64 Processors (Contd.) Intel Processor Date Introduced Micro-architecture Highest Processor Base Frequency at Introduction Transistors Register Sizes Intel Core i7620M Processor 2010 Intel Turbo Boost Technology, Intel microarchitecture code name Westmere; Dualcore; HyperThreading Technology; Intel 64 Architecture; 2.66 GHz 383 M GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 System Bus/QPI Link Speed Max. Extern. Addr. Space On-Die Caches 64 GB L1: 64 KB L2: 256KB L3: 4MB Intel Virtualization Technology., Integrated graphics Intel XeonProcessor 5680 2010 Intel Turbo Boost Technology, Intel microarchitecture code name Westmere; Six core; HyperThreading Technology; Intel 64 Architecture; 3.33 GHz 1.1B GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 QPI: 6.4 GT/s; 32 GB/s 1 TB GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 QPI: 6.4 GT/s; Memory: 76 GB/s 16 TB GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 DMI: 5 GT/s; Memory: 21 GB/s 64 GB L1: 64 KB L2: 256KB L3: 12MB Intel Virtualization Technology. Intel XeonProcessor 7560 2010 Intel Turbo Boost Technology, Intel microarchitecture code name Nehalem; Eight core; HyperThreading Technology; Intel 64 Architecture; 2.26 GHz 2.3B L1: 64 KB L2: 256KB L3: 24MB Intel Virtualization Technology. Intel Core i72600K Processor 2011 Intel Turbo Boost Technology, Intel microarchitecture code name Sandy Bridge; Four core; HyperThreading Technology; Intel 64 Architecture; 3.40 GHz 995M L1: 64 KB L2: 256KB L3: 8MB YMM: 256 Intel Virtualization Technology., Processor graphics, Quicksync Video Vol. 1 2-25 INTEL® 64 AND IA-32 ARCHITECTURES Table 2-2. Key Features of Most Recent Intel 64 Processors (Contd.) Intel Processor Date Introduced Micro-architecture Highest Processor Base Frequency at Introduction Intel XeonProcessor E31280 2011 Intel Turbo Boost Technology, Intel microarchitecture code name Sandy Bridge; Four core; HyperThreading Technology; Intel 64 Architecture; 3.50 GHz Transistors Register Sizes System Bus/QPI Link Speed Max. Extern. Addr. Space On-Die Caches GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 DMI: 5 GT/s; Memory: 21 GB/s 1 TB L1: 64 KB L2: 256KB QPI: 6.4 GT/s; Memory: 102 GB/s 16 TB L3: 8MB YMM: 256 Intel Virtualization Technology. Intel XeonProcessor E78870 2011 Intel Turbo Boost Technology, Intel microarchitecture code name Westmere; Ten core; HyperThreading Technology; Intel 64 Architecture; 2.40 GHz 2.2B GP: 32, 64 FPU: 80 MMX: 64 XMM: 128 L1: 64 KB L2: 256KB L3: 30MB Intel Virtualization Technology. NOTES: 1. The 64-bit Intel Xeon Processor MP with an 8-MByte L3 supports a multi-processor platform with a dual system bus; this creates a platform bandwidth with 10.6 GBytes. 2. In Intel Pentium Processor Extreme Edition 840, the size of on-die cache is listed for each core. The total size of L2 in the physical package in 2 MBytes. 3. In Dual-Core Intel Xeon Processor 7041, the size of on-die cache is listed for each core. The total size of L2 in the physical package in 4 MBytes. 4. In Intel Atom Processor, the size of L1 instruction cache is 32 KBytes, L1 data cache is 24 KBytes. 5. In Intel Atom Processor, the size of L1 instruction cache is 32 KBytes, L1 data cache is 24 KBytes. 2-26 Vol. 1 INTEL® 64 AND IA-32 ARCHITECTURES Table 2-3. Key Features of Previous Generations of IA-32 Processors Intel Processor Date Introduced Max. Clock Frequency/ Technology at Introduction Transistors Register Sizes1 Ext. Data Bus Size2 Max. Extern. Addr. Space Caches 8086 1978 8 MHz 29 K 16 GP 16 1 MB None Intel 286 1982 12.5 MHz 134 K 16 GP 16 16 MB Note 3 Intel386 DX Processor 1985 20 MHz 275 K 32 GP 32 4 GB Note 3 Intel486 DX Processor 1989 25 MHz 1.2 M 32 GP 80 FPU 32 4 GB L1: 8 KB Pentium Processor 1993 60 MHz 3.1 M 32 GP 80 FPU 64 4 GB L1:16 KB Pentium Pro Processor 1995 200 MHz 5.5 M 32 GP 80 FPU 64 64 GB L1: 16 KB L2: 256 KB or 512 KB Pentium II Processor 1997 266 MHz 7M 32 GP 80 FPU 64 MMX 64 64 GB L1: 32 KB L2: 256 KB or 512 KB Pentium III Processor 1999 500 MHz 8.2 M 32 GP 80 FPU 64 MMX 128 XMM 64 64 GB L1: 32 KB L2: 512 KB Pentium III and Pentium III Xeon Processors 1999 700 MHz 28 M 32 GP 80 FPU 64 MMX 128 XMM 64 64 GB L1: 32 KB L2: 256 KB Pentium 4 Processor 2000 1.50 GHz, Intel NetBurst Microarchitecture 42 M 32 GP 80 FPU 64 MMX 128 XMM 64 64 GB 12K µop Execution Trace Cache; L1: 8KB L2: 256 KB Intel Xeon Processor 2001 1.70 GHz, Intel NetBurst Microarchitecture 42 M 32 GP 80 FPU 64 MMX 128 XMM 64 64 GB 12K µop Execution Trace Cache; L1: 8KB L2: 512KB Intel Xeon Processor 2002 2.20 GHz, Intel NetBurst Microarchitecture, HyperThreading Technology 55 M 32 GP 80 FPU 64 MMX 128 XMM 64 64 GB 12K µop Execution Trace Cache; L1: 8KB L2: 512KB Pentium M Processor 2003 1.60 GHz, Intel NetBurst Microarchitecture 77 M 32 GP 80 FPU 64 MMX 128 XMM 64 4 GB L1: 64KB L2: 1 MB Vol. 1 2-27 INTEL® 64 AND IA-32 ARCHITECTURES Table 2-3. Key Features of Previous Generations of IA-32 Processors (Contd.) Intel Pentium 4 Processor Supporting HyperThreading Technology at 90 nm process 2004 3.40 GHz, Intel NetBurst Microarchitecture, HyperThreading Technology 125 M 32 GP 80 FPU 64 MMX 128 XMM 64 64 GB 12K µop Execution Trace Cache; L1: 16KB L2: 1 MB NOTE: 1. The register size and external data bus size are given in bits. Note also that each 32-bit general-purpose (GP) registers can be addressed as an 8- or a 16-bit data registers in all of the processors. 2. Internal data paths are 2 to 4 times wider than the external data bus for each processor. 2.4 PROPOSED REMOVAL OF INTEL INSTRUCTION SET ARCHITECTURE AND FEATURES FROM UPCOMING PRODUCTS This section lists Intel Instruction Set Architecture (ISA) and features that Intel plans to remove from select products starting from a specific year. Table 2-4. Proposed Intel ISA and Features Removal List Intel ISA/Feature Year of Removal Intel® Memory Protection Extensions (Intel® MPX) 2019 onwards TEST_CTRL MSR, bit 31 (MSR address 33H) 2019 onwards 2.5 INTEL INSTRUCTION SET ARCHITECTURE AND FEATURES REMOVED This section lists Intel ISA and features that Intel has already removed for select upcoming products. All sections relevant to the removed features will be identified as such and may be moved to an archived section in future Intel® 64 and IA-32 Architectures Software Developer's Manual releases. Table 2-5. Intel ISA and Features Removal List Intel ISA/Feature Year of Removal None removed NA 2-28 Vol. 1 3. Updates to Chapter 3, Volume 1 Change bars and green text show changes to Chapter 3 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture. -----------------------------------------------------------------------------------------Changes to this chapter: Typo correction on Intel 64 architecture physical address space in Section 3.2.1 “64-Bit Mode Execution Environment”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 13 CHAPTER 3 BASIC EXECUTION ENVIRONMENT This chapter describes the basic execution environment of an Intel 64 or IA-32 processor as seen by assemblylanguage programmers. It describes how the processor executes instructions and how it stores and manipulates data. The execution environment described here includes memory (the address space), general-purpose data registers, segment registers, the flag register, and the instruction pointer register. 3.1 MODES OF OPERATION The IA-32 architecture supports three basic operating modes: protected mode, real-address mode, and system management mode. The operating mode determines which instructions and architectural features are accessible: • Protected mode — This mode is the native state of the processor. Among the capabilities of protected mode is the ability to directly execute “real-address mode” 8086 software in a protected, multi-tasking environment. This feature is called virtual-8086 mode, although it is not actually a processor mode. Virtual-8086 mode is actually a protected mode attribute that can be enabled for any task. • Real-address mode — This mode implements the programming environment of the Intel 8086 processor with extensions (such as the ability to switch to protected or system management mode). The processor is placed in real-address mode following power-up or a reset. • System management mode (SMM) — This mode provides an operating system or executive with a transparent mechanism for implementing platform-specific functions such as power management and system security. The processor enters SMM when the external SMM interrupt pin (SMI#) is activated or an SMI is received from the advanced programmable interrupt controller (APIC). In SMM, the processor switches to a separate address space while saving the basic context of the currently running program or task. SMM-specific code may then be executed transparently. Upon returning from SMM, the processor is placed back into its state prior to the system management interrupt. SMM was introduced with the Intel386™ SL and Intel486™ SL processors and became a standard IA-32 feature with the Pentium processor family. 3.1.1 Intel® 64 Architecture Intel 64 architecture adds IA-32e mode. IA-32e mode has two sub-modes. These are: • Compatibility mode (sub-mode of IA-32e mode) — Compatibility mode permits most legacy 16-bit and 32-bit applications to run without re-compilation under a 64-bit operating system. For brevity, the compatibility sub-mode is referred to as compatibility mode in IA-32 architecture. The execution environment of compatibility mode is the same as described in Section 3.2. Compatibility mode also supports all of the privilege levels that are supported in 64-bit and protected modes. Legacy applications that run in Virtual 8086 mode or use hardware task management will not work in this mode. Compatibility mode is enabled by the operating system (OS) on a code segment basis. This means that a single 64-bit OS can support 64-bit applications running in 64-bit mode and support legacy 32-bit applications (not recompiled for 64-bits) running in compatibility mode. Compatibility mode is similar to 32-bit protected mode. Applications access only the first 4 GByte of linearaddress space. Compatibility mode uses 16-bit and 32-bit address and operand sizes. Like protected mode, this mode allows applications to access physical memory greater than 4 GByte using PAE (Physical Address Extensions). • 64-bit mode (sub-mode of IA-32e mode) — This mode enables a 64-bit operating system to run applications written to access 64-bit linear address space. For brevity, the 64-bit sub-mode is referred to as 64-bit mode in IA-32 architecture. Vol. 1 3-1 BASIC EXECUTION ENVIRONMENT 64-bit mode extends the number of general purpose registers and SIMD extension registers from 8 to 16. General purpose registers are widened to 64 bits. The mode also introduces a new opcode prefix (REX) to access the register extensions. See Section 3.2.1 for a detailed description. 64-bit mode is enabled by the operating system on a code-segment basis. Its default address size is 64 bits and its default operand size is 32 bits. The default operand size can be overridden on an instruction-by-instruction basis using a REX opcode prefix in conjunction with an operand size override prefix. REX prefixes allow a 64-bit operand to be specified when operating in 64-bit mode. By using this mechanism, many existing instructions have been promoted to allow the use of 64-bit registers and 64-bit addresses. 3.2 OVERVIEW OF THE BASIC EXECUTION ENVIRONMENT Any program or task running on an IA-32 processor is given a set of resources for executing instructions and for storing code, data, and state information. These resources (described briefly in the following paragraphs and shown in Figure 3-1) make up the basic execution environment for an IA-32 processor. An Intel 64 processor supports the basic execution environment of an IA-32 processor, and a similar environment under IA-32e mode that can execute 64-bit programs (64-bit sub-mode) and 32-bit programs (compatibility submode). The basic execution environment is used jointly by the application programs and the operating system or executive running on the processor. • Address space — Any task or program running on an IA-32 processor can address a linear address space of up to 4 GBytes (232 bytes) and a physical address space of up to 64 GBytes (236 bytes). See Section 3.3.6, “Extended Physical Addressing in Protected Mode,” for more information about addressing an address space greater than 4 GBytes. • Basic program execution registers — The eight general-purpose registers, the six segment registers, the EFLAGS register, and the EIP (instruction pointer) register comprise a basic execution environment in which to execute a set of general-purpose instructions. These instructions perform basic integer arithmetic on byte, word, and doubleword integers, handle program flow control, operate on bit and byte strings, and address memory. See Section 3.4, “Basic Program Execution Registers,” for more information about these registers. • x87 FPU registers — The eight x87 FPU data registers, the x87 FPU control register, the status register, the x87 FPU instruction pointer register, the x87 FPU operand (data) pointer register, the x87 FPU tag register, and the x87 FPU opcode register provide an execution environment for operating on single-precision, doubleprecision, and double extended-precision floating-point values, word integers, doubleword integers, quadword integers, and binary coded decimal (BCD) values. See Section 8.1, “x87 FPU Execution Environment,” for more information about these registers. • MMX registers — The eight MMX registers support execution of single-instruction, multiple-data (SIMD) operations on 64-bit packed byte, word, and doubleword integers. See Section 9.2, “The MMX Technology Programming Environment,” for more information about these registers. • XMM registers — The eight XMM data registers and the MXCSR register support execution of SIMD operations on 128-bit packed single-precision and double-precision floating-point values and on 128-bit packed byte, word, doubleword, and quadword integers. See Section 10.2, “SSE Programming Environment,” for more information about these registers. • YMM registers — The YMM data registers support execution of 256-bit SIMD operations on 256-bit packed single-precision and double-precision floating-point values and on 256-bit packed byte, word, doubleword, and quadword integers. • Bounds registers — Each of the BND0-BND3 register stores the lower and upper bounds (64 bits each) associated with the pointer to a memory buffer. They support execution of the Intel MPX instructions. • BNDCFGU and BNDSTATUS— BNDCFGU configures user mode MPX operations on bounds checking. BNDSTATUS provides additional information on the #BR caused by an MPX operation. 3-2 Vol. 1 BASIC EXECUTION ENVIRONMENT Basic Program Execution Registers Address Space* 2^32 -1 Eight 32-bit Registers General-Purpose Registers Six 16-bit Registers Segment Registers 32-bits EFLAGS Register 32-bits EIP (Instruction Pointer Register) FPU Registers Eight 80-bit Registers Floating-Point Data Registers 16 bits Control Register 16 bits Status Register 16 bits Tag Register 0 *The address space can be flat or segmented. Using the physical address extension mechanism, a physical address space of 2^36 - 1 can be addressed. Opcode Register (11-bits) 48 bits FPU Instruction Pointer Register 48 bits FPU Data (Operand) Pointer Register Bounds Registers MMX Registers Eight 64-bit Registers MMX Registers Four 128-bit Registers BNDCFGU BNDSTATUS XMM Registers Eight 128-bit Registers XMM Registers 32-bits MXCSR Register YMM Registers Eight 256-bit Registers YMM Registers Figure 3-1. IA-32 Basic Execution Environment for Non-64-bit Modes Vol. 1 3-3 BASIC EXECUTION ENVIRONMENT • Stack — To support procedure or subroutine calls and the passing of parameters between procedures or subroutines, a stack and stack management resources are included in the execution environment. The stack (not shown in Figure 3-1) is located in memory. See Section 6.2, “Stacks,” for more information about stack structure. In addition to the resources provided in the basic execution environment, the IA-32 architecture provides the following resources as part of its system-level architecture. They provide extensive support for operating-system and system-development software. Except for the I/O ports, the system resources are described in detail in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A & 3B. • I/O ports — The IA-32 architecture supports a transfers of data to and from input/output (I/O) ports. See Chapter 18, “Input/Output,” in this volume. • Control registers — The five control registers (CR0 through CR4) determine the operating mode of the processor and the characteristics of the currently executing task. See Chapter 2, “System Architecture Overview,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. • Memory management registers — The GDTR, IDTR, task register, and LDTR specify the locations of data structures used in protected mode memory management. See Chapter 2, “System Architecture Overview,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. • Debug registers — The debug registers (DR0 through DR7) control and allow monitoring of the processor’s debugging operations. See in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B. • Memory type range registers (MTRRs) — The MTRRs are used to assign memory types to regions of memory. See the sections on MTRRs in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A & 3B. • Machine specific registers (MSRs) — The processor provides a variety of machine specific registers that are used to control and report on processor performance. Virtually all MSRs handle system related functions and are not accessible to an application program. One exception to this rule is the time-stamp counter. The MSRs are described in Chapter 2, “Model-Specific Registers (MSRs)” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. • Machine check registers — The machine check registers consist of a set of control, status, and errorreporting MSRs that are used to detect and report on hardware (machine) errors. See Chapter 15, “MachineCheck Architecture,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. • Performance monitoring counters — The performance monitoring counters allow processor performance events to be monitored. See Chapter 18, “Performance Monitoring,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B. The remainder of this chapter describes the organization of memory and the address space, the basic program execution registers, and addressing modes. Refer to the following chapters in this volume for descriptions of the other program execution resources shown in Figure 3-1: • • • x87 FPU registers — See Chapter 8, “Programming with the x87 FPU.” • • YMM registers — See Chapter 14, “Programming with AVX, FMA and AVX2”. • Stack implementation and procedure calls — See Chapter 6, “Procedure Calls, Interrupts, and Exceptions.” MMX Registers — See Chapter 9, “Programming with Intel® MMX™ Technology.” XMM registers — See Chapter 10, “Programming with Intel® Streaming SIMD Extensions (Intel® SSE),” Chapter 11, “Programming with Intel® Streaming SIMD Extensions 2 (Intel® SSE2),” and Chapter 12, “Programming with Intel® SSE3, SSSE3, Intel® SSE4 and Intel® AESNI.” BND registers, BNDCFGU, BNDSTATUS — See Chapter 13, “Managing State Using the XSAVE Feature Set,” and Chapter 17, “Intel® MPX”. 3-4 Vol. 1 BASIC EXECUTION ENVIRONMENT 3.2.1 64-Bit Mode Execution Environment The execution environment for 64-bit mode is similar to that described in Section 3.2. The following paragraphs describe the differences that apply. • Address space — A task or program running in 64-bit mode on an IA-32 processor can address linear address space of up to 264 bytes (subject to the canonical addressing requirement described in Section 3.3.7.1) and physical address space of up to 252 bytes. Software can query CPUID for the physical address size supported by a processor. • Basic program execution registers — The number of general-purpose registers (GPRs) available is 16. GPRs are 64-bits wide and they support operations on byte, word, doubleword and quadword integers. Accessing byte registers is done uniformly to the lowest 8 bits. The instruction pointer register becomes 64 bits. The EFLAGS register is extended to 64 bits wide, and is referred to as the RFLAGS register. The upper 32 bits of RFLAGS is reserved. The lower 32 bits of RFLAGS is the same as EFLAGS. See Figure 3-2. • XMM registers — There are 16 XMM data registers for SIMD operations. See Section 10.2, “SSE Programming Environment,” for more information about these registers. • YMM registers — There are 16 YMM data registers for SIMD operations. See Chapter 14, “Programming with AVX, FMA and AVX2” for more information about these registers. • BND registers, BNDCFGU, BNDSTATUS — See Chapter 13, “Managing State Using the XSAVE Feature Set,” and Chapter 17, “Intel® MPX”. • Stack — The stack pointer size is 64 bits. Stack size is not controlled by a bit in the SS descriptor (as it is in non-64-bit modes) nor can the pointer size be overridden by an instruction prefix. • Control registers — Control registers expand to 64 bits. A new control register (the task priority register: CR8 or TPR) has been added. See Chapter 2, “Intel® 64 and IA-32 Architectures,” in this volume. • Debug registers — Debug registers expand to 64 bits. See Chapter 17, “Debug, Branch Profile, TSC, and Quality of Service,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. Vol. 1 3-5 BASIC EXECUTION ENVIRONMENT • Descriptor table registers — The global descriptor table register (GDTR) and interrupt descriptor table register (IDTR) expand to 10 bytes so that they can hold a full 64-bit base address. The local descriptor table register (LDTR) and the task register (TR) also expand to hold a full 64-bit base address. Basic Program Execution Registers Sixteen 64-bit Registers Six 16-bit Registers Address Space 2^64 -1 General-Purpose Registers Segment Registers 64-bits RFLAGS Register 64-bits RIP (Instruction Pointer Register) FPU Registers Eight 80-bit Registers Floating-Point Data Registers 16 bits Control Register 16 bits Status Register 16 bits Tag Register 0 Opcode Register (11-bits) 64 bits FPU Instruction Pointer Register 64 bits FPU Data (Operand) Pointer Register Bounds Registers MMX Registers Eight 64-bit Registers MMX Registers Four 128-bit Registers BNDCFGU BNDSTATUS XMM Registers Sixteen 128-bit Registers XMM Registers 32-bits MXCSR Register YMM Registers Sixteen 256-bit Registers YMM Registers Figure 3-2. 64-Bit Mode Execution Environment 3.3 MEMORY ORGANIZATION The memory that the processor addresses on its bus is called physical memory. Physical memory is organized as a sequence of 8-bit bytes. Each byte is assigned a unique address, called a physical address. The physical address space ranges from zero to a maximum of 236 − 1 (64 GBytes) if the processor does not support Intel 64 architecture. Intel 64 architecture introduces a changes in physical and linear address space; these are described in Section 3.3.3, Section 3.3.4, and Section 3.3.7. 3-6 Vol. 1 BASIC EXECUTION ENVIRONMENT Virtually any operating system or executive designed to work with an IA-32 or Intel 64 processor will use the processor’s memory management facilities to access memory. These facilities provide features such as segmentation and paging, which allow memory to be managed efficiently and reliably. Memory management is described in detail in Chapter 3, “Protected-Mode Memory Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. The following paragraphs describe the basic methods of addressing memory when memory management is used. 3.3.1 IA-32 Memory Models When employing the processor’s memory management facilities, programs do not directly address physical memory. Instead, they access memory using one of three memory models: flat, segmented, or real address mode: • Flat memory model — Memory appears to a program as a single, continuous address space (Figure 3-3). This space is called a linear address space. Code, data, and stacks are all contained in this address space. Linear address space is byte addressable, with addresses running contiguously from 0 to 232 - 1 (if not in 64-bit mode). An address for any byte in linear address space is called a linear address. • Segmented memory model — Memory appears to a program as a group of independent address spaces called segments. Code, data, and stacks are typically contained in separate segments. To address a byte in a segment, a program issues a logical address. This consists of a segment selector and an offset (logical addresses are often referred to as far pointers). The segment selector identifies the segment to be accessed and the offset identifies a byte in the address space of the segment. Programs running on an IA-32 processor can address up to 16,383 segments of different sizes and types, and each segment can be as large as 232 bytes. Internally, all the segments that are defined for a system are mapped into the processor’s linear address space. To access a memory location, the processor thus translates each logical address into a linear address. This translation is transparent to the application program. The primary reason for using segmented memory is to increase the reliability of programs and systems. For example, placing a program’s stack in a separate segment prevents the stack from growing into the code or data space and overwriting instructions or data, respectively. • Real-address mode memory model — This is the memory model for the Intel 8086 processor. It is supported to provide compatibility with existing programs written to run on the Intel 8086 processor. The realaddress mode uses a specific implementation of segmented memory in which the linear address space for the program and the operating system/executive consists of an array of segments of up to 64 KBytes in size each. The maximum size of the linear address space in real-address mode is 220 bytes. See also: Chapter 20, “8086 Emulation,” Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B. Vol. 1 3-7 BASIC EXECUTION ENVIRONMENT Flat Model Linear Address Linear Address Space* Segmented Model Segments Linear Address Space* Offset (effective address) Logical Address Segment Selector Real-Address Mode Model Offset Logical Address Segment Selector Linear Address Space Divided Into Equal Sized Segments * The linear address space can be paged when using the flat or segmented model. Figure 3-3. Three Memory Management Models 3.3.2 Paging and Virtual Memory With the flat or the segmented memory model, linear address space is mapped into the processor’s physical address space either directly or through paging. When using direct mapping (paging disabled), each linear address has a one-to-one correspondence with a physical address. Linear addresses are sent out on the processor’s address lines without translation. When using the IA-32 architecture’s paging mechanism (paging enabled), linear address space is divided into pages which are mapped to virtual memory. The pages of virtual memory are then mapped as needed into physical memory. When an operating system or executive uses paging, the paging mechanism is transparent to an application program. All that the application sees is linear address space. In addition, IA-32 architecture’s paging mechanism includes extensions that support: • • Physical Address Extensions (PAE) to address physical address space greater than 4 GBytes. Page Size Extensions (PSE) to map linear address to physical address in 4-MBytes pages. See also: Chapter 3, “Protected-Mode Memory Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 3.3.3 Memory Organization in 64-Bit Mode Intel 64 architecture supports physical address space greater than 64 GBytes; the actual physical address size of IA-32 processors is implementation specific. In 64-bit mode, there is architectural support for 64-bit linear address space. However, processors supporting Intel 64 architecture may implement less than 64-bits (see Section 3.3.7.1). The linear address space is mapped into the processor physical address space through the PAE paging mechanism. 3-8 Vol. 1 BASIC EXECUTION ENVIRONMENT 3.3.4 Modes of Operation vs. Memory Model When writing code for an IA-32 or Intel 64 processor, a programmer needs to know the operating mode the processor is going to be in when executing the code and the memory model being used. The relationship between operating modes and memory models is as follows: • Protected mode — When in protected mode, the processor can use any of the memory models described in this section. (The real-addressing mode memory model is ordinarily used only when the processor is in the virtual-8086 mode.) The memory model used depends on the design of the operating system or executive. When multitasking is implemented, individual tasks can use different memory models. • Real-address mode — When in real-address mode, the processor only supports the real-address mode memory model. • System management mode — When in SMM, the processor switches to a separate address space, called the system management RAM (SMRAM). The memory model used to address bytes in this address space is similar to the real-address mode model. See Chapter 34, “System Management Mode,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C, for more information on the memory model used in SMM. • Compatibility mode — Software that needs to run in compatibility mode should observe the same memory model as those targeted to run in 32-bit protected mode. The effect of segmentation is the same as it is in 32bit protected mode semantics. • 64-bit mode — Segmentation is generally (but not completely) disabled, creating a flat 64-bit linear-address space. Specifically, the processor treats the segment base of CS, DS, ES, and SS as zero in 64-bit mode (this makes a linear address equal an effective address). Segmented and real address modes are not available in 64bit mode. 3.3.5 32-Bit and 16-Bit Address and Operand Sizes IA-32 processors in protected mode can be configured for 32-bit or 16-bit address and operand sizes. With 32-bit address and operand sizes, the maximum linear address or segment offset is FFFFFFFFH (232-1); operand sizes are typically 8 bits or 32 bits. With 16-bit address and operand sizes, the maximum linear address or segment offset is FFFFH (216-1); operand sizes are typically 8 bits or 16 bits. When using 32-bit addressing, a logical address (or far pointer) consists of a 16-bit segment selector and a 32-bit offset; when using 16-bit addressing, an address consists of a 16-bit segment selector and a 16-bit offset. Instruction prefixes allow temporary overrides of the default address and/or operand sizes from within a program. When operating in protected mode, the segment descriptor for the currently executing code segment defines the default address and operand size. A segment descriptor is a system data structure not normally visible to application code. Assembler directives allow the default addressing and operand size to be chosen for a program. The assembler and other tools then set up the segment descriptor for the code segment appropriately. When operating in real-address mode, the default addressing and operand size is 16 bits. An address-size override can be used in real-address mode to enable 32-bit addressing. However, the maximum allowable 32-bit linear address is still 000FFFFFH (220-1). 3.3.6 Extended Physical Addressing in Protected Mode Beginning with P6 family processors, the IA-32 architecture supports addressing of up to 64 GBytes (236 bytes) of physical memory. A program or task could not address locations in this address space directly. Instead, it addresses individual linear address spaces of up to 4 GBytes that mapped to 64-GByte physical address space through a virtual memory management mechanism. Using this mechanism, an operating system can enable a program to switch 4-GByte linear address spaces within 64-GByte physical address space. The use of extended physical addressing requires the processor to operate in protected mode and the operating system to provide a virtual memory management system. See “36-Bit Physical Addressing Using the PAE Paging Mechanism” in Chapter 3, “Protected-Mode Memory Management,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. Vol. 1 3-9 BASIC EXECUTION ENVIRONMENT 3.3.7 Address Calculations in 64-Bit Mode In most cases, 64-bit mode uses flat address space for code, data, and stacks. In 64-bit mode (if there is no address-size override), the size of effective address calculations is 64 bits. An effective-address calculation uses a 64-bit base and index registers and sign-extend displacements to 64 bits. In the flat address space of 64-bit mode, linear addresses are equal to effective addresses because the base address is zero. In the event that FS or GS segments are used with a non-zero base, this rule does not hold. In 64bit mode, the effective address components are added and the effective address is truncated (See for example the instruction LEA) before adding the full 64-bit segment base. The base is never truncated, regardless of addressing mode in 64-bit mode. The instruction pointer is extended to 64 bits to support 64-bit code offsets. The 64-bit instruction pointer is called the RIP. Table 3-1 shows the relationship between RIP, EIP, and IP. Table 3-1. Instruction Pointer Sizes Bits 63:32 16-bit instruction pointer Not Modified 32-bit instruction pointer Zero Extension 64-bit instruction pointer RIP Bits 31:16 Bits 15:0 IP EIP Generally, displacements and immediates in 64-bit mode are not extended to 64 bits. They are still limited to 32 bits and sign-extended during effective-address calculations. In 64-bit mode, however, support is provided for 64bit displacement and immediate forms of the MOV instruction. All 16-bit and 32-bit address calculations are zero-extended in IA-32e mode to form 64-bit addresses. Address calculations are first truncated to the effective address size of the current mode (64-bit mode or compatibility mode), as overridden by any address-size prefix. The result is then zero-extended to the full 64-bit address width. Because of this, 16-bit and 32-bit applications running in compatibility mode can access only the low 4 GBytes of the 64-bit mode effective addresses. Likewise, a 32-bit address generated in 64-bit mode can access only the low 4 GBytes of the 64-bit mode effective addresses. 3.3.7.1 Canonical Addressing In 64-bit mode, an address is considered to be in canonical form if address bits 63 through to the most-significant implemented bit by the microarchitecture are set to either all ones or all zeros. Intel 64 architecture defines a 64-bit linear address. Implementations can support less. The first implementation of IA-32 processors with Intel 64 architecture supports a 48-bit linear address. This means a canonical address must have bits 63 through 48 set to zeros or ones (depending on whether bit 47 is a zero or one). Although implementations may not use all 64 bits of the linear address, they should check bits 63 through the most-significant implemented bit to see if the address is in canonical form. If a linear-memory reference is not in canonical form, the implementation should generate an exception. In most cases, a general-protection exception (#GP) is generated. However, in the case of explicit or implied stack references, a stack fault (#SS) is generated. Instructions that have implied stack references, by default, use the SS segment register. These include PUSH/POPrelated instructions and instructions using RSP/RBP as base registers. In these cases, the canonical fault is #SS. If an instruction uses base registers RSP/RBP and uses a segment override prefix to specify a non-SS segment, a canonical fault generates a #GP (instead of an #SS). In 64-bit mode, only FS and GS segment-overrides are applicable in this situation. Other segment override prefixes (CS, DS, ES and SS) are ignored. Note that this also means that an SS segment-override applied to a “non-stack” register reference is ignored. Such a sequence still produces a #GP for a canonical fault (and not an #SS). 3.4 BASIC PROGRAM EXECUTION REGISTERS IA-32 architecture provides 16 basic program execution registers for use in general system and application programing (see Figure 3-4). These registers can be grouped as follows: 3-10 Vol. 1 BASIC EXECUTION ENVIRONMENT • • • General-purpose registers. These eight registers are available for storing operands and pointers. • EIP (instruction pointer) register. The EIP register contains a 32-bit pointer to the next instruction to be executed. Segment registers. These registers hold up to six segment selectors. EFLAGS (program status and control) register. The EFLAGS register report on the status of the program being executed and allows limited (application-program level) control of the processor. 3.4.1 General-Purpose Registers The 32-bit general-purpose registers EAX, EBX, ECX, EDX, ESI, EDI, EBP, and ESP are provided for holding the following items: • • • Operands for logical and arithmetic operations Operands for address calculations Memory pointers Although all of these registers are available for general storage of operands, results, and pointers, caution should be used when referencing the ESP register. The ESP register holds the stack pointer and as a general rule should not be used for another purpose. Many instructions assign specific registers to hold operands. For example, string instructions use the contents of the ECX, ESI, and EDI registers as operands. When using a segmented memory model, some instructions assume that pointers in certain registers are relative to specific segments. For instance, some instructions assume that a pointer in the EBX register points to a memory location in the DS segment. 31 General-Purpose Registers 0 EAX EBX ECX EDX ESI EDI EBP ESP Segment Registers 0 15 CS DS SS ES FS GS Program Status and Control Register 0 31 EFLAGS 31 Instruction Pointer 0 EIP Figure 3-4. General System and Application Programming Registers Vol. 1 3-11 BASIC EXECUTION ENVIRONMENT The special uses of general-purpose registers by instructions are described in Chapter 5, “Instruction Set Summary,” in this volume. See also: Chapter 3, Chapter 4 and Chapter 5 of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B & 2C. The following is a summary of special uses: • • • • • • • • EAX — Accumulator for operands and results data EBX — Pointer to data in the DS segment ECX — Counter for string and loop operations EDX — I/O pointer ESI — Pointer to data in the segment pointed to by the DS register; source pointer for string operations EDI — Pointer to data (or destination) in the segment pointed to by the ES register; destination pointer for string operations ESP — Stack pointer (in the SS segment) EBP — Pointer to data on the stack (in the SS segment) As shown in Figure 3-5, the lower 16 bits of the general-purpose registers map directly to the register set found in the 8086 and Intel 286 processors and can be referenced with the names AX, BX, CX, DX, BP, SI, DI, and SP. Each of the lower two bytes of the EAX, EBX, ECX, and EDX registers can be referenced by the names AH, BH, CH, and DH (high bytes) and AL, BL, CL, and DL (low bytes). 31 General-Purpose Registers 8 7 16 15 0 16-bit 32-bit AH AL AX EAX BH BL BX EBX CH CL CX ECX DH DL DX BP EDX EBP SI ESI DI EDI SP ESP Figure 3-5. Alternate General-Purpose Register Names 3.4.1.1 General-Purpose Registers in 64-Bit Mode In 64-bit mode, there are 16 general purpose registers and the default operand size is 32 bits. However, generalpurpose registers are able to work with either 32-bit or 64-bit operands. If a 32-bit operand size is specified: EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, R8D - R15D are available. If a 64-bit operand size is specified: RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15 are available. R8D-R15D/R8-R15 represent eight new general-purpose registers. All of these registers can be accessed at the byte, word, dword, and qword level. REX prefixes are used to generate 64-bit operand sizes or to reference registers R8-R15. Registers only available in 64-bit mode (R8-R15 and XMM8-XMM15) are preserved across transitions from 64-bit mode into compatibility mode then back into 64-bit mode. However, values of R8-R15 and XMM8-XMM15 are undefined after transitions from 64-bit mode through compatibility mode to legacy or real mode and then back through compatibility mode to 64-bit mode. 3-12 Vol. 1 BASIC EXECUTION ENVIRONMENT Table 3-2. Addressable General Purpose Registers Register Type Without REX With REX Byte Registers AL, BL, CL, DL, AH, BH, CH, DH AL, BL, CL, DL, DIL, SIL, BPL, SPL, R8L - R15L Word Registers AX, BX, CX, DX, DI, SI, BP, SP AX, BX, CX, DX, DI, SI, BP, SP, R8W - R15W Doubleword Registers EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, R8D - R15D Quadword Registers N.A. RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8 - R15 In 64-bit mode, there are limitations on accessing byte registers. An instruction cannot reference legacy highbytes (for example: AH, BH, CH, DH) and one of the new byte registers at the same time (for example: the low byte of the RAX register). However, instructions may reference legacy low-bytes (for example: AL, BL, CL or DL) and new byte registers at the same time (for example: the low byte of the R8 register, or RBP). The architecture enforces this limitation by changing high-byte references (AH, BH, CH, DH) to low byte references (BPL, SPL, DIL, SIL: the low 8 bits for RBP, RSP, RDI and RSI) for instructions using a REX prefix. When in 64-bit mode, operand size determines the number of valid bits in the destination general-purpose register: • • 64-bit operands generate a 64-bit result in the destination general-purpose register. • 8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits. 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register. Because the upper 32 bits of 64-bit general-purpose registers are undefined in 32-bit modes, the upper 32 bits of any general-purpose register are not preserved when switching from 64-bit mode to a 32-bit mode (to protected mode or compatibility mode). Software must not depend on these bits to maintain a value after a 64-bit to 32-bit mode switch. 3.4.2 Segment Registers The segment registers (CS, DS, SS, ES, FS, and GS) hold 16-bit segment selectors. A segment selector is a special pointer that identifies a segment in memory. To access a particular segment in memory, the segment selector for that segment must be present in the appropriate segment register. When writing application code, programmers generally create segment selectors with assembler directives and symbols. The assembler and other tools then create the actual segment selector values associated with these directives and symbols. If writing system code, programmers may need to create segment selectors directly. See Chapter 3, “Protected-Mode Memory Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. How segment registers are used depends on the type of memory management model that the operating system or executive is using. When using the flat (unsegmented) memory model, segment registers are loaded with segment selectors that point to overlapping segments, each of which begins at address 0 of the linear address space (see Figure 3-6). These overlapping segments then comprise the linear address space for the program. Typically, two overlapping segments are defined: one for code and another for data and stacks. The CS segment register points to the code segment and all the other segment registers point to the data and stack segment. When using the segmented memory model, each segment register is ordinarily loaded with a different segment selector so that each segment register points to a different segment within the linear address space (see Figure 3-7). At any time, a program can thus access up to six segments in the linear address space. To access a segment not pointed to by one of the segment registers, a program must first load the segment selector for the segment to be accessed into a segment register. Vol. 1 3-13 BASIC EXECUTION ENVIRONMENT Linear Address Space for Program Segment Registers CS DS SS ES FS GS The segment selector in each segment register points to an overlapping segment in the linear address space. Overlapping Segments of up to 4 GBytes Beginning at Address 0 Figure 3-6. Use of Segment Registers for Flat Memory Model Code Segment Segment Registers CS DS SS ES FS GS Data Segment Stack Segment All segments are mapped to the same linear-address space Data Segment Data Segment Data Segment Figure 3-7. Use of Segment Registers in Segmented Memory Model Each of the segment registers is associated with one of three types of storage: code, data, or stack. For example, the CS register contains the segment selector for the code segment, where the instructions being executed are stored. The processor fetches instructions from the code segment, using a logical address that consists of the segment selector in the CS register and the contents of the EIP register. The EIP register contains the offset within the code segment of the next instruction to be executed. The CS register cannot be loaded explicitly by an application program. Instead, it is loaded implicitly by instructions or internal processor operations that change program control (such as procedure calls, interrupt handling, or task switching). The DS, ES, FS, and GS registers point to four data segments. The availability of four data segments permits efficient and secure access to different types of data structures. For example, four separate data segments might be created: one for the data structures of the current module, another for the data exported from a higher-level module, a third for a dynamically created data structure, and a fourth for data shared with another program. To access additional data segments, the application program must load segment selectors for these segments into the DS, ES, FS, and GS registers, as needed. The SS register contains the segment selector for the stack segment, where the procedure stack is stored for the program, task, or handler currently being executed. All stack operations use the SS register to find the stack 3-14 Vol. 1 BASIC EXECUTION ENVIRONMENT segment. Unlike the CS register, the SS register can be loaded explicitly, which permits application programs to set up multiple stacks and switch among them. See Section 3.3, “Memory Organization,” for an overview of how the segment registers are used in real-address mode. The four segment registers CS, DS, SS, and ES are the same as the segment registers found in the Intel 8086 and Intel 286 processors and the FS and GS registers were introduced into the IA-32 Architecture with the Intel386™ family of processors. 3.4.2.1 Segment Registers in 64-Bit Mode In 64-bit mode: CS, DS, ES, SS are treated as if each segment base is 0, regardless of the value of the associated segment descriptor base. This creates a flat address space for code, data, and stack. FS and GS are exceptions. Both segment registers may be used as additional base registers in linear address calculations (in the addressing of local data and certain operating system data structures). Even though segmentation is generally disabled, segment register loads may cause the processor to perform segment access assists. During these activities, enabled processors will still perform most of the legacy checks on loaded values (even if the checks are not applicable in 64-bit mode). Such checks are needed because a segment register loaded in 64-bit mode may be used by an application running in compatibility mode. Limit checks for CS, DS, ES, SS, FS, and GS are disabled in 64-bit mode. 3.4.3 EFLAGS Register The 32-bit EFLAGS register contains a group of status flags, a control flag, and a group of system flags. Figure 3-8 defines the flags within this register. Following initialization of the processor (either by asserting the RESET pin or the INIT pin), the state of the EFLAGS register is 00000002H. Bits 1, 3, 5, 15, and 22 through 31 of this register are reserved. Software should not use or depend on the states of any of these bits. Some of the flags in the EFLAGS register can be modified directly, using special-purpose instructions (described in the following sections). There are no instructions that allow the whole register to be examined or modified directly. The following instructions can be used to move groups of flags to and from the procedure stack or the EAX register: LAHF, SAHF, PUSHF, PUSHFD, POPF, and POPFD. After the contents of the EFLAGS register have been transferred to the procedure stack or EAX register, the flags can be examined and modified using the processor’s bit manipulation instructions (BT, BTS, BTR, and BTC). When suspending a task (using the processor’s multitasking facilities), the processor automatically saves the state of the EFLAGS register in the task state segment (TSS) for the task being suspended. When binding itself to a new task, the processor loads the EFLAGS register with data from the new task’s TSS. When a call is made to an interrupt or exception handler procedure, the processor automatically saves the state of the EFLAGS registers on the procedure stack. When an interrupt or exception is handled with a task switch, the state of the EFLAGS register is saved in the TSS for the task being suspended. Vol. 1 3-15 BASIC EXECUTION ENVIRONMENT 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 V V I I I A V R 0 N 0 0 0 0 0 0 0 0 0 0 T C M F D P F X X X X X X X X S C X X S S S S S I O P L O D I T S Z P C A F F F F F F 0 F 0 F 1 F ID Flag (ID) Virtual Interrupt Pending (VIP) Virtual Interrupt Flag (VIF) Alignment Check / Access Control (AC) Virtual-8086 Mode (VM) Resume Flag (RF) Nested Task (NT) I/O Privilege Level (IOPL) Overflow Flag (OF) Direction Flag (DF) Interrupt Enable Flag (IF) Trap Flag (TF) Sign Flag (SF) Zero Flag (ZF) Auxiliary Carry Flag (AF) Parity Flag (PF) Carry Flag (CF) S Indicates a Status Flag C Indicates a Control Flag X Indicates a System Flag Reserved bit positions. DO NOT USE. Always set to values previously read. Figure 3-8. EFLAGS Register As the IA-32 Architecture has evolved, flags have been added to the EFLAGS register, but the function and placement of existing flags have remained the same from one family of the IA-32 processors to the next. As a result, code that accesses or modifies these flags for one family of IA-32 processors works as expected when run on later families of processors. 3.4.3.1 Status Flags The status flags (bits 0, 2, 4, 6, 7, and 11) of the EFLAGS register indicate the results of arithmetic instructions, such as the ADD, SUB, MUL, and DIV instructions. The status flag functions are: CF (bit 0) Carry flag — Set if an arithmetic operation generates a carry or a borrow out of the mostsignificant bit of the result; cleared otherwise. This flag indicates an overflow condition for unsigned-integer arithmetic. It is also used in multiple-precision arithmetic. PF (bit 2) Parity flag — Set if the least-significant byte of the result contains an even number of 1 bits; cleared otherwise. AF (bit 4) Auxiliary Carry flag — Set if an arithmetic operation generates a carry or a borrow out of bit 3 of the result; cleared otherwise. This flag is used in binary-coded decimal (BCD) arithmetic. ZF (bit 6) Zero flag — Set if the result is zero; cleared otherwise. SF (bit 7) Sign flag — Set equal to the most-significant bit of the result, which is the sign bit of a signed integer. (0 indicates a positive value and 1 indicates a negative value.) OF (bit 11) Overflow flag — Set if the integer result is too large a positive number or too small a negative number (excluding the sign-bit) to fit in the destination operand; cleared otherwise. This flag indicates an overflow condition for signed-integer (two’s complement) arithmetic. Of these status flags, only the CF flag can be modified directly, using the STC, CLC, and CMC instructions. Also the bit instructions (BT, BTS, BTR, and BTC) copy a specified bit into the CF flag. 3-16 Vol. 1 BASIC EXECUTION ENVIRONMENT The status flags allow a single arithmetic operation to produce results for three different data types: unsigned integers, signed integers, and BCD integers. If the result of an arithmetic operation is treated as an unsigned integer, the CF flag indicates an out-of-range condition (carry or a borrow); if treated as a signed integer (two’s complement number), the OF flag indicates a carry or borrow; and if treated as a BCD digit, the AF flag indicates a carry or borrow. The SF flag indicates the sign of a signed integer. The ZF flag indicates either a signed- or an unsignedinteger zero. When performing multiple-precision arithmetic on integers, the CF flag is used in conjunction with the add with carry (ADC) and subtract with borrow (SBB) instructions to propagate a carry or borrow from one computation to the next. The condition instructions Jcc (jump on condition code cc), SETcc (byte set on condition code cc), LOOPcc, and CMOVcc (conditional move) use one or more of the status flags as condition codes and test them for branch, setbyte, or end-loop conditions. 3.4.3.2 DF Flag The direction flag (DF, located in bit 10 of the EFLAGS register) controls string instructions (MOVS, CMPS, SCAS, LODS, and STOS). Setting the DF flag causes the string instructions to auto-decrement (to process strings from high addresses to low addresses). Clearing the DF flag causes the string instructions to auto-increment (process strings from low addresses to high addresses). The STD and CLD instructions set and clear the DF flag, respectively. 3.4.3.3 System Flags and IOPL Field The system flags and IOPL field in the EFLAGS register control operating-system or executive operations. They should not be modified by application programs. The functions of the system flags are as follows: TF (bit 8) Trap flag — Set to enable single-step mode for debugging; clear to disable single-step mode. IF (bit 9) Interrupt enable flag — Controls the response of the processor to maskable interrupt requests. Set to respond to maskable interrupts; cleared to inhibit maskable interrupts. IOPL (bits 12 and 13) I/O privilege level field — Indicates the I/O privilege level of the currently running program or task. The current privilege level (CPL) of the currently running program or task must be less than or equal to the I/O privilege level to access the I/O address space. The POPF and IRET instructions can modify this field only when operating at a CPL of 0. NT (bit 14) Nested task flag — Controls the chaining of interrupted and called tasks. Set when the current task is linked to the previously executed task; cleared when the current task is not linked to another task. RF (bit 16) Resume flag — Controls the processor’s response to debug exceptions. VM (bit 17) Virtual-8086 mode flag — Set to enable virtual-8086 mode; clear to return to protected mode without virtual-8086 mode semantics. AC (bit 18) Alignment check (or access control) flag — If the AM bit is set in the CR0 register, alignment checking of user-mode data accesses is enabled if and only if this flag is 1. If the SMAP bit is set in the CR4 register, explicit supervisor-mode data accesses to user-mode pages are allowed if and only if this bit is 1. See Section 4.6, “Access Rights,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. VIF (bit 19) Virtual interrupt flag — Virtual image of the IF flag. Used in conjunction with the VIP flag. (To use this flag and the VIP flag the virtual mode extensions are enabled by setting the VME flag in control register CR4.) VIP (bit 20) Virtual interrupt pending flag — Set to indicate that an interrupt is pending; clear when no interrupt is pending. (Software sets and clears this flag; the processor only reads it.) Used in conjunction with the VIF flag. ID (bit 21) Identification flag — The ability of a program to set or clear this flag indicates support for the CPUID instruction. Vol. 1 3-17 BASIC EXECUTION ENVIRONMENT For a detailed description of these flags: see Chapter 3, “Protected-Mode Memory Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 3.4.3.4 RFLAGS Register in 64-Bit Mode In 64-bit mode, EFLAGS is extended to 64 bits and called RFLAGS. The upper 32 bits of RFLAGS register is reserved. The lower 32 bits of RFLAGS is the same as EFLAGS. 3.5 INSTRUCTION POINTER The instruction pointer (EIP) register contains the offset in the current code segment for the next instruction to be executed. It is advanced from one instruction boundary to the next in straight-line code or it is moved ahead or backwards by a number of instructions when executing JMP, Jcc, CALL, RET, and IRET instructions. The EIP register cannot be accessed directly by software; it is controlled implicitly by control-transfer instructions (such as JMP, Jcc, CALL, and RET), interrupts, and exceptions. The only way to read the EIP register is to execute a CALL instruction and then read the value of the return instruction pointer from the procedure stack. The EIP register can be loaded indirectly by modifying the value of a return instruction pointer on the procedure stack and executing a return instruction (RET or IRET). See Section 6.2.4.2, “Return Instruction Pointer.” All IA-32 processors prefetch instructions. Because of instruction prefetching, an instruction address read from the bus during an instruction load does not match the value in the EIP register. Even though different processor generations use different prefetching mechanisms, the function of the EIP register to direct program flow remains fully compatible with all software written to run on IA-32 processors. 3.5.1 Instruction Pointer in 64-Bit Mode In 64-bit mode, the RIP register becomes the instruction pointer. This register holds the 64-bit offset of the next instruction to be executed. 64-bit mode also supports a technique called RIP-relative addressing. Using this technique, the effective address is determined by adding a displacement to the RIP of the next instruction. 3.6 OPERAND-SIZE AND ADDRESS-SIZE ATTRIBUTES When the processor is executing in protected mode, every code segment has a default operand-size attribute and address-size attribute. These attributes are selected with the D (default size) flag in the segment descriptor for the code segment (see Chapter 3, “Protected-Mode Memory Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A). When the D flag is set, the 32-bit operand-size and address-size attributes are selected; when the flag is clear, the 16-bit size attributes are selected. When the processor is executing in real-address mode, virtual-8086 mode, or SMM, the default operand-size and address-size attributes are always 16 bits. The operand-size attribute selects the size of operands. When the 16-bit operand-size attribute is in force, operands can generally be either 8 bits or 16 bits, and when the 32-bit operand-size attribute is in force, operands can generally be 8 bits or 32 bits. The address-size attribute selects the sizes of addresses used to address memory: 16 bits or 32 bits. When the 16bit address-size attribute is in force, segment offsets and displacements are 16 bits. This restriction limits the size of a segment to 64 KBytes. When the 32-bit address-size attribute is in force, segment offsets and displacements are 32 bits, allowing up to 4 GBytes to be addressed. The default operand-size attribute and/or address-size attribute can be overridden for a particular instruction by adding an operand-size and/or address-size prefix to an instruction. See Chapter 2, “Instruction Format,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A. The effect of this prefix applies only to the targeted instruction. Table 3-4 shows effective operand size and address size (when executing in protected mode or compatibility mode) depending on the settings of the D flag and the operand-size and address-size prefixes. 3-18 Vol. 1 BASIC EXECUTION ENVIRONMENT Table 3-3. Effective Operand- and Address-Size Attributes D Flag in Code Segment Descriptor 0 0 0 0 1 1 1 1 Operand-Size Prefix 66H N N Y Y N N Y Y Address-Size Prefix 67H N Y N Y N Y N Y Effective Operand Size 16 16 32 32 32 32 16 16 Effective Address Size 16 32 16 32 32 16 32 16 NOTES: Y: Yes - this instruction prefix is present. N: No - this instruction prefix is not present. 3.6.1 Operand Size and Address Size in 64-Bit Mode In 64-bit mode, the default address size is 64 bits and the default operand size is 32 bits. Defaults can be overridden using prefixes. Address-size and operand-size prefixes allow mixing of 32/64-bit data and 32/64-bit addresses on an instruction-by-instruction basis. Table 3-4 shows valid combinations of the 66H instruction prefix and the REX.W prefix that may be used to specify operand-size overrides in 64-bit mode. Note that 16-bit addresses are not supported in 64-bit mode. REX prefixes consist of 4-bit fields that form 16 different values. The W-bit field in the REX prefixes is referred to as REX.W. If the REX.W field is properly set, the prefix specifies an operand size override to 64 bits. Note that software can still use the operand-size 66H prefix to toggle to a 16-bit operand size. However, setting REX.W takes precedence over the operand-size prefix (66H) when both are used. In the case of SSE/SSE2/SSE3/SSSE3 SIMD instructions: the 66H, F2H, and F3H prefixes are mandatory for opcode extensions. In such a case, there is no interaction between a valid REX.W prefix and a 66H opcode extension prefix. See Chapter 2, “Instruction Format,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A. Table 3-4. Effective Operand- and Address-Size Attributes in 64-Bit Mode L Flag in Code Segment Descriptor 1 1 1 1 1 1 1 1 REX.W Prefix 0 0 0 0 1 1 1 1 Operand-Size Prefix 66H N N Y Y N N Y Y Address-Size Prefix 67H N Y N Y N Y N Y Effective Operand Size 32 32 16 16 64 64 64 64 Effective Address Size 64 32 64 32 64 32 64 32 NOTES: Y: Yes - this instruction prefix is present. N: No - this instruction prefix is not present. 3.7 OPERAND ADDRESSING IA-32 machine-instructions act on zero or more operands. Some operands are specified explicitly and others are implicit. The data for a source operand can be located in: • • • • the instruction itself (an immediate operand) a register a memory location an I/O port Vol. 1 3-19 BASIC EXECUTION ENVIRONMENT When an instruction returns data to a destination operand, it can be returned to: • • • a register a memory location an I/O port 3.7.1 Immediate Operands Some instructions use data encoded in the instruction itself as a source operand. These operands are called immediate operands (or simply immediates). For example, the following ADD instruction adds an immediate value of 14 to the contents of the EAX register: ADD EAX, 14 All arithmetic instructions (except the DIV and IDIV instructions) allow the source operand to be an immediate value. The maximum value allowed for an immediate operand varies among instructions, but can never be greater than the maximum value of an unsigned doubleword integer (232). 3.7.2 Register Operands Source and destination operands can be any of the following registers, depending on the instruction being executed: • • • • • • 32-bit general-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, or EBP) • • • MMX registers (MM0 through MM7) • • debug registers (DR0, DR1, DR2, DR3, DR6, and DR7) 16-bit general-purpose registers (AX, BX, CX, DX, SI, DI, SP, or BP) 8-bit general-purpose registers (AH, BH, CH, DH, AL, BL, CL, or DL) segment registers (CS, DS, SS, ES, FS, and GS) EFLAGS register x87 FPU registers (ST0 through ST7, status word, control word, tag word, data operand pointer, and instruction pointer) XMM registers (XMM0 through XMM7) and the MXCSR register control registers (CR0, CR2, CR3, and CR4) and system table pointer registers (GDTR, LDTR, IDTR, and task register) MSR registers Some instructions (such as the DIV and MUL instructions) use quadword operands contained in a pair of 32-bit registers. Register pairs are represented with a colon separating them. For example, in the register pair EDX:EAX, EDX contains the high order bits and EAX contains the low order bits of a quadword operand. Several instructions (such as the PUSHFD and POPFD instructions) are provided to load and store the contents of the EFLAGS register or to set or clear individual flags in this register. Other instructions (such as the Jcc instructions) use the state of the status flags in the EFLAGS register as condition codes for branching or other decision making operations. The processor contains a selection of system registers that are used to control memory management, interrupt and exception handling, task management, processor management, and debugging activities. Some of these system registers are accessible by an application program, the operating system, or the executive through a set of system instructions. When accessing a system register with a system instruction, the register is generally an implied operand of the instruction. 3-20 Vol. 1 BASIC EXECUTION ENVIRONMENT 3.7.2.1 Register Operands in 64-Bit Mode Register operands in 64-bit mode can be any of the following: • • • • 64-bit general-purpose registers (RAX, RBX, RCX, RDX, RSI, RDI, RSP, RBP, or R8-R15) • • • Segment registers (CS, DS, SS, ES, FS, and GS) • • • MMX registers (MM0 through MM7) • • • Debug registers (DR0, DR1, DR2, DR3, DR6, and DR7) 32-bit general-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP, or R8D-R15D) 16-bit general-purpose registers (AX, BX, CX, DX, SI, DI, SP, BP, or R8W-R15W) 8-bit general-purpose registers: AL, BL, CL, DL, SIL, DIL, SPL, BPL, and R8L-R15L are available using REX prefixes; AL, BL, CL, DL, AH, BH, CH, DH are available without using REX prefixes. RFLAGS register x87 FPU registers (ST0 through ST7, status word, control word, tag word, data operand pointer, and instruction pointer) XMM registers (XMM0 through XMM15) and the MXCSR register Control registers (CR0, CR2, CR3, CR4, and CR8) and system table pointer registers (GDTR, LDTR, IDTR, and task register) MSR registers RDX:RAX register pair representing a 128-bit operand 3.7.3 Memory Operands Source and destination operands in memory are referenced by means of a segment selector and an offset (see Figure 3-9). Segment selectors specify the segment containing the operand. Offsets specify the linear or effective address of the operand. Offsets can be 32 bits (represented by the notation m16:32) or 16 bits (represented by the notation m16:16). 15 Segment Selector 0 31 0 Offset (or Linear Address) Figure 3-9. Memory Operand Address 3.7.3.1 Memory Operands in 64-Bit Mode In 64-bit mode, a memory operand can be referenced by a segment selector and an offset. The offset can be 16 bits, 32 bits or 64 bits (see Figure 3-10). 15 Segment Selector 0 63 0 Offset (or Linear Address) Figure 3-10. Memory Operand Address in 64-Bit Mode 3.7.4 Specifying a Segment Selector The segment selector can be specified either implicitly or explicitly. The most common method of specifying a segment selector is to load it in a segment register and then allow the processor to select the register implicitly, depending on the type of operation being performed. The processor automatically chooses a segment according to the rules given in Table 3-5. Vol. 1 3-21 BASIC EXECUTION ENVIRONMENT When storing data in memory or loading data from memory, the DS segment default can be overridden to allow other segments to be accessed. Within an assembler, the segment override is generally handled with a colon “:” operator. For example, the following MOV instruction moves a value from register EAX into the segment pointed to by the ES register. The offset into the segment is contained in the EBX register: MOV ES:[EBX], EAX Table 3-5. Default Segment Selection Rules Reference Type Register Used Segment Used Default Selection Rule Instructions CS Code Segment All instruction fetches. Stack SS Stack Segment All stack pushes and pops. Any memory reference which uses the ESP or EBP register as a base register. Local Data DS Data Segment All data references, except when relative to stack or string destination. Data Segment pointed to with the ES register Destination of string instructions. Destination Strings ES At the machine level, a segment override is specified with a segment-override prefix, which is a byte placed at the beginning of an instruction. The following default segment selections cannot be overridden: • • • Instruction fetches must be made from the code segment. Destination strings in string instructions must be stored in the data segment pointed to by the ES register. Push and pop operations must always reference the SS segment. Some instructions require a segment selector to be specified explicitly. In these cases, the 16-bit segment selector can be located in a memory location or in a 16-bit register. For example, the following MOV instruction moves a segment selector located in register BX into segment register DS: MOV DS, BX Segment selectors can also be specified explicitly as part of a 48-bit far pointer in memory. Here, the first doubleword in memory contains the offset and the next word contains the segment selector. 3.7.4.1 Segmentation in 64-Bit Mode In IA-32e mode, the effects of segmentation depend on whether the processor is running in compatibility mode or 64-bit mode. In compatibility mode, segmentation functions just as it does in legacy IA-32 mode, using the 16-bit or 32-bit protected mode semantics described above. In 64-bit mode, segmentation is generally (but not completely) disabled, creating a flat 64-bit linear-address space. The processor treats the segment base of CS, DS, ES, SS as zero, creating a linear address that is equal to the effective address. The exceptions are the FS and GS segments, whose segment registers (which hold the segment base) can be used as additional base registers in some linear address calculations. 3.7.5 Specifying an Offset The offset part of a memory address can be specified directly as a static value (called a displacement) or through an address computation made up of one or more of the following components: • • • • Displacement — An 8-, 16-, or 32-bit value. Base — The value in a general-purpose register. Index — The value in a general-purpose register. Scale factor — A value of 2, 4, or 8 that is multiplied by the index value. 3-22 Vol. 1 BASIC EXECUTION ENVIRONMENT The offset which results from adding these components is called an effective address. Each of these components can have either a positive or negative (2s complement) value, with the exception of the scaling factor. Figure 3-11 shows all the possible ways that these components can be combined to create an effective address in the selected segment. Base Index EAX EBX ECX EDX ESP EBP ESI EDI EAX EBX ECX EDX EBP ESI EDI + Scale Displacement 1 None 2 * 4 8 + 8-bit 16-bit 32-bit Offset = Base + (Index * Scale) + Displacement Figure 3-11. Offset (or Effective Address) Computation The uses of general-purpose registers as base or index components are restricted in the following manner: • • The ESP register cannot be used as an index register. When the ESP or EBP register is used as the base, the SS segment is the default segment. In all other cases, the DS segment is the default segment. The base, index, and displacement components can be used in any combination, and any of these components can be NULL. A scale factor may be used only when an index also is used. Each possible combination is useful for data structures commonly used by programmers in high-level languages and assembly language. The following addressing modes suggest uses for common combinations of address components. • Displacement ⎯ A displacement alone represents a direct (uncomputed) offset to the operand. Because the displacement is encoded in the instruction, this form of an address is sometimes called an absolute or static address. It is commonly used to access a statically allocated scalar operand. • Base ⎯ A base alone represents an indirect offset to the operand. Since the value in the base register can change, it can be used for dynamic storage of variables and data structures. • Base + Displacement ⎯ A base register and a displacement can be used together for two distinct purposes: — As an index into an array when the element size is not 2, 4, or 8 bytes—The displacement component encodes the static offset to the beginning of the array. The base register holds the results of a calculation to determine the offset to a specific element within the array. — To access a field of a record: the base register holds the address of the beginning of the record, while the displacement is a static offset to the field. An important special case of this combination is access to parameters in a procedure activation record. A procedure activation record is the stack frame created when a procedure is entered. Here, the EBP register is the best choice for the base register, because it automatically selects the stack segment. This is a compact encoding for this common function. • (Index ∗ Scale) + Displacement ⎯ This address mode offers an efficient way to index into a static array when the element size is 2, 4, or 8 bytes. The displacement locates the beginning of the array, the index register holds the subscript of the desired array element, and the processor automatically converts the subscript into an index by applying the scaling factor. • Base + Index + Displacement ⎯ Using two registers together supports either a two-dimensional array (the displacement holds the address of the beginning of the array) or one of several instances of an array of records (the displacement is an offset to a field within the record). • Base + (Index ∗ Scale) + Displacement ⎯ Using all the addressing components together allows efficient indexing of a two-dimensional array when the elements of the array are 2, 4, or 8 bytes in size. Vol. 1 3-23 BASIC EXECUTION ENVIRONMENT 3.7.5.1 Specifying an Offset in 64-Bit Mode The offset part of a memory address in 64-bit mode can be specified directly as a static value or through an address computation made up of one or more of the following components: • • • • Displacement — An 8-bit, 16-bit, or 32-bit value. Base — The value in a 64-bit general-purpose register. Index — The value in a 64-bit general-purpose register. Scale factor — A value of 2, 4, or 8 that is multiplied by the index value. The base and index value can be specified in one of sixteen available general-purpose registers in most cases. See Chapter 2, “Instruction Format,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A. The following unique combination of address components is also available. • RIP + Displacement ⎯ In 64-bit mode, RIP-relative addressing uses a signed 32-bit displacement to calculate the effective address of the next instruction by sign-extend the 32-bit value and add to the 64-bit value in RIP. 3.7.6 Assembler and Compiler Addressing Modes At the machine-code level, the selected combination of displacement, base register, index register, and scale factor is encoded in an instruction. All assemblers permit a programmer to use any of the allowable combinations of these addressing components to address operands. High-level language compilers will select an appropriate combination of these components based on the language construct a programmer defines. 3.7.7 I/O Port Addressing The processor supports an I/O address space that contains up to 65,536 8-bit I/O ports. Ports that are 16-bit and 32-bit may also be defined in the I/O address space. An I/O port can be addressed with either an immediate operand or a value in the DX register. See Chapter 18, “Input/Output,” for more information about I/O port addressing. 3-24 Vol. 1 4. Updates to Chapter 13, Volume 1 Change bars and green text show changes to Chapter 13 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture. -----------------------------------------------------------------------------------------Changes to this chapter: Removed alignment check exception footnote from Section 13.11 “Operation of XSAVES” and Section 13.12 “Operation of XRSTORS”. The alignment check exception does not apply to supervisor mode; XSAVES and XRSTORS are supervisor mode instructions. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 14 CHAPTER 13 MANAGING STATE USING THE XSAVE FEATURE SET The XSAVE feature set extends the functionality of the FXSAVE and FXRSTOR instructions (see Section 10.5, “FXSAVE and FXRSTOR Instructions”) by supporting the saving and restoring of processor state in addition to the x87 execution environment (x87 state) and the registers used by the streaming SIMD extensions (SSE state). The XSAVE feature set comprises eight instructions. XGETBV and XSETBV allow software to read and write the extended control register XCR0, which controls the operation of the XSAVE feature set. XSAVE, XSAVEOPT, XSAVEC, and XSAVES are four instructions that save processor state to memory; XRSTOR and XRSTORS are corresponding instructions that load processor state from memory. XGETBV, XSAVE, XSAVEOPT, XSAVEC, and XRSTOR can be executed at any privilege level; XSETBV, XSAVES, and XRSTORS can be executed only if CPL = 0. In addition to XCR0, the XSAVES and XRSTORS instructions are controlled also by the IA32_XSS MSR (index DA0H). The XSAVE feature set organizes the state that manages into state components. Operation of the instructions is based on state-component bitmaps that have the same format as XCR0 and as the IA32_XSS MSR: each bit corresponds to a state component. Section 13.1 discusses these state components and bitmaps in more detail. Section 13.2 describes how the processor enumerates support for the XSAVE feature set and for XSAVE-enabled features (those features that require use of the XSAVE feature set for their enabling). Section 13.3 explains how software can enable the XSAVE feature set and XSAVE-enabled features. The XSAVE feature set allows saving and loading processor state from a region of memory called an XSAVE area. Section 13.4 presents details of the XSAVE area and its organization. Each XSAVE-managed state component is associated with a section of the XSAVE area. Section 13.5 describes in detail each of the XSAVE-managed state components. Section 13.7 through Section 13.12 describe the operation of XSAVE, XRSTOR, XSAVEOPT, XSAVEC, XSAVES, and XRSTORS, respectively. 13.1 XSAVE-SUPPORTED FEATURES AND STATE-COMPONENT BITMAPS The XSAVE feature set supports the saving and restoring of state components, each of which is a discrete set of processor registers (or parts of registers). In general, each such state component corresponds to a particular CPU feature. Such a feature is XSAVE-supported. Some XSAVE-supported features use registers in multiple XSAVEmanaged state components. The XSAVE feature set organizes the state components of the XSAVE-supported features using state-component bitmaps. A state-component bitmap comprises 64 bits; each bit in such a bitmap corresponds to a single state component. The following bits are defined in state-component bitmaps: • Bit 0 corresponds to the state component used for the x87 FPU execution environment (x87 state). See Section 13.5.1. • Bit 1 corresponds to the state component used for registers used by the streaming SIMD extensions (SSE state). See Section 13.5.2. • Bit 2 corresponds to the state component used for the additional register state used by the Intel® Advanced Vector Extensions (AVX state). See Section 13.5.3. • Bits 4:3 correspond to the two state components used for the additional register state used by Intel® Memory Protection Extensions (MPX state): — State component 3 is used for the 4 128-bit bounds registers BND0–BND3 (BNDREGS state). — State component 4 is used for the 64-bit user-mode MPX configuration register BNDCFGU and the 64-bit MPX status register BNDSTATUS (BNDCSR state). • Bits 7:5 correspond to the three state components used for the additional register state used by Intel® Advanced Vector Extensions 512 (AVX-512 state): — State component 5 is used for the 8 64-bit opmask registers k0–k7 (opmask state). Vol. 1 13-1 MANAGING STATE USING THE XSAVE FEATURE SET — State component 6 is used for the upper 256 bits of the registers ZMM0–ZMM15. These 16 256-bit values are denoted ZMM0_H–ZMM15_H (ZMM_Hi256 state). — State component 7 is used for the 16 512-bit registers ZMM16–ZMM31 (Hi16_ZMM state). • • Bit 8 corresponds to the state component used for the Intel Processor Trace MSRs (PT state). • Bit 13 corresponds to the state component used for an MSR used to control hardware duty cycling (HDC state). See Section 13.5.8. Bit 9 corresponds to the state component used for the protection-key feature’s register PKRU (PKRU state). See Section 13.5.7. Bits in the ranges 62:14 and 12:10 are not currently defined in state-component bitmaps and are reserved for future expansion. As individual state component is defined within bits 62:11, additional sub-sections are updated within Section 13.5 over time. Bit 63 is used for special functionality in some bitmaps and does not correspond to any state component. The state component corresponding to bit i of state-component bitmaps is called state component i. Thus, x87 state is state component 0; SSE state is state component 1; AVX state is state component 2; MPX state comprises state components 3–4; AVX-512 state comprises state components 5–7; PT state is state component 8; PKRU state is state component 9; and HDC state is state component 13. The XSAVE feature set uses state-component bitmaps in multiple ways. Most of the instructions use an implicit operand (in EDX:EAX), called the instruction mask, which is the state-component bitmap that specifies the state components on which the instruction operates. Some state components are user state components, and they can be managed by the entire XSAVE feature set. Other state components are supervisor state components, and they can be managed only by XSAVES and XRSTORS. The state components corresponding to bit 9 and to bits in the range 7:0 are user state components, PT state (corresponding to bit 8) and HDC state (corresponding to bit 13) are supervisor state components. Extended control register XCR0 contains a state-component bitmap that specifies the user state components that software has enabled the XSAVE feature set to manage. If the bit corresponding to a state component is clear in XCR0, instructions in the XSAVE feature set will not operate on that state component, regardless of the value of the instruction mask. The IA32_XSS MSR (index DA0H) contains a state-component bitmap that specifies the supervisor state components that software has enabled XSAVES and XRSTORS to manage (XSAVE, XSAVEC, XSAVEOPT, and XRSTOR cannot manage supervisor state components). If the bit corresponding to a state component is clear in the IA32_XSS MSR, XSAVES and XRSTORS will not operate on that state component, regardless of the value of the instruction mask. Some XSAVE-supported features can be used only if XCR0 has been configured so that the features’ state components can be managed by the XSAVE feature set. (This applies only to features with user state components.) Such state components and features are XSAVE-enabled. In general, the processor will not modify (or allow modification of) the registers of a state component of an XSAVE-enabled feature if the bit corresponding to that state component is clear in XCR0. (If software clears such a bit in XCR0, the processor preserves the corresponding state component.) If an XSAVE-enabled feature has not been fully enabled in XCR0, execution of any instruction defined for that feature causes an invalid-opcode exception (#UD). As will be explained in Section 13.3, the XSAVE feature set is enabled only if CR4.OSXSAVE[bit 18] = 1. If CR4.OSXSAVE = 0, the processor treats XSAVE-enabled state features and their state components as if all bits in XCR0 were clear; the state components cannot be modified and the features’ instructions cannot be executed. The state components for x87 state, for SSE state, for PT state, for PKRU state, and for HDC state are XSAVEmanaged but the corresponding features are not XSAVE-enabled. Processors allow modification of this state, as well as execution of x87 FPU instructions and SSE instructions and use of Intel Processor Trace, protection keys, and hardware duty cycling regardless of the value of CR4.OSXSAVE and XCR0. 13-2 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET 13.2 ENUMERATION OF CPU SUPPORT FOR XSAVE INSTRUCTIONS AND XSAVESUPPORTED FEATURES A processor enumerates support for the XSAVE feature set and for features supported by that feature set using the CPUID instruction. The following items provide specific details: • CPUID.1:ECX.XSAVE[bit 26] enumerates general support for the XSAVE feature set: — If this bit is 0, the processor does not support any of the following instructions: XGETBV, XRSTOR, XRSTORS, XSAVE, XSAVEC, XSAVEOPT, XSAVES, and XSETBV; the processor provides no further enumeration through CPUID function 0DH (see below). — If this bit is 1, the processor supports the following instructions: XGETBV, XRSTOR, XSAVE, and XSETBV.1 Further enumeration is provided through CPUID function 0DH. CR4.OSXSAVE can be set to 1 if and only if CPUID.1:ECX.XSAVE[bit 26] is enumerated as 1. • CPUID function 0DH enumerates details of CPU support through a set of sub-functions. Software selects a specific sub-function by the value placed in the ECX register. The following items provide specific details: — CPUID function 0DH, sub-function 0. • EDX:EAX is a bitmap of all the user state components that can be managed using the XSAVE feature set. A bit can be set in XCR0 if and only if the corresponding bit is set in this bitmap. Every processor that supports the XSAVE feature set will set EAX[0] (x87 state) and EAX[1] (SSE state). If EAX[i] = 1 (for 1 < i < 32) or EDX[i–32] = 1 (for 32 ≤ i < 63), sub-function i enumerates details for state component i (see below). • ECX enumerates the size (in bytes) required by the XSAVE instruction for an XSAVE area containing all the user state components supported by this processor. • EBX enumerates the size (in bytes) required by the XSAVE instruction for an XSAVE area containing all the user state components corresponding to bits currently set in XCR0. — CPUID function 0DH, sub-function 1. • EAX[0] enumerates support for the XSAVEOPT instruction. The instruction is supported if and only if this bit is 1. If EAX[0] = 0, execution of XSAVEOPT causes an invalid-opcode exception (#UD). • EAX[1] enumerates support for compaction extensions to the XSAVE feature set. The following are supported if this bit is 1: — — — The compacted format of the extended region of XSAVE areas (see Section 13.4.3). The XSAVEC instruction. If EAX[1] = 0, execution of XSAVEC causes a #UD. Execution of the compacted form of XRSTOR (see Section 13.8). • EAX[2] enumerates support for execution of XGETBV with ECX = 1. This allows software to determine the state of the init optimization. See Section 13.6. • EAX[3] enumerates support for XSAVES, XRSTORS, and the IA32_XSS MSR. If EAX[3] = 0, execution of XSAVES or XRSTORS causes a #UD; an attempt to access the IA32_XSS MSR using RDMSR or WRMSR causes a general-protection exception (#GP). Every processor that supports a supervisor state component sets EAX[3]. Every processor that sets EAX[3] (XSAVES, XRSTORS, IA32_XSS) will also set EAX[1] (the compaction extensions). • • EAX[31:4] are reserved. • EBX enumerates the size (in bytes) required by the XSAVES instruction for an XSAVE area containing all the state components corresponding to bits currently set in XCR0 | IA32_XSS. EDX:ECX is a bitmap of all the supervisor state components that can be managed by XSAVES and XRSTORS. A bit can be set in the IA32_XSS MSR if and only if the corresponding bit is set in this bitmap. 1. If CPUID.1:ECX.XSAVE[bit 26] = 1, XGETBV and XSETBV may be executed with ECX = 0 (to read and write XCR0). Any support for execution of these instructions with other values of ECX is enumerated separately. Vol. 1 13-3 MANAGING STATE USING THE XSAVE FEATURE SET NOTE In summary, the XSAVE feature set supports state component i (0 ≤ i < 63) if one of the following is true: (1) i < 32 and CPUID.(EAX=0DH,ECX=0):EAX[i] = 1; (2) i ≥ 32 and CPUID.(EAX=0DH,ECX=0):EAX[i–32] = 1; (3) i < 32 and CPUID.(EAX=0DH,ECX=1):ECX[i] = 1; or (4) i ≥ 32 and CPUID.(EAX=0DH,ECX=1):EDX[i–32] = 1. The XSAVE feature set supports user state component i if (1) or (2) holds; if (3) or (4) holds, state component i is a supervisor state component and support is limited to XSAVES and XRSTORS. — CPUID function 0DH, sub-function i (i > 1). This sub-function enumerates details for state component i. If the XSAVE feature set supports state component i (see note above), the following items provide specific details: • • EAX enumerates the size (in bytes) required for state component i. • • If state component i is a supervisor state component, EBX returns 0. • The value returned by ECX[1] indicates the alignment of state component i when the compacted format of the extended region of an XSAVE area is used (see Section 13.4.3). If ECX[1] returns 0, state component i is located immediately following the preceding state component; if ECX[1] returns 1, state component i is located on the next 64-byte boundary following the preceding state component. • ECX[31:2] and EDX return 0. If state component i is a user state component, EBX enumerates the offset (in bytes, from the base of the XSAVE area) of the section used for state component i. (This offset applies only when the standard format for the extended region of the XSAVE area is being used; see Section 13.4.3.) If state component i is a user state component, ECX[0] return 0; if state component i is a supervisor state component, ECX[0] returns 1. If the XSAVE feature set does not support state component i, sub-function i returns 0 in EAX, EBX, ECX, and EDX. 13.3 ENABLING THE XSAVE FEATURE SET AND XSAVE-ENABLED FEATURES Software enables the XSAVE feature set by setting CR4.OSXSAVE[bit 18] to 1 (e.g., with the MOV to CR4 instruction). If this bit is 0, execution of any of XGETBV, XRSTOR, XRSTORS, XSAVE, XSAVEC, XSAVEOPT, XSAVES, and XSETBV causes an invalid-opcode exception (#UD). When CR4.OSXSAVE = 1 and CPL = 0, executing the XSETBV instruction with ECX = 0 writes the 64-bit value in EDX:EAX to XCR0 (EAX is written to XCR0[31:0] and EDX to XCR0[63:32]). (Execution of the XSETBV instruction causes a general-protection fault — #GP — if CPL > 0.) The following items provide details regarding individual bits in XCR0: • XCR0[0] is associated with x87 state (see Section 13.5.1). XCR0[0] is always 1. It has that value coming out of RESET. Executing the XSETBV instruction causes a general-protection fault (#GP) if ECX = 0 and EAX[0] is 0. • XCR0[1] is associated with SSE state (see Section 13.5.2). Software can use the XSAVE feature set to manage SSE state only if XCR0[1] = 1. The value of XCR0[1] in no way determines whether software can execute SSE instructions (these instructions can be executed even if XCR0[1] = 0). XCR0[1] is 0 coming out of RESET. As noted in Section 13.2, every processor that supports the XSAVE feature set allows software to set XCR0[1]. • XCR0[2] is associated with AVX state (see Section 13.5.3). Software can use the XSAVE feature set to manage AVX state only if XCR0[2] = 1. In addition, software can execute AVX instructions only if CR4.OSXSAVE = XCR0[2] = 1. Otherwise, any execution of an AVX instruction causes an invalid-opcode exception (#UD). XCR0[2] is 0 coming out of RESET. As noted in Section 13.2, a processor allows software to set XCR0[2] if and only if CPUID.(EAX=0DH,ECX=0):EAX[2] = 1. In addition, executing the XSETBV instruction causes a generalprotection fault (#GP) if ECX = 0 and EAX[2:1] has the value 10b; that is, software cannot enable the XSAVE feature set for AVX state but not for SSE state. As noted in Section 13.1, the processor will preserve AVX state unmodified if software clears XCR0[2]. However, clearing XCR0[2] while AVX state is not in its initial configuration may cause SSE instructions to incur a power and performance penalty. See Section 13.5.3, “Enable the Use Of XSAVE Feature Set And XSAVE State 13-4 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET Components” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for how system software can avoid this penalty. • XCR0[4:3] are associated with MPX state (see Section 13.5.4). Software can use the XSAVE feature set to manage MPX state only if XCR0[4:3] = 11b. In addition, software can execute MPX instructions only if CR4.OSXSAVE = 1 and XCR0[4:3] = 11b. Otherwise, any execution of an MPX instruction causes an invalidopcode exception (#UD).1 XCR0[4:3] have value 00b coming out of RESET. As noted in Section 13.2, a processor allows software to set XCR0[4:3] to 11b if and only if CPUID.(EAX=0DH,ECX=0):EAX[4:3] = 11b. In addition, executing the XSETBV instruction causes a general-protection fault (#GP) if ECX = 0, EAX[4:3] is neither 00b nor 11b; that is, software can enable the XSAVE feature set for MPX state only if it does so for both state components. As noted in Section 13.1, the processor will preserve MPX state unmodified if software clears XCR0[4:3]. • XCR0[7:5] are associated with AVX-512 state (see Section 13.5.5). Software can use the XSAVE feature set to manage AVX-512 state only if XCR0[7:5] = 111b. In addition, software can execute AVX-512 instructions only if CR4.OSXSAVE = 1 and XCR0[7:5] = 111b. Otherwise, any execution of an AVX-512 instruction causes an invalid-opcode exception (#UD). XCR0[7:5] have value 000b coming out of RESET. As noted in Section 13.2, a processor allows software to set XCR0[7:5] to 111b if and only if CPUID.(EAX=0DH,ECX=0):EAX[7:5] = 111b. In addition, executing the XSETBV instruction causes a general-protection fault (#GP) if ECX = 0, EAX[7:5] is not 000b, and any bit is clear in EAX[2:1] or EAX[7:5]; that is, software can enable the XSAVE feature set for AVX-512 state only if it does so for all three state components, and only if it also does so for AVX state and SSE state. This implies that the value of XCR0[7:5] is always either 000b or 111b. As noted in Section 13.1, the processor will preserve AVX-512 state unmodified if software clears XCR0[7:5]. However, clearing XCR0[7:5] while AVX-512 state is not in its initial configuration may cause SSE and AVX instructions to incur a power and performance penalty. See Section 13.5.3, “Enable the Use Of XSAVE Feature Set And XSAVE State Components” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for how system software can avoid this penalty. • XCR0[9] is associated with PKRU state (see Section 13.5.7). Software can use the XSAVE feature set to manage PKRU state only if XCR0[9] = 1. The value of XCR0[9] in no way determines whether software can use protection keys or execute other instructions that access PKRU state (these instructions can be executed even if XCR0[9] = 0). XCR0[9] is 0 coming out of RESET. As noted in Section 13.2, a processor allows software to set XCR0[9] if and only if CPUID.(EAX=0DH,ECX=0):EAX[9] = 1. • XCR0[63:10] and XCR0[8] are reserved.2 Executing the XSETBV instruction causes a general-protection fault (#GP) if ECX = 0 and any corresponding bit in EDX:EAX is not 0. These bits in XCR0 are all 0 coming out of RESET. Software operating with CPL > 0 may need to determine whether the XSAVE feature set and certain XSAVEenabled features have been enabled. If CPL > 0, execution of the MOV from CR4 instruction causes a generalprotection fault (#GP). The following alternative mechanisms allow software to discover the enabling of the XSAVE feature set regardless of CPL: • The value of CR4.OSXSAVE is returned in CPUID.1:ECX.OSXSAVE[bit 27]. If software determines that CPUID.1:ECX.OSXSAVE = 1, the processor supports the XSAVE feature set and the feature set has been enabled in CR4. • Executing the XGETBV instruction with ECX = 0 returns the value of XCR0 in EDX:EAX. XGETBV can be executed if CR4.OSXSAVE = 1 (if CPUID.1:ECX.OSXSAVE = 1), regardless of CPL. Thus, software can use the following algorithm to determine the support and enabling for the XSAVE feature set: 1. Use CPUID to discover the value of CPUID.1:ECX.OSXSAVE. — If the bit is 0, either the XSAVE feature set is not supported by the processor or has not been enabled by software. Either way, the XSAVE feature set is not available, nor are XSAVE-enabled features such as AVX. 1. If XCR0[3] = 0, executions of CALL, RET, JMP, and Jcc do not initialize the bounds registers. 2. Bit 8 and bit 13 correspond to supervisor state components. Since bits can be set in XCR0 only for user state components, those bits of XCR0 must be 0. Vol. 1 13-5 MANAGING STATE USING THE XSAVE FEATURE SET — If the bit is 1, the processor supports the XSAVE feature set — including the XGETBV instruction — and it has been enabled by software. The XSAVE feature set can be used to manage x87 state (because XCR0[0] is always 1). Software requiring more detailed information can go on to the next step. 2. Execute XGETBV with ECX = 0 to discover the value of XCR0. If XCR0[1] = 1, the XSAVE feature set can be used to manage SSE state. If XCR0[2] = 1, the XSAVE feature set can be used to manage AVX state and software can execute AVX instructions. If XCR0[4:3] is 11b, the XSAVE feature set can be used to manage MPX state and software can execute MPX instructions. If XCR0[7:5] is 111b, the XSAVE feature set can be used to manage AVX-512 state and software can execute AVX-512 instructions. If XCR0[9] = 1, the XSAVE feature set can be used to manage PKRU state. The IA32_XSS MSR (with MSR index DA0H) is zero coming out of RESET. If CR4.OSXSAVE = 1, CPUID.(EAX=0DH,ECX=1):EAX[3] = 1, and CPL = 0, executing the WRMSR instruction with ECX = DA0H writes the 64-bit value in EDX:EAX to the IA32_XSS MSR (EAX is written to IA32_XSS[31:0] and EDX to IA32_XSS[63:32]). The following items provide details regarding individual bits in the IA32_XSS MSR: • IA32_XSS[8] is associated with PT state (see Section 13.5.6). Software can use XSAVES and XRSTORS to manage PT state only if IA32_XSS[8] = 1. The value of IA32_XSS[8] does not determine whether software can use Intel Processor Trace (the feature can be used even if IA32_XSS[8] = 0). • IA32_XSS[13] is associated with HDC state (see Section 13.5.8). Software can use XSAVES and XRSTORS to manage HDC state only if IA32_XSS[13] = 1. The value of IA32_XSS[13] does not determine whether software can use hardware duty cycling (the feature can be used even if IA32_XSS[13] = 0). • IA32_XSS[63:14], IA32_XSS[12:9] and IA32_XSS[7:0] are reserved.1 Executing the WRMSR instruction causes a general-protection fault (#GP) if ECX = DA0H and any corresponding bit in EDX:EAX is not 0. These bits in XCR0 are all 0 coming out of RESET. The IA32_XSS MSR is 0 coming out of RESET. There is no mechanism by which software operating with CPL > 0 can discover the value of the IA32_XSS MSR. 13.4 XSAVE AREA The XSAVE feature set includes instructions that save and restore the XSAVE-managed state components to and from memory: XSAVE, XSAVEOPT, XSAVEC, and XSAVES (for saving); and XRSTOR and XRSTORS (for restoring). The processor organizes the state components in a region of memory called an XSAVE area. Each of the save and restore instructions takes a memory operand that specifies the 64-byte aligned base address of the XSAVE area on which it operates. Every XSAVE area has the following format: • The legacy region. The legacy region of an XSAVE area comprises the 512 bytes starting at the area’s base address. It is used to manage the state components for x87 state and SSE state. The legacy region is described in more detail in Section 13.4.1. • The XSAVE header. The XSAVE header of an XSAVE area comprises the 64 bytes starting at an offset of 512 bytes from the area’s base address. The XSAVE header is described in more detail in Section 13.4.2. • The extended region. The extended region of an XSAVE area starts at an offset of 576 bytes from the area’s base address. It is used to manage the state components other than those for x87 state and SSE state. The extended region is described in more detail in Section 13.4.3. The size of the extended region is determined by which state components the processor supports and which bits have been set in XCR0 and IA32_XSS (see Section 13.3). 13.4.1 Legacy Region of an XSAVE Area The legacy region of an XSAVE area comprises the 512 bytes starting at the area’s base address. It has the same format as the FXSAVE area (see Section 10.5.1). The XSAVE feature set uses the legacy area for x87 state (state 1. Bit 9 and bits 7:0 correspond to user state components. Since bits can be set in the IA32_XSS MSR only for supervisor state components, those bits of the MSR must be 0. 13-6 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET component 0) and SSE state (state component 1). Table 13-1 illustrates the format of the first 416 bytes of the legacy region of an XSAVE area. Table 13-1. Format of the Legacy Region of an XSAVE Area 15 14 13 FIP[63:48] or reserved 12 FCS or FIP[47:32] MXCSR_MASK 11 10 9 8 7 6 FIP[31:0] FOP MXCSR FDP[63:48] or reserved 5 4 Rsvd. FTW FDS or FDP[47:32] 3 2 FSW 1 0 0 FCW 16 FDP[31:0] Reserved ST0/MM0 32 Reserved ST1/MM1 48 Reserved ST2/MM2 64 Reserved ST3/MM3 80 Reserved ST4/MM4 96 Reserved ST5/MM5 112 Reserved ST6/MM6 128 Reserved ST7/MM7 144 XMM0 160 XMM1 176 XMM2 192 XMM3 208 XMM4 224 XMM5 240 XMM6 256 XMM7 272 XMM8 288 XMM9 304 XMM10 320 XMM11 336 XMM12 352 XMM13 368 XMM14 384 XMM15 400 The x87 state component comprises bytes 23:0 and bytes 159:32. The SSE state component comprises bytes 31:24 and bytes 415:160. The XSAVE feature set does not use bytes 511:416; bytes 463:416 are reserved. Section 13.7 through Section 13.9 provide details of how instructions in the XSAVE feature set use the legacy region of an XSAVE area. 13.4.2 XSAVE Header The XSAVE header of an XSAVE area comprises the 64 bytes starting at offset 512 from the area’s base address: • Bytes 7:0 of the XSAVE header is a state-component bitmap (see Section 13.1) called XSTATE_BV. It identifies the state components in the XSAVE area. Vol. 1 13-7 MANAGING STATE USING THE XSAVE FEATURE SET • Bytes 15:8 of the XSAVE header is a state-component bitmap called XCOMP_BV. It is used as follows: — XCOMP_BV[63] indicates the format of the extended region of the XSAVE area (see Section 13.4.3). If it is clear, the standard format is used. If it is set, the compacted format is used; XCOMP_BV[62:0] provide format specifics as specified in Section 13.4.3. — XCOMP_BV[63] determines which form of the XRSTOR instruction is used. If the bit is set, the compacted form is used; otherwise, the standard form is used. See Section 13.8. — All bits in XCOMP_BV should be 0 if the processor does not support the compaction extensions to the XSAVE feature set. • Bytes 63:16 of the XSAVE header are reserved. Section 13.7 through Section 13.9 provide details of how instructions in the XSAVE feature set use the XSAVE header of an XSAVE area. 13.4.3 Extended Region of an XSAVE Area The extended region of an XSAVE area starts at byte offset 576 from the area’s base address. The size of the extended region is determined by which state components the processor supports and which bits have been set in XCR0 | IA32_XSS (see Section 13.3). The XSAVE feature set uses the extended area for each state component i, where i ≥ 2. The following state components are currently supported in the extended area: state component 2 contains AVX state; state components 5–7 contain AVX-512 state; and state component 9 contains PKRU state. The extended region of the an XSAVE area may have one of two formats. The standard format is supported by all processors that support the XSAVE feature set; the compacted format is supported by those processors that support the compaction extensions to the XSAVE feature set (see Section 13.2). Bit 63 of the XCOMP_BV field in the XSAVE header (see Section 13.4.2) indicates which format is used. The following items describe the two possible formats of the extended region: • Standard format. Each state component i (i ≥ 2) is located at the byte offset from the base address of the XSAVE area enumerated in CPUID.(EAX=0DH,ECX=i):EBX. (CPUID.(EAX=0DH,ECX=i):EAX enumerates the number of bytes required for state component i. • Compacted format. Each state component i (i ≥ 2) is located at a byte offset from the base address of the XSAVE area based on the XCOMP_BV field in the XSAVE header: — If XCOMP_BV[i] = 0, state component i is not in the XSAVE area. — If XCOMP_BV[i] = 1, state component i is located at a byte offset locationI from the base address of the XSAVE area, where locationI is determined by the following items: • If XCOMP_BV[j] = 0 for every j, 2 ≤ j < i, locationI is 576. (This item applies if i is the first bit set in bits 62:2 of the XCOMP_BV; it implies that state component i is located at the beginning of the extended region.) • Otherwise, let j, 2 ≤ j < i, be the greatest value such that XCOMP_BV[j] = 1. Then locationI is determined by the following values: locationJ; sizeJ, as enumerated in CPUID.(EAX=0DH,ECX=j):EAX; and the value of alignI, as enumerated in CPUID.(EAX=0DH,ECX=i):ECX[1]: 13.5 — If alignI = 0, locationI = locationJ + sizeJ. (This item implies that state component i is located immediately following the preceding state component whose bit is set in XCOMP_BV.) — If alignI = 1, locationI = ceiling(locationJ + sizeJ, 64). (This item implies that state component i is located on the next 64-byte boundary following the preceding state component whose bit is set in XCOMP_BV.) XSAVE-MANAGED STATE The section provides details regarding how the XSAVE feature set interacts with the various XSAVE-managed state components. 13-8 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET Unless otherwise state, the state pertaining to a particular state component is saved beginning at byte 0 of the section of the XSAVE are corresponding to that state component. 13.5.1 x87 State Instructions in the XSAVE feature set can manage the same state of the x87 FPU execution environment (x87 state) that can be managed using the FXSAVE and FXRSTOR instructions. They organize all x87 state as a user state component in the legacy region of the XSAVE area (see Section 13.4.1). This region is illustrated in Table 13-1; the x87 state is listed below, along with details of its interactions with the XSAVE feature set: • Bytes 1:0, 3:2, 7:6. These are used for the x87 FPU Control Word (FCW), the x87 FPU Status Word (FSW), and the x87 FPU Opcode (FOP), respectively. • Byte 4 is used for an abridged version of the x87 FPU Tag Word (FTW). The following items describe its usage: — For each j, 0 ≤ j ≤ 7, XSAVE, XSAVEOPT, XSAVEC, and XSAVES save a 0 into bit j of byte 4 if x87 FPU data register STj has a empty tag; otherwise, XSAVE, XSAVEOPT, XSAVEC, and XSAVES save a 1 into bit j of byte 4. — For each j, 0 ≤ j ≤ 7, XRSTOR and XRSTORS establish the tag value for x87 FPU data register STj as follows. If bit j of byte 4 is 0, the tag for STj in the tag register for that data register is marked empty (11B); otherwise, the x87 FPU sets the tag for STj based on the value being loaded into that register (see below). • Bytes 15:8 are used as follows: — If the instruction has no REX prefix, or if REX.W = 0: • • • Bytes 11:8 are used for bits 31:0 of the x87 FPU Instruction Pointer Offset (FIP). If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 0, bytes 13:12 are used for x87 FPU Instruction Pointer Selector (FCS). Otherwise, XSAVE, XSAVEOPT, XSAVEC, and XSAVES save these bytes as 0000H, and XRSTOR and XRSTORS ignore them. Bytes 15:14 are not used. — If the instruction has a REX prefix with REX.W = 1, bytes 15:8 are used for the full 64 bits of FIP. • Bytes 23:16 are used as follows: — If the instruction has no REX prefix, or if REX.W = 0: • • • Bytes 19:16 are used for bits 31:0 of the x87 FPU Data Pointer Offset (FDP). If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 0, bytes 21:20 are used for x87 FPU Data Pointer Selector (FDS). Otherwise, XSAVE, XSAVEOPT, XSAVEC, and XSAVES save these bytes as 0000H; and XRSTOR and XRSTORS ignore them. Bytes 23:22 are not used. — If the instruction has a REX prefix with REX.W = 1, bytes 23:16 are used for the full 64 bits of FDP. • • Bytes 31:24 are used for SSE state (see Section 13.5.2). Bytes 159:32 are used for the registers ST0–ST7 (MM0–MM7). Each of the 8 register is allocated a 128-bit region, with the low 80 bits used for the register and the upper 48 bits unused. x87 state is XSAVE-managed but the x87 FPU feature is not XSAVE-enabled. The XSAVE feature set can operate on x87 state only if the feature set is enabled (CR4.OSXSAVE = 1).1 Software can otherwise use x87 state even if the XSAVE feature set is not enabled. 13.5.2 SSE State Instructions in the XSAVE feature set can manage the registers used by the streaming SIMD extensions (SSE state) just as the FXSAVE and FXRSTOR instructions do. They organize all SSE state as a user state component in the legacy region of the XSAVE area (see Section 13.4.1). This region is illustrated in Table 13-1; the SSE state is listed below, along with details of its interactions with the XSAVE feature set: 1. The processor ensures that XCR0[0] is always 1. Vol. 1 13-9 MANAGING STATE USING THE XSAVE FEATURE SET • • Bytes 23:0 are used for x87 state (see Section 13.5.1). • • • • Bytes 31:28 are used for the MXCSR_MASK value. XRSTOR and XRSTORS ignore this field. Bytes 27:24 are used for the MXCSR register. XRSTOR and XRSTORS generate general-protection faults (#GP) in response to attempts to set any of the reserved bits of the MXCSR register.1 Bytes 159:32 are used for x87 state. Bytes 287:160 are used for the registers XMM0–XMM7. Bytes 415:288 are used for the registers XMM8–XMM15. These fields are used only in 64-bit mode. Executions of XSAVE, XSAVEOPT, XSAVEC, and XSAVES outside 64-bit mode do not modify these bytes; executions of XRSTOR and XRSTORS outside 64-bit mode do not update XMM8–XMM15. See Section 13.13. SSE state is XSAVE-managed but the SSE feature is not XSAVE-enabled. The XSAVE feature set can operate on SSE state only if the feature set is enabled (CR4.OSXSAVE = 1) and has been configured to manage SSE state (XCR0[1] = 1). Software can otherwise use SSE state even if the XSAVE feature set is not enabled or has not been configured to manage SSE state. 13.5.3 AVX State The register state used by the Intel® Advanced Vector Extensions (AVX) comprises the MXCSR register and 16 256bit vector registers called YMM0–YMM15. The low 128 bits of each register YMMi is identical to the SSE register XMMi. Thus, the new state register state added by AVX comprises the upper 128 bits of the registers YMM0– YMM15. These 16 128-bit values are denoted YMM0_H–YMM15_H and are collectively called AVX state. As noted in Section 13.1, the XSAVE feature set manages AVX state as user state component 2. Thus, AVX state is located in the extended region of the XSAVE area (see Section 13.4.3). As noted in Section 13.2, CPUID.(EAX=0DH,ECX=2):EBX enumerates the offset (in bytes, from the base of the XSAVE area) of the section of the extended region of the XSAVE area used for AVX state (when the standard format of the extended region is used). CPUID.(EAX=0DH,ECX=2):EAX enumerates the size (in bytes) required for AVX state. The XSAVE feature set partitions YMM0_H–YMM15_H in a manner similar to that used for the XMM registers (see Section 13.5.2). Bytes 127:0 of the AVX-state section are used for YMM0_H–YMM7_H. Bytes 255:128 are used for YMM8_H–YMM15_H, but they are used only in 64-bit mode. Executions of XSAVE, XSAVEOPT, XSAVEC, and XSAVES outside 64-bit mode do not modify bytes 255:128; executions of XRSTOR and XRSTORS outside 64-bit mode do not update YMM8_H–YMM15_H. See Section 13.13. In general, bytes 16i+15:16i are used for YMMi_H (for 0 ≤ i ≤ 15). AVX state is XSAVE-managed and the AVX feature is XSAVE-enabled. The XSAVE feature set can operate on AVX state only if the feature set is enabled (CR4.OSXSAVE = 1) and has been configured to manage AVX state (XCR0[2] = 1). AVX instructions cannot be used unless the XSAVE feature set is enabled and has been configured to manage AVX state. 13.5.4 MPX State The register state used by the Intel® Memory Protection Extensions (MPX) comprises the 4 128-bit bounds registers BND0–BND3 (BNDREGS state); and the 64-bit user-mode configuration register BNDCFGU and the 64-bit MPX status register BNDSTATUS (collectively, BNDCSR state). Together, these two user state components compose MPX state. As noted in Section 13.1, the XSAVE feature set manages MPX state as state components 3–4. Thus, MPX state is located in the extended region of the XSAVE area (see Section 13.4.3). The following items detail how these state components are organized in this region: • BNDREGS state. As noted in Section 13.2, CPUID.(EAX=0DH,ECX=3):EBX enumerates the offset (in bytes, from the base of the XSAVE area) of the section of the extended region of the XSAVE area used for BNDREGS state (when the 1. While MXCSR and MXCSR_MASK are part of SSE state, their treatment by the XSAVE feature set is not the same as that of the XMM registers. See Section 13.7 through Section 13.11 for details. 13-10 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET standard format of the extended region is used). CPUID.(EAX=0DH,ECX=3):EAX enumerates the size (in bytes) required for BNDREGS state. The BNDREGS section is used for the 4 128-bit bound registers BND0– BND3, with bytes 16i+15:16i being used for BNDi. • BNDCSR state. As noted in Section 13.2, CPUID.(EAX=0DH,ECX=4):EBX enumerates the offset of the section of the extended region of the XSAVE area used for BNDCSR state (when the standard format of the extended region is used). CPUID.(EAX=0DH,ECX=4):EAX enumerates the size (in bytes) required for BNDCSR state. In the BNDSCR section, bytes 7:0 are used for BNDCFGU and bytes 15:8 are used for BNDSTATUS. Both components of MPX state are XSAVE-managed and the MPX feature is XSAVE-enabled. The XSAVE feature set can operate on MPX state only if the feature set is enabled (CR4.OSXSAVE = 1) and has been configured to manage MPX state (XCR0[4:3] = 11b). MPX instructions cannot be used unless the XSAVE feature set is enabled and has been configured to manage MPX state. 13.5.5 AVX-512 State The register state used by the Intel® Advanced Vector Extensions 512 (AVX-512) comprises the MXCSR register, the 8 64-bit opmask registers k0–k7, and 32 512-bit vector registers called ZMM0–ZMM31. For each i, 0 <= i <= 15, the low 256 bits of register ZMMi is identical to the AVX register YMMi. Thus, the new state register state added by AVX comprises the following user state components: • • The opmask registers, collectively called opmask state. • The 16 512-bit registers ZMM16–ZMM31, collectively called Hi16_ZMM state. The upper 256 bits of the registers ZMM0–ZMM15. These 16 256-bit values are denoted ZMM0_H–ZMM15_H and are collectively called ZMM_Hi256 state. Together, these three state components compose AVX-512 state. As noted in Section 13.1, the XSAVE feature set manages AVX-512 state as state components 5–7. Thus, AVX-512 state is located in the extended region of the XSAVE area (see Section 13.4.3). The following items detail how these state components are organized in this region: • Opmask state. As noted in Section 13.2, CPUID.(EAX=0DH,ECX=5):EBX enumerates the offset (in bytes, from the base of the XSAVE area) of the section of the extended region of the XSAVE area used for opmask state (when the standard format of the extended region is used). CPUID.(EAX=0DH,ECX=5):EAX enumerates the size (in bytes) required for opmask state. The opmask section is used for the 8 64-bit opmask registers k0–k7, with bytes 8i+7:8i being used for ki. • ZMM_Hi256 state. As noted in Section 13.2, CPUID.(EAX=0DH,ECX=6):EBX enumerates the offset of the section of the extended region of the XSAVE area used for ZMM_Hi256 state (when the standard format of the extended region is used). CPUID.(EAX=0DH,ECX=6):EAX enumerates the size (in bytes) required for ZMM_Hi256 state. • The XSAVE feature set partitions ZMM0_H–ZMM15_H in a manner similar to that used for the XMM registers (see Section 13.5.2). Bytes 255:0 of the ZMM_Hi256-state section are used for ZMM0_H–ZMM7_H. Bytes 511:256 are used for ZMM8_H–ZMM15_H, but they are used only in 64-bit mode. Executions of XSAVE, XSAVEOPT, XSAVEC, and XSAVES outside 64-bit mode do not modify bytes 511:256; executions of XRSTOR and XRSTORS outside 64-bit mode do not update ZMM8_H–ZMM15_H. See Section 13.13. In general, bytes 32i+31:32i are used for ZMMi_H (for 0 ≤ i ≤ 15). Hi16_ZMM state. As noted in Section 13.2, CPUID.(EAX=0DH,ECX=7):EBX enumerates the offset of the section of the extended region of the XSAVE area used for Hi16_ZMM state (when the standard format of the extended region is used). CPUID.(EAX=0DH,ECX=7):EAX enumerates the size (in bytes) required for Hi16_ZMM state. The XSAVE feature set accesses Hi16_ZMM state only in 64-bit mode. Executions of XSAVE, XSAVEOPT, XSAVEC, and XSAVES outside 64-bit mode do not modify the Hi16_ZMM section; executions of XRSTOR and XRSTORS outside 64-bit mode do not update ZMM16–ZMM31. See Section 13.13. In general, bytes 64(i-16)+63:64(i-16) are used for ZMMi (for 16 ≤ i ≤ 31). All three components of AVX-512 state are XSAVE-managed and the AVX-512 feature is XSAVE-enabled. The XSAVE feature set can operate on AVX-512 state only if the feature set is enabled (CR4.OSXSAVE = 1) and has Vol. 1 13-11 MANAGING STATE USING THE XSAVE FEATURE SET been configured to manage AVX-512 state (XCR0[7:5] = 111b). AVX-512 instructions cannot be used unless the XSAVE feature set is enabled and has been configured to manage AVX-512 state. 13.5.6 PT State The register state used by Intel Processor Trace (PT state) comprises the following 9 MSRs: IA32_RTIT_CTL, IA32_RTIT_OUTPUT_BASE, IA32_RTIT_OUTPUT_MASK_PTRS, IA32_RTIT_STATUS, IA32_RTIT_CR3_MATCH, IA32_RTIT_ADDR0_A, IA32_RTIT_ADDR0_B, IA32_RTIT_ADDR1_A, and IA32_RTIT_ADDR1_B.1 As noted in Section 13.1, the XSAVE feature set manages PT state as supervisor state component 8. Thus, PT state is located in the extended region of the XSAVE area (see Section 13.4.3). As noted in Section 13.2, CPUID.(EAX=0DH,ECX=8):EAX enumerates the size (in bytes) required for PT state. The MSRs are each allocated 8 bytes in the state component in the order given above. Thus, IA32_RTIT_CTL is at byte offset 0, IA32_RTIT_OUTPUT_BASE at byte offset 8, etc. Any locations in the state component at or beyond byte offset 72 are reserved. PT state is XSAVE-managed but Intel Processor Trace is not XSAVE-enabled. The XSAVE feature set can operate on PT state only if the feature set is enabled (CR4.OSXSAVE = 1) and has been configured to manage PT state (IA32_XSS[8] = 1). Software can otherwise use Intel Processor Trace and access its MSRs (using RDMSR and WRMSR) even if the XSAVE feature set is not enabled or has not been configured to manage PT state. The following items describe special treatment of PT state by the XSAVES and XRSTORS instructions: • If XSAVES saves PT state, the instruction clears IA32_RTIT_CTL.TraceEn (bit 0) after saving the value of the IA32_RTIT_CTL MSR and before saving any other PT state. If XSAVES causes a fault or a VM exit, it restores IA32_RTIT_CTL.TraceEn to its original value. • • If XSAVES saves PT state, the instruction saves zeroes in the reserved portions of the state component. • If XRSTORS causes an exception or a VM exit, it does so before any modification to IA32_RTIT_CTL.TraceEn (even if it has loaded other PT state). If XRSTORS would restore (or initialize) PT state and IA32_RTIT_CTL.TraceEn = 1, the instruction causes a general-protection exception (#GP) before modifying PT state. 13.5.7 PKRU State The register state used by the protection-key feature (PKRU state) is the 32-bit PKRU register. As noted in Section 13.1, the XSAVE feature set manages PKRU state as user state component 9. Thus, PKRU state is located in the extended region of the XSAVE area (see Section 13.4.3). As noted in Section 13.2, CPUID.(EAX=0DH,ECX=9):EBX enumerates the offset (in bytes, from the base of the XSAVE area) of the section of the extended region of the XSAVE area used for PKRU state (when the standard format of the extended region is used). CPUID.(EAX=0DH,ECX=9):EAX enumerates the size (in bytes) required for PKRU state. The XSAVE feature set uses bytes 3:0 of the PK-state section for the PKRU register. PKRU state is XSAVE-managed but the protection-key feature is not XSAVE-enabled. The XSAVE feature set can operate on PKRU state only if the feature set is enabled (CR4.OSXSAVE = 1) and has been configured to manage PKRU state (XCR0[9] = 1). Software can otherwise use protection keys and access PKRU state even if the XSAVE feature set is not enabled or has not been configured to manage PKRU state. The value of the PKRU register determines the access rights for user-mode linear addresses. (See Section 4.6, “Access Rights,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.) The access rights that pertain to an execution of the XRSTOR and XRSTORS instructions are determined by the value of the register before the execution and not by any value that the execution might load into the PKRU register. 1. These MSRs might not be supported by every processor that supports Intel Processor Trace. Software can use the CPUID instruction to discover which are supported; see Section 35.3.1, “Detection of Intel Processor Trace and Capability Enumeration,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C. 13-12 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET 13.5.8 HDC State The register state used by hardware duty cycling (HDC state) comprises the IA32_PM_CTL1 MSR. As noted in Section 13.1, the XSAVE feature set manages HDC state as supervisor state component 13. Thus, HDC state is located in the extended region of the XSAVE area (see Section 13.4.3). As noted in Section 13.2, CPUID.(EAX=0DH,ECX=13):EAX enumerates the size (in bytes) required for PT state. The IA32_PM_CTL1 MSR is allocated 8 bytes at byte offset 0 in the state component. HDC state is XSAVE-managed but hardware duty cycling is not XSAVE-enabled. The XSAVE feature set can operate on HDC state only if the feature set is enabled (CR4.OSXSAVE = 1) and has been configured to manage HDC state (IA32_XSS[13] = 1). Software can otherwise use hardware duty cycle and access the IA32_PM_CTL1 MSR (using RDMSR and WRMSR) even if the XSAVE feature set is not enabled or has not been configured to manage HDC state. 13.6 PROCESSOR TRACKING OF XSAVE-MANAGED STATE The XSAVEOPT, XSAVEC, and XSAVES instructions use two optimizations to reduce the amount of data that they write to memory. They avoid writing data for any state component known to be in its initial configuration (the init optimization). In addition, if either XSAVEOPT or XSAVES is using the same XSAVE area as that used by the most recent execution of XRSTOR or XRSTORS, it may avoid writing data for any state component whose configuration is known not to have been modified since then (the modified optimization). (XSAVE does not use these optimizations, and XSAVEC does not use the modified optimization.) The operation of XSAVEOPT, XSAVEC, and XSAVES are described in more detail in Section 13.9 through Section 13.11. A processor can support the init and modified optimizations with special hardware that tracks the state components that might benefit from those optimizations. Other implementations might not include such hardware; such a processor would always consider each such state component as not in its initial configuration and as modified since the last execution of XRSTOR or XRSTORS. The following notation describes the state of the init and modified optimizations: • XINUSE denotes the state-component bitmap corresponding to the init optimization. If XINUSE[i] = 0, state component i is known to be in its initial configuration; otherwise XINUSE[i] = 1. It is possible for XINUSE[i] to be 1 even when state component i is in its initial configuration. On a processor that does not support the init optimization, XINUSE[i] is always 1 for every value of i. Executing XGETBV with ECX = 1 returns in EDX:EAX the logical-AND of XCR0 and the current value of the XINUSE state-component bitmap. Such an execution of XGETBV always sets EAX[1] to 1 if XCR0[1] = 1 and MXCSR does not have its RESET value of 1F80H. Section 13.2 explains how software can determine whether a processor supports this use of XGETBV. • XMODIFIED denotes the state-component bitmap corresponding to the modified optimization. If XMODIFIED[i] = 0, state component i is known not to have been modified since the most recent execution of XRSTOR or XRSTORS; otherwise XMODIFIED[i] = 1. It is possible for XMODIFIED[i] to be 1 even when state component i has not been modified since the most recent execution of XRSTOR or XRSTORS. On a processor that does not support the modified optimization, XMODIFIED[i] is always 1 for every value of i. A processor that implements the modified optimization saves information about the most recent execution of XRSTOR or XRSTORS in a quantity called XRSTOR_INFO, a 4-tuple containing the following: (1) the CPL; (2) whether the logical processor was in VMX non-root operation; (3) the linear address of the XSAVE area; and (4) the XCOMP_BV field in the XSAVE area. An execution of XSAVEOPT or XSAVES uses the modified optimization only if that execution corresponds to XRSTOR_INFO on these four parameters. This mechanism implies that, depending on details of the operating system, the processor might determine that an execution of XSAVEOPT by one user application corresponds to an earlier execution of XRSTOR by a different application. For this reason, Intel recommends the application software not use the XSAVEOPT instruction. The following items specify the initial configuration each state component (for the purposes of defining the XINUSE bitmap): • x87 state. x87 state is in its initial configuration if the following all hold: FCW is 037FH; FSW is 0000H; FTW is FFFFH; FCS and FDS are each 0000H; FIP and FDP are each 00000000_00000000H; each of ST0–ST7 is 0000_00000000_00000000H. Vol. 1 13-13 MANAGING STATE USING THE XSAVE FEATURE SET • SSE state. In 64-bit mode, SSE state is in its initial configuration if each of XMM0–XMM15 is 0. Outside 64-bit mode, SSE state is in its initial configuration if each of XMM0–XMM7 is 0. XINUSE[1] pertains only to the state of the XMM registers and not to MXCSR. An execution of XRSTOR or XRSTORS outside 64-bit mode does not update XMM8–XMM15. (See Section 13.13.) • AVX state. In 64-bit mode, AVX state is in its initial configuration if each of YMM0_H–YMM15_H is 0. Outside 64-bit mode, AVX state is in its initial configuration if each of YMM0_H–YMM7_H is 0. An execution of XRSTOR or XRSTORS outside 64-bit mode does not update YMM8_H–YMM15_H. (See Section 13.13.) • • • • BNDREGS state. BNDREGS state is in its initial configuration if the value of each of BND0–BND3 is 0. • Hi16_ZMM state. In 64-bit mode, Hi16_ZMM state is in its initial configuration if each of ZMM16–ZMM31 is 0. Outside 64-bit mode, Hi16_ZMM state is always in its initial configuration. An execution of XRSTOR or XRSTORS outside 64-bit mode does not update ZMM31–ZMM31. (See Section 13.13.) • • • PT state. PT state is in its initial configuration if each of the 9 MSRs is 0. BNDCSR state. BNDCSR state is in its initial configuration if BNDCFGU and BNDCSR each has value 0. Opmask state. Opmask state is in its initial configuration if each of the opmask registers k0–k7 is 0. ZMM_Hi256 state. In 64-bit mode, ZMM_Hi256 state is in its initial configuration if each of ZMM0_H– ZMM15_H is 0. Outside 64-bit mode, ZMM_Hi256 state is in its initial configuration if each of ZMM0_H–ZMM7_H is 0. An execution of XRSTOR or XRSTORS outside 64-bit mode does not update ZMM8_H–ZMM15_H. (See Section 13.13.) PKRU state. PKRU state is in its initial configuration if the value of the PKRU is 0. HDC state. HDC state is in its initial configuration if the value of the IA32_PM_CTL1 MSR is 1. 13.7 OPERATION OF XSAVE The XSAVE instruction takes a single memory operand, which is an XSAVE area. In addition, the register pair EDX:EAX is an implicit operand used as a state-component bitmap (see Section 13.1) called the instruction mask. The logical-AND of XCR0 and the instruction mask is the requested-feature bitmap (RFBM) of the user state components to be saved. The following conditions cause execution of the XSAVE instruction to generate a fault: • • • If the XSAVE feature set is not enabled (CR4.OSXSAVE = 0), an invalid-opcode exception (#UD) occurs. If CR0.TS[bit 3] is 1, a device-not-available exception (#NM) occurs. If the address of the XSAVE area is not 64-byte aligned, a general-protection exception (#GP) occurs.1 If none of these conditions cause a fault, execution of XSAVE reads the XSTATE_BV field of the XSAVE header (see Section 13.4.2) and writes it back to memory, setting XSTATE_BV[i] (0 ≤ i ≤ 63) as follows: • • If RFBM[i] = 0, XSTATE_BV[i] is not changed. If RFBM[i] = 1, XSTATE_BV[i] is set to the value of XINUSE[i]. Section 13.6 defines XINUSE to describe the processor init optimization and specifies the initial configuration of each state component. The nature of that optimization implies the following: — If state component i is in its initial configuration, XINUSE[i] may be either 0 or 1, and XSTATE_BV[i] may be written with either 0 or 1. XINUSE[1] pertains only to the state of the XMM registers and not to MXCSR. Thus, XSTATE_BV[1] may be written with 0 even if MXCSR does not have its RESET value of 1F80H. — If state component i is not in its initial configuration, XINUSE[i] = 1 and XSTATE_BV[i] is written with 1. (As explained in Section 13.6, the initial configurations of some state components may depend on whether the processor is in 64-bit mode.) The XSAVE instruction does not write any part of the XSAVE header other than the XSTATE_BV field; in particular, it does not write to the XCOMP_BV field. 1. If CR0.AM = 1, CPL = 3, and EFLAGS.AC =1, an alignment-check exception (#AC) may occur instead of #GP. 13-14 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET Execution of XSAVE saves into the XSAVE area those state components corresponding to bits that are set in RFBM. State components 0 and 1 are located in the legacy region of the XSAVE area (see Section 13.4.1). Each state component i, 2 ≤ i ≤ 62, is located in the extended region; the XSAVE instruction always uses the standard format for the extended region (see Section 13.4.3). The MXCSR register and MXCSR_MASK are part of SSE state (see Section 13.5.2) and are thus associated with RFBM[1]. However, the XSAVE instruction also saves these values when RFBM[2] = 1 (even if RFBM[1] = 0). See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes. See Section 13.13 for details regarding faults caused by memory accesses. 13.8 OPERATION OF XRSTOR The XRSTOR instruction takes a single memory operand, which is an XSAVE area. In addition, the register pair EDX:EAX is an implicit operand used as a state-component bitmap (see Section 13.1) called the instruction mask. The logical-AND of XCR0 and the instruction mask is the requested-feature bitmap (RFBM) of the user state components to be restored. The following conditions cause execution of the XRSTOR instruction to generate a fault: • • • If the XSAVE feature set is not enabled (CR4.OSXSAVE = 0), an invalid-opcode exception (#UD) occurs. If CR0.TS[bit 3] is 1, a device-not-available exception (#NM) occurs. If the address of the XSAVE area is not 64-byte aligned, a general-protection exception (#GP) occurs.1 After checking for these faults, the XRSTOR instruction reads the XCOMP_BV field in the XSAVE area’s XSAVE header (see Section 13.4.2). If XCOMP_BV[63] = 0, the standard form of XRSTOR is executed (see Section 13.8.1); otherwise, the compacted form of XRSTOR is executed (see Section 13.8.2).2 See Section 13.2 for details of how to determine whether the compacted form of XRSTOR is supported. 13.8.1 Standard Form of XRSTOR The standard from of XRSTOR performs additional fault checking. Either of the following conditions causes a general-protection exception (#GP): • • The XSTATE_BV field of the XSAVE header sets a bit that is not set in XCR0. Bytes 23:8 of the XSAVE header are not all 0 (this implies that all bits in XCOMP_BV are 0).3 If none of these conditions cause a fault, the processor updates each state component i for which RFBM[i] = 1. XRSTOR updates state component i based on the value of bit i in the XSTATE_BV field of the XSAVE header: • If XSTATE_BV[i] = 0, the state component is set to its initial configuration. Section 13.6 specifies the initial configuration of each state component. The initial configuration of state component 1 pertains only to the XMM registers and not to MXCSR. See below for the treatment of MXCSR • If XSTATE_BV[i] = 1, the state component is loaded with data from the XSAVE area. See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes. See Section 13.13 for details regarding faults caused by memory accesses. 1. If CR0.AM = 1, CPL = 3, and EFLAGS.AC =1, an alignment-check exception (#AC) may occur instead of #GP. 2. If the processor does not support the compacted form of XRSTOR, it may execute the standard form of XRSTOR without first reading the XCOMP_BV field. A processor supports the compacted form of XRSTOR only if it enumerates CPUID.(EAX=0DH,ECX=1):EAX[1] as 1. 3. Bytes 63:24 of the XSAVE header are also reserved. Software should ensure that bytes 63:16 of the XSAVE header are all 0 in any XSAVE area. (Bytes 15:8 should also be 0 if the XSAVE area is to be used on a processor that does not support the compaction extensions to the XSAVE feature set.) Vol. 1 13-15 MANAGING STATE USING THE XSAVE FEATURE SET State components 0 and 1 are located in the legacy region of the XSAVE area (see Section 13.4.1). Each state component i, 2 ≤ i ≤ 62, is located in the extended region; the standard form of XRSTOR uses the standard format for the extended region (see Section 13.4.3). The MXCSR register is part of state component 1, SSE state (see Section 13.5.2). However, the standard form of XRSTOR loads the MXCSR register from memory whenever the RFBM[1] (SSE) or RFBM[2] (AVX) is set, regardless of the values of XSTATE_BV[1] and XSTATE_BV[2]. The standard form of XRSTOR causes a general-protection exception (#GP) if it would load MXCSR with an illegal value. 13.8.2 Compacted Form of XRSTOR The compacted from of XRSTOR performs additional fault checking. Any of the following conditions causes a #GP: • • • The XCOMP_BV field of the XSAVE header sets a bit in the range 62:0 that is not set in XCR0. The XSTATE_BV field of the XSAVE header sets a bit (including bit 63) that is not set in XCOMP_BV. Bytes 63:16 of the XSAVE header are not all 0. If none of these conditions cause a fault, the processor updates each state component i for which RFBM[i] = 1. XRSTOR updates state component i based on the value of bit i in the XSTATE_BV field of the XSAVE header: • If XSTATE_BV[i] = 0, the state component is set to its initial configuration. Section 13.6 specifies the initial configuration of each state component. If XSTATE_BV[1] = 0, the compacted form XRSTOR initializes MXCSR to 1F80H. (This differs from the standard from of XRSTOR, which loads MXCSR from the XSAVE area whenever either RFBM[1] or RFBM[2] is set.) State component i is set to its initial configuration as indicated above if RFBM[i] = 1 and XSTATE_BV[i] = 0 — even if XCOMP_BV[i] = 0. This is true for all values of i, including 0 (x87 state) and 1 (SSE state). • If XSTATE_BV[i] = 1, the state component is loaded with data from the XSAVE area.1 See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes. See Section 13.13 for details regarding faults caused by memory accesses. State components 0 and 1 are located in the legacy region of the XSAVE area (see Section 13.4.1). Each state component i, 2 ≤ i ≤ 62, is located in the extended region; the compacted form of the XRSTOR instruction uses the compacted format for the extended region (see Section 13.4.3). The MXCSR register is part of SSE state (see Section 13.5.2) and is thus loaded from memory if RFBM[1] = XSTATE_BV[i] = 1. The compacted form of XRSTOR does not consider RFBM[2] (AVX) when determining whether to update MXCSR. (This is a difference from the standard form of XRSTOR.) The compacted form of XRSTOR causes a general-protection exception (#GP) if it would load MXCSR with an illegal value. 13.8.3 XRSTOR and the Init and Modified Optimizations Execution of the XRSTOR instruction causes the processor to update its tracking for the init and modified optimizations (see Section 13.6). The following items provide details: • The processor updates its tracking for the init optimization as follows: — If RFBM[i] = 0, XINUSE[i] is not changed. — If RFBM[i] = 1 and XSTATE_BV[i] = 0, state component i may be tracked as init; XINUSE[i] may be set to 0 or 1. (As noted in Section 13.6, a processor need not implement the init optimization for state component i; a processor that does not do so implicitly maintains XINUSE[i] = 1 at all times.) — If RFBM[i] = 1 and XSTATE_BV[i] = 1, state component i is tracked as not init; XINUSE[i] is set to 1. • The processor updates its tracking for the modified optimization and records information about the XRSTOR execution for future interaction with the XSAVEOPT and XSAVES instructions (see Section 13.9 and Section 13.11) as follows: — If RFBM[i] = 0, state component i is tracked as modified; XMODIFIED[i] is set to 1. 1. Earlier fault checking ensured that, if the instruction has reached this point in execution and XSTATE_BV[i] is 1, then XCOMP_BV[i] is also 1. 13-16 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET — If RFBM[i] = 1, state component i may be tracked as unmodified; XMODIFIED[i] may be set to 0 or 1. (As noted in Section 13.6, a processor need not implement the modified optimization for state component i; a processor that does not do so implicitly maintains XMODIFIED[i] = 1 at all times.) — XRSTOR_INFO is set to the 4-tuple w,x,y,z, where w is the CPL (0); x is 1 if the logical processor is in VMX non-root operation and 0 otherwise; y is the linear address of the XSAVE area; and z is XCOMP_BV. In particular, the standard form of XRSTOR always sets z to all zeroes, while the compacted form of XRSTORS never does so (because it sets at least bit 63 to 1). 13.9 OPERATION OF XSAVEOPT The operation of XSAVEOPT is similar to that of XSAVE. Unlike XSAVE, XSAVEOPT uses the init optimization (by which it may omit saving state components that are in their initial configuration) and the modified optimization (by which it may omit saving state components that have not been modified since the last execution of XRSTOR); see Section 13.6. See Section 13.2 for details of how to determine whether XSAVEOPT is supported. The XSAVEOPT instruction takes a single memory operand, which is an XSAVE area. In addition, the register pair EDX:EAX is an implicit operand used as a state-component bitmap (see Section 13.1) called the instruction mask. The logical (bitwise) AND of XCR0 and the instruction mask is the requested-feature bitmap (RFBM) of the user state components to be saved. The following conditions cause execution of the XSAVEOPT instruction to generate a fault: • • • If the XSAVE feature set is not enabled (CR4.OSXSAVE = 0), an invalid-opcode exception (#UD) occurs. If CR0.TS[bit 3] is 1, a device-not-available exception (#NM) occurs. If the address of the XSAVE area is not 64-byte aligned, a general-protection exception (#GP) occurs.1 If none of these conditions cause a fault, execution of XSAVEOPT reads the XSTATE_BV field of the XSAVE header (see Section 13.4.2) and writes it back to memory, setting XSTATE_BV[i] (0 ≤ i ≤ 63) as follows: • • If RFBM[i] = 0, XSTATE_BV[i] is not changed. If RFBM[i] = 1, XSTATE_BV[i] is set to the value of XINUSE[i]. Section 13.6 defines XINUSE to describe the processor init optimization and specifies the initial configuration of each state component. The nature of that optimization implies the following: — If the state component is in its initial configuration, XINUSE[i] may be either 0 or 1, and XSTATE_BV[i] may be written with either 0 or 1. XINUSE[1] pertains only to the state of the XMM registers and not to MXCSR. Thus, XSTATE_BV[1] may be written with 0 even if MXCSR does not have its RESET value of 1F80H. — If the state component is not in its initial configuration, XSTATE_BV[i] is written with 1. (As explained in Section 13.6, the initial configurations of some state components may depend on whether the processor is in 64-bit mode.) The XSAVEOPT instruction does not write any part of the XSAVE header other than the XSTATE_BV field; in particular, it does not write to the XCOMP_BV field. Execution of XSAVEOPT saves into the XSAVE area those state components corresponding to bits that are set in RFBM (subject to the optimizations described below). State components 0 and 1 are located in the legacy region of the XSAVE area (see Section 13.4.1). Each state component i, 2 ≤ i ≤ 62, is located in the extended region; the XSAVEOPT instruction always uses the standard format for the extended region (see Section 13.4.3). See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes. See Section 13.13 for details regarding faults caused by memory accesses. Execution of XSAVEOPT performs two optimizations that reduce the amount of data written to memory: 1. If CR0.AM = 1, CPL = 3, and EFLAGS.AC =1, an alignment-check exception (#AC) may occur instead of #GP. Vol. 1 13-17 MANAGING STATE USING THE XSAVE FEATURE SET • Init optimization. If XINUSE[i] = 0, state component i is not saved to the XSAVE area (even if RFBM[i] = 1). (See below for exceptions made for MXCSR.) • Modified optimization. Each execution of XRSTOR and XRSTORS establishes XRSTOR_INFO as a 4-tuple w,x,y,z (see Section 13.8.3 and Section 13.12). Execution of XSAVEOPT uses the modified optimization only if the following all hold for the current value of XRSTOR_INFO: — w = CPL; — x = 1 if and only if the logical processor is in VMX non-root operation; — y is the linear address of the XSAVE area being used by XSAVEOPT; and — z is 00000000_00000000H. (This last item implies that XSAVEOPT does not use the modified optimization if the last execution of XRSTOR used the compacted form, or if an execution of XRSTORS followed the last execution of XRSTOR.) If XSAVEOPT uses the modified optimization and XMODIFIED[i] = 0 (see Section 13.6), state component i is not saved to the XSAVE area. (In practice, the benefit of the modified optimization for state component i depends on how the processor is tracking state component i; see Section 13.6. Limitations on the tracking ability may result in state component i being saved even though is in the same configuration that was loaded by the previous execution of XRSTOR.) Depending on details of the operating system, an execution of XSAVEOPT by a user application might use the modified optimization when the most recent execution of XRSTOR was by a different application. Because of this, Intel recommends the application software not use the XSAVEOPT instruction. The MXCSR register and MXCSR_MASK are part of SSE state (see Section 13.5.2) and are thus associated with bit 1 of RFBM. However, the XSAVEOPT instruction also saves these values when RFBM[2] = 1 (even if RFBM[1] = 0). The init and modified optimizations do not apply to the MXCSR register and MXCSR_MASK. 13.10 OPERATION OF XSAVEC The operation of XSAVEC is similar to that of XSAVE. Two main differences are (1) XSAVEC uses the compacted format for the extended region of the XSAVE area; and (2) XSAVEC uses the init optimization (see Section 13.6). Unlike XSAVEOPT, XSAVEC does not use the modified optimization. See Section 13.2 for details of how to determine whether XSAVEC is supported. The XSAVEC instruction takes a single memory operand, which is an XSAVE area. In addition, the register pair EDX:EAX is an implicit operand used as a state-component bitmap (see Section 13.1) called the instruction mask. The logical (bitwise) AND of XCR0 and the instruction mask is the requested-feature bitmap (RFBM) of the user state components to be saved. The following conditions cause execution of the XSAVEC instruction to generate a fault: • • • If the XSAVE feature set is not enabled (CR4.OSXSAVE = 0), an invalid-opcode exception (#UD) occurs. If CR0.TS[bit 3] is 1, a device-not-available exception (#NM) occurs. If the address of the XSAVE area is not 64-byte aligned, a general-protection exception (#GP) occurs.1 If none of these conditions cause a fault, execution of XSAVEC writes the XSTATE_BV field of the XSAVE header (see Section 13.4.2), setting XSTATE_BV[i] (0 ≤ i ≤ 63) as follows:2 • • If RFBM[i] = 0, XSTATE_BV[i] is written as 0. If RFBM[i] = 1, XSTATE_BV[i] is set to the value of XINUSE[i] (see below for an exception made for XSTATE_BV[1]). Section 13.6 defines XINUSE to describe the processor init optimization and specifies the initial configuration of each state component. The nature of that optimization implies the following: — If state component i is in its initial configuration, XSTATE_BV[i] may be written with either 0 or 1. 1. If CR0.AM = 1, CPL = 3, and EFLAGS.AC =1, an alignment-check exception (#AC) may occur instead of #GP. 2. Unlike the XSAVE and XSAVEOPT instructions, the XSAVEC instruction does not read the XSTATE_BV field of the XSAVE header. 13-18 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET — If state component i is not in its initial configuration, XSTATE_BV[i] is written with 1. XINUSE[1] pertains only to the state of the XMM registers and not to MXCSR. However, if RFBM[1] = 1 and MXCSR does not have the value 1F80H, XSAVEC writes XSTATE_BV[1] as 1 even if XINUSE[1] = 0. (As explained in Section 13.6, the initial configurations of some state components may depend on whether the processor is in 64-bit mode.) The XSAVEC instructions sets bit 63 of the XCOMP_BV field of the XSAVE header while writing RFBM[62:0] to XCOMP_BV[62:0]. The XSAVEC instruction does not write any part of the XSAVE header other than the XSTATE_BV and XCOMP_BV fields. Execution of XSAVEC saves into the XSAVE area those state components corresponding to bits that are set in RFBM (subject to the init optimization described below). State components 0 and 1 are located in the legacy region of the XSAVE area (see Section 13.4.1). Each state component i, 2 ≤ i ≤ 62, is located in the extended region; the XSAVEC instruction always uses the compacted format for the extended region (see Section 13.4.3). See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes. See Section 13.13 for details regarding faults caused by memory accesses. Execution of XSAVEC performs the init optimization to reduce the amount of data written to memory. If XINUSE[i] = 0, state component i is not saved to the XSAVE area (even if RFBM[i] = 1). However, if RFBM[1] = 1 and MXCSR does not have the value 1F80H, XSAVEC saves all of state component 1 (SSE — including the XMM registers) even if XINUSE[1] = 0. Unlike the XSAVE instruction, RFBM[2] does not determine whether XSAVEC saves MXCSR and MXCSR_MASK. 13.11 OPERATION OF XSAVES The operation of XSAVES is similar to that of XSAVEC. The main differences are (1) XSAVES can be executed only if CPL = 0; (2) XSAVES can operate on the state components whose bits are set in XCR0 | IA32_XSS and can thus operate on supervisor state components; and (3) XSAVES uses the modified optimization (see Section 13.6). See Section 13.2 for details of how to determine whether XSAVES is supported. The XSAVES instruction takes a single memory operand, which is an XSAVE area. In addition, the register pair EDX:EAX is an implicit operand used as a state-component bitmap (see Section 13.1) called the instruction mask. EDX:EAX & (XCR0 | IA32_XSS) (the logical AND the instruction mask with the logical OR of XCR0 and IA32_XSS) is the requested-feature bitmap (RFBM) of the state components to be saved. The following conditions cause execution of the XSAVES instruction to generate a fault: • • • If the XSAVE feature set is not enabled (CR4.OSXSAVE = 0), an invalid-opcode exception (#UD) occurs. If CR0.TS[bit 3] is 1, a device-not-available exception (#NM) occurs. If CPL > 0 or if the address of the XSAVE area is not 64-byte aligned, a general-protection exception (#GP) occurs. If none of these conditions cause a fault, execution of XSAVES writes the XSTATE_BV field of the XSAVE header (see Section 13.4.2), setting XSTATE_BV[i] (0 ≤ i ≤ 63) as follows: • • If RFBM[i] = 0, XSTATE_BV[i] is written as 0. If RFBM[i] = 1, XSTATE_BV[i] is set to the value of XINUSE[i] (see below for an exception made for XSTATE_BV[1]). Section 13.6 defines XINUSE to describe the processor init optimization and specifies the initial configuration of each state component. The nature of that optimization implies the following: — If state component i is in its initial configuration, XSTATE_BV[i] may be written with either 0 or 1. — If state component i is not in its initial configuration, XSTATE_BV[i] is written with 1. XINUSE[1] pertains only to the state of the XMM registers and not to MXCSR. However, if RFBM[1] = 1 and MXCSR does not have the value 1F80H, XSAVES writes XSTATE_BV[1] as 1 even if XINUSE[1] = 0. (As explained in Section 13.6, the initial configurations of some state components may depend on whether the processor is in 64-bit mode.) Vol. 1 13-19 MANAGING STATE USING THE XSAVE FEATURE SET The XSAVES instructions sets bit 63 of the XCOMP_BV field of the XSAVE header while writing RFBM[62:0] to XCOMP_BV[62:0]. The XSAVES instruction does not write any part of the XSAVE header other than the XSTATE_BV and XCOMP_BV fields. Execution of XSAVES saves into the XSAVE area those state components corresponding to bits that are set in RFBM (subject to the optimizations described below). State components 0 and 1 are located in the legacy region of the XSAVE area (see Section 13.4.1). Each state component i, 2 ≤ i ≤ 62, is located in the extended region; the XSAVES instruction always uses the compacted format for the extended region (see Section 13.4.3). See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes; in particular, see Section 13.5.6 for some special treatment of PT state by XSAVES. See Section 13.13 for details regarding faults caused by memory accesses. Execution of XSAVES performs the init optimization to reduce the amount of data written to memory. If XINUSE[i] = 0, state component i is not saved to the XSAVE area (even if RFBM[i] = 1). However, if RFBM[1] = 1 and MXCSR does not have the value 1F80H, XSAVES saves all of state component 1 (SSE — including the XMM registers) even if XINUSE[1] = 0. Like XSAVEOPT, XSAVES may perform the modified optimization. Each execution of XRSTOR and XRSTORS establishes XRSTOR_INFO as a 4-tuple w,x,y,z (see Section 13.8.3 and Section 13.12). Execution of XSAVES uses the modified optimization only if the following all hold: • • • • w = CPL; x = 1 if and only if the logical processor is in VMX non-root operation; y is the linear address of the XSAVE area being used by XSAVEOPT; and z[63] is 1 and z[62:0] = RFBM[62:0]. (This last item implies that XSAVES does not use the modified optimization if the last execution of XRSTOR used the standard form and followed the last execution of XRSTORS.) If XSAVES uses the modified optimization and XMODIFIED[i] = 0 (see Section 13.6), state component i is not saved to the XSAVE area. 13.12 OPERATION OF XRSTORS The operation of XRSTORS is similar to that of XRSTOR. Three main differences are (1) XRSTORS can be executed only if CPL = 0; (2) XRSTORS can operate on the state components whose bits are set in XCR0 | IA32_XSS and can thus operate on supervisor state components; and (3) XRSTORS has only a compacted form (no standard form; see Section 13.8). See Section 13.2 for details of how to determine whether XRSTORS is supported. The XRSTORS instruction takes a single memory operand, which is an XSAVE area. In addition, the register pair EDX:EAX is an implicit operand used as a state-component bitmap (see Section 13.1) called the instruction mask. EDX:EAX & (XCR0 | IA32_XSS) (the logical AND the instruction mask with the logical OR of XCR0 and IA32_XSS) is the requested-feature bitmap (RFBM) of the state components to be restored. The following conditions cause execution of the XRSTOR instruction to generate a fault: • • • If the XSAVE feature set is not enabled (CR4.OSXSAVE = 0), an invalid-opcode exception (#UD) occurs. If CR0.TS[bit 3] is 1, a device-not-available exception (#NM) occurs. If CPL > 0 or if the address of the XSAVE area is not 64-byte aligned, a general-protection exception (#GP) occurs. After checking for these faults, the XRSTORS instruction reads the first 64 bytes of the XSAVE header, including the XSTATE_BV and XCOMP_BV fields (see Section 13.4.2). A #GP occurs if any of the following conditions hold for the values read: • • • • XCOMP_BV[63] = 0. XCOMP_BV sets a bit in the range 62:0 that is not set in XCR0 | IA32_XSS. XSTATE_BV sets a bit (including bit 63) that is not set in XCOMP_BV. Bytes 63:16 of the XSAVE header are not all 0. If none of these conditions cause a fault, the processor updates each state component i for which RFBM[i] = 1. XRSTORS updates state component i based on the value of bit i in the XSTATE_BV field of the XSAVE header: 13-20 Vol. 1 MANAGING STATE USING THE XSAVE FEATURE SET • If XSTATE_BV[i] = 0, the state component is set to its initial configuration. Section 13.6 specifies the initial configuration of each state component. If XSTATE_BV[1] = 0, XRSTORS initializes MXCSR to 1F80H. State component i is set to its initial configuration as indicated above if RFBM[i] = 1 and XSTATE_BV[i] = 0 — even if XCOMP_BV[i] = 0. This is true for all values of i, including 0 (x87 state) and 1 (SSE state). • If XSTATE_BV[i] = 1, the state component is loaded with data from the XSAVE area.1 See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes; in particular, see Section 13.5.6 for some special treatment of PT state by XRSTORS. See Section 13.13 for details regarding faults caused by memory accesses. If XRSTORS is restoring a supervisor state component, the instruction causes a general-protection exception (#GP) if it would load any element of that component with an unsupported value (e.g., by setting a reserved bit in an MSR) or if a bit is set in any reserved portion of the state component in the XSAVE area. State components 0 and 1 are located in the legacy region of the XSAVE area (see Section 13.4.1). Each state component i, 2 ≤ i ≤ 62, is located in the extended region; XRSTORS uses the compacted format for the extended region (see Section 13.4.3). The MXCSR register is part of SSE state (see Section 13.5.2) and is thus loaded from memory if RFBM[1] = XSTATE_BV[i] = 1. XRSTORS causes a general-protection exception (#GP) if it would load MXCSR with an illegal value. If an execution of XRSTORS causes an exception or a VM exit during or after restoring a supervisor state component, each element of that state component may have the value it held before the XRSTORS execution, the value loaded from the XSAVE area, or the element’s initial value (as defined in Section 13.6). See Section 13.5.6 for some special treatment of PT state for the case in which XRSTORS causes an exception or a VM exit. Like XRSTOR, execution of XRSTORS causes the processor to update is tracking for the init and modified optimizations (see Section 13.6 and Section 13.8.3). The following items provide details: • The processor updates its tracking for the init optimization as follows: — If RFBM[i] = 0, XINUSE[i] is not changed. — If RFBM[i] = 1 and XSTATE_BV[i] = 0, state component i may be tracked as init; XINUSE[i] may be set to 0 or 1. — If RFBM[i] = 1 and XSTATE_BV[i] = 1, state component i is tracked as not init; XINUSE[i] is set to 1. • The processor updates its tracking for the modified optimization and records information about the XRSTORS execution for future interaction with the XSAVEOPT and XSAVES instructions as follows: — If RFBM[i] = 0, state component i is tracked as modified; XMODIFIED[i] is set to 1. — If RFBM[i] = 1, state component i may be tracked as unmodified; XMODIFIED[i] may be set to 0 or 1. — XRSTOR_INFO is set to the 4-tuple w,x,y,z, where w is the CPL; x is 1 if the logical processor is in VMX non-root operation and 0 otherwise; y is the linear address of the XSAVE area; and z is XCOMP_BV (this implies that z[63] = 1). 13.13 MEMORY ACCESSES BY THE XSAVE FEATURE SET Each instruction in the XSAVE feature set operates on a set of XSAVE-managed state components. The specific set of components on which an instruction operates is determined by the values of XCR0, the IA32_XSS MSR, EDX:EAX, and (for XRSTOR and XRSTORS) the XSAVE header. Section 13.4 provides the details necessary to determine the location of each state component for any execution of an instruction in the XSAVE feature set. An execution of an instruction in the XSAVE feature set may access any byte of any state component on which that execution operates. Section 13.5 provides details of the different XSAVE-managed state components. Some portions of some of these components are accessible only in 64-bit mode. Executions of XRSTOR and XRSTORS outside 64-bit mode will not 1. Earlier fault checking ensured that, if the instruction has reached this point in execution and XSTATE_BV[i] is 1, then XCOMP_BV[i] is also 1. Vol. 1 13-21 MANAGING STATE USING THE XSAVE FEATURE SET update those portions; executions of XSAVE, XSAVEC, XSAVEOPT, and XSAVES will not modify the corresponding locations in memory. Despite this fact, any execution of these instructions outside 64-bit mode may access any byte in any state component on which that execution operates — even those at addresses corresponding to registers that are accessible only in 64-bit mode. As result, such an execution may incur a fault due to an attempt to access such an address. For example, an execution of XSAVE outside 64-bit mode may incur a page fault if paging does not map as read/write the section of the XSAVE area containing state component 7 (Hi16_ZMM state) — despite the fact that state component 7 can be accessed only in 64-bit mode. 13-22 Vol. 1 5. Updates to Chapter 1, Volume 2A Change bars and green text show changes to Chapter 1 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-L. -----------------------------------------------------------------------------------------Changes to this chapter: Updates to Section 1.1 “Intel® 64 and IA-32 Processors Covered in this Manual”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 15 CHAPTER 1 ABOUT THIS MANUAL The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D: Instruction Set Reference (order numbers 253666, 253667, 326018 and 334569) are part of a set that describes the architecture and programming environment of all Intel 64 and IA-32 architecture processors. Other volumes in this set are: • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture (Order Number 253665). • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A, 3B, 3C & 3D: System Programming Guide (order numbers 253668, 253669, 326019 and 332831). • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4: Model-Specific Registers (order number 335592). The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D, describe the instruction set of the processor and the opcode structure. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A, 3B, 3C & 3D, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operatingsystem and BIOS designers. In addition, the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B, addresses the programming environment for classes of software that host operating systems. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4, describes the model-specific registers of Intel 64 and IA-32 processors. 1.1 INTEL® 64 AND IA-32 PROCESSORS COVERED IN THIS MANUAL This manual set includes information pertaining primarily to the most recent Intel 64 and IA-32 processors, which include: • • • • • • • • • • • • • • • • • • • Pentium® processors P6 family processors Pentium® 4 processors Pentium® M processors Intel® Xeon® processors Pentium® D processors Pentium® processor Extreme Editions 64-bit Intel® Xeon® processors Intel® Core™ Duo processor Intel® Core™ Solo processor Dual-Core Intel® Xeon® processor LV Intel® Core™2 Duo processor Intel® Core™2 Quad processor Q6000 series Intel® Xeon® processor 3000, 3200 series Intel® Xeon® processor 5000 series Intel® Xeon® processor 5100, 5300 series Intel® Core™2 Extreme processor X7000 and X6800 series Intel® Core™2 Extreme processor QX6000 series Intel® Xeon® processor 7100 series Vol. 2A 1-1 ABOUT THIS MANUAL • • • • • • • • Intel® Pentium® Dual-Core processor • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Intel® Core™ i7 processor Intel® Xeon® processor 7200, 7300 series Intel® Xeon® processor 5200, 5400, 7400 series Intel® Core™2 Extreme processor QX9000 and X9000 series Intel® Core™2 Quad processor Q9000 series Intel® Core™2 Duo processor E8000, T9000 series Intel® Atom™ processor family Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are built from 45 nm and 32 nm processes Intel® Core™ i5 processor Intel® Xeon® processor E7-8800/4800/2800 product families Intel® Core™ i7-3930K processor 2nd generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx processor series Intel® Xeon® processor E3-1200 product family Intel® Xeon® processor E5-2400/1400 product family Intel® Xeon® processor E5-4600/2600/1600 product family 3rd generation Intel® Core™ processors Intel® Xeon® processor E3-1200 v2 product family Intel® Xeon® processor E5-2400/1400 v2 product families Intel® Xeon® processor E5-4600/2600/1600 v2 product families Intel® Xeon® processor E7-8800/4800/2800 v2 product families 4th generation Intel® Core™ processors The Intel® Core™ M processor family Intel® Core™ i7-59xx Processor Extreme Edition Intel® Core™ i7-49xx Processor Extreme Edition Intel® Xeon® processor E3-1200 v3 product family Intel® Xeon® processor E5-2600/1600 v3 product families 5th generation Intel® Core™ processors Intel® Xeon® processor D-1500 product family Intel® Xeon® processor E5 v4 family Intel® Atom™ processor X7-Z8000 and X5-Z8000 series Intel® Atom™ processor Z3400 series Intel® Atom™ processor Z3500 series 6th generation Intel® Core™ processors Intel® Xeon® processor E3-1500m v5 product family 7th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series Intel® Xeon® Processor Scalable Family 8th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series Intel® Xeon® E processors 9th generation Intel® Core™ processors 1-2 Vol. 2A ABOUT THIS MANUAL P6 family processors are IA-32 processors based on the P6 family microarchitecture. This includes the Pentium® Pro, Pentium® II, Pentium® III, and Pentium® III Xeon® processors. The Pentium® 4, Pentium® D, and Pentium® processor Extreme Editions are based on the Intel NetBurst® microarchitecture. Most early Intel® Xeon® processors are based on the Intel NetBurst® microarchitecture. Intel Xeon processor 5000, 7100 series are based on the Intel NetBurst® microarchitecture. The Intel® Core™ Duo, Intel® Core™ Solo and dual-core Intel® Xeon® processor LV are based on an improved Pentium® M processor microarchitecture. The Intel® Xeon® processor 3000, 3200, 5100, 5300, 7200, and 7300 series, Intel® Pentium® dual-core, Intel® Core™2 Duo, Intel® Core™2 Quad, and Intel® Core™2 Extreme processors are based on Intel® Core™ microarchitecture. The Intel® Xeon® processor 5200, 5400, 7400 series, Intel® Core™2 Quad processor Q9000 series, and Intel® Core™2 Extreme processors QX9000, X9000 series, Intel® Core™2 processor E8000 series are based on Enhanced Intel® Core™ microarchitecture. The Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are based on the Intel® Atom™ microarchitecture and supports Intel 64 architecture. P6 family, Pentium® M, Intel® Core™ Solo, Intel® Core™ Duo processors, dual-core Intel® Xeon® processor LV, and early generations of Pentium 4 and Intel Xeon processors support IA-32 architecture. The Intel® AtomTM processor Z5xx series support IA-32 architecture. The Intel® Xeon® processor 3000, 3200, 5000, 5100, 5200, 5300, 5400, 7100, 7200, 7300, 7400 series, Intel® Core™2 Duo, Intel® Core™2 Extreme, Intel® Core™2 Quad processors, Pentium® D processors, Pentium® DualCore processor, newer generations of Pentium 4 and Intel Xeon processor family support Intel® 64 architecture. The Intel® Core™ i7 processor and Intel® Xeon® processor 3400, 5500, 7500 series are based on 45 nm Nehalem microarchitecture. Westmere microarchitecture is a 32 nm version of the Nehalem microarchitecture. Intel® Xeon® processor 5600 series, Intel Xeon processor E7 and various Intel Core i7, i5, i3 processors are based on the Westmere microarchitecture. These processors support Intel 64 architecture. The Intel® Xeon® processor E5 family, Intel® Xeon® processor E3-1200 family, Intel® Xeon® processor E78800/4800/2800 product families, Intel® Core™ i7-3930K processor, and 2nd generation Intel® Core™ i7-2xxx, Intel® CoreTM i5-2xxx, Intel® Core™ i3-2xxx processor series are based on the Sandy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E7-8800/4800/2800 v2 product families, Intel® Xeon® processor E3-1200 v2 product family and 3rd generation Intel® Core™ processors are based on the Ivy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-4600/2600/1600 v2 product families, Intel® Xeon® processor E5-2400/1400 v2 product families and Intel® Core™ i7-49xx Processor Extreme Edition are based on the Ivy Bridge-E microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E3-1200 v3 product family and 4th Generation Intel® Core™ processors are based on the Haswell microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-2600/1600 v3 product families and the Intel® Core™ i7-59xx Processor Extreme Edition are based on the Haswell-E microarchitecture and support Intel 64 architecture. The Intel® Atom™ processor Z8000 series is based on the Airmont microarchitecture. The Intel® Atom™ processor Z3400 series and the Intel® Atom™ processor Z3500 series are based on the Silvermont microarchitecture. The Intel® Core™ M processor family, 5th generation Intel® Core™ processors, Intel® Xeon® processor D-1500 product family and the Intel® Xeon® processor E5 v4 family are based on the Broadwell microarchitecture and support Intel 64 architecture. The Intel® Xeon® Processor Scalable Family, Intel® Xeon® processor E3-1500m v5 product family and 6th generation Intel® Core™ processors are based on the Skylake microarchitecture and support Intel 64 architecture. The 7th generation Intel® Core™ processors are based on the Kaby Lake microarchitecture and support Intel 64 architecture. Vol. 2A 1-3 ABOUT THIS MANUAL The Intel® Atom™ processor C series, the Intel® Atom™ processor X series, the Intel® Pentium® processor J series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont microarchitecture. The Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series is based on the Knights Landing microarchitecture and supports Intel 64 architecture. The Intel® Pentium® Silver processor series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont Plus microarchitecture. The 8th generation Intel® Core™ processors, 9th generation Intel® Core™ processors, and Intel® Xeon® E processors are based on the Coffee Lake microarchitecture and support Intel 64 architecture. The Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series is based on the Knights Mill microarchitecture and supports Intel 64 architecture. IA-32 architecture is the instruction set architecture and programming environment for Intel's 32-bit microprocessors. Intel® 64 architecture is the instruction set architecture and programming environment which is the superset of Intel’s 32-bit and 64-bit architectures. It is compatible with the IA-32 architecture. 1.2 OVERVIEW OF VOLUME 2A, 2B, 2C AND 2D: INSTRUCTION SET REFERENCE A description of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D content follows: Chapter 1 — About This Manual. Gives an overview of all seven volumes of the Intel® 64 and IA-32 Architectures Software Developer’s Manual. It also describes the notational conventions in these manuals and lists related Intel® manuals and documentation of interest to programmers and hardware designers. Chapter 2 — Instruction Format. Describes the machine-level instruction format used for all IA-32 instructions and gives the allowable encodings of prefixes, the operand-identifier byte (ModR/M byte), the addressing-mode specifier byte (SIB byte), and the displacement and immediate bytes. Chapter 3 — Instruction Set Reference, A-L. Describes Intel 64 and IA-32 instructions in detail, including an algorithmic description of operations, the effect on flags, the effect of operand- and address-size attributes, and the exceptions that may be generated. The instructions are arranged in alphabetical order. General-purpose, x87 FPU, Intel MMX™ technology, SSE/SSE2/SSE3/SSSE3/SSE4 extensions, and system instructions are included. Chapter 4 — Instruction Set Reference, M-U. Continues the description of Intel 64 and IA-32 instructions started in Chapter 3. It starts Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. Chapter 5 — Instruction Set Reference, V-Z. Continues the description of Intel 64 and IA-32 instructions started in chapters 3 and 4. It provides the balance of the alphabetized list of instructions and starts Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2C. Chapter 6 — Safer Mode Extensions Reference. Describes the safer mode extensions (SMX). SMX is intended for a system executive to support launching a measured environment in a platform where the identity of the software controlling the platform hardware can be measured for the purpose of making trust decisions. This chapter starts Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2D. Chapter 7— Instruction Set Reference Unique to Intel® Xeon Phi™ Processors. Describes the instruction set that is unique to Intel® Xeon Phi™ processors based on the Knights Landing and Knights Mill microarchitectures. The set is not supported in any other Intel processors. Appendix A — Opcode Map. Gives an opcode map for the IA-32 instruction set. Appendix B — Instruction Formats and Encodings. Gives the binary encoding of each form of each IA-32 instruction. Appendix C — Intel® C/C++ Compiler Intrinsics and Functional Equivalents. Lists the Intel® C/C++ compiler intrinsics and their assembly code equivalents for each of the IA-32 MMX and SSE/SSE2/SSE3 instructions. 1-4 Vol. 2A ABOUT THIS MANUAL 1.3 NOTATIONAL CONVENTIONS This manual uses specific notation for data-structure formats, for symbolic representation of instructions, and for hexadecimal and binary numbers. A review of this notation makes the manual easier to read. 1.3.1 Bit and Byte Order In illustrations of data structures in memory, smaller addresses appear toward the bottom of the figure; addresses increase toward the top. Bit positions are numbered from right to left. The numerical value of a set bit is equal to two raised to the power of the bit position. IA-32 processors are “little endian” machines; this means the bytes of a word are numbered starting from the least significant byte. Figure 1-1 illustrates these conventions. Highest Address 31 24 23 Byte 3 Data Structure 8 7 16 15 Byte 2 Byte 1 0 Byte 0 Bit offset 28 24 20 16 12 8 4 0 Lowest Address Byte Offset Figure 1-1. Bit and Byte Order 1.3.2 Reserved Bits and Software Compatibility In many register and memory layout descriptions, certain bits are marked as reserved. When bits are marked as reserved, it is essential for compatibility with future processors that software treat these bits as having a future, though unknown, effect. The behavior of reserved bits should be regarded as not only undefined, but unpredictable. Software should follow these guidelines in dealing with reserved bits: • Do not depend on the states of any reserved bits when testing the values of registers which contain such bits. Mask out the reserved bits before testing. • • • Do not depend on the states of any reserved bits when storing to memory or to a register. Do not depend on the ability to retain information written into any reserved bits. When loading a register, always load the reserved bits with the values indicated in the documentation, if any, or reload them with values previously read from the same register. NOTE Avoid any software dependence upon the state of reserved bits in IA-32 registers. Depending upon the values of reserved register bits will make software dependent upon the unspecified manner in which the processor handles these bits. Programs that depend upon reserved values risk incompatibility with future processors. 1.3.3 Instruction Operands When instructions are represented symbolically, a subset of the IA-32 assembly language is used. In this subset, an instruction has the following format: label: mnemonic argument1, argument2, argument3 Vol. 2A 1-5 ABOUT THIS MANUAL where: • • • A label is an identifier which is followed by a colon. A mnemonic is a reserved name for a class of instruction opcodes which have the same function. The operands argument1, argument2, and argument3 are optional. There may be from zero to three operands, depending on the opcode. When present, they take the form of either literals or identifiers for data items. Operand identifiers are either reserved names of registers or are assumed to be assigned to data items declared in another part of the program (which may not be shown in the example). When two operands are present in an arithmetic or logical instruction, the right operand is the source and the left operand is the destination. For example: LOADREG: MOV EAX, SUBTOTAL In this example, LOADREG is a label, MOV is the mnemonic identifier of an opcode, EAX is the destination operand, and SUBTOTAL is the source operand. Some assembly languages put the source and destination in reverse order. 1.3.4 Hexadecimal and Binary Numbers Base 16 (hexadecimal) numbers are represented by a string of hexadecimal digits followed by the character H (for example, F82EH). A hexadecimal digit is a character from the following set: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Base 2 (binary) numbers are represented by a string of 1s and 0s, sometimes followed by the character B (for example, 1010B). The “B” designation is only used in situations where confusion as to the type of number might arise. 1.3.5 Segmented Addressing The processor uses byte addressing. This means memory is organized and accessed as a sequence of bytes. Whether one or more bytes are being accessed, a byte address is used to locate the byte or bytes in memory. The range of memory that can be addressed is called an address space. The processor also supports segmented addressing. This is a form of addressing where a program may have many independent address spaces, called segments. For example, a program can keep its code (instructions) and stack in separate segments. Code addresses would always refer to the code space, and stack addresses would always refer to the stack space. The following notation is used to specify a byte address within a segment: Segment-register:Byte-address For example, the following segment address identifies the byte at address FF79H in the segment pointed by the DS register: DS:FF79H The following segment address identifies an instruction address in the code segment. The CS register points to the code segment and the EIP register contains the address of the instruction. CS:EIP 1.3.6 Exceptions An exception is an event that typically occurs when an instruction causes an error. For example, an attempt to divide by zero generates an exception. However, some exceptions, such as breakpoints, occur under other conditions. Some types of exceptions may provide error codes. An error code reports additional information about the error. An example of the notation used to show an exception and error code is shown below: #PF(fault code) 1-6 Vol. 2A ABOUT THIS MANUAL This example refers to a page-fault exception under conditions where an error code naming a type of fault is reported. Under some conditions, exceptions which produce error codes may not be able to report an accurate code. In this case, the error code is zero, as shown below for a general-protection exception: #GP(0) 1.3.7 A New Syntax for CPUID, CR, and MSR Values Obtain feature flags, status, and system information by using the CPUID instruction, by checking control register bits, and by reading model-specific registers. We are moving toward a new syntax to represent this information. See Figure 1-2. CPUID Input and Output CPUID.01H:EDX.SSE[bit 25] = 1 Input value for EAX register Output register and feature flag or field name with bit position(s) Value (or range) of output Control Register Values CR4.OSFXSR[bit 9] = 1 Example CR name Feature flag or field name with bit position(s) Value (or range) of output Model-Specific Register Values IA32_MISC_ENABLE.ENABLEFOPCODE[bit 2] = 1 Example MSR name Feature flag or field name with bit position(s) Value (or range) of output Figure 1-2. Syntax for CPUID, CR, and MSR Data Presentation 1.4 SDM29002 RELATED LITERATURE Literature related to Intel 64 and IA-32 processors is listed and viewable on-line at: https://software.intel.com/en-us/articles/intel-sdm Vol. 2A 1-7 ABOUT THIS MANUAL See also: • The latest security information on Intel® products: https://www.intel.com/content/www/us/en/security-center/default.html • Software developer resources, guidance and insights for security advisories: https://software.intel.com/security-software-guidance/ • • • The data sheet for a particular Intel 64 or IA-32 processor • Intel® Fortran Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ • Intel® Software Development Tools: https://software.intel.com/en-us/intel-sdp-home • Intel® 64 and IA-32 Architectures Software Developer’s Manual (in one, four or ten volumes): https://software.intel.com/en-us/articles/intel-sdm • Intel® 64 and IA-32 Architectures Optimization Reference Manual: https://software.intel.com/en-us/articles/intel-sdm#optimization • Intel 64 Architecture x2APIC Specification: The specification update for a particular Intel 64 or IA-32 processor Intel® C++ Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ http://www.intel.com/content/www/us/en/architecture-and-technology/64-architecture-x2apic-specification.html • Intel® Trusted Execution Technology Measured Launched Environment Programming Guide: http://www.intel.com/content/www/us/en/software-developers/intel-txt-software-development-guide.html • Developing Multi-threaded Applications: A Platform Consistent Approach: https://software.intel.com/sites/default/files/article/147714/51534-developing-multithreaded-applications.pdf • Using Spin-Loops on Intel® Pentium® 4 Processor and Intel® Xeon® Processor: https://software.intel.com/sites/default/files/22/30/25602 • Performance Monitoring Unit Sharing Guide http://software.intel.com/file/30388 Literature related to selected features in future Intel processors are available at: • Intel® Architecture Instruction Set Extensions Programming Reference https://software.intel.com/en-us/isa-extensions • Intel® Software Guard Extensions (Intel® SGX) Programming Reference https://software.intel.com/en-us/isa-extensions/intel-sgx More relevant links are: • Intel® Developer Zone: https://software.intel.com/en-us • Developer centers: http://www.intel.com/content/www/us/en/hardware-developers/developer-centers.html • Processor support general link: http://www.intel.com/support/processors/ • Intel® Hyper-Threading Technology (Intel® HT Technology): http://www.intel.com/technology/platform-technology/hyper-threading/index.htm 1-8 Vol. 2A 6. Updates to Chapter 2, Volume 2A Change bars and green text show changes to Chapter 2 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-L. -----------------------------------------------------------------------------------------Changes to this chapter: Updates to Table 2-15 “Instructions in each Exception Class” and Table 2-43 “EVEX Instructions in each Exception Class” to organize and add missing instructions as necessary. Typo corrections to Table 2-36 “EVEX Embedded Broadcast/Rounding/SAE and Vector Length on Vector Instructions”, Table 2-37 “OS XSAVE Enabling Requirements of Instruction Categories”, and Table 2-41 “#UD Conditions Dependent on EVEX.b Context”. Addition of VCVTPS2PH instruction to Type E11 in Table 2-42 “EVEX-Encoded Instruction Exception Class Summary”; removal of erroneous type from the same table. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 16 CHAPTER 2 INSTRUCTION FORMAT This chapter describes the instruction format for all Intel 64 and IA-32 processors. The instruction format for protected mode, real-address mode and virtual-8086 mode is described in Section 2.1. Increments provided for IA32e mode and its sub-modes are described in Section 2.2. 2.1 INSTRUCTION FORMAT FOR PROTECTED MODE, REAL-ADDRESS MODE, AND VIRTUAL-8086 MODE The Intel 64 and IA-32 architectures instruction encodings are subsets of the format shown in Figure 2-1. Instructions consist of optional instruction prefixes (in any order), primary opcode bytes (up to three bytes), an addressing-form specifier (if required) consisting of the ModR/M byte and sometimes the SIB (Scale-Index-Base) byte, a displacement (if required), and an immediate data field (if required). Instruction Prefixes Opcode ModR/M SIB Prefixes of 1-, 2-, or 3-byte 1 byte 1 byte each opcode (if required) (optional)1, 2 7 6 5 Mod 3 2 Reg/ Opcode 0 R/M 1 byte (if required) 7 Scale Displacement Immediate Address displacement of 1, 2, or 4 bytes or none3 Immediate data of 1, 2, or 4 bytes or none3 3 2 6 5 Index 0 Base 1. The REX prefix is optional, but if used must be immediately before the opcode; see Section 2.2.1, “REX Prefixes” for additional information. 2. For VEX encoding information, see Section 2.3, “Intel® Advanced Vector Extensions (Intel® AVX)”. 3. Some rare instructions can take an 8B immediate or 8B displacement. Figure 2-1. Intel 64 and IA-32 Architectures Instruction Format 2.1.1 Instruction Prefixes Instruction prefixes are divided into four groups, each with a set of allowable prefix codes. For each instruction, it is only useful to include up to one prefix code from each of the four groups (Groups 1, 2, 3, 4). Groups 1 through 4 may be placed in any order relative to each other. • Group 1 — Lock and repeat prefixes: • • • LOCK prefix is encoded using F0H. REPNE/REPNZ prefix is encoded using F2H. Repeat-Not-Zero prefix applies only to string and input/output instructions. (F2H is also used as a mandatory prefix for some instructions.) REP or REPE/REPZ is encoded using F3H. The repeat prefix applies only to string and input/output instructions. F3H is also used as a mandatory prefix for POPCNT, LZCNT and ADOX instructions. Vol. 2A 2-1 INSTRUCTION FORMAT — BND prefix is encoded using F2H if the following conditions are true: • • • • CPUID.(EAX=07H, ECX=0):EBX.MPX[bit 14] is set. BNDCFGU.EN and/or IA32_BNDCFGS.EN is set. When the F2 prefix precedes a near CALL, a near RET, a near JMP, a short Jcc, or a near Jcc instruction (see Chapter 17, “Intel® MPX,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). Group 2 — Segment override prefixes: • • • • • • 2EH—CS segment override (use with any branch instruction is reserved). 36H—SS segment override prefix (use with any branch instruction is reserved). 3EH—DS segment override prefix (use with any branch instruction is reserved). 26H—ES segment override prefix (use with any branch instruction is reserved). 64H—FS segment override prefix (use with any branch instruction is reserved). 65H—GS segment override prefix (use with any branch instruction is reserved). — Branch hints1: • • • 3EH—Branch taken (used only with Jcc instructions). Group 3 • • 2EH—Branch not taken (used only with Jcc instructions). Operand-size override prefix is encoded using 66H (66H is also used as a mandatory prefix for some instructions). Group 4 • 67H—Address-size override prefix. The LOCK prefix (F0H) forces an operation that ensures exclusive use of shared memory in a multiprocessor environment. See “LOCK—Assert LOCK# Signal Prefix” in Chapter 3, “Instruction Set Reference, A-L,” for a description of this prefix. Repeat prefixes (F2H, F3H) cause an instruction to be repeated for each element of a string. Use these prefixes only with string and I/O instructions (MOVS, CMPS, SCAS, LODS, STOS, INS, and OUTS). Use of repeat prefixes and/or undefined opcodes with other Intel 64 or IA-32 instructions is reserved; such use may cause unpredictable behavior. Some instructions may use F2H,F3H as a mandatory prefix to express distinct functionality. Branch hint prefixes (2EH, 3EH) allow a program to give a hint to the processor about the most likely code path for a branch. Use these prefixes only with conditional branch instructions (Jcc). Other use of branch hint prefixes and/or other undefined opcodes with Intel 64 or IA-32 instructions is reserved; such use may cause unpredictable behavior. The operand-size override prefix allows a program to switch between 16- and 32-bit operand sizes. Either size can be the default; use of the prefix selects the non-default size. Some SSE2/SSE3/SSSE3/SSE4 instructions and instructions using a three-byte sequence of primary opcode bytes may use 66H as a mandatory prefix to express distinct functionality. Other use of the 66H prefix is reserved; such use may cause unpredictable behavior. The address-size override prefix (67H) allows programs to switch between 16- and 32-bit addressing. Either size can be the default; the prefix selects the non-default size. Using this prefix and/or other undefined opcodes when operands for the instruction do not reside in memory is reserved; such use may cause unpredictable behavior. 1. Some earlier microarchitectures used these as branch hints, but recent generations have not and they are reserved for future hint usage. 2-2 Vol. 2A INSTRUCTION FORMAT 2.1.2 Opcodes A primary opcode can be 1, 2, or 3 bytes in length. An additional 3-bit opcode field is sometimes encoded in the ModR/M byte. Smaller fields can be defined within the primary opcode. Such fields define the direction of operation, size of displacements, register encoding, condition codes, or sign extension. Encoding fields used by an opcode vary depending on the class of operation. Two-byte opcode formats for general-purpose and SIMD instructions consist of one of the following: • • An escape opcode byte 0FH as the primary opcode and a second opcode byte. A mandatory prefix (66H, F2H, or F3H), an escape opcode byte, and a second opcode byte (same as previous bullet). For example, CVTDQ2PD consists of the following sequence: F3 0F E6. The first byte is a mandatory prefix (it is not considered as a repeat prefix). Three-byte opcode formats for general-purpose and SIMD instructions consist of one of the following: • • An escape opcode byte 0FH as the primary opcode, plus two additional opcode bytes. A mandatory prefix (66H, F2H, or F3H), an escape opcode byte, plus two additional opcode bytes (same as previous bullet). For example, PHADDW for XMM registers consists of the following sequence: 66 0F 38 01. The first byte is the mandatory prefix. Valid opcode expressions are defined in Appendix A and Appendix B. 2.1.3 ModR/M and SIB Bytes Many instructions that refer to an operand in memory have an addressing-form specifier byte (called the ModR/M byte) following the primary opcode. The ModR/M byte contains three fields of information: • • The mod field combines with the r/m field to form 32 possible values: eight registers and 24 addressing modes. • The r/m field can specify a register as an operand or it can be combined with the mod field to encode an addressing mode. Sometimes, certain combinations of the mod field and the r/m field are used to express opcode information for some instructions. The reg/opcode field specifies either a register number or three more bits of opcode information. The purpose of the reg/opcode field is specified in the primary opcode. Certain encodings of the ModR/M byte require a second addressing byte (the SIB byte). The base-plus-index and scale-plus-index forms of 32-bit addressing require the SIB byte. The SIB byte includes the following fields: • • • The scale field specifies the scale factor. The index field specifies the register number of the index register. The base field specifies the register number of the base register. See Section 2.1.5 for the encodings of the ModR/M and SIB bytes. 2.1.4 Displacement and Immediate Bytes Some addressing forms include a displacement immediately following the ModR/M byte (or the SIB byte if one is present). If a displacement is required, it can be 1, 2, or 4 bytes. If an instruction specifies an immediate operand, the operand always follows any displacement bytes. An immediate operand can be 1, 2 or 4 bytes. Vol. 2A 2-3 INSTRUCTION FORMAT 2.1.5 Addressing-Mode Encoding of ModR/M and SIB Bytes The values and corresponding addressing forms of the ModR/M and SIB bytes are shown in Table 2-1 through Table 2-3: 16-bit addressing forms specified by the ModR/M byte are in Table 2-1 and 32-bit addressing forms are in Table 2-2. Table 2-3 shows 32-bit addressing forms specified by the SIB byte. In cases where the reg/opcode field in the ModR/M byte represents an extended opcode, valid encodings are shown in Appendix B. In Table 2-1 and Table 2-2, the Effective Address column lists 32 effective addresses that can be assigned to the first operand of an instruction by using the Mod and R/M fields of the ModR/M byte. The first 24 options provide ways of specifying a memory location; the last eight (Mod = 11B) provide ways of specifying general-purpose, MMX technology and XMM registers. The Mod and R/M columns in Table 2-1 and Table 2-2 give the binary encodings of the Mod and R/M fields required to obtain the effective address listed in the first column. For example: see the row indicated by Mod = 11B, R/M = 000B. The row identifies the general-purpose registers EAX, AX or AL; MMX technology register MM0; or XMM register XMM0. The register used is determined by the opcode byte and the operand-size attribute. Now look at the seventh row in either table (labeled “REG =”). This row specifies the use of the 3-bit Reg/Opcode field when the field is used to give the location of a second operand. The second operand must be a generalpurpose, MMX technology, or XMM register. Rows one through five list the registers that may correspond to the value in the table. Again, the register used is determined by the opcode byte along with the operand-size attribute. If the instruction does not require a second operand, then the Reg/Opcode field may be used as an opcode extension. This use is represented by the sixth row in the tables (labeled “/digit (Opcode)”). Note that values in row six are represented in decimal form. The body of Table 2-1 and Table 2-2 (under the label “Value of ModR/M Byte (in Hexadecimal)”) contains a 32 by 8 array that presents all of 256 values of the ModR/M byte (in hexadecimal). Bits 3, 4 and 5 are specified by the column of the table in which a byte resides. The row specifies bits 0, 1 and 2; and bits 6 and 7. The figure below demonstrates interpretation of one table value. Mod 11 RM 000 /digit (Opcode); REG = 001 C8H 11001000 Figure 2-2. Table Interpretation of ModR/M Byte (C8H) 2-4 Vol. 2A INSTRUCTION FORMAT Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte r8(/r) r16(/r) r32(/r) mm(/r) xmm(/r) (In decimal) /digit (Opcode) (In binary) REG = Effective Address AL AX EAX MM0 XMM0 0 000 Mod CL CX ECX MM1 XMM1 1 001 R/M DL DX EDX MM2 XMM2 2 010 BL BX EBX MM3 XMM3 3 011 AH SP ESP MM4 XMM4 4 100 CH BP1 EBP MM5 XMM5 5 101 DH SI ESI MM6 XMM6 6 110 BH DI EDI MM7 XMM7 7 111 Value of ModR/M Byte (in Hexadecimal) [BX+SI] [BX+DI] [BP+SI] [BP+DI] [SI] [DI] disp162 [BX] 00 000 001 010 011 100 101 110 111 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F [BX+SI]+disp83 [BX+DI]+disp8 [BP+SI]+disp8 [BP+DI]+disp8 [SI]+disp8 [DI]+disp8 [BP]+disp8 [BX]+disp8 01 000 001 010 011 100 101 110 111 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F 70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F [BX+SI]+disp16 [BX+DI]+disp16 [BP+SI]+disp16 [BP+DI]+disp16 [SI]+disp16 [DI]+disp16 [BP]+disp16 [BX]+disp16 10 000 001 010 011 100 101 110 111 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF EAX/AX/AL/MM0/XMM0 ECX/CX/CL/MM1/XMM1 EDX/DX/DL/MM2/XMM2 EBX/BX/BL/MM3/XMM3 ESP/SP/AHMM4/XMM4 EBP/BP/CH/MM5/XMM5 ESI/SI/DH/MM6/XMM6 EDI/DI/BH/MM7/XMM7 11 000 001 010 011 100 101 110 111 C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF NOTES: 1. The default segment register is SS for the effective addresses containing a BP index, DS for other effective addresses. 2. The disp16 nomenclature denotes a 16-bit displacement that follows the ModR/M byte and that is added to the index. 3. The disp8 nomenclature denotes an 8-bit displacement that follows the ModR/M byte and that is sign-extended and added to the index. Vol. 2A 2-5 INSTRUCTION FORMAT Table 2-2. 32-Bit Addressing Forms with the ModR/M Byte r8(/r) r16(/r) r32(/r) mm(/r) xmm(/r) (In decimal) /digit (Opcode) (In binary) REG = Effective Address AL AX EAX MM0 XMM0 0 000 Mod CL CX ECX MM1 XMM1 1 001 R/M DL DX EDX MM2 XMM2 2 010 BL BX EBX MM3 XMM3 3 011 AH SP ESP MM4 XMM4 4 100 CH BP EBP MM5 XMM5 5 101 DH SI ESI MM6 XMM6 6 110 BH DI EDI MM7 XMM7 7 111 Value of ModR/M Byte (in Hexadecimal) [EAX] [ECX] [EDX] [EBX] [--][--]1 disp322 [ESI] [EDI] 00 000 001 010 011 100 101 110 111 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F [EAX]+disp83 [ECX]+disp8 [EDX]+disp8 [EBX]+disp8 [--][--]+disp8 [EBP]+disp8 [ESI]+disp8 [EDI]+disp8 01 000 001 010 011 100 101 110 111 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F 70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F [EAX]+disp32 [ECX]+disp32 [EDX]+disp32 [EBX]+disp32 [--][--]+disp32 [EBP]+disp32 [ESI]+disp32 [EDI]+disp32 10 000 001 010 011 100 101 110 111 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF EAX/AX/AL/MM0/XMM0 ECX/CX/CL/MM/XMM1 EDX/DX/DL/MM2/XMM2 EBX/BX/BL/MM3/XMM3 ESP/SP/AH/MM4/XMM4 EBP/BP/CH/MM5/XMM5 ESI/SI/DH/MM6/XMM6 EDI/DI/BH/MM7/XMM7 11 000 001 010 011 100 101 110 111 C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF NOTES: 1. The [--][--] nomenclature means a SIB follows the ModR/M byte. 2. The disp32 nomenclature denotes a 32-bit displacement that follows the ModR/M byte (or the SIB byte if one is present) and that is added to the index. 3. The disp8 nomenclature denotes an 8-bit displacement that follows the ModR/M byte (or the SIB byte if one is present) and that is sign-extended and added to the index. Table 2-3 is organized to give 256 possible values of the SIB byte (in hexadecimal). General purpose registers used as a base are indicated across the top of the table, along with corresponding values for the SIB byte’s base field. Table rows in the body of the table indicate the register used as the index (SIB byte bits 3, 4 and 5) and the scaling factor (determined by SIB byte bits 6 and 7). 2-6 Vol. 2A INSTRUCTION FORMAT Table 2-3. 32-Bit Addressing Forms with the SIB Byte r32 (In decimal) Base = (In binary) Base = EAX 0 000 Scaled Index SS ECX 1 001 Index EDX 2 010 EBX 3 011 ESP 4 100 [*] 5 101 ESI 6 110 EDI 7 111 Value of SIB Byte (in Hexadecimal) [EAX] [ECX] [EDX] [EBX] none [EBP] [ESI] [EDI] 00 000 001 010 011 100 101 110 111 00 08 10 18 20 28 30 38 01 09 11 19 21 29 31 39 02 0A 12 1A 22 2A 32 3A 03 0B 13 1B 23 2B 33 3B 04 0C 14 1C 24 2C 34 3C 05 0D 15 1D 25 2D 35 3D 06 0E 16 1E 26 2E 36 3E 07 0F 17 1F 27 2F 37 3F [EAX*2] [ECX*2] [EDX*2] [EBX*2] none [EBP*2] [ESI*2] [EDI*2] 01 000 001 010 011 100 101 110 111 40 48 50 58 60 68 70 78 41 49 51 59 61 69 71 79 42 4A 52 5A 62 6A 72 7A 43 4B 53 5B 63 6B 73 7B 44 4C 54 5C 64 6C 74 7C 45 4D 55 5D 65 6D 75 7D 46 4E 56 5E 66 6E 76 7E 47 4F 57 5F 67 6F 77 7F [EAX*4] [ECX*4] [EDX*4] [EBX*4] none [EBP*4] [ESI*4] [EDI*4] 10 000 001 010 011 100 101 110 111 80 88 90 98 A0 A8 B0 B8 81 89 91 99 A1 A9 B1 B9 82 8A 92 9A A2 AA B2 BA 83 8B 93 9B A3 AB B3 BB 84 8C 94 9C A4 AC B4 BC 85 8D 95 9D A5 AD B5 BD 86 8E 96 9E A6 AE B6 BE 87 8F 97 9F A7 AF B7 BF [EAX*8] [ECX*8] [EDX*8] [EBX*8] none [EBP*8] [ESI*8] [EDI*8] 11 000 001 010 011 100 101 110 111 C0 C8 D0 D8 E0 E8 F0 F8 C1 C9 D1 D9 E1 E9 F1 F9 C2 CA D2 DA E2 EA F2 FA C3 CB D3 DB E3 EB F3 FB C4 CC D4 DC E4 EC F4 FC C5 CD D5 DD E5 ED F5 FD C6 CE D6 DE E6 EE F6 FE C7 CF D7 DF E7 EF F7 FF NOTES: 1. The [*] nomenclature means a disp32 with no base if the MOD is 00B. Otherwise, [*] means disp8 or disp32 + [EBP]. This provides the following address modes: MOD bits Effective Address 00 [scaled index] + disp32 01 [scaled index] + disp8 + [EBP] 10 [scaled index] + disp32 + [EBP] 2.2 IA-32E MODE IA-32e mode has two sub-modes. These are: • Compatibility Mode. Enables a 64-bit operating system to run most legacy protected mode software unmodified. • 64-Bit Mode. Enables a 64-bit operating system to run applications written to access 64-bit address space. Vol. 2A 2-7 INSTRUCTION FORMAT 2.2.1 REX Prefixes REX prefixes are instruction-prefix bytes used in 64-bit mode. They do the following: • • • Specify GPRs and SSE registers. Specify 64-bit operand size. Specify extended control registers. Not all instructions require a REX prefix in 64-bit mode. A prefix is necessary only if an instruction references one of the extended registers or uses a 64-bit operand. If a REX prefix is used when it has no meaning, it is ignored. Only one REX prefix is allowed per instruction. If used, the REX prefix byte must immediately precede the opcode byte or the escape opcode byte (0FH). When a REX prefix is used in conjunction with an instruction containing a mandatory prefix, the mandatory prefix must come before the REX so the REX prefix can be immediately preceding the opcode or the escape byte. For example, CVTDQ2PD with a REX prefix should have REX placed between F3 and 0F E6. Other placements are ignored. The instruction-size limit of 15 bytes still applies to instructions with a REX prefix. See Figure 2-3. Legacy Prefixes REX Prefix Grp 1, Grp 2, Grp 3, Grp 4 (optional) (optional) Opcode ModR/M 1-, 2-, or 3-byte opcode 1 byte (if required) SIB 1 byte (if required) Displacement Address displacement of 1, 2, or 4 bytes Immediate Immediate data of 1, 2, or 4 bytes or none Figure 2-3. Prefix Ordering in 64-bit Mode 2.2.1.1 Encoding Intel 64 and IA-32 instruction formats specify up to three registers by using 3-bit fields in the encoding, depending on the format: • • ModR/M: the reg and r/m fields of the ModR/M byte. • Instructions without ModR/M: the reg field of the opcode. ModR/M with SIB: the reg field of the ModR/M byte, the base and index fields of the SIB (scale, index, base) byte. In 64-bit mode, these formats do not change. Bits needed to define fields in the 64-bit context are provided by the addition of REX prefixes. 2.2.1.2 More on REX Prefix Fields REX prefixes are a set of 16 opcodes that span one row of the opcode map and occupy entries 40H to 4FH. These opcodes represent valid instructions (INC or DEC) in IA-32 operating modes and in compatibility mode. In 64-bit mode, the same opcodes represent the instruction prefix REX and are not treated as individual instructions. The single-byte-opcode forms of the INC/DEC instructions are not available in 64-bit mode. INC/DEC functionality is still available using ModR/M forms of the same instructions (opcodes FF/0 and FF/1). See Table 2-4 for a summary of the REX prefix format. Figure 2-4 though Figure 2-7 show examples of REX prefix fields in use. Some combinations of REX prefix fields are invalid. In such cases, the prefix is ignored. Some additional information follows: • Setting REX.W can be used to determine the operand size but does not solely determine operand width. Like the 66H size prefix, 64-bit operand size override has no effect on byte-specific operations. • • For non-byte operations: if a 66H prefix is used with prefix (REX.W = 1), 66H is ignored. If a 66H override is used with REX and REX.W = 0, the operand size is 16 bits. 2-8 Vol. 2A INSTRUCTION FORMAT • REX.R modifies the ModR/M reg field when that field encodes a GPR, SSE, control or debug register. REX.R is ignored when ModR/M specifies other registers or defines an extended opcode. • • REX.X bit modifies the SIB index field. REX.B either modifies the base in the ModR/M r/m field or SIB base field; or it modifies the opcode reg field used for accessing GPRs. Table 2-4. REX Prefix Fields [BITS: 0100WRXB] Field Name Bit Position Definition - 7:4 0100 W 3 0 = Operand size determined by CS.D 1 = 64 Bit Operand Size R 2 Extension of the ModR/M reg field X 1 Extension of the SIB index field B 0 Extension of the ModR/M r/m field, SIB base field, or Opcode reg field ModRM Byte REX PREFIX Opcode mod ≠11 0100WR0B reg rrr r/m bbb Rrrr Bbbb OM17Xfig1-3 Figure 2-4. Memory Addressing Without an SIB Byte; REX.X Not Used ModRM Byte REX PREFIX 0100WR0B Opcode mod 11 reg rrr Rrrr r/m bbb Bbbb OM17Xfig1-4 Figure 2-5. Register-Register Addressing (No Memory Operand); REX.X Not Used Vol. 2A 2-9 INSTRUCTION FORMAT ModRM Byte REX PREFIX Opcode mod ≠ 11 0100WRXB reg rrr SIB Byte r/m 100 index xxx scale ss Rrrr Xxxx base bbb Bbbb OM17Xfig1-5 Figure 2-6. Memory Addressing With a SIB Byte REX PREFIX 0100W00B Opcode reg bbb Bbbb OM17Xfig1-6 Figure 2-7. Register Operand Coded in Opcode Byte; REX.X & REX.R Not Used In the IA-32 architecture, byte registers (AH, AL, BH, BL, CH, CL, DH, and DL) are encoded in the ModR/M byte’s reg field, the r/m field or the opcode reg field as registers 0 through 7. REX prefixes provide an additional addressing capability for byte-registers that makes the least-significant byte of GPRs available for byte operations. Certain combinations of the fields of the ModR/M byte and the SIB byte have special meaning for register encodings. For some combinations, fields expanded by the REX prefix are not decoded. Table 2-5 describes how each case behaves. 2-10 Vol. 2A INSTRUCTION FORMAT Table 2-5. Special Cases of REX Encodings ModR/M or SIB Sub-field Encodings ModR/M Byte mod ≠ 11 Compatibility Mode Operation Compatibility Mode Implications SIB byte present. SIB byte required for REX prefix adds a fourth bit (b) which is not decoded ESP-based addressing. (don't care). r/m = b*100(ESP) ModR/M Byte mod = 0 r/m = b*101(EBP) Additional Implications SIB byte also required for R12-based addressing. Base register not used. EBP without a REX prefix adds a fourth bit (b) which is not decoded displacement must be (don't care). done using Using RBP or R13 without displacement must be done mod = 01 with using mod = 01 with a displacement of 0. displacement of 0. SIB Byte index = 0100(ESP) Index register not used. ESP cannot be used as REX prefix adds a fourth bit (b) which is decoded. an index register. There are no additional implications. The expanded index field allows distinguishing RSP from R12, therefore R12 can be used as an index. SIB Byte base = 0101(EBP) Base register is unused if mod = 0. Base register depends REX prefix adds a fourth bit (b) which is not decoded. on mod encoding. This requires explicit displacement to be used with EBP/RBP or R13. NOTES: * Don’t care about value of REX.B 2.2.1.3 Displacement Addressing in 64-bit mode uses existing 32-bit ModR/M and SIB encodings. The ModR/M and SIB displacement sizes do not change. They remain 8 bits or 32 bits and are sign-extended to 64 bits. 2.2.1.4 Direct Memory-Offset MOVs In 64-bit mode, direct memory-offset forms of the MOV instruction are extended to specify a 64-bit immediate absolute address. This address is called a moffset. No prefix is needed to specify this 64-bit memory offset. For these MOV instructions, the size of the memory offset follows the address-size default (64 bits in 64-bit mode). See Table 2-6. Table 2-6. Direct Memory Offset Form of MOV Opcode Instruction A0 MOV AL, moffset A1 MOV EAX, moffset A2 MOV moffset, AL A3 MOV moffset, EAX 2.2.1.5 Immediates In 64-bit mode, the typical size of immediate operands remains 32 bits. When the operand size is 64 bits, the processor sign-extends all immediates to 64 bits prior to their use. Support for 64-bit immediate operands is accomplished by expanding the semantics of the existing move (MOV reg, imm16/32) instructions. These instructions (opcodes B8H – BFH) move 16-bits or 32-bits of immediate data (depending on the effective operand size) into a GPR. When the effective operand size is 64 bits, these instructions can be used to load an immediate into a GPR. A REX prefix is needed to override the 32-bit default operand size to a 64-bit operand size. For example: 48 B8 8877665544332211 MOV RAX,1122334455667788H Vol. 2A 2-11 INSTRUCTION FORMAT 2.2.1.6 RIP-Relative Addressing A new addressing form, RIP-relative (relative instruction-pointer) addressing, is implemented in 64-bit mode. An effective address is formed by adding displacement to the 64-bit RIP of the next instruction. In IA-32 architecture and compatibility mode, addressing relative to the instruction pointer is available only with control-transfer instructions. In 64-bit mode, instructions that use ModR/M addressing can use RIP-relative addressing. Without RIP-relative addressing, all ModR/M modes address memory relative to zero. RIP-relative addressing allows specific ModR/M modes to address memory relative to the 64-bit RIP using a signed 32-bit displacement. This provides an offset range of ±2GB from the RIP. Table 2-7 shows the ModR/M and SIB encodings for RIP-relative addressing. Redundant forms of 32-bit displacement-addressing exist in the current ModR/M and SIB encodings. There is one ModR/M encoding and there are several SIB encodings. RIP-relative addressing is encoded using a redundant form. In 64-bit mode, the ModR/M Disp32 (32-bit displacement) encoding is re-defined to be RIP+Disp32 rather than displacement-only. See Table 2-7. Table 2-7. RIP-Relative Addressing ModR/M and SIB Sub-field Encodings Compatibility Mode Operation 64-bit Mode Operation Additional Implications in 64-bit mode ModR/M Byte Disp32 RIP + Disp32 Must use SIB form with normal (zero-based) displacement addressing if mod = 00, Disp32 Same as legacy None mod = 00 r/m = 101 (none) SIB Byte base = 101 (none) index = 100 (none) scale = 0, 1, 2, 4 The ModR/M encoding for RIP-relative addressing does not depend on using a prefix. Specifically, the r/m bit field encoding of 101B (used to select RIP-relative addressing) is not affected by the REX prefix. For example, selecting R13 (REX.B = 1, r/m = 101B) with mod = 00B still results in RIP-relative addressing. The 4-bit r/m field of REX.B combined with ModR/M is not fully decoded. In order to address R13 with no displacement, software must encode R13 + 0 using a 1-byte displacement of zero. RIP-relative addressing is enabled by 64-bit mode, not by a 64-bit address-size. The use of the address-size prefix does not disable RIP-relative addressing. The effect of the address-size prefix is to truncate and zero-extend the computed effective address to 32 bits. 2.2.1.7 Default 64-Bit Operand Size In 64-bit mode, two groups of instructions have a default operand size of 64 bits (do not need a REX prefix for this operand size). These are: • • Near branches. All instructions, except far branches, that implicitly reference the RSP. 2.2.2 Additional Encodings for Control and Debug Registers In 64-bit mode, more encodings for control and debug registers are available. The REX.R bit is used to modify the ModR/M reg field when that field encodes a control or debug register (see Table 2-4). These encodings enable the processor to address CR8-CR15 and DR8- DR15. An additional control register (CR8) is defined in 64-bit mode. CR8 becomes the Task Priority Register (TPR). In the first implementation of IA-32e mode, CR9-CR15 and DR8-DR15 are not implemented. Any attempt to access unimplemented registers results in an invalid-opcode exception (#UD). 2-12 Vol. 2A INSTRUCTION FORMAT 2.3 INTEL® ADVANCED VECTOR EXTENSIONS (INTEL® AVX) Intel AVX instructions are encoded using an encoding scheme that combines prefix bytes, opcode extension field, operand encoding fields, and vector length encoding capability into a new prefix, referred to as VEX. In the VEX encoding scheme, the VEX prefix may be two or three bytes long, depending on the instruction semantics. Despite the two-byte or three-byte length of the VEX prefix, the VEX encoding format provides a more compact representation/packing of the components of encoding an instruction in Intel 64 architecture. The VEX encoding scheme also allows more headroom for future growth of Intel 64 architecture. 2.3.1 Instruction Format Instruction encoding using VEX prefix provides several advantages: • Instruction syntax support for three operands and up-to four operands when necessary. For example, the third source register used by VBLENDVPD is encoded using bits 7:4 of the immediate byte. • • • Encoding support for vector length of 128 bits (using XMM registers) and 256 bits (using YMM registers). Encoding support for instruction syntax of non-destructive source operands. Elimination of escape opcode byte (0FH), SIMD prefix byte (66H, F2H, F3H) via a compact bit field representation within the VEX prefix. • Elimination of the need to use REX prefix to encode the extended half of general-purpose register sets (R8R15) for direct register access, memory addressing, or accessing XMM8-XMM15 (including YMM8-YMM15). • Flexible and more compact bit fields are provided in the VEX prefix to retain the full functionality provided by REX prefix. REX.W, REX.X, REX.B functionalities are provided in the three-byte VEX prefix only because only a subset of SIMD instructions need them. • Extensibility for future instruction extensions without significant instruction length increase. Figure 2-8 shows the Intel 64 instruction encoding format with VEX prefix support. Legacy instruction without a VEX prefix is fully supported and unchanged. The use of VEX prefix in an Intel 64 instruction is optional, but a VEX prefix is required for Intel 64 instructions that operate on YMM registers or support three and four operand syntax. VEX prefix is not a constant-valued, “single-purpose” byte like 0FH, 66H, F2H, F3H in legacy SSE instructions. VEX prefix provides substantially richer capability than the REX prefix. # Bytes [Prefixes] 2,3 [VEX] 1 OPCODE 1 0,1 ModR/M [SIB] 0,1,2,4 [DISP] 0,1 [IMM] Figure 2-8. Instruction Encoding Format with VEX Prefix 2.3.2 VEX and the LOCK prefix Any VEX-encoded instruction with a LOCK prefix preceding VEX will #UD. 2.3.3 VEX and the 66H, F2H, and F3H prefixes Any VEX-encoded instruction with a 66H, F2H, or F3H prefix preceding VEX will #UD. 2.3.4 VEX and the REX prefix Any VEX-encoded instruction with a REX prefix proceeding VEX will #UD. Vol. 2A 2-13 INSTRUCTION FORMAT 2.3.5 The VEX Prefix The VEX prefix is encoded in either the two-byte form (the first byte must be C5H) or in the three-byte form (the first byte must be C4H). The two-byte VEX is used mainly for 128-bit, scalar, and the most common 256-bit AVX instructions; while the three-byte VEX provides a compact replacement of REX and 3-byte opcode instructions (including AVX and FMA instructions). Beyond the first byte of the VEX prefix, it consists of a number of bit fields providing specific capability, they are shown in Figure 2-9. The bit fields of the VEX prefix can be summarized by its functional purposes: • Non-destructive source register encoding (applicable to three and four operand syntax): This is the first source operand in the instruction syntax. It is represented by the notation, VEX.vvvv. This field is encoded using 1’s complement form (inverted form), i.e. XMM0/YMM0/R0 is encoded as 1111B, XMM15/YMM15/R15 is encoded as 0000B. • Vector length encoding: This 1-bit field represented by the notation VEX.L. L= 0 means vector length is 128 bits wide, L=1 means 256 bit vector. The value of this field is written as VEX.128 or VEX.256 in this document to distinguish encoded values of other VEX bit fields. • REX prefix functionality: Full REX prefix functionality is provided in the three-byte form of VEX prefix. However the VEX bit fields providing REX functionality are encoded using 1’s complement form, i.e. XMM0/YMM0/R0 is encoded as 1111B, XMM15/YMM15/R15 is encoded as 0000B. — Two-byte form of the VEX prefix only provides the equivalent functionality of REX.R, using 1’s complement encoding. This is represented as VEX.R. — Three-byte form of the VEX prefix provides REX.R, REX.X, REX.B functionality using 1’s complement encoding and three dedicated bit fields represented as VEX.R, VEX.X, VEX.B. — Three-byte form of the VEX prefix provides the functionality of REX.W only to specific instructions that need to override default 32-bit operand size for a general purpose register to 64-bit size in 64-bit mode. For those applicable instructions, VEX.W field provides the same functionality as REX.W. VEX.W field can provide completely different functionality for other instructions. Consequently, the use of REX prefix with VEX encoded instructions is not allowed. However, the intent of the REX prefix for expanding register set is reserved for future instruction set extensions using VEX prefix encoding format. • Compaction of SIMD prefix: Legacy SSE instructions effectively use SIMD prefixes (66H, F2H, F3H) as an opcode extension field. VEX prefix encoding allows the functional capability of such legacy SSE instructions (operating on XMM registers, bits 255:128 of corresponding YMM unmodified) to be encoded using the VEX.pp field without the presence of any SIMD prefix. The VEX-encoded 128-bit instruction will zero-out bits 255:128 of the destination register. VEX-encoded instruction may have 128 bit vector length or 256 bits length. • Compaction of two-byte and three-byte opcode: More recently introduced legacy SSE instructions employ two and three-byte opcode. The one or two leading bytes are: 0FH, and 0FH 3AH/0FH 38H. The one-byte escape (0FH) and two-byte escape (0FH 3AH, 0FH 38H) can also be interpreted as an opcode extension field. The VEX.mmmmm field provides compaction to allow many legacy instruction to be encoded without the constant byte sequence, 0FH, 0FH 3AH, 0FH 38H. These VEX-encoded instruction may have 128 bit vector length or 256 bits length. The VEX prefix is required to be the last prefix and immediately precedes the opcode bytes. It must follow any other prefixes. If VEX prefix is present a REX prefix is not supported. The 3-byte VEX leaves room for future expansion with 3 reserved bits. REX and the 66h/F2h/F3h prefixes are reclaimed for future use. VEX prefix has a two-byte form and a three byte form. If an instruction syntax can be encoded using the two-byte form, it can also be encoded using the three byte form of VEX. The latter increases the length of the instruction by one byte. This may be helpful in some situations for code alignment. The VEX prefix supports 256-bit versions of floating-point SSE, SSE2, SSE3, and SSE4 instructions. Note, certain new instruction functionality can only be encoded with the VEX prefix. The VEX prefix will #UD on any instruction containing MMX register sources or destinations. 2-14 Vol. 2A INSTRUCTION FORMAT 3-byte VEX 0 7 6 5 4 11000100 7 2-byte VEX Byte 2 Byte 1 Byte 0 (Bit Position) 7 0 11000101 0 RXB m-mmmm 7 3 R 6 vvvv 1 7 W 6 3 vvvv 1 2 1 0 L pp 2 1 0 L pp R: REX.R in 1’s complement (inverted) form 1: Same as REX.R=0 (must be 1 in 32-bit mode) 0: Same as REX.R=1 (64-bit mode only) X: REX.X in 1’s complement (inverted) form 1: Same as REX.X=0 (must be 1 in 32-bit mode) 0: Same as REX.X=1 (64-bit mode only) B: REX.B in 1’s complement (inverted) form 1: Same as REX.B=0 (Ignored in 32-bit mode). 0: Same as REX.B=1 (64-bit mode only) W: opcode specific (use like REX.W, or used for opcode extension, or ignored, depending on the opcode byte) m-mmmm: 00000: Reserved for future use (will #UD) 00001: implied 0F leading opcode byte 00010: implied 0F 38 leading opcode bytes 00011: implied 0F 3A leading opcode bytes 00100-11111: Reserved for future use (will #UD) vvvv: a register specifier (in 1’s complement form) or 1111 if unused. L: Vector Length 0: scalar or 128-bit vector 1: 256-bit vector pp: opcode extension providing equivalent functionality of a SIMD prefix 00: None 01: 66 10: F3 11: F2 Figure 2-9. VEX bit fields The following subsections describe the various fields in two or three-byte VEX prefix. 2.3.5.1 VEX Byte 0, bits[7:0] VEX Byte 0, bits [7:0] must contain the value 11000101b (C5h) or 11000100b (C4h). The 3-byte VEX uses the C4h first byte, while the 2-byte VEX uses the C5h first byte. 2.3.5.2 VEX Byte 1, bit [7] - ‘R’ VEX Byte 1, bit [7] contains a bit analogous to a bit inverted REX.R. In protected and compatibility modes the bit must be set to ‘1’ otherwise the instruction is LES or LDS. Vol. 2A 2-15 INSTRUCTION FORMAT This bit is present in both 2- and 3-byte VEX prefixes. The usage of WRXB bits for legacy instructions is explained in detail section 2.2.1.2 of Intel 64 and IA-32 Architectures Software developer’s manual, Volume 2A. This bit is stored in bit inverted format. 2.3.5.3 3-byte VEX byte 1, bit[6] - ‘X’ Bit[6] of the 3-byte VEX byte 1 encodes a bit analogous to a bit inverted REX.X. It is an extension of the SIB Index field in 64-bit modes. In 32-bit modes, this bit must be set to ‘1’ otherwise the instruction is LES or LDS. This bit is available only in the 3-byte VEX prefix. This bit is stored in bit inverted format. 2.3.5.4 3-byte VEX byte 1, bit[5] - ‘B’ Bit[5] of the 3-byte VEX byte 1 encodes a bit analogous to a bit inverted REX.B. In 64-bit modes, it is an extension of the ModR/M r/m field, or the SIB base field. In 32-bit modes, this bit is ignored. This bit is available only in the 3-byte VEX prefix. This bit is stored in bit inverted format. 2.3.5.5 3-byte VEX byte 2, bit[7] - ‘W’ Bit[7] of the 3-byte VEX byte 2 is represented by the notation VEX.W. It can provide following functions, depending on the specific opcode. • For AVX instructions that have equivalent legacy SSE instructions (typically these SSE instructions have a general-purpose register operand with its operand size attribute promotable by REX.W), if REX.W promotes the operand size attribute of the general-purpose register operand in legacy SSE instruction, VEX.W has same meaning in the corresponding AVX equivalent form. In 32-bit modes for these instructions, VEX.W is silently ignored. • For AVX instructions that have equivalent legacy SSE instructions (typically these SSE instructions have operands with their operand size attribute fixed and not promotable by REX.W), if REX.W is don’t care in legacy SSE instruction, VEX.W is ignored in the corresponding AVX equivalent form irrespective of mode. • For new AVX instructions where VEX.W has no defined function (typically these meant the combination of the opcode byte and VEX.mmmmm did not have any equivalent SSE functions), VEX.W is reserved as zero and setting to other than zero will cause instruction to #UD. 2.3.5.6 2-byte VEX Byte 1, bits[6:3] and 3-byte VEX Byte 2, bits [6:3]- ‘vvvv’ the Source or Dest Register Specifier In 32-bit mode the VEX first byte C4 and C5 alias onto the LES and LDS instructions. To maintain compatibility with existing programs the VEX 2nd byte, bits [7:6] must be 11b. To achieve this, the VEX payload bits are selected to place only inverted, 64-bit valid fields (extended register selectors) in these upper bits. The 2-byte VEX Byte 1, bits [6:3] and the 3-byte VEX, Byte 2, bits [6:3] encode a field (shorthand VEX.vvvv) that for instructions with 2 or more source registers and an XMM or YMM or memory destination encodes the first source register specifier stored in inverted (1’s complement) form. VEX.vvvv is not used by the instructions with one source (except certain shifts, see below) or on instructions with no XMM or YMM or memory destination. If an instruction does not use VEX.vvvv then it should be set to 1111b otherwise instruction will #UD. In 64-bit mode all 4 bits may be used. See Table 2-8 for the encoding of the XMM or YMM registers. In 32-bit and 16-bit modes bit 6 must be 1 (if bit 6 is not 1, the 2-byte VEX version will generate LDS instruction and the 3-byte VEX version will ignore this bit). 2-16 Vol. 2A INSTRUCTION FORMAT Table 2-8. VEX.vvvv to register name mapping VEX.vvvv Dest Register Valid in Legacy/Compatibility 32-bit modes? 1111B XMM0/YMM0 Valid 1110B XMM1/YMM1 Valid 1101B XMM2/YMM2 Valid 1100B XMM3/YMM3 Valid 1011B XMM4/YMM4 Valid 1010B XMM5/YMM5 Valid 1001B XMM6/YMM6 Valid 1000B XMM7/YMM7 Valid 0111B XMM8/YMM8 Invalid 0110B XMM9/YMM9 Invalid 0101B XMM10/YMM10 Invalid 0100B XMM11/YMM11 Invalid 0011B XMM12/YMM12 Invalid 0010B XMM13/YMM13 Invalid 0001B XMM14/YMM14 Invalid 0000B XMM15/YMM15 Invalid The VEX.vvvv field is encoded in bit inverted format for accessing a register operand. 2.3.6 Instruction Operand Encoding and VEX.vvvv, ModR/M VEX-encoded instructions support three-operand and four-operand instruction syntax. Some VEX-encoded instructions have syntax with less than three operands, e.g. VEX-encoded pack shift instructions support one source operand and one destination operand). The roles of VEX.vvvv, reg field of ModR/M byte (ModR/M.reg), r/m field of ModR/M byte (ModR/M.r/m) with respect to encoding destination and source operands vary with different type of instruction syntax. The role of VEX.vvvv can be summarized to three situations: • VEX.vvvv encodes the first source register operand, specified in inverted (1’s complement) form and is valid for instructions with 2 or more source operands. • VEX.vvvv encodes the destination register operand, specified in 1’s complement form for certain vector shifts. The instructions where VEX.vvvv is used as a destination are listed in Table 2-9. The notation in the “Opcode” column in Table 2-9 is described in detail in section 3.1.1. • VEX.vvvv does not encode any operand, the field is reserved and should contain 1111b. Table 2-9. Instructions with a VEX.vvvv destination Opcode Instruction mnemonic VEX.128.66.0F 73 /7 ib VPSLLDQ xmm1, xmm2, imm8 VEX.128.66.0F 73 /3 ib VPSRLDQ xmm1, xmm2, imm8 VEX.128.66.0F 71 /2 ib VPSRLW xmm1, xmm2, imm8 VEX.128.66.0F 72 /2 ib VPSRLD xmm1, xmm2, imm8 VEX.128.66.0F 73 /2 ib VPSRLQ xmm1, xmm2, imm8 VEX.128.66.0F 71 /4 ib VPSRAW xmm1, xmm2, imm8 VEX.128.66.0F 72 /4 ib VPSRAD xmm1, xmm2, imm8 VEX.128.66.0F 71 /6 ib VPSLLW xmm1, xmm2, imm8 VEX.128.66.0F 72 /6 ib VPSLLD xmm1, xmm2, imm8 VEX.128.66.0F 73 /6 ib VPSLLQ xmm1, xmm2, imm8 Vol. 2A 2-17 INSTRUCTION FORMAT The role of ModR/M.r/m field can be summarized to two situations: • • ModR/M.r/m encodes the instruction operand that references a memory address. For some instructions that do not support memory addressing semantics, ModR/M.r/m encodes either the destination register operand or a source register operand. The role of ModR/M.reg field can be summarized to two situations: • • ModR/M.reg encodes either the destination register operand or a source register operand. For some instructions, ModR/M.reg is treated as an opcode extension and not used to encode any instruction operand. For instruction syntax that support four operands, VEX.vvvv, ModR/M.r/m, ModR/M.reg encodes three of the four operands. The role of bits 7:4 of the immediate byte serves the following situation: • Imm8[7:4] encodes the third source register operand. 2.3.6.1 3-byte VEX byte 1, bits[4:0] - “m-mmmm” Bits[4:0] of the 3-byte VEX byte 1 encode an implied leading opcode byte (0F, 0F 38, or 0F 3A). Several bits are reserved for future use and will #UD unless 0. Table 2-10. VEX.m-mmmm interpretation VEX.m-mmmm Implied Leading Opcode Bytes 00000B Reserved 00001B 0F 00010B 0F 38 00011B 0F 3A 00100-11111B Reserved (2-byte VEX) 0F VEX.m-mmmm is only available on the 3-byte VEX. The 2-byte VEX implies a leading 0Fh opcode byte. 2.3.6.2 2-byte VEX byte 1, bit[2], and 3-byte VEX byte 2, bit [2]- “L” The vector length field, VEX.L, is encoded in bit[2] of either the second byte of 2-byte VEX, or the third byte of 3byte VEX. If “VEX.L = 1”, it indicates 256-bit vector operation. “VEX.L = 0” indicates scalar and 128-bit vector operations. The instruction VZEROUPPER is a special case that is encoded with VEX.L = 0, although its operation zero’s bits 255:128 of all YMM registers accessible in the current operating mode. See the following table. Table 2-11. VEX.L interpretation 2.3.6.3 VEX.L Vector Length 0 128-bit (or 32/64-bit scalar) 1 256-bit 2-byte VEX byte 1, bits[1:0], and 3-byte VEX byte 2, bits [1:0]- “pp” Up to one implied prefix is encoded by bits[1:0] of either the 2-byte VEX byte 1 or the 3-byte VEX byte 2. The prefix behaves as if it was encoded prior to VEX, but after all other encoded prefixes. See the following table. 2-18 Vol. 2A INSTRUCTION FORMAT Table 2-12. VEX.pp interpretation 2.3.7 pp Implies this prefix after other prefixes but before VEX 00B None 01B 66 10B F3 11B F2 The Opcode Byte One (and only one) opcode byte follows the 2 or 3 byte VEX. Legal opcodes are specified in Appendix B, in color. Any instruction that uses illegal opcode will #UD. 2.3.8 The MODRM, SIB, and Displacement Bytes The encodings are unchanged but the interpretation of reg_field or rm_field differs (see above). 2.3.9 The Third Source Operand (Immediate Byte) VEX-encoded instructions can support instruction with a four operand syntax. VBLENDVPD, VBLENDVPS, and PBLENDVB use imm8[7:4] to encode one of the source registers. 2.3.10 AVX Instructions and the Upper 128-bits of YMM registers If an instruction with a destination XMM register is encoded with a VEX prefix, the processor zeroes the upper bits (above bit 128) of the equivalent YMM register. Legacy SSE instructions without VEX preserve the upper bits. 2.3.10.1 Vector Length Transition and Programming Considerations An instruction encoded with a VEX.128 prefix that loads a YMM register operand operates as follows: • • Data is loaded into bits 127:0 of the register Bits above bit 127 in the register are cleared. Thus, such an instruction clears bits 255:128 of a destination YMM register on processors with a maximum vectorregister width of 256 bits. In the event that future processors extend the vector registers to greater widths, an instruction encoded with a VEX.128 or VEX.256 prefix will also clear any bits beyond bit 255. (This is in contrast with legacy SSE instructions, which have no VEX prefix; these modify only bits 127:0 of any destination register operand.) Programmers should bear in mind that instructions encoded with VEX.128 and VEX.256 prefixes will clear any future extensions to the vector registers. A calling function that uses such extensions should save their state before calling legacy functions. This is not possible for involuntary calls (e.g., into an interrupt-service routine). It is recommended that software handling involuntary calls accommodate this by not executing instructions encoded with VEX.128 and VEX.256 prefixes. In the event that it is not possible or desirable to restrict these instructions, then software must take special care to avoid actions that would, on future processors, zero the upper bits of vector registers. Processors that support further vector-register extensions (defining bits beyond bit 255) will also extend the XSAVE and XRSTOR instructions to save and restore these extensions. To ensure forward compatibility, software that handles involuntary calls and that uses instructions encoded with VEX.128 and VEX.256 prefixes should first save and then restore the vector registers (with any extensions) using the XSAVE and XRSTOR instructions with save/restore masks that set bits that correspond to all vector-register extensions. Ideally, software should rely on a mechanism that is cognizant of which bits to set. (E.g., an OS mechanism that sets the save/restore mask bits for all vector-register extensions that are enabled in XCR0.) Saving and restoring state with instructions other than XSAVE and XRSTOR will, on future processors with wider vector registers, corrupt the extended state of the vector registers - even if doing so functions correctly on processors supporting 256-bit vector registers. (The same is true Vol. 2A 2-19 INSTRUCTION FORMAT if XSAVE and XRSTOR are used with a save/restore mask that does not set bits corresponding to all supported extensions to the vector registers.) 2.3.11 AVX Instruction Length The AVX instructions described in this document (including VEX and ignoring other prefixes) do not exceed 11 bytes in length, but may increase in the future. The maximum length of an Intel 64 and IA-32 instruction remains 15 bytes. 2.3.12 Vector SIB (VSIB) Memory Addressing ® In Intel Advanced Vector Extensions 2 (Intel® AVX2), an SIB byte that follows the ModR/M byte can support VSIB memory addressing to an array of linear addresses. VSIB addressing is only supported in a subset of Intel AVX2 instructions. VSIB memory addressing requires 32-bit or 64-bit effective address. In 32-bit mode, VSIB addressing is not supported when address size attribute is overridden to 16 bits. In 16-bit protected mode, VSIB memory addressing is permitted if address size attribute is overridden to 32 bits. Additionally, VSIB memory addressing is supported only with VEX prefix. In VSIB memory addressing, the SIB byte consists of: • • The scale field (bit 7:6) specifies the scale factor. • The base field (bits 2:0) specifies the register number of the base register. The index field (bits 5:3) specifies the register number of the vector index register, each element in the vector register specifies an index. Table 2-3 shows the 32-bit VSIB addressing form. It is organized to give 256 possible values of the SIB byte (in hexadecimal). General purpose registers used as a base are indicated across the top of the table, along with corresponding values for the SIB byte’s base field. The register names also include R8L-R15L applicable only in 64-bit mode (when address size override prefix is used, but the value of VEX.B is not shown in Table 2-3). In 32-bit mode, R8L-R15L does not apply. Table rows in the body of the table indicate the vector index register used as the index field and each supported scaling factor shown separately. Vector registers used in the index field can be XMM or YMM registers. The leftmost column includes vector registers VR8-VR15 (i.e. XMM8/YMM8-XMM15/YMM15), which are only available in 64-bit mode and does not apply if encoding in 32-bit mode. Table 2-13. 32-Bit VSIB Addressing Forms of the SIB Byte r32 (In decimal) Base = (In binary) Base = Scaled Index SS EAX/ R8L 0 000 ECX/ R9L 1 001 EDX/ R10L 2 010 Index EBX/ R11L 3 011 ESP/ R12L 4 100 EBP/ R13L1 5 101 ESI/ R14L 6 110 EDI/ R15L 7 111 Value of SIB Byte (in Hexadecimal) VR0/VR8 VR1/VR9 VR2/VR10 VR3/VR11 VR4/VR12 VR5/VR13 VR6/VR14 VR7/VR15 *1 00 000 001 010 011 100 101 110 111 00 08 10 18 20 28 30 38 01 09 11 19 21 29 31 39 02 0A 12 1A 22 2A 32 3A 03 0B 13 1B 23 2B 33 3B 04 0C 14 1C 24 2C 34 3C 05 0D 15 1D 25 2D 35 3D 06 0E 16 1E 26 2E 36 3E 07 0F 17 1F 27 2F 37 3F VR0/VR8 VR1/VR9 VR2/VR10 VR3/VR11 VR4/VR12 VR5/VR13 VR6/VR14 VR7/VR15 *2 01 000 001 010 011 100 101 110 111 40 48 50 58 60 68 70 78 41 49 51 59 61 69 71 79 42 4A 52 5A 62 6A 72 7A 43 4B 53 5B 63 6B 73 7B 44 4C 54 5C 64 6C 74 7C 45 4D 55 5D 65 6D 75 7D 46 4E 56 5E 66 6E 76 7E 47 4F 57 5F 67 6F 77 7F 2-20 Vol. 2A INSTRUCTION FORMAT Table 2-13. 32-Bit VSIB Addressing Forms of the SIB Byte (Contd.) VR0/VR8 VR1/VR9 VR2/VR10 VR3/VR11 VR4/VR12 VR5/VR13 VR6/VR14 VR7/VR15 *4 10 000 001 010 011 100 101 110 111 80 88 90 98 A0 A8 B0 B8 81 89 91 89 A1 A9 B1 B9 82 8A 92 9A A2 AA B2 BA 83 8B 93 9B A3 AB B3 BB 84 8C 94 9C A4 AC B4 BC 85 8D 95 9D A5 AD B5 BD 86 8E 96 9E A6 AE B6 BE 87 8F 97 9F A7 AF B7 BF VR0/VR8 VR1/VR9 VR2/VR10 VR3/VR11 VR4/VR12 VR5/VR13 VR6/VR14 VR7/VR15 *8 11 000 001 010 011 100 101 110 111 C0 C8 D0 D8 E0 E8 F0 F8 C1 C9 D1 D9 E1 E9 F1 F9 C2 CA D2 DA E2 EA F2 FA C3 CB D3 DB E3 EB F3 FB C4 CC D4 DC E4 EC F4 FC C5 CD D5 DD E5 ED F5 FD C6 CE D6 DE E6 EE F6 FE C7 CF D7 DF E7 EF F7 FF NOTES: 1. If ModR/M.mod = 00b, the base address is zero, then effective address is computed as [scaled vector index] + disp32. Otherwise the base address is computed as [EBP/R13]+ disp, the displacement is either 8 bit or 32 bit depending on the value of ModR/M.mod: MOD Effective Address 00b [Scaled Vector Register] + Disp32 01b [Scaled Vector Register] + Disp8 + [EBP/R13] 10b [Scaled Vector Register] + Disp32 + [EBP/R13] 2.3.12.1 64-bit Mode VSIB Memory Addressing In 64-bit mode VSIB memory addressing uses the VEX.B field and the base field of the SIB byte to encode one of the 16 general-purpose register as the base register. The VEX.X field and the index field of the SIB byte encode one of the 16 vector registers as the vector index register. In 64-bit mode the top row of Table 2-13 base register should be interpreted as the full 64-bit of each register. 2.4 AVX AND SSE INSTRUCTION EXCEPTION SPECIFICATION To look up the exceptions of legacy 128-bit SIMD instruction, 128-bit VEX-encoded instructions, and 256-bit VEXencoded instruction, Table 2-14 summarizes the exception behavior into separate classes, with detailed exception conditions defined in sub-sections 2.4.1 through 2.5.1. For example, ADDPS contains the entry: “See Exceptions Type 2” In this entry, “Type2” can be looked up in Table 2-14. The instruction’s corresponding CPUID feature flag can be identified in the fourth column of the Instruction summary table. Note: #UD on CPUID feature flags=0 is not guaranteed in a virtualized environment if the hardware supports the feature flag. NOTE Instructions that operate only with MMX, X87, or general-purpose registers are not covered by the exception classes defined in this section. For instructions that operate on MMX registers, see Section 22.25.3, “Exception Conditions of Legacy SIMD Instructions Operating on MMX Registers” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B. Vol. 2A 2-21 INSTRUCTION FORMAT Table 2-14. Exception class description Exception Class Instruction set Mem arg Floating-Point Exceptions (#XM) Type 1 AVX, Legacy SSE 16/32 byte explicitly aligned None Type 2 AVX, Legacy SSE 16/32 byte not explicitly aligned Yes Type 3 AVX, Legacy SSE < 16 byte Yes Type 4 AVX, Legacy SSE 16/32 byte not explicitly aligned No Type 5 AVX, Legacy SSE < 16 byte No Type 6 AVX (no Legacy SSE) Varies (At present, none do) Type 7 AVX, Legacy SSE None None Type 8 AVX None None F16C 8 or 16 byte, Not explicitly aligned, no AC# Yes AVX2 Not explicitly aligned, no AC# No Type 11 Type 12 See Table 2-15 for lists of instructions in each exception class. 2-22 Vol. 2A INSTRUCTION FORMAT Table 2-15. Instructions in each Exception Class Exception Class Instruction Type 1 (V)MOVAPD, (V)MOVAPS, (V)MOVDQA, (V)MOVNTDQ, (V)MOVNTDQA, (V)MOVNTPD, (V)MOVNTPS Type 2 (V)ADDPD, (V)ADDPS, (V)ADDSUBPD, (V)ADDSUBPS, (V)CMPPD, (V)CMPPS, (V)CVTDQ2PS, (V)CVTPD2DQ, (V)CVTPD2PS, (V)CVTPS2DQ, (V)CVTTPD2DQ, (V)CVTTPS2DQ, (V)DIVPD, (V)DIVPS, (V)DPPD*, (V)DPPS*, VFMADD132PD, VFMADD213PD, VFMADD231PD, VFMADD132PS, VFMADD213PS, VFMADD231PS, VFMADDSUB132PD, VFMADDSUB213PD, VFMADDSUB231PD, VFMADDSUB132PS, VFMADDSUB213PS, VFMADDSUB231PS, VFMSUBADD132PD, VFMSUBADD213PD, VFMSUBADD231PD, VFMSUBADD132PS, VFMSUBADD213PS, VFMSUBADD231PS, VFMSUB132PD, VFMSUB213PD, VFMSUB231PD, VFMSUB132PS, VFMSUB213PS, VFMSUB231PS, VFNMADD132PD, VFNMADD213PD, VFNMADD231PD, VFNMADD132PS, VFNMADD213PS, VFNMADD231PS, VFNMSUB132PD, VFNMSUB213PD, VFNMSUB231PD, VFNMSUB132PS, VFNMSUB213PS, VFNMSUB231PS, (V)HADDPD, (V)HADDPS, (V)HSUBPD, (V)HSUBPS, (V)MAXPD, (V)MAXPS, (V)MINPD, (V)MINPS, (V)MULPD, (V)MULPS, (V)ROUNDPD, (V)ROUNDPS, (V)SQRTPD, (V)SQRTPS, (V)SUBPD, (V)SUBPS Type 3 (V)ADDSD, (V)ADDSS, (V)CMPSD, (V)CMPSS, (V)COMISD, (V)COMISS, (V)CVTPS2PD, (V)CVTSD2SI, (V)CVTSD2SS, (V)CVTSI2SD, (V)CVTSI2SS, (V)CVTSS2SD, (V)CVTSS2SI, (V)CVTTSD2SI, (V)CVTTSS2SI, (V)DIVSD, (V)DIVSS, VFMADD132SD, VFMADD213SD, VFMADD231SD, VFMADD132SS, VFMADD213SS, VFMADD231SS, VFMSUB132SD, VFMSUB213SD, VFMSUB231SD, VFMSUB132SS, VFMSUB213SS, VFMSUB231SS, VFNMADD132SD, VFNMADD213SD, VFNMADD231SD, VFNMADD132SS, VFNMADD213SS, VFNMADD231SS, VFNMSUB132SD, VFNMSUB213SD, VFNMSUB231SD, VFNMSUB132SS, VFNMSUB213SS, VFNMSUB231SS, (V)MAXSD, (V)MAXSS, (V)MINSD, (V)MINSS, (V)MULSD, (V)MULSS, (V)ROUNDSD, (V)ROUNDSS, (V)SQRTSD, (V)SQRTSS, (V)SUBSD, (V)SUBSS, (V)UCOMISD, (V)UCOMISS Type 4 (V)AESDEC, (V)AESDECLAST, (V)AESENC, (V)AESENCLAST, (V)AESIMC, (V)AESKEYGENASSIST, (V)ANDPD, (V)ANDPS, (V)ANDNPD, (V)ANDNPS, (V)BLENDPD, (V)BLENDPS, VBLENDVPD, VBLENDVPS, (V)LDDQU***, (V)MASKMOVDQU, (V)PTEST, VTESTPS, VTESTPD, (V)MOVDQU*, (V)MOVSHDUP, (V)MOVSLDUP, (V)MOVUPD*, (V)MOVUPS*, (V)MPSADBW, (V)ORPD, (V)ORPS, (V)PABSB, (V)PABSW, (V)PABSD, (V)PACKSSWB, (V)PACKSSDW, (V)PACKUSWB, (V)PACKUSDW, (V)PADDB, (V)PADDW, (V)PADDD, (V)PADDQ, (V)PADDSB, (V)PADDSW, (V)PADDUSB, (V)PADDUSW, (V)PALIGNR, (V)PAND, (V)PANDN, (V)PAVGB, (V)PAVGW, (V)PBLENDVB, (V)PBLENDW, (V)PCMP(E/I)STRI/M***, (V)PCMPEQB, (V)PCMPEQW, (V)PCMPEQD, (V)PCMPEQQ, (V)PCMPGTB, (V)PCMPGTW, (V)PCMPGTD, (V)PCMPGTQ, (V)PCLMULQDQ, (V)PHADDW, (V)PHADDD, (V)PHADDSW, (V)PHMINPOSUW, (V)PHSUBD, (V)PHSUBW, (V)PHSUBSW, (V)PMADDWD, (V)PMADDUBSW, (V)PMAXSB, (V)PMAXSW, (V)PMAXSD, (V)PMAXUB, (V)PMAXUW, (V)PMAXUD, (V)PMINSB, (V)PMINSW, (V)PMINSD, (V)PMINUB, (V)PMINUW, (V)PMINUD, (V)PMULHUW, (V)PMULHRSW, (V)PMULHW, (V)PMULLW, (V)PMULLD, (V)PMULUDQ, (V)PMULDQ, (V)POR, (V)PSADBW, (V)PSHUFB, (V)PSHUFD, (V)PSHUFHW, (V)PSHUFLW, (V)PSIGNB, (V)PSIGNW, (V)PSIGND, (V)PSLLW, (V)PSLLD, (V)PSLLQ, (V)PSRAW, (V)PSRAD, (V)PSRLW, (V)PSRLD, (V)PSRLQ, (V)PSUBB, (V)PSUBW, (V)PSUBD, (V)PSUBQ, (V)PSUBSB, (V)PSUBSW, (V)PSUBUSB, (V)PSUBUSW, (V)PUNPCKHBW, (V)PUNPCKHWD, (V)PUNPCKHDQ, (V)PUNPCKHQDQ, (V)PUNPCKLBW, (V)PUNPCKLWD, (V)PUNPCKLDQ, (V)PUNPCKLQDQ, (V)PXOR, (V)RCPPS, (V)RSQRTPS, (V)SHUFPD, (V)SHUFPS, (V)UNPCKHPD, (V)UNPCKHPS, (V)UNPCKLPD, (V)UNPCKLPS, (V)XORPD, (V)XORPS, VPBLENDD, VPERMD, VPERMPS, VPERMPD, VPERMQ, VPSLLVD, VPSLLVQ, VPSRAVD, VPSRLVD, VPSRLVQ, VPERMILPD, VPERMILPS, VPERM2F128 Type 5 (V)CVTDQ2PD, (V)EXTRACTPS, (V)INSERTPS, (V)MOVD, (V)MOVQ, (V)MOVDDUP, (V)MOVLPD, (V)MOVLPS, (V)MOVHPD, (V)MOVHPS, (V)MOVSD, (V)MOVSS, (V)PEXTRB, (V)PEXTRD, (V)PEXTRW, (V)PEXTRQ, (V)PINSRB, (V)PINSRD, (V)PINSRW, (V)PINSRQ, PMOVSXBW, (V)RCPSS, (V)RSQRTSS, (V)PMOVSX/ZX, VLDMXCSR*, VSTMXCSR Type 6 VEXTRACTF128/VEXTRACTFxxxx, VBROADCASTSS, VBROADCASTSD, VBROADCASTF128, VINSERTF128, VMASKMOVPS**, VMASKMOVPD**, VPMASKMOVD, VPMASKMOVQ, VBROADCASTI128, VPBROADCASTB, VPBROADCASTD, VPBROADCASTW, VPBROADCASTQ, VEXTRACTI128, VINSERTI128, VPERM2I128 Type 7 (V)MOVLHPS, (V)MOVHLPS, (V)MOVMSKPD, (V)MOVMSKPS, (V)PMOVMSKB, (V)PSLLDQ, (V)PSRLDQ, (V)PSLLW, (V)PSLLD, (V)PSLLQ, (V)PSRAW, (V)PSRAD, (V)PSRLW, (V)PSRLD, (V)PSRLQ Type 8 VZEROALL, VZEROUPPER Type 11 VCVTPH2PS, VCVTPS2PH Type 12 VGATHERDPS, VGATHERDPD, VGATHERQPS, VGATHERQPD, VPGATHERDD, VPGATHERDQ, VPGATHERQD, VPGATHERQQ (*) - Additional exception restrictions are present - see the Instruction description for details Vol. 2A 2-23 INSTRUCTION FORMAT (**) - Instruction behavior on alignment check reporting with mask bits of less than all 1s are the same as with mask bits of all 1s, i.e. no alignment checks are performed. (***) - PCMPESTRI, PCMPESTRM, PCMPISTRI, PCMPISTRM and LDDQU instructions do not cause #GP if the memory operand is not aligned to 16-Byte boundary. Table 2-15 classifies exception behaviors for AVX instructions. Within each class of exception conditions that are listed in Table 2-18 through Table 2-27, certain subsets of AVX instructions may be subject to #UD exception depending on the encoded value of the VEX.L field. Table 2-17 provides supplemental information of AVX instructions that may be subject to #UD exception if encoded with incorrect values in the VEX.W or VEX.L field. Table 2-16. #UD Exception and VEX.W=1 Encoding Exception Class #UD If VEX.W = 1 in all modes Type 1 Type 2 Type 3 Type 4 VBLENDVPD, VBLENDVPS, VPBLENDVB, VTESTPD, VTESTPS, VPBLENDD, VPERMD, VPERMPS, VPERM2I128, VPSRAVD, VPERMILPD, VPERMILPS, VPERM2F128 Type 5 Type 6 VEXTRACTF128, VBROADCASTSS, VBROADCASTSD, VBROADCASTF128, VINSERTF128, VMASKMOVPS, VMASKMOVPD, VBROADCASTI128, VPBROADCASTB/W/D, VEXTRACTI128, VINSERTI128 Type 7 Type 8 Type 11 Type 12 2-24 Vol. 2A VCVTPH2PS, VCVTPS2PH #UD If VEX.W = 1 in non-64-bit modes INSTRUCTION FORMAT Table 2-17. #UD Exception and VEX.L Field Encoding Exception Class #UD If VEX.L = 0 Type 1 #UD If (VEX.L = 1 && AVX2 not present && AVX present) #UD If (VEX.L = 1 && AVX2 present) VMOVNTDQA VDPPD VDPPD VPCMP(E/I)STRI/M, PHMINPOSUW Type 4 VMASKMOVDQU, VMPSADBW, VPABSB/W/D, VPACKSSWB/DW, VPACKUSWB/DW, VPADDB/W/D, VPADDQ, VPADDSB/W, VPADDUSB/W, VPALIGNR, VPAND, VPANDN, VPAVGB/W, VPBLENDVB, VPBLENDW, VPCMP(E/I)STRI/M, VPCMPEQB/W/D/Q, VPCMPGTB/W/D/Q, VPHADDW/D, VPHADDSW, VPHMINPOSUW, VPHSUBD/W, VPHSUBSW, VPMADDWD, VPMADDUBSW, VPMAXSB/W/D, VPMAXUB/W/D, VPMINSB/W/D, VPMINUB/W/D, VPMULHUW, VPMULHRSW, VPMULHW/LW, VPMULLD, VPMULUDQ, VPMULDQ, VPOR, VPSADBW, VPSHUFB/D, VPSHUFHW/LW, VPSIGNB/W/D, VPSLLW/D/Q, VPSRAW/D, VPSRLW/D/Q, VPSUBB/W/D/Q, VPSUBSB/W, VPUNPCKHBW/WD/DQ, VPUNPCKHQDQ, VPUNPCKLBW/WD/DQ, VPUNPCKLQDQ, VPXOR Same as column 3 Type 5 VEXTRACTPS, VINSERTPS, VMOVD, VMOVQ, VMOVLPD, VMOVLPS, VMOVHPD, VMOVHPS, VPEXTRB, VPEXTRD, VPEXTRW, VPEXTRQ, VPINSRB, VPINSRD, VPINSRW, VPINSRQ, VPMOVSX/ZX, VLDMXCSR, VSTMXCSR VMOVLHPS, VMOVHLPS, VPMOVMSKB, VPSLLDQ, VPSRLDQ, VPSLLW, VPSLLD, VPSLLQ, VPSRAW, VPSRAD, VPSRLW, VPSRLD, VPSRLQ VMOVLHPS, VMOVHLPS Type 2 Type 3 Type 6 Type 7 VEXTRACTF128, VPERM2F128, VBROADCASTSD, VBROADCASTF128, VINSERTF128, Type 8 Type 11 Type 12 Vol. 2A 2-25 INSTRUCTION FORMAT 2.4.1 Exceptions Type 1 (Aligned memory reference) Invalid Opcode, #UD Device Not Available, #NM X VEX prefix. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X X Legacy SSE instruction: If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. X X X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X X X For an illegal address in the SS segment. X If a memory address referencing the SS segment is in a non-canonical form. X X VEX.256: Memory operand is not 32-byte aligned. VEX.128: Memory operand is not 16-byte aligned. X X Legacy SSE: Memory operand is not 16-byte aligned. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X X Page Fault #PF(fault-code) 2-26 Vol. 2A Cause of Exception X Stack, SS(0) General Protection, #GP(0) 64-bit X Protected and Compatibility Virtual-8086 Exception Real Table 2-18. Type 1 Class Exception Conditions X X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X For a page fault. INSTRUCTION FORMAT 2.4.2 Exceptions Type 2 (>=16 Byte Memory Reference, Unaligned) Invalid Opcode, #UD Device Not Available, #NM X X X X X X X Cause of Exception VEX prefix. X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X Legacy SSE instruction: If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) For an illegal address in the SS segment. X X X X X X General Protection, #GP(0) Page Fault #PF(fault-code) X If a memory address referencing the SS segment is in a non-canonical form. Legacy SSE: Memory operand is not 16-byte aligned. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X SIMD Floatingpoint Exception, #XM 64-bit X Protected and Compatibility Virtual 8086 Exception Real Table 2-19. Type 2 Class Exception Conditions X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X X For a page fault. X X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1. Vol. 2A 2-27 INSTRUCTION FORMAT 2.4.3 Exceptions Type 3 (<16 Byte memory argument) X X X 64-bit X Protected and Compatibility Virtual-8086 Exception Real Table 2-20. Type 3 Class Exception Conditions VEX prefix. X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X Legacy SSE instruction: If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. Invalid Opcode, #UD Device Not Available, #NM X X X X Cause of Exception X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) For an illegal address in the SS segment. X For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(fault-code) X X X For a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 Bytes or less is made while the current privilege level is 3. X X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1. SIMD Floating-point Exception, #XM 2-28 Vol. 2A X INSTRUCTION FORMAT 2.4.4 Exceptions Type 4 (>=16 Byte mem arg no alignment, no floating-point exceptions) Invalid Opcode, #UD Device Not Available, #NM X Cause of Exception VEX prefix. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X X X Legacy SSE instruction: If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. X X X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) X X X For an illegal address in the SS segment. X If a memory address referencing the SS segment is in a non-canonical form. X Legacy SSE: Memory operand is not 16-byte aligned.1 For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X Page Fault #PF(fault-code) 64-bit X Protected and Compatibility Virtual-8086 Exception Real Table 2-21. Type 4 Class Exception Conditions If any part of the operand lies outside the effective address space from 0 to FFFFH. X X If the memory address is in a non-canonical form. X X For a page fault. NOTES: 1. PCMPESTRI, PCMPESTRM, PCMPISTRI, PCMPISTRM and LDDQU instructions do not cause #GP if the memory operand is not aligned to 16-Byte boundary. Vol. 2A 2-29 INSTRUCTION FORMAT 2.4.5 Exceptions Type 5 (<16 Byte mem arg and no FP exceptions) Invalid Opcode, #UD Device Not Available, #NM X 64-bit X Protected and Compatibility Virtual-8086 Exception Real Table 2-22. Type 5 Class Exception Conditions Cause of Exception VEX prefix. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X X X Legacy SSE instruction: If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. X X X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) For an illegal address in the SS segment. X X General Protection, #GP(0) X X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(fault-code) X X X For a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. 2-30 Vol. 2A INSTRUCTION FORMAT 2.4.6 Exceptions Type 6 (VEX-Encoded Instructions Without Legacy SSE Analogues) Note: At present, the AVX instructions in this category do not generate floating-point exceptions. X Device Not Available, #NM Stack, SS(0) General Protection, #GP(0) Page Fault #PF(fault-code) Alignment Check #AC(0) Cause of Exception VEX prefix. X Invalid Opcode, #UD 64-bit X Protected and Compatibility Virtual-8086 Exception Real Table 2-23. Type 6 Class Exception Conditions X If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X If any corresponding CPUID feature flag is ‘0’. X X If CR0.TS[bit 3]=1. X For an illegal address in the SS segment. X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X If the memory address is in a non-canonical form. X X For a page fault. X X For 4 or 8 byte memory references if alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. Vol. 2A 2-31 INSTRUCTION FORMAT 2.4.7 Exceptions Type 7 (No FP exceptions, no memory arg) Invalid Opcode, #UD X Cause of Exception VEX prefix. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X X X Legacy SSE instruction: If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. X X X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X If CR0.TS[bit 3]=1. Device Not Available, #NM 2.4.8 64-bit X Protected and Compatibility Virtual-8086 Exception Real Table 2-24. Type 7 Class Exception Conditions Exceptions Type 8 (AVX and no memory argument) X X X 2-32 Vol. 2A X Cause of Exception Always in Real or Virtual-8086 mode. X Device Not Available, #NM 64-bit Virtual-8086 Invalid Opcode, #UD Protected and Compatibility Exception Real Table 2-25. Type 8 Class Exception Conditions X If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. If CPUID.01H.ECX.AVX[bit 28]=0. If VEX.vvvv ≠ 1111B. X X If proceeded by a LOCK prefix (F0H). X X If CR0.TS[bit 3]=1. INSTRUCTION FORMAT 2.4.9 Exceptions Type 11 (VEX-only, mem arg no AC, floating-point exceptions) X X Device Not Available, #NM 64-bit Virtual-8086 Invalid Opcode, #UD Protected and Compatibility Exception Real Table 2-26. Type 11 Class Exception Conditions VEX prefix. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X If preceded by a LOCK prefix (F0H). X X X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. Stack, SS(0) X General Protection, #GP(0) X For an illegal address in the SS segment. X X Page Fault #PF (fault-code) X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X SIMD Floating-Point Exception, #XM Cause of Exception X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X X For a page fault. X X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1. Vol. 2A 2-33 INSTRUCTION FORMAT 2.4.10 Exceptions Type 12 (VEX-only, VSIB mem arg, no AC, no floating-point exceptions) Invalid Opcode, #UD X X Device Not Available, #NM 64-bit Protected and Compatibility Virtual-8086 Exception Real Table 2-27. Type 12 Class Exception Conditions X X VEX prefix. X X VEX prefix: If XCR0[2:1] ≠ ‘11b’. If CR4.OSXSAVE[bit 18]=0. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X NA If address size attribute is 16 bit. X X X X If ModR/M.mod = ‘11b’. X X X X If ModR/M.rm ≠ ‘100b’. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If any vector register is used more than once between the destination register, mask register and the index register in VSIB addressing. X X X X If CR0.TS[bit 3]=1. Stack, SS(0) X For an illegal address in the SS segment. X General Protection, #GP(0) X X 2.5 X X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X Page Fault #PF (faultcode) Cause of Exception If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X For a page fault. VEX ENCODING SUPPORT FOR GPR INSTRUCTIONS VEX prefix may be used to encode instructions that operate on neither YMM nor XMM registers. VEX-encoded general-purpose-register instructions have the following properties: • • Instruction syntax support for three encodable operands. • Elimination of escape opcode byte (0FH), two-byte escape via a compact bit field representation within the VEX prefix. • Elimination of the need to use REX prefix to encode the extended half of general-purpose register sets (R8-R15) for direct register access or memory addressing. • Flexible and more compact bit fields are provided in the VEX prefix to retain the full functionality provided by REX prefix. REX.W, REX.X, REX.B functionalities are provided in the three-byte VEX prefix only. • VEX-encoded GPR instructions are encoded with VEX.L=0. Encoding support for instruction syntax of non-destructive source operand, destination operand encoded via VEX.vvvv, and destructive three-operand syntax. 2-34 Vol. 2A INSTRUCTION FORMAT Any VEX-encoded GPR instruction with a 66H, F2H, or F3H prefix preceding VEX will #UD. Any VEX-encoded GPR instruction with a REX prefix proceeding VEX will #UD. VEX-encoded GPR instructions are not supported in real and virtual 8086 modes. 2.5.1 Exceptions Type 13 (VEX-Encoded GPR Instructions) The exception conditions applicable to VEX-encoded GPR instruction differs from those of legacy GPR instructions. Table 2-28 lists VEX-encoded GPR instructions. The exception conditions for VEX-encoded GRP instructions are found in Table 2-29 for those instructions which have a default operand size of 32 bits and 16-bit operand size is not encodable. Table 2-28. VEX-Encoded GPR Instructions Exception Class Instruction Type 13 ANDN, BEXTR, BLSI, BLSMSK, BLSR, BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX (*) - Additional exception restrictions are present - see the Instruction description for details. Stack, SS(0) Protected and Compatibility 64-bit Invalid Opcode, #UD Virtual-8086 Exception Real Table 2-29. Type 13 Class Exception Conditions X X Cause of Exception X X X X X X X X If VEX.L = 1. X X X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X If a VEX prefix is present. X For an illegal address in the SS segment. X General Protection, #GP(0) X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. If the DS, ES, FS, or GS register is used to access memory and it contains a null segment selector. X X If BMI1/BMI2 CPUID feature flag is ‘0’. X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. Page Fault #PF(faultcode) X X X For a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. 2.6 INTEL® AVX-512 ENCODING The majority of the Intel AVX-512 family of instructions (operating on 512/256/128-bit vector register operands) are encoded using a new prefix (called EVEX). Opmask instructions (operating on opmask register operands) are encoded using the VEX prefix. The EVEX prefix has some parts resembling the instruction encoding scheme using the VEX prefix, and many other capabilities not available with the VEX prefix. Vol. 2A 2-35 INSTRUCTION FORMAT The significant feature differences between EVEX and VEX are summarized below. • EVEX is a 4-Byte prefix (the first byte must be 62H); VEX is either a 2-Byte (C5H is the first byte) or 3-Byte (C4H is the first byte) prefix. • • EVEX prefix can encode 32 vector registers (XMM/YMM/ZMM) in 64-bit mode. • EVEX memory addressing with disp8 form uses a compressed disp8 encoding scheme to improve the encoding density of the instruction byte stream. • EVEX prefix can encode functionality that are specific to instruction classes (e.g., packed instruction with “load+op” semantic can support embedded broadcast functionality, floating-point instruction with rounding semantic can support static rounding functionality, floating-point instruction with non-rounding arithmetic semantic can support “suppress all exceptions” functionality). EVEX prefix can encode an opmask register for conditional processing or selection control in EVEX-encoded vector instructions. Opmask instructions, whose source/destination operands are opmask registers and treat the content of an opmask register as a single value, are encoded using the VEX prefix. 2.6.1 Instruction Format and EVEX The placement of the EVEX prefix in an IA instruction is represented in Figure 2-10. Note that the values contained within brackets are optional. # of bytes: [Prefixes] 1 4 EVEX 1 Opcode 1 ModR/M [SIB] 2, 4 [Disp16,32] 1 [Immediate] 1 [Disp8*N] Figure 2-10. AVX-512 Instruction Format and the EVEX Prefix The EVEX prefix is a 4-byte prefix, with the first two bytes derived from unused encoding form of the 32-bit-modeonly BOUND instruction. The layout of the EVEX prefix is shown in Figure 2-11. The first byte must be 62H, followed by three payload bytes, denoted as P0, P1, and P2 individually or collectively as P[23:0] (see Figure 2-11). EVEX 62H P0 P1 7 P0 R 7 P1 W 7 P2 z 6 X 6 v 6 L’ P2 5 B 5 v 5 L 4 R’ 4 v 4 b 3 0 3 v 3 V’ 2 0 2 1 2 a 1 0 m 1 m 0 p 1 p P[15:8] 0 a Figure 2-11. Bit Field Layout of the EVEX Prefix 2-36 Vol. 2A P[7:0] a P[23:16] INSTRUCTION FORMAT Table 2-30. EVEX Prefix Bit Field Functional Grouping Notation Bit field Group Position Comment -- Reserved P[3 : 2] Must be 0. -- Fixed Value P[10] Must be 1. EVEX.mm Compressed legacy escape P[1: 0] Identical to low two bits of VEX.mmmmm. EVEX.pp Compressed legacy prefix P[9 : 8] Identical to VEX.pp. EVEX.RXB Next-8 register specifier modifier P[7 : 5] Combine with ModR/M.reg, ModR/M.rm (base, index/vidx). EVEX.R’ High-16 register specifier modifier P[4] Combine with EVEX.R and ModR/M.reg. EVEX.X High-16 register specifier modifier P[6] Combine with EVEX.B and ModR/M.rm, when SIB/VSIB absent. EVEX.vvvv VVVV register specifier P[14 : 11] Same as VEX.vvvv. EVEX.V’ High-16 VVVV/VIDX register specifier P[19] Combine with EVEX.vvvv or when VSIB present. EVEX.aaa Embedded opmask register specifier P[18 : 16] EVEX.W Osize promotion/Opcode extension P[15] EVEX.z Zeroing/Merging P[23] EVEX.b Broadcast/RC/SAE Context P[20] EVEX.L’L Vector length/RC P[22 : 21] The bit fields in P[23:0] are divided into the following functional groups (Table 2-30 provides a tabular summary): • • • Reserved bits: P[3:2] must be 0, otherwise #UD. • Operand specifier modifier bits for vector register, general purpose register, memory addressing: P[7:5] allows access to the next set of 8 registers beyond the low 8 registers when combined with ModR/M register specifiers. • Operand specifier modifier bit for vector register: P[4] (or EVEX.R’) allows access to the high 16 vector register set when combined with P[7] and ModR/M.reg specifier; P[6] can also provide access to a high 16 vector register when SIB or VSIB addressing are not needed. • Non-destructive source /vector index operand specifier: P[19] and P[14:11] encode the second source vector register operand in a non-destructive source syntax, vector index register operand can access an upper 16 vector register using P[19]. • Op-mask register specifiers: P[18:16] encodes op-mask register set k0-k7 in instructions operating on vector registers. • EVEX.W: P[15] is similar to VEX.W which serves either as opcode extension bit or operand size promotion to 64-bit in 64-bit mode. • Vector destination merging/zeroing: P[23] encodes the destination result behavior which either zeroes the masked elements or leave masked element unchanged. • Broadcast/Static-rounding/SAE context bit: P[20] encodes multiple functionality, which differs across different classes of instructions and can affect the meaning of the remaining field (EVEX.L’L). The functionality for the following instruction classes are: Fixed-value bit: P[10] must be 1, otherwise #UD. Compressed legacy prefix/escape bytes: P[1:0] is identical to the lowest 2 bits of VEX.mmmmm; P[9:8] is identical to VEX.pp. — Broadcasting a single element across the destination vector register: this applies to the instruction class with Load+Op semantic where one of the source operand is from memory. — Redirect L’L field (P[22:21]) as static rounding control for floating-point instructions with rounding semantic. Static rounding control overrides MXCSR.RC field and implies “Suppress all exceptions” (SAE). — Enable SAE for floating -point instructions with arithmetic semantic that is not rounding. — For instruction classes outside of the afore-mentioned three classes, setting EVEX.b will cause #UD. Vol. 2A 2-37 INSTRUCTION FORMAT • Vector length/rounding control specifier: P[22:21] can serve one of three options. — Vector length information for packed vector instructions. — Ignored for instructions operating on vector register content as a single data element. — Rounding control for floating-point instructions that have a rounding semantic and whose source and destination operands are all vector registers. 2.6.2 Register Specifier Encoding and EVEX EVEX-encoded instruction can access 8 opmask registers, 16 general-purpose registers and 32 vector registers in 64-bit mode (8 general-purpose registers and 8 vector registers in non-64-bit modes). EVEX-encoding can support instruction syntax that access up to 4 instruction operands. Normal memory addressing modes and VSIB memory addressing are supported with EVEX prefix encoding. The mapping of register operands used by various instruction syntax and memory addressing in 64-bit mode are shown in Table 2-31. Opmask register encoding is described in Section 2.6.3. Table 2-31. 32-Register Support in 64-bit Mode Using EVEX with Embedded REX Bits 41 3 [2:0] Reg. Type Common Usages REG EVEX.R’ REX.R modrm.reg GPR, Vector Destination or Source VVVV EVEX.V’ GPR, Vector 2ndSource or Destination RM EVEX.X EVEX.B modrm.r/m GPR, Vector 1st Source or Destination BASE 0 EVEX.B modrm.r/m GPR memory addressing INDEX 0 EVEX.X sib.index GPR memory addressing EVEX.V’ EVEX.X sib.index Vector VSIB memory addressing VIDX EVEX.vvvv NOTES: 1. Not applicable for accessing general purpose registers. The mapping of register operands used by various instruction syntax and memory addressing in 32-bit modes are shown in Table 2-32. Table 2-32. EVEX Encoding Register Specifiers in 32-bit Mode [2:0] Reg. Type Common Usages REG modrm.reg GPR, Vector Destination or Source VVVV EVEX.vvv GPR, Vector 2nd Source or Destination RM modrm.r/m GPR, Vector 1st Source or Destination BASE modrm.r/m GPR Memory Addressing INDEX sib.index GPR Memory Addressing VIDX sib.index Vector VSIB Memory Addressing 2.6.3 Opmask Register Encoding There are eight opmask registers, k0-k7. Opmask register encoding falls into two categories: • Opmask registers that are the source or destination operands of an instruction treating the content of opmask register as a scalar value, are encoded using the VEX prefix scheme. It can support up to three operands using standard modR/M byte’s reg field and rm field and VEX.vvvv. Such a scalar opmask instruction does not support conditional update of the destination operand. • An opmask register providing conditional processing and/or conditional update of the destination register of a vector instruction is encoded using EVEX.aaa field (see Section 2.6.4). 2-38 Vol. 2A INSTRUCTION FORMAT • An opmask register serving as the destination or source operand of a vector instruction is encoded using standard modR/M byte’s reg field and rm fields. Table 2-33. Opmask Register Specifier Encoding [2:0] Register Access Common Usages REG modrm.reg k0-k7 Source VVVV VEX.vvvv k0-k7 2nd Source RM modrm.r/m k0-7 1st Source {k1} EVEX.aaa 1 k0 -k7 Opmask NOTES: 1. Instructions that overwrite the conditional mask in opmask do not permit using k0 as the embedded mask. 2.6.4 Masking Support in EVEX EVEX can encode an opmask register to conditionally control per-element computational operation and updating of result of an instruction to the destination operand. The predicate operand is known as the opmask register. The EVEX.aaa field, P[18:16] of the EVEX prefix, is used to encode one out of a set of eight 64-bit architectural registers. Note that from this set of 8 architectural registers, only k1 through k7 can be addressed as predicate operands. k0 can be used as a regular source or destination but cannot be encoded as a predicate operand. AVX-512 instructions support two types of masking with EVEX.z bit (P[23]) controlling the type of masking: • Merging-masking, which is the default type of masking for EVEX-encoded vector instructions, preserves the old value of each element of the destination where the corresponding mask bit has a 0. It corresponds to the case of EVEX.z = 0. • Zeroing-masking, is enabled by having the EVEX.z bit set to 1. In this case, an element of the destination is set to 0 when the corresponding mask bit has a 0 value. AVX-512 Foundation instructions can be divided into the following groups: • Instructions which support “zeroing-masking”. — Also allow merging-masking. • Instructions which require aaa = 000. — Do not allow any form of masking. • Instructions which allow merging-masking but do not allow zeroing-masking. — Require EVEX.z to be set to 0. — This group is mostly composed of instructions that write to memory. • Instructions which require aaa <> 000 do not allow EVEX.z to be set to 1. — Allow merging-masking and do not allow zeroing-masking, e.g., gather instructions. 2.6.5 Compressed Displacement (disp8*N) Support in EVEX For memory addressing using disp8 form, EVEX-encoded instructions always use a compressed displacement scheme by multiplying disp8 in conjunction with a scaling factor N that is determined based on the vector length, the value of EVEX.b bit (embedded broadcast) and the input element size of the instruction. In general, the factor N corresponds to the number of bytes characterizing the internal memory operation of the input operand (e.g., 64 when the accessing a full 512-bit memory vector). The scale factor N is listed in Table 2-34 and Table 2-35 below, where EVEX encoded instructions are classified using the tupletype attribute. The scale factor N of each tupletype is listed based on the vector length (VL) and other factors affecting it. Table 2-34 covers EVEX-encoded instructions which has a load semantic in conjunction with additional computational or data element movement operation, operating either on the full vector or half vector (due to conversion of Vol. 2A 2-39 INSTRUCTION FORMAT numerical precision from a wider format to narrower format). EVEX.b is supported for such instructions for data element sizes which are either dword or qword (see Section 2.6.11). EVEX-encoded instruction that are pure load/store, and “Load+op” instruction semantic that operate on data element size less then dword do not support broadcasting using EVEX.b. These are listed in Table 2-35. Table 2-35 also includes many broadcast instructions which perform broadcast using a subset of data elements without using EVEX.b. These instructions and a few data element size conversion instruction are covered in Table 2-35. Instruction classified in Table 2-35 do not use EVEX.b and EVEX.b must be 0, otherwise #UD will occur. The tupletype will be referenced in the instruction operand encoding table in the reference page of each instruction, providing the cross reference for the scaling factor N to encoding memory addressing operand. Note that the disp8*N rules still apply when using 16b addressing. Table 2-34. Compressed Displacement (DISP8*N) Affected by Embedded Broadcast TupleType EVEX.b InputSize EVEX.W 0 32bit 0 none 16 32 64 1 32bit 0 {1tox} 4 4 4 0 64bit 1 none 16 32 64 1 64bit 1 {1tox} 8 8 8 0 32bit 0 none 8 16 32 1 32bit 0 {1tox} 4 4 4 Full Half Broadcast N (VL=128) N (VL=256) N (VL= 512) Comment Load+Op (Full Vector Dword/Qword) Load+Op (Half Vector) Table 2-35. EVEX DISP8*N for Instructions Not Affected by Embedded Broadcast TupleType InputSize Full Mem N/A N/A 16 32 64 8bit N/A 1 1 1 16bit N/A 2 2 2 32bit 0 4 4 4 64bit 1 8 8 8 32bit N/A 4 4 4 64bit N/A 8 8 8 32bit 0 8 8 8 64bit 1 NA 16 16 32bit 0 NA 16 16 64bit 1 NA NA 32 Tuple8 32bit 0 NA NA 32 Broadcast (8 elements) Half Mem N/A N/A 8 16 32 SubQword Conversion Quarter Mem N/A N/A 4 8 16 SubDword Conversion Eighth Mem N/A N/A 2 4 8 SubWord Conversion Mem128 N/A N/A 16 16 16 Shift count from memory MOVDDUP N/A N/A 8 32 64 VMOVDDUP Tuple1 Scalar Tuple1 Fixed Tuple2 Tuple4 2-40 Vol. 2A EVEX.W N (VL= 128) N (VL= 256) N (VL= 512) Comment Load/store or subDword full vector 1Tuple 1 Tuple, memsize not affected by EVEX.W Broadcast (2 elements) Broadcast (4 elements) INSTRUCTION FORMAT 2.6.6 EVEX Encoding of Broadcast/Rounding/SAE Support EVEX.b can provide three types of encoding context, depending on the instruction classes: • Embedded broadcasting of one data element from a source memory operand to the destination for vector instructions with “load+op” semantic. • • Static rounding control overriding MXCSR.RC for floating-point instructions with rounding semantic. “Suppress All exceptions” (SAE) overriding MXCSR mask control for floating-point arithmetic instructions that do not have rounding semantic. 2.6.7 Embedded Broadcast Support in EVEX EVEX encodes an embedded broadcast functionality that is supported on many vector instructions with 32-bit (double word or single-precision floating-point) and 64-bit data elements, and when the source operand is from memory. EVEX.b (P[20]) bit is used to enable broadcast on load-op instructions. When enabled, only one element is loaded from memory and broadcasted to all other elements instead of loading the full memory size. The following instruction classes do not support embedded broadcasting: • • • Instructions with only one scalar result is written to the vector destination. Instructions with explicit broadcast functionality provided by its opcode. Instruction semantic is a pure load or a pure store operation. 2.6.8 Static Rounding Support in EVEX Static rounding control embedded in the EVEX encoding system applies only to register-to-register flavor of floating-point instructions with rounding semantic at two distinct vector lengths: (i) scalar, (ii) 512-bit. In both cases, the field EVEX.L’L expresses rounding mode control overriding MXCSR.RC if EVEX.b is set. When EVEX.b is set, “suppress all exceptions” is implied. The processor behaves as if all MXCSR masking controls are set. 2.6.9 SAE Support in EVEX The EVEX encoding system allows arithmetic floating-point instructions without rounding semantic to be encoded with the SAE attribute. This capability applies to scalar and 512-bit vector lengths, register-to-register only, by setting EVEX.b. When EVEX.b is set, “suppress all exceptions” is implied. The processor behaves as if all MXCSR masking controls are set. 2.6.10 Vector Length Orthogonality The architecture of EVEX encoding scheme can support SIMD instructions operating at multiple vector lengths. Many AVX-512 Foundation instructions operate at 512-bit vector length. The vector length of EVEX encoded vector instructions are generally determined using the L’L field in EVEX prefix, except for 512-bit floating-point, reg-reg instructions with rounding semantic. The table below shows the vector length corresponding to various values of the L’L bits. When EVEX is used to encode scalar instructions, L’L is generally ignored. When EVEX.b bit is set for a register-register instructions with floating-point rounding semantic, the same two bits P2[6:5] specifies rounding mode for the instruction, with implied SAE behavior. The mapping of different instruction classes relative to the embedded broadcast/rounding/SAE control and the EVEX.L’L fields are summarized in Table 2-36. Vol. 2A 2-41 INSTRUCTION FORMAT Table 2-36. EVEX Embedded Broadcast/Rounding/SAE and Vector Length on Vector Instructions Position P2[4] P2[6:5] P2[6:5] Broadcast/Rounding/SAE Context EVEX.b EVEX.L’L EVEX.RC Reg-reg, FP Instructions w/ rounding semantic Enable static rounding control (SAE implied) Vector length Implied (512 bit or scalar) 00b: SAE + RNE 01b: SAE + RD 10b: SAE + RU 11b: SAE + RZ FP Instructions w/o rounding semantic, can cause #XM SAE control NA Load+op Instructions w/ memory source Broadcast Control Other Instructions ( Explicit Load/Store/Broadcast/Gather/Scatter) Must be 0 (otherwise #UD) 00b: 128-bit 01b: 256-bit 10b: 512-bit 11b: Reserved (#UD) 2.6.11 NA NA #UD Equations for EVEX Instructions encoded using EVEX can face three types of UD conditions: state dependent, opcode independent and opcode dependent. 2.6.11.1 State Dependent #UD In general, attempts of execute an instruction, which required OS support for incremental extended state component, will #UD if required state components were not enabled by OS. Table 2-37 lists instruction categories with respect to required processor state components. Attempts to execute a given category of instructions while enabled states were less than the required bit vector in XCR0 shown in Table 2-37 will cause #UD. Table 2-37. OS XSAVE Enabling Requirements of Instruction Categories Vector Register State Access Required XCR0 Bit Vector [7:0] Legacy SIMD prefix encoded Instructions (e.g SSE) XMM xxxxxx11b VEX-encoded instructions operating on YMM YMM xxxxx111b EVEX-encoded 128-bit instructions ZMM 111xx111b EVEX-encoded 256-bit instructions ZMM 111xx111b EVEX-encoded 512-bit instructions ZMM 111xx111b VEX-encoded instructions operating on opmask k-reg 111xxx11b Instruction Categories 2.6.11.2 Opcode Independent #UD A number of bit fields in EVEX encoded instruction must obey mode-specific but opcode-independent patterns listed in Table 2-38. Table 2-38. Opcode Independent, State Dependent EVEX Bit Fields Position Notation 64-bit #UD Non-64-bit #UD P[3 : 2] -- if > 0 if > 0 P[10] -- if 0 if 0 P[1: 0] EVEX.mm if 00b if 00b P[7 : 6] EVEX.RX None (valid) None (BOUND if EVEX.RX != 11b) 2-42 Vol. 2A INSTRUCTION FORMAT 2.6.11.3 Opcode Dependent #UD This section describes legal values for the rest of the EVEX bit fields. Table 2-39 lists the #UD conditions of EVEX prefix bit fields which encodes or modifies register operands. Table 2-39. #UD Conditions of Operand-Encoding EVEX Prefix Bit Fields Notation EVEX.R EVEX.X EVEX.B EVEX.R’ EVEX.vvvv EVEX.V’ Position P[7] P[6] P[5] P[4] P[14 : 11] P[19] Operand Encoding 64-bit #UD Non-64-bit #UD ModRM.reg encodes k-reg if EVEX.R = 0 None (BOUND if EVEX.RX != 11b) ModRM.reg is opcode extension None (ignored) ModRM.reg encodes all other registers None (valid) ModRM.r/m encodes ZMM/YMM/XMM None (valid) ModRM.r/m encodes k-reg or GPR None (ignored) ModRM.r/m without SIB/VSIB None (ignored) ModRM.r/m with SIB/VSIB None (valid) ModRM.r/m encodes k-reg None (ignored) ModRM.r/m encodes other registers None (valid) ModRM.r/m base present None (valid) ModRM.r/m base not present None (ignored) ModRM.reg encodes k-reg or GPR if 0 ModRM.reg is opcode extension None (ignored) ModRM.reg encodes ZMM/YMM/XMM None (valid) vvvv encodes ZMM/YMM/XMM None (valid) None (valid) P[14] ignored Otherwise if != 1111b if != 1111b Encodes ZMM/YMM/XMM None (valid) None (ignored) Otherwise if 0 None (ignored) None (ignored) None (ignored) Table 2-40 lists the #UD conditions of instruction encoding of opmask register using EVEX.aaa and EVEX.z Table 2-40. #UD Conditions of Opmask Related Encoding Field Notation EVEX.aaa EVEX.z VEX.vvvv Position P[18 : 16] P[23] Varies Operand Encoding 64-bit #UD processing1. Non-64-bit #UD if aaa != 000b if aaa != 000b Opmask used as conditional processing mask and updated at completion2. if aaa = 000b if aaa = 000b; Opmask used as conditional processing. None (valid3) None (valid1) Vector instruction using opmask as source or destination4. if EVEX.z != 0 if EVEX.z != 0 Instructions do not use opmask for conditional Store instructions or gather/scatter instructions. if EVEX.z != 0 if EVEX.z != 0 Instruction supporting conditional processing mask with EVEX.aaa = 000b. if EVEX.z != 0 if EVEX.z != 0 K-regs are instruction operands not mask control. if vvvv = 0xxxb None NOTES: 1. E.g., VPBROADCASTMxxx, VPMOVM2x, VPMOVx2M. 2. E.g., Gather/Scatter family. 3. aaa can take any value. A value of 000 indicates that there is no masking on the instruction; in this case, all elements will be processed as if there was a mask of ‘all ones’ regardless of the actual value in K0. Vol. 2A 2-43 INSTRUCTION FORMAT 4. E.g., VFPCLASSPD/PS, VCMPB/D/Q/W family, VPMOVM2x, VPMOVx2M. Table 2-41 lists the #UD conditions of EVEX bit fields that depends on the context of EVEX.b. Table 2-41. #UD Conditions Dependent on EVEX.b Context Notation EVEX.L’Lb Position Operand Encoding P[22 : 20] 64-bit #UD Non-64-bit #UD Reg-reg, FP instructions with rounding semantic. None (valid1) None (valid1) Other reg-reg, FP instructions that can cause #XM. None (valid2) None (valid2) Other reg-mem instructions in Table 2-34. None (valid3) None (valid3) Other instruction classes4 in Table 2-35. If EVEX.b > 0 If EVEX.b > 0 NOTES: 1. L’L specifies rounding control, see Table 2-36, supports {er} syntax. 2. L’L specifies vector length, see Table 2-36, supports {sae} syntax. 3. L’L specifies vector length, see Table 2-36, supports embedded broadcast syntax 4. L’L specifies either vector length or ignored. 2.6.12 Device Not Available EVEX-encoded instructions follow the same rules when it comes to generating #NM (Device Not Available) exception. In particular, it is generated when CR0.TS[bit 3]= 1. 2.6.13 Scalar Instructions EVEX-encoded scalar SIMD instructions can access up to 32 registers in 64-bit mode. Scalar instructions support masking (using the least significant bit of the opmask register), but broadcasting is not supported. 2.7 EXCEPTION CLASSIFICATIONS OF EVEX-ENCODED INSTRUCTIONS The exception behavior of EVEX-encoded instructions can be classified into the classes shown in the rest of this section. The classification of EVEX-encoded instructions follow a similar framework as those of AVX and AVX2 instructions using the VEX prefix. Exception types for EVEX-encoded instructions are named in the style of “E##” or with a suffix “E##XX”. The “##” designation generally follows that of AVX/AVX2 instructions. The majority of EVEX encoded instruction with “Load+op” semantic supports memory fault suppression, which is represented by E##. The instructions with “Load+op” semantic but do not support fault suppression are named “E##NF”. A summary table of exception classes by class names are shown below. Table 2-42. EVEX-Encoded Instruction Exception Class Summary Exception Class Instruction set Mem arg (#XM) Type E1 Vector Moves/Load/Stores Explicitly aligned, w/ fault suppression None Type E1NF Vector Non-temporal Stores Explicitly aligned, no fault suppression None Type E2 FP Vector Load+op Support fault suppression Yes Type E2NF FP Vector Load+op No fault suppression Yes Type E3 FP Scalar/Partial Vector, Load+Op Support fault suppression Yes Type E3NF FP Scalar/Partial Vector, Load+Op No fault suppression Yes Type E4 Integer Vector Load+op Support fault suppression No Type E4NF Integer Vector Load+op No fault suppression No Type E5 Legacy-like Promotion Varies, Support fault suppression No 2-44 Vol. 2A INSTRUCTION FORMAT Table 2-42. EVEX-Encoded Instruction Exception Class Summary Exception Class Instruction set Mem arg (#XM) Type E5NF Legacy-like Promotion Varies, No fault suppression No Type E6 Post AVX Promotion Varies, w/ fault suppression No Type E6NF Post AVX Promotion Varies, no fault suppression No Type E7NM Register-to-register op None None Type E9NF Miscellaneous 128-bit Vector-length Specific, no fault suppression None Type E10 Non-XF Scalar Vector Length ignored, w/ fault suppression None Type E10NF Non-XF Scalar Vector Length ignored, no fault suppression None Type E11 VCVTPH2PS, VCVTPS2PH Half Vector Length, w/ fault suppression Yes Type E12 Gather and Scatter Family VSIB addressing, w/ fault suppression None Type E12NP Gather and Scatter Prefetch Family VSIB addressing, w/o page fault None Table 2-43 lists EVEX-encoded instruction mnemonic by exception classes. Table 2-43. EVEX Instructions in each Exception Class Exception Class Type E1 Type E1NF Instruction VMOVAPD, VMOVAPS, VMOVDQA32, VMOVDQA64 VMOVNTDQ, VMOVNTDQA, VMOVNTPD, VMOVNTPS Type E2 VADDPD, VADDPS, VCMPPD, VCMPPS, VCVTDQ2PS, VCVTPD2DQ, VCVTPD2PS, VCVTPD2QQ, VCVTPD2UQQ, VCVTPD2UDQ, VCVTPS2DQ, VCVTPS2UDQS, VCVTQQ2PD, VCVTQQ2PS, VCVTTPD2DQ, VCVTTPD2QQ, VCVTTPD2UDQ, VCVTTPD2UQQ, VCVTTPS2DQ, VCVTTPS2UDQ, VCVTUDQ2PS, VCVTUQQ2PD, VCVTUQQ2PS, VDIVPD, VDIVPS, VEXP2PD, VEXP2PS, VFIXUPIMMPD, VFIXUPIMMPS, VFMADDxxxPD, VFMADDxxxPS, VFMADDSUBxxxPD, VFMADDSUBxxxPS, VFMSUBADDxxxPD, VFMSUBADDxxxPS, VFMSUBxxxPD, VFMSUBxxxPS, VFNMADDxxxPD, VFNMADDxxxPS, VFNMSUBxxxPD, VFNMSUBxxxPS, VGETEXPPD, VGETEXPPS, VGETMANTPD, VGETMANTPS, VMAXPD, VMAXPS, VMINPD, VMINPS, VMULPD, VMULPS, VRANGEPD, VRANGEPS, VREDUCEPD, VREDUCEPS, VRNDSCALEPD, VRNDSCALEPS, VRCP28PD, VRCP28PS, VRSQRT28PD, VRSQRT28PS, VSCALEFPD, VSCALEFPS, VSQRTPD, VSQRTPS, VSUBPD, VSUBPS Type E3 VADDSD, VADDSS, VCMPSD, VCMPSS, VCVTPS2QQ, VCVTPS2UQQ, VCVTPS2PD, VCVTSD2SS, VCVTSS2SD, VCVTTPS2QQ, VCVTTPS2UQQ, VDIVSD, VDIVSS, VFMADDxxxSD, VFMADDxxxSS, VFMSUBxxxSD, VFMSUBxxxSS, VFNMADDxxxSD, VFNMADDxxxSS, VFNMSUBxxxSD, VFNMSUBxxxSS, VFIXUPIMMSD, VFIXUPIMMSS, VGETEXPSD, VGETEXPSS, VGETMANTSD, VGETMANTSS, VMAXSD, VMAXSS, VMINSD, VMINSS, VMULSD, VMULSS, VRANGESD, VRANGESS, VREDUCESD, VREDUCESS, VRNDSCALESD, VRNDSCALESS, VSCALEFSD, VSCALEFSS, VRCP28SD, VRCP28SS, VRSQRT28SD, VRSQRT28SS, VSQRTSD, VSQRTSS, VSUBSD, VSUBSS Type E3NF Type E4 VCOMISD, VCOMISS, VCVTSD2SI, VCVTSD2USI, VCVTSI2SD, VCVTSI2SS, VCVTSS2SI, VCVTSS2USI, VCVTTSD2SI, VCVTTSD2USI, VCVTTSS2SI, VCVTTSS2USI, VCVTUSI2SD, VCVTUSI2SS, VUCOMISD, VUCOMISS VANDPD, VANDPS, VANDNPD, VANDNPS, VBLENDMPD, VBLENDMPS, VFPCLASSPD, VFPCLASSPS, VORPD, VORPS, VPABSD, VPABSQ, VPADDD, VPADDQ, VPANDD, VPANDQ, VPANDND, VPANDNQ, VPBLENDMB, VPBLENDMD, VPBLENDMQ, VPBLENDMW, VPCMPD, VPCMPEQD, VPCMPEQQ, VPCMPGTD, VPCMPGTQ, VPCMPQ, VPCMPUD, VPCMPUQ, VPLZCNTD, VPLZCNTQ, VPMADD52LUQ, VPMADD52HUQ, VPMAXSD, VPMAXSQ, VPMAXUD, VPMAXUQ, VPMINSD, VPMINSQ, VPMINUD, VPMINUQ, VPMULLD, VPMULLQ, VPMULUDQ, VPMULDQ, VPORD, VPORQ, VPROLD, VPROLQ, VPROLVD, VPROLVQ, VPRORD, VPRORQ, VPRORVD, VPRORVQ, (VPSLLD, VPSLLQ, VPSRAD, VPSRAQ, VPSRAVW, VPSRAVD, VPSRAVW, VPSRAVQ, VPSRLD, VPSRLQ)1, VPSUBD, VPSUBQ, VPSUBUSB, VPSUBUSW, VPTERNLOGD, VPTERNLOGQ, VPTESTMD, VPTESTMQ, VPTESTNMD, VPTESTNMQ, VPXORD, VPXORQ, VPSLLVD, VPSLLVQ, VRCP14PD, VRCP14PS, VRSQRT14PD, VRSQRT14PS, VXORPD, VXORPS Vol. 2A 2-45 INSTRUCTION FORMAT Table 2-43. EVEX Instructions in each Exception Class (Contd.) Exception Class Instruction E4.nb2 VCOMPRESSPD, VCOMPRESSPS, VEXPANDPD, VEXPANDPS, VMOVDQU8, VMOVDQU16, VMOVDQU32, VMOVDQU64, VMOVUPD, VMOVUPS, VPABSB, VPABSW, VPADDB, VPADDW, VPADDSB, VPADDSW, VPADDUSB, VPADDUSW, VPAVGB, VPAVGW, VPCMPB, VPCMPEQB, VPCMPEQW, VPCMPGTB, VPCMPGTW, VPCMPW, VPCMPUB, VPCMPUW, VPCOMPRESSD, VPCOMPRESSQ, VPEXPANDD, VPEXPANDQ, VPMAXSB, VPMAXSW, VPMAXUB, VPMAXUW, VPMINSB, VPMINSW, VPMINUB, VPMINUW, VPMULHRSW, VPMULHUW, VPMULHW, VPMULLW, VPSLLVW, VPSLLW, VPSRAW, VPSRLVW, VPSRLW, VPSUBB, VPSUBW, VPSUBSB, VPSUBSW, VPTESTMB, VPTESTMW, VPTESTNMB, VPTESTNMW Type E4NF VALIGND, VALIGNQ, VPACKSSDW, VPACKUSDW, VPCONFLICTD, VPCONFLICTQ, VPERMD, VPERMI2D, VPERMI2PS, VPERMI2PD, VPERMI2Q, VPERMPD, VPERMPS, VPERMQ, VPERMT2D, VPERMT2PS, VPERMT2Q, VPERMT2PD, VPERMILPD, VPERMILPS, VPMULTISHIFTQB, VPSHUFD, VPUNPCKHDQ, VPUNPCKHQDQ, VPUNPCKLDQ, VPUNPCKLQDQ, VSHUFF32X4, VSHUFF64X2, VSHUFI32X4, VSHUFI64X2, VSHUFPD, VSHUFPS, VUNPCKHPD, VUNPCKHPS, VUNPCKLPD, VUNPCKLPS E4NF.nb2 VDBPSADBW, VPACKSSWB, VPACKUSWB, VPALIGNR, VPMADDWD, VPMADDUBSW, VMOVSHDUP, VMOVSLDUP, VPSADBW, VPSHUFB, VPSHUFHW, VPSHUFLW, VPSLLDQ, VPSRLDQ, VPSLLW, VPSRAW, VPSRLW, (VPSLLD, VPSLLQ, VPSRAD, VPSRAQ, VPSRLD, VPSRLQ)3, VPUNPCKHBW, VPUNPCKHWD, VPUNPCKLBW, VPUNPCKLWD, VPERMW, VPERMI2W, VPERMT2W Type E5 PMOVSXBW, PMOVSXBW, PMOVSXBD, PMOVSXBQ, PMOVSXWD, PMOVSXWQ, PMOVSXDQ, PMOVZXBW, PMOVZXBD, PMOVZXBQ, PMOVZXWD, PMOVZXWQ, PMOVZXDQ, VCVTDQ2PD, VCVTUDQ2PD, VPMOVSXxx, VPMOVZXxx Type E5NF Type E6 Type E6NF Type E7NM.1284 VMOVDDUP VBROADCASTF32X2, VBROADCASTF32X4, VBROADCASTF64X2, VBROADCASTF32X8, VBROADCASTF64X4, VBROADCASTI32X2, VBROADCASTI32X4, VBROADCASTI64X2, VBROADCASTI32X8, VBROADCASTI64X4, VBROADCASTSD, VBROADCASTSS, VFPCLASSSD, VFPCLASSSS, VPBROADCASTB, VPBROADCASTD, VPBROADCASTW, VPBROADCASTQ, VPMOVQB, VPMOVSQB, VPMOVUSQB, VPMOVQW, VPMOVSQW, VPMOVUSQW, VPMOVQD, VPMOVSQD, VPMOVUSQD, VPMOVDB, VPMOVSDB, VPMOVUSDB, VPMOVDW, VPMOVSDW, VPMOVUSDW, VPMOVWB, VPMOVSWB, VPMOVUSWB VEXTRACTF32X4, VEXTRACTF32X8, VEXTRACTF64X2, VEXTRACTF64X4, VEXTRACTI32X4, VEXTRACTI32X8, VEXTRACTI64X2, VEXTRACTI64X4, VINSERTF32X4, VINSERTF32X8, VINSERTF64X2, VINSERTF64X4, VINSERTI32X4, VINSERTI32X8, VINSERTI64X2, VINSERTI64X4, VPBROADCASTMB2Q, VPBROADCASTMW2D VMOVHLPS, VMOVLHPS Type E7NM. (VPBROADCASTD, VPBROADCASTQ, VPBROADCASTB, VPBROADCASTW)5, VPMOVB2M, VPMOVD2M, VPMOVM2B, VPMOVM2D, VPMOVM2Q, VPMOVM2W, VPMOVQ2M, VPMOVW2M Type E9NF VEXTRACTPS, VINSERTPS, VMOVHPD, VMOVHPS, VMOVLPD, VMOVLPS, VMOVD, VMOVQ, VPEXTRB, VPEXTRD, VPEXTRW, VPEXTRQ, VPINSRB, VPINSRD, VPINSRW, VPINSRQ Type E10 VMOVSD, VMOVSS, VRCP14SD, VRCP14SS, VRSQRT14SD, VRSQRT14SS Type E10NF (VCVTSI2SD, VCVTUSI2SD)6 Type E11 VCVTPH2PS, VCVTPS2PH Type E12 VGATHERDPS, VGATHERDPD, VGATHERQPS, VGATHERQPD, VPGATHERDD, VPGATHERDQ, VPGATHERQD, VPGATHERQQ, VPSCATTERDD, VPSCATTERDQ, VPSCATTERQD, VPSCATTERQQ, VSCATTERDPD, VSCATTERDPS, VSCATTERQPD, VSCATTERQPS Type E12NP VGATHERPF0DPD, VGATHERPF0DPS, VGATHERPF0QPD, VGATHERPF0QPS, VGATHERPF1DPD, VGATHERPF1DPS, VGATHERPF1QPD, VGATHERPF1QPS, VSCATTERPF0DPD, VSCATTERPF0DPS, VSCATTERPF0QPD, VSCATTERPF0QPS, VSCATTERPF1DPD, VSCATTERPF1DPS, VSCATTERPF1QPD, VSCATTERPF1QPS NOTES: 1. Operand encoding Full tupletype with immediate. 2. Embedded broadcast is not supported with the “.nb” suffix. 3. Operand encoding Mem128 tupletype. 2-46 Vol. 2A INSTRUCTION FORMAT 4. #UD raised if EVEX.L’L !=00b (VL=128). 5. The source operand is a general purpose register. 6. W0 encoding only. 2.7.1 Exceptions Type E1 and E1NF of EVEX-Encoded Instructions EVEX-encoded instructions with memory alignment restrictions, and supporting memory fault suppression follow exception class E1. X Invalid Opcode, #UD X Device Not Available, #NM X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-44. Type E1 Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) X General Protection, #GP(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. X EVEX.512: Memory operand is not 64-byte aligned. EVEX.256: Memory operand is not 32-byte aligned. EVEX.128: Memory operand is not 16-byte aligned. If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X X Page Fault #PF(fault-code) Cause of Exception If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X X If fault suppression not set, and the memory address is in a non-canonical form. X X If fault suppression not set, and a page fault. Vol. 2A 2-47 INSTRUCTION FORMAT EVEX-encoded instructions with memory alignment restrictions, but do not support memory fault suppression follow exception class E1NF. X Invalid Opcode, #UD X Device Not Available, #NM X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-45. Type E1NF Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) X General Protection, #GP(0) For an illegal address in the SS segment. X If a memory address referencing the SS segment is in a non-canonical form. X EVEX.512: Memory operand is not 64-byte aligned. EVEX.256: Memory operand is not 32-byte aligned. EVEX.128: Memory operand is not 16-byte aligned. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X X Page Fault #PF(fault-code) 2-48 Vol. 2A Cause of Exception X X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X For a page fault. INSTRUCTION FORMAT 2.7.2 Exceptions Type E2 of EVEX-Encoded Instructions EVEX-encoded vector instructions with arithmetic semantic follow exception class E2. X X X Invalid Opcode, #UD X Device Not Available, #NM X 64-bit X Protected and Compatibility Virtual 8086 Exception Real Table 2-46. Type E2 Class Exception Conditions Cause of Exception If EVEX prefix present. X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(fault-code) X X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X X If EVEX.B=1, alignment checking is enabled, and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. X X X If an unmasked SIMD floating-point exception, {sae} or {er} not set, and CR4.OSXMMEXCPT[bit 10] = 1. SIMD Floatingpoint Exception, #XM X Vol. 2A 2-49 INSTRUCTION FORMAT 2.7.3 Exceptions Type E3 and E3NF of EVEX-Encoded Instructions EVEX-encoded scalar instructions with arithmetic semantic that support memory fault suppression follow exception class E3. X X X Invalid Opcode, #UD X Device Not Available, #NM X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-47. Type E3 Class Exception Conditions Cause of Exception If EVEX prefix present. X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. X X X If an unmasked SIMD floating-point exception, {sae} or {er} not set, and CR4.OSXMMEXCPT[bit 10] = 1. SIMD Floating-point Exception, #XM 2-50 Vol. 2A X INSTRUCTION FORMAT EVEX-encoded scalar instructions with arithmetic semantic that do not support memory fault suppression follow exception class E3NF. X X X Invalid Opcode, #UD Device Not Available, #NM 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-48. Type E3NF Class Exception Conditions Cause of Exception EVEX prefix. X X If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. X X If preceded by a LOCK prefix (F0H). X X X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) For an illegal address in the SS segment. X For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X For a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. X X X If an unmasked SIMD floating-point exception, {sae} or {er} not set, and CR4.OSXMMEXCPT[bit 10] = 1. SIMD Floating-point Exception, #XM X Vol. 2A 2-51 INSTRUCTION FORMAT 2.7.4 Exceptions Type E4 and E4NF of EVEX-Encoded Instructions EVEX-encoded vector instructions that cause no SIMD FP exception and support memory fault suppression follow exception class E4. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-49. Type E4 Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0 and in E4.nb subclass (see E4.nb entries in Table 2-43). • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X Cause of Exception X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X X If EVEX.B=1, alignment checking is enabled, and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. 2-52 Vol. 2A INSTRUCTION FORMAT EVEX-encoded vector instructions that do not cause SIMD FP exception nor support memory fault suppression follow exception class E4NF. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-50. Type E4NF Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0 and in E4NF.nb subclass (see E4NF.nb entries in Table 2-43). • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) For an illegal address in the SS segment. X X X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) Page Fault #PF(faultcode) Cause of Exception X X For a page fault. Vol. 2A 2-53 INSTRUCTION FORMAT 2.7.5 Exceptions Type E5 and E5NF EVEX-encoded scalar/partial-vector instructions that cause no SIMD FP exception and support memory fault suppression follow exception class E5. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-51. Type E5 Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X Cause of Exception X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. EVEX-encoded scalar/partial vector instructions that do not cause SIMD FP exception nor support memory fault suppression follow exception class E5NF. 2-54 Vol. 2A INSTRUCTION FORMAT X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-52. Type E5NF Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If an illegal address in the SS segment. X X General Protection, #GP(0) Cause of Exception If an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X For a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. Vol. 2A 2-55 INSTRUCTION FORMAT 2.7.6 Exceptions Type E6 and E6NF X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-53. Type E6 Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X If any corresponding CPUID feature flag is ‘0’. X X If CR0.TS[bit 3]=1. Invalid Opcode, #UD Device Not Available, #NM X Stack, SS(0) General Protection, #GP(0) Cause of Exception If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X If fault suppression not set, and the memory address is in a non-canonical form. Page Fault #PF(faultcode) X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X For 4 or 8 byte memory references if alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. 2-56 Vol. 2A INSTRUCTION FORMAT EVEX-encoded instructions that do not cause SIMD FP exception nor support memory fault suppression follow exception class E6NF. Invalid Opcode, #UD Device Not Available, #NM Stack, SS(0) General Protection, #GP(0) X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-54. Type E6NF Class Exception Conditions Cause of Exception If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X If any corresponding CPUID feature flag is ‘0’. X X If CR0.TS[bit 3]=1. X For an illegal address in the SS segment. X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X If the memory address is in a non-canonical form. Page Fault #PF(faultcode) X X For a page fault. Alignment Check #AC(0) X X For 4 or 8 byte memory references if alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. Vol. 2A 2-57 INSTRUCTION FORMAT 2.7.7 Exceptions Type E7NM EVEX-encoded instructions that cause no SIMD FP exception and do not reference memory follow exception class E7NM. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-55. Type E7NM Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • Instruction specific EVEX.L’L restriction not met. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X If any corresponding CPUID feature flag is ‘0’. X X If CR0.TS[bit 3]=1. Invalid Opcode, #UD X X Device Not Available, #NM 2-58 Vol. 2A X X Cause of Exception INSTRUCTION FORMAT 2.7.8 Exceptions Type E9 and E9NF EVEX-encoded vector or partial-vector instructions that do not cause no SIMD FP exception and support memory fault suppression follow exception class E9. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-56. Type E9 Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 00b (VL=128). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X Cause of Exception X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. Vol. 2A 2-59 INSTRUCTION FORMAT EVEX-encoded vector or partial-vector instructions that must be encoded with VEX.L’L = 0, do not cause SIMD FP exception nor support memory fault suppression follow exception class E9NF. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-57. Type E9NF Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 00b (VL=128). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If an illegal address in the SS segment. X X General Protection, #GP(0) Cause of Exception If an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X X If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X For a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. 2-60 Vol. 2A INSTRUCTION FORMAT 2.7.9 Exceptions Type E10 and E10NF EVEX-encoded scalar instructions that ignore EVEX.L’L vector length encoding and do not cause no SIMD FP exception, support memory fault suppression follow exception class E10. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-58. Type E10 Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X Cause of Exception X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. Vol. 2A 2-61 INSTRUCTION FORMAT EVEX-encoded scalar instructions that must be encoded with VEX.L’L = 0, do not cause SIMD FP exception nor support memory fault suppression follow exception class E10NF. X 64-bit X Protected and Compatibility Virtual 80x86 Exception Real Table 2-59. Type E10NF Class Exception Conditions If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. Invalid Opcode, #UD X Device Not Available, #NM X Cause of Exception X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. X Stack, SS(0) If fault suppression not set, and an illegal address in the SS segment. X If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X General Protection, #GP(0) X X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X Page Fault #PF(faultcode) X X X If fault suppression not set, and a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. 2-62 Vol. 2A INSTRUCTION FORMAT 2.7.10 Exception Type E11 (EVEX-only, mem arg no AC, floating-point exceptions) EVEX-encoded instructions that can cause SIMD FP exception, memory operand support fault suppression but do not cause #AC follow exception class E11. Invalid Opcode, #UD X X Device Not Available, #NM 64-bit Protected and Compatibility Virtual 80x86 Exception Real Table 2-60. Type E11 Class Exception Conditions X X If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a EVEX prefix. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If CR0.TS[bit 3]=1. Stack, SS(0) X If fault suppression not set, and an illegal address in the SS segment. X General Protection, #GP(0) X X Page Fault #PF (faultcode) X If fault suppression not set, and a memory address referencing the SS segment is in a non-canonical form. If fault suppression not set, and an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X SIMD Floating-Point Exception, #XM Cause of Exception X If fault suppression not set, and the memory address is in a non-canonical form. If fault suppression not set, and any part of the operand lies outside the effective address space from 0 to FFFFH. X X X If fault suppression not set, and a page fault. X X X If an unmasked SIMD floating-point exception, {sae} not set, and CR4.OSXMMEXCPT[bit 10] = 1. Vol. 2A 2-63 INSTRUCTION FORMAT 2.7.11 Exception Type E12 and E12NP (VSIB mem arg, no AC, no floating-point exceptions) Invalid Opcode, #UD X X Device Not Available, #NM 64-bit Protected and Compatibility Virtual 80x86 Exception Real Table 2-61. Type E12 Class Exception Conditions X X If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). • If vvvv != 1111b. X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X NA If address size attribute is 16 bit. X X X X If ModR/M.mod = ‘11b’. X X X X If ModR/M.rm != ‘100b’. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If k0 is used (gather or scatter operation). X X X X If index = destination register (gather operation). X X X X If CR0.TS[bit 3]=1. Stack, SS(0) X For an illegal address in the SS segment. X General Protection, #GP(0) X X 2-64 Vol. 2A X X If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X Page Fault #PF (faultcode) Cause of Exception If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X For a page fault. INSTRUCTION FORMAT EVEX-encoded prefetch instructions that do not cause #PF follow exception class E12NP. Invalid Opcode, #UD X X Device Not Available, #NM 64-bit Protected and Compatibility Virtual 80x86 Exception Real Table 2-62. Type E12NP Class Exception Conditions X X Cause of Exception If EVEX prefix present. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. • Opmask encoding #UD condition of Table 2-40. • If EVEX.b != 0. • If EVEX.L’L != 10b (VL=512). X X If preceded by a LOCK prefix (F0H). X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X X NA If address size attribute is 16 bit. X X X X If ModR/M.mod = ‘11b’. X X X X If ModR/M.rm != ‘100b’. X X X X If any corresponding CPUID feature flag is ‘0’. X X X X If k0 is used (gather or scatter operation). X X X X If CR0.TS[bit 3]=1. Vol. 2A 2-65 INSTRUCTION FORMAT 2.8 EXCEPTION CLASSIFICATIONS OF OPMASK INSTRUCTIONS The exception behavior of VEX-encoded opmask instructions are listed below. Exception conditions of Opmask instructions that do not address memory are listed as Type K20. Device Not Available, #NM 2-66 Vol. 2A X X X X 64-bit X Protected and Compatibility Invalid Opcode, #UD Virtual 80x86 Exception Real Table 2-63. TYPE K20 Exception Definition (VEX-Encoded OpMask Instructions w/o Memory Arg) X X Cause of Exception If relevant CPUID feature flag is ‘0’. If a VEX prefix is present. X X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. X X If any REX, F2, F3, or 66 prefixes precede a VEX prefix. X X If ModRM:[7:6] != 11b. X X If CR0.TS[bit 3]=1. INSTRUCTION FORMAT Exception conditions of Opmask instructions that address memory are listed as Type K21. X X X 64-bit X Protected and Compatibility Invalid Opcode, #UD Virtual 80x86 Exception Real Table 2-64. TYPE K21 Exception Definition (VEX-Encoded OpMask Instructions Addressing Memory) X X X X X X X X X Stack, SS(0) X X X X If CR0.TS[bit 3]=1. If any REX, F2, F3, or 66 prefixes precede a VEX prefix. If a memory address referencing the SS segment is in a non-canonical form. For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. If the DS, ES, FS, or GS register is used to access memory and it contains a null segment selector. X X If CR4.OSXSAVE[bit 18]=0. If any one of following conditions applies: • State requirement, Table 2-37 not met. • Opcode independent #UD condition in Table 2-38. • Operand encoding #UD conditions in Table 2-39. For an illegal address in the SS segment. X General Protection, #GP(0) If relevant CPUID feature flag is ‘0’. If a VEX prefix is present. X Device Not Available, #NM Cause of Exception X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. Page Fault #PF(faultcode) X X X For a page fault. Alignment Check #AC(0) X X X If alignment checking is enabled and an unaligned memory reference of 8 bytes or less is made while the current privilege level is 3. Vol. 2A 2-67 INSTRUCTION FORMAT 2-68 Vol. 2A 7. Updates to Chapter 3, Volume 2A Change bars and green text show changes to Chapter 3 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-L. -----------------------------------------------------------------------------------------Changes to this chapter: Typo corrections/additions/updates to the following instructions: ARPL, CALL, CLFLUSHOPT, CMPXCHG8B/CMPXCHG16B, CPUID, and JMP. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 17 INSTRUCTION SET REFERENCE, A-L ARPL—Adjust RPL Field of Segment Selector Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 63 /r ARPL r/m16, r16 MR N. E. Valid Adjust RPL of r/m16 to not less than RPL of r16. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 MR ModRM:r/m (w) ModRM:reg (r) NA NA Description Compares the RPL fields of two segment selectors. The first operand (the destination operand) contains one segment selector and the second operand (source operand) contains the other. (The RPL field is located in bits 0 and 1 of each operand.) If the RPL field of the destination operand is less than the RPL field of the source operand, the ZF flag is set and the RPL field of the destination operand is increased to match that of the source operand. Otherwise, the ZF flag is cleared and no change is made to the destination operand. (The destination operand can be a word register or a memory location; the source operand must be a word register.) The ARPL instruction is provided for use by operating-system procedures (however, it can also be used by applications). It is generally used to adjust the RPL of a segment selector that has been passed to the operating system by an application program to match the privilege level of the application program. Here the segment selector passed to the operating system is placed in the destination operand and segment selector for the application program’s code segment is placed in the source operand. (The RPL field in the source operand represents the privilege level of the application program.) Execution of the ARPL instruction then ensures that the RPL of the segment selector received by the operating system is no lower (does not have a higher privilege) than the privilege level of the application program (the segment selector for the application program’s code segment can be read from the stack following a procedure call). This instruction executes as described in compatibility mode and legacy mode. It is not encodable in 64-bit mode. See “Checking Caller Access Privileges” in Chapter 3, “Protected-Mode Memory Management,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for more information about the use of this instruction. Operation IF 64-BIT MODE THEN See MOVSXD; ELSE IF DEST[RPL] < SRC[RPL] THEN ZF ← 1; DEST[RPL] ← SRC[RPL]; ELSE ZF ← 0; FI; FI; Flags Affected The ZF flag is set to 1 if the RPL field of the destination operand is less than that of the source operand; otherwise, it is set to 0. 3-76 Vol. 2A ARPL—Adjust RPL Field of Segment Selector INSTRUCTION SET REFERENCE, A-L Protected Mode Exceptions #GP(0) If the destination is located in a non-writable segment. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. #UD If the LOCK prefix is used. Real-Address Mode Exceptions #UD The ARPL instruction is not recognized in real-address mode. If the LOCK prefix is used. Virtual-8086 Mode Exceptions #UD The ARPL instruction is not recognized in virtual-8086 mode. If the LOCK prefix is used. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions Not applicable. ARPL—Adjust RPL Field of Segment Selector Vol. 2A 3-77 INSTRUCTION SET REFERENCE, A-L CALL—Call Procedure Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode E8 cw CALL rel16 D N.S. Valid Call near, relative, displacement relative to next instruction. E8 cd CALL rel32 D Valid Valid Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode. FF /2 CALL r/m16 M N.E. Valid Call near, absolute indirect, address given in r/m16. FF /2 CALL r/m32 M N.E. Valid Call near, absolute indirect, address given in r/m32. FF /2 CALL r/m64 M Valid N.E. Call near, absolute indirect, address given in r/m64. 9A cd CALL ptr16:16 D Invalid Valid Call far, absolute, address given in operand. 9A cp CALL ptr16:32 D Invalid Valid Call far, absolute, address given in operand. FF /3 CALL m16:16 M Valid Valid Call far, absolute indirect address given in m16:16. In 32-bit mode: if selector points to a gate, then RIP = 32-bit zero extended displacement taken from gate; else RIP = zero extended 16-bit offset from far pointer referenced in the instruction. FF /3 CALL m16:32 M Valid Valid In 64-bit mode: If selector points to a gate, then RIP = 64-bit displacement taken from gate; else RIP = zero extended 32-bit offset from far pointer referenced in the instruction. REX.W FF /3 CALL m16:64 M Valid N.E. In 64-bit mode: If selector points to a gate, then RIP = 64-bit displacement taken from gate; else RIP = 64-bit offset from far pointer referenced in the instruction. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 D Offset NA NA NA M ModRM:r/m (r) NA NA NA Description Saves procedure linking information on the stack and branches to the called procedure specified using the target operand. The target operand specifies the address of the first instruction in the called procedure. The operand can be an immediate value, a general-purpose register, or a memory location. This instruction can be used to execute four types of calls: • Near Call — A call to a procedure in the current code segment (the segment currently pointed to by the CS register), sometimes referred to as an intra-segment call. • Far Call — A call to a procedure located in a different segment than the current code segment, sometimes referred to as an inter-segment call. • Inter-privilege-level far call — A far call to a procedure in a segment at a different privilege level than that of the currently executing program or procedure. • Task switch — A call to a procedure located in a different task. The latter two call types (inter-privilege-level call and task switch) can only be executed in protected mode. See “Calling Procedures Using Call and RET” in Chapter 6 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, for additional information on near, far, and inter-privilege-level calls. See Chapter 7, “Task Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for information on performing task switches with the CALL instruction. 3-122 Vol. 2A CALL—Call Procedure INSTRUCTION SET REFERENCE, A-L Near Call. When executing a near call, the processor pushes the value of the EIP register (which contains the offset of the instruction following the CALL instruction) on the stack (for use later as a return-instruction pointer). The processor then branches to the address in the current code segment specified by the target operand. The target operand specifies either an absolute offset in the code segment (an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current value of the instruction pointer in the EIP register; this value points to the instruction following the CALL instruction). The CS register is not changed on near calls. For a near call absolute, an absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16, r/m32, or r/m64). The operand-size attribute determines the size of the target operand (16, 32 or 64 bits). When in 64-bit mode, the operand size for near call (and all near branches) is forced to 64-bits. Absolute offsets are loaded directly into the EIP(RIP) register. If the operand size attribute is 16, the upper two bytes of the EIP register are cleared, resulting in a maximum instruction pointer size of 16 bits. When accessing an absolute offset indirectly using the stack pointer [ESP] as the base register, the base value used is the value of the ESP before the instruction executes. A relative offset (rel16 or rel32) is generally specified as a label in assembly code. But at the machine code level, it is encoded as a signed, 16- or 32-bit immediate value. This value is added to the value in the EIP(RIP) register. In 64-bit mode the relative offset is always a 32-bit immediate value which is sign extended to 64-bits before it is added to the value in the RIP register for the target calculation. As with absolute offsets, the operand-size attribute determines the size of the target operand (16, 32, or 64 bits). In 64-bit mode the target operand will always be 64bits because the operand size is forced to 64-bits for near branches. Far Calls in Real-Address or Virtual-8086 Mode. When executing a far call in real- address or virtual-8086 mode, the processor pushes the current value of both the CS and EIP registers on the stack for use as a return-instruction pointer. The processor then performs a “far branch” to the code segment and offset specified with the target operand for the called procedure. The target operand specifies an absolute far address either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). With the pointer method, the segment and offset of the called procedure is encoded in the instruction using a 4-byte (16-bit operand size) or 6byte (32-bit operand size) far address immediate. With the indirect method, the target operand specifies a memory location that contains a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address. The operand-size attribute determines the size of the offset (16 or 32 bits) in the far address. The far address is loaded directly into the CS and EIP registers. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared. Far Calls in Protected Mode. When the processor is operating in protected mode, the CALL instruction can be used to perform the following types of far calls: • • • Far call to the same privilege level Far call to a different privilege level (inter-privilege level call) Task switch (far call to another task) In protected mode, the processor always uses the segment selector part of the far address to access the corresponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate, task gate, or TSS) and access rights determine the type of call operation to be performed. If the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is performed. (If the selected code segment is at a different privilege level and the code segment is non-conforming, a general-protection exception is generated.) A far call to the same privilege level in protected mode is very similar to one carried out in real-address or virtual-8086 mode. The target operand specifies an absolute far address either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). The operand- size attribute determines the size of the offset (16 or 32 bits) in the far address. The new code segment selector and its descriptor are loaded into CS register; the offset from the instruction is loaded into the EIP register. A call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the same privilege level. Using this mechanism provides an extra level of indirection and is the preferred method of making calls between 16-bit and 32-bit code segments. When executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed through a call gate. The segment selector specified by the target operand identifies the call gate. The target operand can specify the call gate segment selector either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). The processor obtains the segment selector for the new code segment and the new instruction pointer (offset) from the call gate descriptor. (The offset from the target operand is ignored when a call gate is used.) CALL—Call Procedure Vol. 2A 3-123 INSTRUCTION SET REFERENCE, A-L On inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The segment selector for the new stack segment is specified in the TSS for the currently running task. The branch to the new code segment occurs after the stack switch. (Note that when using a call gate to perform a far call to a segment at the same privilege level, no stack switch occurs.) On the new stack, the processor pushes the segment selector and stack pointer for the calling procedure’s stack, an optional set of parameters from the calling procedures stack, and the segment selector and instruction pointer for the calling procedure’s code segment. (A value in the call gate descriptor determines how many parameters to copy to the new stack.) Finally, the processor branches to the address of the procedure being called within the new code segment. Executing a task switch with the CALL instruction is similar to executing a call through a call gate. The target operand specifies the segment selector of the task gate for the new task activated by the switch (the offset in the target operand is ignored). The task gate in turn points to the TSS for the new task, which contains the segment selectors for the task’s code and stack segments. Note that the TSS also contains the EIP value for the next instruction that was to be executed before the calling task was suspended. This instruction pointer value is loaded into the EIP register to re-start the calling task. The CALL instruction can also specify the segment selector of the TSS directly, which eliminates the indirection of the task gate. See Chapter 7, “Task Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for information on the mechanics of a task switch. When you execute at task switch with a CALL instruction, the nested task flag (NT) is set in the EFLAGS register and the new TSS’s previous task link field is loaded with the old task’s TSS selector. Code is expected to suspend this nested task by executing an IRET instruction which, because the NT flag is set, automatically uses the previous task link to return to the calling task. (See “Task Linking” in Chapter 7 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for information on nested tasks.) Switching tasks with the CALL instruction differs in this regard from JMP instruction. JMP does not set the NT flag and therefore does not expect an IRET instruction to suspend the task. Mixing 16-Bit and 32-Bit Calls. When making far calls between 16-bit and 32-bit code segments, use a call gate. If the far call is from a 32-bit code segment to a 16-bit code segment, the call should be made from the first 64 KBytes of the 32-bit code segment. This is because the operand-size attribute of the instruction is set to 16, so only a 16-bit return address offset can be saved. Also, the call should be made using a 16-bit call gate so that 16-bit values can be pushed on the stack. See Chapter 21, “Mixing 16-Bit and 32-Bit Code,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B, for more information. Far Calls in Compatibility Mode. When the processor is operating in compatibility mode, the CALL instruction can be used to perform the following types of far calls: • • • Far call to the same privilege level, remaining in compatibility mode Far call to the same privilege level, transitioning to 64-bit mode Far call to a different privilege level (inter-privilege level call), transitioning to 64-bit mode Note that a CALL instruction can not be used to cause a task switch in compatibility mode since task switches are not supported in IA-32e mode. In compatibility mode, the processor always uses the segment selector part of the far address to access the corresponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate) and access rights determine the type of call operation to be performed. If the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is performed. (If the selected code segment is at a different privilege level and the code segment is non-conforming, a general-protection exception is generated.) A far call to the same privilege level in compatibility mode is very similar to one carried out in protected mode. The target operand specifies an absolute far address either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). The operand-size attribute determines the size of the offset (16 or 32 bits) in the far address. The new code segment selector and its descriptor are loaded into CS register and the offset from the instruction is loaded into the EIP register. The difference is that 64-bit mode may be entered. This specified by the L bit in the new code segment descriptor. Note that a 64-bit call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the same privilege level. However, using this mechanism requires that the target code segment descriptor have the L bit set, causing an entry to 64-bit mode. When executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed through a 64-bit call gate. The segment selector specified by the target operand identifies the call gate. The target 3-124 Vol. 2A CALL—Call Procedure INSTRUCTION SET REFERENCE, A-L operand can specify the call gate segment selector either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). The processor obtains the segment selector for the new code segment and the new instruction pointer (offset) from the 16-byte call gate descriptor. (The offset from the target operand is ignored when a call gate is used.) On inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The segment selector for the new stack segment is set to NULL. The new stack pointer is specified in the TSS for the currently running task. The branch to the new code segment occurs after the stack switch. (Note that when using a call gate to perform a far call to a segment at the same privilege level, an implicit stack switch occurs as a result of entering 64-bit mode. The SS selector is unchanged, but stack segment accesses use a segment base of 0x0, the limit is ignored, and the default stack size is 64-bits. The full value of RSP is used for the offset, of which the upper 32-bits are undefined.) On the new stack, the processor pushes the segment selector and stack pointer for the calling procedure’s stack and the segment selector and instruction pointer for the calling procedure’s code segment. (Parameter copy is not supported in IA-32e mode.) Finally, the processor branches to the address of the procedure being called within the new code segment. Near/(Far) Calls in 64-bit Mode. When the processor is operating in 64-bit mode, the CALL instruction can be used to perform the following types of far calls: • • • Far call to the same privilege level, transitioning to compatibility mode Far call to the same privilege level, remaining in 64-bit mode Far call to a different privilege level (inter-privilege level call), remaining in 64-bit mode Note that in this mode the CALL instruction can not be used to cause a task switch in 64-bit mode since task switches are not supported in IA-32e mode. In 64-bit mode, the processor always uses the segment selector part of the far address to access the corresponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate) and access rights determine the type of call operation to be performed. If the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is performed. (If the selected code segment is at a different privilege level and the code segment is non-conforming, a general-protection exception is generated.) A far call to the same privilege level in 64-bit mode is very similar to one carried out in compatibility mode. The target operand specifies an absolute far address indirectly with a memory location (m16:16, m16:32 or m16:64). The form of CALL with a direct specification of absolute far address is not defined in 64-bit mode. The operand-size attribute determines the size of the offset (16, 32, or 64 bits) in the far address. The new code segment selector and its descriptor are loaded into the CS register; the offset from the instruction is loaded into the EIP register. The new code segment may specify entry either into compatibility or 64-bit mode, based on the L bit value. A 64-bit call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the same privilege level. However, using this mechanism requires that the target code segment descriptor have the L bit set. When executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed through a 64-bit call gate. The segment selector specified by the target operand identifies the call gate. The target operand can only specify the call gate segment selector indirectly with a memory location (m16:16, m16:32 or m16:64). The processor obtains the segment selector for the new code segment and the new instruction pointer (offset) from the 16-byte call gate descriptor. (The offset from the target operand is ignored when a call gate is used.) On inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The segment selector for the new stack segment is set to NULL. The new stack pointer is specified in the TSS for the currently running task. The branch to the new code segment occurs after the stack switch. Note that when using a call gate to perform a far call to a segment at the same privilege level, an implicit stack switch occurs as a result of entering 64-bit mode. The SS selector is unchanged, but stack segment accesses use a segment base of 0x0, the limit is ignored, and the default stack size is 64-bits. (The full value of RSP is used for the offset.) On the new stack, the processor pushes the segment selector and stack pointer for the calling procedure’s stack and the segment selector and instruction pointer for the calling procedure’s code segment. (Parameter copy is not supported in IA-32e mode.) Finally, the processor branches to the address of the procedure being called within the new code segment. CALL—Call Procedure Vol. 2A 3-125 INSTRUCTION SET REFERENCE, A-L Instruction ordering. Instructions following a far call may be fetched from memory before earlier instructions complete execution, but they will not execute (even speculatively) until all instructions prior to the far call have completed execution (the later instructions may execute before data stored by the earlier instructions have become globally visible). Certain situations may lead to the next sequential instruction after a near indirect CALL being speculatively executed. If software needs to prevent this (e.g., in order to prevent a speculative execution side channel), then an INT3 or LFENCE instruction opcode can be placed after the near indirect CALL in order to block speculative execution. Operation IF near call THEN IF near relative call THEN IF OperandSize = 64 THEN tempDEST ← SignExtend(DEST); (* DEST is rel32 *) tempRIP ← RIP + tempDEST; IF stack not large enough for a 8-byte return address THEN #SS(0); FI; Push(RIP); RIP ← tempRIP; FI; IF OperandSize = 32 THEN tempEIP ← EIP + DEST; (* DEST is rel32 *) IF tempEIP is not within code segment limit THEN #GP(0); FI; IF stack not large enough for a 4-byte return address THEN #SS(0); FI; Push(EIP); EIP ← tempEIP; FI; IF OperandSize = 16 THEN tempEIP ← (EIP + DEST) AND 0000FFFFH; (* DEST is rel16 *) IF tempEIP is not within code segment limit THEN #GP(0); FI; IF stack not large enough for a 2-byte return address THEN #SS(0); FI; Push(IP); EIP ← tempEIP; FI; ELSE (* Near absolute call *) IF OperandSize = 64 THEN tempRIP ← DEST; (* DEST is r/m64 *) IF stack not large enough for a 8-byte return address THEN #SS(0); FI; Push(RIP); RIP ← tempRIP; FI; IF OperandSize = 32 THEN tempEIP ← DEST; (* DEST is r/m32 *) IF tempEIP is not within code segment limit THEN #GP(0); FI; IF stack not large enough for a 4-byte return address 3-126 Vol. 2A CALL—Call Procedure INSTRUCTION SET REFERENCE, A-L THEN #SS(0); FI; Push(EIP); EIP ← tempEIP; FI; IF OperandSize = 16 THEN tempEIP ← DEST AND 0000FFFFH; (* DEST is r/m16 *) IF tempEIP is not within code segment limit THEN #GP(0); FI; IF stack not large enough for a 2-byte return address THEN #SS(0); FI; Push(IP); EIP ← tempEIP; FI; FI;rel/abs FI; near IF far call and (PE = 0 or (PE = 1 and VM = 1)) (* Real-address or virtual-8086 mode *) THEN IF OperandSize = 32 THEN IF stack not large enough for a 6-byte return address THEN #SS(0); FI; IF DEST[31:16] is not zero THEN #GP(0); FI; Push(CS); (* Padded with 16 high-order bits *) Push(EIP); CS ← DEST[47:32]; (* DEST is ptr16:32 or [m16:32] *) EIP ← DEST[31:0]; (* DEST is ptr16:32 or [m16:32] *) ELSE (* OperandSize = 16 *) IF stack not large enough for a 4-byte return address THEN #SS(0); FI; Push(CS); Push(IP); CS ← DEST[31:16]; (* DEST is ptr16:16 or [m16:16] *) EIP ← DEST[15:0]; (* DEST is ptr16:16 or [m16:16]; clear upper 16 bits *) FI; FI; IF far call and (PE = 1 and VM = 0) (* Protected mode or IA-32e Mode, not virtual-8086 mode*) THEN IF segment selector in target operand NULL THEN #GP(0); FI; IF segment selector index not within descriptor table limits THEN #GP(new code segment selector); FI; Read type and access rights of selected segment descriptor; IF IA32_EFER.LMA = 0 THEN IF segment type is not a conforming or nonconforming code segment, call gate, task gate, or TSS THEN #GP(segment selector); FI; ELSE IF segment type is not a conforming or nonconforming code segment or 64-bit call gate, THEN #GP(segment selector); FI; FI; CALL—Call Procedure Vol. 2A 3-127 INSTRUCTION SET REFERENCE, A-L Depending on type and access rights: GO TO CONFORMING-CODE-SEGMENT; GO TO NONCONFORMING-CODE-SEGMENT; GO TO CALL-GATE; GO TO TASK-GATE; GO TO TASK-STATE-SEGMENT; FI; CONFORMING-CODE-SEGMENT: IF L bit = 1 and D bit = 1 and IA32_EFER.LMA = 1 THEN GP(new code segment selector); FI; IF DPL > CPL THEN #GP(new code segment selector); FI; IF segment not present THEN #NP(new code segment selector); FI; IF stack not large enough for return address THEN #SS(0); FI; tempEIP ← DEST(Offset); IF target mode = Compatibility mode THEN tempEIP ← tempEIP AND 00000000_FFFFFFFFH; FI; IF OperandSize = 16 THEN tempEIP ← tempEIP AND 0000FFFFH; FI; (* Clear upper 16 bits *) IF (EFER.LMA = 0 or target mode = Compatibility mode) and (tempEIP outside new code segment limit) THEN #GP(0); FI; IF tempEIP is non-canonical THEN #GP(0); FI; IF OperandSize = 32 THEN Push(CS); (* Padded with 16 high-order bits *) Push(EIP); CS ← DEST(CodeSegmentSelector); (* Segment descriptor information also loaded *) CS(RPL) ← CPL; EIP ← tempEIP; ELSE IF OperandSize = 16 THEN Push(CS); Push(IP); CS ← DEST(CodeSegmentSelector); (* Segment descriptor information also loaded *) CS(RPL) ← CPL; EIP ← tempEIP; ELSE (* OperandSize = 64 *) Push(CS); (* Padded with 48 high-order bits *) Push(RIP); CS ← DEST(CodeSegmentSelector); (* Segment descriptor information also loaded *) CS(RPL) ← CPL; RIP ← tempEIP; FI; FI; 3-128 Vol. 2A CALL—Call Procedure INSTRUCTION SET REFERENCE, A-L END; NONCONFORMING-CODE-SEGMENT: IF L-Bit = 1 and D-BIT = 1 and IA32_EFER.LMA = 1 THEN GP(new code segment selector); FI; IF (RPL > CPL) or (DPL ≠ CPL) THEN #GP(new code segment selector); FI; IF segment not present THEN #NP(new code segment selector); FI; IF stack not large enough for return address THEN #SS(0); FI; tempEIP ← DEST(Offset); IF target mode = Compatibility mode THEN tempEIP ← tempEIP AND 00000000_FFFFFFFFH; FI; IF OperandSize = 16 THEN tempEIP ← tempEIP AND 0000FFFFH; FI; (* Clear upper 16 bits *) IF (EFER.LMA = 0 or target mode = Compatibility mode) and (tempEIP outside new code segment limit) THEN #GP(0); FI; IF tempEIP is non-canonical THEN #GP(0); FI; IF OperandSize = 32 THEN Push(CS); (* Padded with 16 high-order bits *) Push(EIP); CS ← DEST(CodeSegmentSelector); (* Segment descriptor information also loaded *) CS(RPL) ← CPL; EIP ← tempEIP; ELSE IF OperandSize = 16 THEN Push(CS); Push(IP); CS ← DEST(CodeSegmentSelector); (* Segment descriptor information also loaded *) CS(RPL) ← CPL; EIP ← tempEIP; ELSE (* OperandSize = 64 *) Push(CS); (* Padded with 48 high-order bits *) Push(RIP); CS ← DEST(CodeSegmentSelector); (* Segment descriptor information also loaded *) CS(RPL) ← CPL; RIP ← tempEIP; FI; FI; END; CALL-GATE: IF call gate (DPL < CPL) or (RPL > DPL) THEN #GP(call-gate selector); FI; IF call gate not present THEN #NP(call-gate selector); FI; CALL—Call Procedure Vol. 2A 3-129 INSTRUCTION SET REFERENCE, A-L IF call-gate code-segment selector is NULL THEN #GP(0); FI; IF call-gate code-segment selector index is outside descriptor table limits THEN #GP(call-gate code-segment selector); FI; Read call-gate code-segment descriptor; IF call-gate code-segment descriptor does not indicate a code segment or call-gate code-segment descriptor DPL > CPL THEN #GP(call-gate code-segment selector); FI; IF IA32_EFER.LMA = 1 AND (call-gate code-segment descriptor is not a 64-bit code segment or call-gate code-segment descriptor has both L-bit and D-bit set) THEN #GP(call-gate code-segment selector); FI; IF call-gate code segment not present THEN #NP(call-gate code-segment selector); FI; IF call-gate code segment is non-conforming and DPL < CPL THEN go to MORE-PRIVILEGE; ELSE go to SAME-PRIVILEGE; FI; END; MORE-PRIVILEGE: IF current TSS is 32-bit THEN TSSstackAddress ← (new code-segment DPL ∗ 8) + 4; IF (TSSstackAddress + 5) > current TSS limit THEN #TS(current TSS selector); FI; NewSS ← 2 bytes loaded from (TSS base + TSSstackAddress + 4); NewESP ← 4 bytes loaded from (TSS base + TSSstackAddress); ELSE IF current TSS is 16-bit THEN TSSstackAddress ← (new code-segment DPL ∗ 4) + 2 IF (TSSstackAddress + 3) > current TSS limit THEN #TS(current TSS selector); FI; NewSS ← 2 bytes loaded from (TSS base + TSSstackAddress + 2); NewESP ← 2 bytes loaded from (TSS base + TSSstackAddress); ELSE (* current TSS is 64-bit *) TSSstackAddress ← (new code-segment DPL ∗ 8) + 4; IF (TSSstackAddress + 7) > current TSS limit THEN #TS(current TSS selector); FI; NewSS ← new code-segment DPL; (* NULL selector with RPL = new CPL *) NewRSP ← 8 bytes loaded from (current TSS base + TSSstackAddress); FI; FI; IF IA32_EFER.LMA = 0 and NewSS is NULL THEN #TS(NewSS); FI; Read new stack-segment descriptor; IF IA32_EFER.LMA = 0 and (NewSS RPL ≠ new code-segment DPL or new stack-segment DPL ≠ new code-segment DPL or new stack segment is not a writable data segment) THEN #TS(NewSS); FI IF IA32_EFER.LMA = 0 and new stack segment not present THEN #SS(NewSS); FI; IF CallGateSize = 32 THEN 3-130 Vol. 2A CALL—Call Procedure INSTRUCTION SET REFERENCE, A-L IF new stack does not have room for parameters plus 16 bytes THEN #SS(NewSS); FI; IF CallGate(InstructionPointer) not within new code-segment limit THEN #GP(0); FI; SS ← newSS; (* Segment descriptor information also loaded *) ESP ← newESP; CS:EIP ← CallGate(CS:InstructionPointer); (* Segment descriptor information also loaded *) Push(oldSS:oldESP); (* From calling procedure *) temp ← parameter count from call gate, masked to 5 bits; Push(parameters from calling procedure’s stack, temp) Push(oldCS:oldEIP); (* Return address to calling procedure *) ELSE IF CallGateSize = 16 THEN IF new stack does not have room for parameters plus 8 bytes THEN #SS(NewSS); FI; IF (CallGate(InstructionPointer) AND FFFFH) not in new code-segment limit THEN #GP(0); FI; SS ← newSS; (* Segment descriptor information also loaded *) ESP ← newESP; CS:IP ← CallGate(CS:InstructionPointer); (* Segment descriptor information also loaded *) Push(oldSS:oldESP); (* From calling procedure *) temp ← parameter count from call gate, masked to 5 bits; Push(parameters from calling procedure’s stack, temp) Push(oldCS:oldEIP); (* Return address to calling procedure *) ELSE (* CallGateSize = 64 *) IF pushing 32 bytes on the stack would use a non-canonical address THEN #SS(NewSS); FI; IF (CallGate(InstructionPointer) is non-canonical) THEN #GP(0); FI; SS ← NewSS; (* NewSS is NULL) RSP ← NewESP; CS:IP ← CallGate(CS:InstructionPointer); (* Segment descriptor information also loaded *) Push(oldSS:oldESP); (* From calling procedure *) Push(oldCS:oldEIP); (* Return address to calling procedure *) FI; FI; CPL ← CodeSegment(DPL) CS(RPL) ← CPL END; SAME-PRIVILEGE: IF CallGateSize = 32 THEN IF stack does not have room for 8 bytes THEN #SS(0); FI; IF CallGate(InstructionPointer) not within code segment limit THEN #GP(0); FI; CS:EIP ← CallGate(CS:EIP) (* Segment descriptor information also loaded *) Push(oldCS:oldEIP); (* Return address to calling procedure *) ELSE CALL—Call Procedure Vol. 2A 3-131 INSTRUCTION SET REFERENCE, A-L If CallGateSize = 16 THEN IF stack does not have room for 4 bytes THEN #SS(0); FI; IF CallGate(InstructionPointer) not within code segment limit THEN #GP(0); FI; CS:IP ← CallGate(CS:instruction pointer); (* Segment descriptor information also loaded *) Push(oldCS:oldIP); (* Return address to calling procedure *) ELSE (* CallGateSize = 64) IF pushing 16 bytes on the stack touches non-canonical addresses THEN #SS(0); FI; IF RIP non-canonical THEN #GP(0); FI; CS:IP ← CallGate(CS:instruction pointer); (* Segment descriptor information also loaded *) Push(oldCS:oldIP); (* Return address to calling procedure *) FI; FI; CS(RPL) ← CPL END; TASK-GATE: IF task gate DPL < CPL or RPL THEN #GP(task gate selector); FI; IF task gate not present THEN #NP(task gate selector); FI; Read the TSS segment selector in the task-gate descriptor; IF TSS segment selector local/global bit is set to local or index not within GDT limits THEN #GP(TSS selector); FI; Access TSS descriptor in GDT; IF descriptor is not a TSS segment THEN #GP(TSS selector); FI; IF TSS descriptor specifies that the TSS is busy THEN #GP(TSS selector); FI; IF TSS not present THEN #NP(TSS selector); FI; SWITCH-TASKS (with nesting) to TSS; IF EIP not within code segment limit THEN #GP(0); FI; END; TASK-STATE-SEGMENT: IF TSS DPL < CPL or RPL or TSS descriptor indicates TSS not available THEN #GP(TSS selector); FI; IF TSS is not present THEN #NP(TSS selector); FI; SWITCH-TASKS (with nesting) to TSS; IF EIP not within code segment limit THEN #GP(0); FI; END; 3-132 Vol. 2A CALL—Call Procedure INSTRUCTION SET REFERENCE, A-L Flags Affected All flags are affected if a task switch occurs; no flags are affected if a task switch does not occur. Protected Mode Exceptions #GP(0) If the target offset in destination operand is beyond the new code segment limit. If the segment selector in the destination operand is NULL. If the code segment selector in the gate is NULL. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector. #GP(selector) If a code segment or gate or TSS selector index is outside descriptor table limits. If the segment descriptor pointed to by the segment selector in the destination operand is not for a conforming-code segment, nonconforming-code segment, call gate, task gate, or task state segment. If the DPL for a nonconforming-code segment is not equal to the CPL or the RPL for the segment’s segment selector is greater than the CPL. If the DPL for a conforming-code segment is greater than the CPL. If the DPL from a call-gate, task-gate, or TSS segment descriptor is less than the CPL or than the RPL of the call-gate, task-gate, or TSS’s segment selector. If the segment descriptor for a segment selector from a call gate does not indicate it is a code segment. If the segment selector from a call gate is beyond the descriptor table limits. If the DPL for a code-segment obtained from a call gate is greater than the CPL. If the segment selector for a TSS has its local/global bit set for local. If a TSS segment descriptor specifies that the TSS is busy or not available. #SS(0) If pushing the return address, parameters, or stack segment pointer onto the stack exceeds the bounds of the stack segment, when no stack switch occurs. If a memory operand effective address is outside the SS segment limit. #SS(selector) If pushing the return address, parameters, or stack segment pointer onto the stack exceeds the bounds of the stack segment, when a stack switch occurs. If the SS register is being loaded as part of a stack switch and the segment pointed to is marked not present. If stack segment does not have room for the return address, parameters, or stack segment pointer, when stack switch occurs. #NP(selector) If a code segment, data segment, call gate, task gate, or TSS is not present. #TS(selector) If the new stack segment selector and ESP are beyond the end of the TSS. If the new stack segment selector is NULL. If the RPL of the new stack segment selector in the TSS is not equal to the DPL of the code segment being accessed. If DPL of the stack segment descriptor for the new stack segment is not equal to the DPL of the code segment descriptor. If the new stack segment is not a writable data segment. If segment-selector index for stack segment is outside descriptor table limits. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. #UD If the LOCK prefix is used. CALL—Call Procedure Vol. 2A 3-133 INSTRUCTION SET REFERENCE, A-L Real-Address Mode Exceptions #GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the target offset is beyond the code segment limit. #UD If the LOCK prefix is used. Virtual-8086 Mode Exceptions #GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the target offset is beyond the code segment limit. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made. #UD If the LOCK prefix is used. Compatibility Mode Exceptions Same exceptions as in protected mode. #GP(selector) If a memory address accessed by the selector is in non-canonical space. #GP(0) If the target offset in the destination operand is non-canonical. 64-Bit Mode Exceptions #GP(0) If a memory address is non-canonical. If target offset in destination operand is non-canonical. If the segment selector in the destination operand is NULL. If the code segment selector in the 64-bit gate is NULL. #GP(selector) If code segment or 64-bit call gate is outside descriptor table limits. If code segment or 64-bit call gate overlaps non-canonical space. If the segment descriptor pointed to by the segment selector in the destination operand is not for a conforming-code segment, nonconforming-code segment, or 64-bit call gate. If the segment descriptor pointed to by the segment selector in the destination operand is a code segment and has both the D-bit and the L- bit set. If the DPL for a nonconforming-code segment is not equal to the CPL, or the RPL for the segment’s segment selector is greater than the CPL. If the DPL for a conforming-code segment is greater than the CPL. If the DPL from a 64-bit call-gate is less than the CPL or than the RPL of the 64-bit call-gate. If the upper type field of a 64-bit call gate is not 0x0. If the segment selector from a 64-bit call gate is beyond the descriptor table limits. If the DPL for a code-segment obtained from a 64-bit call gate is greater than the CPL. If the code segment descriptor pointed to by the selector in the 64-bit gate doesn't have the Lbit set and the D-bit clear. If the segment descriptor for a segment selector from the 64-bit call gate does not indicate it is a code segment. #SS(0) If pushing the return offset or CS selector onto the stack exceeds the bounds of the stack segment when no stack switch occurs. If a memory operand effective address is outside the SS segment limit. If the stack address is in a non-canonical form. #SS(selector) If pushing the old values of SS selector, stack pointer, EFLAGS, CS selector, offset, or error code onto the stack violates the canonical boundary when a stack switch occurs. #NP(selector) If a code segment or 64-bit call gate is not present. #TS(selector) If the load of the new RSP exceeds the limit of the TSS. #UD (64-bit mode only) If a far call is direct to an absolute address in memory. If the LOCK prefix is used. 3-134 Vol. 2A CALL—Call Procedure INSTRUCTION SET REFERENCE, A-L #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. CALL—Call Procedure Vol. 2A 3-135 INSTRUCTION SET REFERENCE, A-L CLFLUSHOPT—Flush Cache Line Optimized Opcode / Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode NFx 66 0F AE /7 M Valid Valid Flushes cache line containing m8. CLFLUSHOPT m8 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (w) NA NA NA Description Invalidates from every level of the cache hierarchy in the cache coherence domain the cache line that contains the linear address specified with the memory operand. If that cache line contains modified data at any level of the cache hierarchy, that data is written back to memory. The source operand is a byte memory location. The availability of CLFLUSHOPT is indicated by the presence of the CPUID feature flag CLFLUSHOPT (CPUID.(EAX=7,ECX=0):EBX[bit 23]). The aligned cache line size affected is also indicated with the CPUID instruction (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1). The memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It should be noted that processors are free to speculatively fetch and cache data from system memory regions assigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCHh instructions can be used to provide the processor with hints for this speculative behavior. Because this speculative fetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with respect to PREFETCHh instructions or any of the speculative fetching mechanisms (that is, data can be speculatively loaded into a cache line just before, during, or after the execution of a CLFLUSH instruction that references the cache line). Executions of the CLFLUSHOPT instruction are ordered with respect to fence instructions and to locked readmodify-write instructions; they are also ordered with respect to the following accesses to the cache line being invalidated: older writes and older executions of CLFLUSH. They are not ordered with respect to writes, executions of CLFLUSH that access other cache lines, or executions of CLFLUSHOPT regardless of cache line; to enforce CLFLUSHOPT ordering with any write, CLFLUSH, or CLFLUSHOPT operation, software can insert an SFENCE instruction between CLFLUSHOPT and that operation. The CLFLUSHOPT instruction can be used at all privilege levels and is subject to all permission checking and faults associated with a byte load (and in addition, a CLFLUSHOPT instruction is allowed to flush a linear address in an execute-only segment). Like a load, the CLFLUSHOPT instruction sets the A bit but not the D bit in the page tables. In some implementations, the CLFLUSHOPT instruction may always cause transactional abort with Transactional Synchronization Extensions (TSX). The CLFLUSHOPT instruction is not expected to be commonly used inside typical transactional regions. However, programmers must not rely on CLFLUSHOPT instruction to force a transactional abort, since whether they cause transactional abort is implementation dependent. CLFLUSHOPT operation is the same in non-64-bit modes and 64-bit mode. Operation Flush_Cache_Line_Optimized(SRC); Intel C/C++ Compiler Intrinsic Equivalents CLFLUSHOPT:void _mm_clflushopt(void const *p) 3-144 Vol. 2A CLFLUSHOPT—Flush Cache Line Optimized INSTRUCTION SET REFERENCE, A-L Protected Mode Exceptions #GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. #SS(0) For an illegal address in the SS segment. #PF(fault-code) For a page fault. #UD If CPUID.(EAX=7,ECX=0):EBX.CLFLUSHOPT[bit 23] = 0. If the LOCK prefix is used. If an instruction prefix F2H or F3H is used. Real-Address Mode Exceptions #GP #UD If any part of the operand lies outside the effective address space from 0 to FFFFH. If CPUID.(EAX=7,ECX=0):EBX.CLFLUSHOPT[bit 23] = 0. If the LOCK prefix is used. If an instruction prefix F2H or F3H is used. Virtual-8086 Mode Exceptions Same exceptions as in real address mode. #PF(fault-code) For a page fault. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #SS(0) If a memory address referencing the SS segment is in a non-canonical form. #GP(0) If the memory address is in a non-canonical form. #PF(fault-code) For a page fault. #UD If CPUID.(EAX=7,ECX=0):EBX.CLFLUSHOPT[bit 23] = 0. If the LOCK prefix is used. If an instruction prefix F2H or F3H is used. CLFLUSHOPT—Flush Cache Line Optimized Vol. 2A 3-145 INSTRUCTION SET REFERENCE, A-L CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes Opcode/ Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F C7 /1 M Valid Valid* Compare EDX:EAX with m64. If equal, set ZF and load ECX:EBX into m64. Else, clear ZF and load m64 into EDX:EAX. M Valid N.E. Compare RDX:RAX with m128. If equal, set ZF and load RCX:RBX into m128. Else, clear ZF and load m128 into RDX:RAX. CMPXCHG8B m64 REX.W + 0F C7 /1 CMPXCHG16B m128 NOTES: *See IA-32 Architecture Compatibility section below. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (r, w) NA NA NA Description Compares the 64-bit value in EDX:EAX (or 128-bit value in RDX:RAX if operand size is 128 bits) with the operand (destination operand). If the values are equal, the 64-bit value in ECX:EBX (or 128-bit value in RCX:RBX) is stored in the destination operand. Otherwise, the value in the destination operand is loaded into EDX:EAX (or RDX:RAX). The destination operand is an 8-byte memory location (or 16-byte memory location if operand size is 128 bits). For the EDX:EAX and ECX:EBX register pairs, EDX and ECX contain the high-order 32 bits and EAX and EBX contain the low-order 32 bits of a 64-bit value. For the RDX:RAX and RCX:RBX register pairs, RDX and RCX contain the highorder 64 bits and RAX and RBX contain the low-order 64bits of a 128-bit value. This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processor’s bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.) In 64-bit mode, default operation size is 64 bits. Use of the REX.W prefix promotes operation to 128 bits. Note that CMPXCHG16B requires that the destination (memory) operand be 16-byte aligned. See the summary chart at the beginning of this section for encoding data and limits. For information on the CPUID flag that indicates CMPXCHG16B, see page 3-213. IA-32 Architecture Compatibility This instruction encoding is not supported on Intel processors earlier than the Pentium processors. 3-186 Vol. 2A CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes INSTRUCTION SET REFERENCE, A-L Operation IF (64-Bit Mode and OperandSize = 64) THEN TEMP128 ← DEST IF (RDX:RAX = TEMP128) THEN ZF ← 1; DEST ← RCX:RBX; ELSE ZF ← 0; RDX:RAX ← TEMP128; DEST ← TEMP128; FI; FI ELSE TEMP64 ← DEST; IF (EDX:EAX = TEMP64) THEN ZF ← 1; DEST ← ECX:EBX; ELSE ZF ← 0; EDX:EAX ← TEMP64; DEST ← TEMP64; FI; FI; FI; Flags Affected The ZF flag is set if the destination operand and EDX:EAX are equal; otherwise it is cleared. The CF, PF, AF, SF, and OF flags are unaffected. Protected Mode Exceptions #UD If the destination is not a memory operand. #GP(0) If the destination is located in a non-writable segment. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains a NULL segment selector. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. Real-Address Mode Exceptions #UD If the destination operand is not a memory location. #GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. #SS If a memory operand effective address is outside the SS segment limit. CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes Vol. 2A 3-187 INSTRUCTION SET REFERENCE, A-L Virtual-8086 Mode Exceptions #UD If the destination operand is not a memory location. #GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #SS(0) #GP(0) If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If memory operand for CMPXCHG16B is not aligned on a 16-byte boundary. If CPUID.01H:ECX.CMPXCHG16B[bit 13] = 0. #UD If the destination operand is not a memory location. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. 3-188 Vol. 2A CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes INSTRUCTION SET REFERENCE, A-L CPUID—CPU Identification Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F A2 CPUID ZO Valid Valid Returns processor identification and feature information to the EAX, EBX, ECX, and EDX registers, as determined by input entered in EAX (in some cases, ECX as well). Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 ZO NA NA NA NA Description The ID flag (bit 21) in the EFLAGS register indicates support for the CPUID instruction. If a software procedure can set and clear this flag, the processor executing the procedure supports the CPUID instruction. This instruction operates the same in non-64-bit modes and 64-bit mode. CPUID returns processor identification and feature information in the EAX, EBX, ECX, and EDX registers.1 The instruction’s output is dependent on the contents of the EAX register upon execution (in some cases, ECX as well). For example, the following pseudocode loads EAX with 00H and causes CPUID to return a Maximum Return Value and the Vendor Identification String in the appropriate registers: MOV EAX, 00H CPUID Table 3-8 shows information returned, depending on the initial value loaded into the EAX register. Two types of information are returned: basic and extended function information. If a value entered for CPUID.EAX is higher than the maximum input value for basic or extended function for that processor then the data for the highest basic information leaf is returned. For example, using some Intel processors, the following is true: CPUID.EAX = 05H (* Returns MONITOR/MWAIT leaf. *) CPUID.EAX = 0AH (* Returns Architectural Performance Monitoring leaf. *) CPUID.EAX = 0BH (* Returns Extended Topology Enumeration leaf. *)2 CPUID.EAX =1FH (* Returns V2 Extended Topology Enumeration leaf. *)2 CPUID.EAX = 80000008H (* Returns linear/physical address size data. *) CPUID.EAX = 8000000AH (* INVALID: Returns same information as CPUID.EAX = 0BH. *) If a value entered for CPUID.EAX is less than or equal to the maximum input value and the leaf is not supported on that processor then 0 is returned in all the registers. When CPUID returns the highest basic leaf information as a result of an invalid input EAX value, any dependence on input ECX value in the basic leaf is honored. CPUID can be executed at any privilege level to serialize instruction execution. Serializing instruction execution guarantees that any modifications to flags, registers, and memory for previous instructions are completed before the next instruction is fetched and executed. See also: “Serializing Instructions” in Chapter 8, “Multiple-Processor Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. “Caching Translation Information” in Chapter 4, “Paging,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 1. On Intel 64 processors, CPUID clears the high 32 bits of the RAX/RBX/RCX/RDX registers in all modes. 2. CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence of CPUID leaf 1FH before using leaf 0BH. CPUID—CPU Identification Vol. 2A 3-193 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction Initial EAX Value Information Provided about the Processor Basic CPUID Information 0H 01H EAX Maximum Input Value for Basic CPUID Information. EBX “Genu” ECX “ntel” EDX “ineI” EAX Version Information: Type, Family, Model, and Stepping ID (see Figure 3-6). EBX Bits 07 - 00: Brand Index. Bits 15 - 08: CLFLUSH line size (Value ∗ 8 = cache line size in bytes; used also by CLFLUSHOPT). Bits 23 - 16: Maximum number of addressable IDs for logical processors in this physical package*. Bits 31 - 24: Initial APIC ID**. ECX Feature Information (see Figure 3-7 and Table 3-10). EDX Feature Information (see Figure 3-8 and Table 3-11). NOTES: * The nearest power-of-2 integer that is not smaller than EBX[23:16] is the number of unique initial APIC IDs reserved for addressing different logical processors in a physical package. This field is only valid if CPUID.1.EDX.HTT[bit 28]= 1. ** The 8-bit initial APIC ID in EBX[31:24] is replaced by the 32-bit x2APIC ID, available in Leaf 0BH and Leaf 1FH. 02H 03H EAX Cache and TLB Information (see Table 3-12). EBX Cache and TLB Information. ECX Cache and TLB Information. EDX Cache and TLB Information. EAX Reserved. EBX Reserved. ECX Bits 00 - 31 of 96 bit processor serial number. (Available in Pentium III processor only; otherwise, the value in this register is reserved.) EDX Bits 32 - 63 of 96 bit processor serial number. (Available in Pentium III processor only; otherwise, the value in this register is reserved.) NOTES: Processor serial number (PSN) is not supported in the Pentium 4 processor or later. On all models, use the PSN flag (returned using CPUID) to check for PSN support before accessing the feature. CPUID leaves above 2 and below 80000000H are visible only when IA32_MISC_ENABLE[bit 22] has its default value of 0. Deterministic Cache Parameters Leaf 04H NOTES: Leaf 04H output depends on the initial value in ECX.* See also: “INPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level” on page 221. EAX 3-194 Vol. 2A Bits 04 - 00: Cache Type Field. 0 = Null - No more caches. 1 = Data Cache. 2 = Instruction Cache. 3 = Unified Cache. 4-31 = Reserved. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor Bits 07 - 05: Cache Level (starts at 1). Bit 08: Self Initializing cache level (does not need SW initialization). Bit 09: Fully Associative cache. Bits 13 - 10: Reserved. Bits 25 - 14: Maximum number of addressable IDs for logical processors sharing this cache**, ***. Bits 31 - 26: Maximum number of addressable IDs for processor cores in the physical package**, ****, *****. EBX Bits 11 - 00: L = System Coherency Line Size**. Bits 21 - 12: P = Physical Line partitions**. Bits 31 - 22: W = Ways of associativity**. ECX Bits 31-00: S = Number of Sets**. EDX Bit 00: Write-Back Invalidate/Invalidate. 0 = WBINVD/INVD from threads sharing this cache acts upon lower level caches for threads sharing this cache. 1 = WBINVD/INVD is not guaranteed to act upon lower level caches of non-originating threads sharing this cache. Bit 01: Cache Inclusiveness. 0 = Cache is not inclusive of lower cache levels. 1 = Cache is inclusive of lower cache levels. Bit 02: Complex Cache Indexing. 0 = Direct mapped cache. 1 = A complex function is used to index the cache, potentially using all address bits. Bits 31 - 03: Reserved = 0. NOTES: * If ECX contains an invalid sub leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n+1 is invalid if subleaf n returns EAX[4:0] as 0. ** Add one to the return value to get the result. ***The nearest power-of-2 integer that is not smaller than (1 + EAX[25:14]) is the number of unique initial APIC IDs reserved for addressing different logical processors sharing this cache. **** The nearest power-of-2 integer that is not smaller than (1 + EAX[31:26]) is the number of unique Core_IDs reserved for addressing different processor cores in a physical package. Core ID is a subset of bits of the initial APIC ID. ***** The returned value is constant for valid initial values in ECX. Valid ECX values start from 0. MONITOR/MWAIT Leaf 05H EAX Bits 15 - 00: Smallest monitor-line size in bytes (default is processor's monitor granularity). Bits 31 - 16: Reserved = 0. EBX Bits 15 - 00: Largest monitor-line size in bytes (default is processor's monitor granularity). Bits 31 - 16: Reserved = 0. ECX Bit 00: Enumeration of Monitor-Mwait extensions (beyond EAX and EBX registers) supported. Bit 01: Supports treating interrupts as break-event for MWAIT, even when interrupts disabled. Bits 31 - 02: Reserved. CPUID—CPU Identification Vol. 2A 3-195 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor EDX Bits 03 - 00: Number of C0* sub C-states supported using MWAIT. Bits 07 - 04: Number of C1* sub C-states supported using MWAIT. Bits 11 - 08: Number of C2* sub C-states supported using MWAIT. Bits 15 - 12: Number of C3* sub C-states supported using MWAIT. Bits 19 - 16: Number of C4* sub C-states supported using MWAIT. Bits 23 - 20: Number of C5* sub C-states supported using MWAIT. Bits 27 - 24: Number of C6* sub C-states supported using MWAIT. Bits 31 - 28: Number of C7* sub C-states supported using MWAIT. NOTE: * The definition of C0 through C7 states for MWAIT extension are processor-specific C-states, not ACPI Cstates. Thermal and Power Management Leaf 06H 3-196 Vol. 2A EAX Bit 00: Digital temperature sensor is supported if set. Bit 01: Intel Turbo Boost Technology available (see description of IA32_MISC_ENABLE[38]). Bit 02: ARAT. APIC-Timer-always-running feature is supported if set. Bit 03: Reserved. Bit 04: PLN. Power limit notification controls are supported if set. Bit 05: ECMD. Clock modulation duty cycle extension is supported if set. Bit 06: PTM. Package thermal management is supported if set. Bit 07: HWP. HWP base registers (IA32_PM_ENABLE[bit 0], IA32_HWP_CAPABILITIES, IA32_HWP_REQUEST, IA32_HWP_STATUS) are supported if set. Bit 08: HWP_Notification. IA32_HWP_INTERRUPT MSR is supported if set. Bit 09: HWP_Activity_Window. IA32_HWP_REQUEST[bits 41:32] is supported if set. Bit 10: HWP_Energy_Performance_Preference. IA32_HWP_REQUEST[bits 31:24] is supported if set. Bit 11: HWP_Package_Level_Request. IA32_HWP_REQUEST_PKG MSR is supported if set. Bit 12: Reserved. Bit 13: HDC. HDC base registers IA32_PKG_HDC_CTL, IA32_PM_CTL1, IA32_THREAD_STALL MSRs are supported if set. Bit 14: Intel® Turbo Boost Max Technology 3.0 available. Bit 15: HWP Capabilities. Highest Performance change is supported if set. Bit 16: HWP PECI override is supported if set. Bit 17: Flexible HWP is supported if set. Bit 18: Fast access mode for the IA32_HWP_REQUEST MSR is supported if set. Bit 19: Reserved. Bit 20: Ignoring Idle Logical Processor HWP request is supported if set. Bits 31 - 21: Reserved. EBX Bits 03 - 00: Number of Interrupt Thresholds in Digital Thermal Sensor. Bits 31 - 04: Reserved. ECX Bit 00: Hardware Coordination Feedback Capability (Presence of IA32_MPERF and IA32_APERF). The capability to provide a measure of delivered processor performance (since last reset of the counters), as a percentage of the expected processor performance when running at the TSC frequency. Bits 02 - 01: Reserved = 0. Bit 03: The processor supports performance-energy bias preference if CPUID.06H:ECX.SETBH[bit 3] is set and it also implies the presence of a new architectural MSR called IA32_ENERGY_PERF_BIAS (1B0H). Bits 31 - 04: Reserved = 0. EDX Reserved = 0. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor Structured Extended Feature Flags Enumeration Leaf (Output depends on ECX input value) 07H Sub-leaf 0 (Input ECX = 0). * EAX Bits 31 - 00: Reports the maximum input value for supported leaf 7 sub-leaves. EBX Bit 00: FSGSBASE. Supports RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE if 1. Bit 01: IA32_TSC_ADJUST MSR is supported if 1. Bit 02: SGX. Supports Intel® Software Guard Extensions (Intel® SGX Extensions) if 1. Bit 03: BMI1. Bit 04: HLE. Bit 05: AVX2. Bit 06: FDP_EXCPTN_ONLY. x87 FPU Data Pointer updated only on x87 exceptions if 1. Bit 07: SMEP. Supports Supervisor-Mode Execution Prevention if 1. Bit 08: BMI2. Bit 09: Supports Enhanced REP MOVSB/STOSB if 1. Bit 10: INVPCID. If 1, supports INVPCID instruction for system software that manages process-context identifiers. Bit 11: RTM. Bit 12: RDT-M. Supports Intel® Resource Director Technology (Intel® RDT) Monitoring capability if 1. Bit 13: Deprecates FPU CS and FPU DS values if 1. Bit 14: MPX. Supports Intel® Memory Protection Extensions if 1. Bit 15: RDT-A. Supports Intel® Resource Director Technology (Intel® RDT) Allocation capability if 1. Bit 16: AVX512F. Bit 17: AVX512DQ. Bit 18: RDSEED. Bit 19: ADX. Bit 20: SMAP. Supports Supervisor-Mode Access Prevention (and the CLAC/STAC instructions) if 1. Bit 21: AVX512_IFMA. Bit 22: Reserved. Bit 23: CLFLUSHOPT. Bit 24: CLWB. Bit 25: Intel Processor Trace. Bit 26: AVX512PF. (Intel® Xeon Phi™ only.) Bit 27: AVX512ER. (Intel® Xeon Phi™ only.) Bit 28: AVX512CD. Bit 29: SHA. supports Intel® Secure Hash Algorithm Extensions (Intel® SHA Extensions) if 1. Bit 30: AVX512BW. Bit 31: AVX512VL. CPUID—CPU Identification Vol. 2A 3-197 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor ECX Bit 00: PREFETCHWT1. (Intel® Xeon Phi™ only.) Bit 01: AVX512_VBMI. Bit 02: UMIP. Supports user-mode instruction prevention if 1. Bit 03: PKU. Supports protection keys for user-mode pages if 1. Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable protection keys (and the RDPKRU/WRPKRU instructions). Bit 05: WAITPKG Bit 07 - 06: Reserved Bit 08: GFNI Bits 13 - 09: Reserved. Bit 14: AVX512_VPOPCNTDQ. (Intel® Xeon Phi™ only.) Bits 16 - 15: Reserved. Bits 21 - 17: The value of MAWAU used by the BNDLDX and BNDSTX instructions in 64-bit mode. Bit 22: RDPID and IA32_TSC_AUX are available if 1. Bits 24 - 23: Reserved. Bit 25: CLDEMOTE. Supports cache line demote if 1. Bit 26: Reserved Bit 27: MOVDIRI. Supports MOVDIRI if 1. Bit 28: MOVDIR64B. Supports MOVDIR64B if 1. Bit 29: Reserved Bit 30: SGX_LC. Supports SGX Launch Configuration if 1. Bit 31: Reserved. EDX Bit 01: Reserved. Bit 02: AVX512_4VNNIW. (Intel® Xeon Phi™ only.) Bit 03: AVX512_4FMAPS. (Intel® Xeon Phi™ only.) Bits 25-04: Reserved. Bit 26: Enumerates support for indirect branch restricted speculation (IBRS) and the indirect branch predictor barrier (IBPB). Processors that set this bit support the IA32_SPEC_CTRL MSR and the IA32_PRED_CMD MSR. They allow software to set IA32_SPEC_CTRL[0] (IBRS) and IA32_PRED_CMD[0] (IBPB). Bit 27: Enumerates support for single thread indirect branch predictors (STIBP). Processors that set this bit support the IA32_SPEC_CTRL MSR. They allow software to set IA32_SPEC_CTRL[1] (STIBP). Bit 28: Enumerates support for L1D_FLUSH. Processors that set this bit support the IA32_FLUSH_CMD MSR. They allow software to set IA32_FLUSH_CMD[0] (L1D_FLUSH). Bit 29: Enumerates support for the IA32_ARCH_CAPABILITIES MSR. Bit 30: Enumerates support for the IA32_CORE_CAPABILITIES MSR. Bit 31: Enumerates support for Speculative Store Bypass Disable (SSBD). Processors that set this bit support the IA32_SPEC_CTRL MSR. They allow software to set IA32_SPEC_CTRL[2] (SSBD). NOTE: * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n is invalid if n exceeds the value that sub-leaf 0 returns in EAX. Direct Cache Access Information Leaf 09H 3-198 Vol. 2A EAX Value of bits [31:0] of IA32_PLATFORM_DCA_CAP MSR (address 1F8H). EBX Reserved. ECX Reserved. EDX Reserved. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor Architectural Performance Monitoring Leaf 0AH EAX Bits 07 - 00: Version ID of architectural performance monitoring. Bits 15 - 08: Number of general-purpose performance monitoring counter per logical processor. Bits 23 - 16: Bit width of general-purpose, performance monitoring counter. Bits 31 - 24: Length of EBX bit vector to enumerate architectural performance monitoring events. EBX Bit 00: Core cycle event not available if 1. Bit 01: Instruction retired event not available if 1. Bit 02: Reference cycles event not available if 1. Bit 03: Last-level cache reference event not available if 1. Bit 04: Last-level cache misses event not available if 1. Bit 05: Branch instruction retired event not available if 1. Bit 06: Branch mispredict retired event not available if 1. Bits 31 - 07: Reserved = 0. ECX Reserved = 0. EDX Bits 04 - 00: Number of fixed-function performance counters (if Version ID > 1). Bits 12 - 05: Bit width of fixed-function performance counters (if Version ID > 1). Bits 14 - 13: Reserved = 0. Bit 15: AnyThread deprecation. Bits 31 - 16: Reserved = 0. Extended Topology Enumeration Leaf 0BH NOTES: CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence of Leaf 1FH before using leaf 0BH. Most of Leaf 0BH output depends on the initial value in ECX. The EDX output of leaf 0BH is always valid and does not vary with input value in ECX. Output value in ECX[7:0] always equals input value in ECX[7:0]. Sub-leaf index 0 enumerates SMT level. Each subsequent higher sub-leaf index enumerates a higherlevel topological entity in hierarchical order. For sub-leaves that return an invalid level-type of 0 in ECX[15:8]; EAX and EBX will return 0. If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8], other input values with ECX > n also return 0 in ECX[15:8]. EAX Bits 04 - 00: Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type*. All logical processors with the same next level ID share current level. Bits 31 - 05: Reserved. EBX Bits 15 - 00: Number of logical processors at this level type. The number reflects configuration as shipped by Intel**. Bits 31- 16: Reserved. ECX Bits 07 - 00: Level number. Same value in ECX input. Bits 15 - 08: Level type***. Bits 31 - 16: Reserved. EDX Bits 31- 00: x2APIC ID the current logical processor. NOTES: * Software should use this field (EAX[4:0]) to enumerate processor topology of the system. CPUID—CPU Identification Vol. 2A 3-199 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor ** Software must not use EBX[15:0] to enumerate processor topology of the system. This value in this field (EBX[15:0]) is only intended for display/diagnostic purposes. The actual number of logical processors available to BIOS/OS/Applications may be different from the value of EBX[15:0], depending on software and platform hardware configurations. *** The value of the “level type” field is not related to level numbers in any way, higher “level type” values do not mean higher levels. Level type field has the following encoding: 0: Invalid. 1: SMT. 2: Core. 3-255: Reserved. Processor Extended State Enumeration Main Leaf (EAX = 0DH, ECX = 0) 0DH NOTES: Leaf 0DH main leaf (ECX = 0). EAX Bits 31 - 00: Reports the supported bits of the lower 32 bits of XCR0. XCR0[n] can be set to 1 only if EAX[n] is 1. Bit 00: x87 state. Bit 01: SSE state. Bit 02: AVX state. Bits 04 - 03: MPX state. Bits 07 - 05: AVX-512 state. Bit 08: Used for IA32_XSS. Bit 09: PKRU state. Bits 12 - 10: Reserved. Bit 13: Used for IA32_XSS. Bits 31 - 14: Reserved. EBX Bits 31 - 00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) required by enabled features in XCR0. May be different than ECX if some features at the end of the XSAVE save area are not enabled. ECX Bit 31 - 00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) of the XSAVE/XRSTOR save area required by all supported features in the processor, i.e., all the valid bit fields in XCR0. EDX Bit 31 - 00: Reports the supported bits of the upper 32 bits of XCR0. XCR0[n+32] can be set to 1 only if EDX[n] is 1. Bits 31 - 00: Reserved. Processor Extended State Enumeration Sub-leaf (EAX = 0DH, ECX = 1) 0DH 3-200 Vol. 2A EAX Bit 00: XSAVEOPT is available. Bit 01: Supports XSAVEC and the compacted form of XRSTOR if set. Bit 02: Supports XGETBV with ECX = 1 if set. Bit 03: Supports XSAVES/XRSTORS and IA32_XSS if set. Bits 31 - 04: Reserved. EBX Bits 31 - 00: The size in bytes of the XSAVE area containing all states enabled by XCRO | IA32_XSS. ECX Bits 31 - 00: Reports the supported bits of the lower 32 bits of the IA32_XSS MSR. IA32_XSS[n] can be set to 1 only if ECX[n] is 1. Bits 07 - 00: Used for XCR0. Bit 08: PT state. Bit 09: Used for XCR0. Bits 12 - 10: Reserved. Bit 13: HWP state. Bits 31 - 14: Reserved. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor EDX Bits 31 - 00: Reports the supported bits of the upper 32 bits of the IA32_XSS MSR. IA32_XSS[n+32] can be set to 1 only if EDX[n] is 1. Bits 31 - 00: Reserved. Processor Extended State Enumeration Sub-leaves (EAX = 0DH, ECX = n, n > 1) 0DH NOTES: Leaf 0DH output depends on the initial value in ECX. Each sub-leaf index (starting at position 2) is supported if it corresponds to a supported bit in either the XCR0 register or the IA32_XSS MSR. * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf n (0 ≤ n ≤ 31) is invalid if sub-leaf 0 returns 0 in EAX[n] and sub-leaf 1 returns 0 in ECX[n]. Sub-leaf n (32 ≤ n ≤ 63) is invalid if sub-leaf 0 returns 0 in EDX[n-32] and sub-leaf 1 returns 0 in EDX[n-32]. EAX Bits 31 - 0: The size in bytes (from the offset specified in EBX) of the save area for an extended state feature associated with a valid sub-leaf index, n. EBX Bits 31 - 0: The offset in bytes of this extended state component’s save area from the beginning of the XSAVE/XRSTOR area. This field reports 0 if the sub-leaf index, n, does not map to a valid bit in the XCR0 register*. ECX Bit 00 is set if the bit n (corresponding to the sub-leaf index) is supported in the IA32_XSS MSR; it is clear if bit n is instead supported in XCR0. Bit 01 is set if, when the compacted format of an XSAVE area is used, this extended state component located on the next 64-byte boundary following the preceding state component (otherwise, it is located immediately following the preceding state component). Bits 31 - 02 are reserved. This field reports 0 if the sub-leaf index, n, is invalid*. EDX This field reports 0 if the sub-leaf index, n, is invalid*; otherwise it is reserved. Intel Resource Director Technology (Intel RDT) Monitoring Enumeration Sub-leaf (EAX = 0FH, ECX = 0) 0FH NOTES: Leaf 0FH output depends on the initial value in ECX. Sub-leaf index 0 reports valid resource type starting at bit position 1 of EDX. EAX Reserved. EBX Bits 31 - 00: Maximum range (zero-based) of RMID within this physical processor of all types. ECX Reserved. EDX Bit 00: Reserved. Bit 01: Supports L3 Cache Intel RDT Monitoring if 1. Bits 31 - 02: Reserved. L3 Cache Intel RDT Monitoring Capability Enumeration Sub-leaf (EAX = 0FH, ECX = 1) 0FH NOTES: Leaf 0FH output depends on the initial value in ECX. EAX Reserved. EBX Bits 31 - 00: Conversion factor from reported IA32_QM_CTR value to occupancy metric (bytes) and Memory Bandwidth Monitoring (MBM) metrics. ECX Maximum range (zero-based) of RMID of this resource type. EDX Bit 00: Supports L3 occupancy monitoring if 1. Bit 01: Supports L3 Total Bandwidth monitoring if 1. Bit 02: Supports L3 Local Bandwidth monitoring if 1. Bits 31 - 03: Reserved. CPUID—CPU Identification Vol. 2A 3-201 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor Intel Resource Director Technology (Intel RDT) Allocation Enumeration Sub-leaf (EAX = 10H, ECX = 0) 10H NOTES: Leaf 10H output depends on the initial value in ECX. Sub-leaf index 0 reports valid resource identification (ResID) starting at bit position 1 of EBX. EAX Reserved. EBX Bit 00: Reserved. Bit 01: Supports L3 Cache Allocation Technology if 1. Bit 02: Supports L2 Cache Allocation Technology if 1. Bit 03: Supports Memory Bandwidth Allocation if 1. Bits 31 - 04: Reserved. ECX Reserved. EDX Reserved. L3 Cache Allocation Technology Enumeration Sub-leaf (EAX = 10H, ECX = ResID =1) 10H NOTES: Leaf 10H output depends on the initial value in ECX. EAX Bits 04 - 00: Length of the capacity bit mask for the corresponding ResID using minus-one notation. Bits 31 - 05: Reserved. EBX Bits 31 - 00: Bit-granular map of isolation/contention of allocation units. ECX Bits 01- 00: Reserved. Bit 02: Code and Data Prioritization Technology supported if 1. Bits 31 - 03: Reserved. EDX Bits 15 - 00: Highest COS number supported for this ResID. Bits 31 - 16: Reserved. L2 Cache Allocation Technology Enumeration Sub-leaf (EAX = 10H, ECX = ResID =2) 10H NOTES: Leaf 10H output depends on the initial value in ECX. EAX Bits 04 - 00: Length of the capacity bit mask for the corresponding ResID using minus-one notation. Bits 31 - 05: Reserved. EBX Bits 31 - 00: Bit-granular map of isolation/contention of allocation units. ECX Bits 31 - 00: Reserved. EDX Bits 15 - 00: Highest COS number supported for this ResID. Bits 31 - 16: Reserved. Memory Bandwidth Allocation Enumeration Sub-leaf (EAX = 10H, ECX = ResID =3) 10H 3-202 Vol. 2A NOTES: Leaf 10H output depends on the initial value in ECX. EAX Bits 11 - 00: Reports the maximum MBA throttling value supported for the corresponding ResID using minus-one notation. Bits 31 - 12: Reserved. EBX Bits 31 - 00: Reserved. ECX Bits 01 - 00: Reserved. Bit 02: Reports whether the response of the delay values is linear. Bits 31 - 03: Reserved. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor EDX Bits 15 - 00: Highest COS number supported for this ResID. Bits 31 - 16: Reserved. Intel SGX Capability Enumeration Leaf, sub-leaf 0 (EAX = 12H, ECX = 0) NOTES: Leaf 12H sub-leaf 0 (ECX = 0) is supported if CPUID.(EAX=07H, ECX=0H):EBX[SGX] = 1. 12H EAX Bit 00: SGX1. If 1, Indicates Intel SGX supports the collection of SGX1 leaf functions. Bit 01: SGX2. If 1, Indicates Intel SGX supports the collection of SGX2 leaf functions. Bits 04 - 02: Reserved. Bit 05: If 1, indicates Intel SGX supports ENCLV instruction leaves EINCVIRTCHILD, EDECVIRTCHILD, and ESETCONTEXT. Bit 06: If 1, indicates Intel SGX supports ENCLS instruction leaves ETRACKC, ERDINFO, ELDBC, and ELDUC. Bits 31 - 07: Reserved. EBX Bits 31 - 00: MISCSELECT. Bit vector of supported extended SGX features. ECX Bits 31 - 00: Reserved. EDX Bits 07 - 00: MaxEnclaveSize_Not64. The maximum supported enclave size in non-64-bit mode is 2^(EDX[7:0]). Bits 15 - 08: MaxEnclaveSize_64. The maximum supported enclave size in 64-bit mode is 2^(EDX[15:8]). Bits 31 - 16: Reserved. Intel SGX Attributes Enumeration Leaf, sub-leaf 1 (EAX = 12H, ECX = 1) NOTES: Leaf 12H sub-leaf 1 (ECX = 1) is supported if CPUID.(EAX=07H, ECX=0H):EBX[SGX] = 1. 12H EAX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[31:0] that software can set with ECREATE. EBX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[63:32] that software can set with ECREATE. ECX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[95:64] that software can set with ECREATE. EDX Bit 31 - 00: Reports the valid bits of SECS.ATTRIBUTES[127:96] that software can set with ECREATE. Intel SGX EPC Enumeration Leaf, sub-leaves (EAX = 12H, ECX = 2 or higher) NOTES: Leaf 12H sub-leaf 2 or higher (ECX >= 2) is supported if CPUID.(EAX=07H, ECX=0H):EBX[SGX] = 1. For sub-leaves (ECX = 2 or higher), definition of EDX,ECX,EBX,EAX[31:4] depends on the sub-leaf type listed below. 12H EAX Bit 03 - 00: Sub-leaf Type 0000b: Indicates this sub-leaf is invalid. 0001b: This sub-leaf enumerates an EPC section. EBX:EAX and EDX:ECX provide information on the Enclave Page Cache (EPC) section. All other type encodings are reserved. Type 0000b. This sub-leaf is invalid. EDX:ECX:EBX:EAX return 0. CPUID—CPU Identification Vol. 2A 3-203 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor Type 0001b. This sub-leaf enumerates an EPC sections with EDX:ECX, EBX:EAX defined as follows. EAX[11:04]: Reserved (enumerate 0). EAX[31:12]: Bits 31:12 of the physical address of the base of the EPC section. EBX[19:00]: Bits 51:32 of the physical address of the base of the EPC section. EBX[31:20]: Reserved. ECX[03:00]: EPC section property encoding defined as follows: If EAX[3:0] 0000b, then all bits of the EDX:ECX pair are enumerated as 0. If EAX[3:0] 0001b, then this section has confidentiality and integrity protection. All other encodings are reserved. ECX[11:04]: Reserved (enumerate 0). ECX[31:12]: Bits 31:12 of the size of the corresponding EPC section within the Processor Reserved Memory. EDX[19:00]: Bits 51:32 of the size of the corresponding EPC section within the Processor Reserved Memory. EDX[31:20]: Reserved. Intel Processor Trace Enumeration Main Leaf (EAX = 14H, ECX = 0) NOTES: Leaf 14H main leaf (ECX = 0). 14H EAX Bits 31 - 00: Reports the maximum sub-leaf supported in leaf 14H. EBX Bit 00: If 1, indicates that IA32_RTIT_CTL.CR3Filter can be set to 1, and that IA32_RTIT_CR3_MATCH MSR can be accessed. Bit 01: If 1, indicates support of Configurable PSB and Cycle-Accurate Mode. Bit 02: If 1, indicates support of IP Filtering, TraceStop filtering, and preservation of Intel PT MSRs across warm reset. Bit 03: If 1, indicates support of MTC timing packet and suppression of COFI-based packets. Bit 04: If 1, indicates support of PTWRITE. Writes can set IA32_RTIT_CTL[12] (PTWEn) and IA32_RTIT_CTL[5] (FUPonPTW), and PTWRITE can generate packets. Bit 05: If 1, indicates support of Power Event Trace. Writes can set IA32_RTIT_CTL[4] (PwrEvtEn), enabling Power Event Trace packet generation. Bit 31 - 06: Reserved. ECX Bit 00: If 1, Tracing can be enabled with IA32_RTIT_CTL.ToPA = 1, hence utilizing the ToPA output scheme; IA32_RTIT_OUTPUT_BASE and IA32_RTIT_OUTPUT_MASK_PTRS MSRs can be accessed. Bit 01: If 1, ToPA tables can hold any number of output entries, up to the maximum allowed by the MaskOrTableOffset field of IA32_RTIT_OUTPUT_MASK_PTRS. Bit 02: If 1, indicates support of Single-Range Output scheme. Bit 03: If 1, indicates support of output to Trace Transport subsystem. Bit 30 - 04: Reserved. Bit 31: If 1, generated packets which contain IP payloads have LIP values, which include the CS base component. EDX Bits 31 - 00: Reserved. Intel Processor Trace Enumeration Sub-leaf (EAX = 14H, ECX = 1) 14H 3-204 Vol. 2A EAX Bits 02 - 00: Number of configurable Address Ranges for filtering. Bits 15 - 03: Reserved. Bits 31 - 16: Bitmap of supported MTC period encodings. EBX Bits 15 - 00: Bitmap of supported Cycle Threshold value encodings. Bit 31 - 16: Bitmap of supported Configurable PSB frequency encodings. ECX Bits 31 - 00: Reserved. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor EDX Bits 31 - 00: Reserved. Time Stamp Counter and Nominal Core Crystal Clock Information Leaf 15H NOTES: If EBX[31:0] is 0, the TSC/”core crystal clock” ratio is not enumerated. EBX[31:0]/EAX[31:0] indicates the ratio of the TSC frequency and the core crystal clock frequency. If ECX is 0, the nominal core crystal clock frequency is not enumerated. “TSC frequency” = “core crystal clock frequency” * EBX/EAX. The core crystal clock may differ from the reference clock, bus clock, or core clock frequencies. EAX Bits 31 - 00: An unsigned integer which is the denominator of the TSC/”core crystal clock” ratio. EBX Bits 31 - 00: An unsigned integer which is the numerator of the TSC/”core crystal clock” ratio. ECX Bits 31 - 00: An unsigned integer which is the nominal frequency of the core crystal clock in Hz. EDX Bits 31 - 00: Reserved = 0. Processor Frequency Information Leaf 16H EAX Bits 15 - 00: Processor Base Frequency (in MHz). Bits 31 - 16: Reserved =0. EBX Bits 15 - 00: Maximum Frequency (in MHz). Bits 31 - 16: Reserved = 0. ECX Bits 15 - 00: Bus (Reference) Frequency (in MHz). Bits 31 - 16: Reserved = 0. EDX Reserved. NOTES: * Data is returned from this interface in accordance with the processor's specification and does not reflect actual values. Suitable use of this data includes the display of processor information in like manner to the processor brand string and for determining the appropriate range to use when displaying processor information e.g. frequency history graphs. The returned information should not be used for any other purpose as the returned information does not accurately correlate to information / counters returned by other processor interfaces. While a processor may support the Processor Frequency Information leaf, fields that return a value of zero are not supported. System-On-Chip Vendor Attribute Enumeration Main Leaf (EAX = 17H, ECX = 0) 17H NOTES: Leaf 17H main leaf (ECX = 0). Leaf 17H output depends on the initial value in ECX. Leaf 17H sub-leaves 1 through 3 reports SOC Vendor Brand String. Leaf 17H is valid if MaxSOCID_Index >= 3. Leaf 17H sub-leaves 4 and above are reserved. EAX Bits 31 - 00: MaxSOCID_Index. Reports the maximum input value of supported sub-leaf in leaf 17H. EBX Bits 15 - 00: SOC Vendor ID. Bit 16: IsVendorScheme. If 1, the SOC Vendor ID field is assigned via an industry standard enumeration scheme. Otherwise, the SOC Vendor ID field is assigned by Intel. Bits 31 - 17: Reserved = 0. ECX Bits 31 - 00: Project ID. A unique number an SOC vendor assigns to its SOC projects. EDX Bits 31 - 00: Stepping ID. A unique number within an SOC project that an SOC vendor assigns. CPUID—CPU Identification Vol. 2A 3-205 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor System-On-Chip Vendor Attribute Enumeration Sub-leaf (EAX = 17H, ECX = 1..3) 17H EAX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string. EBX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string. ECX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string. EDX Bit 31 - 00: SOC Vendor Brand String. UTF-8 encoded string. NOTES: Leaf 17H output depends on the initial value in ECX. SOC Vendor Brand String is a UTF-8 encoded string padded with trailing bytes of 00H. The complete SOC Vendor Brand String is constructed by concatenating in ascending order of EAX:EBX:ECX:EDX and from the sub-leaf 1 fragment towards sub-leaf 3. System-On-Chip Vendor Attribute Enumeration Sub-leaves (EAX = 17H, ECX > MaxSOCID_Index) 17H NOTES: Leaf 17H output depends on the initial value in ECX. EAX Bits 31 - 00: Reserved = 0. EBX Bits 31 - 00: Reserved = 0. ECX Bits 31 - 00: Reserved = 0. EDX Bits 31 - 00: Reserved = 0. Deterministic Address Translation Parameters Main Leaf (EAX = 18H, ECX = 0) 18H 3-206 Vol. 2A NOTES: Each sub-leaf enumerates a different address translation structure. If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n is invalid if n exceeds the value that sub-leaf 0 returns in EAX. A sub-leaf index is also invalid if EDX[4:0] returns 0. Valid sub-leaves do not need to be contiguous or in any particular order. A valid sub-leaf may be in a higher input ECX value than an invalid sub-leaf or than a valid sub-leaf of a higher or lower-level structure. * Some unified TLBs will allow a single TLB entry to satisfy data read/write and instruction fetches. Others will require separate entries (e.g., one loaded on data read/write and another loaded on an instruction fetch) . Please see the Intel® 64 and IA-32 Architectures Optimization Reference Manual for details of a particular product. ** Add one to the return value to get the result. EAX Bits 31 - 00: Reports the maximum input value of supported sub-leaf in leaf 18H. EBX Bit 00: 4K page size entries supported by this structure. Bit 01: 2MB page size entries supported by this structure. Bit 02: 4MB page size entries supported by this structure. Bit 03: 1 GB page size entries supported by this structure. Bits 07 - 04: Reserved. Bits 10 - 08: Partitioning (0: Soft partitioning between the logical processors sharing this structure). Bits 15 - 11: Reserved. Bits 31 - 16: W = Ways of associativity. ECX Bits 31 - 00: S = Number of Sets. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor EDX Bits 04 - 00: Translation cache type field. 00000b: Null (indicates this sub-leaf is not valid). 00001b: Data TLB. 00010b: Instruction TLB. 00011b: Unified TLB*. All other encodings are reserved. Bits 07 - 05: Translation cache level (starts at 1). Bit 08: Fully associative structure. Bits 13 - 09: Reserved. Bits 25- 14: Maximum number of addressable IDs for logical processors sharing this translation cache** Bits 31 - 26: Reserved. Deterministic Address Translation Parameters Sub-leaf (EAX = 18H, ECX ≥ 1) 18H NOTES: Each sub-leaf enumerates a different address translation structure. If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n is invalid if n exceeds the value that sub-leaf 0 returns in EAX. A sub-leaf index is also invalid if EDX[4:0] returns 0. Valid sub-leaves do not need to be contiguous or in any particular order. A valid sub-leaf may be in a higher input ECX value than an invalid sub-leaf or than a valid sub-leaf of a higher or lower-level structure. * Some unified TLBs will allow a single TLB entry to satisfy data read/write and instruction fetches. Others will require separate entries (e.g., one loaded on data read/write and another loaded on an instruction fetch) . Please see the Intel® 64 and IA-32 Architectures Optimization Reference Manual for details of a particular product. ** Add one to the return value to get the result. EAX Bits 31 - 00: Reserved. EBX Bit 00: 4K page size entries supported by this structure. Bit 01: 2MB page size entries supported by this structure. Bit 02: 4MB page size entries supported by this structure. Bit 03: 1 GB page size entries supported by this structure. Bits 07 - 04: Reserved. Bits 10 - 08: Partitioning (0: Soft partitioning between the logical processors sharing this structure). Bits 15 - 11: Reserved. Bits 31 - 16: W = Ways of associativity. ECX Bits 31 - 00: S = Number of Sets. EDX Bits 04 - 00: Translation cache type field. 0000b: Null (indicates this sub-leaf is not valid). 0001b: Data TLB. 0010b: Instruction TLB. 0011b: Unified TLB*. All other encodings are reserved. Bits 07 - 05: Translation cache level (starts at 1). Bit 08: Fully associative structure. Bits 13 - 09: Reserved. Bits 25- 14: Maximum number of addressable IDs for logical processors sharing this translation cache** Bits 31 - 26: Reserved. CPUID—CPU Identification Vol. 2A 3-207 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor V2 Extended Topology Enumeration Leaf 1FH NOTES: CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence of Leaf 1FH and using this if available. Most of Leaf 1FH output depends on the initial value in ECX. The EDX output of leaf 1FH is always valid and does not vary with input value in ECX. Output value in ECX[7:0] always equals input value in ECX[7:0]. Sub-leaf index 0 enumerates SMT level. Each subsequent higher sub-leaf index enumerates a higherlevel topological entity in hierarchical order. For sub-leaves that return an invalid level-type of 0 in ECX[15:8]; EAX and EBX will return 0. If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8], other input values with ECX > n also return 0 in ECX[15:8]. EAX Bits 04 - 00: Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type*. All logical processors with the same next level ID share current level. Bits 31 - 05: Reserved. EBX Bits 15 - 00: Number of logical processors at this level type. The number reflects configuration as shipped by Intel**. Bits 31- 16: Reserved. ECX Bits 07 - 00: Level number. Same value in ECX input. Bits 15 - 08: Level type***. Bits 31 - 16: Reserved. EDX Bits 31- 00: x2APIC ID the current logical processor. NOTES: * Software should use this field (EAX[4:0]) to enumerate processor topology of the system. ** Software must not use EBX[15:0] to enumerate processor topology of the system. This value in this field (EBX[15:0]) is only intended for display/diagnostic purposes. The actual number of logical processors available to BIOS/OS/Applications may be different from the value of EBX[15:0], depending on software and platform hardware configurations. *** The value of the “level type” field is not related to level numbers in any way, higher “level type” values do not mean higher levels. Level type field has the following encoding: 0: Invalid. 1: SMT. 2: Core. 3: Module. 4: Tile. 5: Die. 6-255: Reserved. Unimplemented CPUID Leaf Functions 40000000H 4FFFFFFFH Invalid. No existing or future CPU will return processor identification or feature information if the initial EAX value is in the range 40000000H to 4FFFFFFFH. Extended Function CPUID Information 80000000H EAX 3-208 Vol. 2A Maximum Input Value for Extended Function CPUID Information. EBX Reserved. ECX Reserved. CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor EDX 80000001H EAX Reserved. Extended Processor Signature and Feature Bits. EBX Reserved. ECX Bit 00: LAHF/SAHF available in 64-bit mode.* Bits 04 - 01: Reserved. Bit 05: LZCNT. Bits 07 - 06: Reserved. Bit 08: PREFETCHW. Bits 31 - 09: Reserved. EDX Bits 10 - 00: Reserved. Bit 11: SYSCALL/SYSRET.** Bits 19 - 12: Reserved = 0. Bit 20: Execute Disable Bit available. Bits 25 - 21: Reserved = 0. Bit 26: 1-GByte pages are available if 1. Bit 27: RDTSCP and IA32_TSC_AUX are available if 1. Bit 28: Reserved = 0. Bit 29: Intel® 64 Architecture available if 1. Bits 31 - 30: Reserved = 0. NOTES: * LAHF and SAHF are always available in other modes, regardless of the enumeration of this feature flag. ** Intel processors support SYSCALL and SYSRET only in 64-bit mode. This feature flag is always enumerated as 0 outside 64-bit mode. 80000002H EAX EBX ECX EDX Processor Brand String. Processor Brand String Continued. Processor Brand String Continued. Processor Brand String Continued. 80000003H EAX EBX ECX EDX Processor Brand String Continued. Processor Brand String Continued. Processor Brand String Continued. Processor Brand String Continued. 80000004H EAX EBX ECX EDX Processor Brand String Continued. Processor Brand String Continued. Processor Brand String Continued. Processor Brand String Continued. 80000005H EAX EBX ECX EDX Reserved = 0. Reserved = 0. Reserved = 0. Reserved = 0. 80000006H EAX EBX Reserved = 0. Reserved = 0. ECX EDX CPUID—CPU Identification Bits 07 - 00: Cache Line size in bytes. Bits 11 - 08: Reserved. Bits 15 - 12: L2 Associativity field *. Bits 31 - 16: Cache size in 1K units. Reserved = 0. Vol. 2A 3-209 INSTRUCTION SET REFERENCE, A-L Table 3-8. Information Returned by CPUID Instruction (Contd.) Initial EAX Value Information Provided about the Processor NOTES: * L2 associativity field encodings: 00H - Disabled 01H - 1 way (direct mapped) 02H - 2 ways 03H - Reserved 04H - 4 ways 05H - Reserved 06H - 8 ways 07H - See CPUID leaf 04H, sub-leaf 2** 08H - 16 ways 09H - Reserved 0AH - 32 ways 0BH - 48 ways 0CH - 64 ways 0DH - 96 ways 0EH - 128 ways 0FH - Fully associative ** CPUID leaf 04H provides details of deterministic cache parameters, including the L2 cache in sub-leaf 2 80000007H EAX EBX ECX EDX Reserved = 0. Reserved = 0. Reserved = 0. Bits 07 - 00: Reserved = 0. Bit 08: Invariant TSC available if 1. Bits 31 - 09: Reserved = 0. 80000008H EAX Linear/Physical Address size. Bits 07 - 00: #Physical Address Bits*. Bits 15 - 08: #Linear Address Bits. Bits 31 - 16: Reserved = 0. EBX ECX EDX Reserved = 0. Reserved = 0. Reserved = 0. NOTES: * If CPUID.80000008H:EAX[7:0] is supported, the maximum physical address number supported should come from this field. INPUT EAX = 0: Returns CPUID’s Highest Value for Basic Processor Information and the Vendor Identification String When CPUID executes with EAX set to 0, the processor returns the highest value the CPUID recognizes for returning basic processor information. The value is returned in the EAX register and is processor specific. A vendor identification string is also returned in EBX, EDX, and ECX. For Intel processors, the string is “GenuineIntel” and is expressed: EBX ← 756e6547h (* “Genu”, with G in the low eight bits of BL *) EDX ← 49656e69h (* “ineI”, with i in the low eight bits of DL *) ECX ← 6c65746eh (* “ntel”, with n in the low eight bits of CL *) INPUT EAX = 80000000H: Returns CPUID’s Highest Value for Extended Processor Information When CPUID executes with EAX set to 80000000H, the processor returns the highest value the processor recognizes for returning extended processor information. The value is returned in the EAX register and is processor specific. IA32_BIOS_SIGN_ID Returns Microcode Update Signature For processors that support the microcode update facility, the IA32_BIOS_SIGN_ID MSR is loaded with the update signature whenever CPUID executes. The signature is returned in the upper DWORD. For details, see Chapter 9 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 3-210 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L INPUT EAX = 01H: Returns Model, Family, Stepping Information When CPUID executes with EAX set to 01H, version information is returned in EAX (see Figure 3-6). For example: model, family, and processor type for the Intel Xeon processor 5100 series is as follows: • • • Model — 1111B Family — 0101B Processor Type — 00B See Table 3-9 for available processor type values. Stepping IDs are provided as needed. 31 28 27 20 19 Extended Family ID EAX 16 15 14 13 12 11 Extended Model ID 8 7 Family ID 4 Model 3 0 Stepping ID Extended Family ID (0) Extended Model ID (0) Processor Type Family (0FH for the Pentium 4 Processor Family) Model Reserved OM16525 Figure 3-6. Version Information Returned by CPUID in EAX Table 3-9. Processor Type Field Type Encoding Original OEM Processor 00B ® Intel OverDrive Processor 01B Dual processor (not applicable to Intel486 processors) 10B Intel reserved 11B NOTE See Chapter 19 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, for information on identifying earlier IA-32 processors. The Extended Family ID needs to be examined only when the Family ID is 0FH. Integrate the fields into a display using the following rule: IF Family_ID ≠ 0FH THEN DisplayFamily = Family_ID; ELSE DisplayFamily = Extended_Family_ID + Family_ID; (* Right justify and zero-extend 4-bit field. *) FI; (* Show DisplayFamily as HEX field. *) CPUID—CPU Identification Vol. 2A 3-211 INSTRUCTION SET REFERENCE, A-L The Extended Model ID needs to be examined only when the Family ID is 06H or 0FH. Integrate the field into a display using the following rule: IF (Family_ID = 06H or Family_ID = 0FH) THEN DisplayModel = (Extended_Model_ID « 4) + Model_ID; (* Right justify and zero-extend 4-bit field; display Model_ID as HEX field.*) ELSE DisplayModel = Model_ID; FI; (* Show DisplayModel as HEX field. *) INPUT EAX = 01H: Returns Additional Information in EBX When CPUID executes with EAX set to 01H, additional information is returned to the EBX register: • Brand index (low byte of EBX) — this number provides an entry into a brand string table that contains brand strings for IA-32 processors. More information about this field is provided later in this section. • CLFLUSH instruction cache line size (second byte of EBX) — this number indicates the size of the cache line flushed by the CLFLUSH and CLFLUSHOPT instructions in 8-byte increments. This field was introduced in the Pentium 4 processor. • Local APIC ID (high byte of EBX) — this number is the 8-bit ID that is assigned to the local APIC on the processor during power up. This field was introduced in the Pentium 4 processor. INPUT EAX = 01H: Returns Feature Information in ECX and EDX When CPUID executes with EAX set to 01H, feature information is returned in ECX and EDX. • • Figure 3-7 and Table 3-10 show encodings for ECX. Figure 3-8 and Table 3-11 show encodings for EDX. For all feature flags, a 1 indicates that the feature is supported. Use Intel to properly interpret feature flags. NOTE Software must confirm that a processor feature is present using feature flags returned by CPUID prior to using the feature. Software should not depend on future offerings retaining all features. 3-212 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 ECX 3 2 1 0 0 RDRAND F16C AVX OSXSAVE XSAVE AES TSC-Deadline POPCNT MOVBE x2APIC SSE4_2 — SSE4.2 SSE4_1 — SSE4.1 DCA — Direct Cache Access PCID — Process-context Identifiers PDCM — Perf/Debug Capability MSR xTPR Update Control CMPXCHG16B FMA — Fused Multiply Add SDBG CNXT-ID — L1 Context ID SSSE3 — SSSE3 Extensions TM2 — Thermal Monitor 2 EIST — Enhanced Intel SpeedStep® Technology SMX — Safer Mode Extensions VMX — Virtual Machine Extensions DS-CPL — CPL Qualified Debug Store MONITOR — MONITOR/MWAIT DTES64 — 64-bit DS Area PCLMULQDQ — Carryless Multiplication SSE3 — SSE3 Extensions OM16524b Reserved Figure 3-7. Feature Information Returned in the ECX Register Table 3-10. Feature Information Returned in the ECX Register Bit # Mnemonic Description 0 SSE3 Streaming SIMD Extensions 3 (SSE3). A value of 1 indicates the processor supports this technology. 1 PCLMULQDQ PCLMULQDQ. A value of 1 indicates the processor supports the PCLMULQDQ instruction. 2 DTES64 64-bit DS Area. A value of 1 indicates the processor supports DS area using 64-bit layout. 3 MONITOR MONITOR/MWAIT. A value of 1 indicates the processor supports this feature. 4 DS-CPL CPL Qualified Debug Store. A value of 1 indicates the processor supports the extensions to the Debug Store feature to allow for branch message storage qualified by CPL. 5 VMX Virtual Machine Extensions. A value of 1 indicates that the processor supports this technology. 6 SMX Safer Mode Extensions. A value of 1 indicates that the processor supports this technology. See Chapter 6, “Safer Mode Extensions Reference”. 7 EIST Enhanced Intel SpeedStep® technology. A value of 1 indicates that the processor supports this technology. 8 TM2 Thermal Monitor 2. A value of 1 indicates whether the processor supports this technology. 9 SSSE3 A value of 1 indicates the presence of the Supplemental Streaming SIMD Extensions 3 (SSSE3). A value of 0 indicates the instruction extensions are not present in the processor. CPUID—CPU Identification Vol. 2A 3-213 INSTRUCTION SET REFERENCE, A-L Table 3-10. Feature Information Returned in the ECX Register (Contd.) Bit # Mnemonic Description 10 CNXT-ID L1 Context ID. A value of 1 indicates the L1 data cache mode can be set to either adaptive mode or shared mode. A value of 0 indicates this feature is not supported. See definition of the IA32_MISC_ENABLE MSR Bit 24 (L1 Data Cache Context Mode) for details. 11 SDBG A value of 1 indicates the processor supports IA32_DEBUG_INTERFACE MSR for silicon debug. 12 FMA A value of 1 indicates the processor supports FMA extensions using YMM state. 13 CMPXCHG16B CMPXCHG16B Available. A value of 1 indicates that the feature is available. See the “CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes” section in this chapter for a description. 14 xTPR Update Control xTPR Update Control. A value of 1 indicates that the processor supports changing IA32_MISC_ENABLE[bit 23]. 15 PDCM Perfmon and Debug Capability: A value of 1 indicates the processor supports the performance and debug feature indication MSR IA32_PERF_CAPABILITIES. 16 Reserved Reserved 17 PCID Process-context identifiers. A value of 1 indicates that the processor supports PCIDs and that software may set CR4.PCIDE to 1. 18 DCA A value of 1 indicates the processor supports the ability to prefetch data from a memory mapped device. 19 SSE4.1 A value of 1 indicates that the processor supports SSE4.1. 20 SSE4.2 A value of 1 indicates that the processor supports SSE4.2. 21 x2APIC A value of 1 indicates that the processor supports x2APIC feature. 22 MOVBE A value of 1 indicates that the processor supports MOVBE instruction. 23 POPCNT A value of 1 indicates that the processor supports the POPCNT instruction. 24 TSC-Deadline A value of 1 indicates that the processor’s local APIC timer supports one-shot operation using a TSC deadline value. 25 AESNI A value of 1 indicates that the processor supports the AESNI instruction extensions. 26 XSAVE A value of 1 indicates that the processor supports the XSAVE/XRSTOR processor extended states feature, the XSETBV/XGETBV instructions, and XCR0. 27 OSXSAVE A value of 1 indicates that the OS has set CR4.OSXSAVE[bit 18] to enable XSETBV/XGETBV instructions to access XCR0 and to support processor extended state management using XSAVE/XRSTOR. 28 AVX A value of 1 indicates the processor supports the AVX instruction extensions. 29 F16C A value of 1 indicates that processor supports 16-bit floating-point conversion instructions. 30 RDRAND A value of 1 indicates that processor supports RDRAND instruction. 31 Not Used Always returns 0. 3-214 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 EDX PBE–Pend. Brk. EN. TM–Therm. Monitor HTT–Multi-threading SS–Self Snoop SSE2–SSE2 Extensions SSE–SSE Extensions FXSR–FXSAVE/FXRSTOR MMX–MMX Technology ACPI–Thermal Monitor and Clock Ctrl DS–Debug Store CLFSH–CLFLUSH instruction PSN–Processor Serial Number PSE-36 – Page Size Extension PAT–Page Attribute Table CMOV–Conditional Move/Compare Instruction MCA–Machine Check Architecture PGE–PTE Global Bit MTRR–Memory Type Range Registers SEP–SYSENTER and SYSEXIT APIC–APIC on Chip CX8–CMPXCHG8B Inst. MCE–Machine Check Exception PAE–Physical Address Extensions MSR–RDMSR and WRMSR Support TSC–Time Stamp Counter PSE–Page Size Extensions DE–Debugging Extensions VME–Virtual-8086 Mode Enhancement FPU–x87 FPU on Chip Reserved OM16523 Figure 3-8. Feature Information Returned in the EDX Register CPUID—CPU Identification Vol. 2A 3-215 INSTRUCTION SET REFERENCE, A-L Table 3-11. More on Feature Information Returned in the EDX Register Bit # Mnemonic Description 0 FPU Floating Point Unit On-Chip. The processor contains an x87 FPU. 1 VME Virtual 8086 Mode Enhancements. Virtual 8086 mode enhancements, including CR4.VME for controlling the feature, CR4.PVI for protected mode virtual interrupts, software interrupt indirection, expansion of the TSS with the software indirection bitmap, and EFLAGS.VIF and EFLAGS.VIP flags. 2 DE Debugging Extensions. Support for I/O breakpoints, including CR4.DE for controlling the feature, and optional trapping of accesses to DR4 and DR5. 3 PSE Page Size Extension. Large pages of size 4 MByte are supported, including CR4.PSE for controlling the feature, the defined dirty bit in PDE (Page Directory Entries), optional reserved bit trapping in CR3, PDEs, and PTEs. 4 TSC Time Stamp Counter. The RDTSC instruction is supported, including CR4.TSD for controlling privilege. 5 MSR Model Specific Registers RDMSR and WRMSR Instructions. The RDMSR and WRMSR instructions are supported. Some of the MSRs are implementation dependent. 6 PAE Physical Address Extension. Physical addresses greater than 32 bits are supported: extended page table entry formats, an extra level in the page translation tables is defined, 2-MByte pages are supported instead of 4 Mbyte pages if PAE bit is 1. 7 MCE Machine Check Exception. Exception 18 is defined for Machine Checks, including CR4.MCE for controlling the feature. This feature does not define the model-specific implementations of machine-check error logging, reporting, and processor shutdowns. Machine Check exception handlers may have to depend on processor version to do model specific processing of the exception, or test for the presence of the Machine Check feature. 8 CX8 CMPXCHG8B Instruction. The compare-and-exchange 8 bytes (64 bits) instruction is supported (implicitly locked and atomic). 9 APIC APIC On-Chip. The processor contains an Advanced Programmable Interrupt Controller (APIC), responding to memory mapped commands in the physical address range FFFE0000H to FFFE0FFFH (by default - some processors permit the APIC to be relocated). 10 Reserved Reserved 11 SEP SYSENTER and SYSEXIT Instructions. The SYSENTER and SYSEXIT and associated MSRs are supported. 12 MTRR Memory Type Range Registers. MTRRs are supported. The MTRRcap MSR contains feature bits that describe what memory types are supported, how many variable MTRRs are supported, and whether fixed MTRRs are supported. 13 PGE Page Global Bit. The global bit is supported in paging-structure entries that map a page, indicating TLB entries that are common to different processes and need not be flushed. The CR4.PGE bit controls this feature. 14 MCA Machine Check Architecture. A value of 1 indicates the Machine Check Architecture of reporting machine errors is supported. The MCG_CAP MSR contains feature bits describing how many banks of error reporting MSRs are supported. 15 CMOV Conditional Move Instructions. The conditional move instruction CMOV is supported. In addition, if x87 FPU is present as indicated by the CPUID.FPU feature bit, then the FCOMI and FCMOV instructions are supported 16 PAT Page Attribute Table. Page Attribute Table is supported. This feature augments the Memory Type Range Registers (MTRRs), allowing an operating system to specify attributes of memory accessed through a linear address on a 4KB granularity. 17 PSE-36 36-Bit Page Size Extension. 4-MByte pages addressing physical memory beyond 4 GBytes are supported with 32-bit paging. This feature indicates that upper bits of the physical address of a 4-MByte page are encoded in bits 20:13 of the page-directory entry. Such physical addresses are limited by MAXPHYADDR and may be up to 40 bits in size. 18 PSN Processor Serial Number. The processor supports the 96-bit processor identification number feature and the feature is enabled. 19 CLFSH CLFLUSH Instruction. CLFLUSH Instruction is supported. 20 Reserved Reserved 3-216 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-11. More on Feature Information Returned in the EDX Register (Contd.) Bit # Mnemonic Description 21 DS Debug Store. The processor supports the ability to write debug information into a memory resident buffer. This feature is used by the branch trace store (BTS) and processor event-based sampling (PEBS) facilities (see Chapter 23, “Introduction to Virtual-Machine Extensions,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C). 22 ACPI Thermal Monitor and Software Controlled Clock Facilities. The processor implements internal MSRs that allow processor temperature to be monitored and processor performance to be modulated in predefined duty cycles under software control. 23 MMX Intel MMX Technology. The processor supports the Intel MMX technology. 24 FXSR FXSAVE and FXRSTOR Instructions. The FXSAVE and FXRSTOR instructions are supported for fast save and restore of the floating point context. Presence of this bit also indicates that CR4.OSFXSR is available for an operating system to indicate that it supports the FXSAVE and FXRSTOR instructions. 25 SSE SSE. The processor supports the SSE extensions. 26 SSE2 SSE2. The processor supports the SSE2 extensions. 27 SS Self Snoop. The processor supports the management of conflicting memory types by performing a snoop of its own cache structure for transactions issued to the bus. 28 HTT Max APIC IDs reserved field is Valid. A value of 0 for HTT indicates there is only a single logical processor in the package and software should assume only a single APIC ID is reserved. A value of 1 for HTT indicates the value in CPUID.1.EBX[23:16] (the Maximum number of addressable IDs for logical processors in this package) is valid for the package. 29 TM Thermal Monitor. The processor implements the thermal monitor automatic thermal control circuitry (TCC). 30 Reserved Reserved 31 PBE Pending Break Enable. The processor supports the use of the FERR#/PBE# pin when the processor is in the stop-clock state (STPCLK# is asserted) to signal the processor that an interrupt is pending and that the processor should return to normal operation to handle the interrupt. Bit 10 (PBE enable) in the IA32_MISC_ENABLE MSR enables this capability. INPUT EAX = 02H: TLB/Cache/Prefetch Information Returned in EAX, EBX, ECX, EDX When CPUID executes with EAX set to 02H, the processor returns information about the processor’s internal TLBs, cache and prefetch hardware in the EAX, EBX, ECX, and EDX registers. The information is reported in encoded form and fall into the following categories: • The least-significant byte in register EAX (register AL) will always return 01H. Software should ignore this value and not interpret it as an informational descriptor. • The most significant bit (bit 31) of each register indicates whether the register contains valid information (set to 0) or is reserved (set to 1). • If a register contains valid information, the information is contained in 1 byte descriptors. There are four types of encoding values for the byte descriptor, the encoding type is noted in the second column of Table 3-12. Table 3-12 lists the encoding of these descriptors. Note that the order of descriptors in the EAX, EBX, ECX, and EDX registers is not defined; that is, specific bytes are not designated to contain descriptors for specific cache, prefetch, or TLB types. The descriptors may appear in any order. Note also a processor may report a general descriptor type (FFH) and not report any byte descriptor of “cache type” via CPUID leaf 2. CPUID—CPU Identification Vol. 2A 3-217 INSTRUCTION SET REFERENCE, A-L Table 3-12. Encoding of CPUID Leaf 2 Descriptors Value Type 00H General Description Null descriptor, this byte contains no information 01H TLB Instruction TLB: 4 KByte pages, 4-way set associative, 32 entries 02H TLB Instruction TLB: 4 MByte pages, fully associative, 2 entries 03H TLB Data TLB: 4 KByte pages, 4-way set associative, 64 entries 04H TLB Data TLB: 4 MByte pages, 4-way set associative, 8 entries 05H TLB 06H Cache 1st-level instruction cache: 8 KBytes, 4-way set associative, 32 byte line size Data TLB1: 4 MByte pages, 4-way set associative, 32 entries 08H Cache 1st-level instruction cache: 16 KBytes, 4-way set associative, 32 byte line size 09H Cache 1st-level instruction cache: 32KBytes, 4-way set associative, 64 byte line size 0AH Cache 1st-level data cache: 8 KBytes, 2-way set associative, 32 byte line size 0BH TLB 0CH Cache 1st-level data cache: 16 KBytes, 4-way set associative, 32 byte line size 0DH Cache 1st-level data cache: 16 KBytes, 4-way set associative, 64 byte line size Instruction TLB: 4 MByte pages, 4-way set associative, 4 entries 0EH Cache 1st-level data cache: 24 KBytes, 6-way set associative, 64 byte line size 1DH Cache 2nd-level cache: 128 KBytes, 2-way set associative, 64 byte line size 21H Cache 2nd-level cache: 256 KBytes, 8-way set associative, 64 byte line size 22H Cache 3rd-level cache: 512 KBytes, 4-way set associative, 64 byte line size, 2 lines per sector 23H Cache 3rd-level cache: 1 MBytes, 8-way set associative, 64 byte line size, 2 lines per sector 24H Cache 2nd-level cache: 1 MBytes, 16-way set associative, 64 byte line size 25H Cache 3rd-level cache: 2 MBytes, 8-way set associative, 64 byte line size, 2 lines per sector 29H Cache 3rd-level cache: 4 MBytes, 8-way set associative, 64 byte line size, 2 lines per sector 2CH Cache 1st-level data cache: 32 KBytes, 8-way set associative, 64 byte line size 30H Cache 1st-level instruction cache: 32 KBytes, 8-way set associative, 64 byte line size 40H Cache No 2nd-level cache or, if processor contains a valid 2nd-level cache, no 3rd-level cache 41H Cache 2nd-level cache: 128 KBytes, 4-way set associative, 32 byte line size 42H Cache 2nd-level cache: 256 KBytes, 4-way set associative, 32 byte line size 43H Cache 2nd-level cache: 512 KBytes, 4-way set associative, 32 byte line size 44H Cache 2nd-level cache: 1 MByte, 4-way set associative, 32 byte line size 45H Cache 2nd-level cache: 2 MByte, 4-way set associative, 32 byte line size 46H Cache 3rd-level cache: 4 MByte, 4-way set associative, 64 byte line size 47H Cache 3rd-level cache: 8 MByte, 8-way set associative, 64 byte line size 48H Cache 2nd-level cache: 3MByte, 12-way set associative, 64 byte line size 49H Cache 3rd-level cache: 4MB, 16-way set associative, 64-byte line size (Intel Xeon processor MP, Family 0FH, Model 06H); 2nd-level cache: 4 MByte, 16-way set associative, 64 byte line size 4AH Cache 3rd-level cache: 6MByte, 12-way set associative, 64 byte line size 4BH Cache 3rd-level cache: 8MByte, 16-way set associative, 64 byte line size 4CH Cache 3rd-level cache: 12MByte, 12-way set associative, 64 byte line size 4DH Cache 3rd-level cache: 16MByte, 16-way set associative, 64 byte line size 4EH Cache 2nd-level cache: 6MByte, 24-way set associative, 64 byte line size 4FH TLB 3-218 Vol. 2A Instruction TLB: 4 KByte pages, 32 entries CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-12. Encoding of CPUID Leaf 2 Descriptors (Contd.) Value Type Description 50H TLB Instruction TLB: 4 KByte and 2-MByte or 4-MByte pages, 64 entries 51H TLB Instruction TLB: 4 KByte and 2-MByte or 4-MByte pages, 128 entries 52H TLB Instruction TLB: 4 KByte and 2-MByte or 4-MByte pages, 256 entries 55H TLB Instruction TLB: 2-MByte or 4-MByte pages, fully associative, 7 entries 56H TLB Data TLB0: 4 MByte pages, 4-way set associative, 16 entries 57H TLB Data TLB0: 4 KByte pages, 4-way associative, 16 entries 59H TLB Data TLB0: 4 KByte pages, fully associative, 16 entries 5AH TLB Data TLB0: 2 MByte or 4 MByte pages, 4-way set associative, 32 entries 5BH TLB Data TLB: 4 KByte and 4 MByte pages, 64 entries 5CH TLB Data TLB: 4 KByte and 4 MByte pages,128 entries 5DH TLB Data TLB: 4 KByte and 4 MByte pages,256 entries 60H Cache 61H TLB Instruction TLB: 4 KByte pages, fully associative, 48 entries 63H TLB Data TLB: 2 MByte or 4 MByte pages, 4-way set associative, 32 entries and a separate array with 1 GByte pages, 4-way set associative, 4 entries 64H TLB Data TLB: 4 KByte pages, 4-way set associative, 512 entries 66H Cache 1st-level data cache: 8 KByte, 4-way set associative, 64 byte line size 67H Cache 1st-level data cache: 16 KByte, 4-way set associative, 64 byte line size 68H Cache 1st-level data cache: 32 KByte, 4-way set associative, 64 byte line size 6AH Cache uTLB: 4 KByte pages, 8-way set associative, 64 entries 6BH Cache DTLB: 4 KByte pages, 8-way set associative, 256 entries 1st-level data cache: 16 KByte, 8-way set associative, 64 byte line size 6CH Cache DTLB: 2M/4M pages, 8-way set associative, 128 entries 6DH Cache DTLB: 1 GByte pages, fully associative, 16 entries 70H Cache Trace cache: 12 K-μop, 8-way set associative 71H Cache Trace cache: 16 K-μop, 8-way set associative 72H Cache Trace cache: 32 K-μop, 8-way set associative 76H TLB 78H Cache 2nd-level cache: 1 MByte, 4-way set associative, 64byte line size 79H Cache 2nd-level cache: 128 KByte, 8-way set associative, 64 byte line size, 2 lines per sector Instruction TLB: 2M/4M pages, fully associative, 8 entries 7AH Cache 2nd-level cache: 256 KByte, 8-way set associative, 64 byte line size, 2 lines per sector 7BH Cache 2nd-level cache: 512 KByte, 8-way set associative, 64 byte line size, 2 lines per sector 7CH Cache 2nd-level cache: 1 MByte, 8-way set associative, 64 byte line size, 2 lines per sector 7DH Cache 2nd-level cache: 2 MByte, 8-way set associative, 64byte line size 7FH Cache 2nd-level cache: 512 KByte, 2-way set associative, 64-byte line size 80H Cache 2nd-level cache: 512 KByte, 8-way set associative, 64-byte line size 82H Cache 2nd-level cache: 256 KByte, 8-way set associative, 32 byte line size 83H Cache 2nd-level cache: 512 KByte, 8-way set associative, 32 byte line size 84H Cache 2nd-level cache: 1 MByte, 8-way set associative, 32 byte line size 85H Cache 2nd-level cache: 2 MByte, 8-way set associative, 32 byte line size 86H Cache 2nd-level cache: 512 KByte, 4-way set associative, 64 byte line size 87H Cache 2nd-level cache: 1 MByte, 8-way set associative, 64 byte line size CPUID—CPU Identification Vol. 2A 3-219 INSTRUCTION SET REFERENCE, A-L Table 3-12. Encoding of CPUID Leaf 2 Descriptors (Contd.) Value Type Description A0H DTLB B0H TLB Instruction TLB: 4 KByte pages, 4-way set associative, 128 entries B1H TLB Instruction TLB: 2M pages, 4-way, 8 entries or 4M pages, 4-way, 4 entries B2H TLB Instruction TLB: 4KByte pages, 4-way set associative, 64 entries B3H TLB Data TLB: 4 KByte pages, 4-way set associative, 128 entries B4H TLB Data TLB1: 4 KByte pages, 4-way associative, 256 entries B5H TLB Instruction TLB: 4KByte pages, 8-way set associative, 64 entries B6H TLB Instruction TLB: 4KByte pages, 8-way set associative, 128 entries BAH TLB Data TLB1: 4 KByte pages, 4-way associative, 64 entries DTLB: 4k pages, fully associative, 32 entries C0H TLB Data TLB: 4 KByte and 4 MByte pages, 4-way associative, 8 entries C1H STLB Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries C2H DTLB DTLB: 4 KByte/2 MByte pages, 4-way associative, 16 entries C3H STLB Shared 2nd-Level TLB: 4 KByte /2 MByte pages, 6-way associative, 1536 entries. Also 1GBbyte pages, 4-way, 16 entries. C4H DTLB DTLB: 2M/4M Byte pages, 4-way associative, 32 entries CAH STLB Shared 2nd-Level TLB: 4 KByte pages, 4-way associative, 512 entries D0H Cache 3rd-level cache: 512 KByte, 4-way set associative, 64 byte line size D1H Cache 3rd-level cache: 1 MByte, 4-way set associative, 64 byte line size D2H Cache 3rd-level cache: 2 MByte, 4-way set associative, 64 byte line size D6H Cache 3rd-level cache: 1 MByte, 8-way set associative, 64 byte line size D7H Cache 3rd-level cache: 2 MByte, 8-way set associative, 64 byte line size D8H Cache 3rd-level cache: 4 MByte, 8-way set associative, 64 byte line size DCH Cache 3rd-level cache: 1.5 MByte, 12-way set associative, 64 byte line size DDH Cache 3rd-level cache: 3 MByte, 12-way set associative, 64 byte line size DEH Cache 3rd-level cache: 6 MByte, 12-way set associative, 64 byte line size E2H Cache 3rd-level cache: 2 MByte, 16-way set associative, 64 byte line size E3H Cache 3rd-level cache: 4 MByte, 16-way set associative, 64 byte line size E4H Cache 3rd-level cache: 8 MByte, 16-way set associative, 64 byte line size EAH Cache 3rd-level cache: 12MByte, 24-way set associative, 64 byte line size EBH Cache 3rd-level cache: 18MByte, 24-way set associative, 64 byte line size ECH Cache 3rd-level cache: 24MByte, 24-way set associative, 64 byte line size F0H Prefetch 64-Byte prefetching F1H Prefetch 128-Byte prefetching FEH General CPUID leaf 2 does not report TLB descriptor information; use CPUID leaf 18H to query TLB and other address translation parameters. FFH General CPUID leaf 2 does not report cache descriptor information, use CPUID leaf 4 to query cache parameters 3-220 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Example 3-1. Example of Cache and TLB Interpretation The first member of the family of Pentium 4 processors returns the following information about caches and TLBs when the CPUID executes with an input value of 2: EAX EBX ECX EDX 66 5B 50 01H 0H 0H 00 7A 70 00H Which means: • • The least-significant byte (byte 0) of register EAX is set to 01H. This value should be ignored. • Bytes 1, 2, and 3 of register EAX indicate that the processor has: The most-significant bit of all four registers (EAX, EBX, ECX, and EDX) is set to 0, indicating that each register contains valid 1-byte descriptors. — 50H - a 64-entry instruction TLB, for mapping 4-KByte and 2-MByte or 4-MByte pages. — 5BH - a 64-entry data TLB, for mapping 4-KByte and 4-MByte pages. — 66H - an 8-KByte 1st level data cache, 4-way set associative, with a 64-Byte cache line size. • • The descriptors in registers EBX and ECX are valid, but contain NULL descriptors. Bytes 0, 1, 2, and 3 of register EDX indicate that the processor has: — 00H - NULL descriptor. — 70H - Trace cache: 12 K-μop, 8-way set associative. — 7AH - a 256-KByte 2nd level cache, 8-way set associative, with a sectored, 64-byte cache line size. — 00H - NULL descriptor. INPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level When CPUID executes with EAX set to 04H and ECX contains an index value, the processor returns encoded data that describe a set of deterministic cache parameters (for the cache level associated with the input in ECX). Valid index values start from 0. Software can enumerate the deterministic cache parameters for each level of the cache hierarchy starting with an index value of 0, until the parameters report the value associated with the cache type field is 0. The architecturally defined fields reported by deterministic cache parameters are documented in Table 3-8. This Cache Size in Bytes = (Ways + 1) * (Partitions + 1) * (Line_Size + 1) * (Sets + 1) = (EBX[31:22] + 1) * (EBX[21:12] + 1) * (EBX[11:0] + 1) * (ECX + 1) The CPUID leaf 04H also reports data that can be used to derive the topology of processor cores in a physical package. This information is constant for all valid index values. Software can query the raw data reported by executing CPUID with EAX=04H and ECX=0 and use it as part of the topology enumeration algorithm described in Chapter 8, “Multiple-Processor Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. INPUT EAX = 05H: Returns MONITOR and MWAIT Features When CPUID executes with EAX set to 05H, the processor returns information about features available to MONITOR/MWAIT instructions. The MONITOR instruction is used for address-range monitoring in conjunction with MWAIT instruction. The MWAIT instruction optionally provides additional extensions for advanced power management. See Table 3-8. INPUT EAX = 06H: Returns Thermal and Power Management Features When CPUID executes with EAX set to 06H, the processor returns information about thermal and power management features. See Table 3-8. CPUID—CPU Identification Vol. 2A 3-221 INSTRUCTION SET REFERENCE, A-L INPUT EAX = 07H: Returns Structured Extended Feature Enumeration Information When CPUID executes with EAX set to 07H and ECX = 0, the processor returns information about the maximum input value for sub-leaves that contain extended feature flags. See Table 3-8. When CPUID executes with EAX set to 07H and the input value of ECX is invalid (see leaf 07H entry in Table 3-8), the processor returns 0 in EAX/EBX/ECX/EDX. In subleaf 0, EAX returns the maximum input value of the highest leaf 7 sub-leaf, and EBX, ECX & EDX contain information of extended feature flags. INPUT EAX = 09H: Returns Direct Cache Access Information When CPUID executes with EAX set to 09H, the processor returns information about Direct Cache Access capabilities. See Table 3-8. INPUT EAX = 0AH: Returns Architectural Performance Monitoring Features When CPUID executes with EAX set to 0AH, the processor returns information about support for architectural performance monitoring capabilities. Architectural performance monitoring is supported if the version ID (see Table 3-8) is greater than Pn 0. See Table 3-8. For each version of architectural performance monitoring capability, software must enumerate this leaf to discover the programming facilities and the architectural performance events available in the processor. The details are described in Chapter 23, “Introduction to Virtual-Machine Extensions,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C. INPUT EAX = 0BH: Returns Extended Topology Information CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence of Leaf 1FH before using leaf 0BH. When CPUID executes with EAX set to 0BH, the processor returns information about extended topology enumeration data. Software must detect the presence of CPUID leaf 0BH by verifying (a) the highest leaf index supported by CPUID is >= 0BH, and (b) CPUID.0BH:EBX[15:0] reports a non-zero value. See Table 3-8. INPUT EAX = 0DH: Returns Processor Extended States Enumeration Information When CPUID executes with EAX set to 0DH and ECX = 0, the processor returns information about the bit-vector representation of all processor state extensions that are supported in the processor and storage size requirements of the XSAVE/XRSTOR area. See Table 3-8. When CPUID executes with EAX set to 0DH and ECX = n (n > 1, and is a valid sub-leaf index), the processor returns information about the size and offset of each processor extended state save area within the XSAVE/XRSTOR area. See Table 3-8. Software can use the forward-extendable technique depicted below to query the valid sub-leaves and obtain size and offset information for each processor extended state save area: For i = 2 to 62 // sub-leaf 1 is reserved IF (CPUID.(EAX=0DH, ECX=0):VECTOR[i] = 1 ) // VECTOR is the 64-bit value of EDX:EAX Execute CPUID.(EAX=0DH, ECX = i) to examine size and offset for sub-leaf i; FI; INPUT EAX = 0FH: Returns Intel Resource Director Technology (Intel RDT) Monitoring Enumeration Information When CPUID executes with EAX set to 0FH and ECX = 0, the processor returns information about the bit-vector representation of QoS monitoring resource types that are supported in the processor and maximum range of RMID values the processor can use to monitor of any supported resource types. Each bit, starting from bit 1, corresponds to a specific resource type if the bit is set. The bit position corresponds to the sub-leaf index (or ResID) that software must use to query QoS monitoring capability available for that type. See Table 3-8. When CPUID executes with EAX set to 0FH and ECX = n (n >= 1, and is a valid ResID), the processor returns information software can use to program IA32_PQR_ASSOC, IA32_QM_EVTSEL MSRs before reading QoS data from the IA32_QM_CTR MSR. 3-222 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L INPUT EAX = 10H: Returns Intel Resource Director Technology (Intel RDT) Allocation Enumeration Information When CPUID executes with EAX set to 10H and ECX = 0, the processor returns information about the bit-vector representation of QoS Enforcement resource types that are supported in the processor. Each bit, starting from bit 1, corresponds to a specific resource type if the bit is set. The bit position corresponds to the sub-leaf index (or ResID) that software must use to query QoS enforcement capability available for that type. See Table 3-8. When CPUID executes with EAX set to 10H and ECX = n (n >= 1, and is a valid ResID), the processor returns information about available classes of service and range of QoS mask MSRs that software can use to configure each class of services using capability bit masks in the QoS Mask registers, IA32_resourceType_Mask_n. INPUT EAX = 12H: Returns Intel SGX Enumeration Information When CPUID executes with EAX set to 12H and ECX = 0H, the processor returns information about Intel SGX capabilities. See Table 3-8. When CPUID executes with EAX set to 12H and ECX = 1H, the processor returns information about Intel SGX attributes. See Table 3-8. When CPUID executes with EAX set to 12H and ECX = n (n > 1), the processor returns information about Intel SGX Enclave Page Cache. See Table 3-8. INPUT EAX = 14H: Returns Intel Processor Trace Enumeration Information When CPUID executes with EAX set to 14H and ECX = 0H, the processor returns information about Intel Processor Trace extensions. See Table 3-8. When CPUID executes with EAX set to 14H and ECX = n (n > 0 and less than the number of non-zero bits in CPUID.(EAX=14H, ECX= 0H).EAX), the processor returns information about packet generation in Intel Processor Trace. See Table 3-8. INPUT EAX = 15H: Returns Time Stamp Counter and Nominal Core Crystal Clock Information When CPUID executes with EAX set to 15H and ECX = 0H, the processor returns information about Time Stamp Counter and Core Crystal Clock. See Table 3-8. INPUT EAX = 16H: Returns Processor Frequency Information When CPUID executes with EAX set to 16H, the processor returns information about Processor Frequency Information. See Table 3-8. INPUT EAX = 17H: Returns System-On-Chip Information When CPUID executes with EAX set to 17H, the processor returns information about the System-On-Chip Vendor Attribute Enumeration. See Table 3-8. INPUT EAX = 18H: Returns Deterministic Address Translation Parameters Information When CPUID executes with EAX set to 18H, the processor returns information about the Deterministic Address Translation Parameters. See Table 3-8. INPUT EAX = 1FH: Returns V2 Extended Topology Information When CPUID executes with EAX set to 1FH, the processor returns information about extended topology enumeration data. Software must detect the presence of CPUID leaf 1FH by verifying (a) the highest leaf index supported by CPUID is >= 1FH, and (b) CPUID.1FH:EBX[15:0] reports a non-zero value. See Table 3-8. METHODS FOR RETURNING BRANDING INFORMATION Use the following techniques to access branding information: 1. Processor brand string method. 2. Processor brand index; this method uses a software supplied brand string table. CPUID—CPU Identification Vol. 2A 3-223 INSTRUCTION SET REFERENCE, A-L These two methods are discussed in the following sections. For methods that are available in early processors, see Section: “Identification of Earlier IA-32 Processors” in Chapter 19 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. The Processor Brand String Method Figure 3-9 describes the algorithm used for detection of the brand string. Processor brand identification software should execute this algorithm on all Intel 64 and IA-32 processors. This method (introduced with Pentium 4 processors) returns an ASCII brand identification string and the Processor Base frequency of the processor to the EAX, EBX, ECX, and EDX registers. Input: EAX= 0x80000000 CPUID IF (EAX & 0x80000000) CPUID Function Supported False Processor Brand String Not Supported True Processor Brand String Supported True ≥ Extended EAX Return Value = Max. Extended CPUID Function Index IF (EAX Return Value ≥ 0x80000004) OM15194 Figure 3-9. Determination of Support for the Processor Brand String How Brand Strings Work To use the brand string method, execute CPUID with EAX input of 8000002H through 80000004H. For each input value, CPUID returns 16 ASCII characters using EAX, EBX, ECX, and EDX. The returned string will be NULL-terminated. 3-224 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Table 3-13 shows the brand string that is returned by the first processor in the Pentium 4 processor family. Table 3-13. Processor Brand String Returned with Pentium 4 Processor EAX Input Value 80000002H 80000003H 80000004H Return Values ASCII Equivalent EAX = 20202020H “ ” EBX = 20202020H “ ” ECX = 20202020H “ ” EDX = 6E492020H “nI ” EAX = 286C6574H “(let” EBX = 50202952H “P )R” ECX = 69746E65H “itne” EDX = 52286D75H “R(mu” EAX = 20342029H “ 4 )” EBX = 20555043H “ UPC” ECX = 30303531H “0051” EDX = 007A484DH “\0zHM” Extracting the Processor Frequency from Brand Strings Figure 3-10 provides an algorithm which software can use to extract the Processor Base frequency from the processor brand string. Scan "Brand String" in Reverse Byte Order "zHM", or "zHG", or "zHT" Match Substring IF Substring Matched Determine "Freq" and "Multiplier" True False If "zHM" If "zHG" Determine "Multiplier" Determine "Freq" Processor Base Frequency = "Freq" x "Multiplier" If "zHT" Scan Digits Until Blank In Reverse Order Report Error Multiplier = 1 x 106 Multiplier = 1 x 109 Multiplier = 1 x 1012 Reverse Digits To Decimal Value "Freq" = X.YZ if Digits = "ZY.X" OM15195 Figure 3-10. Algorithm for Extracting Processor Frequency CPUID—CPU Identification Vol. 2A 3-225 INSTRUCTION SET REFERENCE, A-L The Processor Brand Index Method The brand index method (introduced with Pentium® III Xeon® processors) provides an entry point into a brand identification table that is maintained in memory by system software and is accessible from system- and user-level code. In this table, each brand index is associate with an ASCII brand identification string that identifies the official Intel family and model number of a processor. When CPUID executes with EAX set to 1, the processor returns a brand index to the low byte in EBX. Software can then use this index to locate the brand identification string for the processor in the brand identification table. The first entry (brand index 0) in this table is reserved, allowing for backward compatibility with processors that do not support the brand identification feature. Starting with processor signature family ID = 0FH, model = 03H, brand index method is no longer supported. Use brand string method instead. Table 3-14 shows brand indices that have identification strings associated with them. Table 3-14. Mapping of Brand Indices; and Intel 64 and IA-32 Processor Brand Strings Brand Index Brand String 00H This processor does not support the brand identification feature 01H Intel(R) Celeron(R) processor1 02H Intel(R) Pentium(R) III processor1 03H Intel(R) Pentium(R) III Xeon(R) processor; If processor signature = 000006B1h, then Intel(R) Celeron(R) processor 04H Intel(R) Pentium(R) III processor 06H Mobile Intel(R) Pentium(R) III processor-M 07H Mobile Intel(R) Celeron(R) processor1 08H Intel(R) Pentium(R) 4 processor 09H Intel(R) Pentium(R) 4 processor 0AH Intel(R) Celeron(R) processor1 0BH Intel(R) Xeon(R) processor; If processor signature = 00000F13h, then Intel(R) Xeon(R) processor MP 0CH Intel(R) Xeon(R) processor MP 0EH Mobile Intel(R) Pentium(R) 4 processor-M; If processor signature = 00000F13h, then Intel(R) Xeon(R) processor 0FH Mobile Intel(R) Celeron(R) processor1 11H Mobile Genuine Intel(R) processor 12H Intel(R) Celeron(R) M processor 13H Mobile Intel(R) Celeron(R) processor1 14H Intel(R) Celeron(R) processor 15H Mobile Genuine Intel(R) processor 16H Intel(R) Pentium(R) M processor 17H Mobile Intel(R) Celeron(R) processor1 18H – 0FFH RESERVED NOTES: 1. Indicates versions of these processors that were introduced after the Pentium III IA-32 Architecture Compatibility CPUID is not supported in early models of the Intel486 processor or in any IA-32 processor earlier than the Intel486 processor. 3-226 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L Operation IA32_BIOS_SIGN_ID MSR ← Update with installed microcode revision number; CASE (EAX) OF EAX = 0: EAX ← Highest basic function input value understood by CPUID; EBX ← Vendor identification string; EDX ← Vendor identification string; ECX ← Vendor identification string; BREAK; EAX = 1H: EAX[3:0] ← Stepping ID; EAX[7:4] ← Model; EAX[11:8] ← Family; EAX[13:12] ← Processor type; EAX[15:14] ← Reserved; EAX[19:16] ← Extended Model; EAX[27:20] ← Extended Family; EAX[31:28] ← Reserved; EBX[7:0] ← Brand Index; (* Reserved if the value is zero. *) EBX[15:8] ← CLFLUSH Line Size; EBX[16:23] ← Reserved; (* Number of threads enabled = 2 if MT enable fuse set. *) EBX[24:31] ← Initial APIC ID; ECX ← Feature flags; (* See Figure 3-7. *) EDX ← Feature flags; (* See Figure 3-8. *) BREAK; EAX = 2H: EAX ← Cache and TLB information; EBX ← Cache and TLB information; ECX ← Cache and TLB information; EDX ← Cache and TLB information; BREAK; EAX = 3H: EAX ← Reserved; EBX ← Reserved; ECX ← ProcessorSerialNumber[31:0]; (* Pentium III processors only, otherwise reserved. *) EDX ← ProcessorSerialNumber[63:32]; (* Pentium III processors only, otherwise reserved. * BREAK EAX = 4H: EAX ← Deterministic Cache Parameters Leaf; (* See Table 3-8. *) EBX ← Deterministic Cache Parameters Leaf; ECX ← Deterministic Cache Parameters Leaf; EDX ← Deterministic Cache Parameters Leaf; BREAK; EAX = 5H: EAX ← MONITOR/MWAIT Leaf; (* See Table 3-8. *) EBX ← MONITOR/MWAIT Leaf; ECX ← MONITOR/MWAIT Leaf; EDX ← MONITOR/MWAIT Leaf; BREAK; CPUID—CPU Identification Vol. 2A 3-227 INSTRUCTION SET REFERENCE, A-L EAX = 6H: EAX ← Thermal and Power Management Leaf; (* See Table 3-8. *) EBX ← Thermal and Power Management Leaf; ECX ← Thermal and Power Management Leaf; EDX ← Thermal and Power Management Leaf; BREAK; EAX = 7H: EAX ← Structured Extended Feature Flags Enumeration Leaf; (* See Table 3-8. *) EBX ← Structured Extended Feature Flags Enumeration Leaf; ECX ← Structured Extended Feature Flags Enumeration Leaf; EDX ← Structured Extended Feature Flags Enumeration Leaf; BREAK; EAX = 8H: EAX ← Reserved = 0; EBX ← Reserved = 0; ECX ← Reserved = 0; EDX ← Reserved = 0; BREAK; EAX = 9H: EAX ← Direct Cache Access Information Leaf; (* See Table 3-8. *) EBX ← Direct Cache Access Information Leaf; ECX ← Direct Cache Access Information Leaf; EDX ← Direct Cache Access Information Leaf; BREAK; EAX = AH: EAX ← Architectural Performance Monitoring Leaf; (* See Table 3-8. *) EBX ← Architectural Performance Monitoring Leaf; ECX ← Architectural Performance Monitoring Leaf; EDX ← Architectural Performance Monitoring Leaf; BREAK EAX = BH: EAX ← Extended Topology Enumeration Leaf; (* See Table 3-8. *) EBX ← Extended Topology Enumeration Leaf; ECX ← Extended Topology Enumeration Leaf; EDX ← Extended Topology Enumeration Leaf; BREAK; EAX = CH: EAX ← Reserved = 0; EBX ← Reserved = 0; ECX ← Reserved = 0; EDX ← Reserved = 0; BREAK; EAX = DH: EAX ← Processor Extended State Enumeration Leaf; (* See Table 3-8. *) EBX ← Processor Extended State Enumeration Leaf; ECX ← Processor Extended State Enumeration Leaf; EDX ← Processor Extended State Enumeration Leaf; BREAK; EAX = EH: EAX ← Reserved = 0; EBX ← Reserved = 0; ECX ← Reserved = 0; EDX ← Reserved = 0; BREAK; 3-228 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L EAX = FH: EAX ← Intel Resource Director Technology Monitoring Enumeration Leaf; (* See Table 3-8. *) EBX ← Intel Resource Director Technology Monitoring Enumeration Leaf; ECX ← Intel Resource Director Technology Monitoring Enumeration Leaf; EDX ← Intel Resource Director Technology Monitoring Enumeration Leaf; BREAK; EAX = 10H: EAX ← Intel Resource Director Technology Allocation Enumeration Leaf; (* See Table 3-8. *) EBX ← Intel Resource Director Technology Allocation Enumeration Leaf; ECX ← Intel Resource Director Technology Allocation Enumeration Leaf; EDX ← Intel Resource Director Technology Allocation Enumeration Leaf; BREAK; EAX = 12H: EAX ← Intel SGX Enumeration Leaf; (* See Table 3-8. *) EBX ← Intel SGX Enumeration Leaf; ECX ← Intel SGX Enumeration Leaf; EDX ← Intel SGX Enumeration Leaf; BREAK; EAX = 14H: EAX ← Intel Processor Trace Enumeration Leaf; (* See Table 3-8. *) EBX ← Intel Processor Trace Enumeration Leaf; ECX ← Intel Processor Trace Enumeration Leaf; EDX ← Intel Processor Trace Enumeration Leaf; BREAK; EAX = 15H: EAX ← Time Stamp Counter and Nominal Core Crystal Clock Information Leaf; (* See Table 3-8. *) EBX ← Time Stamp Counter and Nominal Core Crystal Clock Information Leaf; ECX ← Time Stamp Counter and Nominal Core Crystal Clock Information Leaf; EDX ← Time Stamp Counter and Nominal Core Crystal Clock Information Leaf; BREAK; EAX = 16H: EAX ← Processor Frequency Information Enumeration Leaf; (* See Table 3-8. *) EBX ← Processor Frequency Information Enumeration Leaf; ECX ← Processor Frequency Information Enumeration Leaf; EDX ← Processor Frequency Information Enumeration Leaf; BREAK; EAX = 17H: EAX ← System-On-Chip Vendor Attribute Enumeration Leaf; (* See Table 3-8. *) EBX ← System-On-Chip Vendor Attribute Enumeration Leaf; ECX ← System-On-Chip Vendor Attribute Enumeration Leaf; EDX ← System-On-Chip Vendor Attribute Enumeration Leaf; BREAK; EAX = 18H: EAX ← Deterministic Address Translation Parameters Enumeration Leaf; (* See Table 3-8. *) EBX ← Deterministic Address Translation Parameters Enumeration Leaf; ECX ←Deterministic Address Translation Parameters Enumeration Leaf; EDX ← Deterministic Address Translation Parameters Enumeration Leaf; BREAK; EAX = 1FH: EAX ← V2 Extended Topology Enumeration Leaf; (* See Table 3-8. *) EBX ← V2 Extended Topology Enumeration Leaf; ECX ← V2 Extended Topology Enumeration Leaf; EDX ← V2 Extended Topology Enumeration Leaf; BREAK; CPUID—CPU Identification Vol. 2A 3-229 INSTRUCTION SET REFERENCE, A-L EAX = 80000000H: EAX ← Highest extended function input value understood by CPUID; EBX ← Reserved; ECX ← Reserved; EDX ← Reserved; BREAK; EAX = 80000001H: EAX ← Reserved; EBX ← Reserved; ECX ← Extended Feature Bits (* See Table 3-8.*); EDX ← Extended Feature Bits (* See Table 3-8. *); BREAK; EAX = 80000002H: EAX ← Processor Brand String; EBX ← Processor Brand String, continued; ECX ← Processor Brand String, continued; EDX ← Processor Brand String, continued; BREAK; EAX = 80000003H: EAX ← Processor Brand String, continued; EBX ← Processor Brand String, continued; ECX ← Processor Brand String, continued; EDX ← Processor Brand String, continued; BREAK; EAX = 80000004H: EAX ← Processor Brand String, continued; EBX ← Processor Brand String, continued; ECX ← Processor Brand String, continued; EDX ← Processor Brand String, continued; BREAK; EAX = 80000005H: EAX ← Reserved = 0; EBX ← Reserved = 0; ECX ← Reserved = 0; EDX ← Reserved = 0; BREAK; EAX = 80000006H: EAX ← Reserved = 0; EBX ← Reserved = 0; ECX ← Cache information; EDX ← Reserved = 0; BREAK; EAX = 80000007H: EAX ← Reserved = 0; EBX ← Reserved = 0; ECX ← Reserved = 0; EDX ← Reserved = Misc Feature Flags; BREAK; EAX = 80000008H: EAX ← Reserved = Physical Address Size Information; EBX ← Reserved = Virtual Address Size Information; ECX ← Reserved = 0; EDX ← Reserved = 0; BREAK; 3-230 Vol. 2A CPUID—CPU Identification INSTRUCTION SET REFERENCE, A-L EAX >= 40000000H and EAX <= 4FFFFFFFH: DEFAULT: (* EAX = Value outside of recognized range for CPUID. *) (* If the highest basic information leaf data depend on ECX input value, ECX is honored.*) EAX ← Reserved; (* Information returned for highest basic information leaf. *) EBX ← Reserved; (* Information returned for highest basic information leaf. *) ECX ← Reserved; (* Information returned for highest basic information leaf. *) EDX ← Reserved; (* Information returned for highest basic information leaf. *) BREAK; ESAC; Flags Affected None. Exceptions (All Operating Modes) #UD If the LOCK prefix is used. In earlier IA-32 processors that do not support the CPUID instruction, execution of the instruction results in an invalid opcode (#UD) exception being generated. CPUID—CPU Identification Vol. 2A 3-231 INSTRUCTION SET REFERENCE, A-L JMP—Jump Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode EB cb JMP rel8 D Valid Valid Jump short, RIP = RIP + 8-bit displacement sign extended to 64-bits E9 cw JMP rel16 D N.S. Valid Jump near, relative, displacement relative to next instruction. Not supported in 64-bit mode. E9 cd JMP rel32 D Valid Valid Jump near, relative, RIP = RIP + 32-bit displacement sign extended to 64-bits FF /4 JMP r/m16 M N.S. Valid Jump near, absolute indirect, address = zeroextended r/m16. Not supported in 64-bit mode. FF /4 JMP r/m32 M N.S. Valid Jump near, absolute indirect, address given in r/m32. Not supported in 64-bit mode. FF /4 JMP r/m64 M Valid N.E. Jump near, absolute indirect, RIP = 64-Bit offset from register or memory EA cd JMP ptr16:16 D Inv. Valid Jump far, absolute, address given in operand EA cp JMP ptr16:32 D Inv. Valid Jump far, absolute, address given in operand FF /5 JMP m16:16 D Valid Valid Jump far, absolute indirect, address given in m16:16 FF /5 JMP m16:32 D Valid Valid Jump far, absolute indirect, address given in m16:32. REX.W FF /5 JMP m16:64 D Valid N.E. Jump far, absolute indirect, address given in m16:64. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 D Offset NA NA NA M ModRM:r/m (r) NA NA NA Description Transfers program control to a different point in the instruction stream without recording return information. The destination (target) operand specifies the address of the instruction being jumped to. This operand can be an immediate value, a general-purpose register, or a memory location. This instruction can be used to execute four different types of jumps: • Near jump—A jump to an instruction within the current code segment (the segment currently pointed to by the CS register), sometimes referred to as an intrasegment jump. • • Short jump—A near jump where the jump range is limited to –128 to +127 from the current EIP value. • Task switch—A jump to an instruction located in a different task. Far jump—A jump to an instruction located in a different segment than the current code segment but at the same privilege level, sometimes referred to as an intersegment jump. A task switch can only be executed in protected mode (see Chapter 7, in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for information on performing task switches with the JMP instruction). Near and Short Jumps. When executing a near jump, the processor jumps to the address (within the current code segment) that is specified with the target operand. The target operand specifies either an absolute offset (that is an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current 3-500 Vol. 2A JMP—Jump INSTRUCTION SET REFERENCE, A-L value of the instruction pointer in the EIP register). A near jump to a relative offset of 8-bits (rel8) is referred to as a short jump. The CS register is not changed on near and short jumps. An absolute offset is specified indirectly in a general-purpose register or a memory location (r/m16 or r/m32). The operand-size attribute determines the size of the target operand (16 or 32 bits). Absolute offsets are loaded directly into the EIP register. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared, resulting in a maximum instruction pointer size of 16 bits. A relative offset (rel8, rel16, or rel32) is generally specified as a label in assembly code, but at the machine code level, it is encoded as a signed 8-, 16-, or 32-bit immediate value. This value is added to the value in the EIP register. (Here, the EIP register contains the address of the instruction following the JMP instruction). When using relative offsets, the opcode (for short vs. near jumps) and the operand-size attribute (for near relative jumps) determines the size of the target operand (8, 16, or 32 bits). Far Jumps in Real-Address or Virtual-8086 Mode. When executing a far jump in real-address or virtual-8086 mode, the processor jumps to the code segment and offset specified with the target operand. Here the target operand specifies an absolute far address either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). With the pointer method, the segment and address of the called procedure is encoded in the instruction, using a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address immediate. With the indirect method, the target operand specifies a memory location that contains a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address. The far address is loaded directly into the CS and EIP registers. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared. Far Jumps in Protected Mode. When the processor is operating in protected mode, the JMP instruction can be used to perform the following three types of far jumps: • • • A far jump to a conforming or non-conforming code segment. A far jump through a call gate. A task switch. (The JMP instruction cannot be used to perform inter-privilege-level far jumps.) In protected mode, the processor always uses the segment selector part of the far address to access the corresponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate, task gate, or TSS) and access rights determine the type of jump to be performed. If the selected descriptor is for a code segment, a far jump to a code segment at the same privilege level is performed. (If the selected code segment is at a different privilege level and the code segment is non-conforming, a general-protection exception is generated.) A far jump to the same privilege level in protected mode is very similar to one carried out in real-address or virtual-8086 mode. The target operand specifies an absolute far address either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). The operand-size attribute determines the size of the offset (16 or 32 bits) in the far address. The new code segment selector and its descriptor are loaded into CS register, and the offset from the instruction is loaded into the EIP register. Note that a call gate (described in the next paragraph) can also be used to perform far call to a code segment at the same privilege level. Using this mechanism provides an extra level of indirection and is the preferred method of making jumps between 16-bit and 32-bit code segments. When executing a far jump through a call gate, the segment selector specified by the target operand identifies the call gate. (The offset part of the target operand is ignored.) The processor then jumps to the code segment specified in the call gate descriptor and begins executing the instruction at the offset specified in the call gate. No stack switch occurs. Here again, the target operand can specify the far address of the call gate either directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16 or m16:32). Executing a task switch with the JMP instruction is somewhat similar to executing a jump through a call gate. Here the target operand specifies the segment selector of the task gate for the task being switched to (and the offset part of the target operand is ignored). The task gate in turn points to the TSS for the task, which contains the segment selectors for the task’s code and stack segments. The TSS also contains the EIP value for the next instruction that was to be executed before the task was suspended. This instruction pointer value is loaded into the EIP register so that the task begins executing again at this next instruction. The JMP instruction can also specify the segment selector of the TSS directly, which eliminates the indirection of the task gate. See Chapter 7 in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for detailed information on the mechanics of a task switch. JMP—Jump Vol. 2A 3-501 INSTRUCTION SET REFERENCE, A-L Note that when you execute at task switch with a JMP instruction, the nested task flag (NT) is not set in the EFLAGS register and the new TSS’s previous task link field is not loaded with the old task’s TSS selector. A return to the previous task can thus not be carried out by executing the IRET instruction. Switching tasks with the JMP instruction differs in this regard from the CALL instruction which does set the NT flag and save the previous task link information, allowing a return to the calling task with an IRET instruction. In 64-Bit Mode. The instruction’s operation size is fixed at 64 bits. If a selector points to a gate, then RIP equals the 64-bit displacement taken from gate; else RIP equals the zero-extended offset from the far pointer referenced in the instruction. See the summary chart at the beginning of this section for encoding data and limits. Instruction ordering. Instructions following a far jump may be fetched from memory before earlier instructions complete execution, but they will not execute (even speculatively) until all instructions prior to the far jump have completed execution (the later instructions may execute before data stored by the earlier instructions have become globally visible). Certain situations may lead to the next sequential instruction after a near indirect JMP being speculatively executed. If software needs to prevent this (e.g., in order to prevent a speculative execution side channel), then an INT3 or LFENCE instruction opcode can be placed after the near indirect JMP in order to block speculative execution. Operation IF near jump IF 64-bit Mode THEN IF near relative jump THEN tempRIP ← RIP + DEST; (* RIP is instruction following JMP instruction*) ELSE (* Near absolute jump *) tempRIP ← DEST; FI; ELSE IF near relative jump THEN tempEIP ← EIP + DEST; (* EIP is instruction following JMP instruction*) ELSE (* Near absolute jump *) tempEIP ← DEST; FI; FI; IF (IA32_EFER.LMA = 0 or target mode = Compatibility mode) and tempEIP outside code segment limit THEN #GP(0); FI IF 64-bit mode and tempRIP is not canonical THEN #GP(0); FI; IF OperandSize = 32 THEN EIP ← tempEIP; ELSE IF OperandSize = 16 THEN (* OperandSize = 16 *) EIP ← tempEIP AND 0000FFFFH; ELSE (* OperandSize = 64) RIP ← tempRIP; FI; FI; FI; 3-502 Vol. 2A JMP—Jump INSTRUCTION SET REFERENCE, A-L IF far jump and (PE = 0 or (PE = 1 AND VM = 1)) (* Real-address or virtual-8086 mode *) THEN tempEIP ← DEST(Offset); (* DEST is ptr16:32 or [m16:32] *) IF tempEIP is beyond code segment limit THEN #GP(0); FI; CS ← DEST(segment selector); (* DEST is ptr16:32 or [m16:32] *) IF OperandSize = 32 THEN EIP ← tempEIP; (* DEST is ptr16:32 or [m16:32] *) ELSE (* OperandSize = 16 *) EIP ← tempEIP AND 0000FFFFH; (* Clear upper 16 bits *) FI; FI; IF far jump and (PE = 1 and VM = 0) (* IA-32e mode or protected mode, not virtual-8086 mode *) THEN IF effective address in the CS, DS, ES, FS, GS, or SS segment is illegal or segment selector in target operand NULL THEN #GP(0); FI; IF segment selector index not within descriptor table limits THEN #GP(new selector); FI; Read type and access rights of segment descriptor; IF (EFER.LMA = 0) THEN IF segment type is not a conforming or nonconforming code segment, call gate, task gate, or TSS THEN #GP(segment selector); FI; ELSE IF segment type is not a conforming or nonconforming code segment call gate THEN #GP(segment selector); FI; FI; Depending on type and access rights: GO TO CONFORMING-CODE-SEGMENT; GO TO NONCONFORMING-CODE-SEGMENT; GO TO CALL-GATE; GO TO TASK-GATE; GO TO TASK-STATE-SEGMENT; ELSE #GP(segment selector); FI; CONFORMING-CODE-SEGMENT: IF L-Bit = 1 and D-BIT = 1 and IA32_EFER.LMA = 1 THEN GP(new code segment selector); FI; IF DPL > CPL THEN #GP(segment selector); FI; IF segment not present THEN #NP(segment selector); FI; tempEIP ← DEST(Offset); IF OperandSize = 16 THEN tempEIP ← tempEIP AND 0000FFFFH; FI; IF (IA32_EFER.LMA = 0 or target mode = Compatibility mode) and tempEIP outside code segment limit JMP—Jump Vol. 2A 3-503 INSTRUCTION SET REFERENCE, A-L THEN #GP(0); FI IF tempEIP is non-canonical THEN #GP(0); FI; CS ← DEST[segment selector]; (* Segment descriptor information also loaded *) CS(RPL) ← CPL EIP ← tempEIP; END; NONCONFORMING-CODE-SEGMENT: IF L-Bit = 1 and D-BIT = 1 and IA32_EFER.LMA = 1 THEN GP(new code segment selector); FI; IF (RPL > CPL) OR (DPL ≠ CPL) THEN #GP(code segment selector); FI; IF segment not present THEN #NP(segment selector); FI; tempEIP ← DEST(Offset); IF OperandSize = 16 THEN tempEIP ← tempEIP AND 0000FFFFH; FI; IF (IA32_EFER.LMA = 0 OR target mode = Compatibility mode) and tempEIP outside code segment limit THEN #GP(0); FI IF tempEIP is non-canonical THEN #GP(0); FI; CS ← DEST[segment selector]; (* Segment descriptor information also loaded *) CS(RPL) ← CPL; EIP ← tempEIP; END; CALL-GATE: IF call gate DPL < CPL or call gate DPL < call gate segment-selector RPL THEN #GP(call gate selector); FI; IF call gate not present THEN #NP(call gate selector); FI; IF call gate code-segment selector is NULL THEN #GP(0); FI; IF call gate code-segment selector index outside descriptor table limits THEN #GP(code segment selector); FI; Read code segment descriptor; IF code-segment segment descriptor does not indicate a code segment or code-segment segment descriptor is conforming and DPL > CPL or code-segment segment descriptor is non-conforming and DPL ≠ CPL THEN #GP(code segment selector); FI; IF IA32_EFER.LMA = 1 and (code-segment descriptor is not a 64-bit code segment or code-segment segment descriptor has both L-Bit and D-bit set) THEN #GP(code segment selector); FI; IF code segment is not present THEN #NP(code-segment selector); FI; tempEIP ← DEST(Offset); IF GateSize = 16 THEN tempEIP ← tempEIP AND 0000FFFFH; FI; IF (IA32_EFER.LMA = 0 OR target mode = Compatibility mode) AND tempEIP outside code segment limit THEN #GP(0); FI CS ← DEST[SegmentSelector); (* Segment descriptor information also loaded *) CS(RPL) ← CPL; 3-504 Vol. 2A JMP—Jump INSTRUCTION SET REFERENCE, A-L EIP ← tempEIP; END; TASK-GATE: IF task gate DPL < CPL or task gate DPL < task gate segment-selector RPL THEN #GP(task gate selector); FI; IF task gate not present THEN #NP(gate selector); FI; Read the TSS segment selector in the task-gate descriptor; IF TSS segment selector local/global bit is set to local or index not within GDT limits or descriptor is not a TSS segment or TSS descriptor specifies that the TSS is busy THEN #GP(TSS selector); FI; IF TSS not present THEN #NP(TSS selector); FI; SWITCH-TASKS to TSS; IF EIP not within code segment limit THEN #GP(0); FI; END; TASK-STATE-SEGMENT: IF TSS DPL < CPL or TSS DPL < TSS segment-selector RPL or TSS descriptor indicates TSS not available THEN #GP(TSS selector); FI; IF TSS is not present THEN #NP(TSS selector); FI; SWITCH-TASKS to TSS; IF EIP not within code segment limit THEN #GP(0); FI; END; Flags Affected All flags are affected if a task switch occurs; no flags are affected if a task switch does not occur. Protected Mode Exceptions #GP(0) If offset in target operand, call gate, or TSS is beyond the code segment limits. If the segment selector in the destination operand, call gate, task gate, or TSS is NULL. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector. #GP(selector) If the segment selector index is outside descriptor table limits. If the segment descriptor pointed to by the segment selector in the destination operand is not for a conforming-code segment, nonconforming-code segment, call gate, task gate, or task state segment. If the DPL for a nonconforming-code segment is not equal to the CPL (When not using a call gate.) If the RPL for the segment’s segment selector is greater than the CPL. If the DPL for a conforming-code segment is greater than the CPL. If the DPL from a call-gate, task-gate, or TSS segment descriptor is less than the CPL or than the RPL of the call-gate, task-gate, or TSS’s segment selector. If the segment descriptor for selector in a call gate does not indicate it is a code segment. JMP—Jump Vol. 2A 3-505 INSTRUCTION SET REFERENCE, A-L If the segment descriptor for the segment selector in a task gate does not indicate an available TSS. If the segment selector for a TSS has its local/global bit set for local. If a TSS segment descriptor specifies that the TSS is busy or not available. #SS(0) If a memory operand effective address is outside the SS segment limit. #NP (selector) If the code segment being accessed is not present. If call gate, task gate, or TSS not present. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. (Only occurs when fetching target from memory.) #UD If the LOCK prefix is used. Real-Address Mode Exceptions #GP If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. #SS If a memory operand effective address is outside the SS segment limit. #UD If the LOCK prefix is used. Virtual-8086 Mode Exceptions #GP(0) If the target operand is beyond the code segment limits. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made. (Only occurs when fetching target from memory.) #UD If the LOCK prefix is used. Compatibility Mode Exceptions Same as 64-bit mode exceptions. 64-Bit Mode Exceptions #GP(0) If a memory address is non-canonical. If target offset in destination operand is non-canonical. If target offset in destination operand is beyond the new code segment limit. If the segment selector in the destination operand is NULL. If the code segment selector in the 64-bit gate is NULL. #GP(selector) If the code segment or 64-bit call gate is outside descriptor table limits. If the code segment or 64-bit call gate overlaps non-canonical space. If the segment descriptor from a 64-bit call gate is in non-canonical space. If the segment descriptor pointed to by the segment selector in the destination operand is not for a conforming-code segment, nonconforming-code segment, 64-bit call gate. If the segment descriptor pointed to by the segment selector in the destination operand is a code segment, and has both the D-bit and the L-bit set. If the DPL for a nonconforming-code segment is not equal to the CPL, or the RPL for the segment’s segment selector is greater than the CPL. If the DPL for a conforming-code segment is greater than the CPL. If the DPL from a 64-bit call-gate is less than the CPL or than the RPL of the 64-bit call-gate. If the upper type field of a 64-bit call gate is not 0x0. If the segment selector from a 64-bit call gate is beyond the descriptor table limits. 3-506 Vol. 2A JMP—Jump INSTRUCTION SET REFERENCE, A-L If the code segment descriptor pointed to by the selector in the 64-bit gate doesn't have the L-bit set and the D-bit clear. If the segment descriptor for a segment selector from the 64-bit call gate does not indicate it is a code segment. If the code segment is non-conforming and CPL ≠ DPL. If the code segment is confirming and CPL < DPL. #NP(selector) If a code segment or 64-bit call gate is not present. #UD (64-bit mode only) If a far jump is direct to an absolute address in memory. If the LOCK prefix is used. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. JMP—Jump Vol. 2A 3-507 8. Updates to Chapter 4, Volume 2B Change bars and green text show changes to Chapter 4 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, M-U. -----------------------------------------------------------------------------------------Changes to this chapter: Typo corrections/additions/updates to the following instructions: UMWAIT and RET. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 18 INSTRUCTION SET REFERENCE, M-U RET—Return from Procedure Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode C3 RET ZO Valid Valid Near return to calling procedure. CB RET ZO Valid Valid Far return to calling procedure. C2 iw RET imm16 I Valid Valid Near return to calling procedure and pop imm16 bytes from stack. CA iw RET imm16 I Valid Valid Far return to calling procedure and pop imm16 bytes from stack. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 ZO NA NA NA NA I imm16 NA NA NA Description Transfers program control to a return address located on the top of the stack. The address is usually placed on the stack by a CALL instruction, and the return is made to the instruction that follows the CALL instruction. The optional source operand specifies the number of stack bytes to be released after the return address is popped; the default is none. This operand can be used to release parameters from the stack that were passed to the called procedure and are no longer needed. It must be used when the CALL instruction used to switch to a new procedure uses a call gate with a non-zero word count to access the new procedure. Here, the source operand for the RET instruction must specify the same number of bytes as is specified in the word count field of the call gate. The RET instruction can be used to execute three different types of returns: • Near return — A return to a calling procedure within the current code segment (the segment currently pointed to by the CS register), sometimes referred to as an intrasegment return. • Far return — A return to a calling procedure located in a different segment than the current code segment, sometimes referred to as an intersegment return. • Inter-privilege-level far return — A far return to a different privilege level than that of the currently executing program or procedure. The inter-privilege-level return type can only be executed in protected mode. See the section titled “Calling Procedures Using Call and RET” in Chapter 6 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, for detailed information on near, far, and inter-privilege-level returns. When executing a near return, the processor pops the return instruction pointer (offset) from the top of the stack into the EIP register and begins program execution at the new instruction pointer. The CS register is unchanged. When executing a far return, the processor pops the return instruction pointer from the top of the stack into the EIP register, then pops the segment selector from the top of the stack into the CS register. The processor then begins program execution in the new code segment at the new instruction pointer. The mechanics of an inter-privilege-level far return are similar to an intersegment return, except that the processor examines the privilege levels and access rights of the code and stack segments being returned to determine if the control transfer is allowed to be made. The DS, ES, FS, and GS segment registers are cleared by the RET instruction during an inter-privilege-level return if they refer to segments that are not allowed to be accessed at the new privilege level. Since a stack switch also occurs on an inter-privilege level return, the ESP and SS registers are loaded from the stack. If parameters are passed to the called procedure during an inter-privilege level call, the optional source operand must be used with the RET instruction to release the parameters on the return. Here, the parameters are released both from the called procedure’s stack and the calling procedure’s stack (that is, the stack being returned to). In 64-bit mode, the default operation size of this instruction is the stack-address size, i.e. 64 bits. This applies to near returns, not far returns; the default operation size of far returns is 32 bits. RET—Return from Procedure Vol. 2B 4-551 INSTRUCTION SET REFERENCE, M-U Instruction ordering. Instructions following a far return may be fetched from memory before earlier instructions complete execution, but they will not execute (even speculatively) until all instructions prior to the far return have completed execution (the later instructions may execute before data stored by the earlier instructions have become globally visible). Unlike near indirect CALL and near indirect JMP, the processor will not speculatively execute the next sequential instruction after a near RET unless that instruction is also the target of a jump or is a target in a branch predictor. Operation (* Near return *) IF instruction = near return THEN; IF OperandSize = 32 THEN IF top 4 bytes of stack not within stack limits THEN #SS(0); FI; EIP ← Pop(); ELSE IF OperandSize = 64 THEN IF top 8 bytes of stack not within stack limits THEN #SS(0); FI; RIP ← Pop(); ELSE (* OperandSize = 16 *) IF top 2 bytes of stack not within stack limits THEN #SS(0); FI; tempEIP ← Pop(); tempEIP ← tempEIP AND 0000FFFFH; IF tempEIP not within code segment limits THEN #GP(0); FI; EIP ← tempEIP; FI; FI; IF instruction has immediate operand THEN (* Release parameters from stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE IF StackAddressSize = 64 THEN RSP ← RSP + SRC; ELSE (* StackAddressSize = 16 *) SP ← SP + SRC; FI; FI; FI; FI; (* Real-address mode or virtual-8086 mode *) IF ((PE = 0) or (PE = 1 AND VM = 1)) and instruction = far return THEN IF OperandSize = 32 THEN 4-552 Vol. 2B RET—Return from Procedure INSTRUCTION SET REFERENCE, M-U IF top 8 bytes of stack not within stack limits THEN #SS(0); FI; EIP ← Pop(); CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *) ELSE (* OperandSize = 16 *) IF top 4 bytes of stack not within stack limits THEN #SS(0); FI; tempEIP ← Pop(); tempEIP ← tempEIP AND 0000FFFFH; IF tempEIP not within code segment limits THEN #GP(0); FI; EIP ← tempEIP; CS ← Pop(); (* 16-bit pop *) FI; IF instruction has immediate operand THEN (* Release parameters from stack *) SP ← SP + (SRC AND FFFFH); FI; FI; (* Protected mode, not virtual-8086 mode *) IF (PE = 1 and VM = 0 and IA32_EFER.LMA = 0) and instruction = far return THEN IF OperandSize = 32 THEN IF second doubleword on stack is not within stack limits THEN #SS(0); FI; ELSE (* OperandSize = 16 *) IF second word on stack is not within stack limits THEN #SS(0); FI; FI; IF return code segment selector is NULL THEN #GP(0); FI; IF return code segment selector addresses descriptor beyond descriptor table limit THEN #GP(selector); FI; Obtain descriptor to which return code segment selector points from descriptor table; IF return code segment descriptor is not a code segment THEN #GP(selector); FI; IF return code segment selector RPL < CPL THEN #GP(selector); FI; IF return code segment descriptor is conforming and return code segment DPL > return code segment selector RPL THEN #GP(selector); FI; IF return code segment descriptor is non-conforming and return code segment DPL ≠ return code segment selector RPL THEN #GP(selector); FI; IF return code segment descriptor is not present THEN #NP(selector); FI: IF return code segment selector RPL > CPL THEN GOTO RETURN-TO-OUTER-PRIVILEGE-LEVEL; ELSE GOTO RETURN-TO-SAME-PRIVILEGE-LEVEL; FI; FI; RET—Return from Procedure Vol. 2B 4-553 INSTRUCTION SET REFERENCE, M-U RETURN-TO-SAME-PRIVILEGE-LEVEL: IF the return instruction pointer is not within the return code segment limit THEN #GP(0); FI; IF OperandSize = 32 THEN EIP ← Pop(); CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *) ELSE (* OperandSize = 16 *) EIP ← Pop(); EIP ← EIP AND 0000FFFFH; CS ← Pop(); (* 16-bit pop *) FI; IF instruction has immediate operand THEN (* Release parameters from stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE (* StackAddressSize = 16 *) SP ← SP + SRC; FI; FI; RETURN-TO-OUTER-PRIVILEGE-LEVEL: IF top (16 + SRC) bytes of stack are not within stack limits (OperandSize = 32) or top (8 + SRC) bytes of stack are not within stack limits (OperandSize = 16) THEN #SS(0); FI; Read return segment selector; IF stack segment selector is NULL THEN #GP(0); FI; IF return stack segment selector index is not within its descriptor table limits THEN #GP(selector); FI; Read segment descriptor pointed to by return segment selector; IF stack segment selector RPL ≠ RPL of the return code segment selector or stack segment is not a writable data segment or stack segment descriptor DPL ≠ RPL of the return code segment selector THEN #GP(selector); FI; IF stack segment not present THEN #SS(StackSegmentSelector); FI; IF the return instruction pointer is not within the return code segment limit THEN #GP(0); FI; CPL ← ReturnCodeSegmentSelector(RPL); IF OperandSize = 32 THEN EIP ← Pop(); CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded; segment descriptor loaded *) CS(RPL) ← CPL; IF instruction has immediate operand THEN (* Release parameters from called procedure’s stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE (* StackAddressSize = 16 *) SP ← SP + SRC; FI; 4-554 Vol. 2B RET—Return from Procedure INSTRUCTION SET REFERENCE, M-U FI; FI; tempESP ← Pop(); tempSS ← Pop(); (* 32-bit pop, high-order 16 bits discarded; seg. descriptor loaded *) ESP ← tempESP; SS ← tempSS; ELSE (* OperandSize = 16 *) EIP ← Pop(); EIP ← EIP AND 0000FFFFH; CS ← Pop(); (* 16-bit pop; segment descriptor loaded *) CS(RPL) ← CPL; IF instruction has immediate operand THEN (* Release parameters from called procedure’s stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE (* StackAddressSize = 16 *) SP ← SP + SRC; FI; FI; tempESP ← Pop(); tempSS ← Pop(); (* 16-bit pop; segment descriptor loaded *) ESP ← tempESP; SS ← tempSS; FOR each SegReg in (ES, FS, GS, and DS) DO tempDesc ← descriptor cache for SegReg (* hidden part of segment register *) IF (SegmentSelector == NULL) OR (tempDesc(DPL) < CPL AND tempDesc(Type) is (data or non-conforming code))) THEN (* Segment register invalid *) SegmentSelector ← 0; (*Segment selector becomes null*) FI; OD; IF instruction has immediate operand THEN (* Release parameters from calling procedure’s stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE (* StackAddressSize = 16 *) SP ← SP + SRC; FI; FI; (* IA-32e Mode *) IF (PE = 1 and VM = 0 and IA32_EFER.LMA = 1) and instruction = far return THEN IF OperandSize = 32 THEN IF second doubleword on stack is not within stack limits THEN #SS(0); FI; IF first or second doubleword on stack is not in canonical space THEN #SS(0); FI; ELSE RET—Return from Procedure Vol. 2B 4-555 INSTRUCTION SET REFERENCE, M-U IF OperandSize = 16 THEN IF second word on stack is not within stack limits THEN #SS(0); FI; IF first or second word on stack is not in canonical space THEN #SS(0); FI; ELSE (* OperandSize = 64 *) IF first or second quadword on stack is not in canonical space THEN #SS(0); FI; FI FI; IF return code segment selector is NULL THEN GP(0); FI; IF return code segment selector addresses descriptor beyond descriptor table limit THEN GP(selector); FI; IF return code segment selector addresses descriptor in non-canonical space THEN GP(selector); FI; Obtain descriptor to which return code segment selector points from descriptor table; IF return code segment descriptor is not a code segment THEN #GP(selector); FI; IF return code segment descriptor has L-bit = 1 and D-bit = 1 THEN #GP(selector); FI; IF return code segment selector RPL < CPL THEN #GP(selector); FI; IF return code segment descriptor is conforming and return code segment DPL > return code segment selector RPL THEN #GP(selector); FI; IF return code segment descriptor is non-conforming and return code segment DPL ≠ return code segment selector RPL THEN #GP(selector); FI; IF return code segment descriptor is not present THEN #NP(selector); FI: IF return code segment selector RPL > CPL THEN GOTO IA-32E-MODE-RETURN-TO-OUTER-PRIVILEGE-LEVEL; ELSE GOTO IA-32E-MODE-RETURN-TO-SAME-PRIVILEGE-LEVEL; FI; FI; IA-32E-MODE-RETURN-TO-SAME-PRIVILEGE-LEVEL: IF the return instruction pointer is not within the return code segment limit THEN #GP(0); FI; IF the return instruction pointer is not within canonical address space THEN #GP(0); FI; IF OperandSize = 32 THEN EIP ← Pop(); CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *) ELSE IF OperandSize = 16 THEN EIP ← Pop(); EIP ← EIP AND 0000FFFFH; CS ← Pop(); (* 16-bit pop *) ELSE (* OperandSize = 64 *) 4-556 Vol. 2B RET—Return from Procedure INSTRUCTION SET REFERENCE, M-U RIP ← Pop(); CS ← Pop(); (* 64-bit pop, high-order 48 bits discarded *) FI; FI; IF instruction has immediate operand THEN (* Release parameters from stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE IF StackAddressSize = 16 THEN SP ← SP + SRC; ELSE (* StackAddressSize = 64 *) RSP ← RSP + SRC; FI; FI; FI; IA-32E-MODE-RETURN-TO-OUTER-PRIVILEGE-LEVEL: IF top (16 + SRC) bytes of stack are not within stack limits (OperandSize = 32) or top (8 + SRC) bytes of stack are not within stack limits (OperandSize = 16) THEN #SS(0); FI; IF top (16 + SRC) bytes of stack are not in canonical address space (OperandSize = 32) or top (8 + SRC) bytes of stack are not in canonical address space (OperandSize = 16) or top (32 + SRC) bytes of stack are not in canonical address space (OperandSize = 64) THEN #SS(0); FI; Read return stack segment selector; IF stack segment selector is NULL THEN IF new CS descriptor L-bit = 0 THEN #GP(selector); IF stack segment selector RPL = 3 THEN #GP(selector); FI; IF return stack segment descriptor is not within descriptor table limits THEN #GP(selector); FI; IF return stack segment descriptor is in non-canonical address space THEN #GP(selector); FI; Read segment descriptor pointed to by return segment selector; IF stack segment selector RPL ≠ RPL of the return code segment selector or stack segment is not a writable data segment or stack segment descriptor DPL ≠ RPL of the return code segment selector THEN #GP(selector); FI; IF stack segment not present THEN #SS(StackSegmentSelector); FI; IF the return instruction pointer is not within the return code segment limit THEN #GP(0); FI: IF the return instruction pointer is not within canonical address space THEN #GP(0); FI; CPL ← ReturnCodeSegmentSelector(RPL); IF OperandSize = 32 THEN EIP ← Pop(); RET—Return from Procedure Vol. 2B 4-557 INSTRUCTION SET REFERENCE, M-U CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded, segment descriptor loaded *) CS(RPL) ← CPL; IF instruction has immediate operand THEN (* Release parameters from called procedure’s stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE IF StackAddressSize = 16 THEN SP ← SP + SRC; ELSE (* StackAddressSize = 64 *) RSP ← RSP + SRC; FI; FI; FI; tempESP ← Pop(); tempSS ← Pop(); (* 32-bit pop, high-order 16 bits discarded, segment descriptor loaded *) ESP ← tempESP; SS ← tempSS; ELSE IF OperandSize = 16 THEN EIP ← Pop(); EIP ← EIP AND 0000FFFFH; CS ← Pop(); (* 16-bit pop; segment descriptor loaded *) CS(RPL) ← CPL; IF instruction has immediate operand THEN (* Release parameters from called procedure’s stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE IF StackAddressSize = 16 THEN SP ← SP + SRC; ELSE (* StackAddressSize = 64 *) RSP ← RSP + SRC; FI; FI; FI; tempESP ← Pop(); tempSS ← Pop(); (* 16-bit pop; segment descriptor loaded *) ESP ← tempESP; SS ← tempSS; ELSE (* OperandSize = 64 *) RIP ← Pop(); CS ← Pop(); (* 64-bit pop; high-order 48 bits discarded; seg. descriptor loaded *) CS(RPL) ← CPL; IF instruction has immediate operand THEN (* Release parameters from called procedure’s stack *) RSP ← RSP + SRC; FI; tempESP ← Pop(); 4-558 Vol. 2B RET—Return from Procedure INSTRUCTION SET REFERENCE, M-U FI; tempSS ← Pop(); (* 64-bit pop; high-order 48 bits discarded; seg. desc. loaded *) ESP ← tempESP; SS ← tempSS; FI; FOR each of segment register (ES, FS, GS, and DS) DO IF segment register points to data or non-conforming code segment and CPL > segment descriptor DPL; (* DPL in hidden part of segment register *) THEN SegmentSelector ← 0; (* SegmentSelector invalid *) FI; OD; IF instruction has immediate operand THEN (* Release parameters from calling procedure’s stack *) IF StackAddressSize = 32 THEN ESP ← ESP + SRC; ELSE IF StackAddressSize = 16 THEN SP ← SP + SRC; ELSE (* StackAddressSize = 64 *) RSP ← RSP + SRC; FI; FI; FI; Flags Affected None. Protected Mode Exceptions #GP(0) If the return code or stack segment selector is NULL. If the return instruction pointer is not within the return code segment limit #GP(selector) If the RPL of the return code segment selector is less then the CPL. If the return code or stack segment selector index is not within its descriptor table limits. If the return code segment descriptor does not indicate a code segment. If the return code segment is non-conforming and the segment selector’s DPL is not equal to the RPL of the code segment’s segment selector If the return code segment is conforming and the segment selector’s DPL greater than the RPL of the code segment’s segment selector If the stack segment is not a writable data segment. If the stack segment selector RPL is not equal to the RPL of the return code segment selector. If the stack segment descriptor DPL is not equal to the RPL of the return code segment selector. #SS(0) If the top bytes of stack are not within stack limits. If the return stack segment is not present. #NP(selector) If the return code segment is not present. #PF(fault-code) If a page fault occurs. #AC(0) If an unaligned memory access occurs when the CPL is 3 and alignment checking is enabled. RET—Return from Procedure Vol. 2B 4-559 INSTRUCTION SET REFERENCE, M-U Real-Address Mode Exceptions #GP If the return instruction pointer is not within the return code segment limit #SS If the top bytes of stack are not within stack limits. Virtual-8086 Mode Exceptions #GP(0) If the return instruction pointer is not within the return code segment limit #SS(0) If the top bytes of stack are not within stack limits. #PF(fault-code) If a page fault occurs. #AC(0) If an unaligned memory access occurs when alignment checking is enabled. Compatibility Mode Exceptions Same as 64-bit mode exceptions. 64-Bit Mode Exceptions #GP(0) If the return instruction pointer is non-canonical. If the return instruction pointer is not within the return code segment limit. If the stack segment selector is NULL going back to compatibility mode. If the stack segment selector is NULL going back to CPL3 64-bit mode. If a NULL stack segment selector RPL is not equal to CPL going back to non-CPL3 64-bit mode. If the return code segment selector is NULL. #GP(selector) If the proposed segment descriptor for a code segment does not indicate it is a code segment. If the proposed new code segment descriptor has both the D-bit and L-bit set. If the DPL for a nonconforming-code segment is not equal to the RPL of the code segment selector. If CPL is greater than the RPL of the code segment selector. If the DPL of a conforming-code segment is greater than the return code segment selector RPL. If a segment selector index is outside its descriptor table limits. If a segment descriptor memory address is non-canonical. If the stack segment is not a writable data segment. If the stack segment descriptor DPL is not equal to the RPL of the return code segment selector. If the stack segment selector RPL is not equal to the RPL of the return code segment selector. #SS(0) If an attempt to pop a value off the stack violates the SS limit. If an attempt to pop a value off the stack causes a non-canonical address to be referenced. #NP(selector) If the return code or stack segment is not present. #PF(fault-code) If a page fault occurs. #AC(0) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. 4-560 Vol. 2B RET—Return from Procedure INSTRUCTION SET REFERENCE, M-U UMWAIT—User Level Monitor Wait Opcode / Instruction Op/ En 64/32 bit Mode Support CPUID Feature Flag Description F2 0F AE /6 A V/V WAITPKG A hint that allows the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. UMWAIT r32,, Instruction Operand Encoding1 Op/En Tuple Operand 1 Operand 2 Operand 3 Operand 4 A NA ModRM:r/m (r) NA NA NA Description UMWAIT instructs the processor to enter an implementation-dependent optimized state while monitoring a range of addresses. The optimized state may be either a light-weight power/performance optimized state or an improved power/performance optimized state. The selection between the two states is governed by the explicit input register bit[0] source operand. UMWAIT is available when CPUID.7.0:ECX.WAITPKG[bit 5] is enumerated as 1. UMWAIT may be executed at any privilege level. This instruction’s operation is the same in non-64-bit modes and in 64-bit mode. The input register contains information such as the preferred optimized state the processor should enter as described in the following table. Bits other than bit 0 are reserved and will result in #GP if nonzero. Table 4-20. UMWAIT Input Register Bit Definitions Bit Value State Name Wakeup Time Power Savings Other Benefits bit[0] = 0 C0.2 Slower Larger Improves performance of the other SMT thread(s) on the same core. bit[0] = 1 C0.1 Faster Smaller NA bits[31:1] NA NA NA Reserved The instruction wakes up when the time-stamp counter reaches or exceeds the implicit EDX:EAX 64-bit input value (if the monitoring hardware did not trigger beforehand). Prior to executing the UMWAIT instruction, an operating system may specify the maximum delay it allows the processor to suspend its operation. It can do so by writing TSC-quanta value to the following 32bit MSR (IA32_UMWAIT_CONTROL at MSR index E1H): • IA32_UMWAIT_CONTROL[31:2] — Determines the maximum time in TSC-quanta that the processor can reside in either C0.1 or C0.2. A zero value indicates no maximum time. The maximum time value is a 32-bit value where the upper 30 bits come from this field and the lower two bits are zero. • • IA32_UMWAIT_CONTROL[1] — Reserved. IA32_UMWAIT_CONTROL[0] — C0.2 is not allowed by the OS. Value of “1” means all C0.2 requests revert to C0.1. If the processor that executed a UMWAIT instruction wakes due to the expiration of the operating system timelimit, the instructions sets RFLAGS.CF; otherwise, that flag is cleared. The UMWAIT instruction causes a transactional abort when used inside a transactional region. The UMWAIT instruction operates with the UMONITOR instruction. The two instructions allow the definition of an address at which to wait (UMONITOR) and an implementation-dependent optimized operation to perform while waiting (UMWAIT). The execution of UMWAIT is a hint to the processor that it can enter an implementation-dependent-optimized state while waiting for an event or a store operation to the address range armed by UMONITOR. The UMWAIT instruction will not wait (will not enter an implementation-dependent optimized state) if any of the 1. The Mod field of the ModR/M byte must have value 11B. 4-690 Vol. 2B UMWAIT—User Level Monitor Wait INSTRUCTION SET REFERENCE, M-U following instructions were executed before UMWAIT and after the most recent execution of UMONITOR: IRET, MONITOR, SYSEXIT, SYSRET, and far RET (the last if it is changing CPL). The following additional events cause the processor to exit the implementation-dependent optimized state: a store to the address range armed by the UMONITOR instruction, an NMI or SMI, a debug exception, a machine check exception, the BINIT# signal, the INIT# signal, and the RESET# signal. Other implementation-dependent events may also cause the processor to exit the implementation-dependent optimized state. In addition, an external interrupt causes the processor to exit the implementation-dependent optimized state regardless of whether maskable-interrupts are inhibited (EFLAGS.IF =0). Following exit from the implementation-dependent-optimized state, control passes to the instruction after the UMWAIT instruction. A pending interrupt that is not masked (including an NMI or an SMI) may be delivered before execution of that instruction. Unlike the HLT instruction, the UMWAIT instruction does not restart at the UMWAIT instruction following the handling of an SMI. If the preceding UMONITOR instruction did not successfully arm an address range or if UMONITOR was not executed prior to executing UMWAIT and following the most recent execution of the legacy MONITOR instruction (UMWAIT does not interoperate with MONITOR), then the processor will not enter an optimized state. Execution will continue to the instruction following UMWAIT. A store to the address range armed by the UMONITOR instruction will cause the processor to exit UMWAIT if either the store was originated by other processor agents or the store was originated by a non-processor agent. Operation os_deadline TSC+(IA32_MWAIT_CONTROL[31:2]<<2) instr_deadline UINT64(EDX:EAX) IF os_deadline < instr_deadline: deadline os_deadline using_os_deadline 1 ELSE: deadline instr_deadline using_os_deadline 0 WHILE monitor hardware armed AND TSC < deadline: implementation_dependent_optimized_state(Source register, deadline, IA32_UMWAIT_CONTROL[0] ) IF using_os_deadline AND TSC > deadline: RFLAGS.CF 1 ELSE: RFLAGS.CF 0 RFLAGS.AF,PF,SF,ZF,OF 0 Intel C/C++ Compiler Intrinsic Equivalent UMWAIT uint8_t _umwait(uint32_t control, uint64_t counter); Numeric Exceptions None Exceptions (All Operating Modes) #GP(0) If src[31:1] != 0. If CR4.TSD = 1 and CPL != 0. #UD If CPUID.7.0:ECX.WAITPKG[bit 5]=0. UMWAIT—User Level Monitor Wait Vol. 2B 4-691 9. Updates to Chapter 5, Volume 2C Change bars and green text show changes to Chapter 5 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, V-Z. -----------------------------------------------------------------------------------------Changes to this chapter: Typo corrections/additions/updates to the following instructions: XRSTOR, XRSTORS, XSAVE, XSAVEC, XSAVEOPT, and XSAVES. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 19 INSTRUCTION SET REFERENCE, V-Z XRSTOR—Restore Processor Extended States Opcode / Instruction Op/ En 64/32 bit CPUID Mode Feature Support Flag Description NP 0F AE /5 M V/V XSAVE Restore state components specified by EDX:EAX from mem. M V/N.E. XSAVE Restore state components specified by EDX:EAX from mem. XRSTOR mem NP REX.W + 0F AE /5 XRSTOR64 mem Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (r) NA NA NA Description Performs a full or partial restore of processor state components from the XSAVE area located at the memory address specified by the source operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components restored correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0. The format of the XSAVE area is detailed in Section 13.4, “XSAVE Area,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Like FXRSTOR and FXSAVE, the memory format used for x87 state depends on a REX.W prefix; see Section 13.5.1, “x87 State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Section 13.8, “Operation of XRSTOR,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 provides a detailed description of the operation of the XRSTOR instruction. The following items provide a highlevel outline: • Execution of XRSTOR may take one of two forms: standard and compacted. Bit 63 of the XCOMP_BV field in the XSAVE header determines which form is used: value 0 specifies the standard form, while value 1 specifies the compacted form. • • If RFBM[i] = 0, XRSTOR does not update state component i.1 If RFBM[i] = 1 and bit i is clear in the XSTATE_BV field in the XSAVE header, XRSTOR initializes state component i. • • If RFBM[i] = 1 and XSTATE_BV[i] = 1, XRSTOR loads state component i from the XSAVE area. • XRSTOR loads the internal value XRSTOR_INFO, which may be used to optimize a subsequent execution of XSAVEOPT or XSAVES. • Immediately following an execution of XRSTOR, the processor tracks as in-use (not in initial configuration) any state component i for which RFBM[i] = 1 and XSTATE_BV[i] = 1; it tracks as modified any state component i for which RFBM[i] = 0. The standard form of XRSTOR treats MXCSR (which is part of state component 1 — SSE) differently from the XMM registers. If either form attempts to load MXCSR with an illegal value, a general-protection exception (#GP) occurs. Use of a source operand not aligned to 64-byte boundary (for 64-bit and 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored. See Section 13.6, “Processor Tracking of XSAVE-Managed State,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 for discussion of the bitmaps XINUSE and XMODIFIED and of the quantity XRSTOR_INFO. 1. There is an exception if RFBM[1] = 0 and RFBM[2] = 1. In this case, the standard form of XRSTOR will load MXCSR from memory, even though MXCSR is part of state component 1 — SSE. The compacted form of XRSTOR does not make this exception. 5-586 Vol. 2C XRSTOR—Restore Processor Extended States INSTRUCTION SET REFERENCE, V-Z Operation RFBM ← XCR0 AND EDX:EAX; /* bitwise logical AND */ COMPMASK ← XCOMP_BV field from XSAVE header; RSTORMASK ← XSTATE_BV field from XSAVE header; IF COMPMASK[63] = 0 THEN /* Standard form of XRSTOR */ TO_BE_RESTORED ← RFBM AND RSTORMASK; TO_BE_INITIALIZED ← RFBM AND NOT RSTORMASK; IF TO_BE_RESTORED[0] = 1 THEN load x87 state from legacy region of XSAVE area; XINUSE[0] ← 1; ELSIF TO_BE_INITIALIZED[0] = 1 THEN initialize x87 state; XINUSE[0] ← 0; FI; IF RFBM[1] = 1 OR RFBM[2] = 1 THEN load MXCSR from legacy region of XSAVE area; FI; IF TO_BE_RESTORED[1] = 1 THEN load XMM registers from legacy region of XSAVE area; // this step does not load MXCSR XINUSE[1] ← 1; ELSIF TO_BE_INITIALIZED[1] = 1 THEN set all XMM registers to 0; // this step does not initialize MXCSR XINUSE[1] ← 0; FI; FOR i ← 2 TO 62 IF TO_BE_RESTORED[i] = 1 THEN load XSAVE state component i at offset n from base of XSAVE area; // n enumerated by CPUID(EAX=0DH,ECX=i):EBX) XINUSE[i] ← 1; ELSIF TO_BE_INITIALIZED[i] = 1 THEN initialize XSAVE state component i; XINUSE[i] ← 0; FI; ENDFOR; ELSE /* Compacted form of XRSTOR */ IF CPUID.(EAX=0DH,ECX=1):EAX.XSAVEC[bit 1] = 0 THEN /* compacted form not supported */ #GP(0); XRSTOR—Restore Processor Extended States Vol. 2C 5-587 INSTRUCTION SET REFERENCE, V-Z FI; FORMAT = COMPMASK AND 7FFFFFFF_FFFFFFFFH; RESTORE_FEATURES = FORMAT AND RFBM; TO_BE_RESTORED ← RESTORE_FEATURES AND RSTORMASK; FORCE_INIT ← RFBM AND NOT FORMAT; TO_BE_INITIALIZED = (RFBM AND NOT RSTORMASK) OR FORCE_INIT; IF TO_BE_RESTORED[0] = 1 THEN load x87 state from legacy region of XSAVE area; XINUSE[0] ← 1; ELSIF TO_BE_INITIALIZED[0] = 1 THEN initialize x87 state; XINUSE[0] ← 0; FI; IF TO_BE_RESTORED[1] = 1 THEN load SSE state from legacy region of XSAVE area; // this step loads the XMM registers and MXCSR XINUSE[1] ← 1; ELSIF TO_BE_INITIALIZED[1] = 1 THEN set all XMM registers to 0; MXCSR ← 1F80H; XINUSE[1] ← 0; FI; NEXT_FEATURE_OFFSET = 576; // Legacy area and XSAVE header consume 576 bytes FOR i ← 2 TO 62 IF FORMAT[i] = 1 THEN IF TO_BE_RESTORED[i] = 1 THEN load XSAVE state component i at offset NEXT_FEATURE_OFFSET from base of XSAVE area; XINUSE[i] ← 1; FI; NEXT_FEATURE_OFFSET = NEXT_FEATURE_OFFSET + n (n enumerated by CPUID(EAX=0DH,ECX=i):EAX); FI; IF TO_BE_INITIALIZED[i] = 1 THEN initialize XSAVE state component i; XINUSE[i] ← 0; FI; ENDFOR; FI; XMODIFIED_BV ← NOT RFBM; IF in VMX non-root operation THEN VMXNR ← 1; ELSE VMXNR ← 0; FI; 5-588 Vol. 2C XRSTOR—Restore Processor Extended States INSTRUCTION SET REFERENCE, V-Z LAXA ← linear address of XSAVE area; XRSTOR_INFO ← CPL,VMXNR,LAXA,COMPMASK; Flags Affected None. Intel C/C++ Compiler Intrinsic Equivalent XRSTOR: void _xrstor( void * , unsigned __int64); XRSTOR: void _xrstor64( void * , unsigned __int64); Protected Mode Exceptions #GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If bit 63 of the XCOMP_BV field of the XSAVE header is 1 and CPUID.(EAX=0DH,ECX=1):EAX.XSAVEC[bit 1] = 0. If the standard form is executed and a bit in XCR0 is 0 and the corresponding bit in the XSTATE_BV field of the XSAVE header is 1. If the standard form is executed and bytes 23:8 of the XSAVE header are not all zero. If the compacted form is executed and a bit in XCR0 is 0 and the corresponding bit in the XCOMP_BV field of the XSAVE header is 1. If the compacted form is executed and a bit in the XCOMP_BV field in the XSAVE header is 0 and the corresponding bit in the XSTATE_BV field is 1. If the compacted form is executed and bytes 63:16 of the XSAVE header are not all zero. If attempting to write any reserved bits of the MXCSR register with 1. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. #AC If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). Real-Address Mode Exceptions #GP If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If any part of the operand lies outside the effective address space from 0 to FFFFH. If bit 63 of the XCOMP_BV field of the XSAVE header is 1 and CPUID.(EAX=0DH,ECX=1):EAX.XSAVEC[bit 1] = 0. If the standard form is executed and a bit in XCR0 is 0 and the corresponding bit in the XSTATE_BV field of the XSAVE header is 1. If the standard form is executed and bytes 23:8 of the XSAVE header are not all zero. If the compacted form is executed and a bit in XCR0 is 0 and the corresponding bit in the XCOMP_BV field of the XSAVE header is 1. XRSTOR—Restore Processor Extended States Vol. 2C 5-589 INSTRUCTION SET REFERENCE, V-Z If the compacted form is executed and a bit in the XCOMP_BV field in the XSAVE header is 0 and the corresponding bit in the XSTATE_BV field is 1. If the compacted form is executed and bytes 63:16 of the XSAVE header are not all zero. If attempting to write any reserved bits of the MXCSR register with 1. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Virtual-8086 Mode Exceptions Same exceptions as in protected mode Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #GP(0) If a memory address is in a non-canonical form. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If bit 63 of the XCOMP_BV field of the XSAVE header is 1 and CPUID.(EAX=0DH,ECX=1):EAX.XSAVEC[bit 1] = 0. If the standard form is executed and a bit in XCR0 is 0 and the corresponding bit in the XSTATE_BV field of the XSAVE header is 1. If the standard form is executed and bytes 23:8 of the XSAVE header are not all zero. If the compacted form is executed and a bit in XCR0 is 0 and the corresponding bit in the XCOMP_BV field of the XSAVE header is 1. If the compacted form is executed and a bit in the XCOMP_BV field in the XSAVE header is 0 and the corresponding bit in the XSTATE_BV field is 1. If the compacted form is executed and bytes 63:16 of the XSAVE header are not all zero. If attempting to write any reserved bits of the MXCSR register with 1. #SS(0) If a memory address referencing the SS segment is in a non-canonical form. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. #AC 5-590 Vol. 2C If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). XRSTOR—Restore Processor Extended States INSTRUCTION SET REFERENCE, V-Z XRSTORS—Restore Processor Extended States Supervisor Opcode / Instruction Op/ En 64/32 bit CPUID Mode Feature Support Flag Description NP 0F C7 /3 M V/V XSS Restore state components specified by EDX:EAX from mem. M V/N.E. XSS Restore state components specified by EDX:EAX from mem. XRSTORS mem NP REX.W + 0F C7 /3 XRSTORS64 mem Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (r) NA NA NA Description Performs a full or partial restore of processor state components from the XSAVE area located at the memory address specified by the source operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components restored correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and the logical-OR of XCR0 with the IA32_XSS MSR. XRSTORS may be executed only if CPL = 0. The format of the XSAVE area is detailed in Section 13.4, “XSAVE Area,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Like FXRSTOR and FXSAVE, the memory format used for x87 state depends on a REX.W prefix; see Section 13.5.1, “x87 State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Section 13.12, “Operation of XRSTORS,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 provides a detailed description of the operation of the XRSTOR instruction. The following items provide a high-level outline: • Execution of XRSTORS is similar to that of the compacted form of XRSTOR; XRSTORS cannot restore from an XSAVE area in which the extended region is in the standard format (see Section 13.4.3, “Extended Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). • XRSTORS differs from XRSTOR in that it can restore state components corresponding to bits set in the IA32_XSS MSR. • • If RFBM[i] = 0, XRSTORS does not update state component i. If RFBM[i] = 1 and bit i is clear in the XSTATE_BV field in the XSAVE header, XRSTORS initializes state component i. • • • If RFBM[i] = 1 and XSTATE_BV[i] = 1, XRSTORS loads state component i from the XSAVE area. • Immediately following an execution of XRSTORS, the processor tracks as in-use (not in initial configuration) any state component i for which RFBM[i] = 1 and XSTATE_BV[i] = 1; it tracks as modified any state component i for which RFBM[i] = 0. If XRSTORS attempts to load MXCSR with an illegal value, a general-protection exception (#GP) occurs. XRSTORS loads the internal value XRSTOR_INFO, which may be used to optimize a subsequent execution of XSAVEOPT or XSAVES. Use of a source operand not aligned to 64-byte boundary (for 64-bit and 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored. See Section 13.6, “Processor Tracking of XSAVE-Managed State,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 for discussion of the bitmaps XINUSE and XMODIFIED and of the quantity XRSTOR_INFO. XRSTORS—Restore Processor Extended States Supervisor Vol. 2C 5-591 INSTRUCTION SET REFERENCE, V-Z Operation RFBM ← (XCR0 OR IA32_XSS) AND EDX:EAX; /* bitwise logical OR and AND */ COMPMASK ← XCOMP_BV field from XSAVE header; RSTORMASK ← XSTATE_BV field from XSAVE header; FORMAT = COMPMASK AND 7FFFFFFF_FFFFFFFFH; RESTORE_FEATURES = FORMAT AND RFBM; TO_BE_RESTORED ← RESTORE_FEATURES AND RSTORMASK; FORCE_INIT ← RFBM AND NOT FORMAT; TO_BE_INITIALIZED = (RFBM AND NOT RSTORMASK) OR FORCE_INIT; IF TO_BE_RESTORED[0] = 1 THEN load x87 state from legacy region of XSAVE area; XINUSE[0] ← 1; ELSIF TO_BE_INITIALIZED[0] = 1 THEN initialize x87 state; XINUSE[0] ← 0; FI; IF TO_BE_RESTORED[1] = 1 THEN load SSE state from legacy region of XSAVE area; // this step loads the XMM registers and MXCSR XINUSE[1] ← 1; ELSIF TO_BE_INITIALIZED[1] = 1 THEN set all XMM registers to 0; MXCSR ← 1F80H; XINUSE[1] ← 0; FI; NEXT_FEATURE_OFFSET = 576; // Legacy area and XSAVE header consume 576 bytes FOR i ← 2 TO 62 IF FORMAT[i] = 1 THEN IF TO_BE_RESTORED[i] = 1 THEN load XSAVE state component i at offset NEXT_FEATURE_OFFSET from base of XSAVE area; XINUSE[i] ← 1; FI; NEXT_FEATURE_OFFSET = NEXT_FEATURE_OFFSET + n (n enumerated by CPUID(EAX=0DH,ECX=i):EAX); FI; IF TO_BE_INITIALIZED[i] = 1 THEN initialize XSAVE state component i; XINUSE[i] ← 0; FI; ENDFOR; XMODIFIED_BV ← NOT RFBM; IF in VMX non-root operation 5-592 Vol. 2C XRSTORS—Restore Processor Extended States Supervisor INSTRUCTION SET REFERENCE, V-Z THEN VMXNR ← 1; ELSE VMXNR ← 0; FI; LAXA ← linear address of XSAVE area; XRSTOR_INFO ← CPL,VMXNR,LAXA,COMPMASK; Flags Affected None. Intel C/C++ Compiler Intrinsic Equivalent XRSTORS: void _xrstors( void * , unsigned __int64); XRSTORS64: void _xrstors64( void * , unsigned __int64); Protected Mode Exceptions #GP(0) If CPL > 0. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If bit 63 of the XCOMP_BV field of the XSAVE header is 0. If a bit in XCR0 is 0 and the corresponding bit in the XCOMP_BV field of the XSAVE header is 1. If a bit in the XCOMP_BV field in the XSAVE header is 0 and the corresponding bit in the XSTATE_BV field is 1. If bytes 63:16 of the XSAVE header are not all zero. If attempting to write any reserved bits of the MXCSR register with 1. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSS[bit 3] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Real-Address Mode Exceptions #GP If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If any part of the operand lies outside the effective address space from 0 to FFFFH. If bit 63 of the XCOMP_BV field of the XSAVE header is 0. If a bit in XCR0 is 0 and the corresponding bit in the XCOMP_BV field of the XSAVE header is 1. If a bit in the XCOMP_BV field in the XSAVE header is 0 and the corresponding bit in the XSTATE_BV field is 1. If bytes 63:16 of the XSAVE header are not all zero. If attempting to write any reserved bits of the MXCSR register with 1. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSS[bit 3] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Virtual-8086 Mode Exceptions Same exceptions as in protected mode. XRSTORS—Restore Processor Extended States Supervisor Vol. 2C 5-593 INSTRUCTION SET REFERENCE, V-Z Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #GP(0) If CPL > 0. If a memory address is in a non-canonical form. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If bit 63 of the XCOMP_BV field of the XSAVE header is 0. If a bit in XCR0 is 0 and the corresponding bit in the XCOMP_BV field of the XSAVE header is 1. If a bit in the XCOMP_BV field in the XSAVE header is 0 and the corresponding bit in the XSTATE_BV field is 1. If bytes 63:16 of the XSAVE header are not all zero. If attempting to write any reserved bits of the MXCSR register with 1. #SS(0) If a memory address referencing the SS segment is in a non-canonical form. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSS[bit 3] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. 5-594 Vol. 2C XRSTORS—Restore Processor Extended States Supervisor INSTRUCTION SET REFERENCE, V-Z XSAVE—Save Processor Extended States Opcode / Instruction Op/ En 64/32 bit CPUID Mode Feature Support Flag Description NP 0F AE /4 M V/V XSAVE Save state components specified by EDX:EAX to mem. M V/N.E. XSAVE Save state components specified by EDX:EAX to mem. XSAVE mem NP REX.W + 0F AE /4 XSAVE64 mem Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (w) NA NA NA Description Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0. The format of the XSAVE area is detailed in Section 13.4, “XSAVE Area,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Like FXRSTOR and FXSAVE, the memory format used for x87 state depends on a REX.W prefix; see Section 13.5.1, “x87 State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Section 13.7, “Operation of XSAVE,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 provides a detailed description of the operation of the XSAVE instruction. The following items provide a high-level outline: • • XSAVE saves state component i if and only if RFBM[i] = 1.1 • XSAVE reads the XSTATE_BV field of the XSAVE header (see Section 13.4.2, “XSAVE Header” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1) and writes a modified value back to memory as follows. If RFBM[i] = 1, XSAVE writes XSTATE_BV[i] with the value of XINUSE[i]. (XINUSE is a bitmap by which the processor tracks the status of various state components. See Section 13.6, “Processor Tracking of XSAVEManaged State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.) If RFBM[i] = 0, XSAVE writes XSTATE_BV[i] with the value that it read from memory (it does not modify the bit). XSAVE does not write to any part of the XSAVE header other than the XSTATE_BV field. • XSAVE always uses the standard format of the extended region of the XSAVE area (see Section 13.4.3, “Extended Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). XSAVE does not modify bytes 511:464 of the legacy region of the XSAVE area (see Section 13.4.1, “Legacy Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored. 1. An exception is made for MXCSR and MXCSR_MASK, which belong to state component 1 — SSE. XSAVE saves these values to memory if either RFBM[1] or RFBM[2] is 1. XSAVE—Save Processor Extended States Vol. 2C 5-595 INSTRUCTION SET REFERENCE, V-Z Operation RFBM ← XCR0 AND EDX:EAX; /* bitwise logical AND */ OLD_BV ← XSTATE_BV field from XSAVE header; IF RFBM[0] = 1 THEN store x87 state into legacy region of XSAVE area; FI; IF RFBM[1] = 1 THEN store XMM registers into legacy region of XSAVE area; // this step does not save MXCSR or MXCSR_MASK FI; IF RFBM[1] = 1 OR RFBM[2] = 1 THEN store MXCSR and MXCSR_MASK into legacy region of XSAVE area; FI; FOR i ← 2 TO 62 IF RFBM[i] = 1 THEN save XSAVE state component i at offset n from base of XSAVE area (n enumerated by CPUID(EAX=0DH,ECX=i):EBX); FI; ENDFOR; XSTATE_BV field in XSAVE header ← (OLD_BV AND NOT RFBM) OR (XINUSE AND RFBM); Flags Affected None. Intel C/C++ Compiler Intrinsic Equivalent XSAVE: void _xsave( void * , unsigned __int64); XSAVE: void _xsave64( void * , unsigned __int64); Protected Mode Exceptions #GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. #AC 5-596 Vol. 2C If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). XSAVE—Save Processor Extended States INSTRUCTION SET REFERENCE, V-Z Real-Address Mode Exceptions #GP If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If any part of the operand lies outside the effective address space from 0 to FFFFH. #NM #UD If CR0.TS[bit 3] = 1. If CPUID.01H:ECX.XSAVE[bit 26] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Virtual-8086 Mode Exceptions Same exceptions as in protected mode. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #GP(0) If the memory address is in a non-canonical form. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #SS(0) If a memory address referencing the SS segment is in a non-canonical form. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. #AC If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). XSAVE—Save Processor Extended States Vol. 2C 5-597 INSTRUCTION SET REFERENCE, V-Z XSAVEC—Save Processor Extended States with Compaction Opcode / Instruction Op/ En 64/32 bit CPUID Mode Feature Support Flag Description NP 0F C7 /4 M V/V XSAVEC Save state components specified by EDX:EAX to mem with compaction. M V/N.E. XSAVEC Save state components specified by EDX:EAX to mem with compaction. XSAVEC mem NP REX.W + 0F C7 /4 XSAVEC64 mem Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (w) NA NA NA Description Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0. The format of the XSAVE area is detailed in Section 13.4, “XSAVE Area,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Like FXRSTOR and FXSAVE, the memory format used for x87 state depends on a REX.W prefix; see Section 13.5.1, “x87 State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Section 13.10, “Operation of XSAVEC,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 provides a detailed description of the operation of the XSAVEC instruction. The following items provide a highlevel outline: • Execution of XSAVEC is similar to that of XSAVE. XSAVEC differs from XSAVE in that it uses compaction and that it may use the init optimization. • XSAVEC saves state component i if and only if RFBM[i] = 1 and XINUSE[i] = 1.1 (XINUSE is a bitmap by which the processor tracks the status of various state components. See Section 13.6, “Processor Tracking of XSAVEManaged State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.) • XSAVEC does not modify bytes 511:464 of the legacy region of the XSAVE area (see Section 13.4.1, “Legacy Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). • XSAVEC writes the logical AND of RFBM and XINUSE to the XSTATE_BV field of the XSAVE header.2,3 (See Section 13.4.2, “XSAVE Header” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.) XSAVEC sets bit 63 of the XCOMP_BV field and sets bits 62:0 of that field to RFBM[62:0]. XSAVEC does not write to any parts of the XSAVE header other than the XSTATE_BV and XCOMP_BV fields. • XSAVEC always uses the compacted format of the extended region of the XSAVE area (see Section 13.4.3, “Extended Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored. 1. There is an exception for state component 1 (SSE). MXCSR is part of SSE state, but XINUSE[1] may be 0 even if MXCSR does not have its initial value of 1F80H. In this case, XSAVEC saves SSE state as long as RFBM[1] = 1. 2. Unlike XSAVE and XSAVEOPT, XSAVEC clears bits in the XSTATE_BV field that correspond to bits that are clear in RFBM. 3. There is an exception for state component 1 (SSE). MXCSR is part of SSE state, but XINUSE[1] may be 0 even if MXCSR does not have its initial value of 1F80H. In this case, XSAVEC sets XSTATE_BV[1] to 1 as long as RFBM[1] = 1. 5-598 Vol. 2C XSAVEC—Save Processor Extended States with Compaction INSTRUCTION SET REFERENCE, V-Z Operation RFBM ← XCR0 AND EDX:EAX; TO_BE_SAVED ← RFBM AND XINUSE; If MXCSR ≠ 1F80H AND RFBM[1] TO_BE_SAVED[1] = 1; FI; /* bitwise logical AND */ /* bitwise logical AND */ IF TO_BE_SAVED[0] = 1 THEN store x87 state into legacy region of XSAVE area; FI; IF TO_BE_SAVED[1] = 1 THEN store SSE state into legacy region of XSAVE area; // this step saves the XMM registers, MXCSR, and MXCSR_MASK FI; NEXT_FEATURE_OFFSET = 576; // Legacy area and XSAVE header consume 576 bytes FOR i ← 2 TO 62 IF RFBM[i] = 1 THEN IF TO_BE_SAVED[i] THEN save XSAVE state component i at offset NEXT_FEATURE_OFFSET from base of XSAVE area; FI; NEXT_FEATURE_OFFSET = NEXT_FEATURE_OFFSET + n (n enumerated by CPUID(EAX=0DH,ECX=i):EAX); FI; ENDFOR; XSTATE_BV field in XSAVE header ← TO_BE_SAVED; XCOMP_BV field in XSAVE header ← RFBM OR 80000000_00000000H; Flags Affected None. Intel C/C++ Compiler Intrinsic Equivalent XSAVEC: void _xsavec( void * , unsigned __int64); XSAVEC64: void _xsavec64( void * , unsigned __int64); Protected Mode Exceptions #GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSAVEC[bit 1] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. XSAVEC—Save Processor Extended States with Compaction Vol. 2C 5-599 INSTRUCTION SET REFERENCE, V-Z #AC If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). Real-Address Mode Exceptions #GP If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If any part of the operand lies outside the effective address space from 0 to FFFFH. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSAVEC[bit 1] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Virtual-8086 Mode Exceptions Same exceptions as in protected mode. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #GP(0) If the memory address is in a non-canonical form. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #SS(0) If a memory address referencing the SS segment is in a non-canonical form. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSAVEC[bit 1] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. #AC 5-600 Vol. 2C If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). XSAVEC—Save Processor Extended States with Compaction INSTRUCTION SET REFERENCE, V-Z XSAVEOPT—Save Processor Extended States Optimized Opcode/ Instruction Op/ En 64/32 bit Mode Support CPUID Feature Flag NP 0F AE /6 M V/V XSAVEOPT Save state components specified by EDX:EAX to mem, optimizing if possible. M V/V XSAVEOPT Save state components specified by EDX:EAX to mem, optimizing if possible. XSAVEOPT mem NP REX.W + 0F AE /6 XSAVEOPT64 mem Description Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (w) NA NA NA Description Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), which is the logical-AND of EDX:EAX and XCR0. The format of the XSAVE area is detailed in Section 13.4, “XSAVE Area,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Like FXRSTOR and FXSAVE, the memory format used for x87 state depends on a REX.W prefix; see Section 13.5.1, “x87 State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Section 13.9, “Operation of XSAVEOPT,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 provides a detailed description of the operation of the XSAVEOPT instruction. The following items provide a high-level outline: • Execution of XSAVEOPT is similar to that of XSAVE. XSAVEOPT differs from XSAVE in that it may use the init and modified optimizations. The performance of XSAVEOPT will be equal to or better than that of XSAVE. • XSAVEOPT saves state component i only if RFBM[i] = 1 and XINUSE[i] = 1.1 (XINUSE is a bitmap by which the processor tracks the status of various state components. See Section 13.6, “Processor Tracking of XSAVEManaged State”of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.) Even if both bits are 1, XSAVEOPT may optimize and not save state component i if (1) state component i has not been modified since the last execution of XRSTOR or XRSTORS; and (2) this execution of XSAVES corresponds to that last execution of XRSTOR or XRSTORS as determined by the internal value XRSTOR_INFO (see the Operation section below). • XSAVEOPT does not modify bytes 511:464 of the legacy region of the XSAVE area (see Section 13.4.1, “Legacy Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). • XSAVEOPT reads the XSTATE_BV field of the XSAVE header (see Section 13.4.2, “XSAVE Header” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1) and writes a modified value back to memory as follows. If RFBM[i] = 1, XSAVEOPT writes XSTATE_BV[i] with the value of XINUSE[i]. If RFBM[i] = 0, XSAVEOPT writes XSTATE_BV[i] with the value that it read from memory (it does not modify the bit). XSAVEOPT does not write to any part of the XSAVE header other than the XSTATE_BV field. • XSAVEOPT always uses the standard format of the extended region of the XSAVE area (see Section 13.4.3, “Extended Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) will result in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored. 1. There is an exception made for MXCSR and MXCSR_MASK, which belong to state component 1 — SSE. XSAVEOPT always saves these to memory if RFBM[1] = 1 or RFBM[2] = 1, regardless of the value of XINUSE. XSAVEOPT—Save Processor Extended States Optimized Vol. 2C 5-601 INSTRUCTION SET REFERENCE, V-Z See Section 13.6, “Processor Tracking of XSAVE-Managed State,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 for discussion of the bitmap XMODIFIED and of the quantity XRSTOR_INFO. Operation RFBM ← XCR0 AND EDX:EAX; /* bitwise logical AND */ OLD_BV ← XSTATE_BV field from XSAVE header; TO_BE_SAVED ← RFBM AND XINUSE; IF in VMX non-root operation THEN VMXNR ← 1; ELSE VMXNR ← 0; FI; LAXA ← linear address of XSAVE area; IF XRSTOR_INFO = CPL,VMXNR,LAXA,00000000_00000000H THEN TO_BE_SAVED ← TO_BE_SAVED AND XMODIFIED; FI; IF TO_BE_SAVED[0] = 1 THEN store x87 state into legacy region of XSAVE area; FI; IF TO_BE_SAVED[1] THEN store XMM registers into legacy region of XSAVE area; // this step does not save MXCSR or MXCSR_MASK FI; IF RFBM[1] = 1 or RFBM[2] = 1 THEN store MXCSR and MXCSR_MASK into legacy region of XSAVE area; FI; FOR i ← 2 TO 62 IF TO_BE_SAVED[i] = 1 THEN save XSAVE state component i at offset n from base of XSAVE area (n enumerated by CPUID(EAX=0DH,ECX=i):EBX); FI; ENDFOR; XSTATE_BV field in XSAVE header ← (OLD_BV AND NOT RFBM) OR (XINUSE AND RFBM); Flags Affected None. Intel C/C++ Compiler Intrinsic Equivalent XSAVEOPT: void _xsaveopt( void * , unsigned __int64); XSAVEOPT: void _xsaveopt64( void * , unsigned __int64); Protected Mode Exceptions #GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. 5-602 Vol. 2C XSAVEOPT—Save Processor Extended States Optimized INSTRUCTION SET REFERENCE, V-Z #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSAVEOPT[bit 0] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. #AC If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). Real-Address Mode Exceptions #GP If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If any part of the operand lies outside the effective address space from 0 to FFFFH. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSAVEOPT[bit 0] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Virtual-8086 Mode Exceptions Same exceptions as in protected mode. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #SS(0) #GP(0) If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSAVEOPT[bit 0] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. #AC If this exception is disabled a general protection exception (#GP) is signaled if the memory operand is not aligned on a 64-byte boundary, as described above. If the alignment check exception (#AC) is enabled (and the CPL is 3), signaling of #AC is not guaranteed and may vary with implementation, as follows. In all implementations where #AC is not signaled, a general protection exception is signaled in its place. In addition, the width of the alignment check may also vary with implementation. For instance, for a given implementation, an alignment check exception might be signaled for a 2-byte misalignment, whereas a general protection exception might be signaled for all other misalignments (4-, 8-, or 16-byte misalignments). XSAVEOPT—Save Processor Extended States Optimized Vol. 2C 5-603 INSTRUCTION SET REFERENCE, V-Z XSAVES—Save Processor Extended States Supervisor Opcode / Instruction Op/ En 64/32 bit Mode Support CPUID Feature Flag Description NP 0F C7 /5 M V/V XSS Save state components specified by EDX:EAX to mem with compaction, optimizing if possible. M V/N.E. XSS Save state components specified by EDX:EAX to mem with compaction, optimizing if possible. XSAVES mem NP REX.W + 0F C7 /5 XSAVES64 mem Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (w) NA NA NA Description Performs a full or partial save of processor state components to the XSAVE area located at the memory address specified by the destination operand. The implicit EDX:EAX register pair specifies a 64-bit instruction mask. The specific state components saved correspond to the bits set in the requested-feature bitmap (RFBM), the logicalAND of EDX:EAX and the logical-OR of XCR0 with the IA32_XSS MSR. XSAVES may be executed only if CPL = 0. The format of the XSAVE area is detailed in Section 13.4, “XSAVE Area,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Like FXRSTOR and FXSAVE, the memory format used for x87 state depends on a REX.W prefix; see Section 13.5.1, “x87 State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Section 13.11, “Operation of XSAVES,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 provides a detailed description of the operation of the XSAVES instruction. The following items provide a highlevel outline: • Execution of XSAVES is similar to that of XSAVEC. XSAVES differs from XSAVEC in that it can save state components corresponding to bits set in the IA32_XSS MSR and that it may use the modified optimization. • XSAVES saves state component i only if RFBM[i] = 1 and XINUSE[i] = 1.1 (XINUSE is a bitmap by which the processor tracks the status of various state components. See Section 13.6, “Processor Tracking of XSAVEManaged State” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.) Even if both bits are 1, XSAVES may optimize and not save state component i if (1) state component i has not been modified since the last execution of XRSTOR or XRSTORS; and (2) this execution of XSAVES correspond to that last execution of XRSTOR or XRSTORS as determined by XRSTOR_INFO (see the Operation section below). • XSAVES does not modify bytes 511:464 of the legacy region of the XSAVE area (see Section 13.4.1, “Legacy Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). • XSAVES writes the logical AND of RFBM and XINUSE to the XSTATE_BV field of the XSAVE header.2 (See Section 13.4.2, “XSAVE Header” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.) XSAVES sets bit 63 of the XCOMP_BV field and sets bits 62:0 of that field to RFBM[62:0]. XSAVES does not write to any parts of the XSAVE header other than the XSTATE_BV and XCOMP_BV fields. • XSAVES always uses the compacted format of the extended region of the XSAVE area (see Section 13.4.3, “Extended Region of an XSAVE Area” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). Use of a destination operand not aligned to 64-byte boundary (in either 64-bit or 32-bit modes) results in a general-protection (#GP) exception. In 64-bit mode, the upper 32 bits of RDX and RAX are ignored. 1. There is an exception for state component 1 (SSE). MXCSR is part of SSE state, but XINUSE[1] may be 0 even if MXCSR does not have its initial value of 1F80H. In this case, the init optimization does not apply and XSAVEC will save SSE state as long as RFBM[1] = 1 and the modified optimization is not being applied. 2. There is an exception for state component 1 (SSE). MXCSR is part of SSE state, but XINUSE[1] may be 0 even if MXCSR does not have its initial value of 1F80H. In this case, XSAVES sets XSTATE_BV[1] to 1 as long as RFBM[1] = 1. 5-604 Vol. 2C XSAVES—Save Processor Extended States Supervisor INSTRUCTION SET REFERENCE, V-Z See Section 13.6, “Processor Tracking of XSAVE-Managed State,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 for discussion of the bitmap XMODIFIED and of the quantity XRSTOR_INFO. Operation RFBM ← (XCR0 OR IA32_XSS) AND EDX:EAX; /* bitwise logical OR and AND */ IF in VMX non-root operation THEN VMXNR ← 1; ELSE VMXNR ← 0; FI; LAXA ← linear address of XSAVE area; COMPMASK ← RFBM OR 80000000_00000000H; TO_BE_SAVED ← RFBM AND XINUSE; IF XRSTOR_INFO = CPL,VMXNR,LAXA,COMPMASK THEN TO_BE_SAVED ← TO_BE_SAVED AND XMODIFIED; FI; If MXCSR ≠ 1F80H AND RFBM[1] TO_BE_SAVED[1] = 1; FI; IF TO_BE_SAVED[0] = 1 THEN store x87 state into legacy region of XSAVE area; FI; IF TO_BE_SAVED[1] = 1 THEN store SSE state into legacy region of XSAVE area; // this step saves the XMM registers, MXCSR, and MXCSR_MASK FI; NEXT_FEATURE_OFFSET = 576; // Legacy area and XSAVE header consume 576 bytes FOR i ← 2 TO 62 IF RFBM[i] = 1 THEN IF TO_BE_SAVED[i] THEN save XSAVE state component i at offset NEXT_FEATURE_OFFSET from base of XSAVE area; IF i = 8 // state component 8 is for PT state THEN IA32_RTIT_CTL.TraceEn[bit 0] ← 0; FI; FI; NEXT_FEATURE_OFFSET = NEXT_FEATURE_OFFSET + n (n enumerated by CPUID(EAX=0DH,ECX=i):EAX); FI; ENDFOR; XSTATE_BV field in XSAVE header ← TO_BE_SAVED; XCOMP_BV field in XSAVE header ← COMPMASK; Flags Affected None. Intel C/C++ Compiler Intrinsic Equivalent XSAVES: void _xsaves( void * , unsigned __int64); XSAVES64: void _xsaves64( void * , unsigned __int64); XSAVES—Save Processor Extended States Supervisor Vol. 2C 5-605 INSTRUCTION SET REFERENCE, V-Z Protected Mode Exceptions #GP(0) If CPL > 0. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #SS(0) If a memory operand effective address is outside the SS segment limit. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSS[bit 3] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Real-Address Mode Exceptions #GP If a memory operand is not aligned on a 64-byte boundary, regardless of segment. If any part of the operand lies outside the effective address space from 0 to FFFFH. #NM #UD If CR0.TS[bit 3] = 1. If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSS[bit 3] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. Virtual-8086 Mode Exceptions Same exceptions as in protected mode. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions #GP(0) If CPL > 0. If the memory address is in a non-canonical form. If a memory operand is not aligned on a 64-byte boundary, regardless of segment. #SS(0) If a memory address referencing the SS segment is in a non-canonical form. #PF(fault-code) If a page fault occurs. #NM If CR0.TS[bit 3] = 1. #UD If CPUID.01H:ECX.XSAVE[bit 26] = 0 or CPUID.(EAX=0DH,ECX=1):EAX.XSS[bit 3] = 0. If CR4.OSXSAVE[bit 18] = 0. If the LOCK prefix is used. 5-606 Vol. 2C XSAVES—Save Processor Extended States Supervisor 10.Updates to Chapter 1, Volume 3A Change bars and green text show changes to Chapter 1 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------Changes to this chapter: Updates to Section 1.1 “Intel® 64 and IA-32 Processors Covered in this Manual”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 21 CHAPTER 1 ABOUT THIS MANUAL The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1 (order number 253668), the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2 (order number 253669), the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3 (order number 326019), and the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3D:System Programming Guide, Part 4 (order number 332831) are part of a set that describes the architecture and programming environment of Intel 64 and IA32 Architecture processors. The other volumes in this set are: • Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture (order number 253665). • Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D: Instruction Set Reference (order numbers 253666, 253667, 326018 and 334569). • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4: Model-Specific Registers (order number 335592). The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D, describe the instruction set of the processor and the opcode structure. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A, 3B, 3C & 3D, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operatingsystem and BIOS designers. In addition, Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B, and Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C address the programming environment for classes of software that host operating systems. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4, describes the model-specific registers of Intel 64 and IA-32 processors. 1.1 INTEL® 64 AND IA-32 PROCESSORS COVERED IN THIS MANUAL This manual set includes information pertaining primarily to the most recent Intel 64 and IA-32 processors, which include: • • • • • • • • • • • • • • • • Pentium® processors P6 family processors Pentium® 4 processors Pentium® M processors Intel® Xeon® processors Pentium® D processors Pentium® processor Extreme Editions 64-bit Intel® Xeon® processors Intel® Core™ Duo processor Intel® Core™ Solo processor Dual-Core Intel® Xeon® processor LV Intel® Core™2 Duo processor Intel® Core™2 Quad processor Q6000 series Intel® Xeon® processor 3000, 3200 series Intel® Xeon® processor 5000 series Intel® Xeon® processor 5100, 5300 series Vol. 3A 1-1 ABOUT THIS MANUAL • • • • • • • • • • • Intel® Core™2 Extreme processor X7000 and X6800 series • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Intel® Core™ i7 processor Intel® Core™2 Extreme QX6000 series Intel® Xeon® processor 7100 series Intel® Pentium® Dual-Core processor Intel® Xeon® processor 7200, 7300 series Intel® Core™2 Extreme QX9000 series Intel® Xeon® processor 5200, 5400, 7400 series Intel® Core™2 Extreme processor QX9000 and X9000 series Intel® Core™2 Quad processor Q9000 series Intel® Core™2 Duo processor E8000, T9000 series Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are built from 45 nm and 32 nm processes. Intel® Core™ i5 processor Intel® Xeon® processor E7-8800/4800/2800 product families Intel® Core™ i7-3930K processor 2nd generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx processor series Intel® Xeon® processor E3-1200 product family Intel® Xeon® processor E5-2400/1400 product family Intel® Xeon® processor E5-4600/2600/1600 product family 3rd generation Intel® Core™ processors Intel® Xeon® processor E3-1200 v2 product family Intel® Xeon® processor E5-2400/1400 v2 product families Intel® Xeon® processor E5-4600/2600/1600 v2 product families Intel® Xeon® processor E7-8800/4800/2800 v2 product families 4th generation Intel® Core™ processors The Intel® Core™ M processor family Intel® Core™ i7-59xx Processor Extreme Edition Intel® Core™ i7-49xx Processor Extreme Edition Intel® Xeon® processor E3-1200 v3 product family Intel® Xeon® processor E5-2600/1600 v3 product families 5th generation Intel® Core™ processors Intel® Xeon® processor D-1500 product family Intel® Xeon® processor E5 v4 family Intel® Atom™ processor X7-Z8000 and X5-Z8000 series Intel® Atom™ processor Z3400 series Intel® Atom™ processor Z3500 series 6th generation Intel® Core™ processors Intel® Xeon® processor E3-1500m v5 product family 7th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series Intel® Xeon® Processor Scalable Family 8th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series 1-2 Vol. 3A ABOUT THIS MANUAL • • Intel® Xeon® E processors 9th generation Intel® Core™ processors P6 family processors are IA-32 processors based on the P6 family microarchitecture. This includes the Pentium® Pro, Pentium® II, Pentium® III, and Pentium® III Xeon® processors. The Pentium® 4, Pentium® D, and Pentium® processor Extreme Editions are based on the Intel NetBurst® microarchitecture. Most early Intel® Xeon® processors are based on the Intel NetBurst® microarchitecture. Intel Xeon processor 5000, 7100 series are based on the Intel NetBurst® microarchitecture. The Intel® Core™ Duo, Intel® Core™ Solo and dual-core Intel® Xeon® processor LV are based on an improved Pentium® M processor microarchitecture. The Intel® Xeon® processor 3000, 3200, 5100, 5300, 7200, and 7300 series, Intel® Pentium® dual-core, Intel® Core™2 Duo, Intel® Core™2 Quad, and Intel® Core™2 Extreme processors are based on Intel® Core™ microarchitecture. The Intel® Xeon® processor 5200, 5400, 7400 series, Intel® Core™2 Quad processor Q9000 series, and Intel® Core™2 Extreme processors QX9000, X9000 series, Intel® Core™2 processor E8000 series are based on Enhanced Intel® Core™ microarchitecture. The Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are based on the Intel® Atom™ microarchitecture and supports Intel 64 architecture. P6 family, Pentium® M, Intel® Core™ Solo, Intel® Core™ Duo processors, dual-core Intel® Xeon® processor LV, and early generations of Pentium 4 and Intel Xeon processors support IA-32 architecture. The Intel® AtomTM processor Z5xx series support IA-32 architecture. The Intel® Xeon® processor 3000, 3200, 5000, 5100, 5200, 5300, 5400, 7100, 7200, 7300, 7400 series, Intel® Core™2 Duo, Intel® Core™2 Extreme, Intel® Core™2 Quad processors, Pentium® D processors, Pentium® DualCore processor, newer generations of Pentium 4 and Intel Xeon processor family support Intel® 64 architecture. The Intel® Core™ i7 processor and Intel® Xeon® processor 3400, 5500, 7500 series are based on 45 nm Nehalem microarchitecture. Westmere microarchitecture is a 32 nm version of the Nehalem microarchitecture. Intel® Xeon® processor 5600 series, Intel Xeon processor E7 and various Intel Core i7, i5, i3 processors are based on the Westmere microarchitecture. These processors support Intel 64 architecture. The Intel® Xeon® processor E5 family, Intel® Xeon® processor E3-1200 family, Intel® Xeon® processor E78800/4800/2800 product families, Intel® Core™ i7-3930K processor, and 2nd generation Intel® Core™ i7-2xxx, Intel® CoreTM i5-2xxx, Intel® Core™ i3-2xxx processor series are based on the Sandy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E7-8800/4800/2800 v2 product families, Intel® Xeon® processor E3-1200 v2 product family and 3rd generation Intel® Core™ processors are based on the Ivy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-4600/2600/1600 v2 product families, Intel® Xeon® processor E5-2400/1400 v2 product families and Intel® Core™ i7-49xx Processor Extreme Edition are based on the Ivy Bridge-E microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E3-1200 v3 product family and 4th Generation Intel® Core™ processors are based on the Haswell microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-2600/1600 v3 product families and the Intel® Core™ i7-59xx Processor Extreme Edition are based on the Haswell-E microarchitecture and support Intel 64 architecture. The Intel® Atom™ processor Z8000 series is based on the Airmont microarchitecture. The Intel® Atom™ processor Z3400 series and the Intel® Atom™ processor Z3500 series are based on the Silvermont microarchitecture. The Intel® Core™ M processor family, 5th generation Intel® Core™ processors, Intel® Xeon® processor D-1500 product family and the Intel® Xeon® processor E5 v4 family are based on the Broadwell microarchitecture and support Intel 64 architecture. The Intel® Xeon® Processor Scalable Family, Intel® Xeon® processor E3-1500m v5 product family and 6th generation Intel® Core™ processors are based on the Skylake microarchitecture and support Intel 64 architecture. Vol. 3A 1-3 ABOUT THIS MANUAL The 7th generation Intel® Core™ processors are based on the Kaby Lake microarchitecture and support Intel 64 architecture. The Intel® Atom™ processor C series, the Intel® Atom™ processor X series, the Intel® Pentium® processor J series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont microarchitecture. The Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series is based on the Knights Landing microarchitecture and supports Intel 64 architecture. The Intel® Pentium® Silver processor series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont Plus microarchitecture. The 8th generation Intel® Core™ processors, 9th generation Intel® Core™ processors, and Intel® Xeon® E processors are based on the Coffee Lake microarchitecture and support Intel 64 architecture. The Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series is based on the Knights Mill microarchitecture and supports Intel 64 architecture. IA-32 architecture is the instruction set architecture and programming environment for Intel's 32-bit microprocessors. Intel® 64 architecture is the instruction set architecture and programming environment which is the superset of Intel’s 32-bit and 64-bit architectures. It is compatible with the IA-32 architecture. 1.2 OVERVIEW OF THE SYSTEM PROGRAMMING GUIDE A description of this manual’s content follows1: Chapter 1 — About This Manual. Gives an overview of all eight volumes of the Intel® 64 and IA-32 Architectures Software Developer’s Manual. It also describes the notational conventions in these manuals and lists related Intel manuals and documentation of interest to programmers and hardware designers. Chapter 2 — System Architecture Overview. Describes the modes of operation used by Intel 64 and IA-32 processors and the mechanisms provided by the architectures to support operating systems and executives, including the system-oriented registers and data structures and the system-oriented instructions. The steps necessary for switching between real-address and protected modes are also identified. Chapter 3 — Protected-Mode Memory Management. Describes the data structures, registers, and instructions that support segmentation and paging. The chapter explains how they can be used to implement a “flat” (unsegmented) memory model or a segmented memory model. Chapter 4 — Paging. Describes the paging modes supported by Intel 64 and IA-32 processors. Chapter 5 — Protection. Describes the support for page and segment protection provided in the Intel 64 and IA32 architectures. This chapter also explains the implementation of privilege rules, stack switching, pointer validation, user and supervisor modes. Chapter 6 — Interrupt and Exception Handling. Describes the basic interrupt mechanisms defined in the Intel 64 and IA-32 architectures, shows how interrupts and exceptions relate to protection, and describes how the architecture handles each exception type. Reference information for each exception is given in this chapter. Includes programming the LINT0 and LINT1 inputs and gives an example of how to program the LINT0 and LINT1 pins for specific interrupt vectors. Chapter 7 — Task Management. Describes mechanisms the Intel 64 and IA-32 architectures provide to support multitasking and inter-task protection. Chapter 8 — Multiple-Processor Management. Describes the instructions and flags that support multiple processors with shared memory, memory ordering, and Intel® Hyper-Threading Technology. Includes MP initialization for P6 family processors and gives an example of how to use the MP protocol to boot P6 family processors in an MP system. 1. Model-Specific Registers have been moved out of this volume and into a separate volume: Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. 1-4 Vol. 3A ABOUT THIS MANUAL Chapter 9 — Processor Management and Initialization. Defines the state of an Intel 64 or IA-32 processor after reset initialization. This chapter also explains how to set up an Intel 64 or IA-32 processor for real-address mode operation and protected- mode operation, and how to switch between modes. Chapter 10 — Advanced Programmable Interrupt Controller (APIC). Describes the programming interface to the local APIC and gives an overview of the interface between the local APIC and the I/O APIC. Includes APIC bus message formats and describes the message formats for messages transmitted on the APIC bus for P6 family and Pentium processors. Chapter 11 — Memory Cache Control. Describes the general concept of caching and the caching mechanisms supported by the Intel 64 or IA-32 architectures. This chapter also describes the memory type range registers (MTRRs) and how they can be used to map memory types of physical memory. Information on using the new cache control and memory streaming instructions introduced with the Pentium III, Pentium 4, and Intel Xeon processors is also given. Chapter 12 — Intel® MMX™ Technology System Programming. Describes those aspects of the Intel® MMX™ technology that must be handled and considered at the system programming level, including: task switching, exception handling, and compatibility with existing system environments. Chapter 13 — System Programming For Instruction Set Extensions And Processor Extended States. Describes the operating system requirements to support SSE/SSE2/SSE3/SSSE3/SSE4 extensions, including task switching, exception handling, and compatibility with existing system environments. The latter part of this chapter describes the extensible framework of operating system requirements to support processor extended states. Processor extended state may be required by instruction set extensions beyond those of SSE/SSE2/SSE3/SSSE3/SSE4 extensions. Chapter 14 — Power and Thermal Management. Describes facilities of Intel 64 and IA-32 architecture used for power management and thermal monitoring. Chapter 15 — Machine-Check Architecture. Describes the machine-check architecture and machine-check exception mechanism found in the Pentium 4, Intel Xeon, and P6 family processors. Additionally, a signaling mechanism for software to respond to hardware corrected machine check error is covered. Chapter 16 — Interpreting Machine-Check Error Codes. Gives an example of how to interpret the error codes for a machine-check error that occurred on a P6 family processor. Chapter 17 — Debug, Branch Profile, TSC, and Resource Monitoring Features. Describes the debugging registers and other debug mechanism provided in Intel 64 or IA-32 processors. This chapter also describes the time-stamp counter. Chapter 18 — Performance Monitoring. Describes the Intel 64 and IA-32 architectures’ facilities for monitoring performance. Chapter 19 — Performance-Monitoring Events. Lists architectural performance events. Non-architectural performance events (i.e. model-specific events) are listed for each generation of microarchitecture. Chapter 20 — 8086 Emulation. Describes the real-address and virtual-8086 modes of the IA-32 architecture. Chapter 21 — Mixing 16-Bit and 32-Bit Code. Describes how to mix 16-bit and 32-bit code modules within the same program or task. Chapter 22 — IA-32 Architecture Compatibility. Describes architectural compatibility among IA-32 processors. Chapter 23 — Introduction to Virtual Machine Extensions. Describes the basic elements of virtual machine architecture and the virtual machine extensions for Intel 64 and IA-32 Architectures. Chapter 24 — Virtual Machine Control Structures. Describes components that manage VMX operation. These include the working-VMCS pointer and the controlling-VMCS pointer. Chapter 25 — VMX Non-Root Operation. Describes the operation of a VMX non-root operation. Processor operation in VMX non-root mode can be restricted programmatically such that certain operations, events or conditions can cause the processor to transfer control from the guest (running in VMX non-root mode) to the monitor software (running in VMX root mode). Chapter 26 — VM Entries. Describes VM entries. VM entry transitions the processor from the VMM running in VMX root-mode to a VM running in VMX non-root mode. VM-Entry is performed by the execution of VMLAUNCH or VMRESUME instructions. Vol. 3A 1-5 ABOUT THIS MANUAL Chapter 27 — VM Exits. Describes VM exits. Certain events, operations or situations while the processor is in VMX non-root operation may cause VM-exit transitions. In addition, VM exits can also occur on failed VM entries. Chapter 28 — VMX Support for Address Translation. Describes virtual-machine extensions that support address translation and the virtualization of physical memory. Chapter 29 — APIC Virtualization and Virtual Interrupts. Describes the VMCS including controls that enable the virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC). Chapter 30 — VMX Instruction Reference. Describes the virtual-machine extensions (VMX). VMX is intended for a system executive to support virtualization of processor hardware and a system software layer acting as a host to multiple guest software environments. Chapter 31 — Virtual-Machine Monitor Programming Considerations. Describes programming considerations for VMMs. VMMs manage virtual machines (VMs). Chapter 32 — Virtualization of System Resources. Describes the virtualization of the system resources. These include: debugging facilities, address translation, physical memory, and microcode update facilities. Chapter 33 — Handling Boundary Conditions in a Virtual Machine Monitor. Describes what a VMM must consider when handling exceptions, interrupts, error conditions, and transitions between activity states. Chapter 34 — System Management Mode. Describes Intel 64 and IA-32 architectures’ system management mode (SMM) facilities. Chapter 35 — Intel® Processor Trace. Describes details of Intel® Processor Trace. Chapter 36 — Introduction to Intel® Software Guard Extensions. Provides an overview of the Intel® Software Guard Extensions (Intel® SGX) set of instructions. Chapter 37 — Enclave Access Control and Data Structures. Describes Enclave Access Control procedures and defines various Intel SGX data structures. Chapter 38 — Enclave Operation. Describes enclave creation and initialization, adding pages and measuring an enclave, and enclave entry and exit. Chapter 39 — Enclave Exiting Events. Describes enclave-exiting events (EEE) and asynchronous enclave exit (AEX). Chapter 40 — SGX Instruction References. Describes the supervisor and user level instructions provided by Intel SGX. Chapter 41 — Intel® SGX Interactions with IA32 and Intel® 64 Architecture. Describes the Intel SGX collection of enclave instructions for creating protected execution environments on processors supporting IA32 and Intel 64 architectures. Chapter 42 — Enclave Code Debug and Profiling. Describes enclave code debug processes and options. Appendix A — VMX Capability Reporting Facility. Describes the VMX capability MSRs. Support for specific VMX features is determined by reading capability MSRs. Appendix B — Field Encoding in VMCS. Enumerates all fields in the VMCS and their encodings. Fields are grouped by width (16-bit, 32-bit, etc.) and type (guest-state, host-state, etc.). Appendix C — VM Basic Exit Reasons. Describes the 32-bit fields that encode reasons for a VM exit. Examples of exit reasons include, but are not limited to: software interrupts, processor exceptions, software traps, NMIs, external interrupts, and triple faults. 1.3 NOTATIONAL CONVENTIONS This manual uses specific notation for data-structure formats, for symbolic representation of instructions, and for hexadecimal and binary numbers. A review of this notation makes the manual easier to read. 1-6 Vol. 3A ABOUT THIS MANUAL 1.3.1 Bit and Byte Order In illustrations of data structures in memory, smaller addresses appear toward the bottom of the figure; addresses increase toward the top. Bit positions are numbered from right to left. The numerical value of a set bit is equal to two raised to the power of the bit position. Intel 64 and IA-32 processors are “little endian” machines; this means the bytes of a word are numbered starting from the least significant byte. Figure 1-1 illustrates these conventions. 1.3.2 Reserved Bits and Software Compatibility In many register and memory layout descriptions, certain bits are marked as reserved. When bits are marked as reserved, it is essential for compatibility with future processors that software treat these bits as having a future, though unknown, effect. The behavior of reserved bits should be regarded as not only undefined, but unpredictable. Software should follow these guidelines in dealing with reserved bits: • Do not depend on the states of any reserved bits when testing the values of registers which contain such bits. Mask out the reserved bits before testing. • • • Do not depend on the states of any reserved bits when storing to memory or to a register. Do not depend on the ability to retain information written into any reserved bits. When loading a register, always load the reserved bits with the values indicated in the documentation, if any, or reload them with values previously read from the same register. NOTE Avoid any software dependence upon the state of reserved bits in Intel 64 and IA-32 registers. Depending upon the values of reserved register bits will make software dependent upon the unspecified manner in which the processor handles these bits. Programs that depend upon reserved values risk incompatibility with future processors. Highest 31 Address Data Structure 8 7 24 23 16 15 Byte 3 Byte 2 Byte 1 Bit offset 0 Byte 0 28 24 20 16 12 8 4 0 Lowest Address Byte Offset Figure 1-1. Bit and Byte Order Vol. 3A 1-7 ABOUT THIS MANUAL 1.3.3 Instruction Operands When instructions are represented symbolically, a subset of assembly language is used. In this subset, an instruction has the following format: label: mnemonic argument1, argument2, argument3 where: • • • A label is an identifier which is followed by a colon. A mnemonic is a reserved name for a class of instruction opcodes which have the same function. The operands argument1, argument2, and argument3 are optional. There may be from zero to three operands, depending on the opcode. When present, they take the form of either literals or identifiers for data items. Operand identifiers are either reserved names of registers or are assumed to be assigned to data items declared in another part of the program (which may not be shown in the example). When two operands are present in an arithmetic or logical instruction, the right operand is the source and the left operand is the destination. For example: LOADREG: MOV EAX, SUBTOTAL In this example LOADREG is a label, MOV is the mnemonic identifier of an opcode, EAX is the destination operand, and SUBTOTAL is the source operand. Some assembly languages put the source and destination in reverse order. 1.3.4 Hexadecimal and Binary Numbers Base 16 (hexadecimal) numbers are represented by a string of hexadecimal digits followed by the character H (for example, F82EH). A hexadecimal digit is a character from the following set: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Base 2 (binary) numbers are represented by a string of 1s and 0s, sometimes followed by the character B (for example, 1010B). The “B” designation is only used in situations where confusion as to the type of number might arise. 1.3.5 Segmented Addressing The processor uses byte addressing. This means memory is organized and accessed as a sequence of bytes. Whether one or more bytes are being accessed, a byte address is used to locate the byte or bytes memory. The range of memory that can be addressed is called an address space. The processor also supports segmented addressing. This is a form of addressing where a program may have many independent address spaces, called segments. For example, a program can keep its code (instructions) and stack in separate segments. Code addresses would always refer to the code space, and stack addresses would always refer to the stack space. The following notation is used to specify a byte address within a segment: Segment-register:Byte-address For example, the following segment address identifies the byte at address FF79H in the segment pointed by the DS register: DS:FF79H The following segment address identifies an instruction address in the code segment. The CS register points to the code segment and the EIP register contains the address of the instruction. CS:EIP 1-8 Vol. 3A ABOUT THIS MANUAL 1.3.6 Syntax for CPUID, CR, and MSR Values Obtain feature flags, status, and system information by using the CPUID instruction, by checking control register bits, and by reading model-specific registers. We are moving toward a single syntax to represent this type of information. See Figure 1-2. CPUID Input and Output CPUID.01H:EDX.SSE[bit 25] = 1 Input value for EAX register Output register and feature flag or field name with bit position(s) Value (or range) of output Control Register Values CR4.OSFXSR[bit 9] = 1 Example CR name Feature flag or field name with bit position(s) Value (or range) of output Model-Specific Register Values IA32_MISC_ENABLE.ENABLEFOPCODE[bit 2] = 1 Example MSR name Feature flag or field name with bit position(s) Value (or range) of output S Figure 1-2. Syntax for CPUID, CR, and MSR Data Presentation 1.3.7 29002 Exceptions An exception is an event that typically occurs when an instruction causes an error. For example, an attempt to divide by zero generates an exception. However, some exceptions, such as breakpoints, occur under other conditions. Some types of exceptions may provide error codes. An error code reports additional information about the error. An example of the notation used to show an exception and error code is shown below: #PF(fault code) This example refers to a page-fault exception under conditions where an error code naming a type of fault is reported. Under some conditions, exceptions which produce error codes may not be able to report an accurate code. In this case, the error code is zero, as shown below for a general-protection exception: #GP(0) Vol. 3A 1-9 ABOUT THIS MANUAL 1.4 RELATED LITERATURE Literature related to Intel 64 and IA-32 processors is listed and viewable on-line at: https://software.intel.com/en-us/articles/intel-sdm See also: • The latest security information on Intel® products: https://www.intel.com/content/www/us/en/security-center/default.html • Software developer resources, guidance and insights for security advisories: https://software.intel.com/security-software-guidance/ • • • The data sheet for a particular Intel 64 or IA-32 processor • Intel® Fortran Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ • Intel® Software Development Tools: https://software.intel.com/en-us/intel-sdp-home • Intel® 64 and IA-32 Architectures Software Developer’s Manual (in one, four or ten volumes): https://software.intel.com/en-us/articles/intel-sdm • Intel® 64 and IA-32 Architectures Optimization Reference Manual: https://software.intel.com/en-us/articles/intel-sdm#optimization • Intel 64 Architecture x2APIC Specification: The specification update for a particular Intel 64 or IA-32 processor Intel® C++ Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ http://www.intel.com/content/www/us/en/architecture-and-technology/64-architecture-x2apic-specification.html • Intel® Trusted Execution Technology Measured Launched Environment Programming Guide: http://www.intel.com/content/www/us/en/software-developers/intel-txt-software-development-guide.html • Developing Multi-threaded Applications: A Platform Consistent Approach: https://software.intel.com/sites/default/files/article/147714/51534-developing-multithreaded-applications.pdf • Using Spin-Loops on Intel® Pentium® 4 Processor and Intel® Xeon® Processor: https://software.intel.com/sites/default/files/22/30/25602 • Performance Monitoring Unit Sharing Guide http://software.intel.com/file/30388 Literature related to selected features in future Intel processors are available at: • Intel® Architecture Instruction Set Extensions Programming Reference https://software.intel.com/en-us/isa-extensions • Intel® Software Guard Extensions (Intel® SGX) Programming Reference https://software.intel.com/en-us/isa-extensions/intel-sgx More relevant links are: • Intel® Developer Zone: • Developer centers: https://software.intel.com/en-us http://www.intel.com/content/www/us/en/hardware-developers/developer-centers.html • Processor support general link: http://www.intel.com/support/processors/ • Intel® Hyper-Threading Technology (Intel® HT Technology): http://www.intel.com/technology/platform-technology/hyper-threading/index.htm 1-10 Vol. 3A 11.Updates to Chapter 10, Volume 3A Change bars and green text show changes to Chapter 10 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------Changes to this chapter: Minor update to Section 10.5.4.1 “TSC-Deadline Mode” to remove inaccurate statement. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 22 CHAPTER 10 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The Advanced Programmable Interrupt Controller (APIC), referred to in the following sections as the local APIC, was introduced into the IA-32 processors with the Pentium processor (see Section 22.27, “Advanced Programmable Interrupt Controller (APIC)”) and is included in the P6 family, Pentium 4, Intel Xeon processors, and other more recent Intel 64 and IA-32 processor families (see Section 10.4.2, “Presence of the Local APIC”). The local APIC performs two primary functions for the processor: • It receives interrupts from the processor’s interrupt pins, from internal sources and from an external I/O APIC (or other external interrupt controller). It sends these to the processor core for handling. • In multiple processor (MP) systems, it sends and receives interprocessor interrupt (IPI) messages to and from other logical processors on the system bus. IPI messages can be used to distribute interrupts among the processors in the system or to execute system wide functions (such as, booting up processors or distributing work among a group of processors). The external I/O APIC is part of Intel’s system chip set. Its primary function is to receive external interrupt events from the system and its associated I/O devices and relay them to the local APIC as interrupt messages. In MP systems, the I/O APIC also provides a mechanism for distributing external interrupts to the local APICs of selected processors or groups of processors on the system bus. This chapter provides a description of the local APIC and its programming interface. It also provides an overview of the interface between the local APIC and the I/O APIC. Contact Intel for detailed information about the I/O APIC. When a local APIC has sent an interrupt to its processor core for handling, the processor uses the interrupt and exception handling mechanism described in Chapter 6, “Interrupt and Exception Handling.” See Section 6.1, “Interrupt and Exception Overview,” for an introduction to interrupt and exception handling. 10.1 LOCAL AND I/O APIC OVERVIEW Each local APIC consists of a set of APIC registers (see Table 10-1) and associated hardware that control the delivery of interrupts to the processor core and the generation of IPI messages. The APIC registers are memory mapped and can be read and written to using the MOV instruction. Local APICs can receive interrupts from the following sources: • Locally connected I/O devices — These interrupts originate as an edge or level asserted by an I/O device that is connected directly to the processor’s local interrupt pins (LINT0 and LINT1). The I/O devices may also be connected to an 8259-type interrupt controller that is in turn connected to the processor through one of the local interrupt pins. • Externally connected I/O devices — These interrupts originate as an edge or level asserted by an I/O device that is connected to the interrupt input pins of an I/O APIC. Interrupts are sent as I/O interrupt messages from the I/O APIC to one or more of the processors in the system. • Inter-processor interrupts (IPIs) — An Intel 64 or IA-32 processor can use the IPI mechanism to interrupt another processor or group of processors on the system bus. IPIs are used for software self-interrupts, interrupt forwarding, or preemptive scheduling. • APIC timer generated interrupts — The local APIC timer can be programmed to send a local interrupt to its associated processor when a programmed count is reached (see Section 10.5.4, “APIC Timer”). • Performance monitoring counter interrupts — P6 family, Pentium 4, and Intel Xeon processors provide the ability to send an interrupt to its associated processor when a performance-monitoring counter overflows (see Section 18.6.3.5.8, “Generating an Interrupt on Overflow”). • Thermal Sensor interrupts — Pentium 4 and Intel Xeon processors provide the ability to send an interrupt to themselves when the internal thermal sensor has been tripped (see Section 14.7.2, “Thermal Monitor”). Vol. 3A 10-1 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • APIC internal error interrupts — When an error condition is recognized within the local APIC (such as an attempt to access an unimplemented register), the APIC can be programmed to send an interrupt to its associated processor (see Section 10.5.3, “Error Handling”). Of these interrupt sources: the processor’s LINT0 and LINT1 pins, the APIC timer, the performance-monitoring counters, the thermal sensor, and the internal APIC error detector are referred to as local interrupt sources. Upon receiving a signal from a local interrupt source, the local APIC delivers the interrupt to the processor core using an interrupt delivery protocol that has been set up through a group of APIC registers called the local vector table or LVT (see Section 10.5.1, “Local Vector Table”). A separate entry is provided in the local vector table for each local interrupt source, which allows a specific interrupt delivery protocol to be set up for each source. For example, if the LINT1 pin is going to be used as an NMI pin, the LINT1 entry in the local vector table can be set up to deliver an interrupt with vector number 2 (NMI interrupt) to the processor core. The local APIC handles interrupts from the other two interrupt sources (externally connected I/O devices and IPIs) through its IPI message handling facilities. A processor can generate IPIs by programming the interrupt command register (ICR) in its local APIC (see Section 10.6.1, “Interrupt Command Register (ICR)”). The act of writing to the ICR causes an IPI message to be generated and issued on the system bus (for Pentium 4 and Intel Xeon processors) or on the APIC bus (for Pentium and P6 family processors). See Section 10.2, “System Bus Vs. APIC Bus.” IPIs can be sent to other processors in the system or to the originating processor (self-interrupts). When the target processor receives an IPI message, its local APIC handles the message automatically (using information included in the message such as vector number and trigger mode). See Section 10.6, “Issuing Interprocessor Interrupts,” for a detailed explanation of the local APIC’s IPI message delivery and acceptance mechanism. The local APIC can also receive interrupts from externally connected devices through the I/O APIC (see Figure 10-1). The I/O APIC is responsible for receiving interrupts generated by system hardware and I/O devices and forwarding them to the local APIC as interrupt messages. Pentium 4 and Intel Xeon Processors Pentium and P6 Family Processors Processor Core Processor Core Local APIC Local APIC Interrupt Messages Interrupt Messages Local Interrupts Interrupt Messages 3-Wire APIC Bus System Bus Bridge I/O APIC PCI I/O APIC Local Interrupts External Interrupts System Chip Set External Interrupts System Chip Set Figure 10-1. Relationship of Local APIC and I/O APIC In Single-Processor Systems Individual pins on the I/O APIC can be programmed to generate a specific interrupt vector when asserted. The I/O APIC also has a “virtual wire mode” that allows it to communicate with a standard 8259A-style external interrupt controller. Note that the local APIC can be disabled (see Section 10.4.3, “Enabling or Disabling the Local APIC”). This allows an associated processor core to receive interrupts directly from an 8259A interrupt controller. 10-2 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Both the local APIC and the I/O APIC are designed to operate in MP systems (see Figures 10-2 and 10-3). Each local APIC handles interrupts from the I/O APIC, IPIs from processors on the system bus, and self-generated interrupts. Interrupts can also be delivered to the individual processors through the local interrupt pins; however, this mechanism is commonly not used in MP systems. Processor #1 Processor #2 Processor #3 Processor #3 CPU CPU CPU CPU Local APIC Local APIC Local APIC Local APIC Interrupt Messages IPIs Interrupt Messages IPIs Interrupt Messages IPIs Interrupt Messages IPIs Processor System Bus Interrupt Messages Bridge PCI External Interrupts I/O APIC System Chip Set Figure 10-2. Local APICs and I/O APIC When Intel Xeon Processors Are Used in Multiple-Processor Systems Processor #1 Processor #2 Processor #3 Processor #4 CPU CPU CPU CPU Local APIC Local APIC Local APIC Local APIC Interrupt Messages IPIs Interrupt Messages IPIs Interrupt Messages Interrupt Messages External Interrupts IPIs Interrupt Messages IPIs 3-wire APIC Bus I/O APIC System Chip Set Figure 10-3. Local APICs and I/O APIC When P6 Family Processors Are Used in Multiple-Processor Systems The IPI mechanism is typically used in MP systems to send fixed interrupts (interrupts for a specific vector number) and special-purpose interrupts to processors on the system bus. For example, a local APIC can use an IPI to forward a fixed interrupt to another processor for servicing. Special-purpose IPIs (including NMI, INIT, SMI and SIPI IPIs) allow one or more processors on the system bus to perform system-wide boot-up and control functions. The following sections focus on the local APIC and its implementation in the Pentium 4, Intel Xeon, and P6 family processors. In these sections, the terms “local APIC” and “I/O APIC” refer to local and I/O APICs used with the P6 family processors and to local and I/O xAPICs used with the Pentium 4 and Intel Xeon processors (see Section 10.3, “The Intel® 82489DX External APIC, the APIC, the xAPIC, and the X2APIC”). Vol. 3A 10-3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.2 SYSTEM BUS VS. APIC BUS For the P6 family and Pentium processors, the I/O APIC and local APICs communicate through the 3-wire interAPIC bus (see Figure 10-3). Local APICs also use the APIC bus to send and receive IPIs. The APIC bus and its messages are invisible to software and are not classed as architectural. Beginning with the Pentium 4 and Intel Xeon processors, the I/O APIC and local APICs (using the xAPIC architecture) communicate through the system bus (see Figure 10-2). The I/O APIC sends interrupt requests to the processors on the system bus through bridge hardware that is part of the Intel chip set. The bridge hardware generates the interrupt messages that go to the local APICs. IPIs between local APICs are transmitted directly on the system bus. 10.3 THE INTEL® 82489DX EXTERNAL APIC, THE APIC, THE XAPIC, AND THE X2APIC The local APIC in the P6 family and Pentium processors is an architectural subset of the Intel® 82489DX external APIC. See Section 22.27.1, “Software Visible Differences Between the Local APIC and the 82489DX.” The APIC architecture used in the Pentium 4 and Intel Xeon processors (called the xAPIC architecture) is an extension of the APIC architecture found in the P6 family processors. The primary difference between the APIC and xAPIC architectures is that with the xAPIC architecture, the local APICs and the I/O APIC communicate through the system bus. With the APIC architecture, they communication through the APIC bus (see Section 10.2, “System Bus Vs. APIC Bus”). Also, some APIC architectural features have been extended and/or modified in the xAPIC architecture. These extensions and modifications are described in Section 10.4 through Section 10.10. The basic operating mode of the xAPIC is xAPIC mode. The x2APIC architecture is an extension of the xAPIC architecture, primarily to increase processor addressability. The x2APIC architecture provides backward compatibility to the xAPIC architecture and forward extendability for future Intel platform innovations. These extensions and modifications are supported by a new mode of execution (x2APIC mode) are detailed in Section 10.12. 10.4 LOCAL APIC The following sections describe the architecture of the local APIC and how to detect it, identify it, and determine its status. Descriptions of how to program the local APIC are given in Section 10.5.1, “Local Vector Table,” and Section 10.6.1, “Interrupt Command Register (ICR).” 10.4.1 The Local APIC Block Diagram Figure 10-4 gives a functional block diagram for the local APIC. Software interacts with the local APIC by reading and writing its registers. APIC registers are memory-mapped to a 4-KByte region of the processor’s physical address space with an initial starting address of FEE00000H. For correct APIC operation, this address space must be mapped to an area of memory that has been designated as strong uncacheable (UC). See Section 11.3, “Methods of Caching Available.” In MP system configurations, the APIC registers for Intel 64 or IA-32 processors on the system bus are initially mapped to the same 4-KByte region of the physical address space. Software has the option of changing initial mapping to a different 4-KByte region for all the local APICs or of mapping the APIC registers for each local APIC to its own 4-KByte region. Section 10.4.5, “Relocating the Local APIC Registers,” describes how to relocate the base address for APIC registers. On processors supporting x2APIC architecture (indicated by CPUID.01H:ECX[21] = 1), the local APIC supports operation both in xAPIC mode and (if enabled by software) in x2APIC mode. x2APIC mode provides extended processor addressability (see Section 10.12). 10-4 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) NOTE For P6 family, Pentium 4, and Intel Xeon processors, the APIC handles all memory accesses to addresses within the 4-KByte APIC register space internally and no external bus cycles are produced. For the Pentium processors with an on-chip APIC, bus cycles are produced for accesses to the APIC register space. Thus, for software intended to run on Pentium processors, system software should explicitly not map the APIC register space to regular system memory. Doing so can result in an invalid opcode exception (#UD) being generated or unpredictable execution. DATA/ADDR Version Register EOI Register Timer Task Priority Register Current Count Register Initial Count Register Processor Priority Register Divide Configuration Register INTA INTR Prioritizer EXTINT Local Vector Table From CPU Core To CPU Core Timer LINT0/1 Perf. Mon. (Internal Interrupt) Thermal Sensor (Internal Interrupt) Local Interrupts 0,1 In-Service Register (ISR) Interrupt Request Register (IRR) Performance Monitoring Counters1 Trigger Mode Register (TMR) Thermal Sensor2 Error Vec[3:0] & TMR Bit Arb. ID Register4 Error Status Register Local Interrupts Register Select Vector Decode Acceptance Logic Dest. Mode & Vector Protocol Translation Logic APIC ID Register Logical Destination Register Destination Format Register INIT NMI SMI To CPU Core Interrupt Command Register (ICR) Spurious Vector Register Processor System Bus3 1. Introduced in P6 family processors. 2. Introduced in the Pentium 4 and Intel Xeon processors. 3. Three-wire APIC bus in P6 family and Pentium processors. 4. Not implemented in Pentium 4 and Intel Xeon processors. Figure 10-4. Local APIC Structure Table 10-1 shows how the APIC registers are mapped into the 4-KByte APIC register space. Registers are 32 bits, 64 bits, or 256 bits in width; all are aligned on 128-bit boundaries. All 32-bit registers should be accessed using 128-bit aligned 32-bit loads or stores. Some processors may support loads and stores of less than 32 bits to some of the APIC registers. This is model specific behavior and is not guaranteed to work on all processors. Any Vol. 3A 10-5 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) FP/MMX/SSE access to an APIC register, or any access that touches bytes 4 through 15 of an APIC register may cause undefined behavior and must not be executed. This undefined behavior could include hangs, incorrect results or unexpected exceptions, including machine checks, and may vary between implementations. Wider registers (64-bit or 256-bit) must be accessed using multiple 32-bit loads or stores, with all accesses being 128-bit aligned. The local APIC registers listed in Table 10-1 are not MSRs. The only MSR associated with the programming of the local APIC is the IA32_APIC_BASE MSR (see Section 10.4.3, “Enabling or Disabling the Local APIC”). NOTE In processors based on Intel microarchitecture code name Nehalem1 the Local APIC ID Register is no longer Read/Write; it is Read Only. Table 10-1 Local APIC Register Address Map Address Register Name Software Read/Write FEE0 0000H Reserved FEE0 0010H Reserved FEE0 0020H Local APIC ID Register Read/Write. FEE0 0030H Local APIC Version Register Read Only. FEE0 0040H Reserved FEE0 0050H Reserved FEE0 0060H Reserved FEE0 0070H Reserved FEE0 0080H Task Priority Register (TPR) Read/Write. FEE0 0090H Arbitration Priority Register1 (APR) Read Only. FEE0 00A0H Processor Priority Register (PPR) Read Only. FEE0 00B0H EOI Register Write Only. FEE0 00C0H Remote Read Register1 (RRD) Read Only FEE0 00D0H Logical Destination Register Read/Write. FEE0 00E0H Destination Format Register Read/Write (see Section 10.6.2.2). FEE0 00F0H Spurious Interrupt Vector Register Read/Write (see Section 10.9. FEE0 0100H In-Service Register (ISR); bits 31:0 Read Only. FEE0 0110H In-Service Register (ISR); bits 63:32 Read Only. FEE0 0120H In-Service Register (ISR); bits 95:64 Read Only. FEE0 0130H In-Service Register (ISR); bits 127:96 Read Only. FEE0 0140H In-Service Register (ISR); bits 159:128 Read Only. FEE0 0150H In-Service Register (ISR); bits 191:160 Read Only. FEE0 0160H In-Service Register (ISR); bits 223:192 Read Only. FEE0 0170H In-Service Register (ISR); bits 255:224 Read Only. FEE0 0180H Trigger Mode Register (TMR); bits 31:0 Read Only. FEE0 0190H Trigger Mode Register (TMR); bits 63:32 Read Only. FEE0 01A0H Trigger Mode Register (TMR); bits 95:64 Read Only. 1. See Table 2-1, “CPUID Signature Values of DisplayFamily_DisplayModel,” on page 1, and Section 2.8, “MSRs In the Intel® Microarchitecture Code Name Nehalem” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4 to determine which processors are based on Nehalem microarchitecture. 10-6 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-1 Local APIC Register Address Map (Contd.) Address Register Name Software Read/Write FEE0 01B0H Trigger Mode Register (TMR); bits 127:96 Read Only. FEE0 01C0H Trigger Mode Register (TMR); bits 159:128 Read Only. FEE0 01D0H Trigger Mode Register (TMR); bits 191:160 Read Only. FEE0 01E0H Trigger Mode Register (TMR); bits 223:192 Read Only. FEE0 01F0H Trigger Mode Register (TMR); bits 255:224 Read Only. FEE0 0200H Interrupt Request Register (IRR); bits 31:0 Read Only. FEE0 0210H Interrupt Request Register (IRR); bits 63:32 Read Only. FEE0 0220H Interrupt Request Register (IRR); bits 95:64 Read Only. FEE0 0230H Interrupt Request Register (IRR); bits 127:96 Read Only. FEE0 0240H Interrupt Request Register (IRR); bits 159:128 Read Only. FEE0 0250H Interrupt Request Register (IRR); bits 191:160 Read Only. FEE0 0260H Interrupt Request Register (IRR); bits 223:192 Read Only. FEE0 0270H Interrupt Request Register (IRR); bits 255:224 Read Only. FEE0 0280H Error Status Register Read Only. FEE0 0290H through FEE0 02E0H Reserved FEE0 02F0H LVT Corrected Machine Check Interrupt (CMCI) Register Read/Write. FEE0 0300H Interrupt Command Register (ICR); bits 0-31 Read/Write. FEE0 0310H Interrupt Command Register (ICR); bits 32-63 Read/Write. FEE0 0320H LVT Timer Register Read/Write. FEE0 0330H LVT Thermal Sensor Register2 Read/Write. Register3 FEE0 0340H LVT Performance Monitoring Counters Read/Write. FEE0 0350H LVT LINT0 Register Read/Write. FEE0 0360H LVT LINT1 Register Read/Write. FEE0 0370H LVT Error Register Read/Write. FEE0 0380H Initial Count Register (for Timer) Read/Write. FEE0 0390H Current Count Register (for Timer) Read Only. FEE0 03A0H through FEE0 03D0H Reserved FEE0 03E0H Divide Configuration Register (for Timer) FEE0 03F0H Reserved Read/Write. NOTES: 1. Not supported in the Pentium 4 and Intel Xeon processors. The Illegal Register Access bit (7) of the ESR will not be set when writing to these registers. 2. Introduced in the Pentium 4 and Intel Xeon processors. This APIC register and its associated function are implementation dependent and may not be present in future IA-32 or Intel 64 processors. 3. Introduced in the Pentium Pro processor. This APIC register and its associated function are implementation dependent and may not be present in future IA-32 or Intel 64 processors. Vol. 3A 10-7 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.4.2 Presence of the Local APIC Beginning with the P6 family processors, the presence or absence of an on-chip local APIC can be detected using the CPUID instruction. When the CPUID instruction is executed with a source operand of 1 in the EAX register, bit 9 of the CPUID feature flags returned in the EDX register indicates the presence (set) or absence (clear) of a local APIC. 10.4.3 Enabling or Disabling the Local APIC The local APIC can be enabled or disabled in either of two ways: 1. Using the APIC global enable/disable flag in the IA32_APIC_BASE MSR (MSR address 1BH; see Figure 10-5): — When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent to an IA-32 processor without an on-chip APIC. The CPUID feature flag for the APIC (see Section 10.4.2, “Presence of the Local APIC”) is also set to 0. — When IA32_APIC_BASE[11] is set to 0, processor APICs based on the 3-wire APIC bus cannot be generally re-enabled until a system hardware reset. The 3-wire bus loses track of arbitration that would be necessary for complete re-enabling. Certain APIC functionality can be enabled (for example: performance and thermal monitoring interrupt generation). — For processors that use Front Side Bus (FSB) delivery of interrupts, software may disable or enable the APIC by setting and resetting IA32_APIC_BASE[11]. A hardware reset is not required to re-start APIC functionality, if software guarantees no interrupt will be sent to the APIC as IA32_APIC_BASE[11] is cleared. — When IA32_APIC_BASE[11] is set to 0, prior initialization to the APIC may be lost and the APIC may return to the state described in Section 10.4.7.1, “Local APIC State After Power-Up or Reset.” 2. Using the APIC software enable/disable flag in the spurious-interrupt vector register (see Figure 10-23): — If IA32_APIC_BASE[11] is 1, software can temporarily disable a local APIC at any time by clearing the APIC software enable/disable flag in the spurious-interrupt vector register (see Figure 10-23). The state of the local APIC when in this software-disabled state is described in Section 10.4.7.2, “Local APIC State After It Has Been Software Disabled.” — When the local APIC is in the software-disabled state, it can be re-enabled at any time by setting the APIC software enable/disable flag to 1. For the Pentium processor, the APICEN pin (which is shared with the PICD1 pin) is used during power-up or reset to disable the local APIC. Note that each entry in the LVT has a mask bit that can be used to inhibit interrupts from being delivered to the processor from selected local interrupt sources (the LINT0 and LINT1 pins, the APIC timer, the performance-monitoring counters, the thermal sensor, and/or the internal APIC error detector). 10.4.4 Local APIC Status and Location The status and location of the local APIC are contained in the IA32_APIC_BASE MSR (see Figure 10-5). MSR bit functions are described below: • BSP flag, bit 8 ⎯ Indicates if the processor is the bootstrap processor (BSP). See Section 8.4, “MultipleProcessor (MP) Initialization.” Following a power-up or reset, this flag is set to 1 for the processor selected as the BSP and set to 0 for the remaining processors (APs). • APIC Global Enable flag, bit 11 ⎯ Enables or disables the local APIC (see Section 10.4.3, “Enabling or Disabling the Local APIC”). This flag is available in the Pentium 4, Intel Xeon, and P6 family processors. It is not guaranteed to be available or available at the same location in future Intel 64 or IA-32 processors. • APIC Base field, bits 12 through 35 ⎯ Specifies the base address of the APIC registers. This 24-bit value is extended by 12 bits at the low end to form the base address. This automatically aligns the address on a 4-KByte boundary. Following a power-up or reset, the field is set to FEE0 0000H. • Bits 0 through 7, bits 9 and 10, and bits MAXPHYADDR2 through 63 in the IA32_APIC_BASE MSR are reserved. 10-8 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 63 MAXPHYADDR Reserved 12 11 10 9 8 7 0 APIC Base APIC Base—Base physical address APIC global enable/disable BSP—Processor is BSP Reserved Figure 10-5. IA32_APIC_BASE MSR (APIC_BASE_MSR in P6 Family) 10.4.5 Relocating the Local APIC Registers The Pentium 4, Intel Xeon, and P6 family processors permit the starting address of the APIC registers to be relocated from FEE00000H to another physical address by modifying the value in the base address field of the IA32_APIC_BASE MSR. This extension of the APIC architecture is provided to help resolve conflicts with memory maps of existing systems and to allow individual processors in an MP system to map their APIC registers to different locations in physical memory. 10.4.6 Local APIC ID At power up, system hardware assigns a unique APIC ID to each local APIC on the system bus (for Pentium 4 and Intel Xeon processors) or on the APIC bus (for P6 family and Pentium processors). The hardware assigned APIC ID is based on system topology and includes encoding for socket position and cluster information (see Figure 8-2 and Section 8.9.1, “Hierarchical Mapping of Shared Resources”). In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register. The processor receives the hardware assigned APIC ID (or Initial APIC ID) by sampling pins A11# and A12# and pins BR0# through BR3# (for the Pentium 4, Intel Xeon, and P6 family processors) and pins BE0# through BE3# (for the Pentium processor). The APIC ID latched from these pins is stored in the APIC ID field of the local APIC ID register (see Figure 10-6), and is used as the Initial APIC ID for the processor. 2. The MAXPHYADDR is 36 bits for processors that do not support CPUID leaf 80000008H, or indicated by CPUID.80000008H:EAX[bits 7:0] for processors that support CPUID leaf 80000008H. Vol. 3A 10-9 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) P6 family and Pentium processors xAPIC Mode Address: 0FEE0 0020H Value after reset: 0000 0000H 31 27 24 0 APIC ID Reserved Pentium 4 processors, Xeon processors, and later processors 31 24 0 APIC ID x2APIC Mode Reserved 31 MSR Address: 802H 0 x2APIC ID Figure 10-6. Local APIC ID Register For the P6 family and Pentium processors, the local APIC ID field in the local APIC ID register is 4 bits. Encodings 0H through EH can be used to uniquely identify 15 different processors connected to the APIC bus. For the Pentium 4 and Intel Xeon processors, the xAPIC specification extends the local APIC ID field to 8 bits. These can be used to identify up to 255 processors in the system. 10.4.7 Local APIC State The following sections describe the state of the local APIC and its registers following a power-up or reset, after the local APIC has been software disabled, following an INIT reset, and following an INIT-deassert message. x2APIC will introduce 32-bit ID; see Section 10.12. 10.4.7.1 Local APIC State After Power-Up or Reset Following a power-up or reset of the processor, the state of local APIC and its registers are as follows: • The following registers are reset to all 0s. • • • IRR, ISR, TMR, ICR, LDR, and TPR. Timer initial count and timer current count registers. Divide configuration register. • • • • The DFR register is reset to all 1s. • The spurious-interrupt vector register is initialized to 000000FFH. By setting bit 8 to 0, software disables the local APIC. • If the processor is the only processor in the system or it is the BSP in an MP system (see Section 8.4.1, “BSP and AP Processors”); the local APIC will respond normally to INIT and NMI messages, to INIT# signals and to STPCLK# signals. If the processor is in an MP system and has been designated as an AP; the local APIC will respond the same as for the BSP. In addition, it will respond to SIPI messages. For P6 family processors only, an AP will not respond to a STPCLK# signal. The LVT register is reset to 0s except for the mask bits; these are set to 1s. The local APIC version register is not affected. The local APIC ID register is set to a unique APIC ID. (Pentium and P6 family processors only). The Arb ID register is set to the value in the APIC ID register. 10-10 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.4.7.2 Local APIC State After It Has Been Software Disabled When the APIC software enable/disable flag in the spurious interrupt vector register has been explicitly cleared (as opposed to being cleared during a power up or reset), the local APIC is temporarily disabled (see Section 10.4.3, “Enabling or Disabling the Local APIC”). The operation and response of a local APIC while in this software-disabled state is as follows: • • • The local APIC will respond normally to INIT, NMI, SMI, and SIPI messages. • The reception of any interrupt or transmission of any IPIs that are in progress when the local APIC is disabled are completed before the local APIC enters the software-disabled state. • • The mask bits for all the LVT entries are set. Attempts to reset these bits will be ignored. Pending interrupts in the IRR and ISR registers are held and require masking or handling by the CPU. The local APIC can still issue IPIs. It is software’s responsibility to avoid issuing IPIs through the IPI mechanism and the ICR register if sending interrupts through this mechanism is not desired. (For Pentium and P6 family processors) The local APIC continues to listen to all bus messages in order to keep its arbitration ID synchronized with the rest of the system. 10.4.7.3 Local APIC State After an INIT Reset (“Wait-for-SIPI” State) An INIT reset of the processor can be initiated in either of two ways: • • By asserting the processor’s INIT# pin. By sending the processor an INIT IPI (an IPI with the delivery mode set to INIT). Upon receiving an INIT through either of these mechanisms, the processor responds by beginning the initialization process of the processor core and the local APIC. The state of the local APIC following an INIT reset is the same as it is after a power-up or hardware reset, except that the APIC ID and arbitration ID registers are not affected. This state is also referred to at the “wait-for-SIPI” state (see also: Section 8.4.2, “MP Initialization Protocol Requirements and Restrictions”). 10.4.7.4 Local APIC State After It Receives an INIT-Deassert IPI Only the Pentium and P6 family processors support the INIT-deassert IPI. An INIT-deassert IPI has no affect on the state of the APIC, other than to reload the arbitration ID register with the value in the APIC ID register. 10.4.8 Local APIC Version Register The local APIC contains a hardwired version register. Software can use this register to identify the APIC version (see Figure 10-7). In addition, the register specifies the number of entries in the local vector table (LVT) for a specific implementation. The fields in the local APIC version register are as follows: Version The version numbers of the local APIC: 0XH 82489DX discrete APIC. 10H - 15H Integrated APIC. Other values reserved. Max LVT Entry Shows the number of LVT entries minus 1. For the Pentium 4 and Intel Xeon processors (which have 6 LVT entries), the value returned in the Max LVT field is 5; for the P6 family processors (which have 5 LVT entries), the value returned is 4; for the Pentium processor (which has 4 LVT entries), the value returned is 3. For processors based on the Intel microarchitecture code name Nehalem (which has 7 LVT entries) and onward, the value returned is 6. Suppress EOI-broadcasts Indicates whether software can inhibit the broadcast of EOI message by setting bit 12 of the Spurious Interrupt Vector Register; see Section 10.8.5 and Section 10.9. Vol. 3A 10-11 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 31 25 24 23 Reserved 16 15 Max LVT Entry 8 7 Reserved 0 Version Support for EOI-broadcast suppression Value after reset: 00BN 00VVH V = Version, N = # of LVT entries minus 1, B = 1 if EOI-broadcast suppression supported Address: FEE0 0030H Figure 10-7. Local APIC Version Register 10.5 HANDLING LOCAL INTERRUPTS The following sections describe facilities that are provided in the local APIC for handling local interrupts. These include: the processor’s LINT0 and LINT1 pins, the APIC timer, the performance-monitoring counters, the thermal sensor, and the internal APIC error detector. Local interrupt handling facilities include: the LVT, the error status register (ESR), the divide configuration register (DCR), and the initial count and current count registers. 10.5.1 Local Vector Table The local vector table (LVT) allows software to specify the manner in which the local interrupts are delivered to the processor core. It consists of the following 32-bit APIC registers (see Figure 10-8), one for each local interrupt: • LVT CMCI Register (FEE0 02F0H) — Specifies interrupt delivery when an overflow condition of corrected machine check error count reaching a threshold value occurred in a machine check bank supporting CMCI (see Section 15.5.1, “CMCI Local APIC Interface”). • LVT Timer Register (FEE0 0320H) — Specifies interrupt delivery when the APIC timer signals an interrupt (see Section 10.5.4, “APIC Timer”). • LVT Thermal Monitor Register (FEE0 0330H) — Specifies interrupt delivery when the thermal sensor generates an interrupt (see Section 14.7.2, “Thermal Monitor”). This LVT entry is implementation specific, not architectural. If implemented, it will always be at base address FEE0 0330H. • LVT Performance Counter Register (FEE0 0340H) — Specifies interrupt delivery when a performance counter generates an interrupt on overflow (see Section 18.6.3.5.8, “Generating an Interrupt on Overflow”). This LVT entry is implementation specific, not architectural. If implemented, it is not guaranteed to be at base address FEE0 0340H. • LVT LINT0 Register (FEE0 0350H) — Specifies interrupt delivery when an interrupt is signaled at the LINT0 pin. • LVT LINT1 Register (FEE0 0360H) — Specifies interrupt delivery when an interrupt is signaled at the LINT1 pin. • LVT Error Register (FEE0 0370H) — Specifies interrupt delivery when the APIC detects an internal error (see Section 10.5.3, “Error Handling”). The LVT performance counter register and its associated interrupt were introduced in the P6 processors and are also present in the Pentium 4 and Intel Xeon processors. The LVT thermal monitor register and its associated interrupt were introduced in the Pentium 4 and Intel Xeon processors. The LVT CMCI register and its associated interrupt were introduced in the Intel Xeon 5500 processors. As shown in Figures 10-8, some of these fields and flags are not available (and reserved) for some entries. The setup information that can be specified in the registers of the LVT table is as follows: Vector 10-12 Vol. 3A Interrupt vector number. ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 31 19 18 17 16 15 13 12 11 8 7 0 Timer Vector Address: FEE0 0320H Value after Reset: 0001 0000H Timer Mode 00: One-shot 01: Periodic 10: TSC-Deadline Delivery Status 0: Idle 1: Send Pending Mask† 0: Not Masked 1: Masked Interrupt Input Pin Polarity Delivery Mode 000: Fixed 010: SMI 100: NMI 111: ExtlNT 101: INIT All other combinations are reserved Remote IRR Trigger Mode 0: Edge 1: Level 31 17 11 10 8 7 0 CMCI Vector LINT0 Vector LINT1 Vector Error Vector Performance Mon. Counters Vector Thermal Sensor Vector 16 15 14 13 12 Reserved † (Pentium 4 and Intel Xeon processors.) When a performance monitoring counters interrupt is generated, the mask bit for its associated LVT entry is set. Address: FEE0 02F0H Address: FEE0 0350H Address: FEE0 0360H Address: FEE0 0370H Address: FEE0 0340H Address: FEE0 0330H Value After Reset: 0001 0000H Figure 10-8. Local Vector Table (LVT) Delivery Mode Specifies the type of interrupt to be sent to the processor. Some delivery modes will only operate as intended when used in conjunction with a specific trigger mode. The allowable delivery modes are as follows: 000 (Fixed) Delivers the interrupt specified in the vector field. 010 (SMI) Delivers an SMI interrupt to the processor core through the processor’s local SMI signal path. When using this delivery mode, the vector field should be set to 00H for future compatibility. 100 (NMI) Delivers an NMI interrupt to the processor. The vector information is ignored. 101 (INIT) Delivers an INIT request to the processor core, which causes the processor to perform an INIT. When using this delivery mode, the vector field should Vol. 3A 10-13 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) be set to 00H for future compatibility. Not supported for the LVT CMCI register, the LVT thermal monitor register, or the LVT performance counter register. 110 Reserved; not supported for any LVT register. 111 (ExtINT) Causes the processor to respond to the interrupt as if the interrupt originated in an externally connected (8259A-compatible) interrupt controller. A special INTA bus cycle corresponding to ExtINT, is routed to the external controller. The external controller is expected to supply the vector information. The APIC architecture supports only one ExtINT source in a system, usually contained in the compatibility bridge. Only one processor in the system should have an LVT entry configured to use the ExtINT delivery mode. Not supported for the LVT CMCI register, the LVT thermal monitor register, or the LVT performance counter register. Delivery Status (Read Only) Indicates the interrupt delivery status, as follows: 0 (Idle) There is currently no activity for this interrupt source, or the previous interrupt from this source was delivered to the processor core and accepted. 1 (Send Pending) Indicates that an interrupt from this source has been delivered to the processor core but has not yet been accepted (see Section 10.5.5, “Local Interrupt Acceptance”). Interrupt Input Pin Polarity Specifies the polarity of the corresponding interrupt pin: (0) active high or (1) active low. Remote IRR Flag (Read Only) For fixed mode, level-triggered interrupts; this flag is set when the local APIC accepts the interrupt for servicing and is reset when an EOI command is received from the processor. The meaning of this flag is undefined for edge-triggered interrupts and other delivery modes. Trigger Mode Selects the trigger mode for the local LINT0 and LINT1 pins: (0) edge sensitive and (1) level sensitive. This flag is only used when the delivery mode is Fixed. When the delivery mode is NMI, SMI, or INIT, the trigger mode is always edge sensitive. When the delivery mode is ExtINT, the trigger mode is always level sensitive. The timer and error interrupts are always treated as edge sensitive. If the local APIC is not used in conjunction with an I/O APIC and fixed delivery mode is selected; the Pentium 4, Intel Xeon, and P6 family processors will always use level-sensitive triggering, regardless if edge-sensitive triggering is selected. Software should always set the trigger mode in the LVT LINT1 register to 0 (edge sensitive). Level-sensitive interrupts are not supported for LINT1. Mask Interrupt mask: (0) enables reception of the interrupt and (1) inhibits reception of the interrupt. When the local APIC handles a performance-monitoring counters interrupt, it automatically sets the mask flag in the LVT performance counter register. This flag is set to 1 on reset. It can be cleared only by software. Timer Mode Bits 18:17 selects the timer mode (see Section 10.5.4): (00b) one-shot mode using a count-down value, (01b) periodic mode reloading a count-down value, (10b) TSC-Deadline mode using absolute target value in IA32_TSC_DEADLINE MSR (see Section 10.5.4.1), (11b) is reserved. 10.5.2 Valid Interrupt Vectors The Intel 64 and IA-32 architectures define 256 vector numbers, ranging from 0 through 255 (see Section 6.2, “Exception and Interrupt Vectors”). Local and I/O APICs support 240 of these vectors (in the range of 16 to 255) as valid interrupts. 10-14 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) When an interrupt vector in the range of 0 to 15 is sent or received through the local APIC, the APIC indicates an illegal vector in its Error Status Register (see Section 10.5.3, “Error Handling”). The Intel 64 and IA-32 architectures reserve vectors 16 through 31 for predefined interrupts, exceptions, and Intel-reserved encodings (see Table 6-1). However, the local APIC does not treat vectors in this range as illegal. When an illegal vector value (0 to 15) is written to an LVT entry and the delivery mode is Fixed (bits 8-11 equal 0), the APIC may signal an illegal vector error, without regard to whether the mask bit is set or whether an interrupt is actually seen on the input. 10.5.3 Error Handling The local APIC records errors detected during interrupt handling in the error status register (ESR). The format of the ESR is given in Figure 10-9; it contains the following flags: 8 7 6 5 4 3 2 1 0 31 Reserved Illegal Register Address1 Received Illegal Vector Send Illegal Vector Redirectable IPI2 Receive Accept Error3 Send Accept Error3 Receive Checksum Error3 Send Checksum Error3 Address: FEE0 0280H Value after reset: 0H NOTES: 1. Used only by Intel Core, Pentium 4, Intel Xeon, and P6 family processors; reserved on the Pentium processor. 2. Used only by some Intel Core and Intel Xeon processors; reserved on other processors. 3. Used only by the P6 family and Pentium processors; reserved on Intel Core, Pentium 4 and Intel Xeon processors. Figure 10-9. Error Status Register (ESR) • Bit 0: Send Checksum Error. Set when the local APIC detects a checksum error for a message that it sent on the APIC bus. Used only on P6 family and Pentium processors. • Bit 1: Receive Checksum Error. Set when the local APIC detects a checksum error for a message that it received on the APIC bus. Used only on P6 family and Pentium processors. • Bit 2: Send Accept Error. Set when the local APIC detects that a message it sent was not accepted by any APIC on the APIC bus. Used only on P6 family and Pentium processors. • Bit 3: Receive Accept Error. Set when the local APIC detects that the message it received was not accepted by any APIC on the APIC bus, including itself. Used only on P6 family and Pentium processors. • Bit 4: Redirectable IPI. Set when the local APIC detects an attempt to send an IPI with the lowest-priority delivery mode and the local APIC does not support the sending of such IPIs. This bit is used on some Intel Core and Intel Xeon processors. As noted in Section 10.6.2, the ability of a processor to send a lowest-priority IPI is model-specific and should be avoided. Vol. 3A 10-15 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • Bit 5: Send Illegal Vector. Set when the local APIC detects an illegal vector (one in the range 0 to 15) in the message that it is sending. This occurs as the result of a write to the ICR (in both xAPIC and x2APIC modes) or to SELF IPI register (x2APIC mode only) with an illegal vector. If the local APIC does not support the sending of lowest-priority IPIs and software writes the ICR to send a lowest-priority IPI with an illegal vector, the local APIC sets only the “redirectable IPI” error bit. The interrupt is not processed and hence the “Send Illegal Vector” bit is not set in the ESR. • Bit 6: Receive Illegal Vector. Set when the local APIC detects an illegal vector (one in the range 0 to 15) in an interrupt message it receives or in an interrupt generated locally from the local vector table or via a self IPI. Such interrupts are not delivered to the processor; the local APIC will never set an IRR bit in the range 0 to 15. • Bit 7: Illegal Register Address Set when the local APIC is in xAPIC mode and software attempts to access a register that is reserved in the processor's local-APIC register-address space; see Table 10-1. (The local-APIC register-address space comprises the 4 KBytes at the physical address specified in the IA32_APIC_BASE MSR.) Used only on Intel Core, Intel Atom™, Pentium 4, Intel Xeon, and P6 family processors. In x2APIC mode, software accesses the APIC registers using the RDMSR and WRMSR instructions. Use of one of these instructions to access a reserved register cause a general-protection exception (see Section 10.12.1.3). They do not set the “Illegal Register Access” bit in the ESR. The ESR is a write/read register. Before attempt to read from the ESR, software should first write to it. (The value written does not affect the values read subsequently; only zero may be written in x2APIC mode.) This write clears any previously logged errors and updates the ESR with any errors detected since the last write to the ESR. This write also rearms the APIC error interrupt triggering mechanism. The LVT Error Register (see Section 10.5.1) allows specification of the vector of the interrupt to be delivered to the processor core when APIC error is detected. The register also provides a means of masking an APIC-error interrupt. This masking only prevents delivery of APIC-error interrupts; the APIC continues to record errors in the ESR. 10.5.4 APIC Timer The local APIC unit contains a 32-bit programmable timer that is available to software to time events or operations. This timer is set up by programming four registers: the divide configuration register (see Figure 10-10), the initialcount and current-count registers (see Figure 10-11), and the LVT timer register (see Figure 10-8). If CPUID.06H:EAX.ARAT[bit 2] = 1, the processor’s APIC timer runs at a constant rate regardless of P-state transitions and it continues to run at the same rate in deep C-states. If CPUID.06H:EAX.ARAT[bit 2] = 0 or if CPUID 06H is not supported, the APIC timer may temporarily stop while the processor is in deep C-states or during transitions caused by Enhanced Intel SpeedStep® Technology. 4 3 2 1 0 31 0 Reserved Address: FEE0 03E0H Value after reset: 0H Divide Value (bits 0, 1 and 3) 000: Divide by 2 001: Divide by 4 010: Divide by 8 011: Divide by 16 100: Divide by 32 101: Divide by 64 110: Divide by 128 111: Divide by 1 Figure 10-10. Divide Configuration Register The APIC timer frequency will be the processor’s bus clock or core crystal clock frequency (when TSC/core crystal clock ratio is enumerated in CPUID leaf 0x15) divided by the value specified in the divide configuration register. 10-16 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 31 0 Initial Count Current Count Address: Initial Count FEE0 0380H Current Count FEE0 0390H Value after reset: 0H Figure 10-11. Initial Count and Current Count Registers The timer can be configured through the timer LVT entry for one-shot or periodic operation. In one-shot mode, the timer is started by programming its initial-count register. The initial count value is then copied into the currentcount register and count-down begins. After the timer reaches zero, an timer interrupt is generated and the timer remains at its 0 value until reprogrammed. In periodic mode, the current-count register is automatically reloaded from the initial-count register when the count reaches 0 and a timer interrupt is generated, and the count-down is repeated. If during the count-down process the initial-count register is set, counting will restart, using the new initial-count value. The initial-count register is a read-write register; the current-count register is read only. A write of 0 to the initial-count register effectively stops the local APIC timer, in both one-shot and periodic mode. The LVT timer register determines the vector number that is delivered to the processor with the timer interrupt that is generated when the timer count reaches zero. The mask flag in the LVT timer register can be used to mask the timer interrupt. 10.5.4.1 TSC-Deadline Mode The mode of operation of the local-APIC timer is determined by the LVT Timer Register. Specifically: • • If CPUID.01H:ECX.TSC_Deadline[bit 24] = 0, the mode is determined by bit 17 of the register. If CPUID.01H:ECX.TSC_Deadline[bit 24] = 1, the mode is determined by bits 18:17. See Figure 10-8. (If CPUID.01H:ECX.TSC_Deadline[bit 24] = 0, bit 18 of the register is reserved.) The supported timer modes are given in Table 10-2. The three modes of the local APIC timer are mutually exclusive. Table 10-2. Local APIC Timer Modes LVT Bits [18:17] Timer Mode 00b One-shot mode, program count-down value in an initial-count register. See Section 10.5.4 01b Periodic mode, program interval value in an initial-count register. See Section 10.5.4 10b TSC-Deadline mode, program target value in IA32_TSC_DEADLINE MSR. 11b Reserved TSC-deadline mode allows software to use the local APIC timer to signal an interrupt at an absolute time. In TSCdeadline mode, writes to the initial-count register are ignored; and current-count register always reads 0. Instead, timer behavior is controlled using the IA32_TSC_DEADLINE MSR. The IA32_TSC_DEADLINE MSR (MSR address 6E0H) is a per-logical processor MSR that specifies the time at which a timer interrupt should occur. Writing a non-zero 64-bit value into IA32_TSC_DEADLINE arms the timer. An interrupt is generated when the logical processor’s time-stamp counter equals or exceeds the target value in the IA32_TSC_DEADLINE MSR.3 When the timer generates an interrupt, it disarms itself and clears the IA32_TSC_DEADLINE MSR. Thus, each write to the IA32_TSC_DEADLINE MSR generates at most one timer interrupt. In TSC-deadline mode, writing 0 to the IA32_TSC_DEADLINE MSR disarms the local-APIC timer. Transitioning between TSC-deadline mode and other timer modes also disarms the timer. Vol. 3A 10-17 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The hardware reset value of the IA32_TSC_DEADLINE MSR is 0. In other timer modes (LVT bit 18 = 0), the IA32_TSC_DEADLINE MSR reads zero and writes are ignored. Software can configure the TSC-deadline timer to deliver a single interrupt using the following algorithm: 1. Detect support for TSC-deadline mode by verifying CPUID.1:ECX.24 = 1. 2. Select the TSC-deadline mode by programming bits 18:17 of the LVT Timer register with 10b. 3. Program the IA32_TSC_DEADLINE MSR with the target TSC value at which the timer interrupt is desired. This causes the processor to arm the timer. 4. The processor generates a timer interrupt when the value of time-stamp counter is greater than or equal to that of IA32_TSC_DEADLINE. It then disarms the timer and clear the IA32_TSC_DEADLINE MSR. (Both the timestamp counter and the IA32_TSC_DEADLINE MSR are 64-bit unsigned integers.) 5. Software can re-arm the timer by repeating step 3. The following are usage guidelines for TSC-deadline mode: • Writes to the IA32_TSC_DEADLINE MSR are not serialized. Therefore, system software should not use WRMSR to the IA32_TSC_DEADLINE MSR as a serializing instruction. Read and write accesses to the IA32_TSC_DEADLINE and other MSR registers will occur in program order. • • Software can disarm the timer at any time by writing 0 to the IA32_TSC_DEADLINE MSR. • If software disarms the timer or postpones the deadline, race conditions may result in the delivery of a spurious timer interrupt. Software is expected to detect such spurious interrupts by checking the current value of the time-stamp counter to confirm that the interrupt was desired.4 • In xAPIC mode (in which the local-APIC registers are memory-mapped), software must order the memorymapped write to the LVT entry that enables TSC-deadline mode and any subsequent WRMSR to the IA32_TSC_DEADLINE MSR. Software can assure proper ordering by executing the MFENCE instruction after the memory-mapped write and before any WRMSR. (In x2APIC mode, the WRMSR instruction is used to write to the LVT entry. The processor ensures the ordering of this write and any subsequent WRMSR to the deadline; no fencing is required.) If timer is armed, software can change the deadline (forward or backward) by writing a new value to the IA32_TSC_DEADLINE MSR. 10.5.5 Local Interrupt Acceptance When a local interrupt is sent to the processor core, it is subject to the acceptance criteria specified in the interrupt acceptance flow chart in Figure 10-17. If the interrupt is accepted, it is logged into the IRR register and handled by the processor according to its priority (see Section 10.8.4, “Interrupt Acceptance for Fixed Interrupts”). If the interrupt is not accepted, it is sent back to the local APIC and retried. 10.6 ISSUING INTERPROCESSOR INTERRUPTS The following sections describe the local APIC facilities that are provided for issuing interprocessor interrupts (IPIs) from software. The primary local APIC facility for issuing IPIs is the interrupt command register (ICR). The ICR can be used for the following functions: 3. If the logical processor is in VMX non-root operation, a read of the time-stamp counter (using either RDMSR, RDTSC, or RDTSCP) may not return the actual value of the time-stamp counter; see Chapter 27 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C. It is the responsibility of software operating in VMX root operation to coordinate the virtualization of the time-stamp counter and the IA32_TSC_DEADLINE MSR. 4. If the logical processor is in VMX non-root operation, a read of the time-stamp counter (using either RDMSR, RDTSC, or RDTSCP) may not return the actual value of the time-stamp counter; see Chapter 27 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C. It is the responsibility of software operating in VMX root operation to coordinate the virtualization of the time-stamp counter and the IA32_TSC_DEADLINE MSR. 10-18 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • • To send an interrupt to another processor. • • To direct the processor to interrupt itself (perform a self interrupt). To allow a processor to forward an interrupt that it received but did not service to another processor for servicing. To deliver special IPIs, such as the start-up IPI (SIPI) message, to other processors. Interrupts generated with this facility are delivered to the other processors in the system through the system bus (for Pentium 4 and Intel Xeon processors) or the APIC bus (for P6 family and Pentium processors). The ability for a processor to send a lowest priority IPI is model specific and should be avoided by BIOS and operating system software. 10.6.1 Interrupt Command Register (ICR) The interrupt command register (ICR) is a 64-bit5 local APIC register (see Figure 10-12) that allows software running on the processor to specify and send interprocessor interrupts (IPIs) to other processors in the system. To send an IPI, software must set up the ICR to indicate the type of IPI message to be sent and the destination processor or processors. (All fields of the ICR are read-write by software with the exception of the delivery status field, which is read-only.) The act of writing to the low doubleword of the ICR causes the IPI to be sent. 63 56 55 32 Destination Field 31 Reserved 20 19 18 17 16 15 14 13 12 11 10 Reserved Destination Shorthand 00: No Shorthand 01: Self 10: All Including Self 11: All Excluding Self Reserved 8 7 0 Vector Delivery Mode 000: Fixed 001: Lowest Priority1 010: SMI 011: Reserved 100: NMI 101: INIT 110: Start Up 111: Reserved Destination Mode 0: Physical 1: Logical Delivery Status 0: Idle 1: Send Pending Address: FEE0 0300H (0 - 31) FEE0 0310H (32 - 63) Value after Reset: 0H Level 0 = De-assert 1 = Assert Trigger Mode 0: Edge 1: Level NOTE: 1. The ability of a processor to send Lowest Priority IPI is model specific. Figure 10-12. Interrupt Command Register (ICR) 5. In XAPIC mode the ICR is addressed as two 32-bit registers, ICR_LOW (FFE0 0300H) and ICR_HIGH (FFE0 0310H). In x2APIC mode, the ICR uses MSR 830H. Vol. 3A 10-19 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The ICR consists of the following fields. Vector The vector number of the interrupt being sent. Delivery Mode Specifies the type of IPI to be sent. This field is also know as the IPI message type field. 000 (Fixed) Delivers the interrupt specified in the vector field to the target processor or processors. 001 (Lowest Priority) Same as fixed mode, except that the interrupt is delivered to the processor executing at the lowest priority among the set of processors specified in the destination field. The ability for a processor to send a lowest priority IPI is model specific and should be avoided by BIOS and operating system software. 010 (SMI) Delivers an SMI interrupt to the target processor or processors. The vector field must be programmed to 00H for future compatibility. 011 (Reserved) 100 (NMI) Delivers an NMI interrupt to the target processor or processors. The vector information is ignored. 101 (INIT) Delivers an INIT request to the target processor or processors, which causes them to perform an INIT. As a result of this IPI message, all the target processors perform an INIT. The vector field must be programmed to 00H for future compatibility. 101 (INIT Level De-assert) (Not supported in the Pentium 4 and Intel Xeon processors.) Sends a synchronization message to all the local APICs in the system to set their arbitration IDs (stored in their Arb ID registers) to the values of their APIC IDs (see Section 10.7, “System and APIC Bus Arbitration”). For this delivery mode, the level flag must be set to 0 and trigger mode flag to 1. This IPI is sent to all processors, regardless of the value in the destination field or the destination shorthand field; however, software should specify the “all including self” shorthand. 110 (Start-Up) Sends a special “start-up” IPI (called a SIPI) to the target processor or processors. The vector typically points to a start-up routine that is part of the BIOS boot-strap code (see Section 8.4, “Multiple-Processor (MP) Initialization”). IPIs sent with this delivery mode are not automatically retried if the source APIC is unable to deliver it. It is up to the software to determine if the SIPI was not successfully delivered and to reissue the SIPI if necessary. Destination Mode Selects either physical (0) or logical (1) destination mode (see Section 10.6.2, “Determining IPI Destination”). Delivery Status (Read Only) Indicates the IPI delivery status, as follows: 0 (Idle) Indicates that this local APIC has completed sending any previous IPIs. 1 (Send Pending) Indicates that this local APIC has not completed sending the last IPI. Level 10-20 Vol. 3A For the INIT level de-assert delivery mode this flag must be set to 0; for all other delivery modes it must be set to 1. (This flag has no meaning in Pentium 4 and Intel Xeon processors, and will always be issued as a 1.) ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Trigger Mode Selects the trigger mode when using the INIT level de-assert delivery mode: edge (0) or level (1). It is ignored for all other delivery modes. (This flag has no meaning in Pentium 4 and Intel Xeon processors, and will always be issued as a 0.) Destination Shorthand Indicates whether a shorthand notation is used to specify the destination of the interrupt and, if so, which shorthand is used. Destination shorthands are used in place of the 8-bit destination field, and can be sent by software using a single write to the low doubleword of the ICR. Shorthands are defined for the following cases: software self interrupt, IPIs to all processors in the system including the sender, IPIs to all processors in the system excluding the sender. 00: (No Shorthand) The destination is specified in the destination field. 01: (Self) The issuing APIC is the one and only destination of the IPI. This destination shorthand allows software to interrupt the processor on which it is executing. An APIC implementation is free to deliver the self-interrupt message internally or to issue the message to the bus and “snoop” it as with any other IPI message. 10: (All Including Self) The IPI is sent to all processors in the system including the processor sending the IPI. The APIC will broadcast an IPI message with the destination field set to FH for Pentium and P6 family processors and to FFH for Pentium 4 and Intel Xeon processors. 11: (All Excluding Self) The IPI is sent to all processors in a system with the exception of the processor sending the IPI. The APIC broadcasts a message with the physical destination mode and destination field set to FH for Pentium and P6 family processors and to FFH for Pentium 4 and Intel Xeon processors. Support for this destination shorthand in conjunction with the lowest-priority delivery mode is model specific. For Pentium 4 and Intel Xeon processors, when this shorthand is used together with lowest priority delivery mode, the IPI may be redirected back to the issuing processor. Destination Specifies the target processor or processors. This field is only used when the destination shorthand field is set to 00B. If the destination mode is set to physical, then bits 56 through 59 contain the APIC ID of the target processor for Pentium and P6 family processors and bits 56 through 63 contain the APIC ID of the target processor the for Pentium 4 and Intel Xeon processors. If the destination mode is set to logical, the interpretation of the 8-bit destination field depends on the settings of the DFR and LDR registers of the local APICs in all the processors in the system (see Section 10.6.2, “Determining IPI Destination”). Not all combinations of options for the ICR are valid. Table 10-3 shows the valid combinations for the fields in the ICR for the Pentium 4 and Intel Xeon processors; Table 10-4 shows the valid combinations for the fields in the ICR for the P6 family processors. Also note that the lower half of the ICR may not be preserved over transitions to the deepest C-States. ICR operation in x2APIC mode is discussed in Section 10.12.9. Vol. 3A 10-21 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-3 Valid Combinations for the Pentium 4 and Intel Xeon Processors’ Local xAPIC Interrupt Command Register Destination Shorthand No Shorthand Valid/Invalid Valid 2 Trigger Mode Delivery Mode Destination Mode Modes1 Physical or Logical Edge All Level All Modes Physical or Logical No Shorthand Invalid Self Valid Edge Fixed X3 Self Invalid2 Level Fixed X Self Invalid X Lowest Priority, NMI, INIT, SMI, Start-Up X All Including Self Valid Edge Fixed X All Including Self Invalid2 Level Fixed X All Including Self Invalid X X All Excluding Self Valid Edge Lowest Priority, NMI, INIT, SMI, Start-Up Fixed, Lowest Priority1,4, NMI, INIT, SMI, Start-Up All Excluding Self Invalid2 Level FIxed, Lowest Priority4, NMI, INIT, SMI, Start-Up X X NOTES: 1. The ability of a processor to send a lowest priority IPI is model specific. 2. For these interrupts, if the trigger mode bit is 1 (Level), the local xAPIC will override the bit setting and issue the interrupt as an edge triggered interrupt. 3. X means the setting is ignored. 4. When using the “lowest priority” delivery mode and the “all excluding self” destination, the IPI can be redirected back to the issuing APIC, which is essentially the same as the “all including self” destination mode. Table 10-4 Valid Combinations for the P6 Family Processors’ Local APIC Interrupt Command Register Destination Shorthand No Shorthand Valid/Invalid Valid Trigger Mode Edge Delivery Mode Destination Mode 1 Physical or Logical All Modes No Shorthand Valid 2 Level Fixed, Lowest No Shorthand Valid3 Level INIT Physical or Logical Self Valid Edge Fixed X4 Self Valid2 Level Fixed X 5 Priority1, NMI Physical or Logical Self Invalid X Lowest Priority, NMI, INIT, SMI, Start-Up X All including Self Valid Edge Fixed X All including Self Valid2 Level Fixed X X Lowest Priority, NMI, INIT, SMI, Start-Up All including Self 5 Invalid 1 X All excluding Self Valid Edge All Modes X All excluding Self Valid2 Level Fixed, Lowest Priority1, NMI X Level SMI, Start-Up X Level INIT X Level SMI, Start-Up X All excluding Self Invalid All excluding Self 3 X Valid Invalid 5 5 NOTES: 1. The ability of a processor to send a lowest priority IPI is model specific. 2. Treated as edge triggered if level bit is set to 1, otherwise ignored. 3. Treated as edge triggered when Level bit is set to 1; treated as “INIT Level Deassert” message when level bit is set to 0 (deassert). Only INIT level deassert messages are allowed to have the level bit set to 0. For all other messages the level bit must be set to 1. 4. X means the setting is ignored. 5. The behavior of the APIC is undefined. 10-22 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.6.2 Determining IPI Destination The destination of an IPI6 can be one, all, or a subset (group) of the processors on the system bus. The sender of the IPI specifies the destination of an IPI with the following APIC registers and fields within the registers: • ICR Register — The following fields in the ICR register are used to specify the destination of an IPI. — Destination Mode — Selects one of two destination modes (physical or logical). — Destination Field — In physical destination mode, used to specify the APIC ID of the destination processor; in logical destination mode, used to specify a message destination address (MDA) that can be used to select specific processors in clusters. — Destination Shorthand — A quick method of specifying all processors, all excluding self, or self as the destination. — Delivery mode, Lowest Priority — Architecturally specifies that a lowest-priority arbitration mechanism be used to select a destination processor from a specified group of processors. The ability of a processor to send a lowest priority IPI is model specific and should be avoided by BIOS and operating system software. • Local destination register (LDR) — Used in conjunction with the logical destination mode and MDAs to select the destination processors. • Destination format register (DFR) — Used in conjunction with the logical destination mode and MDAs to select the destination processors. How the ICR, LDR, and DFR are used to select an IPI destination depends on the destination mode used: physical, logical, broadcast/self, or lowest-priority delivery mode. These destination modes are described in the following sections. 10.6.2.1 Physical Destination Mode In physical destination mode, the destination processor is specified by its local APIC ID (see Section 10.4.6, “Local APIC ID”). For Pentium 4 and Intel Xeon processors, either a single destination (local APIC IDs 00H through FEH) or a broadcast to all APICs (the APIC ID is FFH) may be specified in physical destination mode. A broadcast IPI (bits 28-31 of the MDA are 1's) or I/O subsystem initiated interrupt with lowest priority delivery mode is not supported in physical destination mode and must not be configured by software. Also, for any nonbroadcast IPI or I/O subsystem initiated interrupt with lowest priority delivery mode, software must ensure that APICs defined in the interrupt address are present and enabled to receive interrupts. For the P6 family and Pentium processors, a single destination is specified in physical destination mode with a local APIC ID of 0H through 0EH, allowing up to 15 local APICs to be addressed on the APIC bus. A broadcast to all local APICs is specified with 0FH. NOTE The number of local APICs that can be addressed on the system bus may be restricted by hardware. 10.6.2.2 Logical Destination Mode In logical destination mode, IPI destination is specified using an 8-bit message destination address (MDA), which is entered in the destination field of the ICR. Upon receiving an IPI message that was sent using logical destination mode, a local APIC compares the MDA in the message with the values in its LDR and DFR to determine if it should accept and handle the IPI. For both configurations of logical destination mode, when combined with lowest priority delivery mode, software is responsible for ensuring that all of the local APICs included in or addressed by the IPI or I/O subsystem interrupt are present and enabled to receive the interrupt. Figure 10-13 shows the layout of the logical destination register (LDR). The 8-bit logical APIC ID field in this register is used to create an identifier that can be compared with the MDA. 6. Determination of IPI destinations in x2APIC mode is discussed in Section 10.12.10. Vol. 3A 10-23 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) NOTE The logical APIC ID should not be confused with the local APIC ID that is contained in the local APIC ID register. 31 0 24 23 Logical APIC ID Reserved Address: 0FEE0 00D0H Value after reset: 0000 0000H Figure 10-13. Logical Destination Register (LDR) Figure 10-14 shows the layout of the destination format register (DFR). The 4-bit model field in this register selects one of two models (flat or cluster) that can be used to interpret the MDA when using logical destination mode. 31 0 28 Model Reserved (All 1s) Flat model: 1111B Cluster model: 0000B Address: 0FEE0 00E0H Value after reset: FFFF FFFFH Figure 10-14. Destination Format Register (DFR) The interpretation of MDA for the two models is described in the following paragraphs. 1. Flat Model — This model is selected by programming DFR bits 28 through 31 to 1111. Here, a unique logical APIC ID can be established for up to 8 local APICs by setting a different bit in the logical APIC ID field of the LDR for each local APIC. A group of local APICs can then be selected by setting one or more bits in the MDA. Each local APIC performs a bit-wise AND of the MDA and its logical APIC ID. If a true condition (non-zero) is detected, the local APIC accepts the IPI message. A broadcast to all APICs is achieved by setting the MDA to 1s. 2. Cluster Model — This model is selected by programming DFR bits 28 through 31 to 0000. This model supports two basic destination schemes: flat cluster and hierarchical cluster. The flat cluster destination model is only supported for P6 family and Pentium processors. Using this model, all APICs are assumed to be connected through the APIC bus. Bits 60 through 63 of the MDA contains the encoded address of the destination cluster and bits 56 through 59 identify up to four local APICs within the cluster (each bit is assigned to one local APIC in the cluster, as in the flat connection model). To identify one or more local APICs, bits 60 through 63 of the MDA are compared with bits 28 through 31 of the LDR to determine if a local APIC is part of the cluster. Bits 56 through 59 of the MDA are compared with Bits 24 through 27 of the LDR to identify a local APICs within the cluster. Sets of processors within a cluster can be specified by writing the target cluster address in bits 60 through 63 of the MDA and setting selected bits in bits 56 through 59 of the MDA, corresponding to the chosen members of the cluster. In this mode, 15 clusters (with cluster addresses of 0 through 14) each having 4 local APICs can be specified in the message. For the P6 and Pentium processor’s local APICs, however, the APIC arbitration ID supports only 15 APIC agents. Therefore, the total number of processors and their local APICs supported in this mode is limited to 15. Broadcast to all local APICs is achieved by setting all destination bits to one. This guarantees a match on all clusters and selects all APICs in each cluster. A broadcast IPI or I/O subsystem broadcast interrupt with lowest priority delivery mode is not supported in cluster mode and must not be configured by software. The hierarchical cluster destination model can be used with Pentium 4, Intel Xeon, P6 family, or Pentium processors. With this model, a hierarchical network can be created by connecting different flat clusters via 10-24 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) independent system or APIC buses. This scheme requires a cluster manager within each cluster, which is responsible for handling message passing between system or APIC buses. One cluster contains up to 4 agents. Thus 15 cluster managers, each with 4 agents, can form a network of up to 60 APIC agents. Note that hierarchical APIC networks requires a special cluster manager device, which is not part of the local or the I/O APIC units. NOTES All processors that have their APIC software enabled (using the spurious vector enable/disable bit) must have their DFRs (Destination Format Registers) programmed identically. The default mode for DFR is flat mode. If you are using cluster mode, DFRs must be programmed before the APIC is software enabled. Since some chipsets do not accurately track a system view of the logical mode, program DFRs as soon as possible after starting the processor. 10.6.2.3 Broadcast/Self Delivery Mode The destination shorthand field of the ICR allows the delivery mode to be by-passed in favor of broadcasting the IPI to all the processors on the system bus and/or back to itself (see Section 10.6.1, “Interrupt Command Register (ICR)”). Three destination shorthands are supported: self, all excluding self, and all including self. The destination mode is ignored when a destination shorthand is used. 10.6.2.4 Lowest Priority Delivery Mode With lowest priority delivery mode, the ICR is programmed to send an IPI to several processors on the system bus, using the logical or shorthand destination mechanism for selecting the processor. The selected processors then arbitrate with one another over the system bus or the APIC bus, with the lowest-priority processor accepting the IPI. For systems based on the Intel Xeon processor, the chipset bus controller accepts messages from the I/O APIC agents in the system and directs interrupts to the processors on the system bus. When using the lowest priority delivery mode, the chipset chooses a target processor to receive the interrupt out of the set of possible targets. The Pentium 4 processor provides a special bus cycle on the system bus that informs the chipset of the current task priority for each logical processor in the system. The chipset saves this information and uses it to choose the lowest priority processor when an interrupt is received. For systems based on P6 family processors, the processor priority used in lowest-priority arbitration is contained in the arbitration priority register (APR) in each local APIC. Figure 10-15 shows the layout of the APR. 31 8 7 4 3 0 Reserved Address: FEE0 0090H Value after reset: 0H Arbitration Priority Class Arbitration Priority Sub-Class Figure 10-15. Arbitration Priority Register (APR) The APR value is computed as follows: IF (TPR[7:4] ≥ IRRV[7:4]) AND (TPR[7:4] > ISRV[7:4]) THEN APR[7:0] ← TPR[7:0] ELSE APR[7:4] ← max(TPR[7:4] AND ISRV[7:4], IRRV[7:4]) APR[3:0] ← 0. Vol. 3A 10-25 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Here, the TPR value is the task priority value in the TPR (see Figure 10-18), the IRRV value is the vector number for the highest priority bit that is set in the IRR (see Figure 10-20) or 00H (if no IRR bit is set), and the ISRV value is the vector number for the highest priority bit that is set in the ISR (see Figure 10-20). Following arbitration among the destination processors, the processor with the lowest value in its APR handles the IPI and the other processors ignore it. (P6 family and Pentium processors.) For these processors, if a focus processor exists, it may accept the interrupt, regardless of its priority. A processor is said to be the focus of an interrupt if it is currently servicing that interrupt or if it has a pending request for that interrupt. For Intel Xeon processors, the concept of a focus processor is not supported. In operating systems that use the lowest priority delivery mode but do not update the TPR, the TPR information saved in the chipset will potentially cause the interrupt to be always delivered to the same processor from the logical set. This behavior is functionally backward compatible with the P6 family processor but may result in unexpected performance implications. 10.6.3 IPI Delivery and Acceptance When the low double-word of the ICR is written to, the local APIC creates an IPI message from the information contained in the ICR and sends the message out on the system bus (Pentium 4 and Intel Xeon processors) or the APIC bus (P6 family and Pentium processors). The manner in which these IPIs are handled after being issues in described in Section 10.8, “Handling Interrupts.” 10.7 SYSTEM AND APIC BUS ARBITRATION When several local APICs and the I/O APIC are sending IPI and interrupt messages on the system bus (or APIC bus), the order in which the messages are sent and handled is determined through bus arbitration. For the Pentium 4 and Intel Xeon processors, the local and I/O APICs use the arbitration mechanism defined for the system bus to determine the order in which IPIs are handled. This mechanism is non-architectural and cannot be controlled by software. For the P6 family and Pentium processors, the local and I/O APICs use an APIC-based arbitration mechanism to determine the order in which IPIs are handled. Here, each local APIC is given an arbitration priority of from 0 to 15, which the I/O APIC uses during arbitration to determine which local APIC should be given access to the APIC bus. The local APIC with the highest arbitration priority always wins bus access. Upon completion of an arbitration round, the winning local APIC lowers its arbitration priority to 0 and the losing local APICs each raise theirs by 1. The current arbitration priority for a local APIC is stored in a 4-bit, software-transparent arbitration ID (Arb ID) register. During reset, this register is initialized to the APIC ID number (stored in the local APIC ID register). The INIT level-deassert IPI, which is issued with and ICR command, can be used to resynchronize the arbitration priorities of the local APICs by resetting Arb ID register of each agent to its current APIC ID value. (The Pentium 4 and Intel Xeon processors do not implement the Arb ID register.) Section 10.10, “APIC Bus Message Passing Mechanism and Protocol (P6 Family, Pentium Processors),” describes the APIC bus arbitration protocols and bus message formats, while Section 10.6.1, “Interrupt Command Register (ICR),” describes the INIT level de-assert IPI message. Note that except for the SIPI IPI (see Section 10.6.1, “Interrupt Command Register (ICR)”), all bus messages that fail to be delivered to their specified destination or destinations are automatically retried. Software should avoid situations in which IPIs are sent to disabled or nonexistent local APICs, causing the messages to be resent repeatedly. Additionally, interrupt sources that target the APIC should be masked or changed to no longer target the APIC. 10.8 HANDLING INTERRUPTS When a local APIC receives an interrupt from a local source, an interrupt message from an I/O APIC, or an IPI, the manner in which it handles the message depends on processor implementation, as described in the following sections. 10-26 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.8.1 Interrupt Handling with the Pentium 4 and Intel Xeon Processors With the Pentium 4 and Intel Xeon processors, the local APIC handles the local interrupts, interrupt messages, and IPIs it receives as follows: 1. It determines if it is the specified destination or not (see Figure 10-16). If it is the specified destination, it accepts the message; if it is not, it discards the message. Wait to Receive Bus Message Discard Message No Belong to Destination? Yes Accept Message Figure 10-16. Interrupt Acceptance Flow Chart for the Local APIC (Pentium 4 and Intel Xeon Processors) 2. If the local APIC determines that it is the designated destination for the interrupt and if the interrupt request is an NMI, SMI, INIT, ExtINT, or SIPI, the interrupt is sent directly to the processor core for handling. 3. If the local APIC determines that it is the designated destination for the interrupt but the interrupt request is not one of the interrupts given in step 2, the local APIC sets the appropriate bit in the IRR. 4. When interrupts are pending in the IRR register, the local APIC dispatches them to the processor one at a time, based on their priority and the current processor priority in the PPR (see Section 10.8.3.1, “Task and Processor Priorities”). 5. When a fixed interrupt has been dispatched to the processor core for handling, the completion of the handler routine is indicated with an instruction in the instruction handler code that writes to the end-of-interrupt (EOI) register in the local APIC (see Section 10.8.5, “Signaling Interrupt Servicing Completion”). The act of writing to the EOI register causes the local APIC to delete the interrupt from its ISR queue and (for level-triggered interrupts) send a message on the bus indicating that the interrupt handling has been completed. (A write to the EOI register must not be included in the handler routine for an NMI, SMI, INIT, ExtINT, or SIPI.) 10.8.2 Interrupt Handling with the P6 Family and Pentium Processors With the P6 family and Pentium processors, the local APIC handles the local interrupts, interrupt messages, and IPIs it receives as follows (see Figure 10-17). 1. (IPIs only) The local APIC examines the IPI message to determines if it is the specified destination for the IPI as described in Section 10.6.2, “Determining IPI Destination.” If it is the specified destination, it continues its acceptance procedure; if it is not the destination, it discards the IPI message. When the message specifies lowest-priority delivery mode, the local APIC will arbitrate with the other processors that were designated as recipients of the IPI message (see Section 10.6.2.4, “Lowest Priority Delivery Mode”). 2. If the local APIC determines that it is the designated destination for the interrupt and if the interrupt request is an NMI, SMI, INIT, ExtINT, or INIT-deassert interrupt, or one of the MP protocol IPI messages (BIPI, FIPI, and SIPI), the interrupt is sent directly to the processor core for handling. 3. If the local APIC determines that it is the designated destination for the interrupt but the interrupt request is not one of the interrupts given in step 2, the local APIC looks for an open slot in one of its two pending interrupt queues contained in the IRR and ISR registers (see Figure 10-20). If a slot is available (see Section 10.8.4, “Interrupt Acceptance for Fixed Interrupts”), places the interrupt in the slot. If a slot is not available, it rejects the interrupt request and sends it back to the sender with a retry message. 4. When interrupts are pending in the IRR register, the local APIC dispatches them to the processor one at a time, based on their priority and the current processor priority in the PPR (see Section 10.8.3.1, “Task and Processor Priorities”). 5. When a fixed interrupt has been dispatched to the processor core for handling, the completion of the handler routine is indicated with an instruction in the instruction handler code that writes to the end-of-interrupt (EOI) Vol. 3A 10-27 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Wait to Receive Bus Message No Discard Message Belong to Destination? Yes Is it NMI/SMI/INIT /ExtINT? Yes Accept Message No Fixed Delivery Lowest Priority P6 Family Processor Specific No Set Status to Retry Am I Focus? Is Interrupt Slot Available? Yes Yes Is Status a Retry? Yes Accept Message Yes Discard Message No No Other Focus? No Set Status to Retry No Accept Message Is Interrupt Slot Available? Yes No Arbitrate Am I Winner? Yes Accept Message Figure 10-17. Interrupt Acceptance Flow Chart for the Local APIC (P6 Family and Pentium Processors) register in the local APIC (see Section 10.8.5, “Signaling Interrupt Servicing Completion”). The act of writing to the EOI register causes the local APIC to delete the interrupt from its queue and (for level-triggered interrupts) send a message on the bus indicating that the interrupt handling has been completed. (A write to the EOI register must not be included in the handler routine for an NMI, SMI, INIT, ExtINT, or SIPI.) The following sections describe the acceptance of interrupts and their handling by the local APIC and processor in greater detail. 10.8.3 Interrupt, Task, and Processor Priority Each interrupt delivered to the processor through the local APIC has a priority based on its vector number. The local APIC uses this priority to determine when to service the interrupt relative to the other activities of the processor, including the servicing of other interrupts. Each interrupt vector is an 8-bit value. The interrupt-priority class is the value of bits 7:4 of the interrupt vector. The lowest interrupt-priority class is 1 and the highest is 15; interrupts with vectors in the range 0–15 (with interrupt-priority class 0) are illegal and are never delivered. Because vectors 0–31 are reserved for dedicated uses by the Intel 64 and IA-32 architectures, software should configure interrupt vectors to use interrupt-priority classes in the range 2–15. Each interrupt-priority class encompasses 16 vectors. The relative priority of interrupts within an interrupt-priority class is determined by the value of bits 3:0 of the vector number. The higher the value of those bits, the higher the 10-28 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) priority within that interrupt-priority class. Thus, each interrupt vector comprises two parts, with the high 4 bits indicating its interrupt-priority class and the low 4 bits indicating its ranking within the interrupt-priority class. 10.8.3.1 Task and Processor Priorities The local APIC also defines a task priority and a processor priority that determine the order in which interrupts are handled. The task-priority class is the value of bits 7:4 of the task-priority register (TPR), which can be written by software (TPR is a read/write register); see Figure 10-18. 31 8 7 4 3 0 Reserved Address: FEE0 0080H Value after reset: 0H Task-Priority Class Task-Priority Sub-Class Figure 10-18. Task-Priority Register (TPR) NOTE In this discussion, the term “task” refers to a software defined task, process, thread, program, or routine that is dispatched to run on the processor by the operating system. It does not refer to an IA-32 architecture defined task as described in Chapter 7, “Task Management.” The task priority allows software to set a priority threshold for interrupting the processor. This mechanism enables the operating system to temporarily block low priority interrupts from disturbing high-priority work that the processor is doing. The ability to block such interrupts using task priority results from the way that the TPR controls the value of the processor-priority register (PPR).7 The processor-priority class is a value in the range 0–15 that is maintained in bits 7:4 of the processor-priority register (PPR); see Figure 10-19. The PPR is a read-only register. The processor-priority class represents the current priority at which the processor is executing. 31 8 7 4 3 0 Reserved Address: FEE0 00A0H Value after reset: 0H Processor-Priority Class Processor-Priority Sub-Class Figure 10-19. Processor-Priority Register (PPR) The value of the PPR is based on the value of TPR and the value ISRV; ISRV is the vector number of the highest priority bit that is set in the ISR or 00H if no bit is set in the ISR. (See Section 10.8.4 for more details on the ISR.) The value of PPR is determined as follows: • PPR[7:4] (the processor-priority class) the maximum of TPR[7:4] (the task- priority class) and ISRV[7:4] (the priority of the highest priority interrupt in service). • PPR[3:0] (the processor-priority sub-class) is determined as follows: — If TPR[7:4] > ISRV[7:4], PPR[3:0] is TPR[3:0] (the task-priority sub-class). — If TPR[7:4] < ISRV[7:4], PPR[3:0] is 0. — If TPR[7:4] = ISRV[7:4], PPR[3:0] may be either TPR[3:0] or 0. The actual behavior is model-specific. 7. The TPR also determines the arbitration priority of the local processor; see Section 10.6.2.4, “Lowest Priority Delivery Mode.” Vol. 3A 10-29 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The processor-priority class determines the priority threshold for interrupting the processor. The processor will deliver only those interrupts that have an interrupt-priority class higher than the processor-priority class in the PPR. If the processor-priority class is 0, the PPR does not inhibit the delivery any interrupt; if it is 15, the processor inhibits the delivery of all interrupts. (The processor-priority mechanism does not affect the delivery of interrupts with the NMI, SMI, INIT, ExtINT, INIT-deassert, and start-up delivery modes.) The processor does not use the processor-priority sub-class to determine which interrupts to delivery and which to inhibit. (The processor uses the processor-priority sub-class only to satisfy reads of the PPR.) 10.8.4 Interrupt Acceptance for Fixed Interrupts The local APIC queues the fixed interrupts that it accepts in one of two interrupt pending registers: the interrupt request register (IRR) or in-service register (ISR). These two 256-bit read-only registers are shown in Figure 10-20. The 256 bits in these registers represent the 256 possible vectors; vectors 0 through 15 are reserved by the APIC (see also: Section 10.5.2, “Valid Interrupt Vectors”). NOTE All interrupts with an NMI, SMI, INIT, ExtINT, start-up, or INIT-deassert delivery mode bypass the IRR and ISR registers and are sent directly to the processor core for servicing. 255 16 15 0 Reserved IRR Reserved ISR Reserved TMR Addresses: IRR FEE0 0200H - FEE0 0270H ISR FEE0 0100H - FEE0 0170H TMR FEE0 0180H - FEE0 01F0H Value after reset: 0H Figure 10-20. IRR, ISR and TMR Registers The IRR contains the active interrupt requests that have been accepted, but not yet dispatched to the processor for servicing. When the local APIC accepts an interrupt, it sets the bit in the IRR that corresponds the vector of the accepted interrupt. When the processor core is ready to handle the next interrupt, the local APIC clears the highest priority IRR bit that is set and sets the corresponding ISR bit. The vector for the highest priority bit set in the ISR is then dispatched to the processor core for servicing. While the processor is servicing the highest priority interrupt, the local APIC can send additional fixed interrupts by setting bits in the IRR. When the interrupt service routine issues a write to the EOI register (see Section 10.8.5, “Signaling Interrupt Servicing Completion”), the local APIC responds by clearing the highest priority ISR bit that is set. It then repeats the process of clearing the highest priority bit in the IRR and setting the corresponding bit in the ISR. The processor core then begins executing the service routing for the highest priority bit set in the ISR. If more than one interrupt is generated with the same vector number, the local APIC can set the bit for the vector both in the IRR and the ISR. This means that for the Pentium 4 and Intel Xeon processors, the IRR and ISR can queue two interrupts for each interrupt vector: one in the IRR and one in the ISR. Any additional interrupts issued for the same interrupt vector are collapsed into the single bit in the IRR. For the P6 family and Pentium processors, the IRR and ISR registers can queue no more than two interrupts per interrupt vector and will reject other interrupts that are received within the same vector. If the local APIC receives an interrupt with an interrupt-priority class higher than that of the interrupt currently in service, and interrupts are enabled in the processor core, the local APIC dispatches the higher priority interrupt to the processor immediately (without waiting for a write to the EOI register). The currently executing interrupt handler is then interrupted so the higher-priority interrupt can be handled. When the handling of the higher-priority interrupt has been completed, the servicing of the interrupted interrupt is resumed. 10-30 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The trigger mode register (TMR) indicates the trigger mode of the interrupt (see Figure 10-20). Upon acceptance of an interrupt into the IRR, the corresponding TMR bit is cleared for edge-triggered interrupts and set for leveltriggered interrupts. If a TMR bit is set when an EOI cycle for its corresponding interrupt vector is generated, an EOI message is sent to all I/O APICs. 10.8.5 Signaling Interrupt Servicing Completion For all interrupts except those delivered with the NMI, SMI, INIT, ExtINT, the start-up, or INIT-Deassert delivery mode, the interrupt handler must include a write to the end-of-interrupt (EOI) register (see Figure 10-21). This write must occur at the end of the handler routine, sometime before the IRET instruction. This action indicates that the servicing of the current interrupt is complete and the local APIC can issue the next interrupt from the ISR. 31 0 Address: 0FEE0 00B0H Value after reset: 0H Figure 10-21. EOI Register Upon receiving an EOI, the APIC clears the highest priority bit in the ISR and dispatches the next highest priority interrupt to the processor. If the terminated interrupt was a level-triggered interrupt, the local APIC also sends an end-of-interrupt message to all I/O APICs. System software may prefer to direct EOIs to specific I/O APICs rather than having the local APIC send end-ofinterrupt messages to all I/O APICs. Software can inhibit the broadcast of EOI message by setting bit 12 of the Spurious Interrupt Vector Register (see Section 10.9). If this bit is set, a broadcast EOI is not generated on an EOI cycle even if the associated TMR bit indicates that the current interrupt was level-triggered. The default value for the bit is 0, indicating that EOI broadcasts are performed. Bit 12 of the Spurious Interrupt Vector Register is reserved to 0 if the processor does not support suppression of EOI broadcasts. Support for EOI-broadcast suppression is reported in bit 24 in the Local APIC Version Register (see Section 10.4.8); the feature is supported if that bit is set to 1. When supported, the feature is available in both xAPIC mode and x2APIC mode. System software desiring to perform directed EOIs for level-triggered interrupts should set bit 12 of the Spurious Interrupt Vector Register and follow each the EOI to the local xAPIC for a level triggered interrupt with a directed EOI to the I/O APIC generating the interrupt (this is done by writing to the I/O APIC’s EOI register). System software performing directed EOIs must retain a mapping associating level-triggered interrupts with the I/O APICs in the system. 10.8.6 Task Priority in IA-32e Mode In IA-32e mode, operating systems can manage the 16 interrupt-priority classes (see Section 10.8.3, “Interrupt, Task, and Processor Priority”) explicitly using the task priority register (TPR). Operating systems can use the TPR to temporarily block specific (low-priority) interrupts from interrupting a high-priority task. This is done by loading TPR with a value in which the task-priority class corresponds to the highest interrupt-priority class that is to be blocked. For example: • Loading the TPR with a task-priority class of 8 (01000B) blocks all interrupts with an interrupt-priority class of 8 or less while allowing all interrupts with an interrupt-priority class of 9 or more to be recognized. • • Loading the TPR with a task-priority class of 0 enables all external interrupts. Loading the TPR with a task-priority class of 0FH (01111B) disables all external interrupts. The TPR (shown in Figure 10-18) is cleared to 0 on reset. In 64-bit mode, software can read and write the TPR using an alternate interface, MOV CR8 instruction. The new task-priority class is established when the MOV CR8 Vol. 3A 10-31 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) instruction completes execution. Software does not need to force serialization after loading the TPR using MOV CR8. Use of the MOV CRn instruction requires a privilege level of 0. Programs running at privilege level greater than 0 cannot read or write the TPR. An attempt to do so causes a general-protection exception. The TPR is abstracted from the interrupt controller (IC), which prioritizes and manages external interrupt delivery to the processor. The IC can be an external device, such as an APIC or 8259. Typically, the IC provides a priority mechanism similar or identical to the TPR. The IC, however, is considered implementation-dependent with the under-lying priority mechanisms subject to change. CR8, by contrast, is part of the Intel 64 architecture. Software can depend on this definition remaining unchanged. Figure 10-22 shows the layout of CR8; only the low four bits are used. The remaining 60 bits are reserved and must be written with zeros. Failure to do this causes a general-protection exception. 63 4 3 0 Reserved Value after reset: 0H Figure 10-22. CR8 Register 10.8.6.1 Interaction of Task Priorities between CR8 and APIC The first implementation of Intel 64 architecture includes a local advanced programmable interrupt controller (APIC) that is similar to the APIC used with previous IA-32 processors. Some aspects of the local APIC affect the operation of the architecturally defined task priority register and the programming interface using CR8. Notable CR8 and APIC interactions are: • • The processor powers up with the local APIC enabled. • APIC.TPR[bits 7:4] = CR8[bits 3:0], APIC.TPR[bits 3:0] = 0. A read of CR8 returns a 64-bit value which is the value of TPR[bits 7:4], zero extended to 64 bits. The APIC must be enabled for CR8 to function as the TPR. Writes to CR8 are reflected into the APIC Task Priority Register. There are no ordering mechanisms between direct updates of the APIC.TPR and CR8. Operating software should implement either direct APIC TPR updates or CR8 style TPR updates but not mix them. Software can use a serializing instruction (for example, CPUID) to serialize updates between MOV CR8 and stores to the APIC. 10.9 SPURIOUS INTERRUPT A special situation may occur when a processor raises its task priority to be greater than or equal to the level of the interrupt for which the processor INTR signal is currently being asserted. If at the time the INTA cycle is issued, the interrupt that was to be dispensed has become masked (programmed by software), the local APIC will deliver a spurious-interrupt vector. Dispensing the spurious-interrupt vector does not affect the ISR, so the handler for this vector should return without an EOI. The vector number for the spurious-interrupt vector is specified in the spurious-interrupt vector register (see Figure 10-23). The functions of the fields in this register are as follows: Spurious Vector Determines the vector number to be delivered to the processor when the local APIC generates a spurious vector. (Pentium 4 and Intel Xeon processors.) Bits 0 through 7 of the this field are programmable by software. (P6 family and Pentium processors). Bits 4 through 7 of the this field are programmable by software, and bits 0 through 3 are hardwired to logical ones. Software writes to bits 0 through 3 have no effect. APIC Software Enable/Disable 10-32 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Allows software to temporarily enable (1) or disable (0) the local APIC (see Section 10.4.3, “Enabling or Disabling the Local APIC”). Focus Processor Checking Determines if focus processor checking is enabled (0) or disabled (1) when using the lowestpriority delivery mode. In Pentium 4 and Intel Xeon processors, this bit is reserved and should be cleared to 0. Suppress EOI Broadcasts Determines whether an EOI for a level-triggered interrupt causes EOI messages to be broadcast to the I/O APICs (0) or not (1). See Section 10.8.5. The default value for this bit is 0, indicating that EOI broadcasts are performed. This bit is reserved to 0 if the processor does not support EOI-broadcast suppression. NOTE Do not program an LVT or IOAPIC RTE with a spurious vector even if you set the mask bit. A spurious vector ISR does not do an EOI. If for some reason an interrupt is generated by an LVT or RTE entry, the bit in the in-service register will be left set for the spurious vector. This will mask all interrupts at the same or lower priority 31 12 11 10 9 8 7 0 Reserved EOI-Broadcast Suppression1 0: Disabled 1: Enabled Focus Processor Checking2 0: Enabled 1: Disabled APIC Software Enable/Disable 0: APIC Disabled 1: APIC Enabled Spurious Vector3 Address: FEE0 00F0H Value after reset: 0000 00FFH 1. Not supported on all processors. See bit 24 of Local APIC Version Register. 2. Not supported in Pentium 4 and Intel Xeon processors. 3. For the P6 family and Pentium processors, bits 0 through 3 are always 0. Figure 10-23. Spurious-Interrupt Vector Register (SVR) 10.10 APIC BUS MESSAGE PASSING MECHANISM AND PROTOCOL (P6 FAMILY, PENTIUM PROCESSORS) The Pentium 4 and Intel Xeon processors pass messages among the local and I/O APICs on the system bus, using the system bus message passing mechanism and protocol. The P6 family and Pentium processors, pass messages among the local and I/O APICs on the serial APIC bus, as follows. Because only one message can be sent at a time on the APIC bus, the I/O APIC and local APICs employ a “rotating priority” arbitration protocol to gain permission to send a message on the APIC bus. One or more APICs may start sending their messages simultaneously. At the beginning of every message, each APIC presents the type of the message it is sending and its current arbitration priority on the APIC bus. This information is used for arbitration. After each arbitration cycle (within an arbitration round), only the potential winners keep driving the bus. Vol. 3A 10-33 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) By the time all arbitration cycles are completed, there will be only one APIC left driving the bus. Once a winner is selected, it is granted exclusive use of the bus, and will continue driving the bus to send its actual message. After each successfully transmitted message, all APICs increase their arbitration priority by 1. The previous winner (that is, the one that has just successfully transmitted its message) assumes a priority of 0 (lowest). An agent whose arbitration priority was 15 (highest) during arbitration, but did not send a message, adopts the previous winner’s arbitration priority, incremented by 1. Note that the arbitration protocol described above is slightly different if one of the APICs issues a special End-OfInterrupt (EOI). This high-priority message is granted the bus regardless of its sender’s arbitration priority, unless more than one APIC issues an EOI message simultaneously. In the latter case, the APICs sending the EOI messages arbitrate using their arbitration priorities. If the APICs are set up to use “lowest priority” arbitration (see Section 10.6.2.4, “Lowest Priority Delivery Mode”) and multiple APICs are currently executing at the lowest priority (the value in the APR register), the arbitration priorities (unique values in the Arb ID register) are used to break ties. All 8 bits of the APR are used for the lowest priority arbitration. 10.10.1 Bus Message Formats See Section 10.13, “APIC Bus Message Formats,” for a description of bus message formats used to transmit messages on the serial APIC bus. 10.11 MESSAGE SIGNALLED INTERRUPTS The PCI Local Bus Specification, Rev 2.2 (www.pcisig.com) introduces the concept of message signalled interrupts. As the specification indicates: “Message signalled interrupts (MSI) is an optional feature that enables PCI devices to request service by writing a system-specified message to a system-specified address (PCI DWORD memory write transaction). The transaction address specifies the message destination while the transaction data specifies the message. System software is expected to initialize the message destination and message during device configuration, allocating one or more non-shared messages to each MSI capable function.” The capabilities mechanism provided by the PCI Local Bus Specification is used to identify and configure MSI capable PCI devices. Among other fields, this structure contains a Message Data Register and a Message Address Register. To request service, the PCI device function writes the contents of the Message Data Register to the address contained in the Message Address Register (and the Message Upper Address register for 64-bit message addresses). Section 10.11.1 and Section 10.11.2 provide layout details for the Message Address Register and the Message Data Register. The operation issued by the device is a PCI write command to the Message Address Register with the Message Data Register contents. The operation follows semantic rules as defined for PCI write operations and is a DWORD operation. 10.11.1 Message Address Register Format The format of the Message Address Register (lower 32-bits) is shown in Figure 10-24. 31 20 19 0FEEH 12 11 Destination ID 4 Reserved 3 2 RH DM 1 0 XX Figure 10-24. Layout of the MSI Message Address Register 10-34 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Fields in the Message Address Register are as follows: 1. Bits 31-20 — These bits contain a fixed value for interrupt messages (0FEEH). This value locates interrupts at the 1-MByte area with a base address of 4G – 18M. All accesses to this region are directed as interrupt messages. Care must to be taken to ensure that no other device claims the region as I/O space. 2. Destination ID — This field contains an 8-bit destination ID. It identifies the message’s target processor(s). The destination ID corresponds to bits 63:56 of the I/O APIC Redirection Table Entry if the IOAPIC is used to dispatch the interrupt to the processor(s). 3. Redirection hint indication (RH) — When this bit is set, the message is directed to the processor with the lowest interrupt priority among processors that can receive the interrupt. • • When RH is 0, the interrupt is directed to the processor listed in the Destination ID field. When RH is 1 and the physical destination mode is used, the Destination ID field must not be set to FFH; it must point to a processor that is present and enabled to receive the interrupt. • When RH is 1 and the logical destination mode is active in a system using a flat addressing model, the Destination ID field must be set so that bits set to 1 identify processors that are present and enabled to receive the interrupt. • If RH is set to 1 and the logical destination mode is active in a system using cluster addressing model, then Destination ID field must not be set to FFH; the processors identified with this field must be present and enabled to receive the interrupt. 4. Destination mode (DM) — This bit indicates whether the Destination ID field should be interpreted as logical or physical APIC ID for delivery of the lowest priority interrupt. • If RH is 1 and DM is 0, the Destination ID field is in physical destination mode and only the processor in the system that has the matching APIC ID is considered for delivery of that interrupt (this means no redirection). • If RH is 1 and DM is 1, the Destination ID Field is interpreted as in logical destination mode and the redirection is limited to only those processors that are part of the logical group of processors based on the processor’s logical APIC ID and the Destination ID field in the message. The logical group of processors consists of those identified by matching the 8-bit Destination ID with the logical destination identified by the Destination Format Register and the Logical Destination Register in each local APIC. The details are similar to those described in Section 10.6.2, “Determining IPI Destination.” • If RH is 0, then the DM bit is ignored and the message is sent ahead independent of whether the physical or logical destination mode is used. 10.11.2 Message Data Register Format The layout of the Message Data Register is shown in Figure 10-25. Reserved fields are not assumed to be any value. Software must preserve their contents on writes. Other fields in the Message Data Register are described below. 1. Vector — This 8-bit field contains the interrupt vector associated with the message. Values range from 010H to 0FEH. Software must guarantee that the field is not programmed with vector 00H to 0FH. 2. Delivery Mode — This 3-bit field specifies how the interrupt receipt is handled. Delivery Modes operate only in conjunction with specified Trigger Modes. Correct Trigger Modes must be guaranteed by software. Restrictions are indicated below: a. 000B (Fixed Mode) — Deliver the signal to all the agents listed in the destination. The Trigger Mode for fixed delivery mode can be edge or level. b. 001B (Lowest Priority) — Deliver the signal to the agent that is executing at the lowest priority of all agents listed in the destination field. The trigger mode can be edge or level. c. 010B (System Management Interrupt or SMI) — The delivery mode is edge only. For systems that rely on SMI semantics, the vector field is ignored but must be programmed to all zeroes for future compatibility. Vol. 3A 10-35 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 63 32 Reserved 31 16 15 Reserved Trigger Mode 0 - Edge 1 - Level Level for Trigger Mode = 0 X - Don’t care Level for Trigger Mode = 1 0 - Deassert 1 - Assert 14 13 11 10 Reserved 8 7 0 Vector Delivery Mode 000 - Fixed 001 - Lowest Priority 010 - SMI 011 - Reserved 100 - NMI 101 - INIT 110 - Reserved 111 - ExtINT Figure 10-25. Layout of the MSI Message Data Register d. 100B (NMI) — Deliver the signal to all the agents listed in the destination field. The vector information is ignored. NMI is an edge triggered interrupt regardless of the Trigger Mode Setting. e. 101B (INIT) — Deliver this signal to all the agents listed in the destination field. The vector information is ignored. INIT is an edge triggered interrupt regardless of the Trigger Mode Setting. f. 111B (ExtINT) — Deliver the signal to the INTR signal of all agents in the destination field (as an interrupt that originated from an 8259A compatible interrupt controller). The vector is supplied by the INTA cycle issued by the activation of the ExtINT. ExtINT is an edge triggered interrupt. 3. Level — Edge triggered interrupt messages are always interpreted as assert messages. For edge triggered interrupts this field is not used. For level triggered interrupts, this bit reflects the state of the interrupt input. 4. Trigger Mode — This field indicates the signal type that will trigger a message. a. 0 — Indicates edge sensitive. b. 1 — Indicates level sensitive. 10.12 EXTENDED XAPIC (X2APIC) The x2APIC architecture extends the xAPIC architecture (described in Section 10.4) in a backward compatible manner and provides forward extendability for future Intel platform innovations. Specifically, the x2APIC architecture does the following. • Retains all key elements of compatibility to the xAPIC architecture. — Delivery modes. — Interrupt and processor priorities. — Interrupt sources. — Interrupt destination types. • Provides extensions to scale processor addressability for both the logical and physical destination modes. 10-36 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • • • Adds new features to enhance performance of interrupt delivery. Reduces complexity of logical destination mode interrupt delivery on link based platform architectures. Uses MSR programming interface to access APIC registers in x2APIC mode instead of memory-mapped interfaces. Memory-mapped interface is supported when operating in xAPIC mode. 10.12.1 Detecting and Enabling x2APIC Mode Processor support for x2APIC mode can be detected by executing CPUID with EAX=1 and then checking ECX, bit 21 ECX. If CPUID.(EAX=1):ECX.21 is set , the processor supports the x2APIC capability and can be placed into the x2APIC mode. System software can place the local APIC in the x2APIC mode by setting the x2APIC mode enable bit (bit 10) in the IA32_APIC_BASE MSR at MSR address 01BH. The layout for the IA32_APIC_BASE MSR is shown in Figure 10-26. 63 36 35 Reserved 12 11 10 9 8 7 0 APIC Base APIC Base—Base physical address EN—xAPIC global enable/disable EXTD—Enable x2APIC mode BSP—Processor is BSP Reserved Figure 10-26. IA32_APIC_BASE MSR Supporting x2APIC Table 10-5, “x2APIC operating mode configurations” describe the possible combinations of the enable bit (EN - bit 11) and the extended mode bit (EXTD - bit 10) in the IA32_APIC_BASE MSR. Table 10-5. x2APIC Operating Mode Configurations xAPIC global enable (IA32_APIC_BASE[11]) x2APIC enable (IA32_APIC_BASE[10]) Description 0 0 local APIC is disabled 0 1 Invalid 1 0 local APIC is enabled in xAPIC mode 1 1 local APIC is enabled in x2APIC mode Once the local APIC has been switched to x2APIC mode (EN = 1, EXTD = 1), switching back to xAPIC mode would require system software to disable the local APIC unit. Specifically, attempting to write a value to the IA32_APIC_BASE MSR that has (EN= 1, EXTD = 0) when the local APIC is enabled and in x2APIC mode causes a general-protection exception. Once bit 10 in IA32_APIC_BASE MSR is set, the only way to leave x2APIC mode using IA32_APIC_BASE would require a WRMSR to set both bit 11 and bit 10 to zero. Section 10.12.5, “x2APIC State Transitions” provides a detailed state diagram for the state transitions allowed for the local APIC. 10.12.1.1 Instructions to Access APIC Registers In x2APIC mode, system software uses RDMSR and WRMSR to access the APIC registers. The MSR addresses for accessing the x2APIC registers are architecturally defined and specified in Section 10.12.1.2, “x2APIC Register Address Space”. Executing the RDMSR instruction with the APIC register address specified in ECX returns the content of bits 0 through 31 of the APIC registers in EAX. Bits 32 through 63 are returned in register EDX - these bits are reserved if the APIC register being read is a 32-bit register. Similarly executing the WRMSR instruction with the APIC register address in ECX, writes bits 0 to 31 of register EAX to bits 0 to 31 of the specified APIC register. If the register is a 64-bit register then bits 0 to 31 of register EDX are written to bits 32 to 63 of the APIC register. The Vol. 3A 10-37 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Interrupt Command Register is the only APIC register that is implemented as a 64-bit MSR. The semantics of handling reserved bits are defined in Section 10.12.1.3, “Reserved Bit Checking”. 10.12.1.2 x2APIC Register Address Space The MSR address range 800H through 8FFH is architecturally reserved and dedicated for accessing APIC registers in x2APIC mode. Table 10-6 lists the APIC registers that are available in x2APIC mode. When appropriate, the table also gives the offset at which each register is available on the page referenced by IA32_APIC_BASE[35:12] in xAPIC mode. There is a one-to-one mapping between the x2APIC MSRs and the legacy xAPIC register offsets with the following exceptions: • The Destination Format Register (DFR): The DFR, supported at offset 0E0H in xAPIC mode, is not supported in x2APIC mode. There is no MSR with address 80EH. • The Interrupt Command Register (ICR): The two 32-bit registers in xAPIC mode (at offsets 300H and 310H) are merged into a single 64-bit MSR in x2APIC mode (with MSR address 830H). There is no MSR with address 831H. • The SELF IPI register. This register is available only in x2APIC mode at address 83FH. In xAPIC mode, there is no register defined at offset 3F0H. MSR addresses in the range 800H–8FFH that are not listed in Table 10-6 (including 80EH and 831H) are reserved. Executions of RDMSR and WRMSR that attempt to access such addresses cause general-protection exceptions. The MSR address space is compressed to allow for future growth. Every 32 bit register on a 128-bit boundary in the legacy MMIO space is mapped to a single MSR in the local x2APIC MSR address space. The upper 32-bits of all x2APIC MSRs (except for the ICR) are reserved. Table 10-6. Local APIC Register Address Map Supported by x2APIC MSR Address (x2APIC mode) MMIO Offset (xAPIC mode) Register Name MSR R/W Semantics 802H 020H Local APIC ID register Read-only1 See Section 10.12.5.1 for initial values. 803H 030H Local APIC Version register Read-only Same version used in xAPIC mode and x2APIC mode. 808H 080H Task Priority Register (TPR) Read/write Bits 31:8 are reserved.2 80AH 0A0H Processor Priority Register (PPR) Read-only 80BH 0B0H EOI register Write-only3 WRMSR of a non-zero value causes #GP(0). 80DH 0D0H Logical Destination Register (LDR) Read-only Read/write in xAPIC mode. 80FH 0F0H Spurious Interrupt Vector Register (SVR) Read/write See Section 10.9 for reserved bits. 810H 100H In-Service Register (ISR); bits 31:0 Read-only 811H 110H ISR bits 63:32 Read-only 812H 120H ISR bits 95:64 Read-only 813H 130H ISR bits 127:96 Read-only 814H 140H ISR bits 159:128 Read-only 815H 150H ISR bits 191:160 Read-only 816H 160H ISR bits 223:192 Read-only 10-38 Vol. 3A Comments ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-6. Local APIC Register Address Map Supported by x2APIC (Contd.) MSR Address (x2APIC mode) MMIO Offset (xAPIC mode) Register Name MSR R/W Semantics 817H 170H ISR bits 255:224 Read-only 818H 180H Trigger Mode Register (TMR); bits 31:0 Read-only 819H 190H TMR bits 63:32 Read-only 81AH 1A0H TMR bits 95:64 Read-only 81BH 1B0H TMR bits 127:96 Read-only 81CH 1C0H TMR bits 159:128 Read-only 81DH 1D0H TMR bits 191:160 Read-only 81EH 1E0H TMR bits 223:192 Read-only 81FH 1F0H TMR bits 255:224 Read-only 820H 200H Interrupt Request Register (IRR); bits 31:0 Read-only 821H 210H IRR bits 63:32 Read-only 822H 220H IRR bits 95:64 Read-only 823H 230H IRR bits 127:96 Read-only 824H 240H IRR bits 159:128 Read-only 825H 250H IRR bits 191:160 Read-only 826H 260H IRR bits 223:192 Read-only 827H 270H IRR bits 255:224 Read-only 828H 280H Error Status Register (ESR) Read/write WRMSR of a non-zero value causes #GP(0). See Section 10.5.3. 82FH 2F0H LVT CMCI register Read/write See Figure 10-8 for reserved bits. 830H4 300H and 310H Interrupt Command Register (ICR) Read/write See Figure 10-28 for reserved bits 832H 320H LVT Timer register Read/write See Figure 10-8 for reserved bits. 833H 330H LVT Thermal Sensor register Read/write See Figure 10-8 for reserved bits. 834H 340H LVT Performance Monitoring register Read/write See Figure 10-8 for reserved bits. 835H 350H LVT LINT0 register Read/write See Figure 10-8 for reserved bits. 836H 360H LVT LINT1 register Read/write See Figure 10-8 for reserved bits. 837H 370H LVT Error register Read/write See Figure 10-8 for reserved bits. 838H 380H Initial Count register (for Timer) Read/write 839H 390H Current Count register (for Timer) Read-only 83EH 3E0H Divide Configuration Register (DCR; for Timer) Read/write See Figure 10-10 for reserved bits. 83FH Not available SELF IPI5 Write-only Available only in x2APIC mode. Comments NOTES: 1. WRMSR causes #GP(0) for read-only registers. Vol. 3A 10-39 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 2. WRMSR causes #GP(0) for attempts to set a reserved bit to 1 in a read/write register (including bits 63:32 of each register). 3. RDMSR causes #GP(0) for write-only registers. 4. MSR 831H is reserved; read/write operations cause general-protection exceptions. The contents of the APIC register at MMIO offset 310H are accessible in x2APIC mode through the MSR at address 830H. 5. SELF IPI register is supported only in x2APIC mode. 10.12.1.3 Reserved Bit Checking Section 10.12.1.2 and Table 10-6 specifies the reserved bit definitions for the APIC registers in x2APIC mode. Nonzero writes (by WRMSR instruction) to reserved bits to these registers will raise a general protection fault exception while reads return zeros (RsvdZ semantics). In x2APIC mode, the local APIC ID register is increased to 32 bits wide. This enables 232–1 processors to be addressable in physical destination mode. This 32-bit value is referred to as “x2APIC ID”. A processor implementation may choose to support less than 32 bits in its hardware. System software should be agnostic to the actual number of bits that are implemented. All non-implemented bits will return zeros on reads by software. The APIC ID value of FFFF_FFFFH and the highest value corresponding to the implemented bit-width of the local APIC ID register in the system are reserved and cannot be assigned to any logical processor. In x2APIC mode, the local APIC ID register is a read-only register to system software and will be initialized by hardware. It is accessed via the RDMSR instruction reading the MSR at address 0802H. Each logical processor in the system (including clusters with a communication fabric) must be configured with an unique x2APIC ID to avoid collisions of x2APIC IDs. On DP and high-end MP processors targeted to specific market segments and depending on the system configuration, it is possible that logical processors in different and “unconnected” clusters power up initialized with overlapping x2APIC IDs. In these configurations, a model-specific means may be provided in those product segments to enable BIOS and/or platform firmware to re-configure the x2APIC IDs in some clusters to provide for unique and non-overlapping system wide IDs before configuring the disconnected components into a single system. 10.12.2 x2APIC Register Availability The local APIC registers can be accessed via the MSR interface only when the local APIC has been switched to the x2APIC mode as described in Section 10.12.1. Accessing any APIC register in the MSR address range 0800H through 08FFH via RDMSR or WRMSR when the local APIC is not in x2APIC mode causes a general-protection exception. In x2APIC mode, the memory mapped interface is not available and any access to the MMIO interface will behave similar to that of a legacy xAPIC in globally disabled state. Table 10-7 provides the interactions between the legacy & extended modes and the legacy and register interfaces. Table 10-7. MSR/MMIO Interface of a Local x2APIC in Different Modes of Operation MMIO Interface MSR Interface xAPIC mode Available General-protection exception x2APIC mode Behavior identical to xAPIC in globally disabled state Available 10.12.3 MSR Access in x2APIC Mode To allow for efficient access to the APIC registers in x2APIC mode, the serializing semantics of WRMSR are relaxed when writing to the APIC registers. Thus, system software should not use “WRMSR to APIC registers in x2APIC mode” as a serializing instruction. Read and write accesses to the APIC registers will occur in program order. A WRMSR to an APIC register may complete before all preceding stores are globally visible; software can prevent this by inserting a serializing instruction or the sequence MFENCE;LFENCE before the WRMSR. The RDMSR instruction is not serializing and this behavior is unchanged when reading APIC registers in x2APIC mode. System software accessing the APIC registers using the RDMSR instruction should not expect a serializing behavior. (Note: The MMIO-based xAPIC interface is mapped by system software as an un-cached region. Consequently, read/writes to the xAPIC-MMIO interface have serializing semantics in the xAPIC mode.) 10-40 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.12.4 VM-Exit Controls for MSRs and x2APIC Registers The VMX architecture allows a VMM to specify lists of MSRs to be loaded or stored on VMX transitions using the VMX-transition MSR areas (see VM-exit MSR-store address field, VM-exit MSR-load address field, and VM-entry MSR-load address field in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C). The X2APIC MSRs cannot to be loaded and stored on VMX transitions. A VMX transition fails if the VMM has specified that the transition should access any MSRs in the address range from 0000_0800H to 0000_08FFH (the range used for accessing the X2APIC registers). Specifically, processing of an 128-bit entry in any of the VMX-transition MSR areas fails if bits 31:0 of that entry (represented as ENTRY_LOW_DW) satisfies the expression: “ENTRY_LOW_DW & FFFFF800H = 00000800H”. Such a failure causes an associated VM entry to fail (by reloading host state) and causes an associated VM exit to lead to VMX abort. 10.12.5 x2APIC State Transitions This section provides a detailed description of the x2APIC states of a local x2APIC unit, transitions between these states as well as interactions of these states with INIT and reset. 10.12.5.1 x2APIC States The valid states for a local x2APIC unit are listed in Table 10-5. • • • • APIC disabled: IA32_APIC_BASE[EN]=0 and IA32_APIC_BASE[EXTD]=0. xAPIC mode: IA32_APIC_BASE[EN]=1 and IA32_APIC_BASE[EXTD]=0. x2APIC mode: IA32_APIC_BASE[EN]=1 and IA32_APIC_BASE[EXTD]=1. Invalid: IA32_APIC_BASE[EN]=0 and IA32_APIC_BASE[EXTD]=1. The state corresponding to EXTD=1 and EN=0 is not valid and it is not possible to get into this state. An execution of WRMSR to the IA32_APIC_BASE_MSR that attempts a transition from a valid state to this invalid state causes a general-protection exception. Figure 10-27 shows the comprehensive state transition diagram for a local x2APIC unit. On coming out of reset, the local APIC unit is enabled and is in the xAPIC mode: IA32_APIC_BASE[EN]=1 and IA32_APIC_BASE[EXTD]=0. The APIC registers are initialized as follows. • The local APIC ID is initialized by hardware with a 32 bit ID (x2APIC ID). The lowest 8 bits of the x2APIC ID are the legacy local xAPIC ID, and are stored in the upper 8 bits of the APIC register for access in xAPIC mode. • The following APIC registers are reset to all zeros for those fields that are defined in the xAPIC mode. — IRR, ISR, TMR, ICR, LDR, TPR, Divide Configuration Register (See Section 10.4 through Section 10.6 for details of individual APIC registers). — Timer initial count and timer current count registers. • • • • • The LVT registers are reset to 0s except for the mask bits; these are set to 1s. The local APIC version register is not affected. The Spurious Interrupt Vector Register is initialized to 000000FFH. The DFR (available only in xAPIC mode) is reset to all 1s. SELF IPI register is reset to zero. Vol. 3A 10-41 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Reset Init Disabled EN = 0 Extd = 0 EN =1 xAPIC Mode EN = 0 EN=1, Extd=0 Reset Extd = 1 Init EN = 0 Extd = 0 Illegal Transition Extd = 1 Illegal Transition Extd = 1 Illegal Transition Extd = 0 Invalid State EN = 0 Illegal Transition EN = 0 Extended Mode EN=1, Extd=1 Reset Init Figure 10-27. Local x2APIC State Transitions with IA32_APIC_BASE, INIT, and Reset x2APIC After Reset The valid transitions from the xAPIC mode state are: • to the x2APIC mode by setting EXT to 1 (resulting EN=1, EXTD= 1). The physical x2APIC ID (see Figure 10-6) is preserved across this transition and the logical x2APIC ID (see Figure 10-29) is initialized by hardware during this transition as documented in Section 10.12.10.2. The state of the extended fields in other APIC registers, which was not initialized at reset, is not architecturally defined across this transition and system software should explicitly initialize those programmable APIC registers. • to the disabled state by setting EN to 0 (resulting EN=0, EXTD= 0). The result of an INIT in the xAPIC state places the APIC in the state with EN= 1, EXTD= 0. The state of the local APIC ID register is preserved (the 8-bit xAPIC ID is in the upper 8 bits of the APIC ID register). All the other APIC registers are initialized as a result of INIT. A reset in this state places the APIC in the state with EN= 1, EXTD= 0. The state of the local APIC ID register is initialized as described in Section 10.12.5.1. All the other APIC registers are initialized described in Section 10.12.5.1. x2APIC Transitions From x2APIC Mode From the x2APIC mode, the only valid x2APIC transition using IA32_APIC_BASE is to the state where the x2APIC is disabled by setting EN to 0 and EXTD to 0. The x2APIC ID (32 bits) and the legacy local xAPIC ID (8 bits) are preserved across this transition. A transition from the x2APIC mode to xAPIC mode is not valid, and the corresponding WRMSR to the IA32_APIC_BASE MSR causes a general-protection exception. A reset in this state places the x2APIC in xAPIC mode. All APIC registers (including the local APIC ID register) are initialized as described in Section 10.12.5.1. An INIT in this state keeps the x2APIC in the x2APIC mode. The state of the local APIC ID register is preserved (all 32 bits). However, all the other APIC registers are initialized as a result of the INIT transition. 10-42 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) x2APIC Transitions From Disabled Mode From the disabled state, the only valid x2APIC transition using IA32_APIC_BASE is to the xAPIC mode (EN= 1, EXTD = 0). Thus the only means to transition from x2APIC mode to xAPIC mode is a two-step process: • • first transition from x2APIC mode to local APIC disabled mode (EN= 0, EXTD = 0), followed by another transition from disabled mode to xAPIC mode (EN= 1, EXTD= 0). Consequently, all the APIC register states in the x2APIC, except for the x2APIC ID (32 bits), are not preserved across mode transitions. A reset in the disabled state places the x2APIC in the xAPIC mode. All APIC registers (including the local APIC ID register) are initialized as described in Section 10.12.5.1. An INIT in the disabled state keeps the x2APIC in the disabled state. State Changes From xAPIC Mode to x2APIC Mode After APIC register states have been initialized by software in xAPIC mode, a transition from xAPIC mode to x2APIC mode does not affect most of the APIC register states, except the following: • • • The Logical Destination Register is not preserved. Any APIC ID value written to the memory-mapped local APIC ID register is not preserved. The high half of the Interrupt Command Register is not preserved. 10.12.6 Routing of Device Interrupts in x2APIC Mode The x2APIC architecture is intended to work with all existing IOxAPIC units as well as all PCI and PCI Express (PCIe) devices that support the capability for message-signaled interrupts (MSI). Support for x2APIC modifies only the following: • • • the local APIC units; the interconnects joining IOxAPIC units to the local APIC units; and the interconnects joining MSI-capable PCI and PCIe devices to the local APIC units. No modifications are required to MSI-capable PCI and PCIe devices. Similarly, no modifications are required to IOxAPIC units. This made possible through use of the interrupt-remapping architecture specified in the Intel® Virtualization Technology for Directed I/O, Revision 1.3 for the routing of interrupts from MSI-capable devices to local APIC units operating in x2APIC mode. 10.12.7 Initialization by System Software Routing of device interrupts to local APIC units operating in x2APIC mode requires use of the interrupt-remapping architecture specified in the Intel® Virtualization Technology for Directed I/O (Revision 1.3 and/or later versions). Because of this, BIOS must enumerate support for and software must enable this interrupt remapping with Extended Interrupt Mode Enabled before it enabling x2APIC mode in the local APIC units. The ACPI interfaces for the x2APIC are described in Section 5.2, “ACPI System Description Tables,” of the Advanced Configuration and Power Interface Specification, Revision 4.0a (http://www.acpi.info/spec.htm). The default behavior for BIOS is to pass the control to the operating system with the local x2APICs in xAPIC mode if all APIC IDs reported by CPUID.0BH:EDX are less than 255, and in x2APIC mode if there are any logical processor reporting an APIC ID of 255 or greater. 10.12.8 CPUID Extensions And Topology Enumeration For Intel 64 and IA-32 processors that support x2APIC, a value of 1 reported by CPUID.01H:ECX[21] indicates that the processor supports x2APIC and the extended topology enumeration leaf (CPUID.0BH). Vol. 3A 10-43 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The extended topology enumeration leaf can be accessed by executing CPUID with EAX = 0BH. Processors that do not support x2APIC may support CPUID leaf 0BH. Software can detect the availability of the extended topology enumeration leaf (0BH) by performing two steps: • Check maximum input value for basic CPUID information by executing CPUID with EAX= 0. If CPUID.0H:EAX is greater than or equal or 11 (0BH), then proceed to next step • Check CPUID.EAX=0BH, ECX=0H:EBX is non-zero. If both of the above conditions are true, extended topology enumeration leaf is available. If available, the extended topology enumeration leaf is the preferred mechanism for enumerating topology. The presence of CPUID leaf 0BH in a processor does not guarantee support for x2APIC. If CPUID.EAX=0BH, ECX=0H:EBX returns zero and maximum input value for basic CPUID information is greater than 0BH, then CPUID.0BH leaf is not supported on that processor. The extended topology enumeration leaf is intended to assist software with enumerating processor topology on systems that requires 32-bit x2APIC IDs to address individual logical processors. Details of CPUID leaf 0BH can be found in the reference pages of CPUID in Chapter 3 of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A. Processor topology enumeration algorithm for processors supporting the extended topology enumeration leaf of CPUID and processors that do not support CPUID leaf 0BH are treated in Section 8.9.4, “Algorithm for Three-Level Mappings of APIC_ID”. 10.12.8.1 Consistency of APIC IDs and CPUID The consistency of physical x2APIC ID in MSR 802H in x2APIC mode and the 32-bit value returned in CPUID.0BH:EDX is facilitated by processor hardware. CPUID.0BH:EDX will report the full 32 bit ID, in xAPIC and x2APIC mode. This allows BIOS to determine if a system has processors with IDs exceeding the 8-bit initial APIC ID limit (CPUID.01H:EBX[31:24]). Initial APIC ID (CPUID.01H:EBX[31:24]) is always equal to CPUID.0BH:EDX[7:0]. If the values of CPUID.0BH:EDX reported by all logical processors in a system are less than 255, BIOS can transfer control to OS in xAPIC mode. If the values of CPUID.0BH:EDX reported by some logical processors in a system are greater than or equal to 255, BIOS must support two options to hand off to OS. • If BIOS enables logical processors with x2APIC IDs greater than 255, then it should enable x2APIC in the Boot Strap Processor (BSP) and all Application Processors (AP) before passing control to the OS. Applications requiring processor topology information must use OS provided services based on x2APIC IDs or CPUID.0BH leaf. • If a BIOS transfers control to OS in xAPIC mode, then the BIOS must ensure that only logical processors with CPUID.0BH.EDX value less than 255 are enabled. BIOS initialization on all logical processors with CPUID.0B.EDX values greater than or equal to 255 must (a) disable APIC and execute CLI in each logical processor, and (b) leave these logical processor in the lowest power state so that these processors do not respond to INIT IPI during OS boot. The BSP and all the enabled logical processor operate in xAPIC mode after BIOS passed control to OS. Application requiring processor topology information can use OS provided legacy services based on 8-bit initial APIC IDs or legacy topology information from CPUID.01H and CPUID 04H leaves. Even if the BIOS passes control in xAPIC mode, an OS can switch the processors to x2APIC mode later. BIOS SMM handler should always read the APIC_BASE_MSR, determine the APIC mode and use the corresponding access method. 10.12.9 ICR Operation in x2APIC Mode In x2APIC mode, the layout of the Interrupt Command Register is shown in Figure 10-12. The lower 32 bits of ICR in x2APIC mode is identical to the lower half of the ICR in xAPIC mode, except the Delivery Status bit is removed since it is not needed in x2APIC mode. The destination ID field is expanded to 32 bits in x2APIC mode. To send an IPI using the ICR, software must set up the ICR to indicate the type of IPI message to be sent and the destination processor or processors. Self IPIs can also be sent using the SELF IPI register (see Section 10.12.11). 10-44 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 63 32 Destination Field 31 20 19 18 17 16 15 14 13 12 11 10 Reserved Destination Shorthand 00: No Shorthand 01: Self 10: All Including Self 11: All Excluding Self Reserved 8 7 0 Vector Delivery Mode 000: Fixed 001: Reserved 010: SMI 011: Reserved 100: NMI 101: INIT 110: Start Up 111: Reserved Destination Mode 0: Physical 1: Logical Address: 830H (63 - 0) Value after Reset: 0H Level 0 = De-assert 1 = Assert Trigger Mode 0: Edge 1: Level Figure 10-28. Interrupt Command Register (ICR) in x2APIC Mode A single MSR write to the Interrupt Command Register is required for dispatching an interrupt in x2APIC mode. With the removal of the Delivery Status bit, system software no longer has a reason to read the ICR. It remains readable only to aid in debugging; however, software should not assume the value returned by reading the ICR is the last written value. A destination ID value of FFFF_FFFFH is used for broadcast of interrupts in both logical destination and physical destination modes. 10.12.10 Determining IPI Destination in x2APIC Mode 10.12.10.1 Logical Destination Mode in x2APIC Mode In x2APIC mode, the Logical Destination Register (LDR) is increased to 32 bits wide. It is a read-only register to system software. This 32-bit value is referred to as “logical x2APIC ID”. System software accesses this register via the RDMSR instruction reading the MSR at address 80DH. Figure 10-29 provides the layout of the Logical Destination Register in x2APIC mode. Vol. 3A 10-45 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) MSR Address: 80DH 31 0 Logical x2APIC ID Figure 10-29. Logical Destination Register in x2APIC Mode In the xAPIC mode, the Destination Format Register (DFR) through the MMIO interface determines the choice of a flat logical mode or a clustered logical mode. Flat logical mode is not supported in the x2APIC mode. Hence the Destination Format Register (DFR) is eliminated in x2APIC mode. The 32-bit logical x2APIC ID field of LDR is partitioned into two sub-fields: • • Cluster ID (LDR[31:16]): is the address of the destination cluster Logical ID (LDR[15:0]): defines a logical ID of the individual local x2APIC within the cluster specified by LDR[31:16]. This layout enables 2^16-1 clusters each with up to 16 unique logical IDs - effectively providing an addressability of ((2^20) - 16) processors in logical destination mode. It is likely that processor implementations may choose to support less than 16 bits of the cluster ID or less than 16bits of the Logical ID in the Logical Destination Register. However system software should be agnostic to the number of bits implemented in the cluster ID and logical ID sub-fields. The x2APIC hardware initialization will ensure that the appropriately initialized logical x2APIC IDs are available to system software and reads of nonimplemented bits return zero. This is a read-only register that software must read to determine the logical x2APIC ID of the processor. Specifically, software can apply a 16-bit mask to the lowest 16 bits of the logical x2APIC ID to identify the logical address of a processor within a cluster without needing to know the number of implemented bits in cluster ID and Logical ID sub-fields. Similarly, software can create a message destination address for cluster model, by bit-Oring the Logical X2APIC ID (31:0) of processors that have matching Cluster ID(31:16). To enable cluster ID assignment in a fashion that matches the system topology characteristics and to enable efficient routing of logical mode lowest priority device interrupts in link based platform interconnects, the LDR are initialized by hardware based on the value of x2APIC ID upon x2APIC state transitions. Details of this initialization are provided in Section 10.12.10.2. 10.12.10.2 Deriving Logical x2APIC ID from the Local x2APIC ID In x2APIC mode, the 32-bit logical x2APIC ID, which can be read from LDR, is derived from the 32-bit local x2APIC ID. Specifically, the 16-bit logical ID sub-field is derived by shifting 1 by the lowest 4 bits of the x2APIC ID, i.e. Logical ID = 1 « x2APIC ID[3:0]. The remaining bits of the x2APIC ID then form the cluster ID portion of the logical x2APIC ID: Logical x2APIC ID = [(x2APIC ID[19:4] « 16) | (1 « x2APIC ID[3:0])] The use of the lowest 4 bits in the x2APIC ID implies that at least 16 APIC IDs are reserved for logical processors within a socket in multi-socket configurations. If more than 16 APIC IDS are reserved for logical processors in a socket/package then multiple cluster IDs can exist within the package. The LDR initialization occurs whenever the x2APIC mode is enabled (see Section 10.12.5). 10-46 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.12.11 SELF IPI Register SELF IPIs are used extensively by some system software. The x2APIC architecture introduces a new register interface. This new register is dedicated to the purpose of sending self-IPIs with the intent of enabling a highly optimized path for sending self-IPIs. Figure 10-30 provides the layout of the SELF IPI register. System software only specifies the vector associated with the interrupt to be sent. The semantics of sending a self-IPI via the SELF IPI register are identical to sending a self targeted edge triggered fixed interrupt with the specified vector. Specifically the semantics are identical to the following settings for an inter-processor interrupt sent via the ICR - Destination Shorthand (ICR[19:18] = 01 (Self)), Trigger Mode (ICR[15] = 0 (Edge)), Delivery Mode (ICR[10:8] = 000 (Fixed)), Vector (ICR[7:0] = Vector). MSR Address: 083FH 31 8 7 0 Vector Reserved Figure 10-30. SELF IPI register The SELF IPI register is a write-only register. A RDMSR instruction with address of the SELF IPI register causes a general-protection exception. The handling and prioritization of a self-IPI sent via the SELF IPI register is architecturally identical to that for an IPI sent via the ICR from a legacy xAPIC unit. Specifically the state of the interrupt would be tracked via the Interrupt Request Register (IRR) and In Service Register (ISR) and Trigger Mode Register (TMR) as if it were received from the system bus. Also sending the IPI via the Self Interrupt Register ensures that interrupt is delivered to the processor core. Specifically completion of the WRMSR instruction to the SELF IPI register implies that the interrupt has been logged into the IRR. As expected for edge triggered interrupts, depending on the processor priority and readiness to accept interrupts, it is possible that interrupts sent via the SELF IPI register or via the ICR with identical vectors can be combined. 10.13 APIC BUS MESSAGE FORMATS This section describes the message formats used when transmitting messages on the serial APIC bus. The information described here pertains only to the Pentium and P6 family processors. 10.13.1 Bus Message Formats The local and I/O APICs transmit three types of messages on the serial APIC bus: EOI message, short message, and non-focused lowest priority message. The purpose of each type of message and its format are described below. 10.13.2 EOI Message Local APICs send 14-cycle EOI messages to the I/O APIC to indicate that a level triggered interrupt has been accepted by the processor. This interrupt, in turn, is a result of software writing into the EOI register of the local APIC. Table 10-1 shows the cycles in an EOI message. Table 10-1. EOI Message (14 Cycles) Cycle Bit1 Bit0 1 1 1 11 = EOI 2 ArbID3 0 Arbitration ID bits 3 through 0 Vol. 3A 10-47 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-1. EOI Message (14 Cycles) (Contd.) Cycle Bit1 Bit0 3 ArbID2 0 4 ArbID1 0 5 ArbID0 0 6 V7 V6 7 V5 V4 8 V3 V2 9 V1 V0 Interrupt vector V7 - V0 10 C C 11 0 0 Checksum for cycles 6 - 9 12 A A Status Cycle 0 13 A1 A1 Status Cycle 1 14 0 0 Idle The checksum is computed for cycles 6 through 9. It is a cumulative sum of the 2-bit (Bit1:Bit0) logical data values. The carry out of all but the last addition is added to the sum. If any APIC computes a different checksum than the one appearing on the bus in cycle 10, it signals an error, driving 11 on the APIC bus during cycle 12. In this case, the APICs disregard the message. The sending APIC will receive an appropriate error indication (see Section 10.5.3, “Error Handling”) and resend the message. The status cycles are defined in Table 10-4. 10.13.2.1 Short Message Short messages (21-cycles) are used for sending fixed, NMI, SMI, INIT, start-up, ExtINT and lowest-priority-withfocus interrupts. Table 10-2 shows the cycles in a short message. Table 10-2. Short Message (21 Cycles) Cycle Bit1 Bit0 1 0 1 0 1 = normal 2 ArbID3 0 Arbitration ID bits 3 through 0 3 ArbID2 0 4 ArbID1 0 5 ArbID0 0 6 DM M2 DM = Destination Mode 7 M1 M0 M2-M0 = Delivery mode 8 L TM L = Level, TM = Trigger Mode 9 V7 V6 V7-V0 = Interrupt Vector 10 V5 V4 11 V3 V2 12 V1 V0 13 D7 D6 14 D5 D4 15 D3 D2 16 D1 D0 10-48 Vol. 3A D7-D0 = Destination ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-2. Short Message (21 Cycles) (Contd.) Cycle Bit1 Bit0 17 C C Checksum for cycles 6-16 18 0 0 19 A A Status cycle 0 20 A1 A1 Status cycle 1 21 0 0 Idle If the physical delivery mode is being used, then cycles 15 and 16 represent the APIC ID and cycles 13 and 14 are considered don't care by the receiver. If the logical delivery mode is being used, then cycles 13 through 16 are the 8-bit logical destination field. For shorthands of “all-incl-self” and “all-excl-self,” the physical delivery mode and an arbitration priority of 15 (D0:D3 = 1111) are used. The agent sending the message is the only one required to distinguish between the two cases. It does so using internal information. When using lowest priority delivery with an existing focus processor, the focus processor identifies itself by driving 10 during cycle 19 and accepts the interrupt. This is an indication to other APICs to terminate arbitration. If the focus processor has not been found, the short message is extended on-the-fly to the non-focused lowest-priority message. Note that except for the EOI message, messages generating a checksum or an acceptance error (see Section 10.5.3, “Error Handling”) terminate after cycle 21. 10.13.2.2 Non-focused Lowest Priority Message These 34-cycle messages (see Table 10-3) are used in the lowest priority delivery mode when a focus processor is not present. Cycles 1 through 20 are same as for the short message. If during the status cycle (cycle 19) the state of the (A:A) flags is 10B, a focus processor has been identified, and the short message format is used (see Table 10-2). If the (A:A) flags are set to 00B, lowest priority arbitration is started and the 34-cycles of the non-focused lowest priority message are competed. For other combinations of status flags, refer to Section 10.13.2.3, “APIC Bus Status Cycles.” Table 10-3. Non-Focused Lowest Priority Message (34 Cycles) Cycle Bit0 Bit1 1 0 1 0 1 = normal 2 ArbID3 0 Arbitration ID bits 3 through 0 3 ArbID2 0 4 ArbID1 0 5 ArbID0 0 6 DM M2 DM = Destination mode 7 M1 M0 M2-M0 = Delivery mode 8 L TM L = Level, TM = Trigger Mode 9 V7 V6 V7-V0 = Interrupt Vector 10 V5 V4 11 V3 V2 12 V1 V0 13 D7 D6 14 D5 D4 15 D3 D2 16 D1 D0 D7-D0 = Destination Vol. 3A 10-49 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-3. Non-Focused Lowest Priority Message (34 Cycles) (Contd.) Cycle Bit0 Bit1 17 C C Checksum for cycles 6-16 18 0 0 19 A A Status cycle 0 20 A1 A1 Status cycle 1 21 P7 0 22 P6 0 23 P5 0 24 P4 0 25 P3 0 26 P2 0 27 P1 0 28 P0 0 29 ArbID3 0 30 ArbID2 0 31 ArbID1 0 32 ArbID0 0 33 A2 A2 34 0 0 P7 - P0 = Inverted Processor Priority Arbitration ID 3 -0 Status Cycle Idle Cycles 21 through 28 are used to arbitrate for the lowest priority processor. The processors participating in the arbitration drive their inverted processor priority on the bus. Only the local APICs having free interrupt slots participate in the lowest priority arbitration. If no such APIC exists, the message will be rejected, requiring it to be tried at a later time. Cycles 29 through 32 are also used for arbitration in case two or more processors have the same lowest priority. In the lowest priority delivery mode, all combinations of errors in cycle 33 (A2 A2) will set the “accept error” bit in the error status register (see Figure 10-9). Arbitration priority update is performed in cycle 20, and is not affected by errors detected in cycle 33. Only the local APIC that wins in the lowest priority arbitration, drives cycle 33. An error in cycle 33 will force the sender to resend the message. 10.13.2.3 APIC Bus Status Cycles Certain cycles within an APIC bus message are status cycles. During these cycles the status flags (A:A) and (A1:A1) are examined. Table 10-4 shows how these status flags are interpreted, depending on the current delivery mode and existence of a focus processor. 10-50 Vol. 3A ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-4. APIC Bus Status Cycles Interpretation Delivery Mode A Status A1 Status A2 Status Update ArbID Message and Cycle# Length EOI 00: CS_OK 10: Accept XX: Yes, 13 14 Cycle No 00: CS_OK 11: Retry XX: Yes, 13 14 Cycle Yes 00: CS_OK 0X: Accept Error XX: No 14 Cycle Yes 11: CS_Error XX: XX: No 14 Cycle Yes 10: Error XX: XX: No 14 Cycle Yes 01: Error XX: XX: No 14 Cycle Yes 00: CS_OK 10: Accept XX: Yes, 20 21 Cycle No 00: CS_OK 11: Retry XX: Yes, 20 21 Cycle Yes 00: CS_OK 0X: Accept Error XX: No 21 Cycle Yes 11: CS_Error XX: XX: No 21 Cycle Yes 10: Error XX: XX: No 21 Cycle Yes Fixed NMI, SMI, INIT, ExtINT, Start-Up Lowest Retry 01: Error XX: XX: No 21 Cycle Yes 00: CS_OK 10: Accept XX: Yes, 20 21 Cycle No 00: CS_OK 11: Retry XX: Yes, 20 21 Cycle Yes 00: CS_OK 0X: Accept Error XX: No 21 Cycle Yes 11: CS_Error XX: XX: No 21 Cycle Yes 10: Error XX: XX: No 21 Cycle Yes 01: Error XX: XX: No 21 Cycle Yes 00: CS_OK, NoFocus 11: Do Lowest 10: Accept Yes, 20 34 Cycle No 00: CS_OK, NoFocus 11: Do Lowest 11: Error Yes, 20 34 Cycle Yes 00: CS_OK, NoFocus 11: Do Lowest 0X: Error Yes, 20 34 Cycle Yes 00: CS_OK, NoFocus 10: End and Retry XX: Yes, 20 34 Cycle Yes 00: CS_OK, NoFocus 0X: Error XX: No 34 Cycle Yes 10: CS_OK, Focus XX: XX: Yes, 20 34 Cycle No 11: CS_Error XX: XX: No 21 Cycle Yes 01: Error XX: XX: No 21 Cycle Yes Vol. 3A 10-51 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10-52 Vol. 3A 12.Updates to Chapter 15, Volume 3B Change bars and green text show changes to Chapter 15 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------Changes to this chapter: Typo corrections, changing “UNCA” to “UCNA”. Updates to document per-model channel decode based on bank number. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 23 CHAPTER 15 MACHINE-CHECK ARCHITECTURE This chapter describes the machine-check architecture and machine-check exception mechanism found in the Pentium 4, Intel Xeon, Intel Atom, and P6 family processors. See Chapter 6, “Interrupt 18—Machine-Check Exception (#MC),” for more information on machine-check exceptions. A brief description of the Pentium processor’s machine check capability is also given. Additionally, a signaling mechanism for software to respond to hardware corrected machine check error is covered. 15.1 MACHINE-CHECK ARCHITECTURE The Pentium 4, Intel Xeon, Intel Atom, and P6 family processors implement a machine-check architecture that provides a mechanism for detecting and reporting hardware (machine) errors, such as: system bus errors, ECC errors, parity errors, cache errors, and TLB errors. It consists of a set of model-specific registers (MSRs) that are used to set up machine checking and additional banks of MSRs used for recording errors that are detected. The processor signals the detection of an uncorrected machine-check error by generating a machine-check exception (#MC), which is an abort class exception. The implementation of the machine-check architecture does not ordinarily permit the processor to be restarted reliably after generating a machine-check exception. However, the machine-check-exception handler can collect information about the machine-check error from the machine-check MSRs. Starting with 45 nm Intel 64 processor on which CPUID reports DisplayFamily_DisplayModel as 06H_1AH (see CPUID instruction in Chapter 3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A), the processor can report information on corrected machine-check errors and deliver a programmable interrupt for software to respond to MC errors, referred to as corrected machine-check error interrupt (CMCI). See Section 15.5 for detail. Intel 64 processors supporting machine-check architecture and CMCI may also support an additional enhancement, namely, support for software recovery from certain uncorrected recoverable machine check errors. See Section 15.6 for detail. 15.2 COMPATIBILITY WITH PENTIUM PROCESSOR The Pentium 4, Intel Xeon, Intel Atom, and P6 family processors support and extend the machine-check exception mechanism introduced in the Pentium processor. The Pentium processor reports the following machine-check errors: • • data parity errors during read cycles unsuccessful completion of a bus cycle The above errors are reported using the P5_MC_TYPE and P5_MC_ADDR MSRs (implementation specific for the Pentium processor). Use the RDMSR instruction to read these MSRs. See Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4 for the addresses. The machine-check error reporting mechanism that Pentium processors use is similar to that used in Pentium 4, Intel Xeon, Intel Atom, and P6 family processors. When an error is detected, it is recorded in P5_MC_TYPE and P5_MC_ADDR; the processor then generates a machine-check exception (#MC). See Section 15.3.3, “Mapping of the Pentium Processor Machine-Check Errors to the Machine-Check Architecture,” and Section 15.10.2, “Pentium Processor Machine-Check Exception Handling,” for information on compatibility between machine-check code written to run on the Pentium processors and code written to run on P6 family processors. Vol. 3B 15-1 MACHINE-CHECK ARCHITECTURE 15.3 MACHINE-CHECK MSRS Machine check MSRs in the Pentium 4, Intel Atom, Intel Xeon, and P6 family processors consist of a set of global control and status registers and several error-reporting register banks. See Figure 15-1. Error-Reporting Bank Registers (One Set for Each Hardware Unit) Global Control MSRs 63 0 63 IA32_MCG_CAP MSR 63 0 63 IA32_MCG_STATUS MSR 63 IA32_MCG_CTL MSR 0 IA32_MCi_ADDR MSR 0 63 0 IA32_MCi_STATUS MSR 0 63 0 IA32_MCi_CTL MSR 63 IA32_MCG_EXT_CTL MSR 0 IA32_MCi_MISC MSR 0 63 IA32_MCi_CTL2 MSR Figure 15-1. Machine-Check MSRs Each error-reporting bank is associated with a specific hardware unit (or group of hardware units) in the processor. Use RDMSR and WRMSR to read and to write these registers. 15.3.1 Machine-Check Global Control MSRs The machine-check global control MSRs include the IA32_MCG_CAP, IA32_MCG_STATUS, and optionally IA32_MCG_CTL and IA32_MCG_EXT_CTL. See Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4 for the addresses of these registers. 15.3.1.1 IA32_MCG_CAP MSR The IA32_MCG_CAP MSR is a read-only register that provides information about the machine-check architecture of the processor. Figure 15-2 shows the layout of the register. 15-2 Vol. 3B MACHINE-CHECK ARCHITECTURE 63 27 26 25 24 23 16 15 0 12 11 10 9 8 7 Count Reserved MCG_LMCE_P[27] MCG_ELOG_P[26] MCG_EMC_P[25] MCG_SER_P[24] MCG_EXT_CNT[23:16] MCG_TES_P[11] MCG_CMCI_P[10] MCG_EXT_P[9] MCG_CTL_P[8] Figure 15-2. IA32_MCG_CAP Register Where: • Count field, bits 7:0 — Indicates the number of hardware unit error-reporting banks available in a particular processor implementation. • MCG_CTL_P (control MSR present) flag, bit 8 — Indicates that the processor implements the IA32_MCG_CTL MSR when set; this register is absent when clear. • MCG_EXT_P (extended MSRs present) flag, bit 9 — Indicates that the processor implements the extended machine-check state registers found starting at MSR address 180H; these registers are absent when clear. • MCG_CMCI_P (Corrected MC error counting/signaling extension present) flag, bit 10 — Indicates (when set) that extended state and associated MSRs necessary to support the reporting of an interrupt on a corrected MC error event and/or count threshold of corrected MC errors, is present. When this bit is set, it does not imply this feature is supported across all banks. Software should check the availability of the necessary logic on a bank by bank basis when using this signaling capability (i.e. bit 30 settable in individual IA32_MCi_CTL2 register). • MCG_TES_P (threshold-based error status present) flag, bit 11 — Indicates (when set) that bits 56:53 of the IA32_MCi_STATUS MSR are part of the architectural space. Bits 56:55 are reserved, and bits 54:53 are used to report threshold-based error status. Note that when MCG_TES_P is not set, bits 56:53 of the IA32_MCi_STATUS MSR are model-specific. • MCG_EXT_CNT, bits 23:16 — Indicates the number of extended machine-check state registers present. This field is meaningful only when the MCG_EXT_P flag is set. • MCG_SER_P (software error recovery support present) flag, bit 24 — Indicates (when set) that the processor supports software error recovery (see Section 15.6), and IA32_MCi_STATUS MSR bits 56:55 are used to report the signaling of uncorrected recoverable errors and whether software must take recovery actions for uncorrected errors. Note that when MCG_TES_P is not set, bits 56:53 of the IA32_MCi_STATUS MSR are model-specific. If MCG_TES_P is set but MCG_SER_P is not set, bits 56:55 are reserved. • MCG_EMC_P (Enhanced Machine Check Capability) flag, bit 25 — Indicates (when set) that the processor supports enhanced machine check capabilities for firmware first signaling. • MCG_ELOG_P (extended error logging) flag, bit 26 — Indicates (when set) that the processor allows platform firmware to be invoked when an error is detected so that it may provide additional platform specific information in an ACPI format “Generic Error Data Entry” that augments the data included in machine check bank registers. For additional information about extended error logging interface, see https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/enhanced-mca-loggingxeon-paper.pdf. Vol. 3B 15-3 MACHINE-CHECK ARCHITECTURE • MCG_LMCE_P (local machine check exception) flag, bit 27 — Indicates (when set) that the following interfaces are present: — an extended state LMCE_S (located in bit 3 of IA32_MCG_STATUS), and — the IA32_MCG_EXT_CTL MSR, necessary to support Local Machine Check Exception (LMCE). A non-zero MCG_LMCE_P indicates that, when LMCE is enabled as described in Section 15.3.1.5, some machine check errors may be delivered to only a single logical processor. The effect of writing to the IA32_MCG_CAP MSR is undefined. 15.3.1.2 IA32_MCG_STATUS MSR The IA32_MCG_STATUS MSR describes the current state of the processor after a machine-check exception has occurred (see Figure 15-3). 63 3 2 1 0 Reserved M C I P E R I I P P V V LMCE_S—Local machine check exception signaled MCIP—Machine check in progress flag EIPV—Error IP valid flag RIPV—Restart IP valid flag Figure 15-3. IA32_MCG_STATUS Register Where: • RIPV (restart IP valid) flag, bit 0 — Indicates (when set) that program execution can be restarted reliably at the instruction pointed to by the instruction pointer pushed on the stack when the machine-check exception is generated. When clear, the program cannot be reliably restarted at the pushed instruction pointer. • EIPV (error IP valid) flag, bit 1 — Indicates (when set) that the instruction pointed to by the instruction pointer pushed onto the stack when the machine-check exception is generated is directly associated with the error. When this flag is cleared, the instruction pointed to may not be associated with the error. • MCIP (machine check in progress) flag, bit 2 — Indicates (when set) that a machine-check exception was generated. Software can set or clear this flag. The occurrence of a second Machine-Check Event while MCIP is set will cause the processor to enter a shutdown state. For information on processor behavior in the shutdown state, please refer to the description in Chapter 6, “Interrupt and Exception Handling”: “Interrupt 8—Double Fault Exception (#DF)”. • LMCE_S (local machine check exception signaled), bit 3 — Indicates (when set) that a local machinecheck exception was generated. This indicates that the current machine-check event was delivered to only this logical processor. Bits 63:04 in IA32_MCG_STATUS are reserved. An attempt to write to IA32_MCG_STATUS with any value other than 0 would result in #GP. 15.3.1.3 IA32_MCG_CTL MSR The IA32_MCG_CTL MSR is present if the capability flag MCG_CTL_P is set in the IA32_MCG_CAP MSR. IA32_MCG_CTL controls the reporting of machine-check exceptions. If present, writing 1s to this register enables machine-check features and writing all 0s disables machine-check features. All other values are undefined and/or implementation specific. 15-4 Vol. 3B MACHINE-CHECK ARCHITECTURE 15.3.1.4 IA32_MCG_EXT_CTL MSR The IA32_MCG_EXT_CTL MSR is present if the capability flag MCG_LMCE_P is set in the IA32_MCG_CAP MSR. IA32_MCG_EXT_CTL.LMCE_EN (bit 0) allows the processor to signal some MCEs to only a single logical processor in the system. If MCG_LMCE_P is not set in IA32_MCG_CAP, or platform software has not enabled LMCE by setting IA32_FEATURE_CONTROL.LMCE_ON (bit 20), any attempt to write or read IA32_MCG_EXT_CTL will result in #GP. The IA32_MCG_EXT_CTL MSR is cleared on RESET. Figure 15-4 shows the layout of the IA32_MCG_EXT_CTL register 63 1 0 Reserved LMCE_EN - system software control to enable/disable LMCE Figure 15-4. IA32_MCG_EXT_CTL Register where • LMCE_EN (local machine check exception enable) flag, bit 0 - System software sets this to allow hardware to signal some MCEs to only a single logical processor. System software can set LMCE_EN only if the platform software has configured IA32_FEATURE_CONTROL as described in Section 15.3.1.5. 15.3.1.5 Enabling Local Machine Check The intended usage of LMCE requires proper configuration by both platform software and system software. Platform software can turn LMCE on by setting bit 20 (LMCE_ON) in IA32_FEATURE_CONTROL MSR (MSR address 3AH). System software must ensure that both IA32_FEATURE_CONTROL.Lock (bit 0)and IA32_FEATURE_CONTROL.LMCE_ON (bit 20) are set before attempting to set IA32_MCG_EXT_CTL.LMCE_EN (bit 0). When system software has enabled LMCE, then hardware will determine if a particular error can be delivered only to a single logical processor. Software should make no assumptions about the type of error that hardware can choose to deliver as LMCE. The severity and override rules stay the same as described in Table 15-8 to determine the recovery actions. 15.3.2 Error-Reporting Register Banks Each error-reporting register bank can contain the IA32_MCi_CTL, IA32_MCi_STATUS, IA32_MCi_ADDR, and IA32_MCi_MISC MSRs. The number of reporting banks is indicated by bits [7:0] of IA32_MCG_CAP MSR (address 0179H). The first error-reporting register (IA32_MC0_CTL) always starts at address 400H. See Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4 for addresses of the error-reporting registers in the Pentium 4, Intel Atom, and Intel Xeon processors; and for addresses of the error-reporting registers P6 family processors. 15.3.2.1 IA32_MCi_CTL MSRs The IA32_MCi_CTL MSR controls signaling of #MC for errors produced by a particular hardware unit (or group of hardware units). Each of the 64 flags (EEj) represents a potential error. Setting an EEj flag enables signaling #MC of the associated error and clearing it disables signaling of the error. Error logging happens regardless of the setting of these bits. The processor drops writes to bits that are not implemented. Figure 15-5 shows the bit fields of IA32_MCi_CTL. Vol. 3B 15-5 MACHINE-CHECK ARCHITECTURE 63 62 61 E E 6 3 E E 6 2 ..... E E 6 1 3 2 1 0 E E 0 2 E E E E 0 0 1 0 EEj—Error reporting enable flag (where j is 00 through 63) Figure 15-5. IA32_MCi_CTL Register NOTE For P6 family processors, processors based on Intel Core microarchitecture (excluding those on which on which CPUID reports DisplayFamily_DisplayModel as 06H_1AH and onward): the operating system or executive software must not modify the contents of the IA32_MC0_CTL MSR. This MSR is internally aliased to the EBL_CR_POWERON MSR and controls platform-specific error handling features. System specific firmware (the BIOS) is responsible for the appropriate initialization of the IA32_MC0_CTL MSR. P6 family processors only allow the writing of all 1s or all 0s to the IA32_MCi_CTL MSR. 15.3.2.2 IA32_MCi_STATUS MSRS Each IA32_MCi_STATUS MSR contains information related to a machine-check error if its VAL (valid) flag is set (see Figure 15-6). Software is responsible for clearing IA32_MCi_STATUS MSRs by explicitly writing 0s to them; writing 1s to them causes a general-protection exception. NOTE Figure 15-6 depicts the IA32_MCi_STATUS MSR when IA32_MCG_CAP[24] = 1, IA32_MCG_CAP[11] = 1 and IA32_MCG_CAP[10] = 1. When IA32_MCG_CAP[24] = 0 and IA32_MCG_CAP[11] = 1, bits 56:55 is reserved and bits 54:53 for threshold-based error reporting. When IA32_MCG_CAP[11] = 0, bits 56:53 are part of the “Other Information” field. The use of bits 54:53 for threshold-based error reporting began with Intel Core Duo processors, and is currently used for cache memory. See Section 15.4, “Enhanced Cache Error reporting,” for more information. When IA32_MCG_CAP[10] = 0, bits 52:38 are part of the “Other Information” field. The use of bits 52:38 for corrected MC error count is introduced with Intel 64 processor on which CPUID reports DisplayFamily_DisplayModel as 06H_1AH. Where: • MCA (machine-check architecture) error code field, bits 15:0 — Specifies the machine-check architecture-defined error code for the machine-check error condition detected. The machine-check architecturedefined error codes are guaranteed to be the same for all IA-32 processors that implement the machine-check architecture. See Section 15.9, “Interpreting the MCA Error Codes,” and Chapter 16, “Interpreting MachineCheck Error Codes”, for information on machine-check error codes. • Model-specific error code field, bits 31:16 — Specifies the model-specific error code that uniquely identifies the machine-check error condition detected. The model-specific error codes may differ among IA-32 processors for the same machine-check error condition. See Chapter 16, “Interpreting Machine-Check Error Codes”for information on model-specific error codes. • Reserved, Error Status, and Other Information fields, bits 56:32 — • If IA32_MCG_CAP.MCG_EMC_P[bit 25] is 0, bits 37:32 contain “Other Information” that is implementation-specific and is not part of the machine-check architecture. • If IA32_MCG_CAP.MCG_EMC_P is 1, “Other Information” is in bits 36:32. If bit 37 is 0, system firmware has not changed the contents of IA32_MCi_STATUS. If bit 37 is 1, system firmware may have edited the contents of IA32_MCi_STATUS. • If IA32_MCG_CAP.MCG_CMCI_P[bit 10] is 0, bits 52:38 also contain “Other Information” (in the same sense as bits 37:32). 15-6 Vol. 3B MACHINE-CHECK ARCHITECTURE 63 62 61 60 59 58 57 56 55 54 53 52 V O U E A V C N L E R P S A C R C Corrected Error Count 38 37 36 32 31 Other Info 16 15 MSCOD Model Specific Error Code 0 MCA Error Code Firmware updated error status indicator (37)* Threshold-based error status (54:53)** AR — Recovery action required for UCR error (55)*** S — Signaling an uncorrected recoverable (UCR) error (56)*** PCC — Processor context corrupted (57) ADDRV — MCi_ADDR register valid (58) MISCV — MCi_MISC register valid (59) EN — Error reporting enabled (60) UC — Uncorrected error (61) OVER — Error overflow (62) VAL — MCi_STATUS register valid (63) * When IA32_MCG_CAP[25] (MCG_EMC_P) is set, bit 37 is not part of “Other Information”. ** When IA32_MCG_CAP[11] (MCG_TES_P) is not set, these bits are model-specific (part of “Other Information”). *** When IA32_MCG_CAP[11] or IA32_MCG_CAP[24] are not set, these bits are reserved, or model-specific (part of “Other Information”). Figure 15-6. IA32_MCi_STATUS Register • If IA32_MCG_CAP[10] is 1, bits 52:38 are architectural (not model-specific). In this case, bits 52:38 reports the value of a 15 bit counter that increments each time a corrected error is observed by the MCA recording bank. This count value will continue to increment until cleared by software. The most significant bit, 52, is a sticky count overflow bit. • • If IA32_MCG_CAP[11] is 0, bits 56:53 also contain “Other Information” (in the same sense). If IA32_MCG_CAP[11] is 1, bits 56:53 are architectural (not model-specific). In this case, bits 56:53 have the following functionality: • • • If IA32_MCG_CAP[24] is 0, bits 56:55 are reserved. If IA32_MCG_CAP[24] is 1, bits 56:55 are defined as follows: S (Signaling) flag, bit 56 - Signals the reporting of UCR errors in this MC bank. See Section 15.6.2 for additional detail. • AR (Action Required) flag, bit 55 - Indicates (when set) that MCA error code specific recovery action must be performed by system software at the time this error was signaled. See Section 15.6.2 for additional detail. • • If the UC bit (Figure 15-6) is 1, bits 54:53 are undefined. If the UC bit (Figure 15-6) is 0, bits 54:53 indicate the status of the hardware structure that reported the threshold-based error. See Table 15-1. Table 15-1. Bits 54:53 in IA32_MCi_STATUS MSRs when IA32_MCG_CAP[11] = 1 and UC = 0 Bits 54:53 Meaning 00 No tracking - No hardware status tracking is provided for the structure reporting this event. 01 Green - Status tracking is provided for the structure posting the event; the current status is green (below threshold). For more information, see Section 15.4, “Enhanced Cache Error reporting”. 10 Yellow - Status tracking is provided for the structure posting the event; the current status is yellow (above threshold). For more information, see Section 15.4, “Enhanced Cache Error reporting”. 11 Reserved Vol. 3B 15-7 MACHINE-CHECK ARCHITECTURE • PCC (processor context corrupt) flag, bit 57 — Indicates (when set) that the state of the processor might have been corrupted by the error condition detected and that reliable restarting of the processor may not be possible. When clear, this flag indicates that the error did not affect the processor’s state, and software may be able to restart. When system software supports recovery, consult Section 15.10.4, “Machine-Check Software Handler Guidelines for Error Recovery” for additional rules that apply. • ADDRV (IA32_MCi_ADDR register valid) flag, bit 58 — Indicates (when set) that the IA32_MCi_ADDR register contains the address where the error occurred (see Section 15.3.2.3, “IA32_MCi_ADDR MSRs”). When clear, this flag indicates that the IA32_MCi_ADDR register is either not implemented or does not contain the address where the error occurred. Do not read these registers if they are not implemented in the processor. • MISCV (IA32_MCi_MISC register valid) flag, bit 59 — Indicates (when set) that the IA32_MCi_MISC register contains additional information regarding the error. When clear, this flag indicates that the IA32_MCi_MISC register is either not implemented or does not contain additional information regarding the error. Do not read these registers if they are not implemented in the processor. • EN (error enabled) flag, bit 60 — Indicates (when set) that the error was enabled by the associated EEj bit of the IA32_MCi_CTL register. • UC (error uncorrected) flag, bit 61 — Indicates (when set) that the processor did not or was not able to correct the error condition. When clear, this flag indicates that the processor was able to correct the error condition. • OVER (machine check overflow) flag, bit 62 — Indicates (when set) that a machine-check error occurred while the results of a previous error were still in the error-reporting register bank (that is, the VAL bit was already set in the IA32_MCi_STATUS register). The processor sets the OVER flag and software is responsible for clearing it. In general, enabled errors are written over disabled errors, and uncorrected errors are written over corrected errors. Uncorrected errors are not written over previous valid uncorrected errors. When MCG_CMCI_P is set, corrected errors may not set the OVER flag. Software can rely on corrected error count in IA32_MCi_Status[52:38] to determine if any additional corrected errors may have occurred. For more information, see Section 15.3.2.2.1, “Overwrite Rules for Machine Check Overflow”. • VAL (IA32_MCi_STATUS register valid) flag, bit 63 — Indicates (when set) that the information within the IA32_MCi_STATUS register is valid. When this flag is set, the processor follows the rules given for the OVER flag in the IA32_MCi_STATUS register when overwriting previously valid entries. The processor sets the VAL flag and software is responsible for clearing it. 15.3.2.2.1 Overwrite Rules for Machine Check Overflow Table 15-2 shows the overwrite rules for how to treat a second event if the cache has already posted an event to the MC bank – that is, what to do if the valid bit for an MC bank already is set to 1. When more than one structure posts events in a given bank, these rules specify whether a new event will overwrite a previous posting or not. These rules define a priority for uncorrected (highest priority), yellow, and green/unmonitored (lowest priority) status. In Table 15-2, the values in the two left-most columns are IA32_MCi_STATUS[54:53]. Table 15-2. Overwrite Rules for Enabled Errors First Event Second Event UC bit Color MCA Info 00/green 00/green 0 00/green either 00/green yellow 0 yellow second error yellow 00/green 0 yellow first error yellow yellow 0 yellow either 00/green/yellow UC 1 undefined second UC 00/green/yellow 1 undefined first If a second event overwrites a previously posted event, the information (as guarded by individual valid bits) in the MCi bank is entirely from the second event. Similarly, if a first event is retained, all of the information previously posted for that event is retained. In general, when the logged error or the recent error is a corrected error, the OVER bit (MCi_Status[62]) may be set to indicate an overflow. When MCG_CMCI_P is set in IA32_MCG_CAP, system software should consult IA32_MCi_STATUS[52:38] to determine if additional corrected errors may have 15-8 Vol. 3B MACHINE-CHECK ARCHITECTURE occurred. Software may re-read IA32_MCi_STATUS, IA32_MCi_ADDR and IA32_MCi_MISC appropriately to ensure data collected represent the last error logged. After software polls a posting and clears the register, the valid bit is no longer set and therefore the meaning of the rest of the bits, including the yellow/green/00 status field in bits 54:53, is undefined. The yellow/green indication will only be posted for events associated with monitored structures – otherwise the unmonitored (00) code will be posted in IA32_MCi_STATUS[54:53]. 15.3.2.3 IA32_MCi_ADDR MSRs The IA32_MCi_ADDR MSR contains the address of the code or data memory location that produced the machinecheck error if the ADDRV flag in the IA32_MCi_STATUS register is set (see Section 15-7, “IA32_MCi_ADDR MSR”). The IA32_MCi_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MCi_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general protection exception. The address returned is an offset into a segment, linear address, or physical address. This depends on the error encountered. When these registers are implemented, these registers can be cleared by explicitly writing 0s to these registers. Writing 1s to these registers will cause a general-protection exception. See Figure 15-7. Processor Without Support For Intel 64 Architecture 63 0 36 35 Address Reserved Processor With Support for Intel 64 Architecture 63 0 Address* * Useful bits in this field depend on the address methodology in use when the the register state is saved. Figure 15-7. IA32_MCi_ADDR MSR 15.3.2.4 IA32_MCi_MISC MSRs The IA32_MCi_MISC MSR contains additional information describing the machine-check error if the MISCV flag in the IA32_MCi_STATUS register is set. The IA32_MCi_MISC_MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MCi_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general protection exception. When implemented in a processor, these registers can be cleared by explicitly writing all 0s to them; writing 1s to them causes a general-protection exception to be generated. This register is not implemented in any of the errorreporting register banks for the P6 or Intel Atom family processors. If both MISCV and IA32_MCG_CAP[24] are set, the IA32_MCi_MISC_MSR is defined according to Figure 15-8 to support software recovery of uncorrected errors (see Section 15.6). Vol. 3B 15-9 MACHINE-CHECK ARCHITECTURE 9 8 63 6 5 0 Model Specific Information Address Mode Recoverable Address LSB Figure 15-8. UCR Support in IA32_MCi_MISC Register • Recoverable Address LSB (bits 5:0): The lowest valid recoverable address bit. Indicates the position of the least significant bit (LSB) of the recoverable error address. For example, if the processor logs bits [43:9] of the address, the LSB sub-field in IA32_MCi_MISC is 01001b (9 decimal). For this example, bits [8:0] of the recoverable error address in IA32_MCi_ADDR should be ignored. • Address Mode (bits 8:6): Address mode for the address logged in IA32_MCi_ADDR. The supported address modes are given in Table 15-3. Table 15-3. Address Mode in IA32_MCi_MISC[8:6] IA32_MCi_MISC[8:6] Encoding 000 Segment Offset 001 Linear Address 010 Physical Address 011 Memory Address 100 to 110 111 • Definition Reserved Generic Model Specific Information (bits 63:9): Not architecturally defined. 15.3.2.4.2 IOMCA Logging and Signaling of errors from PCI Express domain is governed by PCI Express Advanced Error Reporting (AER) architecture. PCI Express architecture divides errors in two categories: Uncorrectable errors and Correctable errors. Uncorrectable errors can further be classified as Fatal or Non-Fatal. Uncorrected IO errors are signaled to the system software either as AER Message Signaled Interrupt (MSI) or via platform specific mechanisms such as NMI. Generally, the signaling mechanism is controlled by BIOS and/or platform firmware. Certain processors support an error handling mode, called IOMCA mode, where Uncorrected PCI Express errors are signaled in the form of machine check exception and logged in machine check banks. When a processor is in this mode, Uncorrected PCI Express errors are logged in the MCACOD field of the IA32_MCi_STATUS register as Generic I/O error. The corresponding MCA error code is defined in Table 15-8. IA32_MCi_Status [15:0] Simple Error Code Encoding. Machine check logging complements and does not replace AER logging that occurs inside the PCI Express hierarchy. The PCI Express Root Complex and Endpoints continue to log the error in accordance with PCI Express AER mechanism. In IOMCA mode, MCi_MISC register in the bank that logged IOMCA can optionally contain information that link the Machine Check logs with the AER logs or proprietary logs. In such a scenario, the machine check handler can utilize the contents of MCi_MISC to locate the next level of error logs corresponding to the same error. Specifically, if MCi_Status.MISCV is 1 and MCACOD is 0x0E0B, MCi_MISC contains the PCI Express address of the Root Complex device containing the AER Logs. Software can consult the header type and class code registers in the Root Complex device's PCIe Configuration space to determine what type of device it is. This Root Complex device can either be a PCI Express Root Port, PCI Express Root Complex Event Collector or a proprietary device. 15-10 Vol. 3B MACHINE-CHECK ARCHITECTURE Errors that originate from PCI Express or Legacy Endpoints are logged in the corresponding Root Port in addition to the generating device. If MISCV=1 and MCi_MISC contains the address of the Root Port or a Root Complex Event collector, software can parse the AER logs to learn more about the error. If MISCV=1 and MCi_MISC points to a device that is neither a Root Complex Event Collector not a Root Port, software must consult the Vendor ID/Device ID and use device specific knowledge to locate and interpret the error log registers. In some cases, the Root Complex device configuration space may not be accessible to the software and both the Vendor and Device ID read as 0xFFFF. • The format of MCi_MISC for IOMCA errors is shown in Table 15-4. Table 15-4. Address Mode in IA32_MCi_MISC[8:6] 63:40 RSVD 39:32 PCI Express Segment number 31:16 15:9 PCI Express Requestor ID 8:6 RSVD ADDR MODE 5:0 1 RECOV ADDR LSB1 NOTES: 1. Not Applicable if ADDRV=0. Refer to PCI Express Specification 3.0 for definition of PCI Express Requestor ID and AER architecture. Refer to PCI Firmware Specification 3.0 for an explanation of PCI Ex-press Segment number and how software can access configuration space of a PCI Ex-press device given the segment number and Requestor ID. 15.3.2.5 IA32_MCi_CTL2 MSRs The IA32_MCi_CTL2 MSR provides the programming interface to use corrected MC error signaling capability that is indicated by IA32_MCG_CAP[10] = 1. Software must check for the presence of IA32_MCi_CTL2 on a per-bank basis. When IA32_MCG_CAP[10] = 1, the IA32_MCi_CTL2 MSR for each bank exists, i.e. reads and writes to these MSR are supported. However, signaling interface for corrected MC errors may not be supported in all banks. The layout of IA32_MCi_CTL2 is shown in Figure 15-9: 63 31 30 29 15 14 0 Reserved Reserved CMCI_EN—Enable/disable CMCI Corrected error count threshold Figure 15-9. IA32_MCi_CTL2 Register • Corrected error count threshold, bits 14:0 — Software must initialize this field. The value is compared with the corrected error count field in IA32_MCi_STATUS, bits 38 through 52. An overflow event is signaled to the CMCI LVT entry (see Table 10-1) in the APIC when the count value equals the threshold value. The new LVT entry in the APIC is at 02F0H offset from the APIC_BASE. If CMCI interface is not supported for a particular bank (but IA32_MCG_CAP[10] = 1), this field will always read 0. • CMCI_EN (Corrected error interrupt enable/disable/indicator), bits 30 — Software sets this bit to enable the generation of corrected machine-check error interrupt (CMCI). If CMCI interface is not supported for a particular bank (but IA32_MCG_CAP[10] = 1), this bit is writeable but will always return 0 for that bank. This bit also indicates CMCI is supported or not supported in the corresponding bank. See Section 15.5 for details of software detection of CMCI facility. Vol. 3B 15-11 MACHINE-CHECK ARCHITECTURE Some microarchitectural sub-systems that are the source of corrected MC errors may be shared by more than one logical processors. Consequently, the facilities for reporting MC errors and controlling mechanisms may be shared by more than one logical processors. For example, the IA32_MCi_CTL2 MSR is shared between logical processors sharing a processor core. Software is responsible to program IA32_MCi_CTL2 MSR in a consistent manner with CMCI delivery and usage. After processor reset, IA32_MCi_CTL2 MSRs are zero’ed. 15.3.2.6 IA32_MCG Extended Machine Check State MSRs The Pentium 4 and Intel Xeon processors implement a variable number of extended machine-check state MSRs. The MCG_EXT_P flag in the IA32_MCG_CAP MSR indicates the presence of these extended registers, and the MCG_EXT_CNT field indicates the number of these registers actually implemented. See Section 15.3.1.1, “IA32_MCG_CAP MSR.” Also see Table 15-5. Table 15-5. Extended Machine Check State MSRs in Processors Without Support for Intel 64 Architecture MSR Address Description IA32_MCG_EAX 180H Contains state of the EAX register at the time of the machine-check error. IA32_MCG_EBX 181H Contains state of the EBX register at the time of the machine-check error. IA32_MCG_ECX 182H Contains state of the ECX register at the time of the machine-check error. IA32_MCG_EDX 183H Contains state of the EDX register at the time of the machine-check error. IA32_MCG_ESI 184H Contains state of the ESI register at the time of the machine-check error. IA32_MCG_EDI 185H Contains state of the EDI register at the time of the machine-check error. IA32_MCG_EBP 186H Contains state of the EBP register at the time of the machine-check error. IA32_MCG_ESP 187H Contains state of the ESP register at the time of the machine-check error. IA32_MCG_EFLAGS 188H Contains state of the EFLAGS register at the time of the machine-check error. IA32_MCG_EIP 189H Contains state of the EIP register at the time of the machine-check error. IA32_MCG_MISC 18AH When set, indicates that a page assist or page fault occurred during DS normal operation. In processors with support for Intel 64 architecture, 64-bit machine check state MSRs are aliased to the legacy MSRs. In addition, there may be registers beyond IA32_MCG_MISC. These may include up to five reserved MSRs (IA32_MCG_RESERVED[1:5]) and save-state MSRs for registers introduced in 64-bit mode. See Table 15-6. Table 15-6. Extended Machine Check State MSRs In Processors With Support For Intel 64 Architecture MSR Address Description IA32_MCG_RAX 180H Contains state of the RAX register at the time of the machine-check error. IA32_MCG_RBX 181H Contains state of the RBX register at the time of the machine-check error. IA32_MCG_RCX 182H Contains state of the RCX register at the time of the machine-check error. IA32_MCG_RDX 183H Contains state of the RDX register at the time of the machine-check error. IA32_MCG_RSI 184H Contains state of the RSI register at the time of the machine-check error. IA32_MCG_RDI 185H Contains state of the RDI register at the time of the machine-check error. IA32_MCG_RBP 186H Contains state of the RBP register at the time of the machine-check error. IA32_MCG_RSP 187H Contains state of the RSP register at the time of the machine-check error. IA32_MCG_RFLAGS 188H Contains state of the RFLAGS register at the time of the machine-check error. IA32_MCG_RIP 189H Contains state of the RIP register at the time of the machine-check error. 15-12 Vol. 3B MACHINE-CHECK ARCHITECTURE Table 15-6. Extended Machine Check State MSRs In Processors With Support For Intel 64 Architecture (Contd.) MSR Address Description IA32_MCG_MISC 18AH When set, indicates that a page assist or page fault occurred during DS normal operation. IA32_MCG_ RSERVED[1:5] 18BH18FH These registers, if present, are reserved. IA32_MCG_R8 190H Contains state of the R8 register at the time of the machine-check error. IA32_MCG_R9 191H Contains state of the R9 register at the time of the machine-check error. IA32_MCG_R10 192H Contains state of the R10 register at the time of the machine-check error. IA32_MCG_R11 193H Contains state of the R11 register at the time of the machine-check error. IA32_MCG_R12 194H Contains state of the R12 register at the time of the machine-check error. IA32_MCG_R13 195H Contains state of the R13 register at the time of the machine-check error. IA32_MCG_R14 196H Contains state of the R14 register at the time of the machine-check error. IA32_MCG_R15 197H Contains state of the R15 register at the time of the machine-check error. When a machine-check error is detected on a Pentium 4 or Intel Xeon processor, the processor saves the state of the general-purpose registers, the R/EFLAGS register, and the R/EIP in these extended machine-check state MSRs. This information can be used by a debugger to analyze the error. These registers are read/write to zero registers. This means software can read them; but if software writes to them, only all zeros is allowed. If software attempts to write a non-zero value into one of these registers, a generalprotection (#GP) exception is generated. These registers are cleared on a hardware reset (power-up or RESET), but maintain their contents following a soft reset (INIT reset). 15.3.3 Mapping of the Pentium Processor Machine-Check Errors to the Machine-Check Architecture The Pentium processor reports machine-check errors using two registers: P5_MC_TYPE and P5_MC_ADDR. The Pentium 4, Intel Xeon, Intel Atom, and P6 family processors map these registers to the IA32_MCi_STATUS and IA32_MCi_ADDR in the error-reporting register bank. This bank reports on the same type of external bus errors reported in P5_MC_TYPE and P5_MC_ADDR. The information in these registers can then be accessed in two ways: • By reading the IA32_MCi_STATUS and IA32_MCi_ADDR registers as part of a general machine-check exception handler written for Pentium 4, Intel Atom and P6 family processors. • By reading the P5_MC_TYPE and P5_MC_ADDR registers using the RDMSR instruction. The second capability permits a machine-check exception handler written to run on a Pentium processor to be run on a Pentium 4, Intel Xeon, Intel Atom, or P6 family processor. There is a limitation in that information returned by the Pentium 4, Intel Xeon, Intel Atom, and P6 family processors is encoded differently than information returned by the Pentium processor. To run a Pentium processor machine-check exception handler on a Pentium 4, Intel Xeon, Intel Atom, or P6 family processor; the handler must be written to interpret P5_MC_TYPE encodings correctly. 15.4 ENHANCED CACHE ERROR REPORTING Starting with Intel Core Duo processors, cache error reporting was enhanced. In earlier Intel processors, cache status was based on the number of correction events that occurred in a cache. In the new paradigm, called “threshold-based error status”, cache status is based on the number of lines (ECC blocks) in a cache that incur repeated corrections. The threshold is chosen by Intel, based on various factors. If a processor supports thresholdbased error status, it sets IA32_MCG_CAP[11] (MCG_TES_P) to 1; if not, to 0. Vol. 3B 15-13 MACHINE-CHECK ARCHITECTURE A processor that supports enhanced cache error reporting contains hardware that tracks the operating status of certain caches and provides an indicator of their “health”. The hardware reports a “green” status when the number of lines that incur repeated corrections is at or below a pre-defined threshold, and a “yellow” status when the number of affected lines exceeds the threshold. Yellow status means that the cache reporting the event is operating correctly, but you should schedule the system for servicing within a few weeks. Intel recommends that you rely on this mechanism for structures supported by threshold-base error reporting. The CPU/system/platform response to a yellow event should be less severe than its response to an uncorrected error. An uncorrected error means that a serious error has actually occurred, whereas the yellow condition is a warning that the number of affected lines has exceeded the threshold but is not, in itself, a serious event: the error was corrected and system state was not compromised. The green/yellow status indicator is not a foolproof early warning for an uncorrected error resulting from the failure of two bits in the same ECC block. Such a failure can occur and cause an uncorrected error before the yellow threshold is reached. However, the chance of an uncorrected error increases as the number of affected lines increases. 15.5 CORRECTED MACHINE CHECK ERROR INTERRUPT Corrected machine-check error interrupt (CMCI) is an architectural enhancement to the machine-check architecture. It provides capabilities beyond those of threshold-based error reporting (Section 15.4). With threshold-based error reporting, software is limited to use periodic polling to query the status of hardware corrected MC errors. CMCI provides a signaling mechanism to deliver a local interrupt based on threshold values that software can program using the IA32_MCi_CTL2 MSRs. CMCI is disabled by default. System software is required to enable CMCI for each IA32_MCi bank that support the reporting of hardware corrected errors if IA32_MCG_CAP[10] = 1. System software use IA32_MCi_CTL2 MSR to enable/disable the CMCI capability for each bank and program threshold values into IA32_MCi_CTL2 MSR. CMCI is not affected by the CR4.MCE bit, and it is not affected by the IA32_MCi_CTL MSRs. To detect the existence of thresholding for a given bank, software writes only bits 14:0 with the threshold value. If the bits persist, then thresholding is available (and CMCI is available). If the bits are all 0's, then no thresholding exists. To detect that CMCI signaling exists, software writes a 1 to bit 30 of the MCi_CTL2 register. Upon subsequent read, if bit 30 = 0, no CMCI is available for this bank and no corrected or UCNA errors will be reported on this bank. If bit 30 = 1, then CMCI is available and enabled. 15.5.1 CMCI Local APIC Interface The operation of CMCI is depicted in Figure 15-10. 63 Software write 1 to enable 31 30 29 Error threshold ?= 53 52 MCi_STATUS 0 14 MCi_CTL2 Count overflow threshold -> CMCI LVT in local APIC 38 37 0 Error count Figure 15-10. CMCI Behavior 15-14 Vol. 3B APIC_BASE + 2F0H MACHINE-CHECK ARCHITECTURE CMCI interrupt delivery is configured by writing to the LVT CMCI register entry in the local APIC register space at default address of APIC_BASE + 2F0H. A CMCI interrupt can be delivered to more than one logical processors if multiple logical processors are affected by the associated MC errors. For example, if a corrected bit error in a cache shared by two logical processors caused a CMCI, the interrupt will be delivered to both logical processors sharing that microarchitectural sub-system. Similarly, package level errors may cause CMCI to be delivered to all logical processors within the package. However, system level errors will not be handled by CMCI. See Section 10.5.1, “Local Vector Table” for details regarding the LVT CMCI register. 15.5.2 System Software Recommendation for Managing CMCI and Machine Check Resources System software must enable and manage CMCI, set up interrupt handlers to service CMCI interrupts delivered to affected logical processors, program CMCI LVT entry, and query machine check banks that are shared by more than one logical processors. This section describes techniques system software can implement to manage CMCI initialization, service CMCI interrupts in a efficient manner to minimize contentions to access shared MSR resources. 15.5.2.1 CMCI Initialization Although a CMCI interrupt may be delivered to more than one logical processors depending on the nature of the corrected MC error, only one instance of the interrupt service routine needs to perform the necessary service and make queries to the machine-check banks. The following steps describes a technique that limits the amount of work the system has to do in response to a CMCI. • To provide maximum flexibility, system software should define per-thread data structure for each logical processor to allow equal-opportunity and efficient response to interrupt delivery. Specifically, the per-thread data structure should include a set of per-bank fields to track which machine check bank it needs to access in response to a delivered CMCI interrupt. The number of banks that needs to be tracked is determined by IA32_MCG_CAP[7:0]. • Initialization of per-thread data structure. The initialization of per-thread data structure must be done serially on each logical processor in the system. The sequencing order to start the per-thread initialization between different logical processor is arbitrary. But it must observe the following specific detail to satisfy the shared nature of specific MSR resources: a. Each thread initializes its data structure to indicate that it does not own any MC bank registers. b. Each thread examines IA32_MCi_CTL2[30] indicator for each bank to determine if another thread has already claimed ownership of that bank. c. • • If IA32_MCi_CTL2[30] had been set by another thread. This thread can not own bank i and should proceed to step b. and examine the next machine check bank until all of the machine check banks are exhausted. • If IA32_MCi_CTL2[30] = 0, proceed to step c. Check whether writing a 1 into IA32_MCi_CTL2[30] can return with 1 on a subsequent read to determine this bank can support CMCI. • If IA32_MCi_CTL2[30] = 0, this bank does not support CMCI. This thread can not own bank i and should proceed to step b. and examine the next machine check bank until all of the machine check banks are exhausted. • If IA32_MCi_CTL2[30] = 1, modify the per-thread data structure to indicate this thread claims ownership to the MC bank; proceed to initialize the error threshold count (bits 15:0) of that bank as described in Chapter 15, “CMCI Threshold Management”. Then proceed to step b. and examine the next machine check bank until all of the machine check banks are exhausted. After the thread has examined all of the machine check banks, it sees if it owns any MC banks to service CMCI. If any bank has been claimed by this thread: — Ensure that the CMCI interrupt handler has been set up as described in Chapter 15, “CMCI Interrupt Handler”. — Initialize the CMCI LVT entry, as described in Section 15.5.1, “CMCI Local APIC Interface”. Vol. 3B 15-15 MACHINE-CHECK ARCHITECTURE — Log and clear all of IA32_MCi_Status registers for the banks that this thread owns. This will allow new errors to be logged. 15.5.2.2 CMCI Threshold Management The Corrected MC error threshold field, IA32_MCi_CTL2[15:0], is architecturally defined. Specifically, all these bits are writable by software, but different processor implementations may choose to implement less than 15 bits as threshold for the overflow comparison with IA32_MCi_STATUS[52:38]. The following describes techniques that software can manage CMCI threshold to be compatible with changes in implementation characteristics: • Software can set the initial threshold value to 1 by writing 1 to IA32_MCi_CTL2[15:0]. This will cause overflow condition on every corrected MC error and generates a CMCI interrupt. • To increase the threshold and reduce the frequency of CMCI servicing: a. Find the maximum threshold value a given processor implementation supports. The steps are: • • Write 7FFFH to IA32_MCi_CTL2[15:0], Read back IA32_MCi_CTL2[15:0], the lower 15 bits (14:0) is the maximum threshold supported by the processor. b. Increase the threshold to a value below the maximum value discovered using step a. 15.5.2.3 CMCI Interrupt Handler The following describes techniques system software may consider to implement a CMCI service routine: • The service routine examines its private per-thread data structure to check which set of MC banks it has ownership. If the thread does not have ownership of a given MC bank, proceed to the next MC bank. Ownership is determined at initialization time which is described in Section [Cross Reference to 14.5.2.1]. If the thread had claimed ownership to an MC bank, this technique will allow each logical processors to handle corrected MC errors independently and requires no synchronization to access shared MSR resources. Consult Example 15-5 for guidelines on logging when processing CMCI. 15.6 RECOVERY OF UNCORRECTED RECOVERABLE (UCR) ERRORS Recovery of uncorrected recoverable machine check errors is an enhancement in machine-check architecture. The first processor that supports this feature is 45 nm Intel 64 processor on which CPUID reports DisplayFamily_DisplayModel as 06H_2EH (see CPUID instruction in Chapter 3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A). This allow system software to perform recovery action on certain class of uncorrected errors and continue execution. 15.6.1 Detection of Software Error Recovery Support Software must use bit 24 of IA32_MCG_CAP (MCG_SER_P) to detect the presence of software error recovery support (see Figure 15-2). When IA32_MCG_CAP[24] is set, this indicates that the processor supports software error recovery. When this bit is clear, this indicates that there is no support for error recovery from the processor and the primary responsibility of the machine check handler is logging the machine check error information and shutting down the system. The new class of architectural MCA errors from which system software can attempt recovery is called Uncorrected Recoverable (UCR) Errors. UCR errors are uncorrected errors that have been detected and signaled but have not corrupted the processor context. For certain UCR errors, this means that once system software has performed a certain recovery action, it is possible to continue execution on this processor. UCR error reporting provides an error containment mechanism for data poisoning. The machine check handler will use the error log information from the error reporting registers to analyze and implement specific error recovery actions for UCR errors. 15-16 Vol. 3B MACHINE-CHECK ARCHITECTURE 15.6.2 UCR Error Reporting and Logging IA32_MCi_STATUS MSR is used for reporting UCR errors and existing corrected or uncorrected errors. The definitions of IA32_MCi_STATUS, including bit fields to identify UCR errors, is shown in Figure 15-6. UCR errors can be signaled through either the corrected machine check interrupt (CMCI) or machine check exception (MCE) path depending on the type of the UCR error. When IA32_MCG_CAP[24] is set, a UCR error is indicated by the following bit settings in the IA32_MCi_STATUS register: • • • Valid (bit 63) = 1 UC (bit 61) = 1 PCC (bit 57) = 0 Additional information from the IA32_MCi_MISC and the IA32_MCi_ADDR registers for the UCR error are available when the ADDRV and the MISCV flags in the IA32_MCi_STATUS register are set (see Section 15.3.2.4). The MCA error code field of the IA32_MCi_STATUS register indicates the type of UCR error. System software can interpret the MCA error code field to analyze and identify the necessary recovery action for the given UCR error. In addition, the IA32_MCi_STATUS register bit fields, bits 56:55, are defined (see Figure 15-6) to provide additional information to help system software to properly identify the necessary recovery action for the UCR error: • S (Signaling) flag, bit 56 - Indicates (when set) that a machine check exception was generated for the UCR error reported in this MC bank and system software needs to check the AR flag and the MCA error code fields in the IA32_MCi_STATUS register to identify the necessary recovery action for this error. When the S flag in the IA32_MCi_STATUS register is clear, this UCR error was not signaled via a machine check exception and instead was reported as a corrected machine check (CMC). System software is not required to take any recovery action when the S flag in the IA32_MCi_STATUS register is clear. • AR (Action Required) flag, bit 55 - Indicates (when set) that MCA error code specific recovery action must be performed by system software at the time this error was signaled. This recovery action must be completed successfully before any additional work is scheduled for this processor. When the RIPV flag in the IA32_MCG_STATUS is clear, an alternative execution stream needs to be provided; when the MCA error code specific recovery specific recovery action cannot be successfully completed, system software must shut down the system. When the AR flag in the IA32_MCi_STATUS register is clear, system software may still take MCA error code specific recovery action but this is optional; system software can safely resume program execution at the instruction pointer saved on the stack from the machine check exception when the RIPV flag in the IA32_MCG_STATUS register is set. Both the S and the AR flags in the IA32_MCi_STATUS register are defined to be sticky bits, which mean that once set, the processor does not clear them. Only software and good power-on reset can clear the S and the AR-flags. Both the S and the AR flags are only set when the processor reports the UCR errors (MCG_CAP[24] is set). 15.6.3 UCR Error Classification With the S and AR flag encoding in the IA32_MCi_STATUS register, UCR errors can be classified as: • Uncorrected no action required (UCNA) - is a UCR error that is not signaled via a machine check exception and, instead, is reported to system software as a corrected machine check error. UCNA errors indicate that some data in the system is corrupted, but the data has not been consumed and the processor state is valid and you may continue execution on this processor. UCNA errors require no action from system software to continue execution. A UCNA error is indicated with UC=1, PCC=0, S=0 and AR=0 in the IA32_MCi_STATUS register. • Software recoverable action optional (SRAO) - a UCR error is signaled either via a machine check exception or CMCI. System software recovery action is optional and not required to continue execution from this machine check exception. SRAO errors indicate that some data in the system is corrupt, but the data has not been consumed and the processor state is valid. SRAO errors provide the additional error information for system software to perform a recovery action. An SRAO error when signaled as a machine check is indicated with UC=1, PCC=0, S=1, EN=1 and AR=0 in the IA32_MCi_STATUS register. In cases when SRAO is signaled via CMCI the error signature is indicated via UC=1, PCC=0, S=0. Recovery actions for SRAO errors are MCA error code specific. The MISCV and the ADDRV flags in the IA32_MCi_STATUS register are set when the additional error information is available from the IA32_MCi_MISC and the IA32_MCi_ADDR registers. System software needs to inspect the MCA error code fields in the IA32_MCi_STATUS register to identify the specific recovery Vol. 3B 15-17 MACHINE-CHECK ARCHITECTURE action for a given SRAO error. If MISCV and ADDRV are not set, it is recommended that no system software error recovery be performed however, system software can resume execution. • Software recoverable action required (SRAR) - a UCR error that requires system software to take a recovery action on this processor before scheduling another stream of execution on this processor. SRAR errors indicate that the error was detected and raised at the point of the consumption in the execution flow. An SRAR error is indicated with UC=1, PCC=0, S=1, EN=1 and AR=1 in the IA32_MCi_STATUS register. Recovery actions are MCA error code specific. The MISCV and the ADDRV flags in the IA32_MCi_STATUS register are set when the additional error information is available from the IA32_MCi_MISC and the IA32_MCi_ADDR registers. System software needs to inspect the MCA error code fields in the IA32_MCi_STATUS register to identify the specific recovery action for a given SRAR error. If MISCV and ADDRV are not set, it is recommended that system software shutdown the system. Table 15-7 summarizes UCR, corrected, and uncorrected errors. Table 15-7. MC Error Classifications Type of Error1 UC EN PCC S AR Signaling Software Action Uncorrected Error (UC) 1 1 1 x x MCE If EN=1, reset the system, else log and OK to keep the system running. SRAR 1 1 0 1 1 MCE For known MCACOD, take specific recovery action; Example Cache to processor load error. For unknown MCACOD, must bugcheck. If OVER=1, reset system, else take specific recovery action. SRAO 1 x2 0 x2 0 MCE/CMC For known MCACOD, take specific recovery action; Patrol scrub and explicit writeback poison errors. For unknown MCACOD, OK to keep the system running. UCNA 1 x 0 0 0 CMC Log the error and Ok to keep the system running. Poison detection error. Corrected Error (CE) 0 x x x x CMC Log the error and no corrective action required. ECC in caches and memory. NOTES: 1. SRAR, SRAO and UCNA errors are supported by the processor only when IA32_MCG_CAP[24] (MCG_SER_P) is set. 2. EN=1, S=1 when signaled via MCE. EN=x, S=0 when signaled via CMC. 15.6.4 UCR Error Overwrite Rules In general, the overwrite rules are as follows: • • • • UCR errors will overwrite corrected errors. Uncorrected (PCC=1) errors overwrite UCR (PCC=0) errors. UCR errors are not written over previous UCR errors. Corrected errors do not write over previous UCR errors. Regardless of whether the 1st error is retained or the 2nd error is overwritten over the 1st error, the OVER flag in the IA32_MCi_STATUS register will be set to indicate an overflow condition. As the S flag and AR flag in the IA32_MCi_STATUS register are defined to be sticky flags, a second event cannot clear these 2 flags once set, however the MC bank information may be filled in for the 2nd error. The table below shows the overwrite rules and how to treat a second error if the first event is already logged in a MC bank along with the resulting bit setting of the UC, PCC, and AR flags in the IA32_MCi_STATUS register. As UCNA and SRA0 errors do not require recovery action from system software to continue program execution, a system reset by system software is not required unless the AR flag or PCC flag is set for the UCR overflow case (OVER=1, VAL=1, UC=1, PCC=0). 15-18 Vol. 3B MACHINE-CHECK ARCHITECTURE Table 15-8 lists overwrite rules for uncorrected errors, corrected errors, and uncorrected recoverable errors. Table 15-8. Overwrite Rules for UC, CE, and UCR Errors First Event Second Event UC PCC S MCA Bank Reset System CE UCR 1 0 0 if UCNA, else 1 1 if SRAR, else 0 AR second yes, if AR=1 UCR CE 1 0 0 if UCNA, else 1 1 if SRAR, else 0 first yes, if AR=1 UCNA UCNA 1 0 0 first no 0 UCNA SRAO 1 0 1 0 first no UCNA SRAR 1 0 1 1 first yes SRAO UCNA 1 0 1 0 first no SRAO SRAO 1 0 1 0 first no SRAO SRAR 1 0 1 1 first yes SRAR UCNA 1 0 1 1 first yes SRAR SRAO 1 0 1 1 first yes SRAR SRAR 1 0 1 1 first yes UCR UC 1 1 undefined undefined second yes UC UCR 1 1 undefined undefined first yes 15.7 MACHINE-CHECK AVAILABILITY The machine-check architecture and machine-check exception (#MC) are model-specific features. Software can execute the CPUID instruction to determine whether a processor implements these features. Following the execution of the CPUID instruction, the settings of the MCA flag (bit 14) and MCE flag (bit 7) in EDX indicate whether the processor implements the machine-check architecture and machine-check exception. 15.8 MACHINE-CHECK INITIALIZATION To use the processors machine-check architecture, software must initialize the processor to activate the machinecheck exception and the error-reporting mechanism. Example 15-1 gives pseudocode for performing this initialization. This pseudocode checks for the existence of the machine-check architecture and exception; it then enables machine-check exception and the error-reporting register banks. The pseudocode shown is compatible with the Pentium 4, Intel Xeon, Intel Atom, P6 family, and Pentium processors. Following power up or power cycling, IA32_MCi_STATUS registers are not guaranteed to have valid data until after they are initially cleared to zero by software (as shown in the initialization pseudocode in Example 15-1). In addition, when using P6 family processors, software must set MCi_STATUS registers to zero when doing a soft-reset. Example 15-1. Machine-Check Initialization Pseudocode Check CPUID Feature Flags for MCE and MCA support IF CPU supports MCE THEN IF CPU supports MCA THEN IF (IA32_MCG_CAP.MCG_CTL_P = 1) (* IA32_MCG_CTL register is present *) THEN IA32_MCG_CTL ← FFFFFFFFFFFFFFFFH; (* enables all MCA features *) FI IF (IA32_MCG_CAP.MCG_LMCE_P = 1 and IA32_FEATURE_CONTROL.LOCK = 1 and IA32_FEATURE_CONTROL.LMCE_ON= 1) Vol. 3B 15-19 MACHINE-CHECK ARCHITECTURE (* IA32_MCG_EXT_CTL register is present and platform has enabled LMCE to permit system software to use LMCE *) THEN IA32_MCG_EXT_CTL ← IA32_MCG_EXT_CTL | 01H; (* System software enables LMCE capability for hardware to signal MCE to a single logical processor*) FI (* Determine number of error-reporting banks supported *) COUNT← IA32_MCG_CAP.Count; MAX_BANK_NUMBER ← COUNT - 1; IF (Processor Family is 6H and Processor EXTMODEL:MODEL is less than 1AH) THEN (* Enable logging of all errors except for MC0_CTL register *) FOR error-reporting banks (1 through MAX_BANK_NUMBER) DO IA32_MCi_CTL ← 0FFFFFFFFFFFFFFFFH; OD ELSE (* Enable logging of all errors including MC0_CTL register *) FOR error-reporting banks (0 through MAX_BANK_NUMBER) DO IA32_MCi_CTL ← 0FFFFFFFFFFFFFFFFH; OD FI (* BIOS clears all errors only on power-on reset *) IF (BIOS detects Power-on reset) THEN FOR error-reporting banks (0 through MAX_BANK_NUMBER) DO IA32_MCi_STATUS ← 0; OD ELSE FOR error-reporting banks (0 through MAX_BANK_NUMBER) DO (Optional for BIOS and OS) Log valid errors (OS only) IA32_MCi_STATUS ← 0; OD FI FI Setup the Machine Check Exception (#MC) handler for vector 18 in IDT Set the MCE bit (bit 6) in CR4 register to enable Machine-Check Exceptions FI 15.9 INTERPRETING THE MCA ERROR CODES When the processor detects a machine-check error condition, it writes a 16-bit error code to the MCA error code field of one of the IA32_MCi_STATUS registers and sets the VAL (valid) flag in that register. The processor may also write a 16-bit model-specific error code in the IA32_MCi_STATUS register depending on the implementation of the machine-check architecture of the processor. The MCA error codes are architecturally defined for Intel 64 and IA-32 processors. To determine the cause of a machine-check exception, the machine-check exception handler must read the VAL flag for each IA32_MCi_STATUS register. If the flag is set, the machine check-exception handler must then read the MCA error code field of the register. It is the encoding of the MCA error code field [15:0] that determines the type of error being reported and not the register bank reporting it. There are two types of MCA error codes: simple error codes and compound error codes. 15-20 Vol. 3B MACHINE-CHECK ARCHITECTURE 15.9.1 Simple Error Codes Table 15-9 shows the simple error codes. These unique codes indicate global error information. Table 15-9. IA32_MCi_Status [15:0] Simple Error Code Encoding Error Code Binary Encoding Meaning No Error 0000 0000 0000 0000 No error has been reported to this bank of error-reporting registers. Unclassified 0000 0000 0000 0001 This error has not been classified into the MCA error classes. Microcode ROM Parity Error 0000 0000 0000 0010 Parity error in internal microcode ROM External Error 0000 0000 0000 0011 The BINIT# from another processor caused this processor to enter machine check.1 FRC Error 0000 0000 0000 0100 FRC (functional redundancy check) master/slave error Internal Parity Error 0000 0000 0000 0101 Internal parity error. SMM Handler Code Access Violation 0000 0000 0000 0110 An attempt was made by the SMM Handler to execute outside the ranges specified by SMRR. Internal Timer Error 0000 0100 0000 0000 Internal timer error. I/O Error 0000 1110 0000 1011 generic I/O error. Internal Unclassified 0000 01xx xxxx xxxx Internal unclassified errors. 2 NOTES: 1. BINIT# assertion will cause a machine check exception if the processor (or any processor on the same external bus) has BINIT# observation enabled during power-on configuration (hardware strapping) and if machine check exceptions are enabled (by setting CR4.MCE = 1). 2. At least one X must equal one. Internal unclassified errors have not been classified. 15.9.2 Compound Error Codes Compound error codes describe errors related to the TLBs, memory, caches, bus and interconnect logic, and internal timer. A set of sub-fields is common to all of compound errors. These sub-fields describe the type of access, level in the cache hierarchy, and type of request. Table 15-10 shows the general form of the compound error codes. Table 15-10. IA32_MCi_Status [15:0] Compound Error Code Encoding Type Form Interpretation Generic Cache Hierarchy 000F 0000 0000 11LL Generic cache hierarchy error TLB Errors 000F 0000 0001 TTLL {TT}TLB{LL}_ERR Memory Controller Errors 000F 0000 1MMM CCCC {MMM}_CHANNEL{CCCC}_ERR Cache Hierarchy Errors 000F 0001 RRRR TTLL {TT}CACHE{LL}_{RRRR}_ERR Extended Memory Errors 000F 0010 1MMM CCCC {MMM}_CHANNEL{CCCC}_ERR Bus and Interconnect Errors 000F 1PPT RRRR IILL BUS{LL}_{PP}_{RRRR}_{II}_{T}_ERR The “Interpretation” column in the table indicates the name of a compound error. The name is constructed by substituting mnemonics for the sub-field names given within curly braces. For example, the error code ICACHEL1_RD_ERR is constructed from the form: {TT}CACHE{LL}_{RRRR}_ERR, where {TT} is replaced by I, {LL} is replaced by L1, and {RRRR} is replaced by RD. For more information on the “Form” and “Interpretation” columns, see Sections Section 15.9.2.1, “Correction Report Filtering (F) Bit” through Section 15.9.2.5, “Bus and Interconnect Errors”. Vol. 3B 15-21 MACHINE-CHECK ARCHITECTURE 15.9.2.1 Correction Report Filtering (F) Bit Starting with Intel Core Duo processors, bit 12 in the “Form” column in Table 15-10 is used to indicate that a particular posting to a log may be the last posting for corrections in that line/entry, at least for some time: • • 0 in bit 12 indicates “normal” filtering (original P6/Pentium4/Atom/Xeon processor meaning). 1 in bit 12 indicates “corrected” filtering (filtering is activated for the line/entry in the posting). Filtering means that some or all of the subsequent corrections to this entry (in this structure) will not be posted. The enhanced error reporting introduced with the Intel Core Duo processors is based on tracking the lines affected by repeated corrections (see Section 15.4, “Enhanced Cache Error reporting”). This capability is indicated by IA32_MCG_CAP[11]. Only the first few correction events for a line are posted; subsequent redundant correction events to the same line are not posted. Uncorrected events are always posted. The behavior of error filtering after crossing the yellow threshold is model-specific. Filtering has meaning only for corrected errors (UC=0 in IA32_MCi_STATUS MSR). System software must ignore filtering bit (12) for uncorrected errors. 15.9.2.2 Transaction Type (TT) Sub-Field The 2-bit TT sub-field (Table 15-11) indicates the type of transaction (data, instruction, or generic). The sub-field applies to the TLB, cache, and interconnect error conditions. Note that interconnect error conditions are primarily associated with P6 family and Pentium processors, which utilize an external APIC bus separate from the system bus. The generic type is reported when the processor cannot determine the transaction type. Table 15-11. Encoding for TT (Transaction Type) Sub-Field Transaction Type Mnemonic Binary Encoding Instruction I 00 Data D 01 Generic G 10 15.9.2.3 Level (LL) Sub-Field The 2-bit LL sub-field (see Table 15-12) indicates the level in the memory hierarchy where the error occurred (level 0, level 1, level 2, or generic). The LL sub-field also applies to the TLB, cache, and interconnect error conditions. The Pentium 4, Intel Xeon, Intel Atom, and P6 family processors support two levels in the cache hierarchy and one level in the TLBs. Again, the generic type is reported when the processor cannot determine the hierarchy level. Table 15-12. Level Encoding for LL (Memory Hierarchy Level) Sub-Field Hierarchy Level Mnemonic Binary Encoding Level 0 L0 00 Level 1 L1 01 Level 2 L2 10 Generic LG 11 15.9.2.4 Request (RRRR) Sub-Field The 4-bit RRRR sub-field (see Table 15-13) indicates the type of action associated with the error. Actions include read and write operations, prefetches, cache evictions, and snoops. Generic error is returned when the type of error cannot be determined. Generic read and generic write are returned when the processor cannot determine the type of instruction or data request that caused the error. Eviction and snoop requests apply only to the caches. All of the other requests apply to TLBs, caches and interconnects. 15-22 Vol. 3B MACHINE-CHECK ARCHITECTURE Table 15-13. Encoding of Request (RRRR) Sub-Field Request Type Mnemonic Binary Encoding Generic Error ERR 0000 Generic Read RD 0001 Generic Write WR 0010 Data Read DRD 0011 Data Write DWR 0100 Instruction Fetch IRD 0101 Prefetch PREFETCH 0110 Eviction EVICT 0111 Snoop SNOOP 1000 15.9.2.5 Bus and Interconnect Errors The bus and interconnect errors are defined with the 2-bit PP (participation), 1-bit T (time-out), and 2-bit II (memory or I/O) sub-fields, in addition to the LL and RRRR sub-fields (see Table 15-14). The bus error conditions are implementation dependent and related to the type of bus implemented by the processor. Likewise, the interconnect error conditions are predicated on a specific implementation-dependent interconnect model that describes the connections between the different levels of the storage hierarchy. The type of bus is implementation dependent, and as such is not specified in this document. A bus or interconnect transaction consists of a request involving an address and a response. Table 15-14. Encodings of PP, T, and II Sub-Fields Sub-Field Transaction Mnemonic Binary Encoding PP (Participation) Local processor* originated request SRC 00 Local processor* responded to request RES 01 Local processor* observed error as third party OBS Generic T (Time-out) II (Memory or I/O) 10 11 Request timed out TIMEOUT 1 Request did not time out NOTIMEOUT 0 Memory Access M 00 Reserved I/O Other transaction 01 IO 10 11 NOTE: * Local processor differentiates the processor reporting the error from other system components (including the APIC, other processors, etc.). Vol. 3B 15-23 MACHINE-CHECK ARCHITECTURE 15.9.2.6 Memory Controller and Extended Memory Errors The memory controller errors are defined with the 3-bit MMM (memory transaction type), and 4-bit CCCC (channel) sub-fields. The encodings for MMM and CCCC are defined in Table 15-15. Extended Memory errors use the same encodings and are used to report errors in memory used as a cache. Table 15-15. Encodings of MMM and CCCC Sub-Fields Sub-Field Transaction Mnemonic Binary Encoding MMM Generic undefined request GEN 000 Memory read error RD 001 Memory write error WR 010 Address/Command Error AC 011 Memory Scrubbing Error MS 100 Reserved CCCC 101-111 Channel number CHN Channel not specified 0000-1110 1111 Note that the CCCC channel number may be enumerated from zero separately by each memory controller on a system. On a multi-socket system, or a system with multiple memory controllers per socket, it is necessary to also consider which machine check bank logged the error. See Chapter 16 for details on specific implementations. 15.9.3 Architecturally Defined UCR Errors Software recoverable compound error code are defined in this section. 15.9.3.1 Architecturally Defined SRAO Errors The following two SRAO errors are architecturally defined. • • UCR Errors detected by memory controller scrubbing; and UCR Errors detected during L3 cache (L3) explicit writebacks. The MCA error code encodings for these two architecturally-defined UCR errors corresponds to sub-classes of compound MCA error codes (see Table 15-10). Their values and compound encoding format are given in Table 15-16. Table 15-16. MCA Compound Error Code Encoding for SRAO Errors Type MCACOD Value Memory Scrubbing C0H - CFH MCA Error Code Encoding1 0000_0000_1100_CCCC 000F 0000 1MMM CCCC (Memory Controller Error), where Memory subfield MMM = 100B (memory scrubbing) Channel subfield CCCC = channel # or generic L3 Explicit Writeback 17AH 0000_0001_0111_1010 000F 0001 RRRR TTLL (Cache Hierarchy Error) where Request subfields RRRR = 0111B (Eviction) Transaction Type subfields TT = 10B (Generic) Level subfields LL = 10B NOTES: 1. Note that for both of these errors the correction report filtering (F) bit (bit 12) of the MCA error must be ignored. 15-24 Vol. 3B MACHINE-CHECK ARCHITECTURE Table 15-17 lists values of relevant bit fields of IA32_MCi_STATUS for architecturally defined SRAO errors. Table 15-17. IA32_MCi_STATUS Values for SRAO Errors SRAO Error Valid OVER UC EN MISCV ADDRV PCC S AR MCACOD Memory Scrubbing 1 0 1 x1 1 1 0 x1 0 C0H-CFH 1 1 0 1 0 17AH L3 Explicit Writeback 1 0 x 1 1 x NOTES: 1. When signaled as MCE, EN=1 and S=1. If error was signaled via CMC, then EN=x, and S=0. For both the memory scrubbing and L3 explicit writeback errors, the ADDRV and MISCV flags in the IA32_MCi_STATUS register are set to indicate that the offending physical address information is available from the IA32_MCi_MISC and the IA32_MCi_ADDR registers. For the memory scrubbing and L3 explicit writeback errors, the address mode in the IA32_MCi_MISC register should be set as physical address mode (010b) and the address LSB information in the IA32_MCi_MISC register should indicate the lowest valid address bit in the address information provided from the IA32_MCi_ADDR register. MCE signal is broadcast to all logical processors as outlined in Section 15.10.4.1. If LMCE is supported and enabled, some errors (not limited to UCR errors) may be delivered to only a single logical processor. System software should consult IA32_MCG_STATUS.LMCE_S to determine if the MCE signaled is only to this logical processor. IA32_MCi_STATUS banks can be shared by logical processors within a core or within the same package. So several logical processors may find an SRAO error in the shared IA32_MCi_STATUS bank but other processors do not find it in any of the IA32_MCi_STATUS banks. Table 15-18 shows the RIPV and EIPV flag indication in the IA32_MCG_STATUS register for the memory scrubbing and L3 explicit writeback errors on both the reporting and non-reporting logical processors. Table 15-18. IA32_MCG_STATUS Flag Indication for SRAO Errors SRAO Type Reporting Logical Processors Non-reporting Logical Processors RIPV EIPV RIPV EIPV Memory Scrubbing 1 0 1 0 L3 Explicit Writeback 1 0 1 0 15.9.3.2 Architecturally Defined SRAR Errors The following two SRAR errors are architecturally defined. • • UCR Errors detected on data load; and UCR Errors detected on instruction fetch. The MCA error code encodings for these two architecturally-defined UCR errors corresponds to sub-classes of compound MCA error codes (see Table 15-10). Their values and compound encoding format are given in Table 15-19. Table 15-19. MCA Compound Error Code Encoding for SRAR Errors Type MCACOD Value MCA Error Code Encoding1 Data Load 134H 0000_0001_0011_0100 000F 0001 RRRR TTLL (Cache Hierarchy Error), where Request subfield RRRR = 0011B (Data Load) Transaction Type subfield TT= 01B (Data) Level subfield LL = 00B (Level 0) Instruction Fetch 150H 0000_0001_0101_0000 000F 0001 RRRR TTLL (Cache Hierarchy Error), where Request subfield RRRR = 0101B (Instruction Fetch) Transaction Type subfield TT= 00B (Instruction) Level subfield LL = 00B (Level 0) Vol. 3B 15-25 MACHINE-CHECK ARCHITECTURE NOTES: 1. Note that for both of these errors the correction report filtering (F) bit (bit 12) of the MCA error must be ignored. Table 15-20 lists values of relevant bit fields of IA32_MCi_STATUS for architecturally defined SRAR errors. Table 15-20. IA32_MCi_STATUS Values for SRAR Errors SRAR Error Valid OVER UC EN MISCV ADDRV PCC S AR MCACOD Data Load 1 0 1 1 1 1 0 1 1 134H Instruction Fetch 1 0 1 1 1 1 0 1 1 150H For both the data load and instruction fetch errors, the ADDRV and MISCV flags in the IA32_MCi_STATUS register are set to indicate that the offending physical address information is available from the IA32_MCi_MISC and the IA32_MCi_ADDR registers. For the memory scrubbing and L3 explicit writeback errors, the address mode in the IA32_MCi_MISC register should be set as physical address mode (010b) and the address LSB information in the IA32_MCi_MISC register should indicate the lowest valid address bit in the address information provided from the IA32_MCi_ADDR register. MCE signal is broadcast to all logical processors on the system on which the UCR errors are supported, except when the processor supports LMCE and LMCE is enabled by system software (see Section 15.3.1.5). The IA32_MCG_STATUS MSR allows system software to distinguish the affected logical processor of an SRAR error amongst logical processors that observed SRAR via MCi_STATUS bank. Table 15-21 shows the RIPV and EIPV flag indication in the IA32_MCG_STATUS register for the data load and instruction fetch errors on both the reporting and non-reporting logical processors. The recoverable SRAR error reported by a processor may be continuable, where the system software can interpret the context of continuable as follows: the error was isolated, contained. If software can rectify the error condition in the current instruction stream, the execution context on that logical processor can be continued without loss of information. Table 15-21. IA32_MCG_STATUS Flag Indication for SRAR Errors SRAR Type Affected Logical Processor RIPV EIPV Non-Affected Logical Processors Continuable Recoverablecontinuable 1 1 Yes Recoverable-notcontinuable 0 x No RIPV EIPV Continuable 1 1 0 Yes NOTES: 1. see the definition of the context of “continuable” above and additional detail below. SRAR Error And Affected Logical Processors The affected logical processor is the one that has detected and raised an SRAR error at the point of the consumption in the execution flow. The affected logical processor should find the Data Load or the Instruction Fetch error information in the IA32_MCi_STATUS register that is reporting the SRAR error. Table 15-21 list the actionable scenarios that system software can respond to an SRAR error on an affected logical processor according to RIPV and EIPV values: • Recoverable-Continuable SRAR Error (RIPV=1, EIPV=1): For Recoverable-Continuable SRAR errors, the affected logical processor should find that both the IA32_MCG_STATUS.RIPV and the IA32_MCG_STATUS.EIPV flags are set, indicating that system software may be able to restart execution from the interrupted context if it is able to rectify the error condition. If system software cannot rectify the error condition then it must treat the error as a recoverable error where restarting execution with the interrupted context is not possible. Restarting without rectifying the error condition will result in most cases with another SRAR error on the same instruction. 15-26 Vol. 3B MACHINE-CHECK ARCHITECTURE • Recoverable-not-continuable SRAR Error (RIPV=0, EIPV=x): For Recoverable-not-continuable errors, the affected logical processor should find that either — IA32_MCG_STATUS.RIPV= 0, IA32_MCG_STATUS.EIPV=1, or — IA32_MCG_STATUS.RIPV= 0, IA32_MCG_STATUS.EIPV=0. In either case, this indicates that the error is detected at the instruction pointer saved on the stack for this machine check exception and restarting execution with the interrupted context is not possible. System software may take the following recovery actions for the affected logical processor: • The current executing thread cannot be continued. System software must terminate the interrupted stream of execution and provide a new stream of execution on return from the machine check handler for the affected logical processor. SRAR Error And Non-Affected Logical Processors The logical processors that observed but not affected by an SRAR error should find that the RIPV flag in the IA32_MCG_STATUS register is set and the EIPV flag in the IA32_MCG_STATUS register is cleared, indicating that it is safe to restart the execution at the instruction saved on the stack for the machine check exception on these processors after the recovery action is successfully taken by system software. 15.9.4 Multiple MCA Errors When multiple MCA errors are detected within a certain detection window, the processor may aggregate the reporting of these errors together as a single event, i.e. a single machine exception condition. If this occurs, system software may find multiple MCA errors logged in different MC banks on one logical processor or find multiple MCA errors logged across different processors for a single machine check broadcast event. In order to handle multiple UCR errors reported from a single machine check event and possibly recover from multiple errors, system software may consider the following: • Whether it can recover from multiple errors is determined by the most severe error reported on the system. If the most severe error is found to be an unrecoverable error (VAL=1, UC=1, PCC=1 and EN=1) after system software examines the MC banks of all processors to which the MCA signal is broadcast, recovery from the multiple errors is not possible and system software needs to reset the system. • When multiple recoverable errors are reported and no other fatal condition (e.g. overflowed condition for SRAR error) is found for the reported recoverable errors, it is possible for system software to recover from the multiple recoverable errors by taking necessary recovery action for each individual recoverable error. However, system software can no longer expect one to one relationship with the error information recorded in the IA32_MCi_STATUS register and the states of the RIPV and EIPV flags in the IA32_MCG_STATUS register as the states of the RIPV and the EIPV flags in the IA32_MCG_STATUS register may indicate the information for the most severe error recorded on the processor. System software is required to use the RIPV flag indication in the IA32_MCG_STATUS register to make a final decision of recoverability of the errors and find the restart-ability requirement after examining each IA32_MCi_STATUS register error information in the MC banks. In certain cases where system software observes more than one SRAR error logged for a single logical processor, it can no longer rely on affected threads as specified in Table 15-20 above. System software is recommended to reset the system if this condition is observed. 15.9.5 Machine-Check Error Codes Interpretation Chapter 16, “Interpreting Machine-Check Error Codes,” provides information on interpreting the MCA error code, model-specific error code, and other information error code fields. For P6 family processors, information has been included on decoding external bus errors. For Pentium 4 and Intel Xeon processors; information is included on external bus, internal timer and cache hierarchy errors. Vol. 3B 15-27 MACHINE-CHECK ARCHITECTURE 15.10 GUIDELINES FOR WRITING MACHINE-CHECK SOFTWARE The machine-check architecture and error logging can be used in three different ways: • • • To detect machine errors during normal instruction execution, using the machine-check exception (#MC). To periodically check and log machine errors. To examine recoverable UCR errors, determine software recoverability and perform recovery actions via a machine-check exception handler or a corrected machine-check interrupt handler. To use the machine-check exception, the operating system or executive software must provide a machine-check exception handler. This handler may need to be designed specifically for each family of processors. A special program or utility is required to log machine errors. Guidelines for writing a machine-check exception handler or a machine-error logging utility are given in the following sections. 15.10.1 Machine-Check Exception Handler The machine-check exception (#MC) corresponds to vector 18. To service machine-check exceptions, a trap gate must be added to the IDT. The pointer in the trap gate must point to a machine-check exception handler. Two approaches can be taken to designing the exception handler: 1. The handler can merely log all the machine status and error information, then call a debugger or shut down the system. 2. The handler can analyze the reported error information and, in some cases, attempt to correct the error and restart the processor. For Pentium 4, Intel Xeon, Intel Atom, P6 family, and Pentium processors; virtually all machine-check conditions cannot be corrected (they result in abort-type exceptions). The logging of status and error information is therefore a baseline implementation requirement. When IA32_MCG_CAP[24] is clear, consider the following when writing a machine-check exception handler: • To determine the nature of the error, the handler must read each of the error-reporting register banks. The count field in the IA32_MCG_CAP register gives number of register banks. The first register of register bank 0 is at address 400H. • The VAL (valid) flag in each IA32_MCi_STATUS register indicates whether the error information in the register is valid. If this flag is clear, the registers in that bank do not contain valid error information and do not need to be checked. • To write a portable exception handler, only the MCA error code field in the IA32_MCi_STATUS register should be checked. See Section 15.9, “Interpreting the MCA Error Codes,” for information that can be used to write an algorithm to interpret this field. • Correctable errors are corrected automatically by the processor. The UC flag in each IA32_MCi_STATUS register indicates whether the processor automatically corrected an error. • The RIPV, PCC, and OVER flags in each IA32_MCi_STATUS register indicate whether recovery from the error is possible. If PCC or OVER are set, recovery is not possible. If RIPV is not set, program execution can not be restarted reliably. When recovery is not possible, the handler typically records the error information and signals an abort to the operating system. • The RIPV flag in the IA32_MCG_STATUS register indicates whether the program can be restarted at the instruction indicated by the instruction pointer (the address of the instruction pushed on the stack when the exception was generated). If this flag is clear, the processor may still be able to be restarted (for debugging purposes) but not without loss of program continuity. • For unrecoverable errors, the EIPV flag in the IA32_MCG_STATUS register indicates whether the instruction indicated by the instruction pointer pushed on the stack (when the exception was generated) is related to the error. If the flag is clear, the pushed instruction may not be related to the error. • The MCIP flag in the IA32_MCG_STATUS register indicates whether a machine-check exception was generated. Before returning from the machine-check exception handler, software should clear this flag so that it can be used reliably by an error logging utility. The MCIP flag also detects recursion. The machine-check architecture 15-28 Vol. 3B MACHINE-CHECK ARCHITECTURE does not support recursion. When the processor detects machine-check recursion, it enters the shutdown state. Example 15-2 gives typical steps carried out by a machine-check exception handler. Example 15-2. Machine-Check Exception Handler Pseudocode IF CPU supports MCE THEN IF CPU supports MCA THEN call errorlogging routine; (* returns restartability *) FI; ELSE (* Pentium(R) processor compatible *) READ P5_MC_ADDR READ P5_MC_TYPE; report RESTARTABILITY to console; FI; IF error is not restartable THEN report RESTARTABILITY to console; abort system; FI; CLEAR MCIP flag in IA32_MCG_STATUS; 15.10.2 Pentium Processor Machine-Check Exception Handling Machine-check exception handler on P6 family, Intel Atom and later processor families, should follow the guidelines described in Section 15.10.1 and Example 15-2 that check the processor’s support of MCA. NOTE On processors that support MCA (CPUID.1.EDX.MCA = 1) reading the P5_MC_TYPE and P5_MC_ADDR registers may produce invalid data. When machine-check exceptions are enabled for the Pentium processor (MCE flag is set in control register CR4), the machine-check exception handler uses the RDMSR instruction to read the error type from the P5_MC_TYPE register and the machine check address from the P5_MC_ADDR register. The handler then normally reports these register values to the system console before aborting execution (see Example 15-2). 15.10.3 Logging Correctable Machine-Check Errors The error handling routine for servicing the machine-check exceptions is responsible for logging uncorrected errors. If a machine-check error is correctable, the processor does not generate a machine-check exception for it. To detect correctable machine-check errors, a utility program must be written that reads each of the machine-check error-reporting register banks and logs the results in an accounting file or data structure. This utility can be implemented in either of the following ways. • • A system daemon that polls the register banks on an infrequent basis, such as hourly or daily. • An interrupt service routine servicing CMCI can read the MC banks and log the error. Please refer to Section 15.10.4.2 for guidelines on logging correctable machine checks. A user-initiated application that polls the register banks and records the exceptions. Here, the actual polling service is provided by an operating-system driver or through the system call interface. Example 15-3 gives pseudocode for an error logging utility. Vol. 3B 15-29 MACHINE-CHECK ARCHITECTURE Example 15-3. Machine-Check Error Logging Pseudocode Assume that execution is restartable; IF the processor supports MCA THEN FOR each bank of machine-check registers DO READ IA32_MCi_STATUS; IF VAL flag in IA32_MCi_STATUS = 1 THEN IF ADDRV flag in IA32_MCi_STATUS = 1 THEN READ IA32_MCi_ADDR; FI; IF MISCV flag in IA32_MCi_STATUS = 1 THEN READ IA32_MCi_MISC; FI; IF MCIP flag in IA32_MCG_STATUS = 1 (* Machine-check exception is in progress *) AND PCC flag in IA32_MCi_STATUS = 1 OR RIPV flag in IA32_MCG_STATUS = 0 (* execution is not restartable *) THEN RESTARTABILITY = FALSE; return RESTARTABILITY to calling procedure; FI; Save time-stamp counter and processor ID; Set IA32_MCi_STATUS to all 0s; Execute serializing instruction (i.e., CPUID); FI; OD; FI; If the processor supports the machine-check architecture, the utility reads through the banks of error-reporting registers looking for valid register entries. It then saves the values of the IA32_MCi_STATUS, IA32_MCi_ADDR, IA32_MCi_MISC and IA32_MCG_STATUS registers for each bank that is valid. The routine minimizes processing time by recording the raw data into a system data structure or file, reducing the overhead associated with polling. User utilities analyze the collected data in an off-line environment. When the MCIP flag is set in the IA32_MCG_STATUS register, a machine-check exception is in progress and the machine-check exception handler has called the exception logging routine. Once the logging process has been completed the exception-handling routine must determine whether execution can be restarted, which is usually possible when damage has not occurred (The PCC flag is clear, in the IA32_MCi_STATUS register) and when the processor can guarantee that execution is restartable (the RIPV flag is set in the IA32_MCG_STATUS register). If execution cannot be restarted, the system is not recoverable and the exception-handling routine should signal the console appropriately before returning the error status to the Operating System kernel for subsequent shutdown. The machine-check architecture allows buffering of exceptions from a given error-reporting bank although the Pentium 4, Intel Xeon, Intel Atom, and P6 family processors do not implement this feature. The error logging routine should provide compatibility with future processors by reading each hardware error-reporting bank's IA32_MCi_STATUS register and then writing 0s to clear the OVER and VAL flags in this register. The error logging utility should re-read the IA32_MCi_STATUS register for the bank ensuring that the valid bit is clear. The processor will write the next error into the register bank and set the VAL flags. Additional information that should be stored by the exception-logging routine includes the processor’s time-stamp counter value, which provides a mechanism to indicate the frequency of exceptions. A multiprocessing operating system stores the identity of the processor node incurring the exception using a unique identifier, such as the processor’s APIC ID (see Section 10.8, “Handling Interrupts”). The basic algorithm given in Example 15-3 can be modified to provide more robust recovery techniques. For example, software has the flexibility to attempt recovery using information unavailable to the hardware. Specifically, the machine-check exception handler can, after logging carefully analyze the error-reporting registers when the error-logging routine reports an error that does not allow execution to be restarted. These recovery techniques 15-30 Vol. 3B MACHINE-CHECK ARCHITECTURE can use external bus related model-specific information provided with the error report to localize the source of the error within the system and determine the appropriate recovery strategy. 15.10.4 Machine-Check Software Handler Guidelines for Error Recovery 15.10.4.1 Machine-Check Exception Handler for Error Recovery When writing a machine-check exception (MCE) handler to support software recovery from Uncorrected Recoverable (UCR) errors, consider the following: • When IA32_MCG_CAP [24] is zero, there are no recoverable errors supported and all machine-check are fatal exceptions. The logging of status and error information is therefore a baseline implementation requirement. • When IA32_MCG_CAP [24] is 1, certain uncorrected errors called uncorrected recoverable (UCR) errors may be software recoverable. The handler can analyze the reported error information, and in some cases attempt to recover from the uncorrected error and continue execution. • For processors on which CPUID reports DisplayFamily_DisplayModel as 06H_0EH and onward, an MCA signal is broadcast to all logical processors in the system (see CPUID instruction in Chapter 3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A). Due to the potentially shared machine check MSR resources among the logical processors on the same package/core, the MCE handler may be required to synchronize with the other processors that received a machine check error and serialize access to the machine check registers when analyzing, logging and clearing the information in the machine check registers. — On processors that indicate ability for local machine-check exception (MCG_LMCE_P), hardware can choose to report the error to only a single logical processor if system software has enabled LMCE by setting IA32_MCG_EXT_CTL[LMCE_EN] = 1 as outlined in Section 15.3.1.5. • The VAL (valid) flag in each IA32_MCi_STATUS register indicates whether the error information in the register is valid. If this flag is clear, the registers in that bank do not contain valid error information and should not be checked. • The MCE handler is primarily responsible for processing uncorrected errors. The UC flag in each IA32_MCi_Status register indicates whether the reported error was corrected (UC=0) or uncorrected (UC=1). The MCE handler can optionally log and clear the corrected errors in the MC banks if it can implement software algorithm to avoid the undesired race conditions with the CMCI or CMC polling handler. • For uncorrectable errors, the EIPV flag in the IA32_MCG_STATUS register indicates (when set) that the instruction pointed to by the instruction pointer pushed onto the stack when the machine-check exception is generated is directly associated with the error. When this flag is cleared, the instruction pointed to may not be associated with the error. • The MCIP flag in the IA32_MCG_STATUS register indicates whether a machine-check exception was generated. When a machine check exception is generated, it is expected that the MCIP flag in the IA32_MCG_STATUS register is set to 1. If it is not set, this machine check was generated by either an INT 18 instruction or some piece of hardware signaling an interrupt with vector 18. When IA32_MCG_CAP [24] is 1, the following rules can apply when writing a machine check exception (MCE) handler to support software recovery: • The PCC flag in each IA32_MCi_STATUS register indicates whether recovery from the error is possible for uncorrected errors (UC=1). If the PCC flag is set for enabled uncorrected errors (UC=1 and EN=1), recovery is not possible. When recovery is not possible, the MCE handler typically records the error information and signals the operating system to reset the system. • The RIPV flag in the IA32_MCG_STATUS register indicates whether restarting the program execution from the instruction pointer saved on the stack for the machine check exception is possible. When the RIPV is set, program execution can be restarted reliably when recovery is possible. If the RIPV flag is not set, program execution cannot be restarted reliably. In this case the recovery algorithm may involve terminating the current program execution and resuming an alternate thread of execution upon return from the machine check handler when recovery is possible. When recovery is not possible, the MCE handler signals the operating system to reset the system. Vol. 3B 15-31 MACHINE-CHECK ARCHITECTURE • When the EN flag is zero but the VAL and UC flags are one in the IA32_MCi_STATUS register, the reported uncorrected error in this bank is not enabled. As uncorrected errors with the EN flag = 0 are not the source of machine check exceptions, the MCE handler should log and clear non-enabled errors when the S bit is set and should continue searching for enabled errors from the other IA32_MCi_STATUS registers. Note that when IA32_MCG_CAP [24] is 0, any uncorrected error condition (VAL =1 and UC=1) including the one with the EN flag cleared are fatal and the handler must signal the operating system to reset the system. For the errors that do not generate machine check exceptions, the EN flag has no meaning. • When the VAL flag is one, the UC flag is one, the EN flag is one and the PCC flag is zero in the IA32_MCi_STATUS register, the error in this bank is an uncorrected recoverable (UCR) error. The MCE handler needs to examine the S flag and the AR flag to find the type of the UCR error for software recovery and determine if software error recovery is possible. • When both the S and the AR flags are clear in the IA32_MCi_STATUS register for the UCR error (VAL=1, UC=1, EN=x and PCC=0), the error in this bank is an uncorrected no-action required error (UCNA). UCNA errors are uncorrected but do not require any OS recovery action to continue execution. These errors indicate that some data in the system is corrupt, but that data has not been consumed and may not be consumed. If that data is consumed a non-UCNA machine check exception will be generated. UCNA errors are signaled in the same way as corrected machine check errors and the CMCI and CMC polling handler is primarily responsible for handling UCNA errors. Like corrected errors, the MCA handler can optionally log and clear UCNA errors as long as it can avoid the undesired race condition with the CMCI or CMC polling handler. As UCNA errors are not the source of machine check exceptions, the MCA handler should continue searching for uncorrected or software recoverable errors in all other MC banks. • When the S flag in the IA32_MCi_STATUS register is set for the UCR error ((VAL=1, UC=1, EN=1 and PCC=0), the error in this bank is software recoverable and it was signaled through a machine-check exception. The AR flag in the IA32_MCi_STATUS register further clarifies the type of the software recoverable errors. • When the AR flag in the IA32_MCi_STATUS register is clear for the software recoverable error (VAL=1, UC=1, EN=1, PCC=0 and S=1), the error in this bank is a software recoverable action optional (SRAO) error. The MCE handler and the operating system can analyze the IA32_MCi_STATUS [15:0] to implement MCA error code specific optional recovery action, but this recovery action is optional. System software can resume the program execution from the instruction pointer saved on the stack for the machine check exception when the RIPV flag in the IA32_MCG_STATUS register is set. • Even if the OVER flag in the IA32_MCi_STATUS register is set for the SRAO error (VAL=1, UC=1, EN=1, PCC=0, S=1 and AR=0), the MCE handler can take recovery action for the SRAO error logged in the IA32_MCi_STATUS register. Since the recovery action for SRAO errors is optional, restarting the program execution from the instruction pointer saved on the stack for the machine check exception is still possible for the overflowed SRAO error if the RIPV flag in the IA32_MCG_STATUS is set. • When the AR flag in the IA32_MCi_STATUS register is set for the software recoverable error (VAL=1, UC=1, EN=1, PCC=0 and S=1), the error in this bank is a software recoverable action required (SRAR) error. The MCE handler and the operating system must take recovery action in order to continue execution after the machinecheck exception. The MCA handler and the operating system need to analyze the IA32_MCi_STATUS [15:0] to determine the MCA error code specific recovery action. If no recovery action can be performed, the operating system must reset the system. • When the OVER flag in the IA32_MCi_STATUS register is set for the SRAR error (VAL=1, UC=1, EN=1, PCC=0, S=1 and AR=1), the MCE handler cannot take recovery action as the information of the SRAR error in the IA32_MCi_STATUS register was potentially lost due to the overflow condition. Since the recovery action for SRAR errors must be taken, the MCE handler must signal the operating system to reset the system. • When the MCE handler cannot find any uncorrected (VAL=1, UC=1 and EN=1) or any software recoverable errors (VAL=1, UC=1, EN=1, PCC=0 and S=1) in any of the IA32_MCi banks of the processors, this is an unexpected condition for the MCE handler and the handler should signal the operating system to reset the system. • Before returning from the machine-check exception handler, software must clear the MCIP flag in the IA32_MCG_STATUS register. The MCIP flag is used to detect recursion. The machine-check architecture does not support recursion. When the processor receives a machine check when MCIP is set, it automatically enters the shutdown state. Example 15-4 gives pseudocode for an MC exception handler that supports recovery of UCR. 15-32 Vol. 3B MACHINE-CHECK ARCHITECTURE Example 15-4. Machine-Check Error Handler Pseudocode Supporting UCR MACHINE CHECK HANDLER: (* Called from INT 18 handler *) NOERROR = TRUE; ProcessorCount = 0; IF CPU supports MCA THEN RESTARTABILITY = TRUE; IF (Processor Family = 6 AND DisplayModel ≥ 0EH) OR (Processor Family > 6) THEN IF ( MCG_LMCE = 1) MCA_BROADCAST = FALSE; ELSE MCA_BROADCAST = TRUE; FI; Acquire SpinLock; ProcessorCount++; (* Allowing one logical processor at a time to examine machine check registers *) CALL MCA ERROR PROCESSING; (* returns RESTARTABILITY and NOERROR *) ELSE MCA_BROADCAST = FALSE; (* Implement a rendezvous mechanism with the other processors if necessary *) CALL MCA ERROR PROCESSING; FI; ELSE (* Pentium(R) processor compatible *) READ P5_MC_ADDR READ P5_MC_TYPE; RESTARTABILITY = FALSE; FI; IF NOERROR = TRUE THEN IF NOT (MCG_RIPV = 1 AND MCG_EIPV = 0) THEN RESTARTABILITY = FALSE; FI FI; IF RESTARTABILITY = FALSE THEN Report RESTARTABILITY to console; Reset system; FI; IF MCA_BROADCAST = TRUE THEN IF ProcessorCount = MAX_PROCESSORS AND NOERROR = TRUE THEN Report RESTARTABILITY to console; Reset system; FI; Release SpinLock; Wait till ProcessorCount = MAX_PROCESSRS on system; (* implement a timeout and abort function if necessary *) FI; CLEAR IA32_MCG_STATUS; RESUME Execution; (* End of MACHINE CHECK HANDLER*) MCA ERROR PROCESSING: (* MCA Error Processing Routine called from MCA Handler *) IF MCIP flag in IA32_MCG_STATUS = 0 THEN (* MCIP=0 upon MCA is unexpected *) RESTARTABILITY = FALSE; FI; Vol. 3B 15-33 MACHINE-CHECK ARCHITECTURE FOR each bank of machine-check registers DO CLEAR_MC_BANK = FALSE; READ IA32_MCi_STATUS; IF VAL Flag in IA32_MCi_STATUS = 1 THEN IF UC Flag in IA32_MCi_STATUS = 1 THEN IF Bit 24 in IA32_MCG_CAP = 0 THEN (* the processor does not support software error recovery *) RESTARTABILITY = FALSE; NOERROR = FALSE; GOTO LOG MCA REGISTER; FI; (* the processor supports software error recovery *) IF EN Flag in IA32_MCi_STATUS = 0 AND OVER Flag in IA32_MCi_STATUS=0 THEN (* It is a spurious MCA Log. Log and clear the register *) CLEAR_MC_BANK = TRUE; GOTO LOG MCA REGISTER; FI; IF PCC = 1 and EN = 1 in IA32_MCi_STATUS THEN (* processor context might have been corrupted *) RESTARTABILITY = FALSE; ELSE (* It is a uncorrected recoverable (UCR) error *) IF S Flag in IA32_MCi_STATUS = 0 THEN IF AR Flag in IA32_MCi_STATUS = 0 THEN (* It is a uncorrected no action required (UCNA) error *) GOTO CONTINUE; (* let CMCI and CMC polling handler to process *) ELSE RESTARTABILITY = FALSE; (* S=0, AR=1 is illegal *) FI FI; IF RESTARTABILITY = FALSE THEN (* no need to take recovery action if RESTARTABILITY is already false *) NOERROR = FALSE; GOTO LOG MCA REGISTER; FI; (* S in IA32_MCi_STATUS = 1 *) IF AR Flag in IA32_MCi_STATUS = 1 THEN (* It is a software recoverable and action required (SRAR) error *) IF OVER Flag in IA32_MCi_STATUS = 1 THEN RESTARTABILITY = FALSE; NOERROR = FALSE; GOTO LOG MCA REGISTER; FI IF MCACOD Value in IA32_MCi_STATUS is recognized AND Current Processor is an Affected Processor THEN Implement MCACOD specific recovery action; CLEAR_MC_BANK = TRUE; ELSE RESTARTABILITY = FALSE; FI; ELSE (* It is a software recoverable and action optional (SRAO) error *) IF OVER Flag in IA32_MCi_STATUS = 0 AND MCACOD in IA32_MCi_STATUS is recognized THEN Implement MCACOD specific recovery action; FI; CLEAR_MC_BANK = TRUE; FI; AR FI; PCC NOERROR = FALSE; 15-34 Vol. 3B MACHINE-CHECK ARCHITECTURE GOTO LOG MCA REGISTER; ELSE (* It is a corrected error; continue to the next IA32_MCi_STATUS *) GOTO CONTINUE; FI; UC FI; VAL LOG MCA REGISTER: SAVE IA32_MCi_STATUS; If MISCV in IA32_MCi_STATUS THEN SAVE IA32_MCi_MISC; FI; IF ADDRV in IA32_MCi_STATUS THEN SAVE IA32_MCi_ADDR; FI; IF CLEAR_MC_BANK = TRUE THEN SET all 0 to IA32_MCi_STATUS; If MISCV in IA32_MCi_STATUS THEN SET all 0 to IA32_MCi_MISC; FI; IF ADDRV in IA32_MCi_STATUS THEN SET all 0 to IA32_MCi_ADDR; FI; FI; CONTINUE: OD; ( *END FOR *) RETURN; (* End of MCA ERROR PROCESSING*) 15.10.4.2 Corrected Machine-Check Handler for Error Recovery When writing a corrected machine check handler, which is invoked as a result of CMCI or called from an OS CMC Polling dispatcher, consider the following: • The VAL (valid) flag in each IA32_MCi_STATUS register indicates whether the error information in the register is valid. If this flag is clear, the registers in that bank does not contain valid error information and does not need to be checked. • The CMCI or CMC polling handler is responsible for logging and clearing corrected errors. The UC flag in each IA32_MCi_Status register indicates whether the reported error was corrected (UC=0) or not (UC=1). • When IA32_MCG_CAP [24] is one, the CMC handler is also responsible for logging and clearing uncorrected noaction required (UCNA) errors. When the UC flag is one but the PCC, S, and AR flags are zero in the IA32_MCi_STATUS register, the reported error in this bank is an uncorrected no-action required (UCNA) error. In cases when SRAO error are signaled as UCNA error via CMCI, software can perform recovery for those errors identified in Table 15-16. • In addition to corrected errors and UCNA errors, the CMC handler optionally logs uncorrected (UC=1 and PCC=1), software recoverable machine check errors (UC=1, PCC=0 and S=1), but should avoid clearing those errors from the MC banks. Clearing these errors may result in accidentally removing these errors before these errors are actually handled and processed by the MCE handler for attempted software error recovery. Example 15-5 gives pseudocode for a CMCI handler with UCR support. Vol. 3B 15-35 MACHINE-CHECK ARCHITECTURE Example 15-5. Corrected Error Handler Pseudocode with UCR Support Corrected Error HANDLER: (* Called from CMCI handler or OS CMC Polling Dispatcher*) IF CPU supports MCA THEN FOR each bank of machine-check registers DO READ IA32_MCi_STATUS; IF VAL flag in IA32_MCi_STATUS = 1 THEN IF UC Flag in IA32_MCi_STATUS = 0 (* It is a corrected error *) THEN GOTO LOG CMC ERROR; ELSE IF Bit 24 in IA32_MCG_CAP = 0 THEN GOTO CONTINUE; FI; IF S Flag in IA32_MCi_STATUS = 0 AND AR Flag in IA32_MCi_STATUS = 0 THEN (* It is a uncorrected no action required error *) GOTO LOG CMC ERROR FI IF EN Flag in IA32_MCi_STATUS = 0 THEN (* It is a spurious MCA error *) GOTO LOG CMC ERROR FI; FI; FI; GOTO CONTINUE; LOG CMC ERROR: SAVE IA32_MCi_STATUS; If MISCV Flag in IA32_MCi_STATUS THEN SAVE IA32_MCi_MISC; SET all 0 to IA32_MCi_MISC; FI; IF ADDRV Flag in IA32_MCi_STATUS THEN SAVE IA32_MCi_ADDR; SET all 0 to IA32_MCi_ADDR FI; SET all 0 to IA32_MCi_STATUS; CONTINUE: OD; ( *END FOR *) FI; 15-36 Vol. 3B 13.Updates to Chapter 16, Volume 3B Change bars and green text show changes to Chapter 16 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------Changes to this chapter: Updates to document per-model channel decode based on bank number. Updated processor naming as necessary. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 24 CHAPTER 16 INTERPRETING MACHINE-CHECK ERROR CODES Encoding of the model-specific and other information fields is different across processor families. The differences are documented in the following sections. 16.1 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY 06H MACHINE ERROR CODES FOR MACHINE CHECK Section 16.1 provides information for interpreting additional model-specific fields for external bus errors relating to processor family 06H. The references to processor family 06H refers to only IA-32 processors with CPUID signatures listed in Table 16-1. Table 16-1. CPUID DisplayFamily_DisplayModel Signatures for Processor Family 06H DisplayFamily_DisplayModel Processor Families/Processor Number Series 06_0EH Intel Core Duo, Intel Core Solo processors 06_0DH Intel Pentium M processor 06_09H Intel Pentium M processor 06_7H, 06_08H, 06_0AH, 06_0BH Intel Pentium III Xeon Processor, Intel Pentium III Processor 06_03H, 06_05H Intel Pentium II Xeon Processor, Intel Pentium II Processor 06_01H Intel Pentium Pro Processor These errors are reported in the IA32_MCi_STATUS MSRs. They are reported architecturally as compound errors with a general form of 0000 1PPT RRRR IILL in the MCA error code field. See Chapter 15 for information on the interpretation of compound error codes. Incremental decoding information is listed in Table 16-2. Table 16-2. Incremental Decoding Information: Processor Family 06H Machine Error Codes For Machine Check Type Bit No. Bit Function Bit Description MCA error codes1 15:0 Model specific errors 18:16 Reserved Reserved Model specific errors 24:19 Bus queue request type 000000 for BQ_DCU_READ_TYPE error 000010 for BQ_IFU_DEMAND_TYPE error 000011 for BQ_IFU_DEMAND_NC_TYPE error 000100 for BQ_DCU_RFO_TYPE error 000101 for BQ_DCU_RFO_LOCK_TYPE error 000110 for BQ_DCU_ITOM_TYPE error 001000 for BQ_DCU_WB_TYPE error 001010 for BQ_DCU_WCEVICT_TYPE error 001011 for BQ_DCU_WCLINE_TYPE error 001100 for BQ_DCU_BTM_TYPE error Vol. 3B 16-1 INTERPRETING MACHINE-CHECK ERROR CODES Table 16-2. Incremental Decoding Information: Processor Family 06H Machine Error Codes For Machine Check Type Bit No. Bit Function Bit Description 001101 for BQ_DCU_INTACK_TYPE error 001110 for BQ_DCU_INVALL2_TYPE error 001111 for BQ_DCU_FLUSHL2_TYPE error 010000 for BQ_DCU_PART_RD_TYPE error 010010 for BQ_DCU_PART_WR_TYPE error 010100 for BQ_DCU_SPEC_CYC_TYPE error 011000 for BQ_DCU_IO_RD_TYPE error 011001 for BQ_DCU_IO_WR_TYPE error 011100 for BQ_DCU_LOCK_RD_TYPE error 011110 for BQ_DCU_SPLOCK_RD_TYPE error 011101 for BQ_DCU_LOCK_WR_TYPE error Model specific errors 27:25 Bus queue error type 000 for BQ_ERR_HARD_TYPE error 001 for BQ_ERR_DOUBLE_TYPE error 010 for BQ_ERR_AERR2_TYPE error 100 for BQ_ERR_SINGLE_TYPE error 101 for BQ_ERR_AERR1_TYPE error Model specific errors Other information 28 FRC error 1 if FRC error active 29 BERR 1 if BERR is driven 30 Internal BINIT 1 if BINIT driven for this processor 31 Reserved Reserved 34:32 Reserved Reserved 35 External BINIT 1 if BINIT is received from external bus. 36 Response parity error This bit is asserted in IA32_MCi_STATUS if this component has received a parity error on the RS[2:0]# pins for a response transaction. The RS signals are checked by the RSP# external pin. 37 Bus BINIT This bit is asserted in IA32_MCi_STATUS if this component has received a hard error response on a split transaction one access that has needed to be split across the 64-bit external bus interface into two accesses). 38 Timeout BINIT This bit is asserted in IA32_MCi_STATUS if this component has experienced a ROB time-out, which indicates that no micro-instruction has been retired for a predetermined period of time. A ROB time-out occurs when the 15-bit ROB time-out counter carries a 1 out of its high order bit. 2 The timer is cleared when a micro-instruction retires, an exception is detected by the core processor, RESET is asserted, or when a ROB BINIT occurs. The ROB time-out counter is prescaled by the 8-bit PIC timer which is a divide by 128 of the bus clock the bus clock is 1:2, 1:3, 1:4 of the core clock). When a carry out of the 8-bit PIC timer occurs, the ROB counter counts up by one. While this bit is asserted, it cannot be overwritten by another error. 16-2 Vol. 3B 41:39 Reserved Reserved 42 Hard error This bit is asserted in IA32_MCi_STATUS if this component has initiated a bus transactions which has received a hard error response. While this bit is asserted, it cannot be overwritten. INTERPRETING MACHINE-CHECK ERROR CODES Table 16-2. Incremental Decoding Information: Processor Family 06H Machine Error Codes For Machine Check Type Bit No. Bit Function Bit Description 43 IERR This bit is asserted in IA32_MCi_STATUS if this component has experienced a failure that causes the IERR pin to be asserted. While this bit is asserted, it cannot be overwritten. 44 AERR This bit is asserted in IA32_MCi_STATUS if this component has initiated 2 failing bus transactions which have failed due to Address Parity Errors AERR asserted). While this bit is asserted, it cannot be overwritten. 45 UECC The Uncorrectable ECC error bit is asserted in IA32_MCi_STATUS for uncorrected ECC errors. While this bit is asserted, the ECC syndrome field will not be overwritten. 46 CECC The correctable ECC error bit is asserted in IA32_MCi_STATUS for corrected ECC errors. 54:47 ECC syndrome The ECC syndrome field in IA32_MCi_STATUS contains the 8-bit ECC syndrome only if the error was a correctable/uncorrectable ECC error and there wasn't a previous valid ECC error syndrome logged in IA32_MCi_STATUS. A previous valid ECC error in IA32_MCi_STATUS is indicated by IA32_MCi_STATUS.bit45 uncorrectable error occurred) being asserted. After processing an ECC error, machine-check handling software should clear IA32_MCi_STATUS.bit45 so that future ECC error syndromes can be logged. 56:55 Status register validity indicators1 Reserved Reserved. 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 2. For processors with a CPUID signature of 06_0EH, a ROB time-out occurs when the 23-bit ROB time-out counter carries a 1 out of its high order bit. 16.2 INCREMENTAL DECODING INFORMATION: INTEL CORE 2 PROCESSOR FAMILY MACHINE ERROR CODES FOR MACHINE CHECK Table 16-4 provides information for interpreting additional model-specific fields for external bus errors relating to processor based on Intel Core microarchitecture, which implements the P4 bus specification. Table 16-3 lists the CPUID signatures for Intel 64 processors that are covered by Table 16-4. These errors are reported in the IA32_MCi_STATUS MSRs. They are reported architecturally as compound errors with a general form of 0000 1PPT RRRR IILL in the MCA error code field. See Chapter 15 for information on the interpretation of compound error codes. Table 16-3. CPUID DisplayFamily_DisplayModel Signatures for Processors Based on Intel Core Microarchitecture DisplayFamily_DisplayModel Processor Families/Processor Number Series 06_1DH Intel Xeon Processor 7400 series. 06_17H Intel Xeon Processor 5200, 5400 series, Intel Core 2 Quad processor Q9650. 06_0FH Intel Xeon Processor 3000, 3200, 5100, 5300, 7300 series, Intel Core 2 Quad, Intel Core 2 Extreme, Intel Core 2 Duo processors, Intel Pentium dual-core processors. Vol. 3B 16-3 INTERPRETING MACHINE-CHECK ERROR CODES Table 16-4. Incremental Bus Error Codes of Machine Check for Processors Based on Intel Core Microarchitecture Type Bit No. Bit Function Bit Description MCA error codes1 15:0 Model specific errors 18:16 Reserved Reserved Model specific errors 24:19 Bus queue request type ‘000001 for BQ_PREF_READ_TYPE error 000000 for BQ_DCU_READ_TYPE error 000010 for BQ_IFU_DEMAND_TYPE error 000011 for BQ_IFU_DEMAND_NC_TYPE error 000100 for BQ_DCU_RFO_TYPE error 000101 for BQ_DCU_RFO_LOCK_TYPE error 000110 for BQ_DCU_ITOM_TYPE error 001000 for BQ_DCU_WB_TYPE error 001010 for BQ_DCU_WCEVICT_TYPE error 001011 for BQ_DCU_WCLINE_TYPE error 001100 for BQ_DCU_BTM_TYPE error 001101 for BQ_DCU_INTACK_TYPE error 001110 for BQ_DCU_INVALL2_TYPE error 001111 for BQ_DCU_FLUSHL2_TYPE error 010000 for BQ_DCU_PART_RD_TYPE error 010010 for BQ_DCU_PART_WR_TYPE error 010100 for BQ_DCU_SPEC_CYC_TYPE error 011000 for BQ_DCU_IO_RD_TYPE error 011001 for BQ_DCU_IO_WR_TYPE error 011100 for BQ_DCU_LOCK_RD_TYPE error 011110 for BQ_DCU_SPLOCK_RD_TYPE error 011101 for BQ_DCU_LOCK_WR_TYPE error 100100 for BQ_L2_WI_RFO_TYPE error 100110 for BQ_L2_WI_ITOM_TYPE error Model specific errors 27:25 Bus queue error type ‘001 for Address Parity Error Model specific errors 28 MCE Driven 1 if MCE is driven 29 MCE Observed 1 if MCE is observed 30 Internal BINIT 1 if BINIT driven for this processor 31 BINIT Observed 1 if BINIT is observed for this processor 33:32 Reserved Reserved 34 PIC and FSB data parity Data Parity detected on either PIC or FSB access 35 Reserved Reserved ‘010 for Response Hard Error ‘011 for Response Parity Error Other information 16-4 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES Table 16-4. Incremental Bus Error Codes of Machine Check for Processors Based on Intel Core Microarchitecture (Contd.) Type Bit No. Bit Function Bit Description 36 Response parity error This bit is asserted in IA32_MCi_STATUS if this component has received a parity error on the RS[2:0]# pins for a response transaction. The RS signals are checked by the RSP# external pin. 37 FSB address parity Address parity error detected: 1 = Address parity error detected 0 = No address parity error 38 Timeout BINIT This bit is asserted in IA32_MCi_STATUS if this component has experienced a ROB time-out, which indicates that no micro-instruction has been retired for a predetermined period of time. A ROB time-out occurs when the 23-bit ROB time-out counter carries a 1 out of its high order bit. The timer is cleared when a micro-instruction retires, an exception is detected by the core processor, RESET is asserted, or when a ROB BINIT occurs. The ROB time-out counter is prescaled by the 8-bit PIC timer which is a divide by 128 of the bus clock the bus clock is 1:2, 1:3, 1:4 of the core clock). When a carry out of the 8-bit PIC timer occurs, the ROB counter counts up by one. While this bit is asserted, it cannot be overwritten by another error. Status register validity indicators1 41:39 Reserved Reserved 42 Hard error This bit is asserted in IA32_MCi_STATUS if this component has initiated a bus transactions which has received a hard error response. While this bit is asserted, it cannot be overwritten. 43 IERR This bit is asserted in IA32_MCi_STATUS if this component has experienced a failure that causes the IERR pin to be asserted. While this bit is asserted, it cannot be overwritten. 44 Reserved Reserved 45 Reserved Reserved 46 Reserved Reserved 54:47 Reserved Reserved 56:55 Reserved Reserved. 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.2.1 Model-Specific Machine Check Error Codes for Intel Xeon Processor 7400 Series Intel Xeon processor 7400 series has machine check register banks that generally follows the description of Chapter 15 and Section 16.2. Additional error codes specific to Intel Xeon processor 7400 series is describe in this section. MC4_STATUS[63:0] is the main error logging for the processor’s L3 and front side bus errors for Intel Xeon processor 7400 series. It supports the L3 Errors, Bus and Interconnect Errors Compound Error Codes in the MCA Error Code Field. Vol. 3B 16-5 INTERPRETING MACHINE-CHECK ERROR CODES 16.2.1.1 Processor Machine Check Status Register Incremental MCA Error Code Definition Intel Xeon processor 7400 series use compound MCA Error Codes for logging its Bus internal machine check errors, L3 Errors, and Bus/Interconnect Errors. It defines incremental Machine Check error types (IA32_MC6_STATUS[15:0]) beyond those defined in Chapter 15. Table 16-5 lists these incremental MCA error code types that apply to IA32_MC6_STATUS. Error code details are specified in MC6_STATUS [31:16] (see Section 16.2.2), the “Model Specific Error Code” field. The information in the “Other_Info” field (MC4_STATUS[56:32]) is common to the three processor error types and contains a correctable event count and specifies the MC6_MISC register format. Table 16-5. Incremental MCA Error Code Types for Intel Xeon Processor 7400 Processor MCA_Error_Code (MC6_STATUS[15:0]) Type Error Code Binary Encoding Meaning C Internal Error 0000 0100 0000 0000 Internal Error Type Code B Bus and Interconnect 0000 100x 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations Error 0000 101x 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations 0000 110x 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations 0000 1110 0000 1111 Bus and Interconnection Error Type Code 0000 1111 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations The Bold faced binary encodings are the only encodings used by the processor for MC4_STATUS[15:0]. 16.2.2 Intel Xeon Processor 7400 Model Specific Error Code Field 16.2.2.1 Processor Model Specific Error Code Field Type B: Bus and Interconnect Error Note: The Model Specific Error Code field in MC6_STATUS (bits 31:16). Table 16-6. Type B Bus and Interconnect Error Codes Bit Num Sub-Field Name Description 16 FSB Request Parity Parity error detected during FSB request phase 19:17 Reserved 20 FSB Hard Fail Response “Hard Failure“ response received for a local transaction 21 FSB Response Parity Parity error on FSB response field detected 22 FSB Data Parity FSB data parity error on inbound data detected 31:23 --- Reserved 16-6 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES 16.2.2.2 Processor Model Specific Error Code Field Type C: Cache Bus Controller Error Table 16-7. Type C Cache Bus Controller Error Codes MC4_STATUS[31:16] (MSCE) Value Error Description 0000_0000_0000_0001 0001H Inclusion Error from Core 0 0000_0000_0000_0010 0002H Inclusion Error from Core 1 0000_0000_0000_0011 0003H Write Exclusive Error from Core 0 0000_0000_0000_0100 0004H Write Exclusive Error from Core 1 0000_0000_0000_0101 0005H Inclusion Error from FSB 0000_0000_0000_0110 0006H SNP Stall Error from FSB 0000_0000_0000_0111 0007H Write Stall Error from FSB 0000_0000_0000_1000 0008H FSB Arb Timeout Error 0000_0000_0000_1010 000AH Inclusion Error from Core 2 0000_0000_0000_1011 000BH Write Exclusive Error from Core 2 0000_0010_0000_0000 0200H Internal Timeout error 0000_0011_0000_0000 0300H Internal Timeout Error 0000_0100_0000_0000 0400H Intel® Cache Safe Technology Queue Full Error or Disabled-ways-in-a-set overflow 0000_0101_0000_0000 0500H Quiet cycle Timeout Error (correctable) 1100_0000_0000_0010 C002H Correctable ECC event on outgoing Core 0 data 1100_0000_0000_0100 C004H Correctable ECC event on outgoing Core 1 data 1100_0000_0000_1000 C008H Correctable ECC event on outgoing Core 2 data 1110_0000_0000_0010 E002H Uncorrectable ECC error on outgoing Core 0 data 1110_0000_0000_0100 E004H Uncorrectable ECC error on outgoing Core 1 data 1110_0000_0000_1000 E008H Uncorrectable ECC error on outgoing Core 2 data — all other encodings — Reserved 16.3 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE 06_1AH, MACHINE ERROR CODES FOR MACHINE CHECK Table 16-8 through Table 16-12 provide information for interpreting additional model-specific fields for memory controller errors relating to the processor family with CPUID DisplayFamily_DisplaySignature 06_1AH, which supports Intel QuickPath Interconnect links. Incremental MC error codes related to the Intel QPI links are reported in the register banks IA32_MC0 and IA32_MC1, incremental error codes for internal machine check is reported in the register bank IA32_MC7, and incremental error codes for the memory controller unit is reported in the register banks IA32_MC8. Vol. 3B 16-7 INTERPRETING MACHINE-CHECK ERROR CODES 16.3.1 Intel QPI Machine Check Errors Table 16-8. Intel QPI Machine Check Error Codes for IA32_MC0_STATUS and IA32_MC1_STATUS Type 1 MCA error codes Bit No. Bit Function Bit Description 15:0 MCACOD Bus error format: 1PPTRRRRIILL 16 Header Parity if 1, QPI Header had bad parity 17 Data Parity If 1, QPI Data packet had bad parity 18 Retries Exceeded If 1, number of QPI retries was exceeded 19 Received Poison if 1, Received a data packet that was marked as poisoned by the sender 21:20 Reserved Reserved 22 Unsupported Message If 1, QPI received a message encoding it does not support 23 Unsupported Credit If 1, QPI credit type is not supported. 24 Receive Flit Overrun If 1, Sender sent too many QPI flits to the receiver. 25 Received Failed Response If 1, Indicates that sender sent a failed response to receiver. 26 Receiver Clock Jitter If 1, clock jitter detected in the internal QPI clocking 56:27 Reserved Reserved Model specific errors Status register validity indicators1 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. Table 16-9. Intel QPI Machine Check Error Codes for IA32_MC0_MISC and IA32_MC1_MISC Type Model specific Bit No. Bit Function Bit Description 7:0 QPI Opcode Message class and opcode from the packet with the error 13:8 RTId QPI Request Transaction ID 15:14 Reserved Reserved 18:16 RHNID QPI Requestor/Home Node ID 23:19 Reserved Reserved 24 IIB QPI Interleave/Head Indication Bit errors1 NOTES: 1. Which of these fields are valid depends on the error type. 16.3.2 Internal Machine Check Errors Table 16-10. Machine Check Error Codes for IA32_MC7_STATUS Type Bit No. Bit Function MCA error codes1 15:0 MCACOD Model specific errors 16-8 Vol. 3B Bit Description INTERPRETING MACHINE-CHECK ERROR CODES Type Bit No. Bit Function Bit Description 23:16 Reserved Reserved 31:24 Reserved except for the following 00h - No Error 03h - Reset firmware did not complete 08h - Received an invalid CMPD 0Ah - Invalid Power Management Request 0Dh - Invalid S-state transition 11h - VID controller does not match POC controller selected 1Ah - MSID from POC does not match CPU MSID 56:32 Status register validity indicators1 Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.3.3 Memory Controller Errors Table 16-11. Incremental Memory Controller Error Codes of Machine Check for IA32_MC8_STATUS Type MCA error codes1 Bit No. Bit Function Bit Description 15:0 MCACOD Memory error format: 1MMMCCCC 16 Read ECC error if 1, ECC occurred on a read 17 RAS ECC error If 1, ECC occurred on a scrub 18 Write parity error If 1, bad parity on a write 19 Redundancy loss if 1, Error in half of redundant memory 20 Reserved Reserved 21 Memory range error If 1, Memory access out of range 22 RTID out of range If 1, Internal ID invalid 23 Address parity error If 1, bad address parity 24 Byte enable parity error If 1, bad enable parity 37:25 Reserved Reserved 52:38 CORE_ERR_CNT Corrected error count 56:53 Reserved Reserved Model specific errors Other information Status register validity indicators1 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. Vol. 3B 16-9 INTERPRETING MACHINE-CHECK ERROR CODES Table 16-12. Incremental Memory Controller Error Codes of Machine Check for IA32_MC8_MISC Type Model specific Bit No. Bit Function Bit Description 7:0 RTId Transaction Tracker ID 15:8 Reserved Reserved 17:16 DIMM DIMM ID which got the error 19:18 Channel Channel ID which got the error 31:20 Reserved Reserved 63:32 Syndrome ECC Syndrome errors1 NOTES: 1. Which of these fields are valid depends on the error type. 16.4 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE 06_2DH, MACHINE ERROR CODES FOR MACHINE CHECK Table 16-13 through Table 16-15 provide information for interpreting additional model-specific fields for memory controller errors relating to the processor family with CPUID DisplayFamily_DisplaySignature 06_2DH, which supports Intel QuickPath Interconnect links. Incremental MC error codes related to the Intel QPI links are reported in the register banks IA32_MC6 and IA32_MC7, incremental error codes for internal machine check error from PCU controller is reported in the register bank IA32_MC4, and incremental error codes for the memory controller unit is reported in the register banks IA32_MC8-IA32_MC11. 16.4.1 Internal Machine Check Errors Table 16-13. Machine Check Error Codes for IA32_MC4_STATUS Type Bit No. Bit Function MCA error codes1 15:0 MCACOD Model specific errors 19:16 Reserved except for the following Bit Description 0000b - No Error 0001b - Non_IMem_Sel 0010b - I_Parity_Error 0011b - Bad_OpCode 0100b - I_Stack_Underflow 0101b - I_Stack_Overflow 0110b - D_Stack_Underflow 0111b - D_Stack_Overflow 1000b - Non-DMem_Sel 1001b - D_Parity_Error 16-10 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES Type Bit No. Bit Function Bit Description 23:20 Reserved Reserved 31:24 Reserved except for the following 00h - No Error 0Dh - MC_IMC_FORCE_SR_S3_TIMEOUT 0Eh - MC_CPD_UNCPD_ST_TIMEOUT 0Fh - MC_PKGS_SAFE_WP_TIMEOUT 43h - MC_PECI_MAILBOX_QUIESCE_TIMEOUT 5Ch - MC_MORE_THAN_ONE_LT_AGENT 60h - MC_INVALID_PKGS_REQ_PCH 61h - MC_INVALID_PKGS_REQ_QPI 62h - MC_INVALID_PKGS_RES_QPI 63h - MC_INVALID_PKGC_RES_PCH 64h - MC_INVALID_PKG_STATE_CONFIG 70h - MC_WATCHDG_TIMEOUT_PKGC_SLAVE 71h - MC_WATCHDG_TIMEOUT_PKGC_MASTER 72h - MC_WATCHDG_TIMEOUT_PKGS_MASTER 7ah - MC_HA_FAILSTS_CHANGE_DETECTED 81h - MC_RECOVERABLE_DIE_THERMAL_TOO_HOT 56:32 Status register validity indicators1 Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.4.2 Intel QPI Machine Check Errors Table 16-14. Intel QPI MC Error Codes for IA32_MC6_STATUS and IA32_MC7_STATUS Type Bit No. Bit Function Bit Description MCA error codes1 15:0 MCACOD Bus error format: 1PPTRRRRIILL 56:16 Reserved Reserved Model specific errors Status register validity indicators1 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.4.3 Integrated Memory Controller Machine Check Errors MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC8_STATUSIA32_MC11_STATUS. The supported error codes are follows the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “Machine-Check Architecture,”). MSR_ERROR_CONTROL.[bit 1] can enable additional informaVol. 3B 16-11 INTERPRETING MACHINE-CHECK ERROR CODES tion logging of the IMC. The additional error information logged by the IMC is stored in IA32_MCi_STATUS and IA32_MCi_MISC; (i = 8, 11). Table 16-15. Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 8, 11) Type Bit No. Bit Function Bit Description MCA error codes 15:0 MCACOD Bus error format: 1PPTRRRRIILL Model specific errors 31:16 Reserved except for the following 001H - Address parity error 002H - HA Wrt buffer Data parity error 004H - HA Wrt byte enable parity error 008H - Corrected patrol scrub error 010H - Uncorrected patrol scrub error 020H - Corrected spare error 040H - Uncorrected spare error Model specific errors 36:32 Other info When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first device error when corrected error is detected during normal read. 37 Reserved Reserved 1 56:38 Status register validity indicators1 See Chapter 15, “Machine-Check Architecture,” 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. Table 16-16. Intel IMC MC Error Codes for IA32_MCi_MISC (i= 8, 11) Type Bit No. MCA addr info1 Bit Function Bit Description 8:0 See Chapter 15, “Machine-Check Architecture,” Model specific errors 13:9 • When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second device error when corrected error is detected during normal read. • Otherwise contain parity error if MCi_Status indicates HA_WB_Data or HA_W_BE parity error. Model specific errors 29:14 ErrMask_1stErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error bit mask. Model specific errors 45:30 ErrMask_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error bit mask. 50:46 FailRank_1stErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error failing rank. 55:51 FailRank_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error failing rank. 58:56 Reserved Reserved 61:59 Reserved Reserved 62 Valid_1stErrDev When MSR_ERROR_CONTROL.[1] is set, indicates the iMC has logged valid data from the first correctable error in a memory device. 63 Valid_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, indicates the iMC has logged valid data due to a second correctable error in a memory device. Use this information only after there is valid first error info indicated by bit 62. NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16-12 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES 16.5 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE 06_3EH, MACHINE ERROR CODES FOR MACHINE CHECK Intel Xeon processor E5 v2 family and Intel Xeon processor E7 v2 family are based on the Ivy Bridge-EP microarchitecture and can be identified with CPUID DisplayFamily_DisplaySignature 06_3EH. Incremental error codes for internal machine check error from PCU controller is reported in the register bank IA32_MC4, Table lists modelspecific fields to interpret error codes applicable to IA32_MC4_STATUS. Incremental MC error codes related to the Intel QPI links are reported in the register banks IA32_MC5. Information listed in Table 16-14 for QPI MC error code apply to IA32_MC5_STATUS. Incremental error codes for the memory controller unit is reported in the register banks IA32_MC9-IA32_MC16. Table 16-18 lists model-specific error codes apply to IA32_MCi_STATUS, i = 9-16. 16.5.1 Internal Machine Check Errors Table 16-17. Machine Check Error Codes for IA32_MC4_STATUS Type MCA error codes1 Model specific errors Bit No. Bit Function 15:0 MCACOD 19:16 Reserved except for the following Bit Description 0000b - No Error 0001b - Non_IMem_Sel 0010b - I_Parity_Error 0011b - Bad_OpCode 0100b - I_Stack_Underflow 0101b - I_Stack_Overflow 0110b - D_Stack_Underflow 0111b - D_Stack_Overflow 1000b - Non-DMem_Sel 1001b - D_Parity_Error 23:20 Reserved Reserved 31:24 Reserved except for the following 00h - No Error 0Dh - MC_IMC_FORCE_SR_S3_TIMEOUT 0Eh - MC_CPD_UNCPD_ST_TIMEOUT 0Fh - MC_PKGS_SAFE_WP_TIMEOUT 43h - MC_PECI_MAILBOX_QUIESCE_TIMEOUT 44h - MC_CRITICAL_VR_FAILED 45h - MC_ICC_MAX-NOTSUPPORTED 5Ch - MC_MORE_THAN_ONE_LT_AGENT 60h - MC_INVALID_PKGS_REQ_PCH 61h - MC_INVALID_PKGS_REQ_QPI 62h - MC_INVALID_PKGS_RES_QPI 63h - MC_INVALID_PKGC_RES_PCH 64h - MC_INVALID_PKG_STATE_CONFIG 70h - MC_WATCHDG_TIMEOUT_PKGC_SLAVE 71h - MC_WATCHDG_TIMEOUT_PKGC_MASTER 72h - MC_WATCHDG_TIMEOUT_PKGS_MASTER Vol. 3B 16-13 INTERPRETING MACHINE-CHECK ERROR CODES Type Bit No. Bit Function Bit Description 7Ah - MC_HA_FAILSTS_CHANGE_DETECTED 7Bh - MC_PCIE_R2PCIE-RW_BLOCK_ACK_TIMEOUT 81h - MC_RECOVERABLE_DIE_THERMAL_TOO_HOT 56:32 Status register validity indicators1 Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.5.2 Integrated Memory Controller Machine Check Errors MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC9_STATUSIA32_MC16_STATUS. The supported error codes are follows the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “Machine-Check Architecture”). MSR_ERROR_CONTROL.[bit 1] can enable additional information logging of the IMC. The additional error information logged by the IMC is stored in IA32_MCi_STATUS and IA32_MCi_MISC; (i = 9-16). IA32_MCi_STATUS (i=9-12) log errors from the first memory controller. The second memory controller logs into IA32_MCi_STATUS (i=13-16). Table 16-18. Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 9-16) Type MCA error codes1 Model specific errors Bit No. Bit Function Bit Description 15:0 MCACOD Memory Controller error format: 000F 0000 1MMM CCCC 31:16 Reserved except for the following 001H - Address parity error 002H - HA Wrt buffer Data parity error 004H - HA Wrt byte enable parity error 008H - Corrected patrol scrub error 010H - Uncorrected patrol scrub error 020H - Corrected spare error 040H - Uncorrected spare error 080H - Corrected memory read error. (Only applicable with iMC’s “Additional Error logging” Mode-1 enabled.) 100H - iMC, WDB, parity errors 36:32 Other info When MSR_ERROR_CONTROL.[1] is set, logs an encoded value from the first error device. 37 56:38 Status register validity indicators1 Reserved Reserved See Chapter 15, “Machine-Check Architecture,” 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16-14 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES Table 16-19. Intel IMC MC Error Codes for IA32_MCi_MISC (i= 9-16) Type Bit No. MCA addr info1 8:0 Bit Function See Chapter 15, “Machine-Check Architecture,” Model specific errors 13:9 If the error logged is MCWrDataPar error or MCWrBEPar error, this field is the WDB ID that has the parity error. OR if the second error logged is a correctable read error, MC logs the second error device in this field. Model specific errors 29:14 ErrMask_1stErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error bit mask. Model specific errors 45:30 ErrMask_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error bit mask. 50:46 FailRank_1stErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error failing rank. 55:51 FailRank_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error failing rank. 61:56 Bit Description Reserved 62 Valid_1stErrDev When MSR_ERROR_CONTROL.[1] is set, indicates the iMC has logged valid data from a correctable error from memory read associated with first error device. 63 Valid_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, indicates the iMC has logged valid data due to a second correctable error in a memory device. Use this information only after there is valid first error info indicated by bit 62. NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.5.3 Home Agent Machine Check Errors Memory errors from the first memory controller may be logged in the IA32_MC7_{STATUS,ADDR,MISC} registers, while the second memory controller logs to IA32_MC8_{STATUS,ADDR,MISC}. 16.6 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE 06_3FH, MACHINE ERROR CODES FOR MACHINE CHECK Intel Xeon processor E5 v3 family is based on the Haswell-E microarchitecture and can be identified with CPUID DisplayFamily_DisplaySignature 06_3FH. Incremental error codes for internal machine check error from PCU controller is reported in the register bank IA32_MC4, Table 16-20 lists model-specific fields to interpret error codes applicable to IA32_MC4_STATUS. Incremental MC error codes related to the Intel QPI links are reported in the register banks IA32_MC5, IA32_MC20, and IA32_MC21. Information listed in Table 16-21 for QPI MC error codes. Incremental error codes for the memory controller unit is reported in the register banks IA32_MC9-IA32_MC16. Table 16-22 lists model-specific error codes apply to IA32_MCi_STATUS, i = 9-16. Vol. 3B 16-15 INTERPRETING MACHINE-CHECK ERROR CODES 16.6.1 Internal Machine Check Errors Table 16-20. Machine Check Error Codes for IA32_MC4_STATUS Type 1 MCA error codes 2 MCACOD Bit No. Bit Function 15:0 MCACOD 15:0 Internal Errors Bit Description 0402h - PCU internal Errors 0403h - PCU internal Errors 0406h - Intel TXT Errors 0407h - Other UBOX internal Errors. On an IERR caused by a core 3-strike the IA32_MC3_STATUS (MLC) is copied to the IA32_MC4_STATUS (After a 3-strike, the core MCA banks will be unavailable). Model specific errors 19:16 Reserved except for the following 0000b - No Error 23:20 Reserved Reserved 31:24 Reserved except for the following 00h - No Error 00xxb - PCU internal error 09h - MC_MESSAGE_CHANNEL_TIMEOUT 13h - MC_DMI_TRAINING_TIMEOUT 15h - MC_DMI_CPU_RESET_ACK_TIMEOUT 1Eh - MC_VR_ICC_MAX_LT_FUSED_ICC_MAX 25h - MC_SVID_COMMAND_TIMEOUT 29h - MC_VR_VOUT_MAC_LT_FUSED_SVID 2Bh - MC_PKGC_WATCHDOG_HANG_CBZ_DOWN 2Ch - MC_PKGC_WATCHDOG_HANG_CBZ_UP 44h - MC_CRITICAL_VR_FAILED 46h - MC_VID_RAMP_DOWN_FAILED 49h - MC_SVID_WRITE_REG_VOUT_MAX_FAILED 4Bh - MC_BOOT_VID_TIMEOUT. Timeout setting boot VID for DRAM 0. 4Fh - MC_SVID_COMMAND_ERROR. 52h - MC_FIVR_CATAS_OVERVOL_FAULT. 53h - MC_FIVR_CATAS_OVERCUR_FAULT. 57h - MC_SVID_PKGC_REQUEST_FAILED 58h - MC_SVID_IMON_REQUEST_FAILED 59h - MC_SVID_ALERT_REQUEST_FAILED 62h - MC_INVALID_PKGS_RSP_QPI 64h - MC_INVALID_PKG_STATE_CONFIG 67h - MC_HA_IMC_RW_BLOCK_ACK_TIMEOUT 6Ah - MC_MSGCH_PMREQ_CMP_TIMEOUT 72h - MC_WATCHDG_TIMEOUT_PKGS_MASTER 81h - MC_RECOVERABLE_DIE_THERMAL_TOO_HOT 56:32 Status register validity indicators1 Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16-16 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES 2. The internal error codes may be model-specific. 16.6.2 Intel QPI Machine Check Errors MC error codes associated with the Intel QPI agents are reported in the MSRs IA32_MC5_STATUS, IA32_MC20_STATUS, and IA32_MC21_STATUS. The supported error codes follow the architectural MCACOD definition type 1PPTRRRRIILL (see Chapter 15, “Machine-Check Architecture,”). Table 16-21 lists model-specific fields to interpret error codes applicable to IA32_MC5_STATUS, IA32_MC20_STATUS, and IA32_MC21_STATUS. Table 16-21. Intel QPI MC Error Codes for IA32_MCi_STATUS (i = 5, 20, 21) Type Bit No. Bit Function Bit Description MCA error codes1 15:0 MCACOD Bus error format: 1PPTRRRRIILL Model specific errors 31:16 MSCOD 02h - Intel QPI physical layer detected drift buffer alarm. 03h - Intel QPI physical layer detected latency buffer rollover. 10h - Intel QPI link layer detected control error from R3QPI. 11h - Rx entered LLR abort state on CRC error. 12h - Unsupported or undefined packet. 13h - Intel QPI link layer control error. 15h - RBT used un-initialized value. 20h - Intel QPI physical layer detected a QPI in-band reset but aborted initialization 21h - Link failover data self-healing 22h - Phy detected in-band reset (no width change). 23h - Link failover clock failover 30h -Rx detected CRC error - successful LLR after Phy re-init. 31h -Rx detected CRC error - successful LLR without Phy re-init. All other values are reserved. Status register validity indicators1 37:32 Reserved 52:38 Corrected Error Cnt 56:53 Reserved Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.6.3 Integrated Memory Controller Machine Check Errors MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC9_STATUSIA32_MC16_STATUS. The supported error codes follow the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “Machine-Check Architecture,”). MSR_ERROR_CONTROL.[bit 1] can enable additional information logging of the IMC. The additional error information logged by the IMC is stored in IA32_MCi_STATUS and IA32_MCi_MISC; (i = 9-16). IA32_MCi_STATUS (i=9-12) log errors from the first memory controller. The second memory controller logs into IA32_MCi_STATUS (i=13-16). Vol. 3B 16-17 INTERPRETING MACHINE-CHECK ERROR CODES Table 16-22. Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 9-16) Type Bit No. Bit Function Bit Description MCA error codes1 15:0 MCACOD Memory Controller error format: 0000 0000 1MMM CCCC Model specific errors 31:16 Reserved except for the following 0001H - DDR3 address parity error 0002H - Uncorrected HA write data error 0004H - Uncorrected HA data byte enable error 0008H - Corrected patrol scrub error 0010H - Uncorrected patrol scrub error 0020H - Corrected spare error 0040H - Uncorrected spare error 0080H - Corrected memory read error. (Only applicable with iMC’s “Additional Error logging” Mode-1 enabled.) 0100H - iMC, write data buffer parity errors 0200H - DDR4 command address parity error 36:32 Other info When MSR_ERROR_CONTROL.[1] is set, logs an encoded value from the first error device. 37 Reserved 56:38 Status register validity indicators1 Reserved See Chapter 15, “Machine-Check Architecture,” 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. Table 16-23. Intel IMC MC Error Codes for IA32_MCi_MISC (i= 9-16) Type Bit No. MCA addr info1 Bit Function 8:0 See Chapter 15, “Machine-Check Architecture,” Model specific errors 13:9 If the error logged is MCWrDataPar error or MCWrBEPar error, this field is the WDB ID that has the parity error. OR if the second error logged is a correctable read error, MC logs the second error device in this field. Model specific errors 29:14 ErrMask_1stErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error bit mask. Model specific errors 45:30 ErrMask_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error bit mask. 50:46 FailRank_1stErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error failing rank. 55:51 FailRank_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error failing rank. 61:56 16-18 Vol. 3B Bit Description Reserved 62 Valid_1stErrDev When MSR_ERROR_CONTROL.[1] is set, indicates the iMC has logged valid data from a correctable error from memory read associated with first error device. 63 Valid_2ndErrDev When MSR_ERROR_CONTROL.[1] is set, indicates the iMC has logged valid data due to a second correctable error in a memory device. Use this information only after there is valid first error info indicated by bit 62. INTERPRETING MACHINE-CHECK ERROR CODES NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.6.4 Home Agent Machine Check Errors Memory errors from the first memory controller may be logged in the IA32_MC7_{STATUS,ADDR,MISC} registers, while the second memory controller logs to IA32_MC8_{STATUS,ADDR,MISC}. 16.7 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE 06_56H, MACHINE ERROR CODES FOR MACHINE CHECK Intel Xeon processor D family is based on the Broadwell microarchitecture and can be identified with CPUID DisplayFamily_DisplaySignature 06_56H. Incremental error codes for internal machine check error from PCU controller is reported in the register bank IA32_MC4, Table 16-24 lists model-specific fields to interpret error codes applicable to IA32_MC4_STATUS. Incremental error codes for the memory controller unit is reported in the register banks IA32_MC9-IA32_MC10. Table 16-18 lists model-specific error codes apply to IA32_MCi_STATUS, i = 9-10. 16.7.1 Internal Machine Check Errors Table 16-24. Machine Check Error Codes for IA32_MC4_STATUS Type MCA error codes1 MCACOD2 Bit No. Bit Function 15:0 MCACOD 15:0 internal Errors Bit Description 0402h - PCU internal Errors 0403h - internal Errors 0406h - Intel TXT Errors 0407h - Other UBOX internal Errors. On an IERR caused by a core 3-strike the IA32_MC3_STATUS (MLC) is copied to the IA32_MC4_STATUS (After a 3-strike, the core MCA banks will be unavailable). Model specific errors 19:16 Reserved except for the following 0000b - No Error 00x1b - PCU internal error 001xb - PCU internal error Vol. 3B 16-19 INTERPRETING MACHINE-CHECK ERROR CODES Type Bit No. Bit Function Bit Description 23:20 Reserved except for the following x1xxb - UBOX error 31:24 Reserved except for the following 00h - No Error 09h - MC_MESSAGE_CHANNEL_TIMEOUT 13h - MC_DMI_TRAINING_TIMEOUT 15h - MC_DMI_CPU_RESET_ACK_TIMEOUT 1Eh - MC_VR_ICC_MAX_LT_FUSED_ICC_MAX 25h - MC_SVID_COMMAND_TIMEOUT 26h - MCA_PKGC_DIRECT_WAKE_RING_TIMEOUT 29h - MC_VR_VOUT_MAC_LT_FUSED_SVID 2Bh - MC_PKGC_WATCHDOG_HANG_CBZ_DOWN 2Ch - MC_PKGC_WATCHDOG_HANG_CBZ_UP 44h - MC_CRITICAL_VR_FAILED 46h - MC_VID_RAMP_DOWN_FAILED 49h - MC_SVID_WRITE_REG_VOUT_MAX_FAILED 4Bh - MC_PP1_BOOT_VID_TIMEOUT. Timeout setting boot VID for DRAM 0. 4Fh - MC_SVID_COMMAND_ERROR. 52h - MC_FIVR_CATAS_OVERVOL_FAULT. 53h - MC_FIVR_CATAS_OVERCUR_FAULT. 57h - MC_SVID_PKGC_REQUEST_FAILED 58h - MC_SVID_IMON_REQUEST_FAILED 59h - MC_SVID_ALERT_REQUEST_FAILED 62h - MC_INVALID_PKGS_RSP_QPI 64h - MC_INVALID_PKG_STATE_CONFIG 67h - MC_HA_IMC_RW_BLOCK_ACK_TIMEOUT 6Ah - MC_MSGCH_PMREQ_CMP_TIMEOUT 72h - MC_WATCHDG_TIMEOUT_PKGS_MASTER 81h - MC_RECOVERABLE_DIE_THERMAL_TOO_HOT 56:32 Status register validity indicators1 Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 2. The internal error codes may be model-specific. 16.7.2 Integrated Memory Controller Machine Check Errors MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC9_STATUSIA32_MC10_STATUS. The supported error codes follow the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “Machine-Check Architecture,”). MSR_ERROR_CONTROL.[bit 1] can enable additional information logging of the IMC. The additional error information logged by the IMC is stored in IA32_MCi_STATUS and IA32_MCi_MISC; (i = 9-10). 16-20 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES Table 16-25. Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 9-10) Type Bit No. Bit Function Bit Description MCA error codes1 15:0 MCACOD Memory Controller error format: 0000 0000 1MMM CCCC Model specific errors 31:16 Reserved except for the following 0001H - DDR3 address parity error 0002H - Uncorrected HA write data error 0004H - Uncorrected HA data byte enable error 0008H - Corrected patrol scrub error 0010H - Uncorrected patrol scrub error 0100H - iMC, write data buffer parity errors 0200H - DDR4 command address parity error 36:32 Other info Reserved 37 Reserved Reserved 56:38 Status register validity indicators1 See Chapter 15, “Machine-Check Architecture,” 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.8 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE 06_4FH, MACHINE ERROR CODES FOR MACHINE CHECK Next Generation Intel Xeon processor E5 family is based on the Broadwell microarchitecture and can be identified with CPUID DisplayFamily_DisplaySignature 06_4FH. Incremental error codes for internal machine check error from PCU controller is reported in the register bank IA32_MC4, Table 16-20 in Section 16.6.1lists model-specific fields to interpret error codes applicable to IA32_MC4_STATUS. Incremental MC error codes related to the Intel QPI links are reported in the register banks IA32_MC5, IA32_MC20, and IA32_MC21. Information listed in Table 16-21 of Section 16.6.1 covers QPI MC error codes. 16.8.1 Integrated Memory Controller Machine Check Errors MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC9_STATUSIA32_MC16_STATUS. The supported error codes follow the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “Machine-Check Architecture”). Table 16-26 lists model-specific error codes apply to IA32_MCi_STATUS, i = 9-16. IA32_MCi_STATUS (i=9-12) log errors from the first memory controller. The second memory controller logs into IA32_MCi_STATUS (i=13-16). Vol. 3B 16-21 INTERPRETING MACHINE-CHECK ERROR CODES Table 16-26. Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 9-16) Type Bit No. Bit Function Bit Description MCA error codes1 15:0 MCACOD Memory Controller error format: 0000 0000 1MMM CCCC Model specific errors 31:16 Reserved except for the following 0001H - DDR3 address parity error 0002H - Uncorrected HA write data error 0004H - Uncorrected HA data byte enable error 0008H - Corrected patrol scrub error 0010H - Uncorrected patrol scrub error 0020H - Corrected spare error 0040H - Uncorrected spare error 0100H - iMC, write data buffer parity errors 0200H - DDR4 command address parity error 36:32 Other info Reserved 37 Reserved Reserved 56:38 Status register validity indicators1 See Chapter 15, “Machine-Check Architecture,” 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.8.2 Home Agent Machine Check Errors MC error codes associated with mirrored memory corrections are reported in the MSRs IA32_MC7_MISC and IA32_MC8_MISC. Table 16-27 lists model-specific error codes apply to IA32_MCi_MISC, i = 7, 8. Memory errors from the first memory controller may be logged in the IA32_MC7_{STATUS,ADDR,MISC} registers, while the second memory controller logs to IA32_MC8_{STATUS,ADDR,MISC}. Table 16-27. Intel HA MC Error Codes for IA32_MCi_MISC (i= 7, 8) Bit No. Bit Function Bit Description 5:0 LSB See Figure 15-8. 8:6 Address Mode See Table 15-3. 40:9 Reserved Reserved 41 Failover Error occurred at a pair of mirrored memory channels. Error was corrected by mirroring with channel failover. 42 Mirrorcorr Error was corrected by mirroring and primary channel scrubbed successfully. 63:43 Reserved Reserved 16.9 INCREMENTAL DECODING INFORMATION: INTEL® XEON® PROCESSOR SCALABLE FAMILY, MACHINE ERROR CODES FOR MACHINE CHECK In the Intel® Xeon® Processor Scalable Family with CPUID DisplayFamily_DisplaySignature 06_55H, incremental error codes for internal machine check errors from the PCU controller are reported in the register bank IA32_MC4. Table 16-28 in Section 16.9.1 lists model-specific fields to interpret error codes applicable to IA32_MC4_STATUS. 16-22 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES 16.9.1 Internal Machine Check Errors Table 16-28. Machine Check Error Codes for IA32_MC4_STATUS Type 1 MCA error codes 2 MCACOD Bit No. Bit Function 15:0 MCACOD 15:0 Internal Errors Bit Description 0402h - PCU internal Errors 0403h - PCU internal Errors 0406h - Intel TXT Errors 0407h - Other UBOX internal Errors. On an IERR caused by a core 3-strike the IA32_MC3_STATUS (MLC) is copied to the IA32_MC4_STATUS (After a 3-strike, the core MCA banks will be unavailable). Model specific errors 19:16 Reserved except for the following 0000b - No Error 23:20 Reserved Reserved 31:24 Reserved except for the following 00h - No Error 00xxb - PCU internal error 0Dh - MCA_DMI_TRAINING_TIMEOUT 0Fh - MCA_DMI_CPU_RESET_ACK_TIMEOUT 10h - MCA_MORE_THAN_ONE_LT_AGENT 1Eh - MCA_BIOS_RST_CPL_INVALID_SEQ 1Fh - MCA_BIOS_INVALID_PKG_STATE_CONFIG 25h - MCA_MESSAGE_CHANNEL_TIMEOUT 27h - MCA_MSGCH_PMREQ_CMP_TIMEOUT 30h - MCA_PKGC_DIRECT_WAKE_RING_TIMEOUT 31h - MCA_PKGC_INVALID_RSP_PCH 33h - MCA_PKGC_WATCHDOG_HANG_CBZ_DOWN 34h - MCA_PKGC_WATCHDOG_HANG_CBZ_UP 38h - MCA_PKGC_WATCHDOG_HANG_C3_UP_SF 40h - MCA_SVID_VCCIN_VR_ICC_MAX_FAILURE 41h - MCA_SVID_COMMAND_TIMEOUT 42h - MCA_SVID_VCCIN_VR_VOUT_MAX_FAILURE 43h - MCA_SVID_CPU_VR_CAPABILITY_ERROR 44h - MCA_SVID_CRITICAL_VR_FAILED 45h - MCA_SVID_SA_ITD_ERROR 46h - MCA_SVID_READ_REG_FAILED 47h - MCA_SVID_WRITE_REG_FAILED 48h - MCA_SVID_PKGC_INIT_FAILED Vol. 3B 16-23 INTERPRETING MACHINE-CHECK ERROR CODES Type Bit No. Bit Function Bit Description 49h - MCA_SVID_PKGC_CONFIG_FAILED 4Ah - MCA_SVID_PKGC_REQUEST_FAILED 4Bh - MCA_SVID_IMON_REQUEST_FAILED 4Ch - MCA_SVID_ALERT_REQUEST_FAILED 4Dh - MCA_SVID_MCP_VP_ABSENT_OR_RAMP_ERROR 4Eh - MCA_SVID_UNEXPECTED_MCP_VP_DETECTED 51h - MCA_FIVR_CATAS_OVERVOL_FAULT 52h - MCA_FIVR_CATAS_OVERCUR_FAULT 58h - MCA_WATCHDG_TIMEOUT_PKGC_SLAVE 59h - MCA_WATCHDG_TIMEOUT_PKGC_MASTER 5Ah - MCA_WATCHDG_TIMEOUT_PKGS_MASTER 61h - MCA_PKGS_CPD_UNPCD_TIMEOUT 63h - MCA_PKGS_INVALID_REQ_PCH 64h - MCA_PKGS_INVALID_REQ_INTERNAL 65h - MCA_PKGS_INVALID_RSP_INTERNAL 6Bh - MCA_PKGS_SMBUS_VPP_PAUSE_TIMEOUT 81h - MC_RECOVERABLE_DIE_THERMAL_TOO_HOT Status register validity indicators1 52:32 Reserved Reserved 54:53 CORR_ERR_STATUS Reserved 56:55 Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 2. The internal error codes may be model-specific. 16.9.2 Interconnect Machine Check Errors MC error codes associated with the link interconnect agents are reported in the MSRs IA32_MC5_STATUS, IA32_MC12_STATUS, IA32_MC19_STATUS. The supported error codes follow the architectural MCACOD definition type 1PPTRRRRIILL (see Chapter 15, “Machine-Check Architecture”). Table 16-29 lists model-specific fields to interpret error codes applicable to IA32_MCi_STATUS, i= 5, 12, 19. Table 16-29. Interconnect MC Error Codes for IA32_MCi_STATUS, i = 5, 12, 19 Type Bit No. Bit Function Bit Description MCA error codes1 15:0 MCACOD Bus error format: 1PPTRRRRIILL The two supported compound error codes: - 0x0C0F - Unsupported/Undefined Packet - 0x0E0F - For all other corrected and uncorrected errors 16-24 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES Type Bit No. Bit Function Bit Description Model specific errors 21:16 MSCOD The encoding of Uncorrectable (UC) errors are: 00h - UC Phy Initialization Failure. 01h - UC Phy detected drift buffer alarm. 02h - UC Phy detected latency buffer rollover. 10h - UC link layer Rx detected CRC error: unsuccessful LLR entered abort state 11h - UC LL Rx unsupported or undefined packet. 12h - UC LL or Phy control error. 13h - UC LL Rx parameter exchange exception. 1fh - UC LL detected control error from the link-mesh interface The encoding of correctable (COR) errors are: 20h - COR Phy initialization abort 21h - COR Phy reset 22h - COR Phy lane failure, recovery in x8 width. 23h - COR Phy L0c error corrected without Phy reset 24h - COR Phy L0c error triggering Phy reset 25h - COR Phy L0p exit error corrected with Phy reset 30h - COR LL Rx detected CRC error - successful LLR without Phy re-init. 31h - COR LL Rx detected CRC error - successful LLR with Phy re-init. All other values are reserved. 31:22 MSCOD_SPARE The definition below applies to MSCOD 12h (UC LL or Phy Control Errors) [Bit 22] : Phy Control Error [Bit 23] : Unexpected Retry.Ack flit [Bit 24] : Unexpected Retry.Req flit [Bit 25] : RF parity error [Bit 26] : Routeback Table error [Bit 27] : unexpected Tx Protocol flit (EOP, Header or Data) [Bit 28] : Rx Header-or-Credit BGF credit overflow/underflow [Bit 29] : Link Layer Reset still in progress when Phy enters L0 (Phy training should not be enabled until after LL reset is complete as indicated by KTILCL.LinkLayerReset going back to 0). [Bit 30] : Link Layer reset initiated while protocol traffic not idle [Bit 31] : Link Layer Tx Parity Error Status register validity indicators1 37:32 Reserved 52:38 Corrected Error Cnt 56:53 Reserved Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. Vol. 3B 16-25 INTERPRETING MACHINE-CHECK ERROR CODES 16.9.3 Integrated Memory Controller Machine Check Errors MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC13_STATUSIA32_MC18_STATUS. The supported error codes follow the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “Machine-Check Architecture”). IA32_MCi_STATUS (i=13,14,17) log errors from the first memory controller. The second memory controller logs into IA32_MCi_STATUS (i=15,16,18). Table 16-30. Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 13-18) Type Bit No. Bit Function Bit Description MCA error codes 15:0 MCACOD Memory Controller error format: 0000 0000 1MMM CCCC Model specific errors 31:16 Reserved except for the following 0001H - Address parity error 1 0002H - HA write data parity error 0004H - HA write byte enable parity error 0008H - Corrected patrol scrub error 0010H - Uncorrected patrol scrub error 0020H - Corrected spare error 0040H - Uncorrected spare error 0080H - Any HA read error 0100H - WDB read parity error 0200H - DDR4 command address parity error 0400H - Uncorrected address parity error 0800H - Unrecognized request type 0801H - Read response to an invalid scoreboard entry 0802H - Unexpected read response 0803H - DDR4 completion to an invalid scoreboard entry 0804H - Completion to an invalid scoreboard entry 0805H - Completion FIFO overflow 0806H - Correctable parity error 0807H - Uncorrectable error 0808H - Interrupt received while outstanding interrupt was not ACKed 0809H - ERID FIFO overflow 080aH - Error on Write credits 080bH - Error on Read credits 080cH - Scheduler error 080dH - Error event 36:32 Other info MC logs the first error device. This is an encoded 5-bit value of the device. 37 Reserved Reserved 56:38 Status register validity indicators1 See Chapter 15, “Machine-Check Architecture,” 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16-26 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES 16.9.4 M2M Machine Check Errors MC error codes associated with M2M are reported in the MSRs IA32_MC7_STATUS, IA32_MC8_STATUS. The supported error codes follow the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “MachineCheck Architecture,”). Table 16-31. M2M MC Error Codes for IA32_MCi_STATUS (i= 7-8) Type Bit No. Bit Function Bit Description MCA error codes 15:0 MCACOD Compound error format: 0000 0000 1MMM CCCC Model specific errors 16 MscodDataRdErr Logged an MC read data error 17 Reserved Reserved 18 MscodPtlWrErr Logged an MC partial write data error 19 MscodFullWrErr Logged a full write data error 20 MscodBgfErr Logged an M2M clock-domain-crossing buffer (BGF) error 21 MscodTimeOut Logged an M2M time out 22 MscodParErr Logged an M2M tracker parity error 23 MscodBucket1Err Logged a fatal Bucket1 error 31:24 Reserved Reserved 36:32 Other info MC logs the first error device. This is an encoded 5-bit value of the device. 37 Reserved Reserved 1 56:38 Status register validity indicators1 See Chapter 15, “Machine-Check Architecture,” 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16.9.5 Home Agent Machine Check Errors MC error codes associated with mirrored memory corrections are reported in the MSRs IA32_MC7_MISC and IA32_MC8_MISC. Table 16-32 lists model-specific error codes apply to IA32_MCi_MISC, i = 7, 8. Memory errors from the first memory controller may be logged in the IA32_MC7_{STATUS,ADDR,MISC} registers, while the second memory controller logs to IA32_MC8_{STATUS,ADDR,MISC}. Table 16-32. Intel HA MC Error Codes for IA32_MCi_MISC (i= 7, 8) Bit No. Bit Function Bit Description 5:0 LSB See Figure 15-8. 8:6 Address Mode See Table 15-3. 40:9 Reserved Reserved 61:41 Reserved Reserved 62 Mirrorcorr Error was corrected by mirroring and primary channel scrubbed successfully. 63 Failover Error occurred at a pair of mirrored memory channels. Error was corrected by mirroring with channel failover. Vol. 3B 16-27 INTERPRETING MACHINE-CHECK ERROR CODES 16.10 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE 06_5FH, MACHINE ERROR CODES FOR MACHINE CHECK In Intel® Atom™ processors based on Goldmont Microarchitecture with CPUID DisplayFamily_DisplaySignature 06_5FH (code name Denverton), incremental error codes for the memory controller unit are reported in the register banks IA32_MC6 and IA32_MC7. Table 16-33 in Section 16.10.1 lists model-specific fields to interpret error codes applicable to IA32_MCi_STATUS, i = 6, 7. 16.10.1 Integrated Memory Controller Machine Check Errors MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC6_STATUS and IA32_MC7_STATUS. The supported error codes follow the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, “Machine-Check Architecture”). Table 16-33. Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 6, 7) Type MCA error codes1 Model specific errors Bit No. Bit Function 15:0 MCACOD 31:16 Reserved except for the following Bit Description 01h - Cmd/Addr parity 02h - Corrected Demand/Patrol Scrub Error 04h - Uncorrected patrol scrub error 08h - Uncorrected demand read error 10h - WDB read ECC 36:32 Other info 37 Reserved 56:38 Status register validity indicators1 See Chapter 15, “Machine-Check Architecture”. 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16-28 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES 16.11 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY 0FH MACHINE ERROR CODES FOR MACHINE CHECK Table 16-34 provides information for interpreting additional family 0FH model-specific fields for external bus errors. These errors are reported in the IA32_MCi_STATUS MSRs. They are reported architecturally) as compound errors with a general form of 0000 1PPT RRRR IILL in the MCA error code field. See Chapter 15 for information on the interpretation of compound error codes. Table 16-34. Incremental Decoding Information: Processor Family 0FH Machine Error Codes For Machine Check Type Bit No. Bit Function MCA error codes1 15:0 Model-specific error codes 16 FSB address parity Bit Description Address parity error detected: 1 = Address parity error detected 0 = No address parity error 17 Response hard fail Hardware failure detected on response 18 Response parity Parity error detected on response 19 PIC and FSB data parity Data Parity detected on either PIC or FSB access 20 Processor Signature = 00000F04H: Invalid PIC request Processor Signature = 00000F04H. Indicates error due to an invalid PIC request access was made to PIC space with WB memory): 1 = Invalid PIC request error 0 = No Invalid PIC request error Reserved All other processors: Reserved Other Information 21 Pad state machine The state machine that tracks P and N data-strobe relative timing has become unsynchronized or a glitch has been detected. 22 Pad strobe glitch Data strobe glitch 23 Pad address glitch Address strobe glitch 56:24 Reserved Reserved Status register 63:57 validity indicators1 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. Table 16-10 provides information on interpreting additional family 0FH, model specific fields for cache hierarchy errors. These errors are reported in one of the IA32_MCi_STATUS MSRs. These errors are reported, architecturally, as compound errors with a general form of 0000 0001 RRRR TTLL in the MCA error code field. See Chapter 15 for how to interpret the compound error code. 16.11.1 Model-Specific Machine Check Error Codes for Intel Xeon Processor MP 7100 Series Intel Xeon processor MP 7100 series has 5 register banks which contains information related to Machine Check Errors. MCi_STATUS[63:0] refers to all 5 register banks. MC0_STATUS[63:0] through MC3_STATUS[63:0] is the same as on previous generation of Intel Xeon processors within Family 0FH. MC4_STATUS[63:0] is the main error Vol. 3B 16-29 INTERPRETING MACHINE-CHECK ERROR CODES logging for the processor’s L3 and front side bus errors. It supports the L3 Errors, Bus and Interconnect Errors Compound Error Codes in the MCA Error Code Field. Table 16-35. MCi_STATUS Register Bit Definition Bit Field Name Bits Description MCA_Error_Code 15:0 Specifies the machine check architecture defined error code for the machine check error condition detected. The machine check architecture defined error codes are guaranteed to be the same for all Intel Architecture processors that implement the machine check architecture. See tables below Model_Specific_E 31:16 rror_Code Specifies the model specific error code that uniquely identifies the machine check error condition detected. The model specific error codes may differ among Intel Architecture processors for the same Machine Check Error condition. See tables below Other_Info 56:32 The functions of the bits in this field are implementation specific and are not part of the machine check architecture. Software that is intended to be portable among Intel Architecture processors should not rely on the values in this field. PCC 57 Processor Context Corrupt flag indicates that the state of the processor might have been corrupted by the error condition detected and that reliable restarting of the processor may not be possible. When clear, this flag indicates that the error did not affect the processor's state. This bit will always be set for MC errors which are not corrected. ADDRV 58 MC_ADDR register valid flag indicates that the MC_ADDR register contains the address where the error occurred. When clear, this flag indicates that the MC_ADDR register does not contain the address where the error occurred. The MC_ADDR register should not be read if the ADDRV bit is clear. MISCV 59 MC_MISC register valid flag indicates that the MC_MISC register contains additional information regarding the error. When clear, this flag indicates that the MC_MISC register does not contain additional information regarding the error. MC_MISC should not be read if the MISCV bit is not set. EN 60 Error enabled flag indicates that reporting of the machine check exception for this error was enabled by the associated flag bit of the MC_CTL register. Note that correctable errors do not have associated enable bits in the MC_CTL register so the EN bit should be clear when a correctable error is logged. UC 61 Error uncorrected flag indicates that the processor did not correct the error condition. When clear, this flag indicates that the processor was able to correct the event condition. OVER 62 Machine check overflow flag indicates that a machine check error occurred while the results of a previous error were still in the register bank (i.e., the VAL bit was already set in the MC_STATUS register). The processor sets the OVER flag and software is responsible for clearing it. Enabled errors are written over disabled errors, and uncorrected errors are written over corrected events. Uncorrected errors are not written over previous valid uncorrected errors. VAL 63 MC_STATUS register valid flag indicates that the information within the MC_STATUS register is valid. When this flag is set, the processor follows the rules given for the OVER flag in the MC_STATUS register when overwriting previously valid entries. The processor sets the VAL flag and software is responsible for clearing it. 16.11.1.1 Processor Machine Check Status Register MCA Error Code Definition Intel Xeon processor MP 7100 series use compound MCA Error Codes for logging its CBC internal machine check errors, L3 Errors, and Bus/Interconnect Errors. It defines additional Machine Check error types (IA32_MC4_STATUS[15:0]) beyond those defined in Chapter 15. Table 16-36 lists these model-specific MCA error codes. Error code details are specified in MC4_STATUS [31:16] (see Section 16.11.3), the “Model Specific Error Code” field. The information in the “Other_Info” field (MC4_STATUS[56:32]) is common to the three processor error types and contains a correctable event count and specifies the MC4_MISC register format. 16-30 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES Table 16-36. Incremental MCA Error Code for Intel Xeon Processor MP 7100 Processor MCA_Error_Code (MC4_STATUS[15:0]) Type Error Code Binary Encoding Meaning C Internal Error 0000 0100 0000 0000 Internal Error Type Code A L3 Tag Error 0000 0001 0000 1011 L3 Tag Error Type Code B Bus and Interconnect 0000 100x 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations Error 0000 101x 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations 0000 110x 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations 0000 1110 0000 1111 Bus and Interconnection Error Type Code 0000 1111 0000 1111 Not used but this encoding is reserved for compatibility with other MCA implementations The Bold faced binary encodings are the only encodings used by the processor for MC4_STATUS[15:0]. 16.11.2 Other_Info Field (all MCA Error Types) The MC4_STATUS[56:32] field is common to the processor's three MCA error types (A, B & C). Table 16-37. Other Information Field Bit Definition Bit Field Name Bits Description 39:32 8-bit Correctable Event Count Holds a count of the number of correctable events since cold reset. This is a saturating counter; the counter begins at 1 (with the first error) and saturates at a count of 255. 41:40 MC4_MISC format type The value in this field specifies the format of information in the MC4_MISC register. Currently, only two values are defined. Valid only when MISCV is asserted. 43:42 – Reserved 51:44 ECC syndrome ECC syndrome value for a correctable ECC event when the “Valid ECC syndrome” bit is asserted 52 Valid ECC syndrome Set when correctable ECC event supplies the ECC syndrome 54:53 Threshold-Based 00: No tracking - No hardware status tracking is provided for the structure reporting this event. Error Status 01: Green - Status tracking is provided for the structure posting the event; the current status is green (below threshold). 10: Yellow - Status tracking is provided for the structure posting the event; the current status is yellow (above threshold). 11: Reserved for future use Valid only if Valid bit (bit 63) is set Undefined if the UC bit (bit 61) is set 56:55 – Reserved Vol. 3B 16-31 INTERPRETING MACHINE-CHECK ERROR CODES 16.11.3 Processor Model Specific Error Code Field 16.11.3.1 MCA Error Type A: L3 Error Note: The Model Specific Error Code field in MC4_STATUS (bits 31:16). Table 16-38. Type A: L3 Error Codes Bit Num Sub-Field Name Description Legal Value(s) 18:16 L3 Error Code Describes the L3 error encountered 000 - No error 001 - More than one way reporting a correctable event 010 - More than one way reporting an uncorrectable error 011 - More than one way reporting a tag hit 100 - No error 101 - One way reporting a correctable event 110 - One way reporting an uncorrectable error 111 - One or more ways reporting a correctable event while one or more ways are reporting an uncorrectable error 20:19 – Reserved 00 31:21 – Fixed pattern 0010_0000_000 16.11.3.2 Processor Model Specific Error Code Field Type B: Bus and Interconnect Error Note: The Model Specific Error Code field in MC4_STATUS (bits 31:16). Table 16-39. Type B Bus and Interconnect Error Codes Bit Num Sub-Field Name Description 16 FSB Request Parity Parity error detected during FSB request phase 17 Core0 Addr Parity Parity error detected on Core 0 request’s address field 18 Core1 Addr Parity Parity error detected on Core 1 request’s address field 19 Reserved 20 FSB Response Parity Parity error on FSB response field detected 21 FSB Data Parity FSB data parity error on inbound data detected 22 Core0 Data Parity Data parity error on data received from Core 0 detected 23 Core1 Data Parity Data parity error on data received from Core 1 detected 24 IDS Parity Detected an Enhanced Defer parity error (phase A or phase B) 25 FSB Inbound Data ECC Data ECC event to error on inbound data (correctable or uncorrectable) 26 FSB Data Glitch Pad logic detected a data strobe ‘glitch’ (or sequencing error) 27 FSB Address Glitch Pad logic detected a request strobe ‘glitch’ (or sequencing error) 31:28 --- Reserved Exactly one of the bits defined in the preceding table will be set for a Bus and Interconnect Error. The Data ECC can be correctable or uncorrectable (the MC4_STATUS.UC bit, of course, distinguishes between correctable and uncorrectable cases with the Other_Info field possibly providing the ECC Syndrome for correctable errors). All other errors for this processor MCA Error Type are uncorrectable. 16-32 Vol. 3B INTERPRETING MACHINE-CHECK ERROR CODES 16.11.3.3 Processor Model Specific Error Code Field Type C: Cache Bus Controller Error Table 16-40. Type C Cache Bus Controller Error Codes MC4_STATUS[31:16] (MSCE) Value Error Description 0000_0000_0000_0001 0001H Inclusion Error from Core 0 0000_0000_0000_0010 0002H Inclusion Error from Core 1 0000_0000_0000_0011 0003H Write Exclusive Error from Core 0 0000_0000_0000_0100 0004H Write Exclusive Error from Core 1 0000_0000_0000_0101 0005H Inclusion Error from FSB 0000_0000_0000_0110 0006H SNP Stall Error from FSB 0000_0000_0000_0111 0007H Write Stall Error from FSB 0000_0000_0000_1000 0008H FSB Arb Timeout Error 0000_0000_0000_1001 0009H CBC OOD Queue Underflow/overflow 0000_0001_0000_0000 0100H Enhanced Intel SpeedStep Technology TM1-TM2 Error 0000_0010_0000_0000 0200H Internal Timeout error 0000_0011_0000_0000 0300H Internal Timeout Error 0000_0100_0000_0000 0400H Intel® Cache Safe Technology Queue Full Error or Disabled-ways-in-a-set overflow 1100_0000_0000_0001 C001H Correctable ECC event on outgoing FSB data 1100_0000_0000_0010 C002H Correctable ECC event on outgoing Core 0 data 1100_0000_0000_0100 C004H Correctable ECC event on outgoing Core 1 data 1110_0000_0000_0001 E001H Uncorrectable ECC error on outgoing FSB data 1110_0000_0000_0010 E002H Uncorrectable ECC error on outgoing Core 0 data 1110_0000_0000_0100 E004H Uncorrectable ECC error on outgoing Core 1 data — all other encodings — Reserved Vol. 3B 16-33 INTERPRETING MACHINE-CHECK ERROR CODES All errors - except for the correctable ECC types - in this table are uncorrectable. The correctable ECC events may supply the ECC syndrome in the Other_Info field of the MC4_STATUS MSR. Table 16-41. Decoding Family 0FH Machine Check Codes for Cache Hierarchy Errors Type Bit No. MCA error codes1 15:0 Model specific error codes 17:16 Bit Function Bit Description Tag Error Code Contains the tag error code for this machine check error: 00 = No error detected 01 = Parity error on tag miss with a clean line 10 = Parity error/multiple tag match on tag hit 11 = Parity error/multiple tag match on tag miss 19:18 Data Error Code Contains the data error code for this machine check error: 00 = No error detected 01 = Single bit error 10 = Double bit error on a clean line 11 = Double bit error on a modified line 20 L3 Error This bit is set if the machine check error originated in the L3 it can be ignored for invalid PIC request errors): 1 = L3 error 0 = L2 error 21 Invalid PIC Request Indicates error due to invalid PIC request access was made to PIC space with WB memory): 1 = Invalid PIC request error 0 = No invalid PIC request error Other Information Status register validity indicators1 31:22 Reserved Reserved 39:32 8-bit Error Count Holds a count of the number of errors since reset. The counter begins at 0 for the first error and saturates at a count of 255. 56:40 Reserved Reserved 63:57 NOTES: 1. These fields are architecturally defined. Refer to Chapter 15, “Machine-Check Architecture,” for more information. 16-34 Vol. 3B 14.Updates to Chapter 18, Volume 3B Change bars and green text show changes to Chapter 18 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------Changes to this chapter: Replaced “MSR_PERF_FIXED_CTR” naming with “IA32_FIXED_CTR” in various places. Update to Section 18.3.4.4.4 “Precise Distribution of Instructions Retired (PDIR)”. Update to Section 18.5.3.1.2 “Reduced Skid PEBS”. Update to Section 18.5.5.3 “Precise Distribution Support on Fixed Counter 0”. Intel® Atom updates to Section 18.5.3.2 “Offcore Response Event” and Section 18.5.5.4 “Compatibility Enhancements to Offcore Response MSRs”. Update to LBR Entries description in Table 18-91 “MSR_PEBS_CFG Programming”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 25 CHAPTER 18 PERFORMANCE MONITORING Intel 64 and IA-32 architectures provide facilities for monitoring performance via a PMU (Performance Monitoring Unit). 18.1 PERFORMANCE MONITORING OVERVIEW Performance monitoring was introduced in the Pentium processor with a set of model-specific performance-monitoring counter MSRs. These counters permit selection of processor performance parameters to be monitored and measured. The information obtained from these counters can be used for tuning system and compiler performance. In Intel P6 family of processors, the performance monitoring mechanism was enhanced to permit a wider selection of events to be monitored and to allow greater control events to be monitored. Next, Intel processors based on Intel NetBurst microarchitecture introduced a distributed style of performance monitoring mechanism and performance events. The performance monitoring mechanisms and performance events defined for the Pentium, P6 family, and Intel processors based on Intel NetBurst microarchitecture are not architectural. They are all model specific (not compatible among processor families). Intel Core Solo and Intel Core Duo processors support a set of architectural performance events and a set of non-architectural performance events. Newer Intel processor generations support enhanced architectural performance events and non-architectural performance events. Starting with Intel Core Solo and Intel Core Duo processors, there are two classes of performance monitoring capabilities. The first class supports events for monitoring performance using counting or interrupt-based event sampling usage. These events are non-architectural and vary from one processor model to another. They are similar to those available in Pentium M processors. These non-architectural performance monitoring events are specific to the microarchitecture and may change with enhancements. They are discussed in Section 18.6.3, “Performance Monitoring (Processors Based on Intel NetBurst® Microarchitecture).” Non-architectural events for a given microarchitecture cannot be enumerated using CPUID; and they are listed in Chapter 19, “Performance Monitoring Events.” The second class of performance monitoring capabilities is referred to as architectural performance monitoring. This class supports the same counting and Interrupt-based event sampling usages, with a smaller set of available events. The visible behavior of architectural performance events is consistent across processor implementations. Availability of architectural performance monitoring capabilities is enumerated using the CPUID.0AH. These events are discussed in Section 18.2. See also: — Section 18.2, “Architectural Performance Monitoring” — Section 18.3, “Performance Monitoring (Intel® Core™ Processors and Intel® Xeon® Processors)” • Section 18.3.1, “Performance Monitoring for Processors Based on Intel® Microarchitecture Code Name Nehalem” • Section 18.3.2, “Performance Monitoring for Processors Based on Intel® Microarchitecture Code Name Westmere” • • Section 18.3.3, “Intel® Xeon® Processor E7 Family Performance Monitoring Facility” • • • Section 18.3.4, “Performance Monitoring for Processors Based on Intel® Microarchitecture Code Name Sandy Bridge” Section 18.3.5, “3rd Generation Intel® Core™ Processor Performance Monitoring Facility” Section 18.3.6, “4th Generation Intel® Core™ Processor Performance Monitoring Facility” Section 18.3.7, “5th Generation Intel® Core™ Processor and Intel® Core™ M Processor Performance Monitoring Facility” Vol. 3B 18-1 PERFORMANCE MONITORING • Section 18.3.8, “6th Generation, 7th Generation and 8th Generation Intel® Core™ Processor Performance Monitoring Facility” • Section 18.3.9, “Next Generation Intel® Core™ Processor Performance Monitoring Facility” — Section 18.4, “Performance monitoring (Intel® Xeon™ Phi Processors)” • Section 18.4.1, “Intel® Xeon Phi™ Processor 7200/5200/3200 Performance Monitoring” — Section 18.5, “Performance Monitoring (Intel® Atom™ Processors)” • • • • • Section 18.5.1, “Performance Monitoring (45 nm and 32 nm Intel® Atom™ Processors)” Section 18.5.2, “Performance Monitoring for Silvermont Microarchitecture” Section 18.5.3, “Performance Monitoring for Goldmont Microarchitecture” Section 18.5.4, “Performance Monitoring for Goldmont Plus Microarchitecture” Section 18.5.5, “Performance Monitoring for Tremont Microarchitecture” — Section 18.6, “Performance Monitoring (Legacy Intel Processors)” • • • • Section 18.6.1, “Performance Monitoring (Intel® Core™ Solo and Intel® Core™ Duo Processors)” Section 18.6.2, “Performance Monitoring (Processors Based on Intel® Core™ Microarchitecture)” Section 18.6.3, “Performance Monitoring (Processors Based on Intel NetBurst® Microarchitecture)” Section 18.6.4, “Performance Monitoring and Intel Hyper-Threading Technology in Processors Based on Intel NetBurst® Microarchitecture” • • • • • • Section 18.6.4.5, “Counting Clocks on systems with Intel Hyper-Threading Technology in Processors Based on Intel NetBurst® Microarchitecture” Section 18.6.5, “Performance Monitoring and Dual-Core Technology” Section 18.6.6, “Performance Monitoring on 64-bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache” Section 18.6.7, “Performance Monitoring on L3 and Caching Bus Controller Sub-Systems” Section 18.6.8, “Performance Monitoring (P6 Family Processor)” Section 18.6.9, “Performance Monitoring (Pentium Processors)” — Section 18.7, “Counting Clocks” — Section 18.8, “IA32_PERF_CAPABILITIES MSR Enumeration” — Section 18.9, “PEBS Facility” 18.2 ARCHITECTURAL PERFORMANCE MONITORING Performance monitoring events are architectural when they behave consistently across microarchitectures. Intel Core Solo and Intel Core Duo processors introduced architectural performance monitoring. The feature provides a mechanism for software to enumerate performance events and provides configuration and counting facilities for events. Architectural performance monitoring does allow for enhancement across processor implementations. The CPUID.0AH leaf provides version ID for each enhancement. Intel Core Solo and Intel Core Duo processors support base level functionality identified by version ID of 1. Processors based on Intel Core microarchitecture support, at a minimum, the base level functionality of architectural performance monitoring. Intel Core 2 Duo processor T 7700 and newer processors based on Intel Core microarchitecture support both the base level functionality and enhanced architectural performance monitoring identified by version ID of 2. 45 nm and 32 nm Intel Atom processors and Intel Atom processors based on the Silvermont microarchitecture support the functionality provided by versionID 1, 2, and 3; CPUID.0AH:EAX[7:0] reports versionID = 3 to indicate the aggregate of architectural performance monitoring capabilities. Intel Atom processors based on the Airmont microarchitecture support the same performance monitoring capabilities as those based on the Silvermont microarchitecture. 18-2 Vol. 3B PERFORMANCE MONITORING Intel Core processors and related Intel Xeon processor families based on the Nehalem through Broadwell microarchitectures support version ID 1, 2, and 3. Intel processors based on the Skylake, Kaby Lake and Coffee Lake microarchitectures support versionID 4. Next generation Intel Atom processors are based on the Goldmont microarchitecture. Intel processors based on the Goldmont microarchitecture support versionID 4. 18.2.1 Architectural Performance Monitoring Version 1 Configuring an architectural performance monitoring event involves programming performance event select registers. There are a finite number of performance event select MSRs (IA32_PERFEVTSELx MSRs). The result of a performance monitoring event is reported in a performance monitoring counter (IA32_PMCx MSR). Performance monitoring counters are paired with performance monitoring select registers. Performance monitoring select registers and counters are architectural in the following respects: • • • • Bit field layout of IA32_PERFEVTSELx is consistent across microarchitectures. Addresses of IA32_PERFEVTSELx MSRs remain the same across microarchitectures. Addresses of IA32_PMC MSRs remain the same across microarchitectures. Each logical processor has its own set of IA32_PERFEVTSELx and IA32_PMCx MSRs. Configuration facilities and counters are not shared between logical processors sharing a processor core. Architectural performance monitoring provides a CPUID mechanism for enumerating the following information: • Number of performance monitoring counters available to software in a logical processor (each IA32_PERFEVTSELx MSR is paired to the corresponding IA32_PMCx MSR). • • Number of bits supported in each IA32_PMCx. Number of architectural performance monitoring events supported in a logical processor. Software can use CPUID to discover architectural performance monitoring availability (CPUID.0AH). The architectural performance monitoring leaf provides an identifier corresponding to the version number of architectural performance monitoring available in the processor. The version identifier is retrieved by querying CPUID.0AH:EAX[bits 7:0] (see Chapter 3, “Instruction Set Reference, A-L,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A). If the version identifier is greater than zero, architectural performance monitoring capability is supported. Software queries the CPUID.0AH for the version identifier first; it then analyzes the value returned in CPUID.0AH.EAX, CPUID.0AH.EBX to determine the facilities available. In the initial implementation of architectural performance monitoring; software can determine how many IA32_PERFEVTSELx/ IA32_PMCx MSR pairs are supported per core, the bit-width of PMC, and the number of architectural performance monitoring events available. 18.2.1.1 Architectural Performance Monitoring Version 1 Facilities Architectural performance monitoring facilities include a set of performance monitoring counters and performance event select registers. These MSRs have the following properties: • IA32_PMCx MSRs start at address 0C1H and occupy a contiguous block of MSR address space; the number of MSRs per logical processor is reported using CPUID.0AH:EAX[15:8]. Note that this may vary from the number of physical counters present on the hardware, because an agent running at a higher privilege level (e.g., a VMM) may not expose all counters. • IA32_PERFEVTSELx MSRs start at address 186H and occupy a contiguous block of MSR address space. Each performance event select register is paired with a corresponding performance counter in the 0C1H address block. Note the number of IA32_PERFEVTSELx MSRs may vary from the number of physical counters present on the hardware, because an agent running at a higher privilege level (e.g., a VMM) may not expose all counters. • The bit width of an IA32_PMCx MSR is reported using the CPUID.0AH:EAX[23:16]. This the number of valid bits for read operation. On write operations, the lower-order 32 bits of the MSR may be written with any value, and the high-order bits are sign-extended from the value of bit 31. Vol. 3B 18-3 PERFORMANCE MONITORING • Bit field layout of IA32_PERFEVTSELx MSRs is defined architecturally. See Figure 18-1 for the bit field layout of IA32_PERFEVTSELx MSRs. The bit fields are: • Event select field (bits 0 through 7) — Selects the event logic unit used to detect microarchitectural conditions (see Table 18-1, for a list of architectural events and their 8-bit codes). The set of values for this field is defined architecturally; each value corresponds to an event logic unit for use with an architectural performance event. The number of architectural events is queried using CPUID.0AH:EAX. A processor may support only a subset of pre-defined values. 63 31 24 23 22 21 20 19 18 17 16 15 Counter Mask I E N (CMASK) V N INV—Invert counter mask EN—Enable counters INT—APIC interrupt enable PC—Pin control E—Edge detect OS—Operating system mode USR—User Mode 8 7 U I N P E O S Unit Mask (UMASK) S R T C 0 Event Select Reserved Figure 18-1. Layout of IA32_PERFEVTSELx MSRs • Unit mask (UMASK) field (bits 8 through 15) — These bits qualify the condition that the selected event logic unit detects. Valid UMASK values for each event logic unit are specific to the unit. For each architectural performance event, its corresponding UMASK value defines a specific microarchitectural condition. A pre-defined microarchitectural condition associated with an architectural event may not be applicable to a given processor. The processor then reports only a subset of pre-defined architectural events. Pre-defined architectural events are listed in Table 18-1; support for pre-defined architectural events is enumerated using CPUID.0AH:EBX. Architectural performance events available in the initial implementation are listed in Table 19-1. • USR (user mode) flag (bit 16) — Specifies that the selected microarchitectural condition is counted when the logical processor is operating at privilege levels 1, 2 or 3. This flag can be used with the OS flag. • OS (operating system mode) flag (bit 17) — Specifies that the selected microarchitectural condition is counted when the logical processor is operating at privilege level 0. This flag can be used with the USR flag. • E (edge detect) flag (bit 18) — Enables (when set) edge detection of the selected microarchitectural condition. The logical processor counts the number of deasserted to asserted transitions for any condition that can be expressed by the other fields. The mechanism does not permit back-to-back assertions to be distinguished. This mechanism allows software to measure not only the fraction of time spent in a particular state, but also the average length of time spent in such a state (for example, the time spent waiting for an interrupt to be serviced). • PC (pin control) flag (bit 19) — When set, the logical processor toggles the PMi pins and increments the counter when performance-monitoring events occur; when clear, the processor toggles the PMi pins when the counter overflows. The toggling of a pin is defined as assertion of the pin for a single bus clock followed by deassertion. • INT (APIC interrupt enable) flag (bit 20) — When set, the logical processor generates an exception through its local APIC on counter overflow. • EN (Enable Counters) Flag (bit 22) — When set, performance counting is enabled in the corresponding performance-monitoring counter; when clear, the corresponding counter is disabled. The event logic unit for a UMASK must be disabled by setting IA32_PERFEVTSELx[bit 22] = 0, before writing to IA32_PMCx. 18-4 Vol. 3B PERFORMANCE MONITORING • INV (invert) flag (bit 23) — When set, inverts the counter-mask (CMASK) comparison, so that both greater than or equal to and less than comparisons can be made (0: greater than or equal; 1: less than). Note if counter-mask is programmed to zero, INV flag is ignored. • Counter mask (CMASK) field (bits 24 through 31) — When this field is not zero, a logical processor compares this mask to the events count of the detected microarchitectural condition during a single cycle. If the event count is greater than or equal to this mask, the counter is incremented by one. Otherwise the counter is not incremented. This mask is intended for software to characterize microarchitectural conditions that can count multiple occurrences per cycle (for example, two or more instructions retired per clock; or bus queue occupations). If the counter-mask field is 0, then the counter is incremented each cycle by the event count associated with multiple occurrences. 18.2.1.2 Pre-defined Architectural Performance Events Table 18-1 lists architecturally defined events. Table 18-1. UMask and Event Select Encodings for Pre-Defined Architectural Performance Events Bit Position CPUID.AH.EBX Event Name UMask Event Select 0 UnHalted Core Cycles 00H 3CH 1 Instruction Retired 00H C0H 2 UnHalted Reference Cycles 01H 3CH 3 LLC Reference 4FH 2EH 4 LLC Misses 41H 2EH 5 Branch Instruction Retired 00H C4H 6 Branch Misses Retired 00H C5H 7 Topdown Slots 01H A4H A processor that supports architectural performance monitoring may not support all the predefined architectural performance events (Table 18-1). The number of architectural events is reported through CPUID.0AH:EAX[31:24], while non-zero bits in CPUID.0AH:EBX indicate any architectural events that are not available. The behavior of each architectural performance event is expected to be consistent on all processors that support that event. Minor variations between microarchitectures are noted below: • UnHalted Core Cycles — Event select 3CH, Umask 00H This event counts core clock cycles when the clock signal on a specific core is running (not halted). The counter does not advance in the following conditions: — an ACPI C-state other than C0 for normal operation — HLT — STPCLK# pin asserted — being throttled by TM1 — during the frequency switching phase of a performance state transition (see Chapter 14, “Power and Thermal Management”) The performance counter for this event counts across performance state transitions using different core clock frequencies • Instructions Retired — Event select C0H, Umask 00H This event counts the number of instructions at retirement. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. An instruction with a REP prefix counts as one instruction (not per iteration). Faults before the retirement of the last micro-op of a multi-ops instruction are not counted. Vol. 3B 18-5 PERFORMANCE MONITORING This event does not increment under VM-exit conditions. Counters continue counting during hardware interrupts, traps, and inside interrupt handlers. • UnHalted Reference Cycles — Event select 3CH, Umask 01H This event counts reference clock cycles at a fixed frequency while the clock signal on the core is running. The event counts at a fixed frequency, irrespective of core frequency changes due to performance state transitions. Processors may implement this behavior differently. Current implementations use the core crystal clock, TSC or the bus clock. Because the rate may differ between implementations, software should calibrate it to a time source with known frequency. • Last Level Cache References — Event select 2EH, Umask 4FH This event counts requests originating from the core that reference a cache line in the last level on-die cache. The event count includes speculation and cache line fills due to the first-level cache hardware prefetcher, but may exclude cache line fills due to other hardware-prefetchers. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. • Last Level Cache Misses — Event select 2EH, Umask 41H This event counts each cache miss condition for references to the last level on-die cache. The event count may include speculation and cache line fills due to the first-level cache hardware prefetcher, but may exclude cache line fills due to other hardware-prefetchers. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. • Branch Instructions Retired — Event select C4H, Umask 00H This event counts branch instructions at retirement. It counts the retirement of the last micro-op of a branch instruction. • All Branch Mispredict Retired — Event select C5H, Umask 00H This event counts mispredicted branch instructions at retirement. It counts the retirement of the last micro-op of a branch instruction in the architectural path of execution and experienced misprediction in the branch prediction hardware. Branch prediction hardware is implementation-specific across microarchitectures; value comparison to estimate performance differences is not recommended. • Topdown Slots — Event select A4H, Umask 01H This event counts the total number of available slots for an unhalted logical processor. The event increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method. The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core, in processors that support Intel Hyper-Threading Technology. Software can use this event as the denominator for the top-level metrics of the Top-down Microarchitecture Analysis method. NOTE Programming decisions or software precisians on functionality should not be based on the event values or dependent on the existence of performance monitoring events. 18.2.2 Architectural Performance Monitoring Version 2 The enhanced features provided by architectural performance monitoring version 2 include the following: • Fixed-function performance counter register and associated control register — Three of the architectural performance events are counted using three fixed-function MSRs (IA32_FIXED_CTR0 through IA32_FIXED_CTR2). Each of the fixed-function PMC can count only one architectural performance event. Configuring the fixed-function PMCs is done by writing to bit fields in the MSR (IA32_FIXED_CTR_CTRL) located at address 38DH. Unlike configuring performance events for general-purpose PMCs (IA32_PMCx) via UMASK 18-6 Vol. 3B PERFORMANCE MONITORING field in (IA32_PERFEVTSELx), configuring, programming IA32_FIXED_CTR_CTRL for fixed-function PMCs do not require any UMASK. • Simplified event programming — Most frequent operation in programming performance events are enabling/disabling event counting and checking the status of counter overflows. Architectural performance event version 2 provides three architectural MSRs: — IA32_PERF_GLOBAL_CTRL allows software to enable/disable event counting of all or any combination of fixed-function PMCs (IA32_FIXED_CTRx) or any general-purpose PMCs via a single WRMSR. — IA32_PERF_GLOBAL_STATUS allows software to query counter overflow conditions on any combination of fixed-function PMCs or general-purpose PMCs via a single RDMSR. — IA32_PERF_GLOBAL_OVF_CTRL allows software to clear counter overflow conditions on any combination of fixed-function PMCs or general-purpose PMCs via a single WRMSR. • PMI Overhead Mitigation — Architectural performance monitoring version 2 introduces two bit field interface in IA32_DEBUGCTL for PMI service routine to accumulate performance monitoring data and LBR records with reduced perturbation from servicing the PMI. The two bit fields are: — IA32_DEBUGCTL.Freeze_LBR_On_PMI(bit 11). In architectural performance monitoring version 2, only the legacy semantic behavior is supported. See Section 17.4.7 for details of the legacy Freeze LBRs on PMI control. — IA32_DEBUGCTL.Freeze_PerfMon_On_PMI(bit 12). In architectural performance monitoring version 2, only the legacy semantic behavior is supported. See Section 17.4.7 for details of the legacy Freeze LBRs on PMI control. The facilities provided by architectural performance monitoring version 2 can be queried from CPUID leaf 0AH by examining the content of register EDX: • Bits 0 through 4 of CPUID.0AH.EDX indicates the number of fixed-function performance counters available per core, • Bits 5 through 12 of CPUID.0AH.EDX indicates the bit-width of fixed-function performance counters. Bits beyond the width of the fixed-function counter are reserved and must be written as zeros. NOTE Early generation of processors based on Intel Core microarchitecture may report in CPUID.0AH:EDX of support for version 2 but indicating incorrect information of version 2 facilities. The IA32_FIXED_CTR_CTRL MSR include multiple sets of 4-bit field, each 4 bit field controls the operation of a fixed-function performance counter. Figure 18-2 shows the layout of 4-bit controls for each fixed-function PMC. Two sub-fields are currently defined within each control. The definitions of the bit fields are: 63 12 11 9 8 7 P M I P M I E N 5 43 2 1 0 E N P M I E N Cntr2 — Controls for IA32_FIXED_CTR2 Cntr1 — Controls for IA32_FIXED_CTR1 PMI — Enable PMI on overflow Cntr0 — Controls for IA32_FIXED_CTR0 ENABLE — 0: disable; 1: OS; 2: User; 3: All ring levels Reserved Figure 18-2. Layout of IA32_FIXED_CTR_CTRL MSR Vol. 3B 18-7 PERFORMANCE MONITORING • Enable field (lowest 2 bits within each 4-bit control) — When bit 0 is set, performance counting is enabled in the corresponding fixed-function performance counter to increment while the target condition associated with the architecture performance event occurred at ring 0. When bit 1 is set, performance counting is enabled in the corresponding fixed-function performance counter to increment while the target condition associated with the architecture performance event occurred at ring greater than 0. Writing 0 to both bits stops the performance counter. Writing a value of 11B enables the counter to increment irrespective of privilege levels. • PMI field (the fourth bit within each 4-bit control) — When set, the logical processor generates an exception through its local APIC on overflow condition of the respective fixed-function counter. IA32_PERF_GLOBAL_CTRL MSR provides single-bit controls to enable counting of each performance counter. Figure 18-3 shows the layout of IA32_PERF_GLOBAL_CTRL. Each enable bit in IA32_PERF_GLOBAL_CTRL is AND’ed with the enable bits for all privilege levels in the respective IA32_PERFEVTSELx or IA32_PERF_FIXED_CTR_CTRL MSRs to start/stop the counting of respective counters. Counting is enabled if the AND’ed results is true; counting is disabled when the result is false. 63 35 34 33 32 31 2 1 0 IA32_FIXED_CTR2 enable IA32_FIXED_CTR1 enable IA32_FIXED_CTR0 enable IA32_PMC1 enable IA32_PMC0 enable Reserved Figure 18-3. Layout of IA32_PERF_GLOBAL_CTRL MSR The behavior of the fixed function performance counters supported by architectural performance version 2 is expected to be consistent on all processors that support those counters, and is defined as follows. 18-8 Vol. 3B PERFORMANCE MONITORING Table 18-2. Association of Fixed-Function Performance Counters with Architectural Performance Events Fixed-Function Performance Counter Address Event Mask Mnemonic Description IA32_FIXED_CTR0 309H INST_RETIRED.ANY This event counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The counter continues counting during hardware interrupts, traps, and in-side interrupt handlers. IA32_FIXED_CTR1 30AH CPU_CLK_UNHALTED.THREAD The CPU_CLK_UNHALTED.THREAD event counts the number of core cycles while the logical processor is not in a halt state. CPU_CLK_UNHALTED.CORE If there is only one logical processor in a processor core, CPU_CLK_UNHALTED.CORE counts the unhalted cycles of the processor core. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. IA32_FIXED_CTR2 30BH CPU_CLK_UNHALTED.REF_TSC This event counts the number of reference cycles at the TSC rate when the core is not in a halt state and not in a TM stop-clock state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (e.g., P states) but counts at the same frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state and not in a TM stopclock state. IA32_FIXED_CTR3 30CH TOPDOWN.SLOTS This event counts the number of available slots for an unhalted logical processor. The event increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method. The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core. Software can use this event as the denominator for the top-level metrics of the Top-down Microarchitecture Analysis method. IA32_PERF_GLOBAL_STATUS MSR provides single-bit status for software to query the overflow condition of each performance counter. IA32_PERF_GLOBAL_STATUS[bit 62] indicates overflow conditions of the DS area data buffer. IA32_PERF_GLOBAL_STATUS[bit 63] provides a CondChgd bit to indicate changes to the state of performance monitoring hardware. Figure 18-4 shows the layout of IA32_PERF_GLOBAL_STATUS. A value of 1 in bits 0, 1, 32 through 34 indicates a counter overflow condition has occurred in the associated counter. When a performance counter is configured for PEBS, overflow condition in the counter generates a performancemonitoring interrupt signaling a PEBS event. On a PEBS event, the processor stores data records into the buffer area (see Section 18.15.5), clears the counter overflow status., and sets the “OvfBuffer” bit in IA32_PERF_GLOBAL_STATUS. Vol. 3B 18-9 PERFORMANCE MONITORING 63 62 35 34 33 32 31 2 1 0 CondChgd OvfDSBuffer IA32_FIXED_CTR2 Overflow IA32_FIXED_CTR1 Overflow IA32_FIXED_CTR0 Overflow IA32_PMC1 Overflow IA32_PMC0 Overflow Reserved Figure 18-4. Layout of IA32_PERF_GLOBAL_STATUS MSR IA32_PERF_GLOBAL_OVF_CTL MSR allows software to clear overflow indicator(s) of any general-purpose or fixedfunction counters via a single WRMSR. Software should clear overflow indications when • • • Setting up new values in the event select and/or UMASK field for counting or interrupt-based event sampling. Reloading counter values to continue collecting next sample. Disabling event counting or interrupt-based event sampling. The layout of IA32_PERF_GLOBAL_OVF_CTL is shown in Figure 18-5. 63 62 35 34 33 32 31 2 1 0 ClrCondChgd ClrOvfDSBuffer IA32_FIXED_CTR2 ClrOverflow IA32_FIXED_CTR1 ClrOverflow IA32_FIXED_CTR0 ClrOverflow IA32_PMC1 ClrOverflow IA32_PMC0 ClrOverflow Reserved Figure 18-5. Layout of IA32_PERF_GLOBAL_OVF_CTRL MSR 18.2.3 Architectural Performance Monitoring Version 3 Processors supporting architectural performance monitoring version 3 also supports version 1 and 2, as well as capability enumerated by CPUID leaf 0AH. Specifically, version 3 provides the following enhancement in performance monitoring facilities if a processor core comprising of more than one logical processor, i.e. a processor core supporting Intel Hyper-Threading Technology or simultaneous multi-threading capability: • AnyThread counting for processor core supporting two or more logical processors. The interface that supports AnyThread counting include: — Each IA32_PERFEVTSELx MSR (starting at MSR address 186H) support the bit field layout defined in Figure 18-6. 18-10 Vol. 3B PERFORMANCE MONITORING 63 31 24 23 22 21 20 19 18 17 16 15 0 8 7 A I U Counter Mask I E N N N P E O S Unit Mask (UMASK) (CMASK) S R V N Y T C INV—Invert counter mask EN—Enable counters ANY—Any Thread INT—APIC interrupt enable PC—Pin control E—Edge detect OS—Operating system mode USR—User Mode Event Select Reserved Figure 18-6. Layout of IA32_PERFEVTSELx MSRs Supporting Architectural Performance Monitoring Version 3 Bit 21 (AnyThread) of IA32_PERFEVTSELx is supported in architectural performance monitoring version 3 for processor core comprising of two or more logical processors. When set to 1, it enables counting the associated event conditions (including matching the thread’s CPL with the OS/USR setting of IA32_PERFEVTSELx) occurring across all logical processors sharing a processor core. When bit 21 is 0, the counter only increments the associated event conditions (including matching the thread’s CPL with the OS/USR setting of IA32_PERFEVTSELx) occurring in the logical processor which programmed the IA32_PERFEVTSELx MSR. — Each fixed-function performance counter IA32_FIXED_CTRx (starting at MSR address 309H) is configured by a 4-bit control block in the IA32_PERF_FIXED_CTR_CTRL MSR. The control block also allow threadspecificity configuration using an AnyThread bit. The layout of IA32_PERF_FIXED_CTR_CTRL MSR is shown. 63 12 11 P A M N I Y 9 8 7 E N P A M N I Y 5 43 2 1 0 E N P A M N I Y E N Cntr2 — Controls for IA32_FIXED_CTR2 Cntr1 — Controls for IA32_FIXED_CTR1 PMI — Enable PMI on overflow on IA32_FIXED_CTR0 AnyThread — AnyThread for IA32_FIXED_CTR0 ENABLE — IA32_FIXED_CTR0. 0: disable; 1: OS; 2: User; 3: All ring levels Reserved Figure 18-7. IA32_FIXED_CTR_CTRL MSR Supporting Architectural Performance Monitoring Version 3 Each control block for a fixed-function performance counter provides an AnyThread (bit position 2 + 4*N, N= 0, 1, etc.) bit. When set to 1, it enables counting the associated event conditions (including matching the thread’s CPL with the ENABLE setting of the corresponding control block of IA32_PERF_FIXED_CTR_CTRL) occurring across all logical processors sharing a processor core. When an AnyThread bit is 0 in IA32_PERF_FIXED_CTR_CTRL, the corresponding fixed counter only increments the associated event conditions occurring in the logical processor which programmed the IA32_PERF_FIXED_CTR_CTRL MSR. • The IA32_PERF_GLOBAL_CTRL, IA32_PERF_GLOBAL_STATUS, IA32_PERF_GLOBAL_OVF_CTRL MSRs provide single-bit controls/status for each general-purpose and fixed-function performance counter. Figure 18-8 and Figure 18-9 show the layout of these MSRs for N general-purpose performance counters (where N is reported by CPUID.0AH:EAX[15:8]) and three fixed-function counters. NOTE The number of general-purpose performance monitoring counters (i.e., N in Figure 18-9) can vary across processor generations within a processor family, across processor families, or could be Vol. 3B 18-11 PERFORMANCE MONITORING different depending on the configuration chosen at boot time in the BIOS regarding Intel Hyper Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for processors based on the Nehalem microarchitecture; for processors based on the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 if not active). In addition, the number of counters may vary from the number of physical counters present on the hardware, because an agent running at a higher privilege level (e.g., a VMM) may not expose all counters. Global Enable Controls IA32_PERF_GLOBAL_CTRL 35 34 33 32 31 63 N .. .. 1 0 Reserved IA32_FIXED_CTR2 enable IA32_FIXED_CTR1 enable IA32_FIXED_CTR0 enable IA32_PMC(N-1) enable .................... enable IA32_PMC1 enable IA32_PMC0 enable Figure 18-8. Layout of Global Performance Monitoring Control MSR 63 62 61 Global Overflow Status IA32_PERF_GLOBAL_STATUS 35 34 33 32 31 CondChgd OvfDSBuffer OvfUncore IA32_FIXED_CTR2 Overflow IA32_FIXED_CTR1 Overflow IA32_FIXED_CTR0 Overflow IA32_PMC1 Overflow IA32_PMC0 Overflow 63 62 N .. .. 1 0 IA32_PMC(N-1) Overflow ...................... Overflow Global Overflow Status IA32_PERF_GLOBAL_OVF_CTRL 35 34 33 32 31 ClrCondChgd ClrOvfDSBuffer ClrOvfUncore IA32_FIXED_CTR2 ClrOverflow IA32_FIXED_CTR1 ClrOverflow IA32_FIXED_CTR0 ClrOverflow IA32_PMC1 ClrOverflow IA32_PMC0 ClrOverflow N .. .. 1 0 IA32_PMC(N-1) ClrOverflow ........................ ClrOverflow Figure 18-9. Global Performance Monitoring Overflow Status and Control MSRs 18.2.3.1 AnyThread Counting and Software Evolution The motivation for characterizing software workload over multiple software threads running on multiple logical processors of the same processor core originates from a time earlier than the introduction of the AnyThread interface in IA32_PERFEVTSELx and IA32_FIXED_CTR_CTRL. While AnyThread counting provides some benefits in 18-12 Vol. 3B PERFORMANCE MONITORING simple software environments of an earlier era, the evolution contemporary software environments introduce certain concepts and pre-requisites that AnyThread counting does not comply with. One example is the proliferation of software environments that support multiple virtual machines (VM) under VMX (see Chapter 23, “Introduction to Virtual-Machine Extensions”) where each VM represents a domain separated from one another. A Virtual Machine Monitor (VMM) that manages the VMs may allow individual VM to employ performance monitoring facilities to profiles the performance characteristics of a workload. The use of the Anythread interface in IA32_PERFEVTSELx and IA32_FIXED_CTR_CTRL is discouraged with software environments supporting virtualization or requiring domain separation. Specifically, Intel recommends VMM: • Configure the MSR bitmap to cause VM-exits for WRMSR to IA32_PERFEVTSELx and IA32_FIXED_CTR_CTRL in VMX non-Root operation (see CHAPTER 24 for additional information), • Clear the AnyThread bit of IA32_PERFEVTSELx and IA32_FIXED_CTR_CTRL in the MSR-load lists for VM exits and VM entries (see CHAPTER 24, CHAPTER 26, and CHAPTER 27). Even when operating in simpler legacy software environments which might not emphasize the pre-requisites of a virtualized software environment, the use of the AnyThread interface should be moderated and follow any eventspecific guidance where explicitly noted (see relevant sections of Chapter 19, “Performance Monitoring Events”). 18.2.4 Architectural Performance Monitoring Version 4 Processors supporting architectural performance monitoring version 4 also supports version 1, 2, and 3, as well as capability enumerated by CPUID leaf 0AH. Version 4 introduced a streamlined PMI overhead mitigation interface that replaces the legacy semantic behavior but retains the same control interface in IA32_DEBUGCTL.Freeze_LBRs_On_PMI and Freeze_PerfMon_On_PMI. Specifically version 4 provides the following enhancement: • • New indicators (LBR_FRZ, CTR_FRZ) in IA32_PERF_GLOBAL_STATUS, see Section 18.2.4.1. • Fine-grain separation of control interface to manage overflow/status of IA32_PERF_GLOBAL_STATUS and read-only performance counter enabling interface in IA32_PERF_GLOBAL_STATUS: see Section 18.2.4.2. • Performance monitoring resource in-use MSR to facilitate cooperative sharing protocol between perfmonmanaging privilege agents. Streamlined Freeze/PMI Overhead management interfaces to use IA32_DEBUGCTL.Freeze_LBRs_On_PMI and IA32_DEBUGCTL.Freeze_PerfMon_On_PMI: see Section 18.2.4.1. Legacy semantics of Freeze_LBRs_On_PMI and Freeze_PerfMon_On_PMI (applicable to version 2 and 3) are not supported with version 4 or higher. 18.2.4.1 Enhancement in IA32_PERF_GLOBAL_STATUS The IA32_PERF_GLOBAL_STATUS MSR provides the following indicators with architectural performance monitoring version 4: • IA32_PERF_GLOBAL_STATUS.LBR_FRZ[bit 58]: This bit is set due to the following conditions: — IA32_DEBUGCTL.FREEZE_LBR_ON_PMI has been set by the profiling agent, and — A performance counter, configured to generate PMI, has overflowed to signal a PMI. Consequently the LBR stack is frozen. Effectively, the IA32_PERF_GLOBAL_STATUS.LBR_FRZ bit also serves as a control to enable capturing data in the LBR stack. To enable capturing LBR records, the following expression must hold with architectural perfmon version 4 or higher: — (IA32_DEBUGCTL.LBR & (!IA32_PERF_GLOBAL_STATUS.LBR_FRZ) ) =1 • IA32_PERF_GLOBAL_STATUS.CTR_FRZ[bit 59]: This bit is set due to the following conditions: — IA32_DEBUGCTL.FREEZE_PERFMON_ON_PMI has been set by the profiling agent, and — A performance counter, configured to generate PMI, has overflowed to signal a PMI. Consequently, all the performance counters are frozen. Vol. 3B 18-13 PERFORMANCE MONITORING Effectively, the IA32_PERF_GLOBAL_STATUS.CTR_FRZ bit also serve as an read-only control to enable programmable performance counters and fixed counters in the core PMU. To enable counting with the performance counters, the following expression must hold with architectural perfmon version 4 or higher: • (IA32_PERFEVTSELn.EN & IA32_PERF_GLOBAL_CTRL.PMCn & (!IA32_PERF_GLOBAL_STATUS.CTR_FRZ) ) = 1 for programmable counter ‘n’, or • (IA32_PERF_FIXED_CRTL.ENi & IA32_PERF_GLOBAL_CTRL.FCi & (!IA32_PERF_GLOBAL_STATUS.CTR_FRZ) ) = 1 for fixed counter ‘i’ The read-only enable interface IA32_PERF_GLOBAL_STATUS.CTR_FRZ provides a more efficient flow for a PMI handler to use IA32_DEBUGCTL.Freeza_Perfmon_On_PMI to filter out data that may distort target workload analysis, see Table 17-3. It should be noted the IA32_PERF_GLOBAL_CTRL register continue to serve as the primary interface to control all performance counters of the logical processor. For example, when the Freeze-On-PMI mode is not being used, a PMI handler would be setting IA32_PERF_GLOBAL_CTRL as the very last step to commence the overall operation after configuring the individual counter registers, controls and PEBS facility. This does not only assure atomic monitoring but also avoids unnecessary complications (e.g. race conditions) when software attempts to change the core PMU configuration while some counters are kept enabled. Additionally, IA32_PERF_GLOBAL_STATUS.TraceToPAPMI[bit 55]: On processors that support Intel Processor Trace and configured to store trace output packets to physical memory using the ToPA scheme, bit 55 is set when a PMI occurred due to a ToPA entry memory buffer was completely filled. IA32_PERF_GLOBAL_STATUS also provides an indicator to distinguish interaction of performance monitoring operations with other side-band activities, which apply Intel SGX on processors that support SGX (For additional information about Intel SGX, see “Intel® Software Guard Extensions Programming Reference”.): • IA32_PERF_GLOBAL_STATUS.ASCI[bit 60]: This bit is set when data accumulated in any of the configured performance counters (i.e. IA32_PMCx or IA32_FIXED_CTRx) may include contributions from direct or indirect operation of Intel SGX to protect an enclave (since the last time IA32_PERF_GLOBAL_STATUS.ASCI was cleared). 63 62 61 60 59 58 CondChgd OvfDSBuffer OvfUncore ASCI CTR_Frz LBR_Frz TraceToPAPMI IA32_FIXED_CTR2 Overflow IA32_FIXED_CTR1 Overflow IA32_FIXED_CTR0 Overflow 55 35 34 33 32 31 N .. .. 1 0 IA32_PMC(N-1) Overflow ...................... Overflow IA32_PMC1 Overflow IA32_PMC0 Overflow Reserved Figure 18-10. IA32_PERF_GLOBAL_STATUS MSR and Architectural Perfmon Version 4 Note, a processor’s support for IA32_PERF_GLOBAL_STATUS.TraceToPAPMI[bit 55] is enumerated as a result of CPUID enumerated capability of Intel Processor Trace and the use of the ToPA buffer scheme. Support of IA32_PERF_GLOBAL_STATUS.ASCI[bit 60] is enumerated by the CPUID enumeration of Intel SGX. 18.2.4.2 IA32_PERF_GLOBAL_STATUS_RESET and IA32_PERF_GLOBAL_STATUS_SET MSRS With architectural performance monitoring version 3 and lower, clearing of the set bits in IA32_PERF_GLOBAL_STATUS MSR by software is done via IA32_PERF_GLOBAL_OVF_CTRL MSR. Starting with architectural performance monitoring version 4, software can manage the overflow and other indicators in IA32_PERF_GLOBAL_STATUS using separate interfaces to set or clear individual bits. 18-14 Vol. 3B PERFORMANCE MONITORING The address and the architecturally-defined bits of IA32_PERF_GLOBAL_OVF_CTRL is inherited by IA32_PERF_GLOBAL_STATUS_RESET (see Figure 18-11). Further, IA32_PERF_GLOBAL_STATUS_RESET provides additional bit fields to clear the new indicators in IA32_PERF_GLOBAL_STATUS described in Section 18.2.4.1. 63 62 61 60 59 58 55 35 34 33 32 31 Clr CondChgd Clr OvfDSBuffer Clr OvfUncore Clr ASCI Clr CTR_Frz Clr LBR_Frz Clr TraceToPAPMI Clr IA32_FIXED_CTR2 Ovf Clr IA32_FIXED_CTR1 Ovf Clr IA32_FIXED_CTR0 Ovf N .. .. 1 0 Clr IA32_PMC(N-1) Ovf Clr ...................... Ovf Clr IA32_PMC1 Ovf Clr IA32_PMC0 Ovf Reserved Figure 18-11. IA32_PERF_GLOBAL_STATUS_RESET MSR and Architectural Perfmon Version 4 The IA32_PERF_GLOBAL_STATUS_SET MSR is introduced with architectural performance monitoring version 4. It allows software to set individual bits in IA32_PERF_GLOBAL_STATUS. The IA32_PERF_GLOBAL_STATUS_SET interface can be used by a VMM to virtualize the state of IA32_PERF_GLOBAL_STATUS across VMs. 63 62 61 60 59 58 55 35 34 33 32 31 Set OvfDSBuffer Set OvfUncore Set ASCI Set CTR_Frz Set LBR_Frz Set TraceToPAPMI Set IA32_FIXED_CTR2 Ovf Set IA32_FIXED_CTR1 Ovf Set IA32_FIXED_CTR0 Ovf N .. .. 1 0 Set IA32_PMC(N-1) Ovf Set ...................... Ovf Set IA32_PMC1 Ovf Set IA32_PMC0 Ovf Reserved Figure 18-12. IA32_PERF_GLOBAL_STATUS_SET MSR and Architectural Perfmon Version 4 18.2.4.3 IA32_PERF_GLOBAL_INUSE MSR In a contemporary software environment, multiple privileged service agents may wish to employ the processor’s performance monitoring facilities. The IA32_MISC_ENABLE.PERFMON_AVAILABLE[bit 7] interface could not serve the need of multiple agent adequately. A white paper, “Performance Monitoring Unit Sharing Guideline”1, proposed a cooperative sharing protocol that is voluntary for participating software agents. Architectural performance monitoring version 4 introduces a new MSR, IA32_PERF_GLOBAL_INUSE, that simplifies the task of multiple cooperating agents to implement the sharing protocol. The layout of IA32_PERF_GLOBAL_INUSE is shown in Figure 18-13. 1. Available at http://www.intel.com/sdm Vol. 3B 18-15 PERFORMANCE MONITORING 63 35 34 33 32 31 PMI InUse FIXED_CTR2 InUse FIXED_CTR1 InUse FIXED_CTR0 InUse N .. .. 1 0 PERFEVTSEL(N-1) InUse ....................... InUse PERFEVTSEL1 InUse PERFEVTSEL0 InUse N = CPUID.0AH:EAX[15:8] Reserved Figure 18-13. IA32_PERF_GLOBAL_INUSE MSR and Architectural Perfmon Version 4 The IA32_PERF_GLOBAL_INUSE MSR provides an “InUse” bit for each programmable performance counter and fixed counter in the processor. Additionally, it includes an indicator if the PMI mechanism has been configured by a profiling agent. • IA32_PERF_GLOBAL_INUSE.PERFEVTSEL0_InUse[bit 0]: This bit reflects the logical state of (IA32_PERFEVTSEL0[7:0] != 0). • IA32_PERF_GLOBAL_INUSE.PERFEVTSEL1_InUse[bit 1]: This bit reflects the logical state of (IA32_PERFEVTSEL1[7:0] != 0). • IA32_PERF_GLOBAL_INUSE.PERFEVTSEL2_InUse[bit 2]: This bit reflects the logical state of (IA32_PERFEVTSEL2[7:0] != 0). • IA32_PERF_GLOBAL_INUSE.PERFEVTSELn_InUse[bit n]: This bit reflects the logical state of (IA32_PERFEVTSELn[7:0] != 0), n < CPUID.0AH:EAX[15:8]. • IA32_PERF_GLOBAL_INUSE.FC0_InUse[bit 32]: This bit reflects the logical state of (IA32_FIXED_CTR_CTRL[1:0] != 0). • IA32_PERF_GLOBAL_INUSE.FC1_InUse[bit 33]: This bit reflects the logical state of (IA32_FIXED_CTR_CTRL[5:4] != 0). • IA32_PERF_GLOBAL_INUSE.FC2_InUse[bit 34]: This bit reflects the logical state of (IA32_FIXED_CTR_CTRL[9:8] != 0). • IA32_PERF_GLOBAL_INUSE.PMI_InUse[bit 63]: This bit is set if any one of the following bit is set: — IA32_PERFEVTSELn.INT[bit 20], n < CPUID.0AH:EAX[15:8]. — IA32_FIXED_CTR_CTRL.ENi_PMI, i = 0, 1, 2. — Any IA32_PEBS_ENABLES bit which enables PEBS for a general-purpose or fixed-function performance counter. 18.2.5 Architectural Performance Monitoring Version 5 Processors supporting architectural performance monitoring version 5 also support versions 1, 2, 3 and 4, as well as capability enumerated by CPUID leaf 0AH. Specifically, version 5 provides the following enhancements: • • Deprecation of Anythread mode, see Section 18.2.5.1. Individual enumeration of Fixed counters in CPUID.0AH, see Section 18.2.5.2. 18.2.5.1 AnyThread Mode Deprecation With Architectural Performance Monitoring Version 5, a processor that supports AnyThread mode deprecation is enumerated by CPUID.0AH.EDX[15]. If set, software will not have to follow guidelines in Section 18.2.3.1. 18-16 Vol. 3B PERFORMANCE MONITORING 18.2.5.2 Fixed Counter Enumeration With Architectural Performance Monitoring Version 5, register CPUID.0AH.ECX indicates Fixed Counter enumeration. It is a bit mask which enumerates the supported Fixed Counters in a processor. If bit 'i' is set, it implies that Fixed Counter 'i' is supported. Software is recommended to use the following logic to check if a Fixed Counter is supported on a given processor: FxCtr[i]_is_supported := ECX[i] || (EDX[4:0] > i); 18.2.6 Full-Width Writes to Performance Counter Registers The general-purpose performance counter registers IA32_PMCx are writable via WRMSR instruction. However, the value written into IA32_PMCx by WRMSR is the signed extended 64-bit value of the EAX[31:0] input of WRMSR. A processor that supports full-width writes to the general-purpose performance counters enumerated by CPUID.0AH:EAX[15:8] will set IA32_PERF_CAPABILITIES[13] to enumerate its full-width-write capability See Figure 18-63. If IA32_PERF_CAPABILITIES.FW_WRITE[bit 13] =1, each IA32_PMCi is accompanied by a corresponding alias address starting at 4C1H for IA32_A_PMC0. The bit width of the performance monitoring counters is specified in CPUID.0AH:EAX[23:16]. Vol. 3B 18-17 PERFORMANCE MONITORING If IA32_A_PMCi is present, the 64-bit input value (EDX:EAX) of WRMSR to IA32_A_PMCi will cause IA32_PMCi to be updated by: COUNTERWIDTH = CPUID.0AH:EAX[23:16] bit width of the performance monitoring counter IA32_PMCi[COUNTERWIDTH-1:32] ← EDX[COUNTERWIDTH-33:0]); IA32_PMCi[31:0] ← EAX[31:0]; EDX[63:COUNTERWIDTH] are reserved 18.3 PERFORMANCE MONITORING (INTEL® CORE™ PROCESSORS AND INTEL® XEON® PROCESSORS) 18.3.1 Performance Monitoring for Processors Based on Intel® Microarchitecture Code Name Nehalem Intel Core i7 processor family2 supports architectural performance monitoring capability with version ID 3 (see Section 18.2.3) and a host of non-architectural monitoring capabilities. The Intel Core i7 processor family is based on Intel® microarchitecture code name Nehalem, and provides four general-purpose performance counters (IA32_PMC0, IA32_PMC1, IA32_PMC2, IA32_PMC3) and three fixed-function performance counters (IA32_FIXED_CTR0, IA32_FIXED_CTR1, IA32_FIXED_CTR2) in the processor core. Non-architectural performance monitoring in Intel Core i7 processor family uses the IA32_PERFEVTSELx MSR to configure a set of non-architecture performance monitoring events to be counted by the corresponding generalpurpose performance counter. The list of non-architectural performance monitoring events is listed in Table 19-31. Non-architectural performance monitoring events fall into two broad categories: • Performance monitoring events in the processor core: These include many events that are similar to performance monitoring events available to processor based on Intel Core microarchitecture. Additionally, there are several enhancements in the performance monitoring capability for detecting microarchitectural conditions in the processor core or in the interaction of the processor core to the off-core sub-systems in the physical processor package. The off-core sub-systems in the physical processor package is loosely referred to as “uncore“. • Performance monitoring events in the uncore: The uncore sub-system is shared by more than one processor cores in the physical processor package. It provides additional performance monitoring facility outside of IA32_PMCx and performance monitoring events that are specific to the uncore sub-system. Architectural and non-architectural performance monitoring events in Intel Core i7 processor family support thread qualification using bit 21 of IA32_PERFEVTSELx MSR. The bit fields within each IA32_PERFEVTSELx MSR are defined in Figure 18-6 and described in Section 18.2.1.1 and Section 18.2.3. 2. Intel Xeon processor 5500 series and 3400 series are also based on Intel microarchitecture code name Nehalem; the performance monitoring facilities described in this section generally also apply. 18-18 Vol. 3B PERFORMANCE MONITORING 63 62 61 60 353433 3231 8 7 6 5 43 2 1 0 CHG (R/W) OVF_PMI (R/W) OVF_FC2 (R/O) OVF_FC1 (R/O) OVF_FC0 (R/O) OVF_PC7 (R/O), if CCNT>7 OVF_PC6 (R/O), if CCNT>6 OVF_PC5 (R/O), if CCNT>5 OVF_PC4 (R/O), if CCNT>4 OVF_PC3 (R/O) OVF_PC2 (R/O) OVF_PC1 (R/O) OVF_PC0 (R/O) Reserved RESET Value — 00000000_00000000H CCNT: CPUID.AH:EAX[15:8] Figure 18-14. IA32_PERF_GLOBAL_STATUS MSR 18.3.1.1 Enhancements of Performance Monitoring in the Processor Core The notable enhancements in the monitoring of performance events in the processor core include: • Four general purpose performance counters, IA32_PMCx, associated counter configuration MSRs, IA32_PERFEVTSELx, and global counter control MSR supporting simplified control of four counters. Each of the four performance counter can support processor event based sampling (PEBS) and thread-qualification of architectural and non-architectural performance events. Width of IA32_PMCx supported by hardware has been increased. The width of counter reported by CPUID.0AH:EAX[23:16] is 48 bits. The PEBS facility in Intel microarchitecture code name Nehalem has been enhanced to include new data format to capture additional information, such as load latency. • Load latency sampling facility. Average latency of memory load operation can be sampled using load-latency facility in processors based on Intel microarchitecture code name Nehalem. This field measures the load latency from load's first dispatch of till final data writeback from the memory subsystem. The latency is reported for retired demand load operations and in core cycles (it accounts for re-dispatches). This facility is used in conjunction with the PEBS facility. • Off-core response counting facility. This facility in the processor core allows software to count certain transaction responses between the processor core to sub-systems outside the processor core (uncore). Counting off-core response requires additional event qualification configuration facility in conjunction with IA32_PERFEVTSELx. Two off-core response MSRs are provided to use in conjunction with specific event codes that must be specified with IA32_PERFEVTSELx. NOTE The number of counters available to software may vary from the number of physical counters present on the hardware, because an agent running at a higher privilege level (e.g., a VMM) may not expose all counters. CPUID.0AH:EAX[15:8] reports the MSRs available to software; see Section 18.2.1. 18.3.1.1.1 Processor Event Based Sampling (PEBS) All general-purpose performance counters, IA32_PMCx, can be used for PEBS if the performance event supports PEBS. Software uses IA32_MISC_ENABLE[7] and IA32_MISC_ENABLE[12] to detect whether the performance monitoring facility and PEBS functionality are supported in the processor. The MSR IA32_PEBS_ENABLE provides 4 bits that software must use to enable which IA32_PMCx overflow condition will cause the PEBS record to be captured. Vol. 3B 18-19 PERFORMANCE MONITORING Additionally, the PEBS record is expanded to allow latency information to be captured. The MSR IA32_PEBS_ENABLE provides 4 additional bits that software must use to enable latency data recording in the PEBS record upon the respective IA32_PMCx overflow condition. The layout of IA32_PEBS_ENABLE for processors based on Intel microarchitecture code name Nehalem is shown in Figure 18-15. When a counter is enabled to capture machine state (PEBS_EN_PMCx = 1), the processor will write machine state information to a memory buffer specified by software as detailed below. When the counter IA32_PMCx overflows from maximum count to zero, the PEBS hardware is armed. 36 3534 33 32 31 63 8 7 6 5 43 2 1 0 LL_EN_PMC3 (R/W) LL_EN_PMC2 (R/W) LL_EN_PMC1 (R/W) LL_EN_PMC0 (R/W) PEBS_EN_PMC3 (R/W) PEBS_EN_PMC2 (R/W) PEBS_EN_PMC1 (R/W) PEBS_EN_PMC0 (R/W) Reserved RESET Value — 00000000_00000000H Figure 18-15. Layout of IA32_PEBS_ENABLE MSR Upon occurrence of the next PEBS event, the PEBS hardware triggers an assist and causes a PEBS record to be written. The format of the PEBS record is indicated by the bit field IA32_PERF_CAPABILITIES[11:8] (see Figure 18-63). The behavior of PEBS assists is reported by IA32_PERF_CAPABILITIES[6] (see Figure 18-63). The return instruction pointer (RIP) reported in the PEBS record will point to the instruction after (+1) the instruction that causes the PEBS assist. The machine state reported in the PEBS record is the machine state after the instruction that causes the PEBS assist is retired. For instance, if the instructions: mov eax, [eax] ; causes PEBS assist nop are executed, the PEBS record will report the address of the nop, and the value of EAX in the PEBS record will show the value read from memory, not the target address of the read operation. The PEBS record format is shown in Table 18-3, and each field in the PEBS record is 64 bits long. The PEBS record format, along with debug/store area storage format, does not change regardless of IA-32e mode is active or not. CPUID.01H:ECX.DTES64[bit 2] reports whether the processor's DS storage format support is mode-independent. When set, it uses 64-bit DS storage format. Table 18-3. PEBS Record Format for Intel Core i7 Processor Family Byte Offset Field Byte Offset Field 00H R/EFLAGS 58H R9 08H R/EIP 60H R10 10H R/EAX 68H R11 18H R/EBX 70H R12 20H R/ECX 78H R13 28H R/EDX 80H R14 30H R/ESI 88H R15 18-20 Vol. 3B PERFORMANCE MONITORING Table 18-3. PEBS Record Format for Intel Core i7 Processor Family Byte Offset Field Byte Offset Field 38H R/EDI 90H IA32_PERF_GLOBAL_STATUS 40H R/EBP 98H Data Linear Address 48H R/ESP A0H Data Source Encoding 50H R8 A8H Latency value (core cycles) In IA-32e mode, the full 64-bit value is written to the register. If the processor is not operating in IA-32e mode, 32bit value is written to registers with bits 63:32 zeroed. Registers not defined when the processor is not in IA-32e mode are written to zero. Bytes AFH:90H are enhancement to the PEBS record format. Support for this enhanced PEBS record format is indicated by IA32_PERF_CAPABILITIES[11:8] encoding of 0001B. The value written to bytes 97H:90H is the state of the IA32_PERF_GLOBAL_STATUS register before the PEBS assist occurred. This value is written so software can determine which counters overflowed when this PEBS record was written. Note that this field indicates the overflow status for all counters, regardless of whether they were programmed for PEBS or not. Programming PEBS Facility Only a subset of non-architectural performance events in the processor support PEBS. The subset of precise events are listed in Table 18-78. In addition to using IA32_PERFEVTSELx to specify event unit/mask settings and setting the EN_PMCx bit in the IA32_PEBS_ENABLE register for the respective counter, the software must also initialize the DS_BUFFER_MANAGEMENT_AREA data structure in memory to support capturing PEBS records for precise events. NOTE PEBS events are only valid when the following fields of IA32_PERFEVTSELx are all zero: AnyThread, Edge, Invert, CMask. The beginning linear address of the DS_BUFFER_MANAGEMENT_AREA data structure must be programmed into the IA32_DS_AREA register. The layout of the DS_BUFFER_MANAGEMENT_AREA is shown in Figure 18-16. • PEBS Buffer Base: This field is programmed with the linear address of the first byte of the PEBS buffer allocated by software. The processor reads this field to determine the base address of the PEBS buffer. Software should allocate this memory from the non-paged pool. • PEBS Index: This field is initially programmed with the same value as the PEBS Buffer Base field, or the beginning linear address of the PEBS buffer. The processor reads this field to determine the location of the next PEBS record to write to. After a PEBS record has been written, the processor also updates this field with the address of the next PEBS record to be written. The figure above illustrates the state of PEBS Index after the first PEBS record is written. • PEBS Absolute Maximum: This field represents the absolute address of the maximum length of the allocated PEBS buffer plus the starting address of the PEBS buffer. The processor will not write any PEBS record beyond the end of PEBS buffer, when PEBS Index equals PEBS Absolute Maximum. No signaling is generated when PEBS buffer is full. Software must reset the PEBS Index field to the beginning of the PEBS buffer address to continue capturing PEBS records. Vol. 3B 18-21 PERFORMANCE MONITORING IA32_DS_AREA MSR DS Buffer Management Area BTS Buffer Base 0H BTS Index 8H BTS Absolute Maximum BTS Interrupt Threshold BTS Buffer Branch Record 0 10H Branch Record 1 18H PEBS Buffer Base 20H PEBS Index PEBS Absolute Maximum PEBS Interrupt Threshold PEBS Counter0 Reset PEBS Counter1 Reset PEBS Counter2 Reset PEBS Counter3 Reset Reserved 28H 30H 38H Branch Record n 40H 48H PEBS Buffer PEBS Record 0 50H 58H PEBS Record 1 60H PEBS Record n Figure 18-16. PEBS Programming Environment • PEBS Interrupt Threshold: This field specifies the threshold value to trigger a performance interrupt and notify software that the PEBS buffer is nearly full. This field is programmed with the linear address of the first byte of the PEBS record within the PEBS buffer that represents the threshold record. After the processor writes a PEBS record and updates PEBS Index, if the PEBS Index reaches the threshold value of this field, the processor will generate a performance interrupt. This is the same interrupt that is generated by a performance counter overflow, as programmed in the Performance Monitoring Counters vector in the Local Vector Table of the Local APIC. When a performance interrupt due to PEBS buffer full is generated, the IA32_PERF_GLOBAL_STATUS.PEBS_Ovf bit will be set. • PEBS CounterX Reset: This field allows software to set up PEBS counter overflow condition to occur at a rate useful for profiling workload, thereby generating multiple PEBS records to facilitate characterizing the profile the execution of test code. After each PEBS record is written, the processor checks each counter to see if it overflowed and was enabled for PEBS (the corresponding bit in IA32_PEBS_ENABLED was set). If these conditions are met, then the reset value for each overflowed counter is loaded from the DS Buffer Management Area. For example, if counter IA32_PMC0 caused a PEBS record to be written, then the value of “PEBS Counter 0 Reset” would be written to counter IA32_PMC0. If a counter is not enabled for PEBS, its value will not be modified by the PEBS assist. Performance Counter Prioritization Performance monitoring interrupts are triggered by a counter transitioning from maximum count to zero (assuming IA32_PerfEvtSelX.INT is set). This same transition will cause PEBS hardware to arm, but not trigger. PEBS hardware triggers upon detection of the first PEBS event after the PEBS hardware has been armed (a 0 to 1 transition of the counter). At this point, a PEBS assist will be undertaken by the processor. 18-22 Vol. 3B PERFORMANCE MONITORING Performance counters (fixed and general-purpose) are prioritized in index order. That is, counter IA32_PMC0 takes precedence over all other counters. Counter IA32_PMC1 takes precedence over counters IA32_PMC2 and IA32_PMC3, and so on. This means that if simultaneous overflows or PEBS assists occur, the appropriate action will be taken for the highest priority performance counter. For example, if IA32_PMC1 cause an overflow interrupt and IA32_PMC2 causes an PEBS assist simultaneously, then the overflow interrupt will be serviced first. The PEBS threshold interrupt is triggered by the PEBS assist, and is by definition prioritized lower than the PEBS assist. Hardware will not generate separate interrupts for each counter that simultaneously overflows. Generalpurpose performance counters are prioritized over fixed counters. If a counter is programmed with a precise (PEBS-enabled) event and programmed to generate a counter overflow interrupt, the PEBS assist is serviced before the counter overflow interrupt is serviced. If in addition the PEBS interrupt threshold is met, the threshold interrupt is generated after the PEBS assist completes, followed by the counter overflow interrupt (two separate interrupts are generated). Uncore counters may be programmed to interrupt one or more processor cores (see Section 18.3.1.2). It is possible for interrupts posted from the uncore facility to occur coincident with counter overflow interrupts from the processor core. Software must check core and uncore status registers to determine the exact origin of counter overflow interrupts. 18.3.1.1.2 Load Latency Performance Monitoring Facility The load latency facility provides software a means to characterize the average load latency to different levels of cache/memory hierarchy. This facility requires processor supporting enhanced PEBS record format in the PEBS buffer, see Table 18-3. This field measures the load latency from load's first dispatch of till final data writeback from the memory subsystem. The latency is reported for retired demand load operations and in core cycles (it accounts for re-dispatches). To use this feature software must assure: • One of the IA32_PERFEVTSELx MSR is programmed to specify the event unit MEM_INST_RETIRED, and the LATENCY_ABOVE_THRESHOLD event mask must be specified (IA32_PerfEvtSelX[15:0] = 100H). The corresponding counter IA32_PMCx will accumulate event counts for architecturally visible loads which exceed the programmed latency threshold specified separately in a MSR. Stores are ignored when this event is programmed. The CMASK or INV fields of the IA32_PerfEvtSelX register used for counting load latency must be 0. Writing other values will result in undefined behavior. • The MSR_PEBS_LD_LAT_THRESHOLD MSR is programmed with the desired latency threshold in core clock cycles. Loads with latencies greater than this value are eligible for counting and latency data reporting. The minimum value that may be programmed in this register is 3 (the minimum detectable load latency is 4 core clock cycles). • The PEBS enable bit in the IA32_PEBS_ENABLE register is set for the corresponding IA32_PMCx counter register. This means that both the PEBS_EN_CTRX and LL_EN_CTRX bits must be set for the counter(s) of interest. For example, to enable load latency on counter IA32_PMC0, the IA32_PEBS_ENABLE register must be programmed with the 64-bit value 00000001_00000001H. When the load-latency facility is enabled, load operations are randomly selected by hardware and tagged to carry information related to data source locality and latency. Latency and data source information of tagged loads are updated internally. When a PEBS assist occurs, the last update of latency and data source information are captured by the assist and written as part of the PEBS record. The PEBS sample after value (SAV), specified in PEBS CounterX Reset, operates orthogonally to the tagging mechanism. Loads are randomly tagged to collect latency data. The SAV controls the number of tagged loads with latency information that will be written into the PEBS record field by the PEBS assists. The load latency data written to the PEBS record will be for the last tagged load operation which retired just before the PEBS assist was invoked. The load-latency information written into a PEBS record (see Table 18-3, bytes AFH:98H) consists of: • • Data Linear Address: This is the linear address of the target of the load operation. Latency Value: This is the elapsed cycles of the tagged load operation between dispatch to GO, measured in processor core clock domain. Vol. 3B 18-23 PERFORMANCE MONITORING • Data Source: The encoded value indicates the origin of the data obtained by the load instruction. The encoding is shown in Table 18-4. In the descriptions local memory refers to system memory physically attached to a processor package, and remote memory referrals to system memory physically attached to another processor package. Table 18-4. Data Source Encoding for Load Latency Record Encoding Description 00H Unknown L3 cache miss. 01H Minimal latency core cache hit. This request was satisfied by the L1 data cache. 02H Pending core cache HIT. Outstanding core cache miss to same cache-line address was already underway. 03H This data request was satisfied by the L2. 04H L3 HIT. Local or Remote home requests that hit L3 cache in the uncore with no coherency actions required (snooping). 05H L3 HIT. Local or Remote home requests that hit the L3 cache and were serviced by another processor core with a cross core snoop where no modified copies were found. (clean). 06H L3 HIT. Local or Remote home requests that hit the L3 cache and were serviced by another processor core with a cross core snoop where no modified copies were found. 07H1 Reserved/LLC Snoop HitM. Local or Remote home requests that hit the last level cache and were serviced by another core with a cross core snoop where modified copies were found. 08H Reserved/L3 MISS. Local homed requests that missed the L3 cache and were serviced by forwarded data following a cross package snoop where no modified copies were found. (Remote home requests are not counted). 09H Reserved 0AH L3 MISS. Local home requests that missed the L3 cache and were serviced by local DRAM (go to shared state). 0BH L3 MISS. Remote home requests that missed the L3 cache and were serviced by remote DRAM (go to shared state). 0CH L3 MISS. Local home requests that missed the L3 cache and were serviced by local DRAM (go to exclusive state). 0DH L3 MISS. Remote home requests that missed the L3 cache and were serviced by remote DRAM (go to exclusive state). 0EH I/O, Request of input/output operation. 0FH The request was to un-cacheable memory. NOTES: 1. Bit 7 is supported only for processors with a CPUID DisplayFamily_DisplayModel signature of 06_2A, and 06_2E; otherwise it is reserved. The layout of MSR_PEBS_LD_LAT_THRESHOLD is shown in Figure 18-17. 63 1615 0 THRHLD - Load latency threshold Reserved RESET Value — 00000000_00000000H Figure 18-17. Layout of MSR_PEBS_LD_LAT MSR Bits 15:0 specifies the threshold load latency in core clock cycles. Performance events with latencies greater than this value are counted in IA32_PMCx and their latency information is reported in the PEBS record. Otherwise, they are ignored. The minimum value that may be programmed in this field is 3. 18-24 Vol. 3B PERFORMANCE MONITORING 18.3.1.1.3 Off-core Response Performance Monitoring in the Processor Core Programming a performance event using the off-core response facility can choose any of the four IA32_PERFEVTSELx MSR with specific event codes and predefine mask bit value. Each event code for off-core response monitoring requires programming an associated configuration MSR, MSR_OFFCORE_RSP_0. There is only one off-core response configuration MSR. Table 18-5 lists the event code, mask value and additional off-core configuration MSR that must be programmed to count off-core response events using IA32_PMCx. Table 18-5. Off-Core Response Event Encoding Event code in IA32_PERFEVTSELx Mask Value in IA32_PERFEVTSELx Required Off-core Response MSR B7H 01H MSR_OFFCORE_RSP_0 (address 1A6H) The layout of MSR_OFFCORE_RSP_0 is shown in Figure 18-18. Bits 7:0 specifies the request type of a transaction request to the uncore. Bits 15:8 specifies the response of the uncore subsystem. 63 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 RESPONSE TYPE — NON_DRAM (R/W) RESPONSE TYPE — LOCAL_DRAM (R/W) RESPONSE TYPE — REMOTE_DRAM (R/W) RESPONSE TYPE — REMOTE_CACHE_FWD (R/W) RESPONSE TYPE — RESERVED RESPONSE TYPE — OTHER_CORE_HITM (R/W) RESPONSE TYPE — OTHER_CORE_HIT_SNP (R/W) RESPONSE TYPE — UNCORE_HIT (R/W) REQUEST TYPE — OTHER (R/W) REQUEST TYPE — PF_IFETCH (R/W) REQUEST TYPE — PF_RFO (R/W) REQUEST TYPE — PF_DATA_RD (R/W) REQUEST TYPE — WB (R/W) REQUEST TYPE — DMND_IFETCH (R/W) REQUEST TYPE — DMND_RFO (R/W) REQUEST TYPE — DMND_DATA_RD (R/W) Reserved RESET Value — 00000000_00000000H Figure 18-18. Layout of MSR_OFFCORE_RSP_0 and MSR_OFFCORE_RSP_1 to Configure Off-core Response Events Table 18-6. MSR_OFFCORE_RSP_0 and MSR_OFFCORE_RSP_1 Bit Field Definition Bit Name Offset Description DMND_DATA_RD 0 Counts the number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches. DMND_RFO 1 Counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO. DMND_IFETCH 2 Counts the number of demand instruction cacheline reads and L1 instruction cacheline prefetches. WB 3 Counts the number of writeback (modified to exclusive) transactions. PF_DATA_RD 4 Counts the number of data cacheline reads generated by L2 prefetchers. PF_RFO 5 Counts the number of RFO requests generated by L2 prefetchers. PF_IFETCH 6 Counts the number of code reads generated by L2 prefetchers. Vol. 3B 18-25 PERFORMANCE MONITORING Table 18-6. MSR_OFFCORE_RSP_0 and MSR_OFFCORE_RSP_1 Bit Field Definition (Contd.) Bit Name Offset Description OTHER 7 Counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock. UNCORE_HIT 8 L3 Hit: local or remote home requests that hit L3 cache in the uncore with no coherency actions required (snooping). OTHER_CORE_HI T_SNP 9 L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where no modified copies were found (clean). OTHER_CORE_HI TM 10 L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where modified copies were found (HITM). Reserved 11 Reserved REMOTE_CACHE_ 12 FWD L3 Miss: local homed requests that missed the L3 cache and was serviced by forwarded data following a cross package snoop where no modified copies found. (Remote home requests are not counted) REMOTE_DRAM 13 L3 Miss: remote home requests that missed the L3 cache and were serviced by remote DRAM. LOCAL_DRAM 14 L3 Miss: local home requests that missed the L3 cache and were serviced by local DRAM. NON_DRAM 15 Non-DRAM requests that were serviced by IOH. 18.3.1.2 Performance Monitoring Facility in the Uncore The “uncore” in Intel microarchitecture code name Nehalem refers to subsystems in the physical processor package that are shared by multiple processor cores. Some of the sub-systems in the uncore include the L3 cache, Intel QuickPath Interconnect link logic, and integrated memory controller. The performance monitoring facilities inside the uncore operates in the same clock domain as the uncore (U-clock domain), which is usually different from the processor core clock domain. The uncore performance monitoring facilities described in this section apply to Intel Xeon processor 5500 series and processors with the following CPUID signatures: 06_1AH, 06_1EH, 06_1FH (see Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4). An overview of the uncore performance monitoring facilities is described separately. The performance monitoring facilities available in the U-clock domain consist of: • Eight General-purpose counters (MSR_UNCORE_PerfCntr0 through MSR_UNCORE_PerfCntr7). The counters are 48 bits wide. Each counter is associated with a configuration MSR, MSR_UNCORE_PerfEvtSelx, to specify event code, event mask and other event qualification fields. A set of global uncore performance counter enabling/overflow/status control MSRs are also provided for software. • Performance monitoring in the uncore provides an address/opcode match MSR that provides event qualification control based on address value or QPI command opcode. • One fixed-function counter, MSR_UNCORE_FixedCntr0. The fixed-function uncore counter increments at the rate of the U-clock when enabled. The frequency of the uncore clock domain can be determined from the uncore clock ratio which is available in the PCI configuration space register at offset C0H under device number 0 and Function 0. 18.3.1.2.1 Uncore Performance Monitoring Management Facility MSR_UNCORE_PERF_GLOBAL_CTRL provides bit fields to enable/disable general-purpose and fixed-function counters in the uncore. Figure 18-19 shows the layout of MSR_UNCORE_PERF_GLOBAL_CTRL for an uncore that is shared by four processor cores in a physical package. • EN_PCn (bit n, n = 0, 7): When set, enables counting for the general-purpose uncore counter MSR_UNCORE_PerfCntr n. • • EN_FC0 (bit 32): When set, enables counting for the fixed-function uncore counter MSR_UNCORE_FixedCntr0. EN_PMI_COREn (bit n, n = 0, 3 if four cores are present): When set, processor core n is programmed to receive an interrupt signal from any interrupt enabled uncore counter. PMI delivery due to an uncore counter overflow is enabled by setting IA32_DEBUGCTL.Offcore_PMI_EN to 1. 18-26 Vol. 3B PERFORMANCE MONITORING • PMI_FRZ (bit 63): When set, all U-clock uncore counters are disabled when any one of them signals a performance interrupt. Software must explicitly re-enable the counter by setting the enable bits in MSR_UNCORE_PERF_GLOBAL_CTRL upon exit from the ISR. 63 62 51 50 49 48 32 31 8 7 6 5 43 2 1 0 PMI_FRZ (R/W) EN_PMI_CORE3 (R/W) EN_PMI_CORE2 (R/W) EN_PMI_CORE1 (R/W) EN_PMI_CORE0 (R/W) EN_FC0 (R/W) EN_PC7 (R/W) EN_PC6 (R/W) EN_PC5 (R/W) EN_PC4 (R/W) EN_PC3 (R/W) EN_PC2 (R/W) EN_PC1 (R/W) EN_PC0 (R/W) Reserved RESET Value — 00000000_00000000H Figure 18-19. Layout of MSR_UNCORE_PERF_GLOBAL_CTRL MSR MSR_UNCORE_PERF_GLOBAL_STATUS provides overflow status of the U-clock performance counters in the uncore. This is a read-only register. If an overflow status bit is set the corresponding counter has overflowed. The register provides a condition change bit (bit 63) which can be quickly checked by software to determine if a significant change has occurred since the last time the condition change status was cleared. Figure 18-20 shows the layout of MSR_UNCORE_PERF_GLOBAL_STATUS. • OVF_PCn (bit n, n = 0, 7): When set, indicates general-purpose uncore counter MSR_UNCORE_PerfCntr n has overflowed. • OVF_FC0 (bit 32): When set, indicates the fixed-function uncore counter MSR_UNCORE_FixedCntr0 has overflowed. • • OVF_PMI (bit 61): When set indicates that an uncore counter overflowed and generated an interrupt request. CHG (bit 63): When set indicates that at least one status bit in MSR_UNCORE_PERF_GLOBAL_STATUS register has changed state. Vol. 3B 18-27 PERFORMANCE MONITORING MSR_UNCORE_PERF_GLOBAL_OVF_CTRL allows software to clear the status bits in the UNCORE_PERF_GLOBAL_STATUS register. This is a write-only register, and individual status bits in the global status register are cleared by writing a binary one to the corresponding bit in this register. Writing zero to any bit position in this register has no effect on the uncore PMU hardware. 63 62 61 60 32 31 8 7 6 5 43 2 1 0 CHG (R/W) OVF_PMI (R/W) OVF_FC0 (R/O) OVF_PC7 (R/O) OVF_PC6 (R/O) OVF_PC5 (R/O) OVF_PC4 (R/O) OVF_PC3 (R/O) OVF_PC2 (R/O) OVF_PC1 (R/O) OVF_PC0 (R/O) Reserved RESET Value — 00000000_00000000H Figure 18-20. Layout of MSR_UNCORE_PERF_GLOBAL_STATUS MSR Figure 18-21 shows the layout of MSR_UNCORE_PERF_GLOBAL_OVF_CTRL. 63 62 61 60 32 31 8 7 6 5 43 2 1 0 CLR_CHG (WO1) CLR_OVF_PMI (WO1) CLR_OVF_FC0 (WO1) CLR_OVF_PC7 (WO1) CLR_OVF_PC6 (WO1) CLR_OVF_PC5 (WO1) CLR_OVF_PC4 (WO1) CLR_OVF_PC3 (WO1) CLR_OVF_PC2 (WO1) CLR_OVF_PC1 (WO1) CLR_OVF_PC0 (WO1) Reserved RESET Value — 00000000_00000000H Figure 18-21. Layout of MSR_UNCORE_PERF_GLOBAL_OVF_CTRL MSR • CLR_OVF_PCn (bit n, n = 0, 7): Set this bit to clear the overflow status for general-purpose uncore counter MSR_UNCORE_PerfCntr n. Writing a value other than 1 is ignored. • CLR_OVF_FC0 (bit 32): Set this bit to clear the overflow status for the fixed-function uncore counter MSR_UNCORE_FixedCntr0. Writing a value other than 1 is ignored. • CLR_OVF_PMI (bit 61): Set this bit to clear the OVF_PMI flag in MSR_UNCORE_PERF_GLOBAL_STATUS. Writing a value other than 1 is ignored. • CLR_CHG (bit 63): Set this bit to clear the CHG flag in MSR_UNCORE_PERF_GLOBAL_STATUS register. Writing a value other than 1 is ignored. 18-28 Vol. 3B PERFORMANCE MONITORING 18.3.1.2.2 Uncore Performance Event Configuration Facility MSR_UNCORE_PerfEvtSel0 through MSR_UNCORE_PerfEvtSel7 are used to select performance event and configure the counting behavior of the respective uncore performance counter. Each uncore PerfEvtSel MSR is paired with an uncore performance counter. Each uncore counter must be locally configured using the corresponding MSR_UNCORE_PerfEvtSelx and counting must be enabled using the respective EN_PCx bit in MSR_UNCORE_PERF_GLOBAL_CTRL. Figure 18-22 shows the layout of MSR_UNCORE_PERFEVTSELx. 63 31 24 23 22 21 20 19 18 17 16 15 Counter Mask (CMASK) Unit Mask (UMASK) INV—Invert counter mask EN—Enable counters PMI—Enable PMI on overflow E—Edge detect OCC_CTR_RST—Rest Queue Occ 0 8 7 Event Select Reserved RESET Value — 00000000_00000000H Figure 18-22. Layout of MSR_UNCORE_PERFEVTSELx MSRs • • • Event Select (bits 7:0): Selects the event logic unit used to detect uncore events. • Edge Detect (bit 18): When set causes the counter to increment when a deasserted to asserted transition occurs for the conditions that can be expressed by any of the fields in this register. • PMI (bit 20): When set, the uncore will generate an interrupt request when this counter overflowed. This request will be routed to the logical processors as enabled in the PMI enable bits (EN_PMI_COREx) in the register MSR_UNCORE_PERF_GLOBAL_CTRL. • EN (bit 22): When clear, this counter is locally disabled. When set, this counter is locally enabled and counting starts when the corresponding EN_PCx bit in MSR_UNCORE_PERF_GLOBAL_CTRL is set. • INV (bit 23): When clear, the Counter Mask field is interpreted as greater than or equal to. When set, the Counter Mask field is interpreted as less than. • Counter Mask (bits 31:24): When this field is clear, it has no effect on counting. When set to a value other than zero, the logical processor compares this field to the event counts on each core clock cycle. If INV is clear and the event counts are greater than or equal to this field, the counter is incremented by one. If INV is set and the event counts are less than this field, the counter is incremented by one. Otherwise the counter is not incremented. Unit Mask (bits 15:8) : Condition qualifiers for the event selection logic specified in the Event Select field. OCC_CTR_RST (bit17): When set causes the queue occupancy counter associated with this event to be cleared (zeroed). Writing a zero to this bit will be ignored. It will always read as a zero. Figure 18-23 shows the layout of MSR_UNCORE_FIXED_CTR_CTRL. 63 8 7 6 5 43 2 1 0 PMI - Generate PMI on overflow EN - Enable Reserved RESET Value — 00000000_00000000H Figure 18-23. Layout of MSR_UNCORE_FIXED_CTR_CTRL MSR Vol. 3B 18-29 PERFORMANCE MONITORING • EN (bit 0): When clear, the uncore fixed-function counter is locally disabled. When set, it is locally enabled and counting starts when the EN_FC0 bit in MSR_UNCORE_PERF_GLOBAL_CTRL is set. • PMI (bit 2): When set, the uncore will generate an interrupt request when the uncore fixed-function counter overflowed. This request will be routed to the logical processors as enabled in the PMI enable bits (EN_PMI_COREx) in the register MSR_UNCORE_PERF_GLOBAL_CTRL. Both the general-purpose counters (MSR_UNCORE_PerfCntr) and the fixed-function counter (MSR_UNCORE_FixedCntr0) are 48 bits wide. They support both counting and interrupt based sampling usages. The event logic unit can filter event counts to specific regions of code or transaction types incoming to the home node logic. 18.3.1.2.3 Uncore Address/Opcode Match MSR The Event Select field [7:0] of MSR_UNCORE_PERFEVTSELx is used to select different uncore event logic unit. When the event “ADDR_OPCODE_MATCH” is selected in the Event Select field, software can filter uncore performance events according to transaction address and certain transaction responses. The address filter and transaction response filtering requires the use of MSR_UNCORE_ADDR_OPCODE_MATCH register. The layout is shown in Figure 18-24. 63 60 48 47 40 39 Opcode 3 2 0 ADDR MatchSel—Select addr/Opcode Opcode—Opcode and Message ADDR—Bits 39:4 of physical address Reserved RESET Value — 00000000_00000000H Figure 18-24. Layout of MSR_UNCORE_ADDR_OPCODE_MATCH MSR • Addr (bits 39:3): The physical address to match if “MatchSel“ field is set to select address match. The uncore performance counter will increment if the lowest 40-bit incoming physical address (excluding bits 2:0) for a transaction request matches bits 39:3. • Opcode (bits 47:40) : Bits 47:40 allow software to filter uncore transactions based on QPI link message class/packed header opcode. These bits are consists two sub-fields: — Bits 43:40 specify the QPI packet header opcode. — Bits 47:44 specify the QPI message classes. Table 18-7 lists the encodings supported in the opcode field. 18-30 Vol. 3B PERFORMANCE MONITORING Table 18-7. Opcode Field Encoding for MSR_UNCORE_ADDR_OPCODE_MATCH Opcode [43:40] QPI Message Class Home Request Snoop Response Data Response [47:44] = 0000B [47:44] = 0001B [47:44] = 1110B 1 DMND_IFETCH 2 2 WB 3 3 PF_DATA_RD 4 4 PF_RFO 5 5 PF_IFETCH 6 6 OTHER 7 7 NON_DRAM 15 15 • MatchSel (bits 63:61): Software specifies the match criteria according to the following encoding: — 000B: Disable addr_opcode match hardware. — 100B: Count if only the address field matches. — 010B: Count if only the opcode field matches. — 110B: Count if either opcode field matches or the address field matches. — 001B: Count only if both opcode and address field match. — Other encoding are reserved. 18.3.1.3 Intel® Xeon® Processor 7500 Series Performance Monitoring Facility The performance monitoring facility in the processor core of Intel® Xeon® processor 7500 series are the same as those supported in Intel Xeon processor 5500 series. The uncore subsystem in Intel Xeon processor 7500 series are significantly different The uncore performance monitoring facility consist of many distributed units associated with individual logic control units (referred to as boxes) within the uncore subsystem. A high level block diagram of the various box units of the uncore is shown in Figure 18-25. Uncore PMUs are programmed via MSR interfaces. Each of the distributed uncore PMU units have several generalpurpose counters. Each counter requires an associated event select MSR, and may require additional MSRs to configure sub-event conditions. The uncore PMU MSRs associated with each box can be categorized based on its functional scope: per-counter, per-box, or global across the uncore. The number counters available in each box type are different. Each box generally provides a set of MSRs to enable/disable, check status/overflow of multiple counters within each box. Vol. 3B 18-31 PERFORMANCE MONITORING L3 Cache CBox CBox CBox CBox CBox SBox CBox CBox CBox SBox SMI Channels PBox MBox BBox RBox BBox MBox PBox SMI Channels WBox PBox PBox PBox PBox UBox 4 Intel QPI Links Figure 18-25. Distributed Units of the Uncore of Intel® Xeon® Processor 7500 Series Table 18-8 summarizes the number MSRs for uncore PMU for each box. Table 18-8. Uncore PMU MSR Summary Box # of Boxes Counters per Box Counter Width General Purpose Global Enable Sub-control MSRs C-Box 8 6 48 Yes per-box None S-Box 2 4 48 Yes per-box Match/Mask B-Box 2 4 48 Yes per-box Match/Mask M-Box 2 6 48 Yes per-box Yes R-Box 1 16 ( 2 port, 8 per port) 48 Yes per-box Yes W-Box 1 4 48 Yes per-box None 1 48 No per-box None 1 48 Yes uncore None U-Box 1 The W-Box provides 4 general-purpose counters, each requiring an event select configuration MSR, similar to the general-purpose counters in other boxes. There is also a fixed-function counter that increments clockticks in the uncore clock domain. For C,S,B,M,R, and W boxes, each box provides an MSR to enable/disable counting, configuring PMI of multiple counters within the same box, this is somewhat similar the “global control“ programming interface, IA32_PERF_GLOBAL_CTRL, offered in the core PMU. Similarly status information and counter overflow control for multiple counters within the same box are also provided in C,S,B,M,R, and W boxes. In the U-Box, MSR_U_PMON_GLOBAL_CTL provides overall uncore PMU enable/disable and PMI configuration control. The scope of status information in the U-box is at per-box granularity, in contrast to the per-box status information MSR (in the C,S,B,M,R, and W boxes) providing status information of individual counter overflow. The difference in scope also apply to the overflow control MSR in the U-Box versus those in the other Boxes. 18-32 Vol. 3B PERFORMANCE MONITORING The individual MSRs that provide uncore PMU interfaces are listed in Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4, Table 2-17 under the general naming style of MSR_%box#%_PMON_%scope_function%, where %box#% designates the type of box and zerobased index if there are more the one box of the same type, %scope_function% follows the examples below: • Multi-counter enabling MSRs: MSR_U_PMON_GLOBAL_CTL, MSR_S0_PMON_BOX_CTL, MSR_C7_PMON_BOX_CTL, etc. • Multi-counter status MSRs: MSR_U_PMON_GLOBAL_STATUS, MSR_S0_PMON_BOX_STATUS, MSR_C7_PMON_BOX_STATUS, etc. • Multi-counter overflow control MSRs: MSR_U_PMON_GLOBAL_OVF_CTL, MSR_S0_PMON_BOX_OVF_CTL, MSR_C7_PMON_BOX_OVF_CTL, etc. • Performance counters MSRs: the scope is implicitly per counter, e.g. MSR_U_PMON_CTR, MSR_S0_PMON_CTR0, MSR_C7_PMON_CTR5, etc. • Event select MSRs: the scope is implicitly per counter, e.g. MSR_U_PMON_EVNT_SEL, MSR_S0_PMON_EVNT_SEL0, MSR_C7_PMON_EVNT_SEL5, etc • Sub-control MSRs: the scope is implicitly per-box granularity, e.g. MSR_M0_PMON_TIMESTAMP, MSR_R0_PMON_IPERF0_P1, MSR_S1_PMON_MATCH. Details of uncore PMU MSR bit field definitions can be found in a separate document “Intel Xeon Processor 7500 Series Uncore Performance Monitoring Guide“. 18.3.2 Performance Monitoring for Processors Based on Intel® Microarchitecture Code Name Westmere All of the performance monitoring programming interfaces (architectural and non-architectural core PMU facilities, and uncore PMU) described in Section 18.6.3 also apply to processors based on Intel® microarchitecture code name Westmere. Table 18-5 describes a non-architectural performance monitoring event (event code 0B7H) and associated MSR_OFFCORE_RSP_0 (address 1A6H) in the core PMU. This event and a second functionally equivalent offcore response event using event code 0BBH and MSR_OFFCORE_RSP_1 (address 1A7H) are supported in processors based on Intel microarchitecture code name Westmere. The event code and event mask definitions of Non-architectural performance monitoring events are listed in Table 19-31. The load latency facility is the same as described in Section 18.3.1.1.2, but added enhancement to provide more information in the data source encoding field of each load latency record. The additional information relates to STLB_MISS and LOCK, see Table 18-13. 18.3.3 Intel® Xeon® Processor E7 Family Performance Monitoring Facility The performance monitoring facility in the processor core of the Intel® Xeon® processor E7 family is the same as those supported in the Intel Xeon processor 5600 series3. The uncore subsystem in the Intel Xeon processor E7 family is similar to those of the Intel Xeon processor 7500 series. The high level construction of the uncore subsystem is similar to that shown in Figure 18-25, with the additional capability that up to 10 C-Box units are supported. 3. Exceptions are indicated for event code 0FH in Table 19-23; and valid bits of data source encoding field of each load latency record is limited to bits 5:4 of Table 18-13. Vol. 3B 18-33 PERFORMANCE MONITORING Table 18-9 summarizes the number MSRs for uncore PMU for each box. Table 18-9. Uncore PMU MSR Summary for Intel® Xeon® Processor E7 Family Box # of Boxes Counters per Box Counter Width General Purpose Global Enable Sub-control MSRs C-Box 10 6 48 Yes per-box None S-Box 2 4 48 Yes per-box Match/Mask B-Box 2 4 48 Yes per-box Match/Mask M-Box 2 6 48 Yes per-box Yes R-Box 1 16 ( 2 port, 8 per port) 48 Yes per-box Yes W-Box 1 4 48 Yes per-box None 1 48 No per-box None 1 48 Yes uncore None U-Box 1 Details of the uncore performance monitoring facility of Intel Xeon Processor E7 family is available in the “Intel® Xeon® Processor E7 Uncore Performance Monitoring Programming Reference Manual”. 18.3.4 Performance Monitoring for Processors Based on Intel® Microarchitecture Code Name Sandy Bridge Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx processor series, and Intel® Xeon® processor E3-1200 family are based on Intel microarchitecture code name Sandy Bridge; this section describes the performance monitoring facilities provided in the processor core. The core PMU supports architectural performance monitoring capability with version ID 3 (see Section 18.2.3) and a host of non-architectural monitoring capabilities. Architectural performance monitoring version 3 capabilities are described in Section 18.2.3. The core PMU’s capability is similar to those described in Section 18.3.1.1 and Section 18.6.3, with some differences and enhancements relative to Intel microarchitecture code name Westmere summarized in Table 18-10. Table 18-10. Core PMU Comparison Intel® microarchitecture code name Sandy Bridge Intel® microarchitecture code name Westmere # of Fixed counters per thread 3 3 Use CPUID to determine # of counters. See Section 18.2.1. # of general-purpose counters per core 8 8 Use CPUID to determine # of counters. See Section 18.2.1. Counter width (R,W) R:48, W: 32/48 R:48, W:32 See Section 18.2.2. # of programmable counters per thread 4 or (8 if a core not shared by two threads) 4 Use CPUID to determine # of counters. See Section 18.2.1. PMI Overhead Mitigation • Freeze_Perfmon_on_PMI with legacy semantics. • Freeze_on_LBR with legacy semantics for branch profiling. • Freeze_while_SMM. • Freeze_Perfmon_on_PMI with legacy semantics. • Freeze_on_LBR with legacy semantics for branch profiling. • Freeze_while_SMM. See Section 17.4.7. Processor Event Based Sampling (PEBS) Events See Table 18-12. See Table 18-78. IA32_PMC4-IA32_PMC7 do not support PEBS. Box 18-34 Vol. 3B Comment PERFORMANCE MONITORING Table 18-10. Core PMU Comparison (Contd.) Box Intel® microarchitecture code name Sandy Bridge Intel® microarchitecture code name Westmere PEBS-Load Latency See Section 18.3.4.4.2; Data source encoding Comment • Data source encoding • STLB miss encoding • Lock transaction encoding PEBS-Precise Store Section 18.3.4.4.3 No PEBS-PDIR Yes (using precise INST_RETIRED.ALL). No Off-core Response Event MSR 1A6H and 1A7H, extended request and response types. MSR 1A6H and 1A7H, limited response types. 18.3.4.1 Nehalem supports 1A6H only. Global Counter Control Facilities In Intel® Microarchitecture Code Name Sandy Bridge The number of general-purpose performance counters visible to a logical processor can vary across Processors based on Intel microarchitecture code name Sandy Bridge. Software must use CPUID to determine the number performance counters/event select registers (See Section 18.2.1.1). 63 35 34 33 32 31 8 7 6 5 4 3 2 10 FIXED_CTR2 enable FIXED_CTR1 enable FIXED_CTR0 enable PMC7_EN (if PMC7 present) PMC6_EN (if PMC6 present) PMC5_EN (if PMC5 present) PMC4_EN (if PMC4 present) PMC3_EN PMC2_EN PMC1_EN PMC0_EN Reserved Valid if CPUID.0AH:EAX[15:8] = 8, else reserved. Figure 18-26. IA32_PERF_GLOBAL_CTRL MSR in Intel® Microarchitecture Code Name Sandy Bridge Figure 18-42 depicts the layout of IA32_PERF_GLOBAL_CTRL MSR. The enable bits (PMC4_EN, PMC5_EN, PMC6_EN, PMC7_EN) corresponding to IA32_PMC4-IA32_PMC7 are valid only if CPUID.0AH:EAX[15:8] reports a value of ‘8’. If CPUID.0AH:EAX[15:8] = 4, attempts to set the invalid bits will cause #GP. Each enable bit in IA32_PERF_GLOBAL_CTRL is AND’ed with the enable bits for all privilege levels in the respective IA32_PERFEVTSELx or IA32_PERF_FIXED_CTR_CTRL MSRs to start/stop the counting of respective counters. Counting is enabled if the AND’ed results is true; counting is disabled when the result is false. IA32_PERF_GLOBAL_STATUS MSR provides single-bit status used by software to query the overflow condition of each performance counter. IA32_PERF_GLOBAL_STATUS[bit 62] indicates overflow conditions of the DS area data buffer (see Figure 18-27). A value of 1 in each bit of the PMCx_OVF field indicates an overflow condition has occurred in the associated counter. Vol. 3B 18-35 PERFORMANCE MONITORING 63 62 61 35 34 33 32 31 8 7 6 5 4 3 2 1 0 CondChgd Ovf_DSBuffer Ovf_UncorePMU FIXED_CTR2 Overflow (RO) FIXED_CTR1 Overflow (RO) FIXED_CTR0 Overflow (RO) PMC7_OVF (RO, If PMC7 present) PMC6_OVF (RO, If PMC6 present) PMC5_OVF (RO, If PMC5 present) PMC4_OVF (RO, If PMC4 present) PMC3_OVF (RO) PMC2_OVF (RO) PMC1_OVF (RO) PMC0_OVF (RO) Valid if CPUID.0AH:EAX[15:8] = 8; else reserved Reserved Figure 18-27. IA32_PERF_GLOBAL_STATUS MSR in Intel® Microarchitecture Code Name Sandy Bridge When a performance counter is configured for PEBS, an overflow condition in the counter will arm PEBS. On the subsequent event following overflow, the processor will generate a PEBS event. On a PEBS event, the processor will perform bounds checks based on the parameters defined in the DS Save Area (see Section 17.4.9). Upon successful bounds checks, the processor will store the data record in the defined buffer area, clear the counter overflow status, and reload the counter. If the bounds checks fail, the PEBS will be skipped entirely. In the event that the PEBS buffer fills up, the processor will set the OvfBuffer bit in MSR_PERF_GLOBAL_STATUS. IA32_PERF_GLOBAL_OVF_CTL MSR allows software to clear overflow the indicators for general-purpose or fixedfunction counters via a single WRMSR (see Figure 18-28). Clear overflow indications when: • • • Setting up new values in the event select and/or UMASK field for counting or interrupt based sampling. Reloading counter values to continue sampling. Disabling event counting or interrupt based sampling. 63 62 35 34 33 32 31 8 7 6 5 4 3 2 1 0 ClrCondChgd ClrOvfDSBuffer ClrOvfUncore FIXED_CTR2 ClrOverflow FIXED_CTR1 ClrOverflow FIXED_CTR0 ClrOverflow PMC7_ClrOvf (if PMC7 present) PMC6_ClrOvf (if PMC6 present) PMC5_ClrOvf (if PMC5 present) PMC4_ClrOvf (if PMC4 present) PMC3_ClrOvf PMC2_ClrOvf PMC1_ClrOvf PMC0_ClrOvf Reserved Valid if CPUID.0AH:EAX[15:8] = 8; else reserved Figure 18-28. IA32_PERF_GLOBAL_OVF_CTRL MSR in Intel microarchitecture code name Sandy Bridge 18-36 Vol. 3B PERFORMANCE MONITORING 18.3.4.2 Counter Coalescence In processors based on Intel microarchitecture code name Sandy Bridge, each processor core implements eight general-purpose counters. CPUID.0AH:EAX[15:8] will report the number of counters visible to software. If a processor core is shared by two logical processors, each logical processors can access up to four counters (IA32_PMC0-IA32_PMC3). This is the same as in the prior generation for processors based on Intel microarchitecture code name Nehalem. If a processor core is not shared by two logical processors, up to eight general-purpose counters are visible. If CPUID.0AH:EAX[15:8] reports 8 counters, then IA32_PMC4-IA32_PMC7 would occupy MSR addresses 0C5H through 0C8H. Each counter is accompanied by an event select MSR (IA32_PERFEVTSEL4-IA32_PERFEVTSEL7). If CPUID.0AH:EAX[15:8] report 4, access to IA32_PMC4-IA32_PMC7, IA32_PMC4-IA32_PMC7 will cause #GP. Writing 1’s to bit position 7:4 of IA32_PERF_GLOBAL_CTRL, IA32_PERF_GLOBAL_STATUS, or IA32_PERF_GLOBAL_OVF_CTL will also cause #GP. 18.3.4.3 Full Width Writes to Performance Counters Processors based on Intel microarchitecture code name Sandy Bridge support full-width writes to the generalpurpose counters, IA32_PMCx. Support of full-width writes are enumerated by IA32_PERF_CAPABILITIES.FW_WRITES[13] (see Section 18.2.4). The default behavior of IA32_PMCx is unchanged, i.e. WRMSR to IA32_PMCx results in a sign-extended 32-bit value of the input EAX written into IA32_PMCx. Full-width writes must issue WRMSR to a dedicated alias MSR address for each IA32_PMCx. Software must check the presence of full-width write capability and the presence of the alias address IA32_A_PMCx by testing IA32_PERF_CAPABILITIES[13]. 18.3.4.4 PEBS Support in Intel® Microarchitecture Code Name Sandy Bridge Processors based on Intel microarchitecture code name Sandy Bridge support PEBS, similar to those offered in prior generation, with several enhanced features. The key components and differences of PEBS facility relative to Intel microarchitecture code name Westmere is summarized in Table 18-11. Table 18-11. PEBS Facility Comparison Box Intel® microarchitecture code name Sandy Bridge Intel® microarchitecture code name Westmere Comment Valid IA32_PMCx PMC0-PMC3 PMC0-PMC3 No PEBS on PMC4-PMC7. PEBS Buffer Programming Section 18.3.1.1.1 Section 18.3.1.1.1 Unchanged IA32_PEBS_ENABLE Layout Figure 18-29 Figure 18-15 PEBS record layout Physical Layout same as Table 18-3. Table 18-3 Enhanced fields at offsets 98H, A0H, A8H. PEBS Events See Table 18-12. See Table 18-78. IA32_PMC4-IA32_PMC7 do not support PEBS. PEBS-Load Latency See Table 18-13. Table 18-4 PEBS-Precise Store Yes; see Section 18.3.4.4.3. No IA32_PMC3 only PEBS-PDIR Yes No IA32_PMC1 only PEBS skid from EventingIP 1 (or 2 if micro+macro fusion) 1 SAMPLING Restriction Small SAV(CountDown) value incur higher overhead than prior generation. Only IA32_PMC0 through IA32_PMC3 support PEBS. Vol. 3B 18-37 PERFORMANCE MONITORING NOTE PEBS events are only valid when the following fields of IA32_PERFEVTSELx are all zero: AnyThread, Edge, Invert, CMask. In a PMU with PDIR capability, PEBS behavior is unpredictable if IA32_PERFEVTSELx or IA32_PMCx is changed for a PEBS-enabled counter while an event is being counted. To avoid this, changes to the programming or value of a PEBS-enabled counter should be performed when the counter is disabled. In IA32_PEBS_ENABLE MSR, bit 63 is defined as PS_ENABLE: When set, this enables IA32_PMC3 to capture precise store information. Only IA32_PMC3 supports the precise store facility. In typical usage of PEBS, the bit fields in IA32_PEBS_ENABLE are written to when the agent software starts PEBS operation; the enabled bit fields should be modified only when re-programming another PEBS event or cleared when the agent uses the performance counters for non-PEBS operations. 36 3534 33 32 31 63 62 8 7 6 5 43 2 1 0 PS_EN (R/W) LL_EN_PMC3 (R/W) LL_EN_PMC2 (R/W) LL_EN_PMC1 (R/W) LL_EN_PMC0 (R/W) PEBS_EN_PMC3 (R/W) PEBS_EN_PMC2 (R/W) PEBS_EN_PMC1 (R/W) PEBS_EN_PMC0 (R/W) Reserved RESET Value — 00000000_00000000H Figure 18-29. Layout of IA32_PEBS_ENABLE MSR 18.3.4.4.1 PEBS Record Format The layout of PEBS records physically identical to those shown in Table 18-3, but the fields at offset 98H, A0H and A8H have been enhanced to support additional PEBS capabilities. • Load/Store Data Linear Address (Offset 98H): This field will contain the linear address of the source of the load, or linear address of the destination of the store. • Data Source /Store Status (Offset A0H): When load latency is enabled, this field will contain three piece of information (including an encoded value indicating the source which satisfied the load operation). The source field encodings are detailed in Table 18-4. When precise store is enabled, this field will contain information indicating the status of the store, as detailed in Table 19. • Latency Value/0 (Offset A8H): When load latency is enabled, this field contains the latency in cycles to service the load. This field is not meaningful when precise store is enabled and will be written to zero in that case. Upon writing the PEBS record, microcode clears the overflow status bits in the IA32_PERF_GLOBAL_STATUS corresponding to those counters that both overflowed and were enabled in the IA32_PEBS_ENABLE register. The status bits of other counters remain unaffected. The number PEBS events has expanded. The list of PEBS events supported in Intel microarchitecture code name Sandy Bridge is shown in Table 18-12. 18-38 Vol. 3B PERFORMANCE MONITORING Table 18-12. PEBS Performance Events for Intel® Microarchitecture Code Name Sandy Bridge Event Name Event Select Sub-event UMask INST_RETIRED C0H PREC_DIST 01H1 UOPS_RETIRED C2H BR_INST_RETIRED BR_MISP_RETIRED MEM_UOPS_RETIRED MEM_LOAD_UOPS_RETIRED MEM_LOAD_UOPS_LLC_HIT_RETIRED C4H C5H D0H D1H D2H All 01H Retire_Slots 02H Conditional 01H Near_Call 02H All_branches 04H Near_Return 08H Near_Taken 20H Conditional 01H Near_Call 02H All_branches 04H Not_Taken 10H Taken 20H STLB_MISS_LOADS 11H STLB_MISS_STORE 12H LOCK_LOADS 21H SPLIT_LOADS 41H SPLIT_STORES 42H ALL_LOADS 81H ALL_STORES 82H L1_Hit 01H L2_Hit 02H L3_Hit 04H Hit_LFB 40H XSNP_Miss 01H XSNP_Hit 02H XSNP_Hitm 04H XSNP_None 08H NOTES: 1. Only available on IA32_PMC1. 18.3.4.4.2 Load Latency Performance Monitoring Facility The load latency facility in Intel microarchitecture code name Sandy Bridge is similar to that in prior microarchitecture. It provides software a means to characterize the average load latency to different levels of cache/memory hierarchy. This facility requires processor supporting enhanced PEBS record format in the PEBS buffer, see Table 18-3 and Section 18.3.4.4.1. This field measures the load latency from load's first dispatch of till final data writeback from the memory subsystem. The latency is reported for retired demand load operations and in core cycles (it accounts for re-dispatches). To use this feature software must assure: • One of the IA32_PERFEVTSELx MSR is programmed to specify the event unit MEM_TRANS_RETIRED, and the LATENCY_ABOVE_THRESHOLD event mask must be specified (IA32_PerfEvtSelX[15:0] = 1CDH). The corresponding counter IA32_PMCx will accumulate event counts for architecturally visible loads which exceed the programmed latency threshold specified separately in a MSR. Stores are ignored when this event is Vol. 3B 18-39 PERFORMANCE MONITORING programmed. The CMASK or INV fields of the IA32_PerfEvtSelX register used for counting load latency must be 0. Writing other values will result in undefined behavior. • The MSR_PEBS_LD_LAT_THRESHOLD MSR is programmed with the desired latency threshold in core clock cycles. Loads with latencies greater than this value are eligible for counting and latency data reporting. The minimum value that may be programmed in this register is 3 (the minimum detectable load latency is 4 core clock cycles). • The PEBS enable bit in the IA32_PEBS_ENABLE register is set for the corresponding IA32_PMCx counter register. This means that both the PEBS_EN_CTRX and LL_EN_CTRX bits must be set for the counter(s) of interest. For example, to enable load latency on counter IA32_PMC0, the IA32_PEBS_ENABLE register must be programmed with the 64-bit value 00000001.00000001H. • When Load latency event is enabled, no other PEBS event can be configured with other counters. When the load-latency facility is enabled, load operations are randomly selected by hardware and tagged to carry information related to data source locality and latency. Latency and data source information of tagged loads are updated internally. The MEM_TRANS_RETIRED event for load latency counts only tagged retired loads. If a load is cancelled it will not be counted and the internal state of the load latency facility will not be updated. In this case the hardware will tag the next available load. When a PEBS assist occurs, the last update of latency and data source information are captured by the assist and written as part of the PEBS record. The PEBS sample after value (SAV), specified in PEBS CounterX Reset, operates orthogonally to the tagging mechanism. Loads are randomly tagged to collect latency data. The SAV controls the number of tagged loads with latency information that will be written into the PEBS record field by the PEBS assists. The load latency data written to the PEBS record will be for the last tagged load operation which retired just before the PEBS assist was invoked. The physical layout of the PEBS records is the same as shown in Table 18-3. The specificity of Data Source entry at offset A0H has been enhanced to report three pieces of information. Table 18-13. Layout of Data Source Field of Load Latency Record Field Position Description Source 3:0 See Table 18-4 STLB_MISS 4 0: The load did not miss the STLB (hit the DTLB or STLB). 1: The load missed the STLB. Lock 5 Reserved 63:6 0: The load was not part of a locked transaction. 1: The load was part of a locked transaction. Reserved The layout of MSR_PEBS_LD_LAT_THRESHOLD is the same as shown in Figure 18-17. 18.3.4.4.3 Precise Store Facility Processors based on Intel microarchitecture code name Sandy Bridge offer a precise store capability that complements the load latency facility. It provides a means to profile store memory references in the system. Precise stores leverage the PEBS facility and provide additional information about sampled stores. Having precise memory reference events with linear address information for both loads and stores can help programmers improve data structure layout, eliminate remote node references, and identify cache-line conflicts in NUMA systems. Only IA32_PMC3 can be used to capture precise store information. After enabling this facility, counter overflows will initiate the generation of PEBS records as previously described in PEBS. Upon counter overflow hardware captures the linear address and other status information of the next store that retires. This information is then written to the PEBS record. To enable the precise store facility, software must complete the following steps. Please note that the precise store facility relies on the PEBS facility, so the PEBS configuration requirements must be completed before attempting to capture precise store information. • Complete the PEBS configuration steps. 18-40 Vol. 3B PERFORMANCE MONITORING • Program the MEM_TRANS_RETIRED.PRECISE_STORE event in IA32_PERFEVTSEL3. Only counter 3 (IA32_PMC3) supports collection of precise store information. • Set IA32_PEBS_ENABLE[3] and IA32_PEBS_ENABLE[63]. This enables IA32_PMC3 as a PEBS counter and enables the precise store facility, respectively. The precise store information written into a PEBS record affects entries at offset 98H, A0H and A8H of Table 18-3. The specificity of Data Source entry at offset A0H has been enhanced to report three piece of information. Table 18-14. Layout of Precise Store Information In PEBS Record Field Offset Description Store Data 98H Linear Address The linear address of the destination of the store. Store Status L1D Hit (Bit 0): The store hit the data cache closest to the core (lowest latency cache) if this bit is set, otherwise the store missed the data cache. A0H STLB Miss (bit 4): The store missed the STLB if set, otherwise the store hit the STLB Locked Access (bit 5): The store was part of a locked access if set, otherwise the store was not part of a locked access. Reserved A8H Reserved 18.3.4.4.4 Precise Distribution of Instructions Retired (PDIR) Upon triggering a PEBS assist, there will be a finite delay between the time the counter overflows and when the microcode starts to carry out its data collection obligations. INST_RETIRED is a very common event that is used to sample where performance bottleneck happened and to help identify its location in instruction address space. Even if the delay is constant in core clock space, it invariably manifest as variable “skids” in instruction address space. This creates a challenge for programmers to profile a workload and pinpoint the location of bottlenecks. The core PMU in processors based on Intel microarchitecture code name Sandy Bridge include a facility referred to as precise distribution of Instruction Retired (PDIR). The PDIR facility mitigates the “skid” problem by providing an early indication of when the INST_RETIRED counter is about to overflow, allowing the machine to more precisely trap on the instruction that actually caused the counter overflow. On processors based on Intel microarchitecture code name Sandy Bridge skid is significantly reduced, and can be as little as one instruction. On future implementations PDIR may eliminate skid. PDIR applies only to the INST_RETIRED.ALL precise event, and processors based on Sandy Bridge microarchitecture must use IA32_PMC1 with PerfEvtSel1 property configured and bit 1 in the IA32_PEBS_ENABLE set to 1. INST_RETIRED.ALL is a non-architectural performance event, it is not supported in prior generation microarchitectures. Additionally, on processors with CPUID DisplayFamily_DisplayModel signatures of 06_2A and 06_2D, the tool that programs PDIR should quiesce the rest of the programmable counters in the core when PDIR is active. 18.3.4.5 Off-core Response Performance Monitoring The core PMU in processors based on Intel microarchitecture code name Sandy Bridge provides off-core response facility similar to prior generation. Off-core response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the off-core transaction. Two event codes are dedicated for off-core response event programming. Each event code for off-core response monitoring requires programming an associated configuration MSR, MSR_OFFCORE_RSP_x. Table 18-15 lists the event code, mask value and additional off-core configuration MSR that must be programmed to count off-core response events using IA32_PMCx. Vol. 3B 18-41 PERFORMANCE MONITORING Table 18-15. Off-Core Response Event Encoding Counter Event code UMask Required Off-core Response MSR PMC0-3 B7H 01H MSR_OFFCORE_RSP_0 (address 1A6H) PMC0-3 BBH 01H MSR_OFFCORE_RSP_1 (address 1A7H) The layout of MSR_OFFCORE_RSP_0 and MSR_OFFCORE_RSP_1 are shown in Figure 18-30 and Figure 18-31. Bits 15:0 specifies the request type of a transaction request to the uncore. Bits 30:16 specifies supplier information, bits 37:31 specifies snoop response information. 63 37 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 See Figure 18-30 RESPONSE TYPE — Other (R/W) RESERVED REQUEST TYPE — STRM_ST (R/W) REQUEST TYPE — BUS_LOCKS (R/W) REQUEST TYPE — PF_LLC_IFETCH (R/W) REQUEST TYPE — PF_LLC_RFO (R/W) REQUEST TYPE — PF_LLC_DATA_RD (R/W) REQUEST TYPE — PF_IFETCH (R/W) REQUEST TYPE — PF_RFO (R/W) REQUEST TYPE — PF_DATA_RD (R/W) REQUEST TYPE — WB (R/W) REQUEST TYPE — DMND_IFETCH (R/W) REQUEST TYPE — DMND_RFO (R/W) REQUEST TYPE — DMND_DATA_RD (R/W) Reserved RESET Value — 00000000_00000000H Figure 18-30. Request_Type Fields for MSR_OFFCORE_RSP_x Table 18-16. MSR_OFFCORE_RSP_x Request_Type Field Definition Bit Name Offset Description DMND_DATA_RD 0 Counts the number of demand data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches. DMND_RFO 1 Counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches. DMND_IFETCH 2 Counts the number of demand instruction cacheline reads and L1 instruction cacheline prefetches. WB 3 Counts the number of writeback (modified to exclusive) transactions. PF_DATA_RD 4 Counts the number of data cacheline reads generated by L2 prefetchers. PF_RFO 5 Counts the number of RFO requests generated by L2 prefetchers. PF_IFETCH 6 Counts the number of code reads generated by L2 prefetchers. PF_LLC_DATA_RD 7 L2 prefetcher to L3 for loads. PF_LLC_RFO 8 RFO requests generated by L2 prefetcher PF_LLC_IFETCH 9 L2 prefetcher to L3 for instruction fetches. BUS_LOCKS 10 Bus lock and split lock requests STRM_ST 11 Streaming store requests OTHER 15 Any other request that crosses IDI, including I/O. 18-42 Vol. 3B PERFORMANCE MONITORING 63 37 36 35 34 33 32 31 22 21 20 19 18 17 16 RESPONSE TYPE — NON_DRAM (R/W) RSPNS_SNOOP — HITM (R/W) RSPNS_SNOOP — HIT_FWD RSPNS_SNOOP — HIT_NO_FWD (R/W) RSPNS_SNOOP — SNP_MISS (R/W) RSPNS_SNOOP — SNP_NOT_NEEDED (R/W) RSPNS_SNOOP — SNPl_NONE (R/W) RSPNS_SUPPLIER — RESERVED RSPNS_SUPPLIER — Local RSPNS_SUPPLIER — LLC_HITF (R/W) RSPNS_SUPPLIER — LLC_HITS (R/W) RSPNS_SUPPLIER — LLC_HITE (R/W) RSPNS_SUPPLIER — LLC_HITM (R/W) RSPNS_SUPPLIER — No_SUPP (R/W) RSPNS_SUPPLIER — ANY (R/W) Reserved RESET Value — 00000000_00000000H Figure 18-31. Response_Supplier and Snoop Info Fields for MSR_OFFCORE_RSP_x To properly program this extra register, software must set at least one request type bit and a valid response type pattern. Otherwise, the event count reported will be zero. It is permissible and useful to set multiple request and response type bits in order to obtain various classes of off-core response events. Although MSR_OFFCORE_RSP_x allow an agent software to program numerous combinations that meet the above guideline, not all combinations produce meaningful data. Table 18-17. MSR_OFFCORE_RSP_x Response Supplier Info Field Definition Subtype Bit Name Offset Description Common Any 16 Catch all value for any response types. Supplier Info NO_SUPP 17 No Supplier Information available. LLC_HITM 18 M-state initial lookup stat in L3. LLC_HITE 19 E-state LLC_HITS 20 S-state LLC_HITF 21 F-state LOCAL 22 Local DRAM Controller. Reserved 30:23 Reserved To specify a complete offcore response filter, software must properly program bits in the request and response type fields. A valid request type must have at least one bit set in the non-reserved bits of 15:0. A valid response type must be a non-zero value of the following expression: ANY | [(‘OR’ of Supplier Info Bits) & (‘OR’ of Snoop Info Bits)] If “ANY“ bit is set, the supplier and snoop info bits are ignored. Vol. 3B 18-43 PERFORMANCE MONITORING Table 18-18. MSR_OFFCORE_RSP_x Snoop Info Field Definition Subtype Bit Name Offset Description Snoop Info SNP_NONE 31 No details on snoop-related information. SNP_NOT_NEEDED 32 No snoop was needed to satisfy the request. SNP_MISS A snoop was needed and it missed all snooped caches: 33 -For LLC Hit, ReslHitl was returned by all cores -For LLC Miss, Rspl was returned by all sockets and data was returned from DRAM. SNP_NO_FWD 34 A snoop was needed and it hits in at least one snooped cache. Hit denotes a cache-line was valid before snoop effect. This includes: -Snoop Hit w/ Invalidation (LLC Hit, RFO) -Snoop Hit, Left Shared (LLC Hit/Miss, IFetch/Data_RD) -Snoop Hit w/ Invalidation and No Forward (LLC Miss, RFO Hit S) In the LLC Miss case, data is returned from DRAM. SNP_FWD 35 A snoop was needed and data was forwarded from a remote socket. This includes: -Snoop Forward Clean, Left Shared (LLC Hit/Miss, IFetch/Data_RD/RFT). HITM 36 A snoop was needed and it HitM-ed in local or remote cache. HitM denotes a cache-line was in modified state before effect as a results of snoop. This includes: -Snoop HitM w/ WB (LLC miss, IFetch/Data_RD) -Snoop Forward Modified w/ Invalidation (LLC Hit/Miss, RFO) -Snoop MtoS (LLC Hit, IFetch/Data_RD). NON_DRAM 18.3.4.6 37 Target was non-DRAM system address. This includes MMIO transactions. Uncore Performance Monitoring Facilities In Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series The uncore sub-system in Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx processor series provides a unified L3 that can support up to four processor cores. The L3 cache consists multiple slices, each slice interface with a processor via a coherence engine, referred to as a C-Box. Each C-Box provides dedicated facility of MSRs to select uncore performance monitoring events and each C-Box event select MSR is paired with a counter register, similar in style as those described in Section 18.3.1.2.2. The ARB unit in the uncore also provides its local performance counters and event select MSRs. The layout of the event select MSRs in the C-Boxes and the ARB unit are shown in Figure 18-32. 63 28 24 23 22 21 20 19 18 17 16 15 Counter Mask (CMASK) 0 8 7 Unit Mask (UMASK) Event Select INV—Invert counter mask EN—Enable counter OVF_EN—Overflow forwarding E—Edge detect Reserved RESET Value — 00000000_00000000H Figure 18-32. Layout of Uncore PERFEVTSEL MSR for a C-Box Unit or the ARB Unit 18-44 Vol. 3B PERFORMANCE MONITORING The bit fields of the uncore event select MSRs for a C-box unit or the ARB unit are summarized below: • Event_Select (bits 7:0) and UMASK (bits 15:8): Specifies the microarchitectural condition to count in a local uncore PMU counter, see Table 19-20. • • E (bit 18): Enables edge detection filtering, if 1. • • • EN (bit 22): Enables the local counter associated with this event select MSR. OVF_EN (bit 20): Enables the overflow indicator from the uncore counter forwarded to MSR_UNC_PERF_GLOBAL_CTRL, if 1. INV (bit 23): Event count increments with non-negative value if 0, with negated value if 1. CMASK (bits 28:24): Specifies a positive threshold value to filter raw event count input. At the uncore domain level, there is a master set of control MSRs that centrally manages all the performance monitoring facility of uncore units. Figure 18-33 shows the layout of the uncore domain global control. When an uncore counter overflows, a PMI can be routed to a processor core. Bits 3:0 of MSR_UNC_PERF_GLOBAL_CTRL can be used to select which processor core to handle the uncore PMI. Software must then write to bit 13 of IA32_DEBUGCTL (at address 1D9H) to enable this capability. • PMI_SEL_Core#: Enables the forwarding of an uncore PMI request to a processor core, if 1. If bit 30 (WakePMI) is ‘1’, a wake request is sent to the respective processor core prior to sending the PMI. • EN: Enables the fixed uncore counter, the ARB counters, and the CBO counters in the uncore PMU, if 1. This bit is cleared if bit 31 (FREEZE) is set and any enabled uncore counters overflow. • WakePMI: Controls sending a wake request to any halted processor core before issuing the uncore PMI request. If a processor core was halted and not sent a wake request, the uncore PMI will not be serviced by the processor core. • FREEZE: Provides the capability to freeze all uncore counters when an overflow condition occurs in a unit counter. When this bit is set, and a counter overflow occurs, the uncore PMU logic will clear the global enable bit (bit 29). 63 32 31 30 29 28 4 3 2 1 0 FREEZE—Freeze counters WakePMI—Wake cores on PMI EN—Enable all uncore counters PMI_Sel_Core3 — Uncore PMI to core 3 PMI_Sel_Core2 — Uncore PMI to core 2 PMI_Sel_Core1 — Uncore PMI to core 1 PMI_Sel_Core0 — Uncore PMI to core 0 Reserved RESET Value — 00000000_00000000H Figure 18-33. Layout of MSR_UNC_PERF_GLOBAL_CTRL MSR for Uncore Vol. 3B 18-45 PERFORMANCE MONITORING Additionally, there is also a fixed counter, counting uncore clockticks, for the uncore domain. Table 18-19 summarizes the number MSRs for uncore PMU for each box. Table 18-19. Uncore PMU MSR Summary Box # of Boxes Counters per Box Counter Width General Purpose Global Enable C-Box SKU specific 2 44 Yes Per-box ARB 1 2 44 Yes Uncore Fixed Counter N.A. N.A. 48 No Uncore Comment Up to 4, seeTable 2-21 MSR_UNC_CBO_CONFIG 18.3.4.6.1 Uncore Performance Monitoring Events There are certain restrictions on the uncore performance counters in each C-Box. Specifically, • • Occupancy events are supported only with counter 0 but not counter 1. Other uncore C-Box events can be programmed with either counter 0 or 1. The C-Box uncore performance events described in Table 19-20 can collect performance characteristics of transactions initiated by processor core. In that respect, they are similar to various sub-events in the OFFCORE_RESPONSE family of performance events in the core PMU. Information such as data supplier locality (LLC HIT/MISS) and snoop responses can be collected via OFFCORE_RESPONSE and qualified on a per-thread basis. On the other hand, uncore performance event logic cannot associate its counts with the same level of per-thread qualification attributes as the core PMU events can. Therefore, whenever similar event programming capabilities are available from both core PMU and uncore PMU, the recommendation is that utilizing the core PMU events may be less affected by artifacts, complex interactions and other factors. 18.3.4.7 Intel® Xeon® Processor E5 Family Performance Monitoring Facility The Intel® Xeon® Processor E5 Family (and Intel® Core™ i7-3930K Processor) are based on Intel microarchitecture code name Sandy Bridge-E. While the processor cores share the same microarchitecture as those of the Intel® Xeon® Processor E3 Family and 2nd generation Intel Core i7-2xxx, Intel Core i5-2xxx, Intel Core i3-2xxx processor series, the uncore subsystems are different. An overview of the uncore performance monitoring facilities of the Intel Xeon processor E5 family (and Intel Core i7-3930K processor) is described in Section 18.3.4.8. Thus, the performance monitoring facilities in the processor core generally are the same as those described in Section 18.6.3 through Section 18.3.4.5. However, the MSR_OFFCORE_RSP_0/MSR_OFFCORE_RSP_1 Response Supplier Info field shown in Table 18-17 applies to Intel Core Processors with CPUID signature of DisplayFamily_DisplayModel encoding of 06_2AH; Intel Xeon processor with CPUID signature of DisplayFamily_DisplayModel encoding of 06_2DH supports an additional field for remote DRAM controller shown in Table 18-20. Additionally, there are some small differences in the non-architectural performance monitoring events (see Table 19-18). 18-46 Vol. 3B PERFORMANCE MONITORING Table 18-20. MSR_OFFCORE_RSP_x Supplier Info Field Definitions Subtype Bit Name Offset Description Common Any 16 Catch all value for any response types. Supplier Info 18.3.4.8 NO_SUPP 17 No Supplier Information available. LLC_HITM 18 M-state initial lookup stat in L3. LLC_HITE 19 E-state LLC_HITS 20 S-state LLC_HITF 21 F-state LOCAL 22 Local DRAM Controller. Remote 30:23 Remote DRAM Controller (either all 0s or all 1s). Intel® Xeon® Processor E5 Family Uncore Performance Monitoring Facility The uncore subsystem in the Intel Xeon processor E5-2600 product family has some similarities with those of the Intel Xeon processor E7 family. Within the uncore subsystem, localized performance counter sets are provided at logic control unit scope. For example, each Cbox caching agent has a set of local performance counters, and the power controller unit (PCU) has its own local performance counters. Up to 8 C-Box units are supported in the uncore sub-system. Table 18-21 summarizes the uncore PMU facilities providing MSR interfaces. Table 18-21. Uncore PMU MSR Summary for Intel® Xeon® Processor E5 Family Box # of Boxes Counters per Box Counter Width General Purpose Global Enable Sub-control MSRs C-Box 8 4 44 Yes per-box None PCU 1 4 48 Yes per-box Match/Mask U-Box 1 2 44 Yes uncore None Details of the uncore performance monitoring facility of Intel Xeon Processor E5 family is available in “Intel® Xeon® Processor E5 Uncore Performance Monitoring Programming Reference Manual”. The MSR-based uncore PMU interfaces are listed in Table 2-24. 18.3.5 3rd Generation Intel® Core™ Processor Performance Monitoring Facility The 3rd generation Intel® Core™ processor family and Intel® Xeon® processor E3-1200v2 product family are based on the Ivy Bridge microarchitecture. The performance monitoring facilities in the processor core generally are the same as those described in Section 18.6.3 through Section 18.3.4.5. The non-architectural performance monitoring events supported by the processor core are listed in Table 19-18. 18.3.5.1 Intel® Xeon® Processor E5 v2 and E7 v2 Family Uncore Performance Monitoring Facility The uncore subsystem in the Intel Xeon processor E5 v2 and Intel Xeon Processor E7 v2 product families are based on the Ivy Bridge-E microarchitecture. There are some similarities with those of the Intel Xeon processor E5 family based on the Sandy Bridge microarchitecture. Within the uncore subsystem, localized performance counter sets are provided at logic control unit scope. Details of the uncore performance monitoring facility of Intel Xeon Processor E5 v2 and Intel Xeon Processor E7 v2 families are available in “Intel® Xeon® Processor E5 v2 and E7 v2 Uncore Performance Monitoring Programming Reference Manual”. The MSR-based uncore PMU interfaces are listed in Table 2-28. Vol. 3B 18-47 PERFORMANCE MONITORING 18.3.6 4th Generation Intel® Core™ Processor Performance Monitoring Facility The 4th generation Intel® Core™ processor and Intel® Xeon® processor E3-1200 v3 product family are based on the Haswell microarchitecture. The core PMU supports architectural performance monitoring capability with version ID 3 (see Section 18.2.3) and a host of non-architectural monitoring capabilities. Architectural performance monitoring version 3 capabilities are described in Section 18.2.3. The core PMU’s capability is similar to those described in Section 18.6.3 through Section 18.3.4.5, with some differences and enhancements summarized in Table 18-22. Additionally, the core PMU provides some enhancement to support performance monitoring when the target workload contains instruction streams using Intel® Transactional Synchronization Extensions (TSX), see Section 18.3.6.5. For details of Intel TSX, see Chapter 16, “Programming with Intel® Transactional Synchronization Extensions” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Table 18-22. Core PMU Comparison Box Intel® microarchitecture code name Haswell Intel® microarchitecture code name Sandy Bridge # of Fixed counters per thread 3 3 Use CPUID to determine # of counters. See Section 18.2.1. # of general-purpose counters per core 8 8 Use CPUID to determine # of counters. See Section 18.2.1. Counter width (R,W) R:48, W: 32/48 R:48, W: 32/48 See Section 18.2.2. # of programmable counters per thread 4 or (8 if a core not shared by two threads) 4 or (8 if a core not shared by two threads) Use CPUID to determine # of counters. See Section 18.2.1. PMI Overhead Mitigation • Freeze_Perfmon_on_PMI with legacy semantics. • Freeze_on_LBR with legacy semantics for branch profiling. • Freeze_while_SMM. • Freeze_Perfmon_on_PMI with legacy semantics. • Freeze_on_LBR with legacy semantics for branch profiling. • Freeze_while_SMM. See Section 17.4.7. Processor Event Based Sampling (PEBS) Events See Table 18-12 and Section 18.3.6.5.1. See Table 18-12. IA32_PMC4-IA32_PMC7 do not support PEBS. PEBS-Load Latency See Section 18.3.4.4.2. See Section 18.3.4.4.2. PEBS-Precise Store No, replaced by Data Address profiling. Section 18.3.4.4.3 PEBS-PDIR Yes (using precise INST_RETIRED.ALL) Yes (using precise INST_RETIRED.ALL) PEBS-EventingIP Yes No Data Address Profiling Yes No LBR Profiling Yes Yes Call Stack Profiling Yes, see Section 17.11. No Off-core Response Event MSR 1A6H and 1A7H; extended request and response types. MSR 1A6H and 1A7H; extended request and response types. Intel TSX support for Perfmon See Section 18.3.6.5. No 18-48 Vol. 3B Comment Use LBR facility. PERFORMANCE MONITORING 18.3.6.1 Processor Event Based Sampling (PEBS) Facility The PEBS facility in the 4th Generation Intel Core processor is similar to those in processors based on Intel microarchitecture code name Sandy Bridge, with several enhanced features. The key components and differences of PEBS facility relative to Intel microarchitecture code name Sandy Bridge is summarized in Table 18-23. Table 18-23. PEBS Facility Comparison Box Intel® microarchitecture code name Haswell Intel® microarchitecture code name Sandy Bridge Comment Valid IA32_PMCx PMC0-PMC3 PMC0-PMC3 No PEBS on PMC4-PMC7 PEBS Buffer Programming Section 18.3.1.1.1 Section 18.3.1.1.1 Unchanged IA32_PEBS_ENABLE Layout Figure 18-15 Figure 18-29 PEBS record layout Table 18-24; enhanced fields at offsets 98H, A0H, A8H, B0H. Table 18-3; enhanced fields at offsets 98H, A0H, A8H. Precise Events See Table 18-12. See Table 18-12. PEBS-Load Latency See Table 18-13. Table 18-13 PEBS-Precise Store No, replaced by data address profiling. Yes; see Section 18.3.4.4.3. PEBS-PDIR Yes Yes PEBS skid from EventingIP 1 (or 2 if micro+macro fusion) 1 SAMPLING Restriction Small SAV(CountDown) value incur higher overhead than prior generation. IA32_PMC4-IA32_PMC7 do not support PEBS. IA32_PMC1 only. Only IA32_PMC0 through IA32_PMC3 support PEBS. NOTE PEBS events are only valid when the following fields of IA32_PERFEVTSELx are all zero: AnyThread, Edge, Invert, CMask. In a PMU with PDIR capability, PEBS behavior is unpredictable if IA32_PERFEVTSELx or IA32_PMCx is changed for a PEBS-enabled counter while an event is being counted. To avoid this, changes to the programming or value of a PEBS-enabled counter should be performed when the counter is disabled. 18.3.6.2 PEBS Data Format The PEBS record format for the 4th Generation Intel Core processor is shown in Table 18-24. The PEBS record format, along with debug/store area storage format, does not change regardless of whether IA-32e mode is active or not. CPUID.01H:ECX.DTES64[bit 2] reports whether the processor's DS storage format support is mode-independent. When set, it uses 64-bit DS storage format. Vol. 3B 18-49 PERFORMANCE MONITORING Table 18-24. PEBS Record Format for 4th Generation Intel Core Processor Family Byte Offset Field Byte Offset Field 00H R/EFLAGS 60H R10 08H R/EIP 68H R11 10H R/EAX 70H R12 18H R/EBX 78H R13 20H R/ECX 80H R14 28H R/EDX 88H R15 30H R/ESI 90H IA32_PERF_GLOBAL_STATUS 38H R/EDI 98H Data Linear Address 40H R/EBP A0H Data Source Encoding 48H R/ESP A8H Latency value (core cycles) 50H R8 B0H EventingIP 58H R9 B8H TX Abort Information (Section 18.3.6.5.1) The layout of PEBS records are almost identical to those shown in Table 18-3. Offset B0H is a new field that records the eventing IP address of the retired instruction that triggered the PEBS assist. The PEBS records at offsets 98H, A0H, and ABH record data gathered from three of the PEBS capabilities in prior processor generations: load latency facility (Section 18.3.4.4.2), PDIR (Section 18.3.4.4.4), and the equivalent capability of precise store in prior generation (see Section 18.3.6.3). In the core PMU of the 4th generation Intel Core processor, load latency facility and PDIR capabilities are unchanged. However, precise store is replaced by an enhanced capability, data address profiling, that is not restricted to store address. Data address profiling also records information in PEBS records at offsets 98H, A0H, and ABH. 18.3.6.3 PEBS Data Address Profiling The Data Linear Address facility is also abbreviated as DataLA. The facility is a replacement or extension of the precise store facility in previous processor generations. The DataLA facility complements the load latency facility by providing a means to profile load and store memory references in the system, leverages the PEBS facility, and provides additional information about sampled loads and stores. Having precise memory reference events with linear address information for both loads and stores provides information to improve data structure layout, eliminate remote node references, and identify cache-line conflicts in NUMA systems. The DataLA facility in the 4th generation processor supports the following events configured to use PEBS: Table 18-25. Precise Events That Supports Data Linear Address Profiling Event Name Event Name MEM_UOPS_RETIRED.STLB_MISS_LOADS MEM_UOPS_RETIRED.STLB_MISS_STORES MEM_UOPS_RETIRED.LOCK_LOADS MEM_UOPS_RETIRED.SPLIT_STORES MEM_UOPS_RETIRED.SPLIT_LOADS MEM_UOPS_RETIRED.ALL_STORES MEM_UOPS_RETIRED.ALL_LOADS MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM MEM_LOAD_UOPS_RETIRED.L1_HIT MEM_LOAD_UOPS_RETIRED.L2_HIT MEM_LOAD_UOPS_RETIRED.L3_HIT MEM_LOAD_UOPS_RETIRED.L1_MISS MEM_LOAD_UOPS_RETIRED.L2_MISS MEM_LOAD_UOPS_RETIRED.L3_MISS MEM_LOAD_UOPS_RETIRED.HIT_LFB MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS 18-50 Vol. 3B PERFORMANCE MONITORING Table 18-25. Precise Events That Supports Data Linear Address Profiling (Contd.) Event Name Event Name MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM UOPS_RETIRED.ALL (if load or store is tagged) MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE DataLA can use any one of the IA32_PMC0-IA32_PMC3 counters. Counter overflows will initiate the generation of PEBS records. Upon counter overflow, hardware captures the linear address and possible other status information of the retiring memory uop. This information is then written to the PEBS record that is subsequently generated. To enable the DataLA facility, software must complete the following steps. Please note that the DataLA facility relies on the PEBS facility, so the PEBS configuration requirements must be completed before attempting to capture DataLA information. • • • Complete the PEBS configuration steps. Program an event listed in Table 18-25 using any one of IA32_PERFEVTSEL0-IA32_PERFEVTSEL3. Set the corresponding IA32_PEBS_ENABLE.PEBS_EN_CTRx bit. This enables the corresponding IA32_PMCx as a PEBS counter and enables the DataLA facility. When the DataLA facility is enabled, the relevant information written into a PEBS record affects entries at offsets 98H, A0H and A8H, as shown in Table 18-26. Table 18-26. Layout of Data Linear Address Information In PEBS Record Field Offset Description Data Linear Address 98H The linear address of the load or the destination of the store. Store Status A0H • DCU Hit (Bit 0): The store hit the data cache closest to the core (L1 cache) if this bit is set, otherwise the store missed the data cache. This information is valid only for the following store events: UOPS_RETIRED.ALL (if store is tagged), MEM_UOPS_RETIRED.STLB_MISS_STORES, MEM_UOPS_RETIRED.SPLIT_STORES, MEM_UOPS_RETIRED.ALL_STORES • Other bits are zero, The STLB_MISS, LOCK bit information can be obtained by programming the corresponding store event in Table 18-25. Reserved A8H Always zero. 18.3.6.3.1 EventingIP Record The PEBS record layout for processors based on Intel microarchitecture code name Haswell adds a new field at offset 0B0H. This is the eventingIP field that records the IP address of the retired instruction that triggered the PEBS assist. The EIP/RIP field at offset 08H records the IP address of the next instruction to be executed following the PEBS assist. 18.3.6.4 Off-core Response Performance Monitoring The core PMU facility to collect off-core response events are similar to those described in Section 18.3.4.5. The event codes are listed in Table 18-15. Each event code for off-core response monitoring requires programming an associated configuration MSR, MSR_OFFCORE_RSP_x. Software must program MSR_OFFCORE_RSP_x according to: • • • Transaction request type encoding (bits 15:0): see Table 18-27. Supplier information (bits 30:16): see Table 18-28. Snoop response information (bits 37:31): see Table 18-18. Vol. 3B 18-51 PERFORMANCE MONITORING Table 18-27. MSR_OFFCORE_RSP_x Request_Type Definition (Haswell microarchitecture) Bit Name Offset Description DMND_DATA_RD 0 Counts the number of demand data reads and page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches. DMND_RFO 1 Counts demand read (RFO) and software prefetches (PREFETCHW) for exclusive ownership in anticipation of a write. DMND_IFETCH 2 Counts the number of demand instruction cacheline reads and L1 instruction cacheline prefetches. COREWB 3 Counts the number of modified cachelines written back. PF_DATA_RD 4 Counts the number of data cacheline reads generated by L2 prefetchers. PF_RFO 5 Counts the number of RFO requests generated by L2 prefetchers. PF_IFETCH 6 Counts the number of code reads generated by L2 prefetchers. PF_L3_DATA_RD 7 Counts the number of data cacheline reads generated by L3 prefetchers. PF_L3_RFO 8 Counts the number of RFO requests generated by L3 prefetchers. PF_L3_CODE_RD 9 Counts the number of code reads generated by L3 prefetchers. SPLIT_LOCK_UC_ LOCK 10 Counts the number of lock requests that split across two cachelines or are to UC memory. STRM_ST 11 Counts the number of streaming store requests electronically. Reserved 14:12 Reserved OTHER 15 Any other request that crosses IDI, including I/O. The supplier information field listed in Table 18-28. The fields vary across products (according to CPUID signatures) and is noted in the description. Table 18-28. MSR_OFFCORE_RSP_x Supplier Info Field Definition (CPUID Signature 06_3CH, 06_46H) Subtype Bit Name Offset Description Common Any 16 Catch all value for any response types. Supplier Info NO_SUPP 17 No Supplier Information available. L3_HITM 18 M-state initial lookup stat in L3. L3_HITE 19 E-state L3_HITS 20 S-state Reserved 21 Reserved LOCAL 22 Local DRAM Controller. Reserved 30:23 Reserved 18-52 Vol. 3B PERFORMANCE MONITORING Table 18-29. MSR_OFFCORE_RSP_x Supplier Info Field Definition (CPUID Signature 06_45H) Subtype Bit Name Offset Description Common Any 16 Catch all value for any response types. Supplier Info NO_SUPP 17 No Supplier Information available. L3_HITM 18 M-state initial lookup stat in L3. L3_HITE 19 E-state L3_HITS 20 S-state Reserved 21 Reserved L4_HIT_LOCAL_L4 22 L4 Cache L4_HIT_REMOTE_HOP0_L4 23 L4 Cache L4_HIT_REMOTE_HOP1_L4 24 L4 Cache L4_HIT_REMOTE_HOP2P_L4 25 L4 Cache Reserved 30:26 Reserved 18.3.6.4.1 Off-core Response Performance Monitoring in Intel Xeon Processors E5 v3 Series Table 18-28 lists the supplier information field that apply to Intel Xeon processor E5 v3 series (CPUID signature 06_3FH). Table 18-30. MSR_OFFCORE_RSP_x Supplier Info Field Definition Subtype Bit Name Offset Description Common Any 16 Catch all value for any response types. Supplier Info NO_SUPP 17 No Supplier Information available. L3_HITM 18 M-state initial lookup stat in L3. L3_HITE 19 E-state L3_HITS 20 S-state L3_HITF 21 F-state LOCAL 22 Local DRAM Controller. Reserved 26:23 Reserved L3_MISS_REMOTE_HOP0 27 Hop 0 Remote supplier. L3_MISS_REMOTE_HOP1 28 Hop 1 Remote supplier. L3_MISS_REMOTE_HOP2P 29 Hop 2 or more Remote supplier. Reserved 30 Reserved 18.3.6.5 Performance Monitoring and Intel® TSX Chapter 16 of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 describes the details of Intel® Transactional Synchronization Extensions (Intel® TSX). This section describes performance monitoring support for Intel TSX. If a processor supports Intel TSX, the core PMU enhances its IA32_PERFEVTSELx MSR with two additional bit fields for event filtering. Support for Intel TSX is indicated by either (a) CPUID.(EAX=7, ECX=0):RTM[bit 11]=1, or (b) if CPUID.07H.EBX.HLE [bit 4] = 1. The TSX-enhanced layout of IA32_PERFEVTSELx is shown in Figure 18-34. The two additional bit fields are: Vol. 3B 18-53 PERFORMANCE MONITORING • IN_TX (bit 32): When set, the counter will only include counts that occurred inside a transactional region, regardless of whether that region was aborted or committed. This bit may only be set if the processor supports HLE or RTM. • IN_TXCP (bit 33): When set, the counter will not include counts that occurred inside of an aborted transactional region. This bit may only be set if the processor supports HLE or RTM. This bit may only be set for IA32_PERFEVTSEL2. When the IA32_PERFEVTSELx MSR is programmed with both IN_TX=0 and IN_TXCP=0 on a processor that supports Intel TSX, the result in a counter may include detectable conditions associated with a transaction code region for its aborted execution (if any) and completed execution. In the initial implementation, software may need to take pre-caution when using the IN_TXCP bit. See Table 2-29. 63 34 Reserved 31 24 23 22 21 20 19 18 17 16 15 8 7 A I U Counter Mask I E N N N P E O S Unit Mask (UMASK) (CMASK) S R V N Y T C 0 Event Select USR—User Mode OS—Operating system mode E—Edge detect PC—Pin control INT—APIC interrupt enable ANY—Any Thread EN—Enable counters INV—Invert counter mask IN_TX—In Trans. Rgn IN_TXCP—In Tx exclude abort (PERFEVTSEL2 Only) Figure 18-34. Layout of IA32_PERFEVTSELx MSRs Supporting Intel TSX A common usage of setting IN_TXCP=1 is to capture the number of events that were discarded due to a transactional abort. With IA32_PMC2 configured to count in such a manner, then when a transactional region aborts, the value for that counter is restored to the value it had prior to the aborted transactional region. As a result, any updates performed to the counter during the aborted transactional region are discarded. On the other hand, setting IN_TX=1 can be used to drill down on the performance characteristics of transactional code regions. When a PMCx is configured with the corresponding IA32_PERFEVTSELx.IN_TX=1, only eventing conditions that occur inside transactional code regions are propagated to the event logic and reflected in the counter result. Eventing conditions specified by IA32_PERFEVTSELx but occurring outside a transactional region are discarded. Additionally, a number of performance events are solely focused on characterizing the execution of Intel TSX transactional code, they are listed in Table 19-12. 18.3.6.5.1 Intel TSX and PEBS Support If a PEBS event would have occurred inside a transactional region, then the transactional region first aborts, and then the PEBS event is processed. Two of the TSX performance monitoring events in Table 19-12 also support using PEBS facility to capture additional information. They are: • • HLE_RETIRED.ABORT ED (encoding C8H mask 04H), RTM_RETIRED.ABORTED (encoding C9H mask 04H). A transactional abort (HLE_RETIRED.ABORTED,RTM_RETIRED.ABORTED) can also be programmed to cause PEBS events. In this scenario, a PEBS event is processed following the abort. 18-54 Vol. 3B PERFORMANCE MONITORING Pending a PEBS record inside of a transactional region will cause a transactional abort. If a PEBS record was pended at the time of the abort or on an overflow of the TSX PEBS events listed above, only the following PEBS entries will be valid (enumerated by PEBS entry offset B8H bits[33:32] to indicate an HLE abort or an RTM abort): • • Offset B0H: EventingIP, Offset B8H: TX Abort Information These fields are set for all PEBS events. • Offset 08H (RIP/EIP) corresponds to the instruction following the outermost XACQUIRE in HLE or the first instruction of the fallback handler of the outermost XBEGIN instruction in RTM. This is useful to identify the aborted transactional region. In the case of HLE, an aborted transaction will restart execution deterministically at the start of the HLE region. In the case of RTM, an aborted transaction will transfer execution to the RTM fallback handler. The layout of the TX Abort Information field is given in Table 18-31. Table 18-31. TX Abort Information Field Definition Bit Name Offset Description Cycles_Last_TX 31:0 The number of cycles in the last TSX region, regardless of whether that region had aborted or committed. HLE_Abort 32 If set, the abort information corresponds to an aborted HLE execution RTM_Abort 33 If set, the abort information corresponds to an aborted RTM execution Instruction_Abort 34 If set, the abort was associated with the instruction corresponding to the eventing IP (offset 0B0H) within the transactional region. Non_Instruction_Abort 35 If set, the instruction corresponding to the eventing IP may not necessarily be related to the transactional abort. Retry 36 If set, retrying the transactional execution may have succeeded. Data_Conflict 37 If set, another logical processor conflicted with a memory address that was part of the transactional region that aborted. Capacity Writes 38 If set, the transactional region aborted due to exceeding resources for transactional writes. Capacity Reads 39 If set, the transactional region aborted due to exceeding resources for transactional reads. Reserved 63:40 Reserved 18.3.6.6 Uncore Performance Monitoring Facilities in the 4th Generation Intel® Core™ Processors The uncore sub-system in the 4th Generation Intel® Core™ processors provides its own performance monitoring facility. The uncore PMU facility provides dedicated MSRs to select uncore performance monitoring events in a similar manner as those described in Section 18.3.4.6. The ARB unit and each C-Box provide local pairs of event select MSR and counter register. The layout of the event select MSRs in the C-Boxes are identical as shown in Figure 18-32. At the uncore domain level, there is a master set of control MSRs that centrally manages all the performance monitoring facility of uncore units. Figure 18-33 shows the layout of the uncore domain global control. Additionally, there is also a fixed counter, counting uncore clockticks, for the uncore domain. Table 18-19 summarizes the number MSRs for uncore PMU for each box. Vol. 3B 18-55 PERFORMANCE MONITORING Table 18-32. Uncore PMU MSR Summary Box # of Boxes Counters per Box Counter Width General Purpose Global Enable C-Box SKU specific 2 44 Yes Per-box ARB 1 2 44 Yes Uncore Fixed Counter N.A. N.A. 48 No Uncore Comment Up to 4, seeTable 2-21 MSR_UNC_CBO_CONFIG The uncore performance events for the C-Box and ARB units are listed in Table 19-13. 18.3.6.7 Intel® Xeon® Processor E5 v3 Family Uncore Performance Monitoring Facility Details of the uncore performance monitoring facility of Intel Xeon Processor E5 v3 families are available in “Intel® Xeon® Processor E5 v3 Uncore Performance Monitoring Programming Reference Manual”. The MSR-based uncore PMU interfaces are listed in Table 2-33. 18.3.7 5th Generation Intel® Core™ Processor and Intel® Core™ M Processor Performance Monitoring Facility The 5th Generation Intel® Core™ processor and the Intel® Core™ M processor families are based on the Broadwell microarchitecture. The core PMU supports architectural performance monitoring capability with version ID 3 (see Section 18.2.3) and a host of non-architectural monitoring capabilities. Architectural performance monitoring version 3 capabilities are described in Section 18.2.3. The core PMU has the same capability as those described in Section 18.3.6. IA32_PERF_GLOBAL_STATUS provide a bit indicator (bit 55) for PMI handler to distinguish PMI due to output buffer overflow condition due to accumulating packet data from Intel Processor Trace. 63 62 61 55 35 34 33 32 31 8 7 6 5 4 3 2 1 0 CondChgd Ovf_Buffer Ovf_UncorePMU Trace_ToPA_PMI FIXED_CTR2 Overflow (RO) FIXED_CTR1 Overflow (RO) FIXED_CTR0 Overflow (RO) PMC7_OVF (RO, If PMC7 present) PMC6_OVF (RO, If PMC6 present) PMC5_OVF (RO, If PMC5 present) PMC4_OVF (RO, If PMC4 present) PMC3_OVF (RO) PMC2_OVF (RO) PMC1_OVF (RO) PMC0_OVF (RO) Reserved Valid if CPUID.0AH:EAX[15:8] = 8; else reserved Figure 18-35. IA32_PERF_GLOBAL_STATUS MSR in Broadwell Microarchitecture Details of Intel Processor Trace is described in Chapter 35, “Intel® Processor Trace”. IA32_PERF_GLOBAL_OVF_CTRL MSR provide a corresponding reset control bit. 18-56 Vol. 3B PERFORMANCE MONITORING 63 62 61 55 35 34 33 32 31 8 7 6 5 4 3 2 1 0 ClrCondChgd ClrOvfDSBuffer ClrOvfUncore ClrTraceToPA_PMI FIXED_CTR2 ClrOverflow FIXED_CTR1 ClrOverflow FIXED_CTR0 ClrOverflow PMC7_ClrOvf (if PMC7 present) PMC6_ClrOvf (if PMC6 present) PMC5_ClrOvf (if PMC5 present) PMC4_ClrOvf (if PMC4 present) PMC3_ClrOvf PMC2_ClrOvf PMC1_ClrOvf PMC0_ClrOvf Reserved Valid if CPUID.0AH:EAX[15:8] = 8; else reserved Figure 18-36. IA32_PERF_GLOBAL_OVF_CTRL MSR in Broadwell microarchitecture The specifics of non-architectural performance events are listed in Chapter 19, “Performance Monitoring Events”. 18.3.8 6th Generation, 7th Generation and 8th Generation Intel® Core™ Processor Performance Monitoring Facility The 6th generation Intel® Core™ processor is based on the Skylake microarchitecture. The 7th generation Intel® Core™ processor is based on the Kaby Lake microarchitecture. The 8th generation Intel® Core™ processor is based on the Coffee Lake microarchitecture. For these microarchitectures, the core PMU supports architectural performance monitoring capability with version ID 4 (see Section 18.2.4) and a host of non-architectural monitoring capabilities. Architectural performance monitoring version 4 capabilities are described in Section 18.2.4. The core PMU’s capability is similar to those described in Section 18.6.3 through Section 18.3.4.5, with some differences and enhancements summarized in Table 18-22. Additionally, the core PMU provides some enhancement to support performance monitoring when the target workload contains instruction streams using Intel® Transactional Synchronization Extensions (TSX), see Section 18.3.6.5. For details of Intel TSX, see Chapter 16, “Programming with Intel® Transactional Synchronization Extensions” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. Performance monitoring result may be affected by side-band activity on processors that support Intel SGX, details are described in Chapter 42, “Enclave Code Debug and Profiling”. Vol. 3B 18-57 PERFORMANCE MONITORING Table 18-33. Core PMU Comparison Box Intel® Microarchitecture Code Name Skylake, Kaby Lake and Coffee Lake Intel® Microarchitecture Code Name Haswell and Broadwell Comment # of Fixed counters per thread 3 3 Use CPUID to determine # of counters. See Section 18.2.1. # of general-purpose counters per core 8 8 Use CPUID to determine # of counters. See Section 18.2.1. Counter width (R,W) R:48, W: 32/48 R:48, W: 32/48 See Section 18.2.2. # of programmable counters per thread 4 or (8 if a core not shared by two threads) 4 or (8 if a core not shared by two threads) Use CPUID to determine # of counters. See Section 18.2.1. Architectural Perfmon version 4 3 See Section 18.2.4 PMI Overhead Mitigation • Freeze_Perfmon_on_PMI with streamlined semantics. • Freeze_on_LBR with streamlined semantics. • Freeze_while_SMM. • Freeze_Perfmon_on_PMI with legacy semantics. • Freeze_on_LBR with legacy semantics for branch profiling. • Freeze_while_SMM. See Section 17.4.7. Legacy semantics not supported with version 4 or higher. Counter and Buffer Overflow Status Management See Section 18.2.4. • Query via • Query via IA32_PERF_GLOBAL_STATUS IA32_PERF_GLOBAL_STATUS • Reset via • Reset via IA32_PERF_GLOBAL_OVF_CTRL IA32_PERF_GLOBAL_STATUS_RESET • Set via IA32_PERF_GLOBAL_STATUS_SET IA32_PERF_GLOBAL_STATUS Indicators of Overflow/Overhead/Interferen ce • • • • Enable control in IA32_PERF_GLOBAL_STATUS • Individual counter overflow • PEBS buffer overflow • ToPA buffer overflow (applicable to Broadwell microarchitecture) See Section 18.2.4. • CTR_Frz • LBR_Frz NA See Section 18.2.4.1. Perfmon Counter In-Use Indicator Query IA32_PERF_GLOBAL_INUSE NA See Section 18.2.4.3. Precise Events See Table 18-36. See Table 18-12. IA32_PMC4-PMC7 do not support PEBS. PEBS for front end events See Section 18.3.8.1.4. No LBR Record Format Encoding 000101b 000100b LBR Size 32 entries 16 entries LBR Entry From_IP/To_IP/LBR_Info triplet From_IP/To_IP pair Section 17.12 LBR Timing Yes No Section 17.12.1 Call Stack Profiling Yes, see Section 17.11 Yes, see Section 17.11 Use LBR facility. Off-core Response Event MSR 1A6H and 1A7H; Extended request and response types. MSR 1A6H and 1A7H; Extended request and response types. Intel TSX support for Perfmon See Section 18.3.6.5. See Section 18.3.6.5. 18-58 Vol. 3B Individual counter overflow PEBS buffer overflow ToPA buffer overflow CTR_Frz, LBR_Frz, ASCI Section 17.4.8.1 PERFORMANCE MONITORING 18.3.8.1 Processor Event Based Sampling (PEBS) Facility The PEBS facility in the 6th generation, 7th generation and 8th generation Intel Core processors provides a number enhancement relative to PEBS in processors based on Haswell/Broadwell microarchitectures. The key components and differences of PEBS facility relative to Haswell/Broadwell microarchitecture is summarized in Table 18-34. Table 18-34. PEBS Facility Comparison Box Intel® Microarchitecture Code Name Skylake, Kaby Lake and Coffee Lake Intel® Microarchitecture Code Name Haswell and Broadwell Comment Valid IA32_PMCx PMC0-PMC3 PMC0-PMC3 No PEBS on PMC4-PMC7. PEBS Buffer Programming Section 18.3.1.1.1 Section 18.3.1.1.1 Unchanged IA32_PEBS_ENABLE Layout Figure 18-15 Figure 18-15 PEBS-EventingIP Yes Yes PEBS record format encoding 0011b 0010b PEBS record layout Table 18-35; enhanced fields at offsets 98H- B8H; and TSC record field at C0H. Table 18-24; enhanced fields at offsets 98H, A0H, A8H, B0H. Multi-counter PEBS resolution PEBS record 90H resolves the eventing counter overflow. PEBS record 90H reflects IA32_PERF_GLOBAL_STATUS. Precise Events See Table 18-36. See Table 18-12. IA32_PMC4-IA32_PMC7 do not support PEBS. PEBS-PDIR Yes Yes IA32_PMC1 only. PEBS-Load Latency See Section 18.3.4.4.2. See Section 18.3.4.4.2. Data Address Profiling Yes Yes FrontEnd event support FrontEnd_Retried event and MSR_PEBS_FRONTEND. No IA32_PMC0-PMC3 only. Only IA32_PMC0 through IA32_PMC3 support PEBS. NOTES Precise events are only valid when the following fields of IA32_PERFEVTSELx are all zero: AnyThread, Edge, Invert, CMask. In a PMU with PDIR capability, PEBS behavior is unpredictable if IA32_PERFEVTSELx or IA32_PMCx is changed for a PEBS-enabled counter while an event is being counted. To avoid this, changes to the programming or value of a PEBS-enabled counter should be performed when the counter is disabled. 18.3.8.1.1 PEBS Data Format The PEBS record format for the 6th generation, 7th generation and 8th generation Intel Core processors is reporting with encoding 0011b in IA32_PERF_CAPABILITIES[11:8]. The lay out is shown in Table 18-35. The PEBS record format, along with debug/store area storage format, does not change regardless of whether IA-32e mode is active or not. CPUID.01H:ECX.DTES64[bit 2] reports whether the processor's DS storage format support is modeindependent. When set, it uses 64-bit DS storage format. Vol. 3B 18-59 PERFORMANCE MONITORING Table 18-35. PEBS Record Format for 6th Generation, 7th Generation and 8th Generation Intel Core Processor Families Byte Offset Field Byte Offset Field 00H R/EFLAGS 68H R11 08H R/EIP 70H R12 10H R/EAX 78H R13 18H R/EBX 80H R14 20H R/ECX 88H R15 28H R/EDX 90H Applicable Counter 30H R/ESI 98H Data Linear Address 38H R/EDI A0H Data Source Encoding 40H R/EBP A8H Latency value (core cycles) 48H R/ESP B0H EventingIP 50H R8 B8H TX Abort Information (Section 18.3.6.5.1) 58H R9 C0H TSC 60H R10 The layout of PEBS records are largely identical to those shown in Table 18-24. The PEBS records at offsets 98H, A0H, and ABH record data gathered from three of the PEBS capabilities in prior processor generations: load latency facility (Section 18.3.4.4.2), PDIR (Section 18.3.4.4.4), and data address profiling (Section 18.3.6.3). In the core PMU of the 6th generation, 7th generation and 8th generation Intel Core processors, load latency facility and PDIR capabilities and data address profiling are unchanged relative to the 4th generation and 5th generation Intel Core processors. Similarly, precise store is replaced by data address profiling. With format 0010b, a snapshot of the IA32_PERF_GLOBAL_STATUS may be useful to resolve the situations when more than one of IA32_PMICx have been configured to collect PEBS data and two consecutive overflows of the PEBS-enabled counters are sufficiently far apart in time. It is also possible for the image at 90H to indicate multiple PEBS-enabled counters have overflowed. In the latter scenario, software cannot to correlate the PEBS record entry to the multiple overflowed bits. With PEBS record format encoding 0011b, offset 90H reports the “applicable counter” field, which is a multicounter PEBS resolution index allowing software to correlate the PEBS record entry with the eventing PEBS overflow when multiple counters are configured to record PEBS records. Additionally, offset C0H captures a snapshot of the TSC that provides a time line annotation for each PEBS record entry. 18.3.8.1.2 PEBS Events The list of precise events supported for PEBS in the Skylake, Kaby Lake and Coffee Lake microarchitectures is shown in Table 18-36. 18-60 Vol. 3B PERFORMANCE MONITORING Table 18-36. Precise Events for the Skylake, Kaby Lake and Coffee Lake Microarchitectures Event Name Event Select INST_RETIRED C0H Sub-event UMask 1 01H PREC_DIST 2 ALL_CYCLES 01H OTHER_ASSISTS C1H ANY 3FH BR_INST_RETIRED C4H CONDITIONAL 01H NEAR_CALL 02H BR_MISP_RETIRED C5H ALL_BRANCHES 04H NEAR_RETURN 08H NEAR_TAKEN 20H FAR_BRACHES 40H CONDITIONAL 01H ALL_BRANCHES 04H NEAR_TAKEN 20H 3> 01H FRONTEND_RETIRED C6H PEBS_Absolute_Maximum. Note that when an out of bound condition is encountered, the overflow bits in IA32_PERF_GLOBAL_STATUS will be cleared according to Applicable Counters, however the IA32_PMCx values will not be reloaded with the Reset values stored in the DS_AREA. 18.5.3.2 Offcore Response Event Event number 0B7H support offcore response monitoring using an associated configuration MSR, MSR_OFFCORE_RSP0 (address 1A6H) in conjunction with umask value 01H or MSR_OFFCORE_RSP1 (address 1A7H) in conjunction with umask value 02H. Table 18-56 lists the event code, mask value and additional off-core configuration MSR that must be programmed to count off-core response events using IA32_PMCx. The Goldmont microarchitecture provides unique pairs of MSR_OFFCORE_RSPx registers per core. The layout of MSR_OFFCORE_RSP0 and MSR_OFFCORE_RSP1 are organized as follows: 18-84 Vol. 3B PERFORMANCE MONITORING • • • Bits 15:0 specifies the request type of a transaction request to the uncore. This is described in Table 18-63. • For outstanding requests, bit 38 can enable measurement of average latency of specific type of offcore transaction requests using two programmable counter simultaneously; see Section 18.5.2.3 for details. Bits 30:16 specifies common supplier information or an L2 Hit, and is described in Table 18-58. If L2 misses, then Bits 37:31 can be used to specify snoop response information and is described in Table 18-64. Table 18-63. MSR_OFFCORE_RSPx Request_Type Field Definition Bit Name Offset Description DEMAND_DATA_RD 0 Counts cacheline read requests due to demand reads (excludes prefetches). DEMAND_RFO 1 Counts cacheline read for ownership (RFO) requests due to demand writes (excludes prefetches). DEMAND_CODE_RD 2 Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache. COREWB 3 Counts writeback transactions caused by L1 or L2 cache evictions. PF_L2_DATA_RD 4 Counts data cacheline reads generated by hardware L2 cache prefetcher. PF_L2_RFO 5 Counts reads for ownership (RFO) requests generated by L2 prefetcher. Reserved 6 Reserved. PARTIAL_READS 7 Counts demand data partial reads, including data in uncacheable (UC) or uncacheable (WC) write combining memory types. PARTIAL_WRITES 8 Counts partial writes, including uncacheable (UC), write through (WT) and write protected (WP) memory type writes. UC_CODE_READS 9 Counts code reads in uncacheable (UC) memory region. BUS_LOCKS 10 Counts bus lock and split lock requests. FULL_STREAMING_STORES 11 Counts full cacheline writes due to streaming stores. SW_PREFETCH 12 Counts cacheline requests due to software prefetch instructions. PF_L1_DATA_RD 13 Counts data cacheline reads generated by hardware L1 data cache prefetcher. PARTIAL_STREAMING_STORES 14 Counts partial cacheline writes due to streaming stores. ANY_REQUEST 15 Counts requests to the uncore subsystem. To properly program this extra register, software must set at least one request type bit (Table 18-57) and a valid response type pattern (either Table 18-58 or Table 18-64). Otherwise, the event count reported will be zero. It is permissible and useful to set multiple request and response type bits in order to obtain various classes of off-core response events. Although MSR_OFFCORE_RSPx allow an agent software to program numerous combinations that meet the above guideline, not all combinations produce meaningful data. Vol. 3B 18-85 PERFORMANCE MONITORING Table 18-64. MSR_OFFCORE_RSPx For L2 Miss and Outstanding Requests Subtype Bit Name Offset Description L2_MISS (Snoop Info) Reserved 32:31 Reserved L2_MISS.SNOOP_MISS_O 33 R_NO_SNOOP_NEEDED A true miss to this module, for which a snoop request missed the other module or no snoop was performed/needed. L2_MISS.HIT_OTHER_CO RE_NO_FWD 34 A snoop hit in the other processor module, but no data forwarding is required. Reserved 35 Reserved L2_MISS.HITM_OTHER_C 36 ORE Counts the number of snoops hit in the other module or other core's L1 where modified copies were found. L2_MISS.NON_DRAM 37 Target was a non-DRAM system address. This includes MMIO transactions. 38 Counts weighted cycles of outstanding offcore requests of the request type specified in bits 15:0, from the time the XQ receives the request and any response is received. Bits 37:16 must be set to 0. This bit is only available in MSR_OFFCORE_RESP0. Outstanding OUTSTANDING requests1 NOTES: 1. See Section 18.5.2.3, “Average Offcore Request Latency Measurement” for details on how to use this bit to extract average latency. To specify a complete offcore response filter, software must properly program bits in the request and response type fields. A valid request type must have at least one bit set in the non-reserved bits of 15:0. A valid response type must be a non-zero value of the following expression: Any_Response Bit | L2 Hit | ‘OR’ of Snoop Info Bits | Outstanding Bit 18.5.3.3 Average Offcore Request Latency Measurement In Goldmont microarchitecture, measurement of average latency of offcore transaction requests is the same as described in Section 18.5.2.3. 18.5.4 Performance Monitoring for Goldmont Plus Microarchitecture Intel Atom processors based on the Goldmont Plus microarchitecture report architectural performance monitoring versionID = 4 and support non-architectural monitoring capabilities described in this section. Architectural performance monitoring version 4 capabilities are described in Section 18.2.4. Goldmont Plus performance monitoring capabilities are similar to Goldmont capabilities. The differences are in specific events and in which counters support PEBS. Goldmont Plus introduces the ability for fixed performance monitoring counters to generate PEBS records. Goldmont Plus will set the AnyThread deprecation CPUID bit (CPUID.0AH:EDX[15]) to indicate that the Any-Thread bits in IA32_PERFEVTSELx and IA32_FIXED_CTR_CTRL have no effect. The core PMU's capability is similar to that of the Goldmont microarchitecture described in Section 18.6.3, with some differences and enhancements summarized in Table 18-65. Table 18-65. Core PMU Comparison Between the Goldmont Plus and Goldmont Microarchitectures Box Goldmont Plus Microarchitecture Goldmont Microarchitecture Comment # of Fixed counters per core 3 3 Use CPUID to determine # of counters. See Section 18.2.1. 18-86 Vol. 3B PERFORMANCE MONITORING Table 18-65. Core PMU Comparison Between the Goldmont Plus and Goldmont Microarchitectures Box Goldmont Plus Microarchitecture Goldmont Microarchitecture Comment # of general-purpose counters per core 4 4 Use CPUID to determine # of counters. See Section 18.2.1. Counter width (R,W) R:48, W: 32/48 R:48, W: 32/48 No change. Architectural Performance Monitoring version ID 4 4 No change. Processor Event Based Sampling (PEBS) Events All General-Purpose and Fixed counters. Each General-Purpose counter supports all events (precise and non-precise). General-Purpose Counter 0 only. Supports all events (precise and non-precise). Precise events are listed in Table 18-61. Goldmont Plus supports PEBS on all counters. PEBS record format encoding 0011b 0011b No change. 18.5.4.1 Extended PEBS The PEBS facility in Goldmont Plus microarchitecture provides a number of enhancements relative to PEBS in processors from previous generations. Enhancement of PEBS facility with the Extended PEBS feature are described in detail in section 18.9. 18.5.5 Performance Monitoring for Tremont Microarchitecture Intel Atom processors based on the Tremont microarchitecture report architectural performance monitoring versionID = 5 and support non-architectural monitoring capabilities described in this section. Architectural performance monitoring version 5 capabilities are described in Section 18.2.5. Tremont performance monitoring capabilities are similar to Goldmont Plus capabilities, with the following extensions: • • • • Support for Adaptive PEBS. Support for PEBS output to Intel® Processor Trace. Precise Distribution support on Fixed Counter0. Compatibility enhancements to off-core response MSRs, MSR_OFFCORE_RSPx. The differences and enhancements between Tremont microarchitecture and Goldmont Plus microarchitecture are summarized in Table 18-66. Table 18-66. Core PMU Comparison Between the Tremont and Goldmont Plus Microarchitectures Box Tremont Microarchitecture Goldmont Plus Microarchitecture Comment # of fixed counters per core 3 3 Use CPUID to determine # of counters. See Section 18.2.1. # of general-purpose counters per core 4 4 Use CPUID to determine # of counters. See Section 18.2.1. Counter width (R,W) R:48, W: 32/48 R:48, W: 32/48 No change. See Section 18.2.2. Architectural Performance Monitoring version ID 5 4 Vol. 3B 18-87 PERFORMANCE MONITORING Table 18-66. Core PMU Comparison Between the Tremont and Goldmont Plus Microarchitectures Box Tremont Microarchitecture Goldmont Plus Microarchitecture Comment PEBS record format encoding 0100b 0011b See Section 18.6.2.4.2. Reduce skid PEBS IA32_PMC0 and IA32_FIXED_CTR0 IA32_PMC0 only Extended PEBS Yes Yes See Section 18.5.4.1. Adaptive PEBS Yes No See Section 18.9.2. PEBS output DS Save Area or Intel® Processor Trace DS Save Area only See Section 18.5.5.2.1. PEBS record layout See Section 18.9.2.3 for output to DS, Section 18.5.5.2.2 for output to Intel PT. Table 18-62; enhanced fields at offsets 90H- 98H; and TSC record field at C0H. Off-core Response Event MSR 1A6H and 1A7H, each core has its own register, extended request and response types. MSR 1A6H and 1A7H, each core has its own register. 18.5.5.1 Adaptive PEBS The PEBS record format and configuration interface has changed versus Goldmont Plus, as the Tremont microarchitecture includes support for the configurable Adaptive PEBS records; see Section 18.9.2. 18.5.5.2 PEBS output to Intel® Processor Trace Intel Atom processors based on the Tremont microarchitecture introduce the following Precise Event-Based Sampling (PEBS) extensions: • A mechanism to direct PEBS output into the Intel® Processor Trace (Intel® PT) output stream. In this scenario, the PEBS record is written in packetized form, in order to co-exist with other Intel PT trace data. • New Performance Monitoring counter reload MSRs, which are used by PEBS in place of the counter reload values stored in the DS Management area when PEBS output is directed into the Intel PT output stream. Processors that indicate support for Intel PT by setting CPUID.07H.0.EBX[25]=1, and set the new IA32_PERF_CAPABILITIES.PEBS_OUTPUT_PT_AVAIL[16] bit, support these extensions. 18.5.5.2.1 PEBS Configuration PEBS output to Intel Processor Trace includes support for two new fields in IA32_PEBS_ENABLE. Table 18-67. New Fields in IA32_PEBS_ENABLE Field Description PMI_AFTER_EACH_RECORD[60] Pend a PerfMon Interrupt (PMI) after each PEBS event. PEBS_OUTPUT[62:61] Specifies PEBS output destination. Encodings: 00B: DS Save Area. Matches legacy PEBS behavior, output location defined by IA32_DS_AREA. 01B: Intel PT trace output. 10B: Reserved. 11B: Reserved. When PEBS_OUTPUT is set to 01B, the DS Management Area is not used and need not be configured. Instead, the output mechanism is configured through IA32_RTIT_CTL and other Intel PT MSRs, while counter reload values are configured in the MSR_RELOAD_PMCx MSRs. Details on configuring Intel PT can be found in Section 35.2.6. 18-88 Vol. 3B PERFORMANCE MONITORING 63 62 61 60 n 32 31 ● ● ● m 1 0 ● ● ● PEBS_OUTPUT (R/W) PMI_AFTER_EACH_RECORD (R/W) PEBS_EN_FIXEDn (R/W) PEBS_EN_FIXED1 (R/W) PEBS_EN_FIXED0 (R/W) PEBS_EN_PMCm (R/W) PEBS_EN_PMC1 (R/W) PEBS_EN_PMC0 (R/W) Reserved RESET Value – 00000000 _00000000 H Figure 18-40. IA32_PEBS_ENABLE MSR with PEBS Output to Intel® Processor Trace 18.5.5.2.2 PEBS Record Format in Intel® Processor Trace The format of the PEBS record changes when output to Intel PT, as the PEBS state is packetized. Each PEBS grouping is emitted as a Block Begin (BBP) and following Block Item (BIP) packets. A PEBS grouping ends when either a new PEBS grouping begins (indicated by a BBP packet) or a Block End (BEP) packet is encountered. See Section 35.4.1.1 for details of these Intel PT packets. Because the packet headers describe the state held in the packet payload, PEBS state ordering is not fixed. PEBS state groupings may be emitted in any order, and the PEBS state elements within those groupings may be emitted in any order. Further, there is no packet that provides indication of “Record Format” or “Record Size”. If Intel PT tracing is not enabled (IA32_RTIT_STATUS.TriggerEn=0), any PEBS records triggered will be dropped. PEBS packets do not depend on ContextEn or FilterEn in IA32_RTIT_STATUS, any filtering of PEBS must be enabled from within the PerfMon configuration. Counter reload will occur in all scenarios where PEBS is triggered, regardless of TriggerEn. The PEBS threshold mechanism for generating PerfMon Interrupts (PMIs) is not available in this mode. However, there exist other means to generate PMIs based on PEBS output. When the Intel PT ToPA output mechanism is chosen, a PMI can optionally be pended when a ToPA region is filled; see Section 35.2.6.2 for details. Further, software can opt to generate a PMI on each PEBS record by setting the new IA32_PEBS_ENABLE.PMI_AFTER_EACH_RECORD[60] bit. The IA32_PERF_GLOBAL_STATUS.OvfDSBuffer bit will not be set in this mode. 18.5.5.2.3 PEBS Counter Reload When PEBS output is directed into Intel PT (IA32_PEBS_ENABLE.PEBS_OUTPUT = 01B), new MSR_RELOAD_PMCx MSRs are used by the PEBS routine to reload PerfMon counters. The value from the associated reload MSR will be loaded to the appropriate counter on each PEBS event. Vol. 3B 18-89 PERFORMANCE MONITORING 18.5.5.3 Precise Distribution Support on Fixed Counter 0 The Tremont microarchitecture supports the PDIR (Precise Distribution of Retired Instructions) facility, as described in Section 18.3.4.4.4, on Fixed Counter 0. Fixed Counter 0 counts the INST_RETIRED.ALL event. PEBS skid for Fixed Counter 0 will be precisely one instruction. This is in addition to the reduced skid PEBS behavior on IA32_PMC0; see Section 18.5.3.1.2. 18.5.5.4 Compatibility Enhancements to Offcore Response MSRs The Off-core Response facility is similar to that described in Section 18.5.3.2. The layout of MSR_OFFCORE_RSP0 and MSR_OFFCORE_RSP1 are organized as shown below. RequestType bits are defined in Table 18-68, ResponseType bits in Table 18-69, and SnoopInfo bits in Table 18-70. Table 18-68. MSR_OFFCORE_RSPx Request Type Definition Bit Name Offset Description DEMAND_DATA_RD 0 Counts demand data reads. DEMAND_RFO 1 Counts all demand reads for ownership (RFO) requests and software based prefetches for exclusive ownership (prefetchw). DEMAND_CODE_RD 2 Counts demand instruction fetches and L1 instruction cache prefetches. COREWB_M 3 Counts modified write backs from L1 and L2. HWPF_L2_DATA_RD 4 Counts prefetch (that bring data to L2) data reads. HWPF_L2_RFO 5 Counts all prefetch (that bring data to L2) RFOs. HWPF_L2_CODE_RD 6 Counts all prefetch (that bring data to L2 only) code reads. Reserved 9:7 Reserved. HWPF_L1D_AND_SWPF 10 Counts L1 data cache hardware prefetch requests, read for ownership prefetch requests and software prefetch requests (except prefetchw). STREAMING_WR 11 Counts all streaming stores. COREWB_NONM 12 Counts non-modified write backs from L2. Reserved 14:13 Reserved. OTHER 15 Counts miscellaneous requests, such as I/O accesses that have any response type. UC_RD 44 Counts uncached memory reads (PRd, UCRdF). UC_WR 45 Counts uncached memory writes (WiL). PARTIAL_STREAMING_WR 46 Counts partial (less than 64 byte) streaming stores (WCiL). FULL_STREAMING_WR 47 Counts full, 64 byte streaming stores (WCiLF). L1WB_M 48 Counts modified WriteBacks from L1 that miss the L2. L2WB_M 49 Counts modified WriteBacks from L2. Table 18-69. MSR_OFFCORE_RSPx Response Type Definition Bit Name Offset Description ANY_RESPONSE 16 Catch all value for any response types. L3_HIT_M 18 LLC/L3 Hit - M-state. L3_HIT_E 19 LLC/L3 Hit - E-state. L3_HIT_S 20 LLC/L3 Hit - S-state. 18-90 Vol. 3B PERFORMANCE MONITORING Table 18-69. MSR_OFFCORE_RSPx Response Type Definition Bit Name Offset Description L3_HIT_F 21 LLC/L3 Hit - I-state. LOCAL_DRAM 26 LLC/L3 Miss, DRAM Hit. OUTSTANDING 63 Average latency of outstanding requests with the other counter counting number of occurrences; can also can be used to count occupancy. Table 18-70. MSR_OFFCORE_RSPx Snoop Info Definition Bit Name Offset Description SNOOP_NONE 31 None of the cores were snooped. • LLC miss and Dram data returned directly to the core. SNOOP_NOT_NEEDED 32 No snoop needed to satisfy the request. • LLC hit and CV bit(s) (core valid) was not set. • LLC miss and Dram data returned directly to the core. SNOOP_MISS 33 A snoop was sent but missed. • LLC hit and CV bit(s) was set but snoop missed (silent data drop in core), data returned from LLC. • LLC miss and Dram data returned directly to the core. SNOOP_HIT_NO_FWD 34 A snoop was sent but no data forward. • LLC hit and CV bit(s) was set but no data forward from the core, data returned from LLC. • LLC miss and Dram data returned directly to the core. SNOOP_HIT_WITH_FWD 35 A snoop was sent and non-modified data was forward. • LLC hit and CV bit(s) was set, non-modified data was forward from core. SNOOP_HITM 36 A snoop was sent and modified data was forward. • LLC hit E or M and the CV bit(s) was set, modified data was forward from core. NON_DRAM_BIT 37 Target was non-DRAM system address, MMIO access. • LLC miss and Non-Dram data returned. The Off-core Response capability behaves as follows: • To specify a complete offcore response filter, software must properly program at least one RequestType and one ResponseType. A valid request type must have at least one bit set in the non-reserved bits of 15:0 or 49:44. A valid response type must be a non-zero value of one the following expressions: • Read requests: Any_Response Bit | (‘OR’ of Supplier Info Bits) ‘AND’ ( ‘OR’ of Snoop Info Bits) | Outstanding Bit • Write requests: Any_Response Bit | (‘OR’ of Supplier Info Bits) | Outstanding Bit • • • When the ANY_RESPONSE bit in the ResponseType is set, all other response type bits will be ignored. • • Bits 30:16 specifies common supplier information. True Demand Cacheable Loads include neither L1 Prefetches nor Software Prefetches. Bits 15:0 and Bits 49:44 specifies the request type of a transaction request to the uncore. This is described in Table 18-68. “Outstanding Requests” (bit 63) is only available on MSR_OFFCORE_RSP0; a #GP fault will occur if software attempts to write a 1 to this bit in MSR_OFFCORE_RSP1. It is mutually exclusive with any ResponseType. Vol. 3B 18-91 PERFORMANCE MONITORING Software must guarantee that all other ResponseType bits are set to 0 when the “Outstanding Requests” bit is set. • “Outstanding Requests” bit 63 can enable measurement of the average latency of a specific type of off-core transaction; two programmable counters must be used simultaneously and the RequestType programming for MSR_OFFCORE_RSP0 and MSR_OFFCORE_RSP1 must be the same when using this Average Latency feature. See Section 18.5.2.3 for further details. 18.6 PERFORMANCE MONITORING (LEGACY INTEL PROCESSORS) 18.6.1 Performance Monitoring (Intel® Core™ Solo and Intel® Core™ Duo Processors) In Intel Core Solo and Intel Core Duo processors, non-architectural performance monitoring events are programmed using the same facilities (see Figure 18-1) used for architectural performance events. Non-architectural performance events use event select values that are model-specific. Event mask (Umask) values are also specific to event logic units. Some microarchitectural conditions detectable by a Umask value may have specificity related to processor topology (see Section 8.6, “Detecting Hardware Multi-Threading Support and Topology,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A). As a result, the unit mask field (for example, IA32_PERFEVTSELx[bits 15:8]) may contain sub-fields that specify topology information of processor cores. The sub-field layout within the Umask field may support two-bit encoding that qualifies the relationship between a microarchitectural condition and the originating core. This data is shown in Table 18-71. The two-bit encoding for core-specificity is only supported for a subset of Umask values (see Chapter 19, “Performance Monitoring Events”) and for Intel Core Duo processors. Such events are referred to as core-specific events. Table 18-71. Core Specificity Encoding within a Non-Architectural Umask IA32_PERFEVTSELx MSRs Bit 15:14 Encoding Description 11B All cores 10B Reserved 01B This core 00B Reserved Some microarchitectural conditions allow detection specificity only at the boundary of physical processors. Some bus events belong to this category, providing specificity between the originating physical processor (a bus agent) versus other agents on the bus. Sub-field encoding for agent specificity is shown in Table 18-72. Table 18-72. Agent Specificity Encoding within a Non-Architectural Umask IA32_PERFEVTSELx MSRs Bit 13 Encoding Description 0 This agent 1 Include all agents Some microarchitectural conditions are detectable only from the originating core. In such cases, unit mask does not support core-specificity or agent-specificity encodings. These are referred to as core-only conditions. Some microarchitectural conditions allow detection specificity that includes or excludes the action of hardware prefetches. A two-bit encoding may be supported to qualify hardware prefetch actions. Typically, this applies only to some L2 or bus events. The sub-field encoding for hardware prefetch qualification is shown in Table 18-73. 18-92 Vol. 3B PERFORMANCE MONITORING Table 18-73. HW Prefetch Qualification Encoding within a Non-Architectural Umask IA32_PERFEVTSELx MSRs Bit 13:12 Encoding Description 11B All inclusive 10B Reserved 01B Hardware prefetch only 00B Exclude hardware prefetch Some performance events may (a) support none of the three event-specific qualification encodings (b) may support core-specificity and agent specificity simultaneously (c) or may support core-specificity and hardware prefetch qualification simultaneously. Agent-specificity and hardware prefetch qualification are mutually exclusive. In addition, some L2 events permit qualifications that distinguish cache coherent states. The sub-field definition for cache coherency state qualification is shown in Table 18-74. If no bits in the MESI qualification sub-field are set for an event that requires setting MESI qualification bits, the event count will not increment. Table 18-74. MESI Qualification Definitions within a Non-Architectural Umask IA32_PERFEVTSELx MSRs Bit Position 11:8 Description Bit 11 Counts modified state Bit 10 Counts exclusive state Bit 9 Counts shared state Bit 8 Counts Invalid state 18.6.2 Performance Monitoring (Processors Based on Intel® Core™ Microarchitecture) In addition to architectural performance monitoring, processors based on the Intel Core microarchitecture support non-architectural performance monitoring events. Architectural performance events can be collected using general-purpose performance counters. Non-architectural performance events can be collected using general-purpose performance counters (coupled with two IA32_PERFEVTSELx MSRs for detailed event configurations), or fixed-function performance counters (see Section 18.6.2.1). IA32_PERFEVTSELx MSRs are architectural; their layout is shown in Figure 18-1. Starting with Intel Core 2 processor T 7700, fixed-function performance counters and associated counter control and status MSR becomes part of architectural performance monitoring version 2 facilities (see also Section 18.2.2). Non-architectural performance events in processors based on Intel Core microarchitecture use event select values that are model-specific. Valid event mask (Umask) bits are listed in Chapter 19. The UMASK field may contain subfields identical to those listed in Table 18-71, Table 18-72, Table 18-73, and Table 18-74. One or more of these sub-fields may apply to specific events on an event-by-event basis. Details are listed in Table 19-27 in Chapter 19, “Performance Monitoring Events.” In addition, the UMASK filed may also contain a sub-field that allows detection specificity related to snoop responses. Bits of the snoop response qualification sub-field are defined in Table 18-75. Vol. 3B 18-93 PERFORMANCE MONITORING Table 18-75. Bus Snoop Qualification Definitions within a Non-Architectural Umask IA32_PERFEVTSELx MSRs Bit Position 11:8 Description Bit 11 HITM response Bit 10 Reserved Bit 9 HIT response Bit 8 CLEAN response There are also non-architectural events that support qualification of different types of snoop operation. The corresponding bit field for snoop type qualification are listed in Table 18-76. Table 18-76. Snoop Type Qualification Definitions within a Non-Architectural Umask IA32_PERFEVTSELx MSRs Bit Position 9:8 Description Bit 9 CMP2I snoops Bit 8 CMP2S snoops No more than one sub-field of MESI, snoop response, and snoop type qualification sub-fields can be supported in a performance event. NOTE Software must write known values to the performance counters prior to enabling the counters. The content of general-purpose counters and fixed-function counters are undefined after INIT or RESET. 18.6.2.1 Fixed-function Performance Counters Processors based on Intel Core microarchitecture provide three fixed-function performance counters. Bits beyond the width of the fixed counter are reserved and must be written as zeros. Model-specific fixed-function performance counters on processors that support Architectural Perfmon version 1 are 40 bits wide. Each of the fixed-function counter is dedicated to count a pre-defined performance monitoring events. See Table 18-2 for details of the PMC addresses and what these events count. Programming the fixed-function performance counters does not involve any of the IA32_PERFEVTSELx MSRs, and does not require specifying any event masks. Instead, the MSR IA32_FIXED_CTR_CTRL provides multiple sets of 4-bit fields; each 4-bit field controls the operation of a fixed-function performance counter (PMC). See Figures 18-41. Two sub-fields are defined for each control. See Figure 18-41; bit fields are: • Enable field (low 2 bits in each 4-bit control) — When bit 0 is set, performance counting is enabled in the corresponding fixed-function performance counter to increment when the target condition associated with the architecture performance event occurs at ring 0. When bit 1 is set, performance counting is enabled in the corresponding fixed-function performance counter to increment when the target condition associated with the architecture performance event occurs at ring greater than 0. Writing 0 to both bits stops the performance counter. Writing 11B causes the counter to increment irrespective of privilege levels. 18-94 Vol. 3B PERFORMANCE MONITORING 63 12 11 9 8 7 P M I P M I E N 5 4 3 2 1 0 E N P M I E N Cntr2 — Controls for IA32_FIXED_CTR2 Cntr1 — Controls for IA32_FIXED_CTR1 PMI — Enable PMI on overflow Cntr0 — Controls for IA32_FIXED_CTR0 ENABLE — 0: disable; 1: OS; 2: User; 3: All ring levels Reserved SDM30265 Figure 18-41. Layout of IA32_FIXED_CTR_CTRL MSR • PMI field (fourth bit in each 4-bit control) — When set, the logical processor generates an exception through its local APIC on overflow condition of the respective fixed-function counter. 18.6.2.2 Global Counter Control Facilities Processors based on Intel Core microarchitecture provides simplified performance counter control that simplifies the most frequent operations in programming performance events, i.e. enabling/disabling event counting and checking the status of counter overflows. This is done by the following three MSRs: • MSR_PERF_GLOBAL_CTRL enables/disables event counting for all or any combination of fixed-function PMCs (IA32_FIXED_CTRx) or general-purpose PMCs via a single WRMSR. • MSR_PERF_GLOBAL_STATUS allows software to query counter overflow conditions on any combination of fixed-function PMCs (IA32_FIXED_CTRx) or general-purpose PMCs via a single RDMSR. • MSR_PERF_GLOBAL_OVF_CTRL allows software to clear counter overflow conditions on any combination of fixed-function PMCs (IA32_FIXED_CTRx) or general-purpose PMCs via a single WRMSR. MSR_PERF_GLOBAL_CTRL MSR provides single-bit controls to enable counting in each performance counter (see Figure 18-42). Each enable bit in MSR_PERF_GLOBAL_CTRL is AND’ed with the enable bits for all privilege levels in the respective IA32_PERFEVTSELx or IA32_FIXED_CTR_CTRL MSRs to start/stop the counting of respective counters. Counting is enabled if the AND’ed results is true; counting is disabled when the result is false. 63 35 34 33 32 31 2 1 0 FIXED_CTR2 enable FIXED_CTR1 enable FIXED_CTR0 enable PMC1 enable PMC0 enable Reserved Figure 18-42. Layout of MSR_PERF_GLOBAL_CTRL MSR Vol. 3B 18-95 PERFORMANCE MONITORING MSR_PERF_GLOBAL_STATUS MSR provides single-bit status used by software to query the overflow condition of each performance counter. MSR_PERF_GLOBAL_STATUS[bit 62] indicates overflow conditions of the DS area data buffer. MSR_PERF_GLOBAL_STATUS[bit 63] provides a CondChgd bit to indicate changes to the state of performance monitoring hardware (see Figure 18-43). A value of 1 in bits 34:32, 1, 0 indicates an overflow condition has occurred in the associated counter. 63 62 35 34 33 32 31 2 1 0 CondChgd OvfBuffer FIXED_CTR2 Overflow FIXED_CTR1 Overflow FIXED_CTR0 Overflow PMC1 Overflow PMC0 Overflow Reserved Figure 18-43. Layout of MSR_PERF_GLOBAL_STATUS MSR When a performance counter is configured for PEBS, an overflow condition in the counter will arm PEBS. On the subsequent event following overflow, the processor will generate a PEBS event. On a PEBS event, the processor will perform bounds checks based on the parameters defined in the DS Save Area (see Section 17.4.9). Upon successful bounds checks, the processor will store the data record in the defined buffer area, clear the counter overflow status, and reload the counter. If the bounds checks fail, the PEBS will be skipped entirely. In the event that the PEBS buffer fills up, the processor will set the OvfBuffer bit in MSR_PERF_GLOBAL_STATUS. MSR_PERF_GLOBAL_OVF_CTL MSR allows software to clear overflow the indicators for general-purpose or fixedfunction counters via a single WRMSR (see Figure 18-44). Clear overflow indications when: • • • Setting up new values in the event select and/or UMASK field for counting or interrupt-based event sampling. Reloading counter values to continue collecting next sample. Disabling event counting or interrupt-based event sampling. 63 62 35 34 33 32 31 2 1 0 ClrCondChgd ClrOvfBuffer FIXED_CTR2 ClrOverflow FIXED_CTR1 ClrOverflow FIXED_CTR0 ClrOverflow PMC1 ClrOverflow PMC0 ClrOverflow Reserved Figure 18-44. Layout of MSR_PERF_GLOBAL_OVF_CTRL MSR 18.6.2.3 At-Retirement Events Many non-architectural performance events are impacted by the speculative nature of out-of-order execution. A subset of non-architectural performance events on processors based on Intel Core microarchitecture are enhanced with a tagging mechanism (similar to that found in Intel NetBurst® microarchitecture) that exclude contributions that arise from speculative execution. The at-retirement events available in processors based on Intel Core micro- 18-96 Vol. 3B PERFORMANCE MONITORING architecture does not require special MSR programming control (see Section 18.6.3.6, “At-Retirement Counting”), but is limited to IA32_PMC0. See Table 18-77 for a list of events available to processors based on Intel Core microarchitecture. Table 18-77. At-Retirement Performance Events for Intel Core Microarchitecture Event Name UMask Event Select ITLB_MISS_RETIRED 00H C9H MEM_LOAD_RETIRED.L1D_MISS 01H CBH MEM_LOAD_RETIRED.L1D_LINE_MISS 02H CBH MEM_LOAD_RETIRED.L2_MISS 04H CBH MEM_LOAD_RETIRED.L2_LINE_MISS 08H CBH MEM_LOAD_RETIRED.DTLB_MISS 10H CBH 18.6.2.4 Processor Event Based Sampling (PEBS) Processors based on Intel Core microarchitecture also support processor event based sampling (PEBS). This feature was introduced by processors based on Intel NetBurst microarchitecture. PEBS uses a debug store mechanism and a performance monitoring interrupt to store a set of architectural state information for the processor. The information provides architectural state of the instruction executed after the instruction that caused the event (See Section 18.6.2.4.2 and Section 17.4.9). In cases where the same instruction causes BTS and PEBS to be activated, PEBS is processed before BTS are processed. The PMI request is held until the processor completes processing of PEBS and BTS. For processors based on Intel Core microarchitecture, precise events that can be used with PEBS are listed in Table 18-78. The procedure for detecting availability of PEBS is the same as described in Section 18.6.3.8.1. Table 18-78. PEBS Performance Events for Intel Core Microarchitecture Event Name UMask Event Select INSTR_RETIRED.ANY_P 00H C0H X87_OPS_RETIRED.ANY FEH C1H BR_INST_RETIRED.MISPRED 00H C5H SIMD_INST_RETIRED.ANY 1FH C7H MEM_LOAD_RETIRED.L1D_MISS 01H CBH MEM_LOAD_RETIRED.L1D_LINE_MISS 02H CBH MEM_LOAD_RETIRED.L2_MISS 04H CBH MEM_LOAD_RETIRED.L2_LINE_MISS 08H CBH MEM_LOAD_RETIRED.DTLB_MISS 10H CBH 18.6.2.4.1 Setting up the PEBS Buffer For processors based on Intel Core microarchitecture, PEBS is available using IA32_PMC0 only. Use the following procedure to set up the processor and IA32_PMC0 counter for PEBS: 1. Set up the precise event buffering facilities. Place values in the precise event buffer base, precise event index, precise event absolute maximum, precise event interrupt threshold, and precise event counter reset fields of the DS buffer management area. In processors based on Intel Core microarchitecture, PEBS records consist of 64-bit address entries. See Figure 17-8 to set up the precise event records buffer in memory. 2. Enable PEBS. Set the Enable PEBS on PMC0 flag (bit 0) in IA32_PEBS_ENABLE MSR. 3. Set up the IA32_PMC0 performance counter and IA32_PERFEVTSEL0 for an event listed in Table 18-78. Vol. 3B 18-97 PERFORMANCE MONITORING 18.6.2.4.2 PEBS Record Format The PEBS record format may be extended across different processor implementations. The IA32_PERF_CAPABILITES MSR defines a mechanism for software to handle the evolution of PEBS record format in processors that support architectural performance monitoring with version id equals 2 or higher. The bit fields of IA32_PERF_CAPABILITES are defined in Table 2-2 of Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. The relevant bit fields that governs PEBS are: • PEBSTrap [bit 6]: When set, PEBS recording is trap-like. After the PEBS-enabled counter has overflowed, PEBS record is recorded for the next PEBS-able event at the completion of the sampled instruction causing the PEBS event. When clear, PEBS recording is fault-like. The PEBS record is recorded before the sampled instruction causing the PEBS event. • PEBSSaveArchRegs [bit 7]: When set, PEBS will save architectural register and state information according to the encoded value of the PEBSRecordFormat field. When clear, only the return instruction pointer and flags are recorded. On processors based on Intel Core microarchitecture, this bit is always 1 • PEBSRecordFormat [bits 11:8]: Valid encodings are: — 0000B: Only general-purpose registers, instruction pointer and RFLAGS registers are saved in each PEBS record (See Section 18.6.3.8). — 0001B: PEBS record includes additional information of IA32_PERF_GLOBAL_STATUS and load latency data. (See Section 18.3.1.1.1). — 0010B: PEBS record includes additional information of IA32_PERF_GLOBAL_STATUS, load latency data, and TSX tuning information. (See Section 18.3.6.2). — 0011B: PEBS record includes additional information of load latency data, TSX tuning information, TSC data, and the applicable counter field replaces IA32_PERF_GLOBAL_STATUS at offset 90H. (See Section 18.3.8.1.1). — 0100B: PEBS record contents are defined by elections in MSR_PEBS_DATA_CFG. (See Section 18.9.2.3). 18.6.2.4.3 Writing a PEBS Interrupt Service Routine The PEBS facilities share the same interrupt vector and interrupt service routine (called the DS ISR) with the Interrupt-based event sampling and BTS facilities. To handle PEBS interrupts, PEBS handler code must be included in the DS ISR. See Section 17.4.9.1, “64 Bit Format of the DS Save Area,” for guidelines when writing the DS ISR. The service routine can query MSR_PERF_GLOBAL_STATUS to determine which counter(s) caused of overflow condition. The service routine should clear overflow indicator by writing to MSR_PERF_GLOBAL_OVF_CTL. A comparison of the sequence of requirements to program PEBS for processors based on Intel Core and Intel NetBurst microarchitectures is listed in Table 18-79. Table 18-79. Requirements to Program PEBS For Processors based on Intel Core microarchitecture For Processors based on Intel NetBurst microarchitecture Verify PEBS support of processor/OS. • IA32_MISC_ENABLE.EMON_AVAILABE (bit 7) is set. • IA32_MISC_ENABLE.PEBS_UNAVAILABE (bit 12) is clear. Ensure counters are in disabled. On initial set up or changing event configurations, write MSR_PERF_GLOBAL_CTRL MSR (38FH) with 0. Optional On subsequent entries: • Clear all counters if “Counter Freeze on PMI“ is not enabled. • If IA32_DebugCTL.Freeze is enabled, counters are automatically disabled. Counters MUST be stopped before writing.1 Disable PEBS. 18-98 Vol. 3B Clear ENABLE PMC0 bit in IA32_PEBS_ENABLE MSR (3F1H). Optional PERFORMANCE MONITORING Table 18-79. Requirements to Program PEBS (Contd.) For Processors based on Intel Core microarchitecture For Processors based on Intel NetBurst microarchitecture Check overflow conditions. Check MSR_PERF_GLOBAL_STATUS MSR (38EH) handle any overflow conditions. Check OVF flag of each CCCR for overflow condition Clear overflow status. Clear MSR_PERF_GLOBAL_STATUS MSR (38EH) using IA32_PERF_GLOBAL_OVF_CTRL MSR (390H). Clear OVF flag of each CCCR. Write “sample-after“ values. Configure the counter(s) with the sample after value. Configure specific counter configuration MSR. • Set local enable bit 22 - 1. • Do NOT set local counter PMI/INT bit, bit 20 - 0. • Event programmed must be PEBS capable. Allocate buffer for PEBS states. Allocate a buffer in memory for the precise information. • Set appropriate OVF_PMI bits - 1. • Only CCCR for MSR_IQ_COUNTER4 support PEBS. Program the IA32_DS_AREA MSR. Program the IA32_DS_AREA MSR. Configure the PEBS buffer management records. Configure the PEBS buffer management records in the DS buffer management area. Configure/Enable PEBS. Set Enable PMC0 bit in IA32_PEBS_ENABLE MSR (3F1H). Configure MSR_PEBS_ENABLE, MSR_PEBS_MATRIX_VERT and MSR_PEBS_MATRIX_HORZ as needed. Enable counters. Set Enable bits in MSR_PERF_GLOBAL_CTRL MSR (38FH). Set each CCCR enable bit 12 - 1. NOTES: 1. Counters read while enabled are not guaranteed to be precise with event counts that occur in timing proximity to the RDMSR. 18.6.2.4.4 Re-configuring PEBS Facilities When software needs to reconfigure PEBS facilities, it should allow a quiescent period between stopping the prior event counting and setting up a new PEBS event. The quiescent period is to allow any latent residual PEBS records to complete its capture at their previously specified buffer address (provided by IA32_DS_AREA). 18.6.3 Performance Monitoring (Processors Based on Intel NetBurst® Microarchitecture) The performance monitoring mechanism provided in processors based on Intel NetBurst microarchitecture is different from that provided in the P6 family and Pentium processors. While the general concept of selecting, filtering, counting, and reading performance events through the WRMSR, RDMSR, and RDPMC instructions is unchanged, the setup mechanism and MSR layouts are incompatible with the P6 family and Pentium processor mechanisms. Also, the RDPMC instruction has been extended to support faster reading of counters and to read all performance counters available in processors based on Intel NetBurst microarchitecture. The event monitoring mechanism consists of the following facilities: • The IA32_MISC_ENABLE MSR, which indicates the availability in an Intel 64 or IA-32 processor of the performance monitoring and processor event-based sampling (PEBS) facilities. • Event selection control (ESCR) MSRs for selecting events to be monitored with specific performance counters. The number available differs by family and model (43 to 45). • • 18 performance counter MSRs for counting events. • • • A debug store (DS) save area in memory for storing PEBS records. 18 counter configuration control (CCCR) MSRs, with one CCCR associated with each performance counter. CCCRs sets up an associated performance counter for a specific method of counting. The IA32_DS_AREA MSR, which establishes the location of the DS save area. The debug store (DS) feature flag (bit 21) returned by the CPUID instruction, which indicates the availability of the DS mechanism. Vol. 3B 18-99 PERFORMANCE MONITORING • The MSR_PEBS_ENABLE MSR, which enables the PEBS facilities and replay tagging used in at-retirement event counting. • A set of predefined events and event metrics that simplify the setting up of the performance counters to count specific events. Table 18-80 lists the performance counters and their associated CCCRs, along with the ESCRs that select events to be counted for each performance counter. Predefined event metrics and events are listed in Chapter 19, “Performance Monitoring Events.” Table 18-80. Performance Counter MSRs and Associated CCCR and ESCR MSRs (Processors Based on Intel NetBurst Microarchitecture) Counter CCCR ESCR Name No. Addr Name Addr Name No. Addr MSR_BPU_COUNTER0 0 300H MSR_BPU_CCCR0 360H MSR_BSU_ESCR0 MSR_FSB_ESCR0 MSR_MOB_ESCR0 MSR_PMH_ESCR0 MSR_BPU_ESCR0 MSR_IS_ESCR0 MSR_ITLB_ESCR0 MSR_IX_ESCR0 7 6 2 4 0 1 3 5 3A0H 3A2H 3AAH 3ACH 3B2H 3B4H 3B6H 3C8H MSR_BPU_COUNTER1 1 301H MSR_BPU_CCCR1 361H MSR_BSU_ESCR0 MSR_FSB_ESCR0 MSR_MOB_ESCR0 MSR_PMH_ESCR0 MSR_BPU_ESCR0 MSR_IS_ESCR0 MSR_ITLB_ESCR0 MSR_IX_ESCR0 7 6 2 4 0 1 3 5 3A0H 3A2H 3AAH 3ACH 3B2H 3B4H 3B6H 3C8H MSR_BPU_COUNTER2 2 302H MSR_BPU_CCCR2 362H MSR_BSU_ESCR1 MSR_FSB_ESCR1 MSR_MOB_ESCR1 MSR_PMH_ESCR1 MSR_BPU_ESCR1 MSR_IS_ESCR1 MSR_ITLB_ESCR1 MSR_IX_ESCR1 7 6 2 4 0 1 3 5 3A1H 3A3H 3ABH 3ADH 3B3H 3B5H 3B7H 3C9H MSR_BPU_COUNTER3 3 303H MSR_BPU_CCCR3 363H MSR_BSU_ESCR1 MSR_FSB_ESCR1 MSR_MOB_ESCR1 MSR_PMH_ESCR1 MSR_BPU_ESCR1 MSR_IS_ESCR1 MSR_ITLB_ESCR1 MSR_IX_ESCR1 7 6 2 4 0 1 3 5 3A1H 3A3H 3ABH 3ADH 3B3H 3B5H 3B7H 3C9H MSR_MS_COUNTER0 4 304H MSR_MS_CCCR0 364H MSR_MS_ESCR0 MSR_TBPU_ESCR0 MSR_TC_ESCR0 0 2 1 3C0H 3C2H 3C4H MSR_MS_COUNTER1 5 305H MSR_MS_CCCR1 365H MSR_MS_ESCR0 MSR_TBPU_ESCR0 MSR_TC_ESCR0 0 2 1 3C0H 3C2H 3C4H MSR_MS_COUNTER2 6 306H MSR_MS_CCCR2 366H MSR_MS_ESCR1 MSR_TBPU_ESCR1 MSR_TC_ESCR1 0 2 1 3C1H 3C3H 3C5H MSR_MS_COUNTER3 7 307H MSR_MS_CCCR3 367H MSR_MS_ESCR1 MSR_TBPU_ESCR1 MSR_TC_ESCR1 0 2 1 3C1H 3C3H 3C5H 18-100 Vol. 3B PERFORMANCE MONITORING Table 18-80. Performance Counter MSRs and Associated CCCR and ESCR MSRs (Processors Based on Intel NetBurst Microarchitecture) (Contd.) Counter CCCR ESCR Name No. Addr Name Addr Name No. Addr MSR_FLAME_COUNTER0 8 308H MSR_FLAME_CCCR0 368H MSR_FIRM_ESCR0 MSR_FLAME_ESCR0 MSR_DAC_ESCR0 MSR_SAAT_ESCR0 MSR_U2L_ESCR0 1 0 5 2 3 3A4H 3A6H 3A8H 3AEH 3B0H MSR_FLAME_COUNTER1 9 309H MSR_FLAME_CCCR1 369H MSR_FIRM_ESCR0 MSR_FLAME_ESCR0 MSR_DAC_ESCR0 MSR_SAAT_ESCR0 MSR_U2L_ESCR0 1 0 5 2 3 3A4H 3A6H 3A8H 3AEH 3B0H MSR_FLAME_COUNTER2 10 30AH MSR_FLAME_CCCR2 36AH MSR_FIRM_ESCR1 MSR_FLAME_ESCR1 MSR_DAC_ESCR1 MSR_SAAT_ESCR1 MSR_U2L_ESCR1 1 0 5 2 3 3A5H 3A7H 3A9H 3AFH 3B1H MSR_FLAME_COUNTER3 11 30BH MSR_FLAME_CCCR3 36BH MSR_FIRM_ESCR1 MSR_FLAME_ESCR1 MSR_DAC_ESCR1 MSR_SAAT_ESCR1 MSR_U2L_ESCR1 1 0 5 2 3 3A5H 3A7H 3A9H 3AFH 3B1H MSR_IQ_COUNTER0 12 30CH MSR_IQ_CCCR0 36CH MSR_CRU_ESCR0 MSR_CRU_ESCR2 MSR_CRU_ESCR4 MSR_IQ_ESCR01 MSR_RAT_ESCR0 MSR_SSU_ESCR0 MSR_ALF_ESCR0 4 5 6 0 2 3 1 3B8H 3CCH 3E0H 3BAH 3BCH 3BEH 3CAH MSR_IQ_COUNTER1 13 30DH MSR_IQ_CCCR1 36DH MSR_CRU_ESCR0 MSR_CRU_ESCR2 MSR_CRU_ESCR4 MSR_IQ_ESCR01 MSR_RAT_ESCR0 MSR_SSU_ESCR0 MSR_ALF_ESCR0 4 5 6 0 2 3 1 3B8H 3CCH 3E0H 3BAH 3BCH 3BEH 3CAH MSR_IQ_COUNTER2 14 30EH MSR_IQ_CCCR2 36EH MSR_CRU_ESCR1 MSR_CRU_ESCR3 MSR_CRU_ESCR5 MSR_IQ_ESCR11 MSR_RAT_ESCR1 MSR_ALF_ESCR1 4 5 6 0 2 1 3B9H 3CDH 3E1H 3BBH 3BDH 3CBH MSR_IQ_COUNTER3 15 30FH MSR_IQ_CCCR3 36FH MSR_CRU_ESCR1 MSR_CRU_ESCR3 MSR_CRU_ESCR5 MSR_IQ_ESCR11 MSR_RAT_ESCR1 MSR_ALF_ESCR1 4 5 6 3B9H 3CDH 3E1H 3BBH 3BDH 3CBH MSR_CRU_ESCR0 MSR_CRU_ESCR2 MSR_CRU_ESCR4 MSR_IQ_ESCR01 MSR_RAT_ESCR0 MSR_SSU_ESCR0 MSR_ALF_ESCR0 4 5 6 0 2 3 1 MSR_IQ_COUNTER4 16 310H MSR_IQ_CCCR4 370H 2 1 0 3B8H 3CCH 3E0H 3BAH 3BCH 3BEH 3CAH Vol. 3B 18-101 PERFORMANCE MONITORING Table 18-80. Performance Counter MSRs and Associated CCCR and ESCR MSRs (Processors Based on Intel NetBurst Microarchitecture) (Contd.) Counter CCCR ESCR Name No. Addr Name Addr Name No. Addr MSR_IQ_COUNTER5 17 311H MSR_IQ_CCCR5 371H MSR_CRU_ESCR1 MSR_CRU_ESCR3 MSR_CRU_ESCR5 MSR_IQ_ESCR11 MSR_RAT_ESCR1 MSR_ALF_ESCR1 4 5 6 0 2 1 3B9H 3CDH 3E1H 3BBH 3BDH 3CBH NOTES: 1. MSR_IQ_ESCR0 and MSR_IQ_ESCR1 are available only on early processor builds (family 0FH, models 01H-02H). These MSRs are not available on later versions. The types of events that can be counted with these performance monitoring facilities are divided into two classes: non-retirement events and at-retirement events. • Non-retirement events (see Table 19-33) are events that occur any time during instruction execution (such as bus transactions or cache transactions). • At-retirement events (see Table 19-34) are events that are counted at the retirement stage of instruction execution, which allows finer granularity in counting events and capturing machine state. The at-retirement counting mechanism includes facilities for tagging μops that have encountered a particular performance event during instruction execution. Tagging allows events to be sorted between those that occurred on an execution path that resulted in architectural state being committed at retirement as well as events that occurred on an execution path where the results were eventually cancelled and never committed to architectural state (such as, the execution of a mispredicted branch). The Pentium 4 and Intel Xeon processor performance monitoring facilities support the three usage models described below. The first two models can be used to count both non-retirement and at-retirement events; the third model is used to count a subset of at-retirement events: • Event counting — A performance counter is configured to count one or more types of events. While the counter is counting, software reads the counter at selected intervals to determine the number of events that have been counted between the intervals. • Interrupt-based event sampling — A performance counter is configured to count one or more types of events and to generate an interrupt when it overflows. To trigger an overflow, the counter is preset to a modulus value that will cause the counter to overflow after a specific number of events have been counted. When the counter overflows, the processor generates a performance monitoring interrupt (PMI). The interrupt service routine for the PMI then records the return instruction pointer (RIP), resets the modulus, and restarts the counter. Code performance can be analyzed by examining the distribution of RIPs with a tool like the VTune™ Performance Analyzer. • Processor event-based sampling (PEBS) — In PEBS, the processor writes a record of the architectural state of the processor to a memory buffer after the counter overflows. The records of architectural state provide additional information for use in performance tuning. Processor-based event sampling can be used to count only a subset of at-retirement events. PEBS captures more precise processor state information compared to interrupt based event sampling, because the latter need to use the interrupt service routine to re-construct the architectural states of processor. The following sections describe the MSRs and data structures used for performance monitoring in the Pentium 4 and Intel Xeon processors. 18.6.3.1 ESCR MSRs The 45 ESCR MSRs (see Table 18-80) allow software to select specific events to be countered. Each ESCR is usually associated with a pair of performance counters (see Table 18-80) and each performance counter has several ESCRs associated with it (allowing the events counted to be selected from a variety of events). Figure 18-45 shows the layout of an ESCR MSR. The functions of the flags and fields are: 18-102 Vol. 3B PERFORMANCE MONITORING • USR flag, bit 2 — When set, events are counted when the processor is operating at a current privilege level (CPL) of 1, 2, or 3. These privilege levels are generally used by application code and unprotected operating system code. • OS flag, bit 3 — When set, events are counted when the processor is operating at CPL of 0. This privilege level is generally reserved for protected operating system code. (When both the OS and USR flags are set, events are counted at all privilege levels.) 31 30 25 24 Event Select 5 4 3 2 1 0 9 8 Tag Value Event Mask Tag Enable OS USR Reserved 63 32 Reserved Figure 18-45. Event Selection Control Register (ESCR) for Pentium 4 and Intel Xeon Processors without Intel HT Technology Support • Tag enable, bit 4 — When set, enables tagging of μops to assist in at-retirement event counting; when clear, disables tagging. See Section 18.6.3.6, “At-Retirement Counting.” • Tag value field, bits 5 through 8 — Selects a tag value to associate with a μop to assist in at-retirement event counting. • Event mask field, bits 9 through 24 — Selects events to be counted from the event class selected with the event select field. • Event select field, bits 25 through 30) — Selects a class of events to be counted. The events within this class that are counted are selected with the event mask field. When setting up an ESCR, the event select field is used to select a specific class of events to count, such as retired branches. The event mask field is then used to select one or more of the specific events within the class to be counted. For example, when counting retired branches, four different events can be counted: branch not taken predicted, branch not taken mispredicted, branch taken predicted, and branch taken mispredicted. The OS and USR flags allow counts to be enabled for events that occur when operating system code and/or application code are being executed. If neither the OS nor USR flag is set, no events will be counted. The ESCRs are initialized to all 0s on reset. The flags and fields of an ESCR are configured by writing to the ESCR using the WRMSR instruction. Table 18-80 gives the addresses of the ESCR MSRs. Writing to an ESCR MSR does not enable counting with its associated performance counter; it only selects the event or events to be counted. The CCCR for the selected performance counter must also be configured. Configuration of the CCCR includes selecting the ESCR and enabling the counter. 18.6.3.2 Performance Counters The performance counters in conjunction with the counter configuration control registers (CCCRs) are used for filtering and counting the events selected by the ESCRs. Processors based on Intel NetBurst microarchitecture provide 18 performance counters organized into 9 pairs. A pair of performance counters is associated with a particular subset of events and ESCR’s (see Table 18-80). The counter pairs are partitioned into four groups: • The BPU group, includes two performance counter pairs: — MSR_BPU_COUNTER0 and MSR_BPU_COUNTER1. — MSR_BPU_COUNTER2 and MSR_BPU_COUNTER3. Vol. 3B 18-103 PERFORMANCE MONITORING • The MS group, includes two performance counter pairs: — MSR_MS_COUNTER0 and MSR_MS_COUNTER1. — MSR_MS_COUNTER2 and MSR_MS_COUNTER3. • The FLAME group, includes two performance counter pairs: — MSR_FLAME_COUNTER0 and MSR_FLAME_COUNTER1. — MSR_FLAME_COUNTER2 and MSR_FLAME_COUNTER3. • The IQ group, includes three performance counter pairs: — MSR_IQ_COUNTER0 and MSR_IQ_COUNTER1. — MSR_IQ_COUNTER2 and MSR_IQ_COUNTER3. — MSR_IQ_COUNTER4 and MSR_IQ_COUNTER5. The MSR_IQ_COUNTER4 counter in the IQ group provides support for the PEBS. Alternate counters in each group can be cascaded: the first counter in one pair can start the first counter in the second pair and vice versa. A similar cascading is possible for the second counters in each pair. For example, within the BPU group of counters, MSR_BPU_COUNTER0 can start MSR_BPU_COUNTER2 and vice versa, and MSR_BPU_COUNTER1 can start MSR_BPU_COUNTER3 and vice versa (see Section 18.6.3.5.6, “Cascading Counters”). The cascade flag in the CCCR register for the performance counter enables the cascading of counters. Each performance counter is 40-bits wide (see Figure 18-46). The RDPMC instruction is intended to allow reading of either the full counter-width (40-bits) or, if ECX[31] is set to 1, the low 32-bits of the counter. Reading the low 32-bits is faster than reading the full counter width and is appropriate in situations where the count is small enough to be contained in 32 bits. In such cases, counter bits 31:0 are written to EAX, while 0 is written to EDX. The RDPMC instruction can be used by programs or procedures running at any privilege level and in virtual-8086 mode to read these counters. The PCE flag in control register CR4 (bit 8) allows the use of this instruction to be restricted to only programs and procedures running at privilege level 0. 31 0 Counter 63 32 39 Reserved Counter Figure 18-46. Performance Counter (Pentium 4 and Intel Xeon Processors) The RDPMC instruction is not serializing or ordered with other instructions. Thus, it does not necessarily wait until all previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin execution before the RDPMC instruction operation is performed. Only the operating system, executing at privilege level 0, can directly manipulate the performance counters, using the RDMSR and WRMSR instructions. A secure operating system would clear the PCE flag during system initialization to disable direct user access to the performance-monitoring counters, but provide a user-accessible programming interface that emulates the RDPMC instruction. Some uses of the performance counters require the counters to be preset before counting begins (that is, before the counter is enabled). This can be accomplished by writing to the counter using the WRMSR instruction. To set a counter to a specified number of counts before overflow, enter a 2s complement negative integer in the counter. The counter will then count from the preset value up to -1 and overflow. Writing to a performance counter in a Pentium 4 or Intel Xeon processor with the WRMSR instruction causes all 40 bits of the counter to be written. 18-104 Vol. 3B PERFORMANCE MONITORING 18.6.3.3 CCCR MSRs Each of the 18 performance counters has one CCCR MSR associated with it (see Table 18-80). The CCCRs control the filtering and counting of events as well as interrupt generation. Figure 18-47 shows the layout of an CCCR MSR. The functions of the flags and fields are as follows: • Enable flag, bit 12 — When set, enables counting; when clear, the counter is disabled. This flag is cleared on reset. • ESCR select field, bits 13 through 15 — Identifies the ESCR to be used to select events to be counted with the counter associated with the CCCR. • Compare flag, bit 18 — When set, enables filtering of the event count; when clear, disables filtering. The filtering method is selected with the threshold, complement, and edge flags. • Complement flag, bit 19 — Selects how the incoming event count is compared with the threshold value. When set, event counts that are less than or equal to the threshold value result in a single count being delivered to the performance counter; when clear, counts greater than the threshold value result in a count being delivered to the performance counter (see Section 18.6.3.5.2, “Filtering Events”). The complement flag is not active unless the compare flag is set. • Threshold field, bits 20 through 23 — Selects the threshold value to be used for comparisons. The processor examines this field only when the compare flag is set, and uses the complement flag setting to determine the type of threshold comparison to be made. The useful range of values that can be entered in this field depend on the type of event being counted (see Section 18.6.3.5.2, “Filtering Events”). • Edge flag, bit 24 — When set, enables rising edge (false-to-true) edge detection of the threshold comparison output for filtering event counts; when clear, rising edge detection is disabled. This flag is active only when the compare flag is set. Reserved 31 30 29 27 26 25 24 23 20 19 18 17 16 15 Threshold 13 12 11 ESCR Select 0 Reserved Reserved Enable Reserved: Must be set to 11B Compare Complement Edge FORCE_OVF OVF_PMI Cascade OVF 63 32 Reserved Figure 18-47. Counter Configuration Control Register (CCCR) • FORCE_OVF flag, bit 25 — When set, forces a counter overflow on every counter increment; when clear, overflow only occurs when the counter actually overflows. • OVF_PMI flag, bit 26 — When set, causes a performance monitor interrupt (PMI) to be generated when the counter overflows occurs; when clear, disables PMI generation. Note that the PMI is generated on the next event count after the counter has overflowed. • Cascade flag, bit 30 — When set, enables counting on one counter of a counter pair when its alternate counter in the other the counter pair in the same counter group overflows (see Section 18.6.3.2, “Performance Counters,” for further details); when clear, disables cascading of counters. Vol. 3B 18-105 PERFORMANCE MONITORING • OVF flag, bit 31 — Indicates that the counter has overflowed when set. This flag is a sticky flag that must be explicitly cleared by software. The CCCRs are initialized to all 0s on reset. The events that an enabled performance counter actually counts are selected and filtered by the following flags and fields in the ESCR and CCCR registers and in the qualification order given: 1. The event select and event mask fields in the ESCR select a class of events to be counted and one or more event types within the class, respectively. 2. The OS and USR flags in the ESCR selected the privilege levels at which events will be counted. 3. The ESCR select field of the CCCR selects the ESCR. Since each counter has several ESCRs associated with it, one ESCR must be chosen to select the classes of events that may be counted. 4. The compare and complement flags and the threshold field of the CCCR select an optional threshold to be used in qualifying an event count. 5. The edge flag in the CCCR allows events to be counted only on rising-edge transitions. The qualification order in the above list implies that the filtered output of one “stage” forms the input for the next. For instance, events filtered using the privilege level flags can be further qualified by the compare and complement flags and the threshold field, and an event that matched the threshold criteria, can be further qualified by edge detection. The uses of the flags and fields in the CCCRs are discussed in greater detail in Section 18.6.3.5, “Programming the Performance Counters for Non-Retirement Events.” 18.6.3.4 Debug Store (DS) Mechanism The debug store (DS) mechanism was introduced with processors based on Intel NetBurst microarchitecture to allow various types of information to be collected in memory-resident buffers for use in debugging and tuning programs. The DS mechanism can be used to collect two types of information: branch records and processor eventbased sampling (PEBS) records. The availability of the DS mechanism in a processor is indicated with the DS feature flag (bit 21) returned by the CPUID instruction. See Section 17.4.5, “Branch Trace Store (BTS),” and Section 18.6.3.8, “Processor Event-Based Sampling (PEBS),” for a description of these facilities. Records collected with the DS mechanism are saved in the DS save area. See Section 17.4.9, “BTS and DS Save Area.” 18.6.3.5 Programming the Performance Counters for Non-Retirement Events The basic steps to program a performance counter and to count events include the following: 1. Select the event or events to be counted. 2. For each event, select an ESCR that supports the event using the values in the ESCR restrictions row in Table 19-33, Chapter 19. 3. Match the CCCR Select value and ESCR name in Table 19-33 to a value listed in Table 18-80; select a CCCR and performance counter. 4. Set up an ESCR for the specific event or events to be counted and the privilege levels at which they are to be counted. 5. Set up the CCCR for the performance counter by selecting the ESCR and the desired event filters. 6. Set up the CCCR for optional cascading of event counts, so that when the selected counter overflows its alternate counter starts. 7. Set up the CCCR to generate an optional performance monitor interrupt (PMI) when the counter overflows. If PMI generation is enabled, the local APIC must be set up to deliver the interrupt to the processor and a handler for the interrupt must be in place. 8. Enable the counter to begin counting. 18-106 Vol. 3B PERFORMANCE MONITORING 18.6.3.5.1 Selecting Events to Count Table 19-34 in Chapter 19 lists a set of at-retirement events for processors based on Intel NetBurst microarchitecture. For each event listed in Table 19-34, setup information is provided. Table 18-81 gives an example of one of the events. Table 18-81. Event Example Event Name Event Parameters Parameter Value Description branch_retired Counts the retirement of a branch. Specify one or more mask bits to select any combination of branch taken, not-taken, predicted and mispredicted. ESCR restrictions MSR_CRU_ESCR2 MSR_CRU_ESCR3 See Table 15-3 for the addresses of the ESCR MSRs. Counter numbers per ESCR2: 12, 13, 16 The counter numbers associated with each ESCR are provided. The ESCR ESCR3: 14, 15, 17 performance counters and corresponding CCCRs can be obtained from Table 15-3. ESCR Event Select 06H ESCR[24:9] ESCR Event Mask Bit 0: MMNP CCCR Select ESCR[31:25] 1: MMNM Branch Not-taken Mispredicted 2: MMTP Branch Taken Predicted 3: MMTM Branch Taken Mispredicted 05H Event Specific Notes Branch Not-taken Predicted CCCR[15:13] P6: EMON_BR_INST_RETIRED Can Support PEBS No Requires Additional MSRs for Tagging No For Table 19-33 and Table 19-34 in Chapter 19, the name of the event is listed in the Event Name column and parameters that define the event and other information are listed in the Event Parameters column. The Parameter Value and Description columns give specific parameters for the event and additional description information. Entries in the Event Parameters column are described below. • ESCR restrictions — Lists the ESCRs that can be used to program the event. Typically only one ESCR is needed to count an event. • Counter numbers per ESCR — Lists which performance counters are associated with each ESCR. Table 18-80 gives the name of the counter and CCCR for each counter number. Typically only one counter is needed to count the event. • • ESCR event select — Gives the value to be placed in the event select field of the ESCR to select the event. • CCCR select — Gives the value to be placed in the ESCR select field of the CCCR associated with the counter to select the ESCR to be used to define the event. This value is not the address of the ESCR; it is the number of the ESCR from the Number column in Table 18-80. • Event specific notes — Gives additional information about the event, such as the name of the same or a similar event defined for the P6 family processors. • Can support PEBS — Indicates if PEBS is supported for the event (only supplied for at-retirement events listed in Table 19-34.) • Requires additional MSR for tagging — Indicates which if any additional MSRs must be programmed to count the events (only supplied for the at-retirement events listed in Table 19-34.) ESCR event mask — Gives the value to be placed in the Event Mask field of the ESCR to select sub-events to be counted. The parameter value column defines the documented bits with relative bit position offset starting from 0, where the absolute bit position of relative offset 0 is bit 9 of the ESCR. All undocumented bits are reserved and should be set to 0. Vol. 3B 18-107 PERFORMANCE MONITORING NOTE The performance-monitoring events listed in Chapter 19, “Performance Monitoring Events,” are intended to be used as guides for performance tuning. The counter values reported are not guaranteed to be absolutely accurate and should be used as a relative guide for tuning. Known discrepancies are documented where applicable. The following procedure shows how to set up a performance counter for basic counting; that is, the counter is set up to count a specified event indefinitely, wrapping around whenever it reaches its maximum count. This procedure is continued through the following four sections. Using information in Table 19-33, Chapter 19, an event to be counted can be selected as follows: 1. Select the event to be counted. 2. Select the ESCR to be used to select events to be counted from the ESCRs field. 3. Select the number of the counter to be used to count the event from the Counter Numbers Per ESCR field. 4. Determine the name of the counter and the CCCR associated with the counter, and determine the MSR addresses of the counter, CCCR, and ESCR from Table 18-80. 5. Use the WRMSR instruction to write the ESCR Event Select and ESCR Event Mask values into the appropriate fields in the ESCR. At the same time set or clear the USR and OS flags in the ESCR as desired. 6. Use the WRMSR instruction to write the CCCR Select value into the appropriate field in the CCCR. NOTE Typically all the fields and flags of the CCCR will be written with one WRMSR instruction; however, in this procedure, several WRMSR writes are used to more clearly demonstrate the uses of the various CCCR fields and flags. This setup procedure is continued in the next section, Section 18.6.3.5.2, “Filtering Events.” 18.6.3.5.2 Filtering Events Each counter receives up to 4 input lines from the processor hardware from which it is counting events. The counter treats these inputs as binary inputs (input 0 has a value of 1, input 1 has a value of 2, input 3 has a value of 4, and input 3 has a value of 8). When a counter is enabled, it adds this binary input value to the counter value on each clock cycle. For each clock cycle, the value added to the counter can then range from 0 (no event) to 15. For many events, only the 0 input line is active, so the counter is merely counting the clock cycles during which the 0 input is asserted. However, for some events two or more input lines are used. Here, the counters threshold setting can be used to filter events. The compare, complement, threshold, and edge fields control the filtering of counter increments by input value. If the compare flag is set, then a “greater than” or a “less than or equal to” comparison of the input value vs. a threshold value can be made. The complement flag selects “less than or equal to” (flag set) or “greater than” (flag clear). The threshold field selects a threshold value of from 0 to 15. For example, if the complement flag is cleared and the threshold field is set to 6, than any input value of 7 or greater on the 4 inputs to the counter will cause the counter to be incremented by 1, and any value less than 7 will cause an increment of 0 (or no increment) of the counter. Conversely, if the complement flag is set, any value from 0 to 6 will increment the counter and any value from 7 to 15 will not increment the counter. Note that when a threshold condition has been satisfied, the input to the counter is always 1, not the input value that is presented to the threshold filter. The edge flag provides further filtering of the counter inputs when a threshold comparison is being made. The edge flag is only active when the compare flag is set. When the edge flag is set, the resulting output from the threshold filter (a value of 0 or 1) is used as an input to the edge filter. Each clock cycle, the edge filter examines the last and current input values and sends a count to the counter only when it detects a “rising edge” event; that is, a false-totrue transition. Figure 18-48 illustrates rising edge filtering. 18-108 Vol. 3B PERFORMANCE MONITORING The following procedure shows how to configure a CCCR to filter events using the threshold filter and the edge filter. This procedure is a continuation of the setup procedure introduced in Section 18.6.3.5.1, “Selecting Events to Count.” 7. (Optional) To set up the counter for threshold filtering, use the WRMSR instruction to write values in the CCCR compare and complement flags and the threshold field: — Set the compare flag. — Set or clear the complement flag for less than or equal to or greater than comparisons, respectively. — Enter a value from 0 to 15 in the threshold field. 8. (Optional) Select rising edge filtering by setting the CCCR edge flag. This setup procedure is continued in the next section, Section 18.6.3.5.3, “Starting Event Counting.” Processor Clock Output from Threshold Filter Counter Increments On Rising Edge (False-to-True) Figure 18-48. Effects of Edge Filtering 18.6.3.5.3 Starting Event Counting Event counting by a performance counter can be initiated in either of two ways. The typical way is to set the enable flag in the counter’s CCCR. Following the instruction to set the enable flag, event counting begins and continues until it is stopped (see Section 18.6.3.5.5, “Halting Event Counting”). The following procedural step shows how to start event counting. This step is a continuation of the setup procedure introduced in Section 18.6.3.5.2, “Filtering Events.” 9. To start event counting, use the WRMSR instruction to set the CCCR enable flag for the performance counter. This setup procedure is continued in the next section, Section 18.6.3.5.4, “Reading a Performance Counter’s Count.” The second way that a counter can be started by using the cascade feature. Here, the overflow of one counter automatically starts its alternate counter (see Section 18.6.3.5.6, “Cascading Counters”). 18.6.3.5.4 Reading a Performance Counter’s Count Performance counters can be read using either the RDPMC or RDMSR instructions. The enhanced functions of the RDPMC instruction (including fast read) are described in Section 18.6.3.2, “Performance Counters.” These instructions can be used to read a performance counter while it is counting or when it is stopped. The following procedural step shows how to read the event counter. This step is a continuation of the setup procedure introduced in Section 18.6.3.5.3, “Starting Event Counting.” 10. To read a performance counters current event count, execute the RDPMC instruction with the counter number obtained from Table 18-80 used as an operand. This setup procedure is continued in the next section, Section 18.6.3.5.5, “Halting Event Counting.” 18.6.3.5.5 Halting Event Counting After a performance counter has been started (enabled), it continues counting indefinitely. If the counter overflows (goes one count past its maximum count), it wraps around and continues counting. When the counter wraps Vol. 3B 18-109 PERFORMANCE MONITORING around, it sets its OVF flag to indicate that the counter has overflowed. The OVF flag is a sticky flag that indicates that the counter has overflowed at least once since the OVF bit was last cleared. To halt counting, the CCCR enable flag for the counter must be cleared. The following procedural step shows how to stop event counting. This step is a continuation of the setup procedure introduced in Section 18.6.3.5.4, “Reading a Performance Counter’s Count.” 11. To stop event counting, execute a WRMSR instruction to clear the CCCR enable flag for the performance counter. To halt a cascaded counter (a counter that was started when its alternate counter overflowed), either clear the Cascade flag in the cascaded counter’s CCCR MSR or clear the OVF flag in the alternate counter’s CCCR MSR. 18.6.3.5.6 Cascading Counters As described in Section 18.6.3.2, “Performance Counters,” eighteen performance counters are implemented in pairs. Nine pairs of counters and associated CCCRs are further organized as four blocks: BPU, MS, FLAME, and IQ (see Table 18-80). The first three blocks contain two pairs each. The IQ block contains three pairs of counters (12 through 17) with associated CCCRs (MSR_IQ_CCCR0 through MSR_IQ_CCCR5). The first 8 counter pairs (0 through 15) can be programmed using ESCRs to detect performance monitoring events. Pairs of ESCRs in each of the four blocks allow many different types of events to be counted. The cascade flag in the CCCR MSR allows nested monitoring of events to be performed by cascading one counter to a second counter located in another pair in the same block (see Figure 18-47 for the location of the flag). Counters 0 and 1 form the first pair in the BPU block. Either counter 0 or 1 can be programmed to detect an event via MSR_MO B_ESCR0. Counters 0 and 2 can be cascaded in any order, as can counters 1 and 3. It’s possible to set up 4 counters in the same block to cascade on two pairs of independent events. The pairing described also applies to subsequent blocks. Since the IQ PUB has two extra counters, cascading operates somewhat differently if 16 and 17 are involved. In the IQ block, counter 16 can only be cascaded from counter 14 (not from 12); counter 14 cannot be cascaded from counter 16 using the CCCR cascade bit mechanism. Similar restrictions apply to counter 17. Example 18-1. Counting Events Assume a scenario where counter X is set up to count 200 occurrences of event A; then counter Y is set up to count 400 occurrences of event B. Each counter is set up to count a specific event and overflow to the next counter. In the above example, counter X is preset for a count of -200 and counter Y for a count of -400; this setup causes the counters to overflow on the 200th and 400th counts respectively. Continuing this scenario, counter X is set up to count indefinitely and wraparound on overflow. This is described in the basic performance counter setup procedure that begins in Section 18.6.3.5.1, “Selecting Events to Count.” Counter Y is set up with the cascade flag in its associated CCCR MSR set to 1 and its enable flag set to 0. To begin the nested counting, the enable bit for the counter X is set. Once enabled, counter X counts until it overflows. At this point, counter Y is automatically enabled and begins counting. Thus counter X overflows after 200 occurrences of event A. Counter Y then starts, counting 400 occurrences of event B before overflowing. When performance counters are cascaded, the counter Y would typically be set up to generate an interrupt on overflow. This is described in Section 18.6.3.5.8, “Generating an Interrupt on Overflow.” The cascading counters mechanism can be used to count a single event. The counting begins on one counter then continues on the second counter after the first counter overflows. This technique doubles the number of event counts that can be recorded, since the contents of the two counters can be added together. 18.6.3.5.7 EXTENDED CASCADING Extended cascading is a model-specific feature in the Intel NetBurst microarchitecture with CPUID DisplayFamily_DisplayModel 0F_02, 0F_03, 0F_04, 0F_06. This feature uses bit 11 in CCCRs associated with the IQ 18-110 Vol. 3B PERFORMANCE MONITORING block. See Table 18-82. Table 18-82. CCR Names and Bit Positions CCCR Name:Bit Position Bit Name Description MSR_IQ_CCCR1|2:11 Reserved MSR_IQ_CCCR0:11 CASCNT4INTO0 Allow counter 4 to cascade into counter 0 MSR_IQ_CCCR3:11 CASCNT5INTO3 Allow counter 5 to cascade into counter 3 MSR_IQ_CCCR4:11 CASCNT5INTO4 Allow counter 5 to cascade into counter 4 MSR_IQ_CCCR5:11 CASCNT4INTO5 Allow counter 4 to cascade into counter 5 The extended cascading feature can be adapted to the Interrupt based sampling usage model for performance monitoring. However, it is known that performance counters do not generate PMI in cascade mode or extended cascade mode due to an erratum. This erratum applies to processors with CPUID DisplayFamily_DisplayModel signature of 0F_02. For processors with CPUID DisplayFamily_DisplayModel signature of 0F_00 and 0F_01, the erratum applies to processors with stepping encoding greater than 09H. Counters 16 and 17 in the IQ block are frequently used in processor event-based sampling or at-retirement counting of events indicating a stalled condition in the pipeline. Neither counter 16 or 17 can initiate the cascading of counter pairs using the cascade bit in a CCCR. Extended cascading permits performance monitoring tools to use counters 16 and 17 to initiate cascading of two counters in the IQ block. Extended cascading from counter 16 and 17 is conceptually similar to cascading other counters, but instead of using CASCADE bit of a CCCR, one of the four CASCNTxINTOy bits is used. Example 18-2. Scenario for Extended Cascading A usage scenario for extended cascading is to sample instructions retired on logical processor 1 after the first 4096 instructions retired on logical processor 0. A procedure to program extended cascading in this scenario is outlined below: 1. Write the value 0 to counter 12. 2. Write the value 04000603H to MSR_CRU_ESCR0 (corresponding to selecting the NBOGNTAG and NBOGTAG event masks with qualification restricted to logical processor 1). 3. Write the value 04038800H to MSR_IQ_CCCR0. This enables CASCNT4INTO0 and OVF_PMI. An ISR can sample on instruction addresses in this case (do not set ENABLE, or CASCADE). 4. Write the value FFFFF000H into counter 16.1. 5. Write the value 0400060CH to MSR_CRU_ESCR2 (corresponding to selecting the NBOGNTAG and NBOGTAG event masks with qualification restricted to logical processor 0). 6. Write the value 00039000H to MSR_IQ_CCCR4 (set ENABLE bit, but not OVF_PMI). Another use for cascading is to locate stalled execution in a multithreaded application. Assume MOB replays in thread B cause thread A to stall. Getting a sample of the stalled execution in this scenario could be accomplished by: 1. Set up counter B to count MOB replays on thread B. 2. Set up counter A to count resource stalls on thread A; set its force overflow bit and the appropriate CASCNTxINTOy bit. 3. Use the performance monitoring interrupt to capture the program execution data of the stalled thread. 18.6.3.5.8 Generating an Interrupt on Overflow Any performance counter can be configured to generate a performance monitor interrupt (PMI) if the counter overflows. The PMI interrupt service routine can then collect information about the state of the processor or program Vol. 3B 18-111 PERFORMANCE MONITORING when overflow occurred. This information can then be used with a tool like the Intel® VTune™ Performance Analyzer to analyze and tune program performance. To enable an interrupt on counter overflow, the OVR_PMI flag in the counter’s associated CCCR MSR must be set. When overflow occurs, a PMI is generated through the local APIC. (Here, the performance counter entry in the local vector table [LVT] is set up to deliver the interrupt generated by the PMI to the processor.) The PMI service routine can use the OVF flag to determine which counter overflowed when multiple counters have been configured to generate PMIs. Also, note that these processors mask PMIs upon receiving an interrupt. Clear this condition before leaving the interrupt handler. When generating interrupts on overflow, the performance counter being used should be preset to value that will cause an overflow after a specified number of events are counted plus 1. The simplest way to select the preset value is to write a negative number into the counter, as described in Section 18.6.3.5.6, “Cascading Counters.” Here, however, if an interrupt is to be generated after 100 event counts, the counter should be preset to minus 100 plus 1 (-100 + 1), or -99. The counter will then overflow after it counts 99 events and generate an interrupt on the next (100th) event counted. The difference of 1 for this count enables the interrupt to be generated immediately after the selected event count has been reached, instead of waiting for the overflow to be propagation through the counter. Because of latency in the microarchitecture between the generation of events and the generation of interrupts on overflow, it is sometimes difficult to generate an interrupt close to an event that caused it. In these situations, the FORCE_OVF flag in the CCCR can be used to improve reporting. Setting this flag causes the counter to overflow on every counter increment, which in turn triggers an interrupt after every counter increment. 18.6.3.5.9 Counter Usage Guideline There are some instances where the user must take care to configure counting logic properly, so that it is not powered down. To use any ESCR, even when it is being used just for tagging, (any) one of the counters that the particular ESCR (or its paired ESCR) can be connected to should be enabled. If this is not done, 0 counts may result. Likewise, to use any counter, there must be some event selected in a corresponding ESCR (other than no_event, which generally has a select value of 0). 18.6.3.6 At-Retirement Counting At-retirement counting provides a means counting only events that represent work committed to architectural state and ignoring work that was performed speculatively and later discarded. One example of this speculative activity is branch prediction. When a branch misprediction occurs, the results of instructions that were decoded and executed down the mispredicted path are canceled. If a performance counter was set up to count all executed instructions, the count would include instructions whose results were canceled as well as those whose results committed to architectural state. To provide finer granularity in event counting in these situations, the performance monitoring facilities provided in the Pentium 4 and Intel Xeon processors provide a mechanism for tagging events and then counting only those tagged events that represent committed results. This mechanism is called “at-retirement counting.” Tables 19-34 through 19-38 list predefined at-retirement events and event metrics that can be used to for tagging events when using at retirement counting. The following terminology is used in describing at-retirement counting: • Bogus, non-bogus, retire — In at-retirement event descriptions, the term “bogus” refers to instructions or μops that must be canceled because they are on a path taken from a mispredicted branch. The terms “retired” and “non-bogus” refer to instructions or μops along the path that results in committed architectural state changes as required by the program being executed. Thus instructions and μops are either bogus or non-bogus, but not both. Several of the Pentium 4 and Intel Xeon processors’ performance monitoring events (such as, Instruction_Retired and Uops_Retired in Table 19-34) can count instructions or μops that are retired based on the characterization of bogus” versus non-bogus. • Tagging — Tagging is a means of marking μops that have encountered a particular performance event so they can be counted at retirement. During the course of execution, the same event can happen more than once per μop and a direct count of the event would not provide an indication of how many μops encountered that event. The tagging mechanisms allow a μop to be tagged once during its lifetime and thus counted once at retirement. The retired suffix is used for performance metrics that increment a count once per μop, rather than once per 18-112 Vol. 3B PERFORMANCE MONITORING event. For example, a μop may encounter a cache miss more than once during its life time, but a “Miss Retired” metric (that counts the number of retired μops that encountered a cache miss) will increment only once for that μop. A “Miss Retired” metric would be useful for characterizing the performance of the cache hierarchy for a particular instruction sequence. Details of various performance metrics and how these can be constructed using the Pentium 4 and Intel Xeon processors performance events are provided in the Intel Pentium 4 Processor Optimization Reference Manual (see Section 1.4, “Related Literature”). • Replay — To maximize performance for the common case, the Intel NetBurst microarchitecture aggressively schedules μops for execution before all the conditions for correct execution are guaranteed to be satisfied. In the event that all of these conditions are not satisfied, μops must be reissued. The mechanism that the Pentium 4 and Intel Xeon processors use for this reissuing of μops is called replay. Some examples of replay causes are cache misses, dependence violations, and unforeseen resource constraints. In normal operation, some number of replays is common and unavoidable. An excessive number of replays is an indication of a performance problem. • Assist — When the hardware needs the assistance of microcode to deal with some event, the machine takes an assist. One example of this is an underflow condition in the input operands of a floating-point operation. The hardware must internally modify the format of the operands in order to perform the computation. Assists clear the entire machine of μops before they begin and are costly. 18.6.3.6.1 Using At-Retirement Counting Processors based on Intel NetBurst microarchitecture allow counting both events and μops that encountered a specified event. For a subset of the at-retirement events listed in Table 19-34, a μop may be tagged when it encounters that event. The tagging mechanisms can be used in Interrupt-based event sampling, and a subset of these mechanisms can be used in PEBS. There are four independent tagging mechanisms, and each mechanism uses a different event to count μops tagged with that mechanism: • Front-end tagging — This mechanism pertains to the tagging of μops that encountered front-end events (for example, trace cache and instruction counts) and are counted with the Front_end_event event. • Execution tagging — This mechanism pertains to the tagging of μops that encountered execution events (for example, instruction types) and are counted with the Execution_Event event. • Replay tagging — This mechanism pertains to tagging of μops whose retirement is replayed (for example, a cache miss) and are counted with the Replay_event event. Branch mispredictions are also tagged with this mechanism. • No tags — This mechanism does not use tags. It uses the Instr_retired and the Uops_ retired events. Each tagging mechanism is independent from all others; that is, a μop that has been tagged using one mechanism will not be detected with another mechanism’s tagged-μop detector. For example, if μops are tagged using the front-end tagging mechanisms, the Replay_event will not count those as tagged μops unless they are also tagged using the replay tagging mechanism. However, execution tags allow up to four different types of μops to be counted at retirement through execution tagging. The independence of tagging mechanisms does not hold when using PEBS. When using PEBS, only one tagging mechanism should be used at a time. Certain kinds of μops that cannot be tagged, including I/O, uncacheable and locked accesses, returns, and far transfers. Table 19-34 lists the performance monitoring events that support at-retirement counting: specifically the Front_end_event, Execution_event, Replay_event, Inst_retired and Uops_retired events. The following sections describe the tagging mechanisms for using these events to tag μop and count tagged μops. 18.6.3.6.2 Tagging Mechanism for Front_end_event The Front_end_event counts μops that have been tagged as encountering any of the following events: • μop decode events — Tagging μops for μop decode events requires specifying bits in the ESCR associated with the performance-monitoring event, Uop_type. • Trace cache events — Tagging μops for trace cache events may require specifying certain bits in the MSR_TC_PRECISE_EVENT MSR (see Table 19-36). Vol. 3B 18-113 PERFORMANCE MONITORING Table 19-34 describes the Front_end_event and Table 19-36 describes metrics that are used to set up a Front_end_event count. The MSRs specified in the Table 19-34 that are supported by the front-end tagging mechanism must be set and one or both of the NBOGUS and BOGUS bits in the Front_end_event event mask must be set to count events. None of the events currently supported requires the use of the MSR_TC_PRECISE_EVENT MSR. 18.6.3.6.3 Tagging Mechanism For Execution_event Table 19-34 describes the Execution_event and Table 19-37 describes metrics that are used to set up an Execution_event count. The execution tagging mechanism differs from other tagging mechanisms in how it causes tagging. One upstream ESCR is used to specify an event to detect and to specify a tag value (bits 5 through 8) to identify that event. A second downstream ESCR is used to detect μops that have been tagged with that tag value identifier using Execution_event for the event selection. The upstream ESCR that counts the event must have its tag enable flag (bit 4) set and must have an appropriate tag value mask entered in its tag value field. The 4-bit tag value mask specifies which of tag bits should be set for a particular μop. The value selected for the tag value should coincide with the event mask selected in the downstream ESCR. For example, if a tag value of 1 is set, then the event mask of NBOGUS0 should be enabled, correspondingly in the downstream ESCR. The downstream ESCR detects and counts tagged μops. The normal (not tag value) mask bits in the downstream ESCR specify which tag bits to count. If any one of the tag bits selected by the mask is set, the related counter is incremented by one. This mechanism is summarized in the Table 19-37 metrics that are supported by the execution tagging mechanism. The tag enable and tag value bits are irrelevant for the downstream ESCR used to select the Execution_event. The four separate tag bits allow the user to simultaneously but distinctly count up to four execution events at retirement. (This applies for interrupt-based event sampling. There are additional restrictions for PEBS as noted in Section 18.6.3.8.3, “Setting Up the PEBS Buffer.”) It is also possible to detect or count combinations of events by setting multiple tag value bits in the upstream ESCR or multiple mask bits in the downstream ESCR. For example, use a tag value of 3H in the upstream ESCR and use NBOGUS0/NBOGUS1 in the downstream ESCR event mask. 18.6.3.7 Tagging Mechanism for Replay_event Table 19-34 describes the Replay_event and Table 19-38 describes metrics that are used to set up an Replay_event count. The replay mechanism enables tagging of μops for a subset of all replays before retirement. Use of the replay mechanism requires selecting the type of μop that may experience the replay in the MSR_PEBS_MATRIX_VERT MSR and selecting the type of event in the MSR_PEBS_ENABLE MSR. Replay tagging must also be enabled with the UOP_Tag flag (bit 24) in the MSR_PEBS_ENABLE MSR. The Table 19-38 lists the metrics that are support the replay tagging mechanism and the at-retirement events that use the replay tagging mechanism, and specifies how the appropriate MSRs need to be configured. The replay tags defined in Table A-5 also enable Processor Event-Based Sampling (PEBS, see Section 17.4.9). Each of these replay tags can also be used in normal sampling by not setting Bit 24 nor Bit 25 in IA_32_PEBS_ENABLE_MSR. Each of these metrics requires that the Replay_Event (see Table 19-34) be used to count the tagged μops. 18.6.3.8 Processor Event-Based Sampling (PEBS) The debug store (DS) mechanism in processors based on Intel NetBurst microarchitecture allow two types of information to be collected for use in debugging and tuning programs: PEBS records and BTS records. See Section 17.4.5, “Branch Trace Store (BTS),” for a description of the BTS mechanism. PEBS permits the saving of precise architectural information associated with one or more performance events in the precise event records buffer, which is part of the DS save area (see Section 17.4.9, “BTS and DS Save Area”). To use this mechanism, a counter is configured to overflow after it has counted a preset number of events. After the counter overflows, the processor copies the current state of the general-purpose and EFLAGS registers and instruction pointer into a record in the precise event records buffer. The processor then resets the count in the performance counter and restarts the counter. When the precise event records buffer is nearly full, an interrupt is 18-114 Vol. 3B PERFORMANCE MONITORING generated, allowing the precise event records to be saved. A circular buffer is not supported for precise event records. PEBS is supported only for a subset of the at-retirement events: Execution_event, Front_end_event, and Replay_event. Also, PEBS can only be carried out using the one performance counter, the MSR_IQ_COUNTER4 MSR. In processors based on Intel Core microarchitecture, a similar PEBS mechanism is also supported using IA32_PMC0 and IA32_PERFEVTSEL0 MSRs (See Section 18.6.2.4). 18.6.3.8.1 Detection of the Availability of the PEBS Facilities The DS feature flag (bit 21) returned by the CPUID instruction indicates (when set) the availability of the DS mechanism in the processor, which supports the PEBS (and BTS) facilities. When this bit is set, the following PEBS facilities are available: • The PEBS_UNAVAILABLE flag in the IA32_MISC_ENABLE MSR indicates (when clear) the availability of the PEBS facilities, including the MSR_PEBS_ENABLE MSR. • The enable PEBS flag (bit 24) in the MSR_PEBS_ENABLE MSR allows PEBS to be enabled (set) or disabled (clear). • The IA32_DS_AREA MSR can be programmed to point to the DS save area. 18.6.3.8.2 Setting Up the DS Save Area Section 17.4.9.2, “Setting Up the DS Save Area,” describes how to set up and enable the DS save area. This procedure is common for PEBS and BTS. 18.6.3.8.3 Setting Up the PEBS Buffer Only the MSR_IQ_COUNTER4 performance counter can be used for PEBS. Use the following procedure to set up the processor and this counter for PEBS: 1. Set up the precise event buffering facilities. Place values in the precise event buffer base, precise event index, precise event absolute maximum, and precise event interrupt threshold, and precise event counter reset fields of the DS buffer management area (see Figure 17-5) to set up the precise event records buffer in memory. 2. Enable PEBS. Set the Enable PEBS flag (bit 24) in MSR_PEBS_ENABLE MSR. 3. Set up the MSR_IQ_COUNTER4 performance counter and its associated CCCR and one or more ESCRs for PEBS as described in Tables 19-34 through 19-38. 18.6.3.8.4 Writing a PEBS Interrupt Service Routine The PEBS facilities share the same interrupt vector and interrupt service routine (called the DS ISR) with the nonprecise event-based sampling and BTS facilities. To handle PEBS interrupts, PEBS handler code must be included in the DS ISR. See Section 17.4.9.5, “Writing the DS Interrupt Service Routine,” for guidelines for writing the DS ISR. 18.6.3.8.5 Other DS Mechanism Implications The DS mechanism is not available in the SMM. It is disabled on transition to the SMM mode. Similarly the DS mechanism is disabled on the generation of a machine check exception and is cleared on processor RESET and INIT. The DS mechanism is available in real address mode. 18.6.3.9 Operating System Implications The DS mechanism can be used by the operating system as a debugging extension to facilitate failure analysis. When using this facility, a 25 to 30 times slowdown can be expected due to the effects of the trace store occurring on every taken branch. Vol. 3B 18-115 PERFORMANCE MONITORING Depending upon intended usage, the instruction pointers that are part of the branch records or the PEBS records need to have an association with the corresponding process. One solution requires the ability for the DS specific operating system module to be chained to the context switch. A separate buffer can then be maintained for each process of interest and the MSR pointing to the configuration area saved and setup appropriately on each context switch. If the BTS facility has been enabled, then it must be disabled and state stored on transition of the system to a sleep state in which processor context is lost. The state must be restored on return from the sleep state. It is required that an interrupt gate be used for the DS interrupt as opposed to a trap gate to prevent the generation of an endless interrupt loop. Pages that contain buffers must have mappings to the same physical address for all processes/logical processors, such that any change to CR3 will not change DS addresses. If this requirement cannot be satisfied (that is, the feature is enabled on a per thread/process basis), then the operating system must ensure that the feature is enabled/disabled appropriately in the context switch code. 18.6.4 Performance Monitoring and Intel Hyper-Threading Technology in Processors Based on Intel NetBurst® Microarchitecture The performance monitoring capability of processors based on Intel NetBurst microarchitecture and supporting Intel Hyper-Threading Technology is similar to that described in Section 18.6.3. However, the capability is extended so that: • • Performance counters can be programmed to select events qualified by logical processor IDs. Performance monitoring interrupts can be directed to a specific logical processor within the physical processor. The sections below describe performance counters, event qualification by logical processor ID, and special purpose bits in ESCRs/CCCRs. They also describe MSR_PEBS_ENABLE, MSR_PEBS_MATRIX_VERT, and MSR_TC_PRECISE_EVENT. 18.6.4.1 ESCR MSRs Figure 18-49 shows the layout of an ESCR MSR in processors supporting Intel Hyper-Threading Technology. The functions of the flags and fields are as follows: • T1_USR flag, bit 0 — When set, events are counted when thread 1 (logical processor 1) is executing at a current privilege level (CPL) of 1, 2, or 3. These privilege levels are generally used by application code and unprotected operating system code. 31 30 25 24 Event Select 5 4 3 2 1 0 9 8 Tag Value Event Mask Tag Enable T0_OS T0_USR T1_OS T1_USR Reserved 63 32 Reserved Figure 18-49. Event Selection Control Register (ESCR) for the Pentium 4 Processor, Intel Xeon Processor and Intel Xeon Processor MP Supporting Hyper-Threading Technology 18-116 Vol. 3B PERFORMANCE MONITORING • T1_OS flag, bit 1 — When set, events are counted when thread 1 (logical processor 1) is executing at CPL of 0. This privilege level is generally reserved for protected operating system code. (When both the T1_OS and T1_USR flags are set, thread 1 events are counted at all privilege levels.) • T0_USR flag, bit 2 — When set, events are counted when thread 0 (logical processor 0) is executing at a CPL of 1, 2, or 3. • T0_OS flag, bit 3 — When set, events are counted when thread 0 (logical processor 0) is executing at CPL of 0. (When both the T0_OS and T0_USR flags are set, thread 0 events are counted at all privilege levels.) • Tag enable, bit 4 — When set, enables tagging of μops to assist in at-retirement event counting; when clear, disables tagging. See Section 18.6.3.6, “At-Retirement Counting.” • Tag value field, bits 5 through 8 — Selects a tag value to associate with a μop to assist in at-retirement event counting. • Event mask field, bits 9 through 24 — Selects events to be counted from the event class selected with the event select field. • Event select field, bits 25 through 30) — Selects a class of events to be counted. The events within this class that are counted are selected with the event mask field. The T0_OS and T0_USR flags and the T1_OS and T1_USR flags allow event counting and sampling to be specified for a specific logical processor (0 or 1) within an Intel Xeon processor MP (See also: Section 8.4.5, “Identifying Logical Processors in an MP System,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A). Not all performance monitoring events can be detected within an Intel Xeon processor MP on a per logical processor basis (see Section 18.6.4.4, “Performance Monitoring Events”). Some sub-events (specified by an event mask bits) are counted or sampled without regard to which logical processor is associated with the detected event. 18.6.4.2 CCCR MSRs Figure 18-50 shows the layout of a CCCR MSR in processors supporting Intel Hyper-Threading Technology. The functions of the flags and fields are as follows: • Enable flag, bit 12 — When set, enables counting; when clear, the counter is disabled. This flag is cleared on reset • ESCR select field, bits 13 through 15 — Identifies the ESCR to be used to select events to be counted with the counter associated with the CCCR. • Active thread field, bits 16 and 17 — Enables counting depending on which logical processors are active (executing a thread). This field enables filtering of events based on the state (active or inactive) of the logical processors. The encodings of this field are as follows: 00 — None. Count only when neither logical processor is active. 01 — Single. Count only when one logical processor is active (either 0 or 1). 10 — Both. Count only when both logical processors are active. 11 — Any. Count when either logical processor is active. A halted logical processor or a logical processor in the “wait for SIPI” state is considered inactive. • Compare flag, bit 18 — When set, enables filtering of the event count; when clear, disables filtering. The filtering method is selected with the threshold, complement, and edge flags. Vol. 3B 18-117 PERFORMANCE MONITORING Reserved 31 30 29 27 26 25 24 23 20 19 18 17 16 15 Threshold 13 12 11 ESCR Select 0 Reserved Reserved Enable Active Thread Compare Complement Edge FORCE_OVF OVF_PMI_T0 OVF_PMI_T1 Cascade OVF 63 32 Reserved Figure 18-50. Counter Configuration Control Register (CCCR) • Complement flag, bit 19 — Selects how the incoming event count is compared with the threshold value. When set, event counts that are less than or equal to the threshold value result in a single count being delivered to the performance counter; when clear, counts greater than the threshold value result in a count being delivered to the performance counter (see Section 18.6.3.5.2, “Filtering Events”). The compare flag is not active unless the compare flag is set. • Threshold field, bits 20 through 23 — Selects the threshold value to be used for comparisons. The processor examines this field only when the compare flag is set, and uses the complement flag setting to determine the type of threshold comparison to be made. The useful range of values that can be entered in this field depend on the type of event being counted (see Section 18.6.3.5.2, “Filtering Events”). • Edge flag, bit 24 — When set, enables rising edge (false-to-true) edge detection of the threshold comparison output for filtering event counts; when clear, rising edge detection is disabled. This flag is active only when the compare flag is set. • FORCE_OVF flag, bit 25 — When set, forces a counter overflow on every counter increment; when clear, overflow only occurs when the counter actually overflows. • OVF_PMI_T0 flag, bit 26 — When set, causes a performance monitor interrupt (PMI) to be sent to logical processor 0 when the counter overflows occurs; when clear, disables PMI generation for logical processor 0. Note that the PMI is generate on the next event count after the counter has overflowed. • OVF_PMI_T1 flag, bit 27 — When set, causes a performance monitor interrupt (PMI) to be sent to logical processor 1 when the counter overflows occurs; when clear, disables PMI generation for logical processor 1. Note that the PMI is generate on the next event count after the counter has overflowed. • Cascade flag, bit 30 — When set, enables counting on one counter of a counter pair when its alternate counter in the other the counter pair in the same counter group overflows (see Section 18.6.3.2, “Performance Counters,” for further details); when clear, disables cascading of counters. • OVF flag, bit 31 — Indicates that the counter has overflowed when set. This flag is a sticky flag that must be explicitly cleared by software. 18.6.4.3 IA32_PEBS_ENABLE MSR In a processor supporting Intel Hyper-Threading Technology and based on the Intel NetBurst microarchitecture, PEBS is enabled and qualified with two bits in the MSR_PEBS_ENABLE MSR: bit 25 (ENABLE_PEBS_MY_THR) and 26 (ENABLE_PEBS_OTH_THR) respectively. These bits do not explicitly identify a specific logical processor by logic 18-118 Vol. 3B PERFORMANCE MONITORING processor ID(T0 or T1); instead, they allow a software agent to enable PEBS for subsequent threads of execution on the same logical processor on which the agent is running (“my thread”) or for the other logical processor in the physical package on which the agent is not running (“other thread”). PEBS is supported for only a subset of the at-retirement events: Execution_event, Front_end_event, and Replay_event. Also, PEBS can be carried out only with two performance counters: MSR_IQ_CCCR4 (MSR address 370H) for logical processor 0 and MSR_IQ_CCCR5 (MSR address 371H) for logical processor 1. Performance monitoring tools should use a processor affinity mask to bind the kernel mode components that need to modify the ENABLE_PEBS_MY_THR and ENABLE_PEBS_OTH_THR bits in the MSR_PEBS_ENABLE MSR to a specific logical processor. This is to prevent these kernel mode components from migrating between different logical processors due to OS scheduling. 18.6.4.4 Performance Monitoring Events All of the events listed in Table 19-33 and 19-34 are available in an Intel Xeon processor MP. When Intel HyperThreading Technology is active, many performance monitoring events can be can be qualified by the logical processor ID, which corresponds to bit 0 of the initial APIC ID. This allows for counting an event in any or all of the logical processors. However, not all the events have this logic processor specificity, or thread specificity. Here, each event falls into one of two categories: • • Thread specific (TS) — The event can be qualified as occurring on a specific logical processor. Thread independent (TI) — The event cannot be qualified as being associated with a specific logical processor. Table 19-39 gives logical processor specific information (TS or TI) for each of the events described in Tables 19-33 and 19-34. If for example, a TS event occurred in logical processor T0, the counting of the event (as shown in Table 18-83) depends only on the setting of the T0_USR and T0_OS flags in the ESCR being used to set up the event counter. The T1_USR and T1_OS flags have no effect on the count. Vol. 3B 18-119 PERFORMANCE MONITORING Table 18-83. Effect of Logical Processor and CPL Qualification for Logical-Processor-Specific (TS) Events T1_OS/T1_USR = 00 T1_OS/T1_USR = 01 T1_OS/T1_USR = 11 T1_OS/T1_USR = 10 T0_OS/T0_USR = 00 Zero count Counts while T1 in USR Counts while T1 in OS or USR Counts while T1 in OS T0_OS/T0_USR = 01 Counts while T0 in USR Counts while T0 in USR or T1 in USR Counts while (a) T0 in USR or (b) T1 in OS or (c) T1 in USR Counts while (a) T0 in OS or (b) T1 in OS T0_OS/T0_USR = 11 Counts while T0 in OS or USR Counts while (a) T0 in OS or (b) T0 in USR or (c) T1 in USR Counts irrespective of CPL, T0, T1 Counts while (a) T0 in OS or (b) or T0 in USR or (c) T1 in OS T0_OS/T0_USR = 10 Counts T0 in OS Counts T0 in OS or T1 in USR Counts while (a)T0 in Os or (b) T1 in OS or (c) T1 in USR Counts while (a) T0 in OS or (b) T1 in OS When a bit in the event mask field is TI, the effect of specifying bit-0-3 of the associated ESCR are described in Table 15-6. For events that are marked as TI in Chapter 19, the effect of selectively specifying T0_USR, T0_OS, T1_USR, T1_OS bits is shown in Table 18-84. Table 18-84. Effect of Logical Processor and CPL Qualification for Non-logical-Processor-specific (TI) Events T1_OS/T1_USR = 00 T1_OS/T1_USR = 01 T1_OS/T1_USR = 11 T1_OS/T1_USR = 10 T0_OS/T0_USR = 00 Zero count Counts while (a) T0 in USR or (b) T1 in USR Counts irrespective of CPL, T0, T1 Counts while (a) T0 in OS or (b) T1 in OS T0_OS/T0_USR = 01 Counts while (a) T0 in USR or (b) T1 in USR Counts while (a) T0 in USR or (b) T1 in USR Counts irrespective of CPL, T0, T1 Counts irrespective of CPL, T0, T1 T0_OS/T0_USR = 11 Counts irrespective of CPL, T0, T1 Counts irrespective of CPL, T0, T1 Counts irrespective of CPL, T0, T1 Counts irrespective of CPL, T0, T1 T0_OS/T0_USR = 0 Counts while (a) T0 in OS or (b) T1 in OS Counts irrespective of CPL, T0, T1 Counts irrespective of CPL, T0, T1 Counts while (a) T0 in OS or (b) T1 in OS 18.6.4.5 Counting Clocks on systems with Intel Hyper-Threading Technology in Processors Based on Intel NetBurst® Microarchitecture 18.6.4.5.1 Non-Halted Clockticks Use the following procedure to program ESCRs and CCCRs to obtain non-halted clockticks on processors based on Intel NetBurst microarchitecture: 1. Select an ESCR for the global_power_events and specify the RUNNING sub-event mask and the desired T0_OS/T0_USR/T1_OS/T1_USR bits for the targeted processor. 2. Select an appropriate counter. 3. Enable counting in the CCCR for that counter by setting the enable bit. 18.6.4.5.2 Non-Sleep Clockticks Performance monitoring counters can be configured to count clockticks whenever the performance monitoring hardware is not powered-down. To count Non-sleep Clockticks with a performance-monitoring counter, do the following: 1. Select one of the 18 counters. 2. Select any of the ESCRs whose events the selected counter can count. Set its event select to anything other than “no_event”; the counter may be disabled if this is not done. 18-120 Vol. 3B PERFORMANCE MONITORING 3. Turn threshold comparison on in the CCCR by setting the compare bit to “1”. 4. Set the threshold to “15” and the complement to “1” in the CCCR. Since no event can exceed this threshold, the threshold condition is met every cycle and the counter counts every cycle. Note that this overrides any qualification (e.g. by CPL) specified in the ESCR. 5. Enable counting in the CCCR for the counter by setting the enable bit. In most cases, the counts produced by the non-halted and non-sleep metrics are equivalent if the physical package supports one logical processor and is not placed in a power-saving state. Operating systems may execute an HLT instruction and place a physical processor in a power-saving state. On processors that support Intel Hyper-Threading Technology (Intel HT Technology), each physical package can support two or more logical processors. Current implementation of Intel HT Technology provides two logical processors for each physical processor. While both logical processors can execute two threads simultaneously, one logical processor may halt to allow the other logical processor to execute without sharing execution resources between two logical processors. Non-halted Clockticks can be set up to count the number of processor clock cycles for each logical processor whenever the logical processor is not halted (the count may include some portion of the clock cycles for that logical processor to complete a transition to a halted state). Physical processors that support Intel HT Technology enter into a power-saving state if all logical processors halt. The Non-sleep Clockticks mechanism uses a filtering mechanism in CCCRs. The mechanism will continue to increment as long as one logical processor is not halted or in a power-saving state. Applications may cause a processor to enter into a power-saving state by using an OS service that transfers control to an OS's idle loop. The idle loop then may place the processor into a power-saving state after an implementation-dependent period if there is no work for the processor. 18.6.5 Performance Monitoring and Dual-Core Technology The performance monitoring capability of dual-core processors duplicates the microarchitectural resources of a single-core processor implementation. Each processor core has dedicated performance monitoring resources. In the case of Pentium D processor, each logical processor is associated with dedicated resources for performance monitoring. In the case of Pentium processor Extreme edition, each processor core has dedicated resources, but two logical processors in the same core share performance monitoring resources (see Section 18.6.4, “Performance Monitoring and Intel Hyper-Threading Technology in Processors Based on Intel NetBurst® Microarchitecture”). 18.6.6 Performance Monitoring on 64-bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache The 64-bit Intel Xeon processor MP with up to 8-MByte L3 cache has a CPUID signature of family [0FH], model [03H or 04H]. Performance monitoring capabilities available to Pentium 4 and Intel Xeon processors with the same values (see Section 18.1 and Section 18.6.4) apply to the 64-bit Intel Xeon processor MP with an L3 cache. The level 3 cache is connected between the system bus and IOQ through additional control logic. See Figure 18-51. Vol. 3B 18-121 PERFORMANCE MONITORING System Bus iBUSQ and iSNPQ 3rd Level Cache 8 or 4 -way iFSB Processor Core (Front end, Execution, Retirement, L1, L2 IOQ Figure 18-51. Block Diagram of 64-bit Intel Xeon Processor MP with 8-MByte L3 Additional performance monitoring capabilities and facilities unique to 64-bit Intel Xeon processor MP with an L3 cache are described in this section. The facility for monitoring events consists of a set of dedicated model-specific registers (MSRs), each dedicated to a specific event. Programming of these MSRs requires using RDMSR/WRMSR instructions with 64-bit values. The lower 32-bits of the MSRs at addresses 107CC through 107D3 are treated as 32 bit performance counter registers. These performance counters can be accessed using RDPMC instruction with the index starting from 18 through 25. The EDX register returns zero when reading these 8 PMCs. The performance monitoring capabilities consist of four events. These are: • IBUSQ event — This event detects the occurrence of micro-architectural conditions related to the iBUSQ unit. It provides two MSRs: MSR_IFSB_IBUSQ0 and MSR_IFSB_IBUSQ1. Configure sub-event qualification and enable/disable functions using the high 32 bits of these MSRs. The low 32 bits act as a 32-bit event counter. Counting starts after software writes a non-zero value to one or more of the upper 32 bits. See Figure 18-52. MSR_IFSB_IBUSQx, Addresses: 107CCH and 107CDH 63 60 59 58 57 56 55 49 48 46 45 1 1 Saturate Fill_match Eviction_match L3_state_match Snoop_match Type_match T1_match T0_match 0 38 37 36 35 34 33 32 31 32 bit event count Reserved Figure 18-52. MSR_IFSB_IBUSQx, Addresses: 107CCH and 107CDH • ISNPQ event — This event detects the occurrence of microarchitectural conditions related to the iSNPQ unit. It provides two MSRs: MSR_IFSB_ISNPQ0 and MSR_IFSB_ISNPQ1. Configure sub-event qualifications and enable/disable functions using the high 32 bits of the MSRs. The low 32-bits act as a 32-bit event counter. Counting starts after software writes a non-zero value to one or more of the upper 32-bits. See Figure 18-53. 18-122 Vol. 3B PERFORMANCE MONITORING MSR_IFSB_ISNPQx, Addresses: 107CEH and 107CFH 63 60 59 58 57 56 55 48 46 45 Reserved 39 38 37 36 35 34 33 32 Saturate L3_state_match Snoop_match Type_match Agent_match T1_match T0_match 31 0 32 bit event count Figure 18-53. MSR_IFSB_ISNPQx, Addresses: 107CEH and 107CFH • EFSB event — This event can detect the occurrence of micro-architectural conditions related to the iFSB unit or system bus. It provides two MSRs: MSR_EFSB_DRDY0 and MSR_EFSB_DRDY1. Configure sub-event qualifications and enable/disable functions using the high 32 bits of the 64-bit MSR. The low 32-bit act as a 32-bit event counter. Counting starts after software writes a non-zero value to one or more of the qualification bits in the upper 32-bits of the MSR. See Figure 18-54. MSR_EFSB_DRDYx, Addresses: 107D0H and 107D1H 63 60 59 58 57 56 55 50 49 48 Saturate Other Own 39 38 37 36 35 34 33 32 Reserved 31 0 32 bit event count Figure 18-54. MSR_EFSB_DRDYx, Addresses: 107D0H and 107D1H • IBUSQ Latency event — This event accumulates weighted cycle counts for latency measurement of transactions in the iBUSQ unit. The count is enabled by setting MSR_IFSB_CTRL6[bit 26] to 1; the count freezes after software sets MSR_IFSB_CTRL6[bit 26] to 0. MSR_IFSB_CNTR7 acts as a 64-bit event counter for this event. See Figure 18-55. Vol. 3B 18-123 PERFORMANCE MONITORING MSR_IFSB_CTL6 Address: 107D2H 63 59 0 57 Enable MSR_IFSB_CNTR7 Address: 107D3H Reserved 0 63 64 bit event count Figure 18-55. MSR_IFSB_CTL6, Address: 107D2H; MSR_IFSB_CNTR7, Address: 107D3H 18.6.7 Performance Monitoring on L3 and Caching Bus Controller Sub-Systems The Intel Xeon processor 7400 series and Dual-Core Intel Xeon processor 7100 series employ a distinct L3/caching bus controller sub-system. These sub-system have a unique set of performance monitoring capability and programming interfaces that are largely common between these two processor families. Intel Xeon processor 7400 series are based on 45 nm enhanced Intel Core microarchitecture. The CPUID signature is indicated by DisplayFamily_DisplayModel value of 06_1DH (see CPUID instruction in Chapter 3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A). Intel Xeon processor 7400 series have six processor cores that share an L3 cache. Dual-Core Intel Xeon processor 7100 series are based on Intel NetBurst microarchitecture, have a CPUID signature of family [0FH], model [06H] and a unified L3 cache shared between two cores. Each core in an Intel Xeon processor 7100 series supports Intel Hyper-Threading Technology, providing two logical processors per core. Both Intel Xeon processor 7400 series and Intel Xeon processor 7100 series support multi-processor configurations using system bus interfaces. In Intel Xeon processor 7400 series, the L3/caching bus controller sub-system provides three Simple Direct Interface (SDI) to service transactions originated the XQ-replacement SDI logic in each dual-core modules. In Intel Xeon processor 7100 series, the IOQ logic in each processor core is replaced with a Simple Direct Interface (SDI) logic. The L3 cache is connected between the system bus and the SDI through additional control logic. See Figure 18-56 for the block configuration of six processor cores and the L3/Caching bus controller sub-system in Intel Xeon processor 7400 series. Figure 18-56 shows the block configuration of two processor cores (four logical processors) and the L3/Caching bus controller sub-system in Intel Xeon processor 7100 series. 18-124 Vol. 3B PERFORMANCE MONITORING FSB GBSQ, GSNPQ, GINTQ, ... L3 SDI SDI interface SDI interface L2 Core SDI interface L2 Core Core L2 Core Core Core Figure 18-56. Block Diagram of Intel Xeon Processor 7400 Series Almost all of the performance monitoring capabilities available to processor cores with the same CPUID signatures (see Section 18.1 and Section 18.6.4) apply to Intel Xeon processor 7100 series. The MSRs used by performance monitoring interface are shared between two logical processors in the same processor core. The performance monitoring capabilities available to processor with DisplayFamily_DisplayModel signature 06_17H also apply to Intel Xeon processor 7400 series. Each processor core provides its own set of MSRs for performance monitoring interface. The IOQ_allocation and IOQ_active_entries events are not supported in Intel Xeon processor 7100 series and 7400 series. Additional performance monitoring capabilities applicable to the L3/caching bus controller sub-system are described in this section. FSB GBSQ, GSNPQ, GINTQ, ... L3 SDI SDI interface SDI interface Processor core Processor core Logical processor Logical processor Logical processor Logical processor Figure 18-57. Block Diagram of Intel Xeon Processor 7100 Series Vol. 3B 18-125 PERFORMANCE MONITORING 18.6.7.1 Overview of Performance Monitoring with L3/Caching Bus Controller The facility for monitoring events consists of a set of dedicated model-specific registers (MSRs). There are eight event select/counting MSRs that are dedicated to counting events associated with specified microarchitectural conditions. Programming of these MSRs requires using RDMSR/WRMSR instructions with 64-bit values. In addition, an MSR MSR_EMON_L3_GL_CTL provides simplified interface to control freezing, resetting, re-enabling operation of any combination of these event select/counting MSRs. The eight MSRs dedicated to count occurrences of specific conditions are further divided to count three sub-classes of microarchitectural conditions: • Two MSRs (MSR_EMON_L3_CTR_CTL0 and MSR_EMON_L3_CTR_CTL1) are dedicated to counting GBSQ events. Up to two GBSQ events can be programmed and counted simultaneously. • Two MSRs (MSR_EMON_L3_CTR_CTL2 and MSR_EMON_L3_CTR_CTL3) are dedicated to counting GSNPQ events. Up to two GBSQ events can be programmed and counted simultaneously. • Four MSRs (MSR_EMON_L3_CTR_CTL4, MSR_EMON_L3_CTR_CTL5, MSR_EMON_L3_CTR_CTL6, and MSR_EMON_L3_CTR_CTL7) are dedicated to counting external bus operations. The bit fields in each of eight MSRs share the following common characteristics: • Bits 63:32 is the event control field that includes an event mask and other bit fields that control counter operation. The event mask field specifies details of the microarchitectural condition, and its definition differs across GBSQ, GSNPQ, FSB. • Bits 31:0 is the event count field. If the specified condition is met during each relevant clock domain of the event logic, the matched condition signals the counter logic to increment the associated event count field. The lower 32-bits of these 8 MSRs at addresses 107CC through 107D3 are treated as 32 bit performance counter registers. In Dual-Core Intel Xeon processor 7100 series, the uncore performance counters can be accessed using RDPMC instruction with the index starting from 18 through 25. The EDX register returns zero when reading these 8 PMCs. In Intel Xeon processor 7400 series, RDPMC with ECX between 2 and 9 can be used to access the eight uncore performance counter/control registers. 18.6.7.2 GBSQ Event Interface The layout of MSR_EMON_L3_CTR_CTL0 and MSR_EMON_L3_CTR_CTL1 is given in Figure 18-58. Counting starts after software writes a non-zero value to one or more of the upper 32 bits. The event mask field (bits 58:32) consists of the following eight attributes: • Agent_Select (bits 35:32): The definition of this field differs slightly between Intel Xeon processor 7100 and 7400. For Intel Xeon processor 7100 series, each bit specifies a logical processor in the physical package. The lower two bits corresponds to two logical processors in the first processor core, the upper two bits corresponds to two logical processors in the second processor core. 0FH encoding matches transactions from any logical processor. For Intel Xeon processor 7400 series, each bit of [34:32] specifies the SDI logic of a dual-core module as the originator of the transaction. A value of 0111B in bits [35:32] specifies transaction from any processor core. 18-126 Vol. 3B PERFORMANCE MONITORING MSR_EMON_L3_CTR_CTL0/1, Addresses: 107CCH/107CDH 63 60 59 58 57 56 55 54 53 47 46 44 43 Reserved 38 37 36 35 32 Saturate Cross_snoop Fill_eviction Core_module_select L3_state Snoop_match Type_match Data_flow Agent_select 31 0 32 bit event count Figure 18-58. MSR_EMON_L3_CTR_CTL0/1, Addresses: 107CCH/107CDH • • Data_Flow (bits 37:36): Bit 36 specifies demand transactions, bit 37 specifies prefetch transactions. • Snoop_Match: (bits 46:44): The three bits specify (in ascending bit position) clean snoop result, HIT snoop result, and HITM snoop results respectively. • • L3_State (bits 53:47): Each bit specifies an L2 coherency state. Type_Match (bits 43:38): Specifies transaction types. If all six bits are set, event count will include all transaction types. Core_Module_Select (bits 55:54): The valid encodings for L3 lookup differ slightly between Intel Xeon processor 7100 and 7400. For Intel Xeon processor 7100 series, — 00B: Match transactions from any core in the physical package — 01B: Match transactions from this core only — 10B: Match transactions from the other core in the physical package — 11B: Match transaction from both cores in the physical package For Intel Xeon processor 7400 series, — 00B: Match transactions from any dual-core module in the physical package — 01B: Match transactions from this dual-core module only — 10B: Match transactions from either one of the other two dual-core modules in the physical package — 11B: Match transaction from more than one dual-core modules in the physical package • Fill_Eviction (bits 57:56): The valid encodings are — 00B: Match any transactions — 01B: Match transactions that fill L3 — 10B: Match transactions that fill L3 without an eviction — 11B: Match transaction fill L3 with an eviction • Cross_Snoop (bit 58): The encodings are — 0B: Match any transactions — 1B: Match cross snoop transactions Vol. 3B 18-127 PERFORMANCE MONITORING For each counting clock domain, if all eight attributes match, event logic signals to increment the event count field. 18.6.7.3 GSNPQ Event Interface The layout of MSR_EMON_L3_CTR_CTL2 and MSR_EMON_L3_CTR_CTL3 is given in Figure 18-59. Counting starts after software writes a non-zero value to one or more of the upper 32 bits. The event mask field (bits 58:32) consists of the following six attributes: • Agent_Select (bits 37:32): The definition of this field differs slightly between Intel Xeon processor 7100 and 7400. • For Intel Xeon processor 7100 series, each of the lowest 4 bits specifies a logical processor in the physical package. The lowest two bits corresponds to two logical processors in the first processor core, the next two bits corresponds to two logical processors in the second processor core. Bit 36 specifies other symmetric agent transactions. Bit 37 specifies central agent transactions. 3FH encoding matches transactions from any logical processor. For Intel Xeon processor 7400 series, each of the lowest 3 bits specifies a dual-core module in the physical package. Bit 37 specifies central agent transactions. • Type_Match (bits 43:38): Specifies transaction types. If all six bits are set, event count will include any transaction types. • Snoop_Match: (bits 46:44): The three bits specify (in ascending bit position) clean snoop result, HIT snoop result, and HITM snoop results respectively. • • L2_State (bits 53:47): Each bit specifies an L3 coherency state. Core_Module_Select (bits 56:54): Bit 56 enables Core_Module_Select matching. If bit 56 is clear, Core_Module_Select encoding is ignored. The valid encodings for the lower two bits (bit 55, 54) differ slightly between Intel Xeon processor 7100 and 7400. For Intel Xeon processor 7100 series, if bit 56 is set, the valid encodings for the lower two bits (bit 55, 54) are — 00B: Match transactions from only one core (irrespective which core) in the physical package — 01B: Match transactions from this core and not the other core — 10B: Match transactions from the other core in the physical package, but not this core — 11B: Match transaction from both cores in the physical package For Intel Xeon processor 7400 series, if bit 56 is set, the valid encodings for the lower two bits (bit 55, 54) are — 00B: Match transactions from only one dual-core module (irrespective which module) in the physical package. — 01B: Match transactions from one or more dual-core modules. — 10B: Match transactions from two or more dual-core modules. — 11B: Match transaction from all three dual-core modules in the physical package. • Block_Snoop (bit 57): specifies blocked snoop. For each counting clock domain, if all six attributes match, event logic signals to increment the event count field. 18-128 Vol. 3B PERFORMANCE MONITORING MSR_EMON_L3_CTR_CTL2/3, Addresses: 107CEH/107CFH 63 60 59 58 57 56 55 54 53 47 46 44 43 Reserved 39 38 37 36 32 Saturate Block_snoop Core_select L2_state Snoop_match Type_match Agent_match 31 0 32 bit event count Figure 18-59. MSR_EMON_L3_CTR_CTL2/3, Addresses: 107CEH/107CFH 18.6.7.4 FSB Event Interface The layout of MSR_EMON_L3_CTR_CTL4 through MSR_EMON_L3_CTR_CTL7 is given in Figure 18-60. Counting starts after software writes a non-zero value to one or more of the upper 32 bits. The event mask field (bits 58:32) is organized as follows: • • Bit 58: must set to 1. FSB_Submask (bits 57:32): Specifies FSB-specific sub-event mask. The FSB sub-event mask defines a set of independent attributes. The event logic signals to increment the associated event count field if one of the attribute matches. Some of the sub-event mask bit counts durations. A duration event increments at most once per cycle. MSR_EMON_L3_CTR_CTL4/5/6/7, Addresses: 107D0H-107D3H 63 60 59 58 57 56 55 50 49 48 39 38 37 36 35 34 33 32 1 Reserved Saturate FSB submask 31 0 32 bit event count Figure 18-60. MSR_EMON_L3_CTR_CTL4/5/6/7, Addresses: 107D0H-107D3H 18.6.7.4.1 FSB Sub-Event Mask Interface • • FSB_type (bit 37:32): Specifies different FSB transaction types originated from this physical package. • FSB_L_hit (bit 39): Count HIT snoop results from any source for transaction originated from this physical package. FSB_L_clear (bit 38): Count clean snoop results from any source for transaction originated from this physical package. Vol. 3B 18-129 PERFORMANCE MONITORING • FSB_L_hitm (bit 40): Count HITM snoop results from any source for transaction originated from this physical package. • • • • • • • • • • • • • • • • • FSB_L_defer (bit 41): Count DEFER responses to this processor’s transactions. FSB_L_retry (bit 42): Count RETRY responses to this processor’s transactions. FSB_L_snoop_stall (bit 43): Count snoop stalls to this processor’s transactions. FSB_DBSY (bit 44): Count DBSY assertions by this processor (without a concurrent DRDY). FSB_DRDY (bit 45): Count DRDY assertions by this processor. FSB_BNR (bit 46): Count BNR assertions by this processor. FSB_IOQ_empty (bit 47): Counts each bus clocks when the IOQ is empty. FSB_IOQ_full (bit 48): Counts each bus clocks when the IOQ is full. FSB_IOQ_active (bit 49): Counts each bus clocks when there is at least one entry in the IOQ. FSB_WW_data (bit 50): Counts back-to-back write transaction’s data phase. FSB_WW_issue (bit 51): Counts back-to-back write transaction request pairs issued by this processor. FSB_WR_issue (bit 52): Counts back-to-back write-read transaction request pairs issued by this processor. FSB_RW_issue (bit 53): Counts back-to-back read-write transaction request pairs issued by this processor. FSB_other_DBSY (bit 54): Count DBSY assertions by another agent (without a concurrent DRDY). FSB_other_DRDY (bit 55): Count DRDY assertions by another agent. FSB_other_snoop_stall (bit 56): Count snoop stalls on the FSB due to another agent. FSB_other_BNR (bit 57): Count BNR assertions from another agent. 18.6.7.5 Common Event Control Interface The MSR_EMON_L3_GL_CTL MSR provides simplified access to query overflow status of the GBSQ, GSNPQ, FSB event counters. It also provides control bit fields to freeze, unfreeze, or reset those counters. The following bit fields are supported: • • • GL_freeze_cmd (bit 0): Freeze the event counters specified by the GL_event_select field. • GL_event_select (bit 23:16): Selects one or more event counters to subject to specified command operations indicated by bits 2:0. Bit 16 corresponds to MSR_EMON_L3_CTR_CTL0, bit 23 corresponds to MSR_EMON_L3_CTR_CTL7. • GL_event_status (bit 55:48): Indicates the overflow status of each event counters. Bit 48 corresponds to MSR_EMON_L3_CTR_CTL0, bit 55 corresponds to MSR_EMON_L3_CTR_CTL7. GL_unfreeze_cmd (bit 1): Unfreeze the event counters specified by the GL_event_select field. GL_reset_cmd (bit 2): Clear the event count field of the event counters specified by the GL_event_select field. The event select field is not affected. In the event control field (bits 63:32) of each MSR, if the saturate control (bit 59, see Figure 18-58 for example) is set, the event logic forces the value FFFF_FFFFH into the event count field instead of incrementing it. 18.6.8 Performance Monitoring (P6 Family Processor) The P6 family processors provide two 40-bit performance counters, allowing two types of events to be monitored simultaneously. These can either count events or measure duration. When counting events, a counter increments each time a specified event takes place or a specified number of events takes place. When measuring duration, it counts the number of processor clocks that occur while a specified condition is true. The counters can count events or measure durations that occur at any privilege level. Table 19-42, Chapter 19, lists the events that can be counted with the P6 family performance monitoring counters. 18-130 Vol. 3B PERFORMANCE MONITORING NOTE The performance-monitoring events listed in Chapter 19 are intended to be used as guides for performance tuning. Counter values reported are not guaranteed to be accurate and should be used as a relative guide for tuning. Known discrepancies are documented where applicable. The performance-monitoring counters are supported by four MSRs: the performance event select MSRs (PerfEvtSel0 and PerfEvtSel1) and the performance counter MSRs (PerfCtr0 and PerfCtr1). These registers can be read from and written to using the RDMSR and WRMSR instructions, respectively. They can be accessed using these instructions only when operating at privilege level 0. The PerfCtr0 and PerfCtr1 MSRs can be read from any privilege level using the RDPMC (read performance-monitoring counters) instruction. NOTE The PerfEvtSel0, PerfEvtSel1, PerfCtr0, and PerfCtr1 MSRs and the events listed in Table 19-42 are model-specific for P6 family processors. They are not guaranteed to be available in other IA-32 processors. 18.6.8.1 PerfEvtSel0 and PerfEvtSel1 MSRs The PerfEvtSel0 and PerfEvtSel1 MSRs control the operation of the performance-monitoring counters, with one register used to set up each counter. They specify the events to be counted, how they should be counted, and the privilege levels at which counting should take place. Figure 18-61 shows the flags and fields in these MSRs. The functions of the flags and fields in the PerfEvtSel0 and PerfEvtSel1 MSRs are as follows: • Event select field (bits 0 through 7) — Selects the event logic unit to detect certain microarchitectural conditions (see Table 19-42, for a list of events and their 8-bit codes). • Unit mask (UMASK) field (bits 8 through 15) — Further qualifies the event logic unit selected in the event select field to detect a specific microarchitectural condition. For example, for some cache events, the mask is used as a MESI-protocol qualifier of cache states (see Table 19-42). 31 24 23 22 21 20 19 18 17 16 15 Counter Mask (CMASK) INV—Invert counter mask EN—Enable counters* INT—APIC interrupt enable PC—Pin control E—Edge detect OS—Operating system mode USR—User Mode I N E V N 0 8 7 I U N P E O S Unit Mask (UMASK) S R T C Event Select Reserved * Only available in PerfEvtSel0. Figure 18-61. PerfEvtSel0 and PerfEvtSel1 MSRs • USR (user mode) flag (bit 16) — Specifies that events are counted only when the processor is operating at privilege levels 1, 2 or 3. This flag can be used in conjunction with the OS flag. • OS (operating system mode) flag (bit 17) — Specifies that events are counted only when the processor is operating at privilege level 0. This flag can be used in conjunction with the USR flag. • E (edge detect) flag (bit 18) — Enables (when set) edge detection of events. The processor counts the number of deasserted to asserted transitions of any condition that can be expressed by the other fields. The mechanism is limited in that it does not permit back-to-back assertions to be distinguished. This mechanism allows software to measure not only the fraction of time spent in a particular state, but also the average length of time spent in such a state (for example, the time spent waiting for an interrupt to be serviced). Vol. 3B 18-131 PERFORMANCE MONITORING • PC (pin control) flag (bit 19) — When set, the processor toggles the PMi pins and increments the counter when performance-monitoring events occur; when clear, the processor toggles the PMi pins when the counter overflows. The toggling of a pin is defined as assertion of the pin for a single bus clock followed by deassertion. • INT (APIC interrupt enable) flag (bit 20) — When set, the processor generates an exception through its local APIC on counter overflow. • EN (Enable Counters) Flag (bit 22) — This flag is only present in the PerfEvtSel0 MSR. When set, performance counting is enabled in both performance-monitoring counters; when clear, both counters are disabled. • INV (invert) flag (bit 23) — When set, inverts the counter-mask (CMASK) comparison, so that both greater than or equal to and less than comparisons can be made (0: greater than or equal; 1: less than). Note if counter-mask is programmed to zero, INV flag is ignored. • Counter mask (CMASK) field (bits 24 through 31) — When nonzero, the processor compares this mask to the number of events counted during a single cycle. If the event count is greater than or equal to this mask, the counter is incremented by one. Otherwise the counter is not incremented. This mask can be used to count events only if multiple occurrences happen per clock (for example, two or more instructions retired per clock). If the counter-mask field is 0, then the counter is incremented each cycle by the number of events that occurred that cycle. 18.6.8.2 PerfCtr0 and PerfCtr1 MSRs The performance-counter MSRs (PerfCtr0 and PerfCtr1) contain the event or duration counts for the selected events being counted. The RDPMC instruction can be used by programs or procedures running at any privilege level and in virtual-8086 mode to read these counters. The PCE flag in control register CR4 (bit 8) allows the use of this instruction to be restricted to only programs and procedures running at privilege level 0. The RDPMC instruction is not serializing or ordered with other instructions. Thus, it does not necessarily wait until all previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin execution before the RDPMC instruction operation is performed. Only the operating system, executing at privilege level 0, can directly manipulate the performance counters, using the RDMSR and WRMSR instructions. A secure operating system would clear the PCE flag during system initialization to disable direct user access to the performance-monitoring counters, but provide a user-accessible programming interface that emulates the RDPMC instruction. The WRMSR instruction cannot arbitrarily write to the performance-monitoring counter MSRs (PerfCtr0 and PerfCtr1). Instead, the lower-order 32 bits of each MSR may be written with any value, and the high-order 8 bits are sign-extended according to the value of bit 31. This operation allows writing both positive and negative values to the performance counters. 18.6.8.3 Starting and Stopping the Performance-Monitoring Counters The performance-monitoring counters are started by writing valid setup information in the PerfEvtSel0 and/or PerfEvtSel1 MSRs and setting the enable counters flag in the PerfEvtSel0 MSR. If the setup is valid, the counters begin counting following the execution of a WRMSR instruction that sets the enable counter flag. The counters can be stopped by clearing the enable counters flag or by clearing all the bits in the PerfEvtSel0 and PerfEvtSel1 MSRs. Counter 1 alone can be stopped by clearing the PerfEvtSel1 MSR. 18.6.8.4 Event and Time-Stamp Monitoring Software To use the performance-monitoring counters and time-stamp counter, the operating system needs to provide an event-monitoring device driver. This driver should include procedures for handling the following operations: • • • • • Feature checking. Initialize and start counters. Stop counters. Read the event counters. Read the time-stamp counter. 18-132 Vol. 3B PERFORMANCE MONITORING The event monitor feature determination procedure must check whether the current processor supports the performance-monitoring counters and time-stamp counter. This procedure compares the family and model of the processor returned by the CPUID instruction with those of processors known to support performance monitoring. (The Pentium and P6 family processors support performance counters.) The procedure also checks the MSR and TSC flags returned to register EDX by the CPUID instruction to determine if the MSRs and the RDTSC instruction are supported. The initialize and start counters procedure sets the PerfEvtSel0 and/or PerfEvtSel1 MSRs for the events to be counted and the method used to count them and initializes the counter MSRs (PerfCtr0 and PerfCtr1) to starting counts. The stop counters procedure stops the performance counters (see Section 18.6.8.3, “Starting and Stopping the Performance-Monitoring Counters”). The read counters procedure reads the values in the PerfCtr0 and PerfCtr1 MSRs, and a read time-stamp counter procedure reads the time-stamp counter. These procedures would be provided in lieu of enabling the RDTSC and RDPMC instructions that allow application code to read the counters. 18.6.8.5 Monitoring Counter Overflow The P6 family processors provide the option of generating a local APIC interrupt when a performance-monitoring counter overflows. This mechanism is enabled by setting the interrupt enable flag in either the PerfEvtSel0 or the PerfEvtSel1 MSR. The primary use of this option is for statistical performance sampling. To use this option, the operating system should do the following things on the processor for which performance events are required to be monitored: • • • Provide an interrupt vector for handling the counter-overflow interrupt. • Provide an event monitor driver that provides the actual interrupt handler and modifies the reserved IDT entry to point to its interrupt routine. Initialize the APIC PERF local vector entry to enable handling of performance-monitor counter overflow events. Provide an entry in the IDT that points to a stub exception handler that returns without executing any instructions. When interrupted by a counter overflow, the interrupt handler needs to perform the following actions: • Save the instruction pointer (EIP register), code-segment selector, TSS segment selector, counter values and other relevant information at the time of the interrupt. • Reset the counter to its initial setting and return from the interrupt. An event monitor application utility or another application program can read the information collected for analysis of the performance of the profiled application. 18.6.9 Performance Monitoring (Pentium Processors) The Pentium processor provides two 40-bit performance counters, which can be used to count events or measure duration. The counters are supported by three MSRs: the control and event select MSR (CESR) and the performance counter MSRs (CTR0 and CTR1). These can be read from and written to using the RDMSR and WRMSR instructions, respectively. They can be accessed using these instructions only when operating at privilege level 0. Each counter has an associated external pin (PM0/BP0 and PM1/BP1), which can be used to indicate the state of the counter to external hardware. NOTES The CESR, CTR0, and CTR1 MSRs and the events listed in Table 19-43 are model-specific for the Pentium processor. The performance-monitoring events listed in Chapter 19 are intended to be used as guides for performance tuning. Counter values reported are not guaranteed to be accurate and should be used as a relative guide for tuning. Known discrepancies are documented where applicable. Vol. 3B 18-133 PERFORMANCE MONITORING 18.6.9.1 Control and Event Select Register (CESR) The 32-bit control and event select MSR (CESR) controls the operation of performance-monitoring counters CTR0 and CTR1 and the associated pins (see Figure 18-62). To control each counter, the CESR register contains a 6-bit event select field (ES0 and ES1), a pin control flag (PC0 and PC1), and a 3-bit counter control field (CC0 and CC1). The functions of these fields are as follows: • ES0 and ES1 (event select) fields (bits 0-5, bits 16-21) — Selects (by entering an event code in the field) up to two events to be monitored. See Table 19-43 for a list of available event codes. 31 26 25 24 P C 1 22 21 CC1 16 15 ES1 10 9 8 P C 0 6 5 CC0 0 ESO PC1—Pin control 1 CC1—Counter control 1 ES1—Event select 1 PC0—Pin control 0 CC0—Counter control 0 ES0—Event select 0 Reserved Figure 18-62. CESR MSR (Pentium Processor Only) • CC0 and CC1 (counter control) fields (bits 6-8, bits 22-24) — Controls the operation of the counter. Control codes are as follows: 000 — Count nothing (counter disabled). 001 — Count the selected event while CPL is 0, 1, or 2. 010 — Count the selected event while CPL is 3. 011 — Count the selected event regardless of CPL. 100 — Count nothing (counter disabled). 101 — Count clocks (duration) while CPL is 0, 1, or 2. 110 — Count clocks (duration) while CPL is 3. 111 — Count clocks (duration) regardless of CPL. The highest order bit selects between counting events and counting clocks (duration); the middle bit enables counting when the CPL is 3; and the low-order bit enables counting when the CPL is 0, 1, or 2. • PC0 and PC1 (pin control) flags (bits 9, 25) — Selects the function of the external performance-monitoring counter pin (PM0/BP0 and PM1/BP1). Setting one of these flags to 1 causes the processor to assert its associated pin when the counter has overflowed; setting the flag to 0 causes the pin to be asserted when the counter has been incremented. These flags permit the pins to be individually programmed to indicate the overflow or incremented condition. The external signaling of the event on the pins will lag the internal event by a few clocks as the signals are latched and buffered. While a counter need not be stopped to sample its contents, it must be stopped and cleared or preset before switching to a new event. It is not possible to set one counter separately. If only one event needs to be changed, the CESR register must be read, the appropriate bits modified, and all bits must then be written back to CESR. At reset, all bits in the CESR register are cleared. 18.6.9.2 Use of the Performance-Monitoring Pins When performance-monitor pins PM0/BP0 and/or PM1/BP1 are configured to indicate when the performancemonitor counter has incremented and an “occurrence event” is being counted, the associated pin is asserted (high) each time the event occurs. When a “duration event” is being counted, the associated PM pin is asserted for the 18-134 Vol. 3B PERFORMANCE MONITORING entire duration of the event. When the performance-monitor pins are configured to indicate when the counter has overflowed, the associated PM pin is asserted when the counter has overflowed. When the PM0/BP0 and/or PM1/BP1 pins are configured to signal that a counter has incremented, it should be noted that although the counters may increment by 1 or 2 in a single clock, the pins can only indicate that the event occurred. Moreover, since the internal clock frequency may be higher than the external clock frequency, a single external clock may correspond to multiple internal clocks. A “count up to” function may be provided when the event pin is programmed to signal an overflow of the counter. Because the counters are 40 bits, a carry out of bit 39 indicates an overflow. A counter may be preset to a specific value less then 240 − 1. After the counter has been enabled and the prescribed number of events has transpired, the counter will overflow. Approximately 5 clocks later, the overflow is indicated externally and appropriate action, such as signaling an interrupt, may then be taken. The PM0/BP0 and PM1/BP1 pins also serve to indicate breakpoint matches during in-circuit emulation, during which time the counter increment or overflow function of these pins is not available. After RESET, the PM0/BP0 and PM1/BP1 pins are configured for performance monitoring, however a hardware debugger may reconfigure these pins to indicate breakpoint matches. 18.6.9.3 Events Counted Events that performance-monitoring counters can be set to count and record (using CTR0 and CTR1) are divided in two categories: occurrence and duration: • Occurrence events — Counts are incremented each time an event takes place. If PM0/BP0 or PM1/BP1 pins are used to indicate when a counter increments, the pins are asserted each clock counters increment. But if an event happens twice in one clock, the counter increments by 2 (the pins are asserted only once). • Duration events — Counters increment the total number of clocks that the condition is true. When used to indicate when counters increment, PM0/BP0 and/or PM1/BP1 pins are asserted for the duration. 18.7 COUNTING CLOCKS The count of cycles, also known as clockticks, forms the basis for measuring how long a program takes to execute. Clockticks are also used as part of efficiency ratios like cycles per instruction (CPI). Processor clocks may stop ticking under circumstances like the following: • The processor is halted when there is nothing for the CPU to do. For example, the processor may halt to save power while the computer is servicing an I/O request. When Intel Hyper-Threading Technology is enabled, both logical processors must be halted for performance-monitoring counters to be powered down. • The processor is asleep as a result of being halted or because of a power-management scheme. There are different levels of sleep. In the some deep sleep levels, the time-stamp counter stops counting. In addition, processor core clocks may undergo transitions at different ratios relative to the processor’s bus clock frequency. Some of the situations that can cause processor core clock to undergo frequency transitions include: • • TM2 transitions. Enhanced Intel SpeedStep Technology transitions (P-state transitions). For Intel processors that support TM2, the processor core clocks may operate at a frequency that differs from the Processor Base frequency (as indicated by processor frequency information reported by CPUID instruction). See Section 18.7.2 for more detail. Due to the above considerations there are several important clocks referenced in this manual: • Base Clock — The frequency of this clock is the frequency of the processor when the processor is not in turbo mode, and not being throttled via Intel SpeedStep. • • Maximum Clock — This is the maximum frequency of the processor when turbo mode is at the highest point. Bus Clock — These clockticks increment at a fixed frequency and help coordinate the bus on some systems. Vol. 3B 18-135 PERFORMANCE MONITORING • Core Crystal Clock — This is a clock that runs at fixed frequency; it coordinates the clocks on all packages across the system. • Non-halted Clockticks — Measures clock cycles in which the specified logical processor is not halted and is not in any power-saving state. When Intel Hyper-Threading Technology is enabled, ticks can be measured on a per-logical-processor basis. There are also performance events on dual-core processors that measure clockticks per logical processor when the processor is not halted. • Non-sleep Clockticks — Measures clock cycles in which the specified physical processor is not in a sleep mode or in a power-saving state. These ticks cannot be measured on a logical-processor basis. • • Time-stamp Counter — See Section 17.17, “Time-Stamp Counter”. Reference Clockticks — TM2 or Enhanced Intel SpeedStep technology are two examples of processor features that can cause processor core clockticks to represent non-uniform tick intervals due to change of bus ratios. Performance events that counts clockticks of a constant reference frequency was introduced Intel Core Duo and Intel Core Solo processors. The mechanism is further enhanced on processors based on Intel Core microarchitecture. Some processor models permit clock cycles to be measured when the physical processor is not in deep sleep (by using the time-stamp counter and the RDTSC instruction). Note that such ticks cannot be measured on a perlogical-processor basis. See Section 17.17, “Time-Stamp Counter,” for detail on processor capabilities. The first two methods use performance counters and can be set up to cause an interrupt upon overflow (for sampling). They may also be useful where it is easier for a tool to read a performance counter than to use a time stamp counter (the timestamp counter is accessed using the RDTSC instruction). For applications with a significant amount of I/O, there are two ratios of interest: • Non-halted CPI — Non-halted clockticks/instructions retired measures the CPI for phases where the CPU was being used. This ratio can be measured on a logical-processor basis when Intel Hyper-Threading Technology is enabled. • Nominal CPI — Time-stamp counter ticks/instructions retired measures the CPI over the duration of a program, including those periods when the machine halts while waiting for I/O. 18.7.1 Non-Halted Reference Clockticks Software can use UnHalted Reference Cycles on either a general purpose performance counter using event mask 0x3C and umask 0x01 or on fixed function performance counter 2 to count at a constant rate. These events count at a consistent rate irrespective of P-state, TM2, or frequency transitions that may occur to the processor. The UnHalted Reference Cycles event may count differently on the general purpose event and fixed counter. 18.7.2 Cycle Counting and Opportunistic Processor Operation As a result of the state transitions due to opportunistic processor performance operation (see Chapter 14, “Power and Thermal Management”), a logical processor or a processor core can operate at frequency different from the Processor Base frequency. The following items are expected to hold true irrespective of when opportunistic processor operation causes state transitions: • • The time stamp counter operates at a fixed-rate frequency of the processor. • The IA32_FIXED_CTR2 counter increments at the same TSC frequency irrespective of any transitions caused by opportunistic processor operation. • • The Local APIC timer operation is unaffected by opportunistic processor operation. The IA32_MPERF counter increments at a fixed frequency irrespective of any transitions caused by opportunistic processor operation. The TSC, IA32_MPERF, and IA32_FIXED_CTR2 operate at close to the maximum non-turbo frequency, which is equal to the product of scalable bus frequency and maximum non-turbo ratio. 18-136 Vol. 3B PERFORMANCE MONITORING 18.7.3 Determining the Processor Base Frequency For Intel processors in which the nominal core crystal clock frequency is enumerated in CPUID.15H.ECX and the core crystal clock ratio is encoded in CPUID.15H (see Table 3-8 “Information Returned by CPUID Instruction”), the nominal TSC frequency can be determined by using the following equation: Nominal TSC frequency = ( CPUID.15H.ECX[31:0] * CPUID.15H.EBX[31:0] ) ÷ CPUID.15H.EAX[31:0] For Intel processors in which CPUID.15H.EBX[31:0] ÷ CPUID.0x15.EAX[31:0] is enumerated but CPUID.15H.ECX is not enumerated, Table 18-85 can be used to look up the nominal core crystal clock frequency. Table 18-85. Nominal Core Crystal Clock Frequency Processor Families/Processor Number Series1 Nominal Core Crystal Clock Frequency Intel® Xeon® Processor Scalable Family with CPUID signature 06_55H. 25 MHz 6th and 7th generation Intel® Core™ processors and Intel® Xeon® W Processor Family. 24 MHz Next Generation Intel® Atom™ processors based on Goldmont Microarchitecture with CPUID signature 06_5CH (does not include Intel Xeon processors). 19.2 MHz NOTES: 1. For any processor in which CPUID.15H is enumerated and MSR_PLATFORM_INFO[15:8] (which gives the scalable bus frequency) is available, a more accurate frequency can be obtained by using CPUID.15H. 18.7.3.1 For Intel® Processors Based on Microarchitecture Code Name Sandy Bridge, Ivy Bridge, Haswell and Broadwell The scalable bus frequency is encoded in the bit field MSR_PLATFORM_INFO[15:8] and the nominal TSC frequency can be determined by multiplying this number by a bus speed of 100 MHz. 18.7.3.2 For Intel® Processors Based on Microarchitecture Code Name Nehalem The scalable bus frequency is encoded in the bit field MSR_PLATFORM_INFO[15:8] and the nominal TSC frequency can be determined by multiplying this number by a bus speed of 133.33 MHz. 18.7.3.3 For Intel® Atom™ Processors Based on the Silvermont Microarchitecture (Including Intel Processors Based on Airmont Microarchitecture) The scalable bus frequency is encoded in the bit field MSR_PLATFORM_INFO[15:8] and the nominal TSC frequency can be determined by multiplying this number by the scalable bus frequency. The scalable bus frequency is encoded in the bit field MSR_FSB_FREQ[2:0] for Intel Atom processors based on the Silvermont microarchitecture, and in bit field MSR_FSB_FREQ[3:0] for processors based on the Airmont microarchitecture; see Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. Vol. 3B 18-137 PERFORMANCE MONITORING 18.7.3.4 For Intel® Core™ 2 Processor Family and for Intel® Xeon® Processors Based on Intel Core Microarchitecture For processors based on Intel Core microarchitecture, the scalable bus frequency is encoded in the bit field MSR_FSB_FREQ[2:0] at (0CDH), see Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. The maximum resolved bus ratio can be read from the following bit field: • If XE operation is disabled, the maximum resolved bus ratio can be read in MSR_PLATFORM_ID[12:8]. It corresponds to the Processor Base frequency. • IF XE operation is enabled, the maximum resolved bus ratio is given in MSR_PERF_STATUS[44:40], it corresponds to the maximum XE operation frequency configured by BIOS. XE operation of an Intel 64 processor is implementation specific. XE operation can be enabled only by BIOS. If MSR_PERF_STATUS[31] is set, XE operation is enabled. The MSR_PERF_STATUS[31] field is read-only. 18.8 IA32_PERF_CAPABILITIES MSR ENUMERATION The layout of IA32_PERF_CAPABILITIES MSR is shown in Figure 18-63, it provides enumeration of a variety of interfaces: • IA32_PERF_CAPABILITIES.LBR_FMT[bits 5:0]: encodes the LBR format, details are described in Section 17.4.8.1. • IA32_PERF_CAPABILITIES.PEBSTrap[6]: Trap/Fault-like indicator of PEBS recording assist, see Section 18.6.2.4.2. • IA32_PERF_CAPABILITIES.PEBSArchRegs[7]: Indicator of PEBS assist save architectural registers, see Section 18.6.2.4.2. • IA32_PERF_CAPABILITIES.PEBS_FMT[bits 11:8]: Specifies the encoding of the layout of PEBS records, see Section 18.6.2.4.2. • IA32_PERF_CAPABILITIES.FREEZE_WHILE_SMM[12]: Indicates IA32_DEBUGCTL.FREEZE_WHILE_SMM is supported if 1, see Section 18.8.1. • IA32_PERF_CAPABILITIES.FULL_WRITE[13]: Indicates the processor supports IA32_A_PMCx interface for updating bits 32 and above of IA32_PMCx, see Section 18.2.6. • IA32_PERF_CAPABILITIES.PERF_METRICS_AVAILABLE[15]: If set, indicates that the architecture provides built in support for TMA L1 metrics through the IA32_PERF_METRICS MSR, see Section 18.3.9.3. • IA32_PERF_CAPABILITIES.PEBS_OUTPUT_PT_AVAIL[16]: If set on parts that enumerate support for Intel PT (CPUID.0x7.0.EBX[25]=1), setting IA32_PEBS_ENABLE.PEBS_OUTPUT to 01B will result in PEBS output being written into the Intel PT trace stream. See Section 18.5.5.2. 18-138 Vol. 3B PERFORMANCE MONITORING 63 16 15 13 12 11 8 7 6 5 4 3 2 1 0 PEBS_OUTPUT_PT_AVAIL (R/O) PERF_METRICS_AVAILABLE (R/O) FW_WRITE (R/O) SMM_FREEZE (R/O) PEBS_REC_FMT (R/O) PEBS_ARCH _REG (R/O) PEBS_TRAP (R/O) LBR_FMT (R/O) Reserved Figure 18-63. Layout of IA32_PERF_CAPABILITIES MSR 18.8.1 Filtering of SMM Handler Overhead When performance monitoring facilities and/or branch profiling facilities (see Section 17.5, “Last Branch, Interrupt, and Exception Recording (Intel® Core™ 2 Duo and Intel® Atom™ Processors)”) are enabled, these facilities capture event counts, branch records and branch trace messages occurring in a logical processor. The occurrence of interrupts, instruction streams due to various interrupt handlers all contribute to the results recorded by these facilities. If CPUID.01H:ECX.PDCM[bit 15] is 1, the processor supports the IA32_PERF_CAPABILITIES MSR. If IA32_PERF_CAPABILITIES.FREEZE_WHILE_SMM[Bit 12] is 1, the processor supports the ability for system software using performance monitoring and/or branch profiling facilities to filter out the effects of servicing system management interrupts. If the FREEZE_WHILE_SMM capability is enabled on a logical processor and after an SMI is delivered, the processor will clear all the enable bits of IA32_PERF_GLOBAL_CTRL, save a copy of the content of IA32_DEBUGCTL and disable LBR, BTF, TR, and BTS fields of IA32_DEBUGCTL before transferring control to the SMI handler. The enable bits of IA32_PERF_GLOBAL_CTRL will be set to 1, the saved copy of IA32_DEBUGCTL prior to SMI delivery will be restored , after the SMI handler issues RSM to complete its servicing. It is the responsibility of the SMM code to ensure the state of the performance monitoring and branch profiling facilities are preserved upon entry or until prior to exiting the SMM. If any of this state is modified due to actions by the SMM code, the SMM code is required to restore such state to the values present at entry to the SMM handler. System software is allowed to set IA32_DEBUGCTL.FREEZE_WHILE_SMM[bit 14] to 1 only supported as indicated by IA32_PERF_CAPABILITIES.FREEZE_WHILE_SMM[Bit 12] reporting 1. 18.9 PEBS FACILITY 18.9.1 Extended PEBS • The Extended PEBS feature supports Processor Event Based Sampling (PEBS) on all counters, both fixed function and general purpose; and all performance monitoring events, both precise and non-precise. PEBS can be enabled for the general purpose counters using PEBS_EN_PMCi bits of IA32_PEBS_ENABLE (i = 0, 1,..n). PEBS can be enabled for 'i' fixed function counters using the PEBS_EN_FIXEDi bits of IA32_PEBS_ENABLE (i = 0, 1, ...m). Vol. 3B 18-139 PERFORMANCE MONITORING n 63 32 31 ● ● ● 1 0 m ● ● ● PEBS_EN_FIXEDn (R/W) PEBS_EN_FIXED1 (R/W) PEBS_EN_FIXED0 (R/W) PEBS_EN_PMCm (R/W) PEBS_EN_PMC1 (R/W) PEBS_EN_PMC0 (R/W) Reserved RESET Value – 00000000 _00000000 H Figure 18-64. Layout of IA32_PEBS_ENABLE MSR A PEBS record due to a precise event will be generated after an instruction that causes the event when the counter has already overflowed. A PEBS record due to a non-precise event will occur at the next opportunity after the counter has overflowed, including immediately after an overflow is set by an MSR write. Currently, IA32_FIXED_CTR0 counts instructions retired and is a precise event. IA32_FIXED_CTR1, IA32_FIXED_CTR2 … IA32_FIXED_CTRm count as non-precise events. The Applicable Counter field in the Basic Info Group of the PEBS record indicates which counters caused the PEBS record to be generated. It is in the same format as the enable bits for each counter in IA32_PEBS_ENABLE. As an example, an Applicable Counter field with bits 2 and 32 set would indicate that both general purpose counter 2 and fixed function counter 0 generated the PEBS record. • To properly use PEBS for the additional counters, software will need to set up the counter reset values in PEBS portion of the DS_BUFFER_MANAGEMENT_AREA data structure that is indicated by the IA32_DS_AREA register. The layout of the DS_BUFFER_MANAGEMENT_AREA is shown in Figure 18-65. When a counter generates a PEBS records, the appropriate counter reset values will be loaded into that counter. In the above example where general purpose counter 2 and fixed function counter 0 generated the PEBS record, general purpose counter 2 would be reloaded with the value contained in PEBS GP Counter 2 Reset (offset 50H) and fixed function counter 0 would be reloaded with the value contained in PEBS Fixed Counter 0 Reset (offset 80H). 18-140 Vol. 3B PERFORMANCE MONITORING IA32_DS_AREA MSR BTS Buffer DS Buffer Management 63 BTS Buffer Base 0 00H BTS Index 08H BTS Absolute Maximum 10H BTS Interrupt Threshold 18H PEBS Buffer Base 20H PEBS Index 28H PEBS Absolute Maximum 30H PEBS Interrupt Threshold 38H PEBS GP Counter 0 Reset 40H ●●● PEBS GP Counter m Reset PEBS Fixed Counter 0 Reset ●●● 80H Branch Record 1 Branch Record N PEBS Config Buffer PEBS GP Counter 1 Reset Branch Record 0 PEBS Buffer PEBS Record 0 PEBS Record 1 PEBS Fixed Counter n Reset PEBS Record N Figure 18-65. PEBS Programming Environment Extended PEBS support debuts on Intel® Atom processors based on the Goldmont Plus microarchitecture and future Intel® Core™ processors based on the Ice Lake microarchitecture. 18.9.2 Adaptive PEBS The PEBS facility has been enhanced to collect the following CPU state in addition to GPRs, EventingIP, TSC and memory access related information collected by legacy PEBS: • • XMM registers LBR records (TO/FROM/INFO) The PEBS record is restructured where fields are grouped into Basic group, Memory group, GPR group, XMM group and LBR group. A new register MSR_PEBS_DATA_CFG provides software the capability to select data groups of interest and thus reduce the record size in memory and record generation latency. Hence, a PEBS record's size and layout vary based on the selected groups. The MSR also allows software to select LBR depth for branch data records. By default, the PEBS record will only contain the Basic group. Optionally, each counter can be configured to generate a PEBS records with the groups specified in MSR_PEBS_DATA_CFG. Vol. 3B 18-141 PERFORMANCE MONITORING Details and examples for the Adaptive PEBS capability follow below. 18.9.2.1 Adaptive_Record Counter Control • IA32_PERFEVTSELx.Adaptive_Record[34]: If this bit is set and IA32_PEBS_ENABLE.PEBS_EN_PMCx is set for the corresponding GP counter, an overflow of PMCx results in generation of an adaptive PEBS record with state information based on the selections made in MSR_PEBS_DATA_CFG. If this bit is not set, a basic record is generated. • IA32_FIXED_CTR_CTRL.FCx_Adaptive_Record: If this bit is set and IA32_PEBS_ENABLE.PEBS_EN_FIXEDx is set for the corresponding Fixed counter, an overflow of FixedCtrx results in generation of an adaptive PEBS record with state information based on the selections made in MSR_PEBS_DATA_CFG. If this bit is not set, a basic record is generated. EN_FC0 PMI_FC0 AnyThr_FC0 EN_FC1 PMI_FC1 AnyThr_FC1 EN_FC2 PMI_FC2 AnyThr_FC2 EN_FC3 PMI_FC3 AnyThr_FC3 Reserved 0 63 55 47 39 32 FC0_ADAPTIVE_RECORD 7 FC1_ADAPTIVE_RECORD 15 FC2_ADAPTIVE_RECORD 23 FC3_ADAPTIVE_RECORD 31 IA32_Fixed_CTR_CTRL Address: 38DH Scope: Thread Reset value : 0x00000000 .00000000 Figure 18-66. Layout of IA32_FIXED_CTR_CTRL MSR Supporting Adaptive PEBS 18.9.2.2 PEBS Record Format The data fields in the PEBS record are aggregated into five groups which are described in the sub-sections below. Processors that support Adaptive PEBS implement a new MSR called MSR_PEBS_DATA_CFG which allows software to select the data groups to be captured. The data groups are not placed at fixed locations in the PEBS record, but are positioned immediately after one another, thus making the record format/size variable based on the groups selected. 18-142 Vol. 3B PERFORMANCE MONITORING 18.9.2.2.1 Basic Info The Basic group contains essential information for software to parse a record along with several critical fields. It is always collected. Table 18-86. Basic Info Group Field Name Record Format Bit Width Description [47:0] This field indicates which data groups are included in the record. The field is zero if none of the counters that triggered the current PEBS record have their Adaptive_Record bit set. Otherwise it contains the value of MSR_PEBS_DATA_CFG. [63:48] This field provides the size of the current record in bytes. Selected groups are packed back-to-back in the record without gaps or padding for unselected groups. Instruction Pointer [63:0] This field reports the Eventing Instruction Pointer (EventingIP) of the retired instruction that triggered the PEBS record generation. Note that this field is different than R/EIP which records the instruction pointer of the next instruction to be executed after record generation. The legacy R/EIP field has been removed. Applicable Counters [63:0] The Applicable Counters field indicates which counters triggered the generation of the PEBS record, linking the record to specific events. This allows software to correlate the PEBS record entry properly with the instruction that caused the event, even when multiple counters are configured to generate PEBS records and multiple bits are set in the field. TSC [63:0] This field provides the time stamp counter value when the PEBS record was generated. 18.9.2.2.2 Memory Access Info This group contains the legacy PEBS memory-related fields; see Section 18.3.1.1.2. Table 18-87. Memory Access Info Group Field Name Bit Width Description Memory Access Address [63:0] This field contains the linear address of the source of the load, or linear address of the destination (target) of the store. This value is written as a 64-bit address in canonical form. Memory Auxiliary Info [63:0] When MEM_TRANS_RETIRED.* event is configured in a General Purpose counter, this field contains an encoded value indicating the memory hierarchy source which satisfied the load. These encodings are detailed in Table 18-4 and Table 18-13. If the PEBS assist was triggered for a store uop, this field will contain information indicating the status of the store, as detailed in Table 18-14. Memory Access Latency [63:0] When MEM_TRANS_RETIRED.* event is configured in a General Purpose counter, this field contains the latency to service the load in core clock cycles. TSX Auxiliary Info [31:0] This field contains the number of cycles in the last TSX region, regardless of whether that region had aborted or committed. [63:31] This field contains the abort details. Refer to Section 18.3.6.5.1. 18.9.2.2.3 GPRs This group is captured when the GPR bit is enabled in MSR_PEBS_DATA_CFG. GPRs are always 64 bits wide. If they are selected for non 64-bit mode, the upper 32-bit of the legacy RAX - RDI and all contents of R8-15 GPRs will be filled with 0s. In 64bit mode, the full 64 bit value of each register is written. Vol. 3B 18-143 PERFORMANCE MONITORING The order differs from legacy. The table below shows the order of the GPRs in Ice Lake microarchitecture. Table 18-88. GPRs in Ice Lake Microarchitecture Field Name Bit Width RFLAGS [63:0] RIP [63:0] RAX [63:0] RCX* [63:0] RDX* [63:0] RBX* [63:0] RSP* [63:0] RBP* [63:0] RSI* [63:0] RDI* [63:0] R8 [63:0] ... ... R15 [63:0] The machine state reported in the PEBS record is the committed machine state immediately after the instruction that triggers PEBS completes. For instance, consider the following instruction sequence: MOV eax, [eax]; triggers PEBS record generation NOP If the mov instruction triggers PEBS record generation, the EventingIP field in the PEBS record will report the address of the mov, and the value of EAX in the PEBS record will show the value read from memory, not the target address of the read operation. And the value of RIP will contain the linear address of the nop. 18.9.2.2.4 XMMs This group is captured when the XMM bit is enabled in MSR_PEBS_DATA_CFG and SSE is enabled. If SSE is not enabled, the fields will contain zeroes. XMM8-XMM15 will also contain zeroes if not in 64-bit mode. 18-144 Vol. 3B PERFORMANCE MONITORING Table 18-89. XMMs Field Name Bit Width XMM0 [127:0] ... ... XMM15 [127:0] 18.9.2.2.5 LBRs To capture LBR data in the PEBS record, the LBR bit in MSR_PEBS_DATA_CFG must be enabled. The number of LBR entries included in the record can be configured in the LBR_entries field of MSR_PEBS_DATA_CFG. Table 18-90. LBRs Field Name Bit Width Description LBR[].FROM [63:0] Branch from address. LBR[].TO [63:0] Branch to address. LBR[].INFO [63:0] Other LBR information, like timing. This field is described in more detail in Section 17.12.1, “MSR_LBR_INFO_x MSR”. LBR entries are recorded into the record starting at LBR[TOS] and proceeding to LBR[TOS-1] and following. Note that LBR index is modulo the number of LBRs supporting on the processor. 18.9.2.3 MSR_PEBS_DATA_CFG Bits in MSR_PEBS_DATA_CFG can be set to include data field blocks/groups into adaptive records. The Basic Info group is always included in the record. Additionally, the number of LBR entries included in the record is configurable. Vol. 3B 18-145 PERFORMANCE MONITORING GPRs Memory Info LBRs XMMs LBR Entries Reserved 31 23 15 7 31 63 55 47 39 63 MSR_PEBS_DATA_CFG Address: 3F2H Scope: Thread Reset value : 0x00000000 .00000000 Figure 18-67. MSR_PEBS_DATA_CFG Table 18-91. MSR_PEBS_CFG Programming1 Bit Bit Index Access Description Memory Info 0 R/W Setting this bit will capture memory information such as the linear address, data source and latency of the memory access in the PEBS record. GPRs 1 R/W Setting this bit will capture the contents of the General Purpose registers in the PEBS record. XMMs 2 R/W Setting this bit will capture the contents of the XMM registers in the PEBS record. LBRs 3 R/W Setting this bit will capture LBR TO, FROM and INFO in the PEBS record. Reserved2 23:4 NA Reserved LBR Entries 31:24 R/W Set the field to the desired number of entries minus 1. For example, if the LBR_entries field is 0, a single entry will be included in the record. To include 32 LBR entries, set the LBR_entries field to 31 (0x1F). To ensure all PEBS records are 16-byte aligned, it is recommended to select an even number of LBR entries (programmed into LBR_entries as an odd number). NOTES: 1. A write to the MSR will be ignored when IA32_MISC_ENABLES.PERFMON_AVAILABLE is zero (default). 2. Writing to the reserved bits will cause a GP fault. 18-146 Vol. 3B PERFORMANCE MONITORING 18.9.2.4 PEBS Record Examples The following example shows the layout of the PEBS record when all data groups are selected (all valid bits in MSR_PEBS_DATA_CFG are set) and maximum number of LBRs are selected. There are no gaps in the PEBS record when a subset of the groups are selected, thus keeping the layout compact. Implementations that do not support some features will have to pad zeroes in the corresponding fields. Table 18-92. PEBS Record Example 1 Offset 0x0 Group Name Legacy Name (If Different) Record Format New Record Size New 0x8 Instruction Pointer EventingRIP 0x10 Applicable Counters 0x18 TSC 0x20 Basic Info Field Name Memory Access Address DLA 0x28 Memory Auxiliary Info DATA_SRC 0x30 Memory Access Latency Load Latency 0x38 TSX Auxiliary Info HLE Information 0x40 Memory Info GPRs RFLAGS 0x48 RIP 0x50 RAX ... ... 0x88 RDI 0x90 R8 ... ... 0xC8 R15 0xD0 XMMs XMM0 ... ... 0x1C0 XMM15 0x1D0 LBRs LBR[TOS].FROM 0x1D8 LBR[TOS].TO 0x1E0 LBR[TOS].INFO ... ... 0x4B8 LBR[TOS +1].FROM 0x4C0 LBR[TOS +1].TO 0x4C8 LBR[TOS +1].INFO New New Vol. 3B 18-147 PERFORMANCE MONITORING The following example shows the layout of the PEBS record when Basic, GPR, and LBR group with 3 LBR entries are selected. Table 18-93. PEBS Record Example 2 Offset Group Name 0x0 Basic Info Field Name Legacy Name (If Different) Record Format New Record Size New 0x8 Instruction Pointer EventingRIP 0x10 Applicable Counters 0x18 TSC 0x20 GPRs RFLAGS 0x28 RIP 0x30 RAX ... ... 0x68 RDI 0x70 R8 ... ... 0xA8 R15 0xB0 LBRs LBR[TOS].FROM 0xB8 LBR[TOS].TO 0xC0 LBR[TOS].INFO ... ... 0xE0 LBR[TOS +1].FROM 0xE8 LBR[TOS +1].TO 0xF0 LBR[TOS +1].INFO 18.9.3 New Precise Distribution of Instructions Retired (PDIR) Facility Precise Distribution of Instructions Retired Facility is available via PEBS on some microarchitectures. Refer to Section 18.3.4.4.4. Counters that support PDIR also vary. See the processor specific sections for availability. 18.9.4 Reduced Skid PEBS Processors based on Goldmont Plus microarchitecture support the Reduced Skid PEBS feature described in Section 18.5.3.1.2 on the IA32_PMC0 counter. Although Extended PEBS adds support for generating PEBS records for precise events on additional general-purpose and fixed-function performance counters, those counters do not support the Reduced Skid PEBS feature. 18-148 Vol. 3B 15.Updates to Chapter 19, Volume 3B Change bars and green text show changes to Chapter 19 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------Changes to this chapter: Added performance monitoring events for processors based on Ice Lake microarchitecture. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 26 CHAPTER 19 PERFORMANCE MONITORING EVENTS NOTE The event tables listed this chapter provide information for tool developers to support architectural and model-specific performance monitoring events. The tables are up to date at processor launch, but are subject to changes. The most up to date event tables and additional details of performance event implementation can be found here: 1) In the document titled “Intel® 64 and IA32 Architectures Performance Monitoring Events”, Document number: 335279, located here: https://software.intel.com/sites/default/files/managed/8b/6e/335279_performance_monitoring_ events_guide.pdf. These event tables are also available on the web and are located at: https://download.01.org/perfmon/index/. 2) Performance monitoring event files for Intel processors are hosted by the Intel Open Source Technology Center. These files can be downloaded here: https://download.01.org/perfmon/. This chapter lists the performance monitoring events that can be monitored with the Intel 64 or IA-32 processors. The ability to monitor performance events and the events that can be monitored in these processors are mostly model-specific, except for architectural performance events, described in Section 19.1. Model-specific performance events are listed for each generation of microarchitecture: • • • • • • • • • • • • • • • • • • • • • • • • Section 19.2 - Processors based on Skylake microarchitecture and processors based on Cascade Lake product Section 19.3 - Processors based on Ice Lake microarchitecture Section 19.4 - Processors based on Skylake, Kaby Lake and Coffee Lake microarchitectures Section 19.5 - Processors based on Knights Landing and Knights Mill microarchitectures Section 19.6 - Processors based on Broadwell microarchitecture Section 19.7 - Processors based on Haswell microarchitecture Section 19.7.1 - Processors based on Haswell-E microarchitecture Section 19.8 - Processors based on Ivy Bridge microarchitecture Section 19.8.1 - Processors based on Ivy Bridge-E microarchitecture Section 19.9 - Processors based on Sandy Bridge microarchitecture Section 19.10 - Processors based on Intel® microarchitecture code name Nehalem Section 19.11 - Processors based on Intel® microarchitecture code name Westmere Section 19.12 - Processors based on Enhanced Intel® Core™ microarchitecture Section 19.13 - Processors based on Intel® Core™ microarchitecture Section 19.14 - Processors based on the Goldmont Plus microarchitecture Section 19.15 - Processors based on the Goldmont microarchitecture Section 19.16 - Processors based on the Silvermont microarchitecture Section 19.16.1 - Processors based on the Airmont microarchitecture Section 19.17 - 45 nm and 32 nm Intel® Atom™ Processors Section 19.18 - Intel® Core™ Solo and Intel® Core™ Duo processors Section 19.19 - Processors based on Intel NetBurst® microarchitecture Section 19.20 - Pentium® M family processors Section 19.21 - P6 family processors Section 19.22 - Pentium® processors Vol. 3B 19-1 PERFORMANCE MONITORING EVENTS NOTE These performance monitoring events are intended to be used as guides for performance tuning. The counter values reported by the performance monitoring events are approximate and believed to be useful as relative guides for tuning software. Known discrepancies are documented where applicable. All performance event encodings not documented in the appropriate tables for the given processor are considered reserved, and their use will result in undefined counter updates with associated overflow actions. 19.1 ARCHITECTURAL PERFORMANCE MONITORING EVENTS Architectural performance events are introduced in Intel Core Solo and Intel Core Duo processors. They are also supported on processors based on Intel Core microarchitecture. Table 19-1 lists pre-defined architectural performance events that can be configured using general-purpose performance counters and associated event-select registers. Table 19-1. Architectural Performance Events Event Num. Event Mask Name Umask Value 3CH UnHalted Core Cycles 00H Counts core clock cycles whenever the logical processor is in C0 state (not halted). The frequency of this event varies with state transitions in the core. 3CH UnHalted Reference Cycles1 01H Counts at a fixed frequency whenever the logical processor is in C0 state (not halted). C0H Instructions Retired 00H Counts when the last uop of an instruction retires. 2EH LLC Reference 4FH Counts requests originating from the core that reference a cache line in the last level on-die cache. 2EH LLC Misses 41H Counts each cache miss condition for references to the last level on-die cache. C4H Branch Instruction Retired 00H Counts when the last uop of a branch instruction retires. C5H Branch Misses Retired 00H Counts when the last uop of a branch instruction retires which corrected misprediction of the branch prediction hardware at execution time. Description NOTES: 1. Current implementations count at core crystal clock, TSC, or bus clock frequency. Fixed-function performance counters count only events defined in Table 19-2. Table 19-2. Fixed-Function Performance Counter and Pre-defined Performance Events Fixed-Function Performance Counter Address Event Mask Mnemonic Description IA32_PERF_FIXED_CTR0 309H Inst_Retired.Any This event counts the number of instructions that retire execution. For instructions that consist of multiple microops, this event counts the retirement of the last micro-op of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. 19-2 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-2. Fixed-Function Performance Counter and Pre-defined Performance Events (Contd.) Fixed-Function Performance Counter Address Event Mask Mnemonic Description IA32_PERF_FIXED_CTR1 30AH CPU_CLK_UNHALTED.THRE AD/CPU_CLK_UNHALTED.C ORE/CPU_CLK_UNHALTED. THREAD_ANY The CPU_CLK_UNHALTED.THREAD event counts the number of core cycles while the logical processor is not in a halt state. If there is only one logical processor in a processor core, CPU_CLK_UNHALTED.CORE counts the unhalted cycles of the processor core. If there are more than one logical processor in a processor core, CPU_CLK_UNHALTED.THREAD_ANY is supported by programming IA32_FIXED_CTR_CTRL[bit 6]AnyThread = 1. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. IA32_PERF_FIXED_CTR2 19.2 30BH CPU_CLK_UNHALTED.REF_ TSC This event counts the number of reference cycles at the TSC rate when the core is not in a halt state and not in a TM stop-clock state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (e.g., P states) but counts at the same frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state and not in a TM stopclock state. PERFORMANCE MONITORING EVENTS FOR INTEL® XEON® PROCESSOR SCALABLE FAMILY The Intel® Xeon® Processor Scalable Family is based on the Skylake microarchitecture and includes processors based on the Cascade Lake product. These processors support the architectural performance monitoring events listed in Table 19-1. Fixed counters in the core PMU support the architecture events defined in Table 19-2. Modelspecific performance monitoring events in the processor core are listed in Table 19-3. The events in Table 19-3 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following value: 06_55H . The comment column in Table 19-3 uses abbreviated letters to indicate additional conditions applicable to the Event Mask Mnemonic. For event umasks listed in Table 19-6 that do not show “AnyT”, users should refrain from programming “AnyThread =1” in IA32_PERF_EVTSELx. Vol. 3B 19-3 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product Event Num. Umask Value Event Mask Mnemonic Description Comment 00H 01H INST_RETIRED.ANY Counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, Counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. Notes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions. Fixed Counter 00H 02H CPU_CLK_UNHALTED.THREAD Counts the number of core cycles while the thread is Fixed Counter not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. 00H 02H CPU_CLK_UNHALTED.THREAD_ ANY Core cycles when at least one thread on the physical core is not in halt state. 19-4 Vol. 3B AnyThread=1 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 00H 03H CPU_CLK_UNHALTED.REF_TSC Counts the number of reference cycles when the core is Fixed Counter not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Note: On all current platforms this event stops counting during ‘throttling (TM)’ states duty off periods the processor is ‘halted’. The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear ‘sticky’. After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip “high (1)” and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit ‘1 for bit 34 while the counter value is less than MAX. Software should ignore this case. 03H 02H LD_BLOCKS.STORE_FORWARD Counts how many times the load operation got the true Block-on-Store blocking code preventing store forwarding. This includes cases when: a. preceding store conflicts with the load (incomplete overlap), b. store forwarding is impossible due to u-arch limitations, c. preceding lock RMW operations are not forwarded, d. store has the no-forward bit set (uncacheable/pagesplit/masked stores), e. all-blocking stores are used (mostly, fences and port I/O), and others. The most common case is a load blocked due to its address range overlapping with a preceding smaller uncompleted store. Note: This event does not take into account cases of out-of-SW-control (for example, SbTailHit), unknown physical STA, and cases of blocking loads on store due to being non-WB memory type or a lock. These cases are covered by other events. See the table of not supported store forwards in the Optimization Guide. 03H 08H LD_BLOCKS.NO_SR The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use. 07H 01H LD_BLOCKS_PARTIAL.ADDRESS _ALIAS Counts false dependencies in MOB when the partial comparison upon loose net check and dependency was resolved by the Enhanced Loose net mechanism. This may not result in high performance penalties. Loose net checks can fail when loads and stores are 4k aliased. Vol. 3B 19-5 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic 08H 01H DTLB_LOAD_MISSES.MISS_CAUS Counts demand data loads that caused a page walk of ES_A_WALK any page size (4K/2M/4M/1G). This implies it missed in all TLB levels, but the walk need not have completed. 08H 02H DTLB_LOAD_MISSES.WALK_COM Counts demand data loads that caused a completed PLETED_4K page walk (4K page size). This implies it missed in all TLB levels. The page walk can end with or without a fault. 08H 04H DTLB_LOAD_MISSES.WALK_COM Counts demand data loads that caused a completed PLETED_2M_4M page walk (2M and 4M page sizes). This implies it missed in all TLB levels. The page walk can end with or without a fault. 08H 08H DTLB_LOAD_MISSES.WALK_COM Counts load misses in all DTLB levels that cause a PLETED_1G completed page walk (1G page size). The page walk can end with or without a fault. 08H 0EH DTLB_LOAD_MISSES.WALK_COM Counts demand data loads that caused a completed PLETED page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault. 08H 10H DTLB_LOAD_MISSES.WALK_PEN Counts 1 per cycle for each PMH that is busy with a DING page walk for a load. EPT page walk duration are excluded in Skylake microarchitecture. 08H 10H DTLB_LOAD_MISSES.WALK_ACT Counts cycles when at least one PMH (Page Miss IVE Handler) is busy with a page walk for a load. 08H 20H DTLB_LOAD_MISSES.STLB_HIT Counts loads that miss the DTLB (Data TLB) and hit the STLB (Second level TLB). 0DH 01H INT_MISC.RECOVERY_CYCLES Core cycles the Resource allocator was stalled due to recovery from an earlier branch misprediction or machine clear event. 0DH 01H INT_MISC.RECOVERY_CYCLES_A Core cycles the allocator was stalled due to recovery NY from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke). 0DH 80H INT_MISC.CLEAR_RESTEER_CYC LES Cycles the issue-stage is waiting for front-end to fetch from resteered path following branch misprediction or machine clear events. 0EH 01H UOPS_ISSUED.ANY Counts the number of uops that the Resource Allocation Table (RAT) issues to the Reservation Station (RS). 0EH 01H UOPS_ISSUED.STALL_CYCLES Counts cycles during which the Resource Allocation CounterMask=1 Table (RAT) does not issue any uops to the reservation Invert=1 CMSK1, INV station (RS) for the current thread. 0EH 02H UOPS_ISSUED.VECTOR_WIDTH_ Counts the number of Blend Uops issued by the MISMATCH Resource Allocation Table (RAT) to the reservation station (RS) in order to preserve upper bits of vector registers. Starting with the Skylake microarchitecture, these Blend uops are needed since every Intel SSE instruction executed in Dirty Upper State needs to preserve bits 128-255 of the destination register. For more information, refer to Mixing Intel AVX and Intel SSE Code section of the Optimization Guide. 19-6 Vol. 3B Description Comment CounterMask=1 CMSK1 AnyThread=1 AnyT PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 0EH 20H UOPS_ISSUED.SLOW_LEA Number of slow LEA uops being allocated. A uop is generally considered SlowLea if it has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not. 14H 01H ARITH.DIVIDER_ACTIVE Cycles when divide unit is busy executing divide or square root operations. Accounts for integer and floating-point operations. 24H 21H L2_RQSTS.DEMAND_DATA_RD_ MISS Counts the number of demand Data Read requests that miss L2 cache. Only not rejected loads are counted. 24H 22H L2_RQSTS.RFO_MISS Counts the RFO (Read-for-Ownership) requests that miss L2 cache. 24H 24H L2_RQSTS.CODE_RD_MISS Counts L2 cache misses when fetching instructions. 24H 27H L2_RQSTS.ALL_DEMAND_MISS Demand requests that miss L2 cache. 24H 38H L2_RQSTS.PF_MISS Counts requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache. 24H 3FH L2_RQSTS.MISS All requests that miss L2 cache. 24H 41H L2_RQSTS.DEMAND_DATA_RD_ HIT Counts the number of demand Data Read requests that hit L2 cache. Only non rejected loads are counted. 24H 42H L2_RQSTS.RFO_HIT Counts the RFO (Read-for-Ownership) requests that hit L2 cache. 24H 44H L2_RQSTS.CODE_RD_HIT Counts L2 cache hits when fetching instructions, code reads. 24H D8H L2_RQSTS.PF_HIT Counts requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache. 24H E1H L2_RQSTS.ALL_DEMAND_DATA _RD Counts the number of demand Data Read requests (including requests from L1D hardware prefetchers). These loads may hit or miss L2 cache. Only non rejected loads are counted. 24H E2H L2_RQSTS.ALL_RFO Counts the total number of RFO (read for ownership) requests to L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. 24H E4H L2_RQSTS.ALL_CODE_RD Counts the total number of L2 code requests. 24H E7H L2_RQSTS.ALL_DEMAND_REFE RENCES Demand requests to L2 cache. 24H F8H L2_RQSTS.ALL_PF Counts the total number of requests from the L2 hardware prefetchers. All L2 requests. 24H FFH L2_RQSTS.REFERENCES 28H 07H CORE_POWER.LVL0_TURBO_LIC Core cycles where the core was running with powerENSE delivery for baseline license level 0. This includes nonAVX codes, SSE, AVX 128-bit, and low-current AVX 256-bit codes. Comment CounterMask=1 Vol. 3B 19-7 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 28H 18H CORE_POWER.LVL1_TURBO_LIC Core cycles where the core was running with powerENSE delivery for license level 1. This includes high current AVX 256-bit instructions as well as low current AVX 512-bit instructions. 28H 20H CORE_POWER.LVL2_TURBO_LIC Core cycles where the core was running with powerENSE delivery for license level 2 (introduced in Skylake Server microarchitecture). This includes high current AVX 512-bit instructions. 28H 40H CORE_POWER.THROTTLE Core cycles the out-of-order engine was throttled due to a pending power level request. 2EH 41H LONGEST_LAT_CACHE.MISS Counts core-originated cacheable requests that miss See Table 19-1. the L3 cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches from L1 and L2. It does not include all misses to the L3. 2EH 4FH LONGEST_LAT_CACHE.REFEREN Counts core-originated cacheable requests to the L3 See Table 19-1. CE cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches from L1 and L2. It does not include all accesses to the L3. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P This is an architectural event that counts the number of See Table 19-1. thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P_ANY Core cycles when at least one thread on the physical core is not in halt state. 3CH 00H CPU_CLK_UNHALTED.RING0_TR Counts when the Current Privilege Level (CPL) ANS transitions from ring 1, 2 or 3 to ring 0 (Kernel). EdgeDetect=1 CounterMask=1 3CH 01H CPU_CLK_THREAD_UNHALTED. REF_XCLK Core crystal clock cycles when the thread is unhalted. See Table 19-1. 3CH 01H CPU_CLK_THREAD_UNHALTED. REF_XCLK_ANY Core crystal clock cycles when at least one thread on the physical core is unhalted. AnyThread=1 AnyT AnyThread=1 AnyT 3CH 01H CPU_CLK_UNHALTED.REF_XCLK Core crystal clock cycles when the thread is unhalted. See Table 19-1. 3CH 01H CPU_CLK_UNHALTED.REF_XCLK Core crystal clock cycles when at least one thread on _ANY the physical core is unhalted. AnyThread=1 AnyT 3CH 02H CPU_CLK_THREAD_UNHALTED. ONE_THREAD_ACTIVE Core crystal clock cycles when this thread is unhalted and the other thread is halted. 3CH 02H CPU_CLK_UNHALTED.ONE_THR EAD_ACTIVE Core crystal clock cycles when this thread is unhalted and the other thread is halted. 19-8 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 48H 01H L1D_PEND_MISS.PENDING Counts duration of L1D miss outstanding, that is each cycle number of Fill Buffers (FB) outstanding required by Demand Reads. FB either is held by demand loads, or it is held by non-demand loads and gets hit at least once by demand. The valid outstanding interval is defined until the FB deallocation by one of the following ways: from FB allocation, if FB is allocated by demand from the demand Hit FB, if it is allocated by hardware or software prefetch.Note: In the L1D, a Demand Read contains cacheable or noncacheable demand loads, including ones causing cache-line splits and reads due to page walks resulted from any request type. 48H 01H L1D_PEND_MISS.PENDING_CYCL Counts duration of L1D miss outstanding in cycles. ES CounterMask=1 CMSK1 48H 01H L1D_PEND_MISS.PENDING_CYCL Cycles with L1D load Misses outstanding from any ES_ANY thread on physical core. CounterMask=1 AnyThread=1 CMSK1, AnyT 48H 02H L1D_PEND_MISS.FB_FULL 49H 01H DTLB_STORE_MISSES.MISS_CAU Counts demand data stores that caused a page walk of SES_A_WALK any page size (4K/2M/4M/1G). This implies it missed in all TLB levels, but the walk need not have completed. 49H 02H DTLB_STORE_MISSES.WALK_CO Counts demand data stores that caused a completed MPLETED_4K page walk (4K page size). This implies it missed in all TLB levels. The page walk can end with or without a fault. 49H 04H DTLB_STORE_MISSES.WALK_CO Counts demand data stores that caused a completed MPLETED_2M_4M page walk (2M and 4M page sizes). This implies it missed in all TLB levels. The page walk can end with or without a fault. 49H 08H DTLB_STORE_MISSES.WALK_CO Counts store misses in all DTLB levels that cause a MPLETED_1G completed page walk (1G page size). The page walk can end with or without a fault. 49H 0EH DTLB_STORE_MISSES.WALK_CO Counts demand data stores that caused a completed MPLETED page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault. 49H 10H DTLB_STORE_MISSES.WALK_PE Counts 1 per cycle for each PMH that is busy with a NDING page walk for a store. EPT page walk duration are excluded in Skylake microarchitecture. 49H 10H DTLB_STORE_MISSES.WALK_AC Counts cycles when at least one PMH (Page Miss TIVE Handler) is busy with a page walk for a store. 49H 20H DTLB_STORE_MISSES.STLB_HIT Stores that miss the DTLB (Data TLB) and hit the STLB (2nd Level TLB). Number of times a request needed a FB (Fill Buffer) entry but there was no entry available for it. A request includes cacheable/uncacheable demands that are load, store or SW prefetch instructions. CounterMask=1 CMSK1 Vol. 3B 19-9 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 4CH 01H LOAD_HIT_PRE.SW_PF Counts all not software-prefetch load dispatches that hit the fill buffer (FB) allocated for the software prefetch. It can also be incremented by some lock instructions. So it should only be used with profiling so that the locks can be excluded by ASM (Assembly File) inspection of the nearby instructions. 4FH 10H EPT.WALK_PENDING Counts cycles for each PMH (Page Miss Handler) that is busy with an EPT (Extended Page Table) walk for any request type. 51H 01H L1D.REPLACEMENT Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace. 54H 01H TX_MEM.ABORT_CONFLICT Number of times a TSX line had a cache conflict. 54H 02H TX_MEM.ABORT_CAPACITY Number of times a transactional abort was signaled due to a data capacity limitation for transactional reads or writes. 54H 04H TX_MEM.ABORT_HLE_STORE_T Number of times a TSX Abort was triggered due to a O_ELIDED_LOCK non-release/commit store to lock. 54H 08H TX_MEM.ABORT_HLE_ELISION_ BUFFER_NOT_EMPTY Number of times a TSX Abort was triggered due to commit but Lock Buffer not empty. 54H 10H TX_MEM.ABORT_HLE_ELISION_ BUFFER_MISMATCH Number of times a TSX Abort was triggered due to release/commit but data and address mismatch. 54H 20H TX_MEM.ABORT_HLE_ELISION_ Number of times a TSX Abort was triggered due to BUFFER_UNSUPPORTED_ALIGN attempting an unsupported alignment from Lock MENT Buffer. 54H 40H TX_MEM.HLE_ELISION_BUFFER _FULL Number of times we could not allocate Lock Buffer. 5DH 01H TX_EXEC.MISC1 Unfriendly TSX abort triggered by a flowmarker. 5DH 02H TX_EXEC.MISC2 Unfriendly TSX abort triggered by a vzeroupper instruction. 5DH 04H TX_EXEC.MISC3 Unfriendly TSX abort triggered by a nest count that is too deep. 5DH 08H TX_EXEC.MISC4 RTM region detected inside HLE. 5DH 10H TX_EXEC.MISC5 Counts the number of times an HLE XACQUIRE instruction was executed inside an RTM transactional region. 5EH 01H RS_EVENTS.EMPTY_CYCLES Counts cycles during which the reservation station (RS) is empty for the thread.; Note: In ST-mode, not active thread should drive 0. This is usually caused by severely costly branch mispredictions, or allocator/FE issues. 5EH 01H RS_EVENTS.EMPTY_END Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate front-end Latency Bound issues. 19-10 Vol. 3B Comment EdgeDetect=1 CounterMask=1 Invert=1 CMSK1, INV PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 60H 01H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding Demand DING.DEMAND_DATA_RD Data Read transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor. See the corresponding Umask under OFFCORE_REQUESTS. Note: A prefetch promoted to Demand is counted from the promotion point. 60H 01H OFFCORE_REQUESTS_OUTSTAN Counts cycles when offcore outstanding Demand Data CounterMask=1 DING.CYCLES_WITH_DEMAND_D Read transactions are present in the super queue (SQ). CMSK1 ATA_RD A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). 60H 01H OFFCORE_REQUESTS_OUTSTAN Cycles with at least 6 offcore outstanding Demand Data CounterMask=6 DING.DEMAND_DATA_RD_GE_6 Read transactions in uncore queue. CMSK6 60H 02H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding Code Reads DING.DEMAND_CODE_RD transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS. CounterMask=1 CMSK1 60H 02H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding Code Reads DING.CYCLES_WITH_DEMAND_C transactions in the super queue every cycle. The ODE_RD 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS. CMSK1 60H 04H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding RFO (store) DING.DEMAND_RFO transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS. CounterMask=1 CMSK1 60H 04H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding demand rfo CMSK1 DING.CYCLES_WITH_DEMAND_R Reads transactions in the super queue every cycle. The FO 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS. 60H 08H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding cacheable DING.ALL_DATA_RD Core Data Read transactions in the super queue every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS. 60H 08H OFFCORE_REQUESTS_OUTSTAN Counts cycles when offcore outstanding cacheable Core CounterMask=1 DING.CYCLES_WITH_DATA_RD Data Read transactions are present in the super queue. CMSK1 A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS. Vol. 3B 19-11 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic 60H 10H OFFCORE_REQUESTS_OUTSTAN Counts number of Offcore outstanding Demand Data DING.L3_MISS_DEMAND_DATA_ Read requests that miss L3 cache in the superQ every RD cycle. 60H 10H OFFCORE_REQUESTS_OUTSTAN Cycles with at least 1 Demand Data Read requests who CounterMask=1 DING.CYCLES_WITH_L3_MISS_D miss L3 cache in the superQ. CMSK1 EMAND_DATA_RD 60H 10H OFFCORE_REQUESTS_OUTSTAN Cycles with at least 6 Demand Data Read requests that CounterMask=6 DING.L3_MISS_DEMAND_DATA_ miss L3 cache in the superQ. CMSK6 RD_GE_6 79H 04H IDQ.MITE_UOPS Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB). 79H 04H IDQ.MITE_CYCLES Counts cycles during which uops are being delivered to CounterMask=1 Instruction Decode Queue (IDQ) from the MITE path. CMSK1 Counting includes uops that may 'bypass' the IDQ. 79H 08H IDQ.DSB_UOPS Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may ‘bypass’ the IDQ. 79H 08H IDQ.DSB_CYCLES Counts cycles during which uops are being delivered to CounterMask=1 Instruction Decode Queue (IDQ) from the Decode CMSK1 Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ. 79H 10H IDQ.MS_DSB_CYCLES Counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ. 79H 18H IDQ.ALL_DSB_CYCLES_4_UOPS Counts the number of cycles 4 uops were delivered to CounterMask=4 Instruction Decode Queue (IDQ) from the Decode CMSK4 Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. 79H 18H IDQ.ALL_DSB_CYCLES_ANY_UO PS Counts the number of cycles uops were delivered to CounterMask=1 Instruction Decode Queue (IDQ) from the Decode CMSK1 Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. 79H 20H IDQ.MS_MITE_UOPS Counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ. 79H 24H IDQ.ALL_MITE_CYCLES_4_UOPS Counts the number of cycles 4 uops were delivered to CounterMask=4 the Instruction Decode Queue (IDQ) from the MITE CMSK4 (legacy decode pipeline) path. Counting includes uops that may 'bypass' the IDQ. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB). 19-12 Vol. 3B Description Comment CounterMask=1 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 79H 24H IDQ.ALL_MITE_CYCLES_ANY_UO Counts the number of cycles uops were delivered to CounterMask=1 PS the Instruction Decode Queue (IDQ) from the MITE CMSK1 (legacy decode pipeline) path. Counting includes uops that may 'bypass' the IDQ. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB). 79H 30H IDQ.MS_CYCLES Counts cycles during which uops are being delivered to CounterMask=1 Instruction Decode Queue (IDQ) while the Microcode CMSK1 Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE. 79H 30H IDQ.MS_SWITCHES Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer. 79H 30H IDQ.MS_UOPS Counts the total number of uops delivered by the Microcode Sequencer (MS). Any instruction over 4 uops will be delivered by the MS. Some instructions such as transcendentals may additionally generate uops from the MS. 80H 04H ICACHE_16B.IFDATA_STALL Cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. 83H 01H ICACHE_64B.IFTAG_HIT Instruction fetch tag lookups that hit in the instruction cache (L1I). Counts at 64-byte cache-line granularity. 83H 02H ICACHE_64B.IFTAG_MISS Instruction fetch tag lookups that miss in the instruction cache (L1I). Counts at 64-byte cache-line granularity. 83H 04H ICACHE_64B.IFTAG_STALL Cycles where a code fetch is stalled due to L1 instruction cache tag miss. 85H 01H ITLB_MISSES.MISS_CAUSES_A_ WALK Counts page walks of any page size (4K/2M/4M/1G) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB, but the walk need not have completed. 85H 02H ITLB_MISSES.WALK_COMPLETE D_4K Counts completed page walks (4K page size) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault. 85H 04H ITLB_MISSES.WALK_COMPLETE D_2M_4M Counts completed page walks of any page size (4K/2M/4M/1G) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault. 85H 08H ITLB_MISSES.WALK_COMPLETE D_1G Counts store misses in all DTLB levels that cause a completed page walk (1G page size). The page walk can end with or without a fault. 85H 0EH ITLB_MISSES.WALK_COMPLETE D Counts completed page walks (2M and 4M page sizes) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault. EdgeDetect=1 CounterMask=1 EDGE Vol. 3B 19-13 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 85H 10H ITLB_MISSES.WALK_PENDING Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake microarchitecture. 85H 10H ITLB_MISSES.WALK_ACTIVE Cycles when at least one PMH is busy with a page walk CounterMask=1 for code (instruction fetch) request. EPT page walk duration are excluded in Skylake microarchitecture. 85H 20H ITLB_MISSES.STLB_HIT Instruction fetch requests that miss the ITLB and hit the STLB. 87H 01H ILD_STALL.LCP Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CO RE Counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding “4 – x” ? when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when: a. IDQResource Allocation Table (RAT) pipe serves the other thread. b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions). c. Instruction Decode Queue (IDQ) delivers four uops. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Counts, on the per-thread basis, cycles when no uops LES_0_UOPS_DELIV.CORE are delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core =4. CounterMask=4 CMSK4 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Counts, on the per-thread basis, cycles when less than LES_LE_1_UOP_DELIV.CORE 1 uop is delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core >= 3. CounterMask=3 CMSK3 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Cycles with less than 2 uops delivered by the front end. CounterMask=2 LES_LE_2_UOP_DELIV.CORE CMSK2 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Cycles with less than 3 uops delivered by the front end. CounterMask=1 LES_LE_3_UOP_DELIV.CORE CMSK1 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Counts cycles FE delivered 4 uops or Resource LES_FE_WAS_OK Allocation Table (RAT) was stalling FE. A1H 01H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _0 least one uop is dispatched from the Reservation Station (RS) to port 0. A1H 02H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _1 least one uop is dispatched from the Reservation Station (RS) to port 1. A1H 04H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _2 least one uop is dispatched from the Reservation Station (RS) to port 2. 19-14 Vol. 3B Comment CounterMask=1 Invert=1 CMSK, INV PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment A1H 08H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _3 least one uop is dispatched from the Reservation Station (RS) to port 3. A1H 10H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _4 least one uop is dispatched from the Reservation Station (RS) to port 4. A1H 20H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _5 least one uop is dispatched from the Reservation Station (RS) to port 5. A1H 40H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _6 least one uop is dispatched from the Reservation Station (RS) to port 6. A1H 80H UOPS_DISPATCHED_PORT.PORT Counts, on the per-thread basis, cycles during which at _7 least one uop is dispatched from the Reservation Station (RS) to port 7. A2H 01H RESOURCE_STALLS.ANY Counts resource-related stall cycles. Reasons for stalls can be as follows: a. *any* u-arch structure got full (LB, SB, RS, ROB, BOB, LM, Physical Register Reclaim Table (PRRT), or Physical History Table (PHT) slots). b. *any* u-arch structure got empty (like INT/SIMD FreeLists). c. FPU control word (FPCW), MXCSR.and others. This counts cycles that the pipeline back end blocked uop delivery from the front end. A2H 08H RESOURCE_STALLS.SB Counts allocation stall cycles caused by the store buffer (SB) being full. This counts cycles that the pipeline back end blocked uop delivery from the front end. A3H 01H CYCLE_ACTIVITY.CYCLES_L2_MI SS Cycles while L2 cache miss demand load is outstanding. CounterMask=1 CMSK1 A3H 02H CYCLE_ACTIVITY.CYCLES_L3_MI SS Cycles while L3 cache miss demand load is outstanding. CounterMask=2 CMSK2 A3H 04H CYCLE_ACTIVITY.STALLS_TOTAL Total execution stalls. CounterMask=4 CMSK4 A3H 05H CYCLE_ACTIVITY.STALLS_L2_MI SS Execution stalls while L2 cache miss demand load is outstanding. CounterMask=5 CMSK5 A3H 06H CYCLE_ACTIVITY.STALLS_L3_MI SS Execution stalls while L3 cache miss demand load is outstanding. CounterMask=6 CMSK6 A3H 08H CYCLE_ACTIVITY.CYCLES_L1D_M Cycles while L1 cache miss demand load is outstanding. CounterMask=8 ISS CMSK8 A3H 0CH CYCLE_ACTIVITY.STALLS_L1D_M Execution stalls while L1 cache miss demand load is ISS outstanding. CounterMask=12 CMSK12 A3H 10H CYCLE_ACTIVITY.CYCLES_MEM_ ANY Cycles while memory subsystem has an outstanding load. CounterMask=16 CMSK16 A3H 14H CYCLE_ACTIVITY.STALLS_MEM_ ANY Execution stalls while memory subsystem has an outstanding load. CounterMask=20 CMSK20 A6H 01H EXE_ACTIVITY.EXE_BOUND_0_P Counts cycles during which no uops were executed on ORTS all ports and Reservation Station (RS) was not empty. Vol. 3B 19-15 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description A6H 02H EXE_ACTIVITY.1_PORTS_UTIL Counts cycles during which a total of 1 uop was executed on all ports and Reservation Station (RS) was not empty. A6H 04H EXE_ACTIVITY.2_PORTS_UTIL Counts cycles during which a total of 2 uops were executed on all ports and Reservation Station (RS) was not empty. A6H 08H EXE_ACTIVITY.3_PORTS_UTIL Cycles total of 3 uops are executed on all ports and Reservation Station (RS) was not empty. A6H 10H EXE_ACTIVITY.4_PORTS_UTIL Cycles total of 4 uops are executed on all ports and Reservation Station (RS) was not empty. A6H 40H EXE_ACTIVITY.BOUND_ON_STO RES Cycles where the Store Buffer was full and no outstanding load. A8H 01H LSD.UOPS Number of uops delivered to the back-end by the LSD (Loop Stream Detector). A8H 01H LSD.CYCLES_ACTIVE Counts the cycles when at least one uop is delivered by CounterMask=1 the LSD (Loop-stream detector). CMSK1 A8H 01H LSD.CYCLES_4_UOPS Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector). ABH 02H DSB2MITE_SWITCHES.PENALTY _CYCLES Counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) preallocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. MM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.Penalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 0 to 2 cycles. AEH 01H ITLB.ITLB_FLUSH Counts the number of flushes of the big or small ITLB pages. Counting include both TLB Flush (covering all sets) and TLB Set Clear (set-specific). B0H 01H OFFCORE_REQUESTS.DEMAND_ Counts the Demand Data Read requests sent to uncore. DATA_RD Use it in conjunction with OFFCORE_REQUESTS_OUTSTANDING to determine average latency in the uncore. B0H 02H OFFCORE_REQUESTS.DEMAND_ Counts both cacheable and non-cacheable code read CODE_RD requests. B0H 04H OFFCORE_REQUESTS.DEMAND_ Counts the demand RFO (read for ownership) requests RFO including regular RFOs, locks, ItoM. 19-16 Vol. 3B Comment CounterMask=4 CMSK4 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment B0H 08H OFFCORE_REQUESTS.ALL_DATA Counts the demand and prefetch data reads. All Core _RD Data Reads include cacheable 'Demands' and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type. B0H 10H OFFCORE_REQUESTS.L3_MISS_ DEMAND_DATA_RD B0H 80H OFFCORE_REQUESTS.ALL_REQU Counts memory transactions reached the super queue ESTS including requests initiated by the core, all L3 prefetches, page walks, etc. B1H 01H UOPS_EXECUTED.THREAD Number of uops to be executed per-thread each cycle. B1H 01H UOPS_EXECUTED.STALL_CYCLE S Counts cycles during which no uops were dispatched from the Reservation Station (RS) per thread. B1H 01H UOPS_EXECUTED.CYCLES_GE_1 Cycles where at least 1 uop was executed per-thread. _UOP_EXEC B1H 01H UOPS_EXECUTED.CYCLES_GE_2 Cycles where at least 2 uops were executed per-thread. CounterMask=2 _UOPS_EXEC CMSK2 B1H 01H UOPS_EXECUTED.CYCLES_GE_3 Cycles where at least 3 uops were executed per-thread. CounterMask=3 _UOPS_EXEC CMSK3 B1H 01H UOPS_EXECUTED.CYCLES_GE_4 Cycles where at least 4 uops were executed per-thread. CounterMask=4 _UOPS_EXEC CMSK4 B1H 02H UOPS_EXECUTED.CORE B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles at least 1 micro-op is executed from any thread _GE_1 on physical core. CounterMask=1 CMSK1 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles at least 2 micro-op is executed from any thread _GE_2 on physical core. CounterMask=2 CMSK2 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles at least 3 micro-op is executed from any thread _GE_3 on physical core. CounterMask=3 CMSK3 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles at least 4 micro-op is executed from any thread _GE_4 on physical core. CounterMask=4 CMSK4 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles with no micro-ops executed from any thread on _NONE physical core. CounterMask=1 Invert=1 CMSK1, INV B1H 10H UOPS_EXECUTED.X87 B2H 01H OFFCORE_REQUESTS_BUFFER.S Counts the number of cases when the offcore requests Q_FULL buffer cannot take more entries for the core. This can happen when the superqueue does not contain eligible entries, or when L1D writeback pending FIFO requests is full. Note: Writeback pending FIFO has six entries. BDH 01H TLB_FLUSH.DTLB_THREAD Counts the number of DTLB flush attempts of the thread-specific entries. BDH 20H TLB_FLUSH.STLB_ANY Counts the number of any STLB flush attempts (such as entire, VPID, PCID, InvPage, CR3 write, etc.). C0H 00H INST_RETIRED.ANY_P Counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two). Demand Data Read requests who miss L3 cache. CounterMask=1 Invert=1 CMSK, INV CounterMask=1 CMSK1 Number of uops executed from any thread. Counts the number of x87 uops executed. See Table 19-1. Vol. 3B 19-17 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment C0H 01H INST_RETIRED.PREC_DIST A version of INST_RETIRED that allows for a more unbiased distribution of samples across instructions retired. It utilizes the Precise Distribution of Instructions Retired (PDIR) feature to mitigate some bias in how retired instructions get sampled. Precise event capable Requires PEBS on General Counter 1(PDIR). C1H 3FH OTHER_ASSISTS.ANY Number of times a microcode assist is invoked by HW other than FP-assist. Examples include AD (page Access Dirty) and AVX* related assists. C2H 01H UOPS_RETIRED.STALL_CYCLES This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops. C2H 01H UOPS_RETIRED.TOTAL_CYCLES Number of cycles using always true condition (uops_ret CounterMask=10 < 16) applied to non PEBS uops retired event. Invert=1 CMSK10, INV C2H 02H UOPS_RETIRED.RETIRE_SLOTS Counts the retirement slots used. C3H 01H MACHINE_CLEARS.COUNT Number of machine clears (nukes) of any type. C3H 02H MACHINE_CLEARS.MEMORY_OR DERING Counts the number of memory ordering Machine Clears detected. Memory Ordering Machine Clears can result from one of the following: a. memory disambiguation, b. external snoop, or c. cross SMT-HW-thread snoop (stores) hitting load buffer. C3H 04H MACHINE_CLEARS.SMC Counts self-modifying code (SMC) detected, which causes a machine clear. C4H 00H BR_INST_RETIRED.ALL_BRANC HES Counts all (macro) branch instructions retired. C4H 01H BR_INST_RETIRED.CONDITIONA L This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired. C4H 02H BR_INST_RETIRED.NEAR_CALL This is a non-precise version (that is, does not use Precise event capable. PEBS) of the event that counts both direct and indirect PS near call instructions retired. C4H 08H BR_INST_RETIRED.NEAR_RETU RN This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired. C4H 10H BR_INST_RETIRED.NOT_TAKEN This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired. C4H 20H BR_INST_RETIRED.NEAR_TAKE N This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired. Precise event capable. PS C4H 40H BR_INST_RETIRED.FAR_BRANC H This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired. Precise event capable. PS 19-18 Vol. 3B CounterMask=1 Invert=1 CMSK1, INV EdgeDetect=1 CounterMask=1 CMSK1, EDG Precise event capable. See Table 19-1. Precise event capable. PS Precise event capable. PS PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment C5H 00H BR_MISP_RETIRED.ALL_BRANC HES Counts all the retired branch instructions that were Precise event capable. mispredicted by the processor. A branch misprediction See Table 19-1. occurs when the processor incorrectly predicts the destination of the branch. When the misprediction is discovered at execution, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path. C5H 01H BR_MISP_RETIRED.CONDITIONA This is a non-precise version (that is, does not use Precise event capable. L PEBS) of the event that counts mispredicted conditional PS branch instructions retired. C5H 02H BR_MISP_RETIRED.NEAR_CALL Counts both taken and not taken retired mispredicted Precise event capable. direct and indirect near calls, including both register and memory indirect. C5H 20H BR_MISP_RETIRED.NEAR_TAKE N Number of near branch instructions retired that were mispredicted and taken. Precise event capable. PS C6H 01H FRONTEND_RETIRED.DSB_MISS Counts retired Instructions that experienced DSB (Decode stream buffer, i.e. the decoded instructioncache) miss. Precise event capable. C6H 01H FRONTEND_RETIRED.L1I_MISS Retired Instructions who experienced Instruction L1 Cache true miss. Precise event capable. C6H 01H FRONTEND_RETIRED.L2_MISS Retired Instructions who experienced Instruction L2 Cache true miss. Precise event capable. C6H 01H FRONTEND_RETIRED.ITLB_MISS Counts retired Instructions that experienced iTLB (Instruction TLB) true miss. C6H 01H FRONTEND_RETIRED.STLB_MIS S C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval Precise event capable. GE_2 where the front end delivered no uops for a period of 2 cycles which was not interrupted by a back-end stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval Precise event capable. GE_4 where the front end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the GE_8 back end after a front-end stall of at least 8 cycles. During this period the front end delivered no uops. Precise event capable. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the GE_16 back end after a front-end stall of at least 16 cycles. During this period the front end delivered no uops. Precise event capable. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the GE_32 back end after a front-end stall of at least 32 cycles. During this period the front end delivered no uops. Precise event capable. C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval GE_64 where the front end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall. Precise event capable. Precise event capable. Counts retired Instructions that experienced STLB (2nd Precise event capable. level TLB) true miss. Vol. 3B 19-19 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval GE_128 where the front end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall. Precise event capable. C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval GE_256 where the front end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall. Precise event capable. C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval GE_512 where the front end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall. Precise event capable. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the Precise event capable. GE_2_BUBBLES_GE_1 back end after the front end had at least 1 bubble-slot for a period of 2 cycles. A bubble-slot is an empty issuepipeline slot while there was no RAT stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval Precise event capable. GE_2_BUBBLES_GE_2 where the front end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a backend stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval Precise event capable. GE_2_BUBBLES_GE_3 where the front end had at least 3 bubble-slots for a period of 2 cycles which was not interrupted by a backend stall. C7H 01H FP_ARITH_INST_RETIRED.SCAL AR_DOUBLE Number of SSE/AVX computational scalar double precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. Software may treat each count as one DP FLOP. C7H 02H FP_ARITH_INST_RETIRED.SCAL AR_SINGLE Number of SSE/AVX computational scalar single precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. Software may treat each count as one SP FLOP. C7H 04H FP_ARITH_INST_RETIRED.128B Number of SSE/AVX computational 128-bit packed Software may treat _PACKED_DOUBLE double precision floating-point instructions retired. each count as two DP Each count represents 2 computations. Applies to SSE* FLOPs. and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. 19-20 Vol. 3B Description Comment PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description C7H 08H FP_ARITH_INST_RETIRED.128B Number of SSE/AVX computational 128-bit packed Software may treat _PACKED_SINGLE single precision floating-point instructions retired. Each each count as four SP count represents 4 computations. Applies to SSE* and FLOPs. AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. C7H 10H FP_ARITH_INST_RETIRED.256B Number of SSE/AVX computational 256-bit packed Software may treat _PACKED_DOUBLE double precision floating-point instructions retired. each count as four DP Each count represents 4 computations. Applies to SSE* FLOPs. and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. C7H 20H FP_ARITH_INST_RETIRED.256B Number of SSE/AVX computational 256-bit packed Software may treat _PACKED_SINGLE single precision floating-point instructions retired. Each each count as eight count represents 8 computations. Applies to SSE* and SP FLOPs. AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. C7H 40H FP_ARITH_INST_RETIRED.512B Number of Packed Double-Precision FP arithmetic _PACKED_DOUBLE instructions (use operation multiplier of 8). Only applicable when AVX-512 is enabled. C7H 80H FP_ARITH_INST_RETIRED.512B Number of Packed Single-Precision FP arithmetic _PACKED_SINGLE instructions (use operation multiplier of 16). Only applicable when AVX-512 is enabled. C8H 01H HLE_RETIRED.START Number of times we entered an HLE region. Does not count nested transactions. C8H 02H HLE_RETIRED.COMMIT Number of times HLE commit succeeded. C8H 04H HLE_RETIRED.ABORTED Number of times HLE abort was triggered. C8H 08H HLE_RETIRED.ABORTED_MEM Number of times an HLE execution aborted due to various memory events (e.g., read/write capacity and conflicts). C8H 10H HLE_RETIRED.ABORTED_TIMER Number of times an HLE execution aborted due to hardware timer expiration. C8H 20H HLE_RETIRED.ABORTED_UNFRI ENDLY Number of times an HLE execution aborted due to HLEunfriendly instructions and certain unfriendly events (such as AD assists etc.). C8H 40H HLE_RETIRED.ABORTED_MEMT YPE Number of times an HLE execution aborted due to incompatible memory type. C8H 80H HLE_RETIRED.ABORTED_EVENT Number of times an HLE execution aborted due to S unfriendly events (such as interrupts). C9H 01H RTM_RETIRED.START Number of times we entered an RTM region. Does not count nested transactions. C9H 02H RTM_RETIRED.COMMIT Number of times RTM commit succeeded. C9H 04H RTM_RETIRED.ABORTED Number of times RTM abort was triggered. Comment Precise event capable. Precise event capable. Vol. 3B 19-21 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description C9H 08H RTM_RETIRED.ABORTED_MEM Number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts). C9H 10H RTM_RETIRED.ABORTED_TIMER Number of times an RTM execution aborted due to uncommon conditions. C9H 20H RTM_RETIRED.ABORTED_UNFRI Number of times an RTM execution aborted due to ENDLY HLE-unfriendly instructions. C9H 40H RTM_RETIRED.ABORTED_MEMT Number of times an RTM execution aborted due to YPE incompatible memory type. C9H 80H RTM_RETIRED.ABORTED_EVENT Number of times an RTM execution aborted due to S none of the previous 4 categories (e.g. interrupt). CAH 1EH FP_ASSIST.ANY Counts cycles with any input and output SSE or x87 FP CounterMask=1 assist. If an input and output assist are detected on the CMSK1 same cycle the event increments by 1. CBH 01H HW_INTERRUPTS.RECEIVED Counts the number of hardware interruptions received by the processor. CCH 20H ROB_MISC_EVENTS.LBR_INSERT Increments when an entry is added to the Last Branch S Record (LBR) array (or removed from the array in case of RETURNs in call stack mode). The event requires LBR enable via IA32_DEBUGCTL MSR and branch type selection via MSR_LBR_SELECT. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_4 Counts loads when the latency from first dispatch to completion is greater than 4 cycles. Reported latency may be longer than just the memory latency. Precise event capable. Specify threshold in MSR 3F6H. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_8 Counts loads when the latency from first dispatch to completion is greater than 8 cycles. Reported latency may be longer than just the memory latency. Precise event capable. Specify threshold in MSR 3F6H. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_16 Counts loads when the latency from first dispatch to completion is greater than 16 cycles. Reported latency may be longer than just the memory latency. Precise event capable. Specify threshold in MSR 3F6H. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_32 Counts loads when the latency from first dispatch to completion is greater than 32 cycles. Reported latency may be longer than just the memory latency. Precise event capable. Specify threshold in MSR 3F6H. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_64 Counts loads when the latency from first dispatch to completion is greater than 64 cycles. Reported latency may be longer than just the memory latency. Precise event capable. Specify threshold in MSR 3F6H. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_128 Counts loads when the latency from first dispatch to Precise event capable. completion is greater than 128 cycles. Reported latency Specify threshold in may be longer than just the memory latency. MSR 3F6H. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_256 Counts loads when the latency from first dispatch to Precise event capable. completion is greater than 256 cycles. Reported latency Specify threshold in may be longer than just the memory latency. MSR 3F6H. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_512 Counts loads when the latency from first dispatch to Precise event capable. completion is greater than 512 cycles. Reported latency Specify threshold in may be longer than just the memory latency. MSR 3F6H. D0H 11H MEM_INST_RETIRED.STLB_MISS Retired load instructions that miss the STLB. _LOADS 19-22 Vol. 3B Comment Precise event capable. PSDLA PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description D0H 12H MEM_INST_RETIRED.STLB_MISS Retired store instructions that miss the STLB. _STORES Precise event capable. PSDLA D0H 21H MEM_INST_RETIRED.LOCK_LOA DS Precise event capable. PSDLA D0H 41H MEM_INST_RETIRED.SPLIT_LOA Counts retired load instructions that split across a DS cacheline boundary. Precise event capable. PSDLA D0H 42H MEM_INST_RETIRED.SPLIT_STO Counts retired store instructions that split across a RES cacheline boundary. Precise event capable. PSDLA D0H 81H MEM_INST_RETIRED.ALL_LOAD S All retired load instructions. Precise event capable. PSDLA D0H 82H MEM_INST_RETIRED.ALL_STOR ES All retired store instructions. Precise event capable. PSDLA D1H 01H MEM_LOAD_RETIRED.L1_HIT Counts retired load instructions with at least one uop Precise event capable. that hit in the L1 data cache. This event includes all SW PSDLA prefetches and lock instructions regardless of the data source. D1H 02H MEM_LOAD_RETIRED.L2_HIT Retired load instructions with L2 cache hits as data sources. Precise event capable. PSDLA D1H 04H MEM_LOAD_RETIRED.L3_HIT Counts retired load instructions with at least one uop that hit in the L3 cache. Precise event capable. PSDLA D1H 08H MEM_LOAD_RETIRED.L1_MISS Counts retired load instructions with at least one uop that missed in the L1 cache. Precise event capable. PSDLA D1H 10H MEM_LOAD_RETIRED.L2_MISS Retired load instructions missed L2 cache as data sources. Precise event capable. PSDLA D1H 20H MEM_LOAD_RETIRED.L3_MISS Counts retired load instructions with at least one uop that missed in the L3 cache. Precise event capable. PSDLA D1H 40H MEM_LOAD_RETIRED.FB_HIT Counts retired load instructions with at least one uop was load missed in L1 but hit FB (Fill Buffers) due to preceding miss to the same cache line with data not ready. Precise event capable. PSDLA D2H 01H MEM_LOAD_L3_HIT_RETIRED.X SNP_MISS Retired load instructions which data sources were L3 hit and cross-core snoop missed in on-pkg core cache. Precise event capable. PSDLA D2H 02H MEM_LOAD_L3_HIT_RETIRED.X SNP_HIT Retired load instructions which data sources were L3 and cross-core snoop hits in on-pkg core cache. Precise event capable. PSDLA D2H 04H MEM_LOAD_L3_HIT_RETIRED.X SNP_HITM Retired load instructions which data sources were HitM Precise event capable. responses from shared L3. PSDLA D2H 08H MEM_LOAD_L3_HIT_RETIRED.X SNP_NONE Retired load instructions which data sources were hits in L3 without snoops required. D3H 01H MEM_LOAD_L3_MISS_RETIRED. LOCAL_DRAM Retired load instructions which data sources missed L3 Precise event capable. but serviced from local DRAM. D3H 02H MEM_LOAD_L3_MISS_RETIRED. REMOTE_DRAM Retired load instructions which data sources missed L3 Precise event capable. but serviced from remote dram. D3H 04H MEM_LOAD_L3_MISS_RETIRED. REMOTE_HITM Retired load instructions whose data sources was remote HITM. D3H 08H MEM_LOAD_L3_MISS_RETIRED. REMOTE_FWD Retired load instructions whose data sources was forwarded from a remote cache. Retired load instructions with locked access. Comment Precise event capable. PSDLA Precise event capable. Vol. 3B 19-23 PERFORMANCE MONITORING EVENTS Table 19-3. Performance Events of the Processor Core Supported in Intel® Xeon® Processor Scalable Family with Skylake Microarchitecture and Processors Based on the Cascade Lake Product (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment D4H 04H MEM_LOAD_MISC_RETIRED.UC Retired instructions with at least 1 uncacheable load or Precise event capable. lock. E6H 01H BACLEARS.ANY Counts the number of times the front-end is resteered when it finds a branch instruction in a fetch line. This occurs for the first time a branch instruction is fetched or when the branch is not tracked by the BPU (Branch Prediction Unit) anymore. F0H 40H L2_TRANS.L2_WB Counts L2 writebacks that access L2 cache. F1H 1FH L2_LINES_IN.ALL Counts the number of L2 cache lines filling the L2. Counting does not cover rejects. F2H 01H L2_LINES_OUT.SILENT Counts the number of lines that are silently dropped by L2 cache when triggered by an L2 cache fill. These lines are typically in Shared state. A non-threaded event. F2H 02H L2_LINES_OUT.NON_SILENT Counts the number of lines that are evicted by L2 cache when triggered by an L2 cache fill. Those lines can be either in modified state or clean state. Modified lines may either be written back to L3 or directly written to memory and not allocated in L3. Clean lines may either be allocated in L3 or dropped. F2H 04H L2_LINES_OUT.USELESS_PREF Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache. F2H 04H L2_LINES_OUT.USELESS_HWPF Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache. F4H 10H SQ_MISC.SPLIT_LOCK Counts the number of cache line split locks sent to the uncore. FEH 02H IDI_MISC.WB_UPGRADE Counts number of cache lines that are allocated and written back to L3 with the intention that they are more likely to be reused shortly. FEH 04H IDI_MISC.WB_DOWNGRADE Counts number of cache lines that are dropped and not written back to L3 as they are deemed to be less likely to be reused shortly. Table 19-4 lists performance events applicable to processors based on the Cascade Lake product. Table 19-4. Performance Event Addendum in Processors Based on the Cascade Lake Product Event Num. Umask Value D1H D1H Event Mask Mnemonic Description Comment 80H MEM_LOAD_RETIRED.LOCAL_ PMM_PS Retired load instructions with local Intel® Optane™ DC persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode) and DRAM cache (Memory Mode). Precise Event 80H MEM_LOAD_RETIRED.LOCAL_ PMM Retired load instructions with local Intel® Optane™ DC persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode), L3 and DRAM cache (Memory Mode). 19-24 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-4. Performance Event Addendum in Processors Based on the Cascade Lake Product Event Num. Umask Value Event Mask Mnemonic Description Comment D3H 10H MEM_LOAD_RETIRED.LOCAL_ PMM_PS Retired load instructions with remote Intel® Optane™ persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode) and DRAM cache (Memory Mode). Precise Event D3H 10H MEM_LOAD_RETIRED.LOCAL_ PMM_PS Retired load instructions with remote Intel® Optane™ persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode) and DRAM cache (Memory Mode). 19.3 PERFORMANCE MONITORING EVENTS FOR FUTURE INTEL® CORE™ PROCESSORS Future Intel® Core™ processors are based on the Ice Lake microarchitecture. They support the architectural performance monitoring events listed in Table 19-1. Fixed counters in the core PMU support the architecture events defined in Table 19-2. Model-specific performance monitoring events in the processor core are listed in Table 19-5. The events in Table 19-5 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_7DH and 06_7EH. The comment column in Table 19-5 uses abbreviated letters to indicate additional conditions applicable to the Event Mask Mnemonic. For event umasks listed in Table 19-5 that do not show “AnyT”, users should refrain from programming “AnyThread =1” in IA32_PERF_EVTSELx. Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture Event Num. Umask Value Event Mask Mnemonic Description 00H 01H INST_RETIRED.ANY Counts the number of X86 instructions retired - an Architectural PerfMon event. Counting continues during hardware interrupts, traps, and inside interrupt handlers. Notes: INST_RETIRED.ANY is counted by a designated fixed counter freeing up programmable counters to count other events. INST_RETIRED.ANY_P is counted by a programmable counter. 00H 01H INST_RETIRED.PREC_DIST A version of INST_RETIRED that allows for a more unbiased distribution of samples across instructions retired. It utilizes the Precise Distribution of Instructions Retired (PDIR) feature to mitigate some bias in how retired instructions get sampled. Use on Fixed Counter 0. 00H 02H CPU_CLK_UNHALTED.THREAD Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Comment Vol. 3B 19-25 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 00H 03H CPU_CLK_UNHALTED.REF_TSC Counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Note: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'. The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'. After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case. 00H 04H TOPDOWN.SLOTS Counts the number of available slots for an unhalted logical processor. The event increments by machinewidth of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method. The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core. Software can use this event as the denominator for the top-level metrics of the Top-down Microarchitecture Analysis method. This event is counted on a designated fixed counter (Fixed Counter 3) and is an architectural event. 19-26 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 03H 02H LD_BLOCKS.STORE_FORWARD Counts the number of times the load operation got the true Block-on-Store blocking code preventing store forwarding. This includes cases when: a. preceding store conflicts with the load (incomplete overlap), b. store forwarding is impossible due to u-arch limitations, c. preceding lock RMW operations are not forwarded, d. store has the no-forward bit set (uncacheable/pagesplit/masked stores), e. all-blocking stores are used (mostly, fences and port I/O), and others. The most common case is a load blocked due to its address range overlapping with a preceding smaller uncompleted store. Note: This event does not take into account cases of out-of-SW-control (for example, SbTailHit), unknown physical STA, and cases of blocking loads on store due to being non-WB memory type or a lock. These cases are covered by other events. See the table of not supported store forwards in the Optimization Guide. 03H 08H LD_BLOCKS.NO_SR Counts the number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use. 07H 01H LD_BLOCKS_PARTIAL.ADDRESS _ALIAS Counts the number of times a load got blocked due to false dependencies in MOB due to partial compare on address. 08H 02H DTLB_LOAD_MISSES.WALK_COM Counts page walks completed due to demand data PLETED_4K loads whose address translations missed in the TLB and were mapped to 4K pages. The page walks can end with or without a page fault. 08H 04H DTLB_LOAD_MISSES.WALK_COM Counts page walks completed due to demand data PLETED_2M_4M loads whose address translations missed in the TLB and were mapped to 2M/4M pages. The page walks can end with or without a page fault. 08H 0EH DTLB_LOAD_MISSES.WALK_COM Counts demand data loads that caused a completed PLETED page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault. 08H 10H DTLB_LOAD_MISSES.WALK_PEN Counts the number of page walks outstanding for a DING demand load in the PMH (Page Miss Handler) each cycle. 08H 10H DTLB_LOAD_MISSES.WALK_ACT Counts cycles when at least one PMH (Page Miss IVE Handler) is busy with a page walk for a demand load. 08H 20H DTLB_LOAD_MISSES.STLB_HIT Counts loads that miss the DTLB (Data TLB) and hit the STLB (Second level TLB). 0DH 01H INT_MISC.RECOVERY_CYCLES Counts core cycles when the Resource allocator was stalled due to recovery from an earlier branch misprediction or machine clear event. 0DH 03H INT_MISC.ALL_RECOVERY_CYCL ES Counts cycles the back end cluster is recovering after a miss-speculation or a Store Buffer or Load Buffer drain stall. 0DH 80H INT_MISC.CLEAR_RESTEER_CYC LES Cycles after recovery from a branch misprediction or machine clear till the first uop is issued from the resteered path. Comment Vol. 3B 19-27 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 0EH 01H UOPS_ISSUED.ANY Counts the number of uops that the Resource Allocation Table (RAT) issues to the Reservation Station (RS). 0EH 01H UOPS_ISSUED.STALL_CYCLES Counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread. 14H 09H ARITH.DIVIDER_ACTIVE Counts cycles when divide unit is busy executing divide or square root operations. Accounts for integer and floating-point operations. 24H 21H L2_RQSTS.DEMAND_DATA_RD_ Counts the number of demand Data Read requests that MISS miss L2 cache. Only not rejected loads are counted. 24H 22H L2_RQSTS.RFO_MISS Counts the RFO (Read-for-Ownership) requests that miss L2 cache. 24H 24H L2_RQSTS.CODE_RD_MISS Counts L2 cache misses when fetching instructions. 24H 27H L2_RQSTS.ALL_DEMAND_MISS Counts demand requests that miss L2 cache. 24H 28H L2_RQSTS.SWPF_MISS Counts Software prefetch requests that miss the L2 cache. This event accounts for PREFETCHNTA and PREFETCHT0/1/2 instructions. 24H C1H L2_RQSTS.DEMAND_DATA_RD_ Counts the number of demand Data Read requests HIT initiated by load instructions that hit L2 cache. 24H C2H L2_RQSTS.RFO_HIT Counts the RFO (Read-for-Ownership) requests that hit L2 cache. 24H C4H L2_RQSTS.CODE_RD_HIT Counts L2 cache hits when fetching instructions, code reads. 24H C8H L2_RQSTS.SWPF_HIT Counts Software prefetch requests that hit the L2 cache. This event accounts for PREFETCHNTA and PREFETCHT0/1/2 instructions. 24H E1H L2_RQSTS.ALL_DEMAND_DATA _RD Counts the number of demand Data Read requests (including requests from L1D hardware prefetchers). These loads may hit or miss L2 cache. Only nonrejected loads are counted. 24H E2H L2_RQSTS.ALL_RFO Counts the total number of RFO (read for ownership) requests to L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. 24H E4H L2_RQSTS.ALL_CODE_RD Counts the total number of L2 code requests. 24H E7H L2_RQSTS.ALL_DEMAND_REFE RENCES Counts demand requests to L2 cache. 28H 07H CORE_POWER.LVL0_TURBO_LIC Counts Core cycles where the core was running with ENSE power-delivery for baseline license level 0. This includes non-AVX codes, SSE, AVX 128-bit, and lowcurrent AVX 256-bit codes. 28H 18H CORE_POWER.LVL1_TURBO_LIC Counts Core cycles where the core was running with ENSE power-delivery for license level 1. This includes high current AVX 256-bit instructions as well as low current AVX 512-bit instructions. 19-28 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value 28H 20H CORE_POWER.LVL2_TURBO_LIC Core cycles where the core was running with powerENSE delivery for license level 2 (introduced in Skylake Server microarchitecture). This includes high current AVX 512-bit instructions. 32H 01H SW_PREFETCH_ACCESS.NTA Counts the number of PREFETCHNTA instructions executed. 32H 02H SW_PREFETCH_ACCESS.T0 Counts the number of PREFETCHT0 instructions executed. 32H 04H SW_PREFETCH_ACCESS.T1_T2 Counts the number of PREFETCHT1 or PREFETCHT2 instructions executed. 32H 08H SW_PREFETCH_ACCESS.PREFET Counts the number of PREFETCHW instructions CHW executed. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P 3CH 01H CPU_CLK_UNHALTED.REF_XCLK Counts core crystal clock cycles when the thread is unhalted. 3CH 02H CPU_CLK_UNHALTED.ONE_THR EAD_ACTIVE Counts Core crystal clock cycles when current thread is unhalted and the other thread is halted. 48H 01H L1D_PEND_MISS.PENDING Counts number of L1D misses that are outstanding in each cycle, that is each cycle the number of Fill Buffers (FB) outstanding required by Demand Reads. FB either is held by demand loads, or it is held by non-demand loads and gets hit at least once by demand. The valid outstanding interval is defined until the FB deallocation by one of the following ways: from FB allocation, if FB is allocated by demand from the demand Hit FB, if it is allocated by hardware or software prefetch. Note: In the L1D, a Demand Read contains cacheable or noncacheable demand loads, including ones causing cache-line splits and reads due to page walks resulted from any request type. 48H 01H L1D_PEND_MISS.PENDING_CYCL Counts duration of L1D miss outstanding in cycles. ES 48H 02H L1D_PEND_MISS.FB_FULL 48H 02H L1D_PEND_MISS.FB_FULL_PERI Counts number of phases a demand request has waited ODS due to L1D Fill Buffer (FB) unavailability. Demand requests include cacheable/uncacheable demand load, store, lock or SW prefetch accesses. 48H 04H L1D_PEND_MISS.L2_STALL Event Mask Mnemonic Description Comment This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time. Counts number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability. Demand requests include cacheable/uncacheable demand load, store, lock or SW prefetch accesses. Counts number of cycles a demand request has waited due to L1D due to lack of L2 resources. Demand requests include cacheable/uncacheable demand load, store, lock or SW prefetch accesses. Vol. 3B 19-29 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value 49H 02H DTLB_STORE_MISSES.WALK_CO Counts page walks completed due to demand data MPLETED_4K stores whose address translations missed in the TLB and were mapped to 4K pages. The page walks can end with or without a page fault. 49H 04H DTLB_STORE_MISSES.WALK_CO Counts page walks completed due to demand data MPLETED_2M_4M stores whose address translations missed in the TLB and were mapped to 2M/4M pages. The page walks can end with or without a page fault. 49H 0EH DTLB_STORE_MISSES.WALK_CO Counts demand data stores that caused a completed MPLETED page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault. 49H 10H DTLB_STORE_MISSES.WALK_PE Counts the number of page walks outstanding for a NDING store in the PMH (Page Miss Handler) each cycle. 49H 10H DTLB_STORE_MISSES.WALK_AC Counts cycles when at least one PMH (Page Miss TIVE Handler) is busy with a page walk for a store. 49H 20H DTLB_STORE_MISSES.STLB_HIT Counts stores that miss the DTLB (Data TLB) and hit the STLB (2nd Level TLB). 4CH 01H LOAD_HIT_PREFETCH.SWPF Counts all not software-prefetch load dispatches that hit the fill buffer (FB) allocated for the software prefetch. It can also be incremented by some lock instructions. So it should only be used with profiling so that the locks can be excluded by ASM (Assembly File) inspection of the nearby instructions. 51H 01H L1D.REPLACEMENT Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace. 54H 01H TX_MEM.ABORT_CONFLICT Counts the number of times a TSX line had a cache conflict. 54H 02H TX_MEM.ABORT_CAPACITY_WRI Speculatively counts the number Transactional TE Synchronization Extensions (TSX) Aborts due to a data capacity limitation for transactional writes. 54H 04H TX_MEM.ABORT_HLE_STORE_T Counts the number of times a TSX Abort was triggered O_ELIDED_LOCK due to a non-release/commit store to lock. 54H 08H TX_MEM.ABORT_HLE_ELISION_ BUFFER_NOT_EMPTY Counts the number of times a TSX Abort was triggered due to commit but Lock Buffer not empty. 54H 10H TX_MEM.ABORT_HLE_ELISION_ BUFFER_MISMATCH Counts the number of times a TSX Abort was triggered due to release/commit but data and address mismatch. 54H 20H TX_MEM.ABORT_HLE_ELISION_ Counts the number of times a TSX Abort was triggered BUFFER_UNSUPPORTED_ALIGN due to attempting an unsupported alignment from Lock MENT Buffer. 54H 40H TX_MEM.HLE_ELISION_BUFFER _FULL Counts the number of times we could not allocate Lock Buffer. 5DH 02H TX_EXEC.MISC2 Counts unfriendly TSX abort triggered by a vzeroupper instruction. 5DH 04H TX_EXEC.MISC3 Counts unfriendly TSX abort triggered by a nest count that is too deep. 19-30 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 5EH 01H RS_EVENTS.EMPTY_CYCLES Counts cycles during which the Reservation Station (RS) is empty for this logical processor. This is usually caused when the front-end pipeline runs into starvation periods (e.g., branch mispredictions or i-cache misses). 5EH 01H RS_EVENTS.EMPTY_END Counts end of periods where the Reservation Station (RS) was empty. Could be useful to closely sample on front-end latency issues (see the FRONTEND_RETIRED event of designated precise events). 60H 04H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding demand rfo DING.CYCLES_WITH_DEMAND_R Reads transactions in the super queue every cycle. The FO 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS. 60H 08H OFFCORE_REQUESTS_OUTSTAN Counts the number of offcore outstanding cacheable DING.ALL_DATA_RD Core Data Read transactions in the super queue every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS. 60H 08H OFFCORE_REQUESTS_OUTSTAN Counts cycles when offcore outstanding cacheable Core DING.CYCLES_WITH_DATA_RD Data Read transactions are present in the super queue. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS. 79H 04H IDQ.MITE_UOPS Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. This also means that uops are not being delivered from the Decode Stream Buffer (DSB). 79H 04H IDQ.MITE_CYCLES_OK Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB). 79H 04H IDQ.MITE_CYCLES_ANY Counts the number of cycles uops were delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB). 79H 08H IDQ.DSB_UOPS Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. 79H 08H IDQ.DSB_CYCLES_OK Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB). 79H 08H IDQ.DSB_CYCLES_ANY Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Comment Vol. 3B 19-31 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 79H 30H IDQ.MS_SWITCHES Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer. 79H 30H IDQ.MS_UOPS Counts the total number of uops delivered by the Microcode Sequencer (MS). Any instruction over 4 uops will be delivered by the MS. Some instructions such as transcendentals may additionally generate uops from the MS. 79H 30H IDQ.MS_CYCLES_ANY Counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE. 80H 04H ICACHE_16B.IFDATA_STALL Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. 83H 01H ICACHE_64B.IFTAG_HIT Counts instruction fetch tag lookups that hit in the instruction cache (L1I). Counts at 64-byte cache-line granularity. Accounts for both cacheable and uncacheable accesses. 83H 02H ICACHE_64B.IFTAG_MISS Counts instruction fetch tag lookups that miss in the instruction cache (L1I). Counts at 64-byte cache-line granularity. Accounts for both cacheable and uncacheable accesses. 83H 04H ICACHE_64B.IFTAG_STALL Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss. 85H 02H ITLB_MISSES.WALK_COMPLETE D_4K Counts completed page walks (4K page size) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault. 85H 04H ITLB_MISSES.WALK_COMPLETE D_2M_4M Counts code misses in all ITLB (Instruction TLB) levels that caused a completed page walk (2M and 4M page sizes). The page walk can end with or without a fault. 85H 0EH ITLB_MISSES.WALK_COMPLETE D Counts completed page walks (2M and 4M page sizes) caused by a code fetch. This implies it missed in the ITLB (Instruction TLB) and further levels of TLB. The page walk can end with or without a fault. 85H 10H ITLB_MISSES.WALK_PENDING Counts the number of page walks outstanding for an outstanding code (instruction fetch) request in the PMH (Page Miss Handler) each cycle. 85H 10H ITLB_MISSES.WALK_ACTIVE Counts cycles when at least one PMH (Page Miss Handler) is busy with a page walk for a code (instruction fetch) request. 85H 20H ITLB_MISSES.STLB_HIT Counts instruction fetch requests that miss the ITLB (Instruction TLB) and hit the STLB (Second-level TLB). 19-32 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 87H 01H ILD_STALL.LCP Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CO RE Counts the number of uops not delivered to by the Instruction Decode Queue (IDQ) to the back-end of the pipeline when there were no back-end stalls. This event counts for one SMT thread in a given cycle. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Counts the number of cycles when no uops were LES_0_UOPS_DELIV.CORE delivered by the Instruction Decode Queue (IDQ) to the back-end of the pipeline when there was no back-end stalls. This event counts for one SMT thread in a given cycle. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Counts the number of cycles when the optimal number LES_FE_WAS_OK of uops were delivered by the Instruction Decode Queue (IDQ) to the back-end of the pipeline when there was no back-end stalls. This event counts for one SMT thread in a given cycle. A1H 01H UOPS_DISPATCHED.PORT_0 Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 0. A1H 02H UOPS_DISPATCHED.PORT_1 Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 1. A1H 04H UOPS_DISPATCHED.PORT_2_3 Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to ports 2 and 3. A1H 10H UOPS_DISPATCHED.PORT_4_9 Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to ports 5 and 9. A1H 20H UOPS_DISPATCHED.PORT_5 Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 5. A1H 40H UOPS_DISPATCHED.PORT_6 Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 6. A1H 80H UOPS_DISPATCHED.PORT_7_8 Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to ports 7 and 8. A2H 02H RESOURCE_STALLS.SCOREBOAR Counts cycles where the pipeline is stalled due to D serializing operations. A2H 08H RESOURCE_STALLS.SB Counts allocation stall cycles caused by the store buffer (SB) being full. This counts cycles that the pipeline backend blocked uop delivery from the front-end. A3H 01H CYCLE_ACTIVITY.CYCLES_L2_MI SS Cycles while L2 cache miss demand load is outstanding. Comment Vol. 3B 19-33 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description A3H 02H CYCLE_ACTIVITY.CYCLES_L3_MI SS Cycles while L3 cache miss demand load is outstanding. A3H 04H CYCLE_ACTIVITY.STALLS_TOTAL Total execution stalls. A3H 05H CYCLE_ACTIVITY.STALLS_L2_MI SS Execution stalls while L2 cache miss demand load is outstanding. A3H 06H CYCLE_ACTIVITY.STALLS_L3_MI SS Execution stalls while L3 cache miss demand load is outstanding. A3H 08H CYCLE_ACTIVITY.CYCLES_L1D_M Cycles while L1 cache miss demand load is outstanding. ISS A3H 0CH CYCLE_ACTIVITY.STALLS_L1D_M Execution stalls while L1 cache miss demand load is ISS outstanding. A3H 10H CYCLE_ACTIVITY.CYCLES_MEM_ ANY Cycles while memory subsystem has an outstanding load. A3H 14H CYCLE_ACTIVITY.STALLS_MEM_ ANY Execution stalls while memory subsystem has an outstanding load. A4H 01H TOPDOWN.SLOTS_P Counts the number of available slots for an unhalted logical processor. The event increments by machinewidth of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method. The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core. A4H 02H TOPDOWN.BACKEND_BOUND_S LOTS Issue slots where no uops were being issued due to lack of back end resources. A6H 02H EXE_ACTIVITY.1_PORTS_UTIL Counts cycles during which a total of 1 uop was executed on all ports and Reservation Station (RS) was not empty. A6H 04H EXE_ACTIVITY.2_PORTS_UTIL Counts cycles during which a total of 2 uops were executed on all ports and Reservation Station (RS) was not empty. A6H 40H EXE_ACTIVITY.BOUND_ON_STO RES Counts cycles where the Store Buffer was full and no loads caused an execution stall. A6H 80H EXE_ACTIVITY.EXE_BOUND_0_P Counts cycles during which no uops were executed on ORTS all ports and Reservation Station (RS) was not empty. A8H 01H LSD.UOPS Counts the number of uops delivered to the back-end by the LSD (Loop Stream Detector). A8H 01H LSD.CYCLES_ACTIVE Counts the cycles when at least one uop is delivered by the LSD (Loop-stream detector). A8H 01H LSD.CYCLES_OK Counts the cycles when optimal number of uops is delivered by the LSD (Loop-stream detector). ABH 02H DSB2MITE_SWITCHES.PENALTY _CYCLES Decode Stream Buffer (DSB) is a Uop-cache that holds translations of previously fetched instructions that were decoded by the legacy x86 decode pipeline (MITE). This event counts fetch penalty cycles when a transition occurs from DSB to MITE. AEH 01H ITLB.ITLB_FLUSH Counts the number of flushes of the big or small ITLB pages. Counting include both TLB Flush (covering all sets) and TLB Set Clear (set-specific). 19-34 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value B0H 01H OFFCORE_REQUESTS.DEMAND_ Counts the Demand Data Read requests sent to uncore. DATA_RD Use it in conjunction with OFFCORE_REQUESTS_OUTSTANDING to determine average latency in the uncore. B0H 04H OFFCORE_REQUESTS.DEMAND_ Counts the demand RFO (read for ownership) requests RFO including regular RFOs, locks, ItoM. B0H 08H OFFCORE_REQUESTS.ALL_DATA Counts the demand and prefetch data reads. All Core _RD Data Reads include cacheable 'Demands' and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type. B0H 10H OFFCORE_REQUESTS.L3_MISS_ DEMAND_DATA_RD B0H 80H OFFCORE_REQUESTS.ALL_REQU Counts memory transactions reached the super queue ESTS including requests initiated by the core, all L3 prefetches, page walks, etc. B1H 01H UOPS_EXECUTED.THREAD Counts the number of uops to be executed per-thread each cycle. B1H 01H UOPS_EXECUTED.STALL_CYCLE S Counts cycles during which no uops were dispatched from the Reservation Station (RS) per thread. B1H 01H UOPS_EXECUTED.CYCLES_GE_1 Cycles where at least 1 uop was executed per-thread. B1H 01H UOPS_EXECUTED.CYCLES_GE_2 Cycles where at least 2 uops were executed per-thread. B1H 01H UOPS_EXECUTED.CYCLES_GE_3 Cycles where at least 3 uops were executed per-thread. B1H 01H UOPS_EXECUTED.CYCLES_GE_4 Cycles where at least 4 uops were executed per-thread. B1H 02H UOPS_EXECUTED.CORE B1H 02H UOPS_EXECUTED.CORE_CYCLES Counts cycles when at least 1 micro-op is executed _GE_1 from any thread on physical core. B1H 02H UOPS_EXECUTED.CORE_CYCLES Counts cycles when at least 2 micro-op is executed _GE_2 from any thread on physical core. B1H 02H UOPS_EXECUTED.CORE_CYCLES Counts cycles when at least 3 micro-op is executed _GE_3 from any thread on physical core. B1H 02H UOPS_EXECUTED.CORE_CYCLES Counts cycles when at least 4 micro-op is executed _GE_4 from any thread on physical core. B1H 10H UOPS_EXECUTED.X87 Counts the number of x87 uops executed. BDH 01H TLB_FLUSH.DTLB_THREAD Counts the number of DTLB flush attempts of the thread-specific entries. BDH 20H TLB_FLUSH.STLB_ANY Counts the number of any STLB flush attempts (such as entire, VPID, PCID, InvPage, CR3 write, etc.). C0H 00H INST_RETIRED.ANY_P Counts the number of X86 instructions retired - an Architectural PerfMon event. Counting continues during hardware interrupts, traps, and inside interrupt handlers. Notes: INST_RETIRED.ANY is counted by a designated fixed counter freeing up programmable counters to count other events. INST_RETIRED.ANY_P is counted by a programmable counter. C1H 02H ASSISTS.FP Counts all microcode Floating Point assists. Event Mask Mnemonic Description Comment Demand Data Read requests who miss L3 cache. Counts the number of uops executed from any thread. Vol. 3B 19-35 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description C1H 07H ASSISTS.ANY Counts the number of occurrences where a microcode assist is invoked by hardware Examples include AD (page Access Dirty), FP and AVX related assists. C2H 02H UOPS_RETIRED.TOTAL_CYCLES Counts the number of cycles using always true condition (uops_ret & amp; lt; 16) applied to non PEBS uops retired event. C2H 02H UOPS_RETIRED.SLOTS Counts the retirement slots used each cycle. C3H 01H MACHINE_CLEARS.COUNT Counts the number of machine clears (nukes) of any type. C3H 02H MACHINE_CLEARS.MEMORY_OR DERING Counts the number of Machine Clears detected dye to memory ordering. Memory Ordering Machine Clears may apply when a memory read may not conform to the memory ordering rules of the x86 architecture. C3H 04H MACHINE_CLEARS.SMC Counts self-modifying code (SMC) detected, which causes a machine clear. C4H 00H BR_INST_RETIRED.ALL_BRANC HES Counts all branch instructions retired. C4H 01H BR_INST_RETIRED.COND_TAKE N Counts taken conditional branch instructions retired. C4H 02H BR_INST_RETIRED.NEAR_CALL Counts both direct and indirect near call instructions retired. C4H 08H BR_INST_RETIRED.NEAR_RETU RN Counts return instructions retired. C4H 10H BR_INST_RETIRED.COND_NTAK EN Counts not taken branch instructions retired. C4H 11H BR_INST_RETIRED.COND Counts conditional branch instructions retired. C4H 20H BR_INST_RETIRED.NEAR_TAKE N Counts taken branch instructions retired. C4H 40H BR_INST_RETIRED.FAR_BRANC H Counts far branch instructions retired. C4H 80H BR_INST_RETIRED.INDIRECT Counts all indirect branch instructions retired (excluding RETs. TSX aborts is considered indirect branch). C5H 00H BR_MISP_RETIRED.ALL_BRANC HES Counts all the retired branch instructions that were mispredicted by the processor. A branch misprediction occurs when the processor incorrectly predicts the destination of the branch. When the misprediction is discovered at execution, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path. C5H 01H BR_MISP_RETIRED.COND_TAKE N Counts taken conditional mispredicted branch instructions retired. C5H 11H BR_MISP_RETIRED.COND Counts mispredicted conditional branch instructions retired. C5H 20H BR_MISP_RETIRED.NEAR_TAKE N Counts number of near branch instructions retired that were mispredicted and taken. 19-36 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description C5H 80H BR_MISP_RETIRED.INDIRECT Counts all miss-predicted indirect branch instructions retired (excluding RETs. TSX aborts is considered indirect branch). C6H 01H FRONTEND_RETIRED.DSB_MISS Counts retired Instructions that experienced DSB (Decode stream buffer, i.e., the decoded instructioncache) miss. C6H 01H FRONTEND_RETIRED.L1I_MISS Counts retired Instructions who experienced Instruction L1 Cache true miss. C6H 01H FRONTEND_RETIRED.L2_MISS Counts retired Instructions who experienced Instruction L2 Cache true miss. C6H 01H FRONTEND_RETIRED.ITLB_MISS Counts retired Instructions that experienced iTLB (Instruction TLB) true miss. C6H 01H FRONTEND_RETIRED.STLB_MIS S C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are fetched after an GE_2 interval where the front-end delivered no uops for a period of 2 cycles which was not interrupted by a backend stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are fetched after an GE_4 interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a backend stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the GE_8 back-end after a front-end stall of at least 8 cycles. During this period the front-end delivered no uops. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the GE_16 back-end after a front-end stall of at least 16 cycles. During this period the front-end delivered no uops. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the GE_32 back-end after a front-end stall of at least 32 cycles. During this period the front-end delivered no uops. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are fetched after an GE_64 interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are fetched after an GE_128 interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are fetched after an GE_256 interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall. C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are fetched after an GE_512 interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall. Comment Counts retired Instructions that experienced STLB (2nd level TLB) true miss. Vol. 3B 19-37 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value C6H 01H FRONTEND_RETIRED.LATENCY_ Counts retired instructions that are delivered to the GE_2_BUBBLES_GE_1 back-end after the front-end had at least 1 bubble-slot for a period of 2 cycles. A bubble-slot is an empty issuepipeline slot while there was no RAT stall. C7H 01H FP_ARITH_INST_RETIRED.SCAL AR_DOUBLE Counts number of SSE/AVX computational scalar double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. C7H 02H FP_ARITH_INST_RETIRED.SCAL AR_SINGLE Counts number of SSE/AVX computational scalar single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT RCP FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. C7H 04H FP_ARITH_INST_RETIRED.128B Counts number of SSE/AVX computational 128-bit _PACKED_DOUBLE packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 2 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. C7H 08H FP_ARITH_INST_RETIRED.128B Counts number of SSE/AVX computational 128-bit _PACKED_SINGLE packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. C7H 10H FP_ARITH_INST_RETIRED.256B Counts number of SSE/AVX computational 256-bit _PACKED_DOUBLE packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. 19-38 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value C7H 20H FP_ARITH_INST_RETIRED.256B Counts number of SSE/AVX computational 256-bit _PACKED_SINGLE packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. C7H 40H FP_ARITH_INST_RETIRED.512B Counts number of SSE/AVX computational 512-bit _PACKED_DOUBLE packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 RANGE FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. C7H 80H FP_ARITH_INST_RETIRED.512B Counts number of SSE/AVX computational 512-bit _PACKED_SINGLE packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 16 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 RANGE FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. C8H 01H HLE_RETIRED.START Counts the number of times we entered an HLE region. Does not count nested transactions. C8H 02H HLE_RETIRED.COMMIT Counts the number of times HLE commit succeeded. C8H 04H HLE_RETIRED.ABORTED Counts the number of times HLE abort was triggered. C8H 08H HLE_RETIRED.ABORTED_MEM Counts the number of times an HLE execution aborted due to various memory events (e.g., read/write capacity and conflicts). C8H 20H HLE_RETIRED.ABORTED_UNFRI ENDLY Counts the number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.). C8H 80H HLE_RETIRED.ABORTED_EVENT Counts the number of times an HLE execution aborted S due to unfriendly events (such as interrupts). C9H 01H RTM_RETIRED.START Counts the number of times we entered an RTM region. Does not count nested transactions. C9H 02H RTM_RETIRED.COMMIT Counts the number of times RTM commit succeeded. C9H 04H RTM_RETIRED.ABORTED Counts the number of times RTM abort was triggered. C9H 08H RTM_RETIRED.ABORTED_MEM Counts the number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts). Event Mask Mnemonic Description Comment Vol. 3B 19-39 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value C9H 20H RTM_RETIRED.ABORTED_UNFRI Counts the number of times an RTM execution aborted ENDLY due to HLE-unfriendly instructions. C9H 40H RTM_RETIRED.ABORTED_MEMT Counts the number of times an RTM execution aborted YPE due to incompatible memory type. C9H 80H RTM_RETIRED.ABORTED_EVENT Counts the number of times an RTM execution aborted S due to none of the previous 4 categories (e.g., interrupt). CCH 20H MISC_RETIRED.LBR_INSERTS Increments when an entry is added to the Last Branch Record (LBR) array (or removed from the array in case of RETURNs in call stack mode). The event requires LBR enable via IA32_DEBUGCTL MSR and branch type selection via MSR_LBR_SELECT. CCH 40H MISC_RETIRED.PAUSE_INST Counts number of retired PAUSE instructions (that do not end up with a VM Exit to the VMM; TSX aborted Instructions may be counted). CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_4 Counts randomly selected loads when the latency from first dispatch to completion is greater than 4 cycles. Reported latency may be longer than just the memory latency. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_8 Counts randomly selected loads when the latency from first dispatch to completion is greater than 8 cycles. Reported latency may be longer than just the memory latency. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_16 Counts randomly selected loads when the latency from first dispatch to completion is greater than 16 cycles. Reported latency may be longer than just the memory latency. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_32 Counts randomly selected loads when the latency from first dispatch to completion is greater than 32 cycles. Reported latency may be longer than just the memory latency. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_64 Counts randomly selected loads when the latency from first dispatch to completion is greater than 64 cycles. Reported latency may be longer than just the memory latency. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_128 Counts randomly selected loads when the latency from first dispatch to completion is greater than 128 cycles. Reported latency may be longer than just the memory latency. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_256 Counts randomly selected loads when the latency from first dispatch to completion is greater than 256 cycles. Reported latency may be longer than just the memory latency. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY_GT_512 Counts randomly selected loads when the latency from first dispatch to completion is greater than 512 cycles. Reported latency may be longer than just the memory latency. DOH 11H MEM_INST_RETIRED.STLB_MISS Counts retired load instructions that true miss the _LOADS STLB. 19-40 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value D0H 12H MEM_INST_RETIRED.STLB_MISS Counts retired store instructions that true miss the _STORES STLB. DOH 21H MEM_INST_RETIRED.LOCK_LOA DS D0H 41H MEM_INST_RETIRED.SPLIT_LOA Counts retired load instructions that split across a DS cacheline boundary. DOH 42H MEM_INST_RETIRED.SPLIT_STO Counts retired store instructions that split across a RES cacheline boundary. D0H 81H MEM_INST_RETIRED.ALL_LOAD S Counts all retired load instructions. This event accounts for SW prefetch instructions for loads. D0H 82H MEM_INST_RETIRED.ALL_STOR ES Counts all retired store instructions. This event account for SW prefetch instructions and PREFETCHW instruction for stores. D1H 01H MEM_LOAD_RETIRED.L1_HIT Counts retired load instructions with at least one uop that hit in the L1 data cache. This event includes all SW prefetches and lock instructions regardless of the data source. D1H 02H MEM_LOAD_RETIRED.L2_HIT Counts retired load instructions with L2 cache hits as data sources. D1H 04H MEM_LOAD_RETIRED.L3_HIT Counts retired load instructions with at least one uop that hit in the L3 cache. D1H 08H MEM_LOAD_RETIRED.L1_MISS Counts retired load instructions with at least one uop that missed in the L1 cache. D1H 10H MEM_LOAD_RETIRED.L2_MISS Counts retired load instructions missed L2 cache as data sources. D1H 20H MEM_LOAD_RETIRED.L3_MISS Counts retired load instructions with at least one uop that missed in the L3 cache. D1H 40H MEM_LOAD_RETIRED.FB_HIT Counts retired load instructions with at least one uop was load missed in L1 but hit FB (Fill Buffers) due to preceding miss to the same cache line with data not ready. D2H 01H MEM_LOAD_L3_HIT_RETIRED.X SNP_MISS Counts the retired load instructions whose data sources were L3 hit and cross-core snoop missed in on-pkg core cache. D2H 02H MEM_LOAD_L3_HIT_RETIRED.X SNP_HIT Counts retired load instructions whose data sources were L3 and cross-core snoop hits in on-pkg core cache. D2H 04H MEM_LOAD_L3_HIT_RETIRED.X SNP_HITM Counts retired load instructions whose data sources were HitM responses from shared L3. D2H 08H MEM_LOAD_L3_HIT_RETIRED.X SNP_NONE Counts retired load instructions whose data sources were hits in L3 without snoops required. E6H 01H BACLEARS.ANY Counts the number of times the front-end is resteered when it finds a branch instruction in a fetch line. This occurs for the first time a branch instruction is fetched or when the branch is not tracked by the BPU (Branch Prediction Unit) anymore. Event Mask Mnemonic Description Comment Counts retired load instructions with locked access. Vol. 3B 19-41 PERFORMANCE MONITORING EVENTS Table 19-5. Performance Events of the Processor Core Supported by Ice Lake Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description ECH 02H CPU_CLK_UNHALTED.DISTRIBU TED This event distributes cycle counts between active hyperthreads, i.e., those in C0. A hyperthread becomes inactive when it executes the HLT or MWAIT instructions. If all other hyperthreads are inactive (or disabled or do not exist), all counts are attributed to this hyperthread. To obtain the full count when the Core is active, sum the counts from each hyperthread. F1H 1FH L2_LINES_IN.ALL Counts the number of L2 cache lines filling the L2. Counting does not cover rejects. F4H 04H SQ_MISC.SQ_FULL Counts the cycles for which the thread is active and the superQ cannot take any more entries. Comment CMSK1: Counter Mask = 1 required; CMSK4: CounterMask = 4 required; CMSK6: CounterMask = 6 required; CMSK8: CounterMask = 8 required; CMSK10: CounterMask = 10 required; CMSK12: CounterMask = 12 required; CMSK16: CounterMask = 16 required; CMSK20: CounterMask = 20 required. AnyT: AnyThread = 1 required. INV: Invert = 1 required. EDG: EDGE = 1 required. PSDLA: Also supports PEBS and DataLA. PS: Also supports PEBS. 19.4 PERFORMANCE MONITORING EVENTS FOR 6TH GENERATION, 7TH GENERATION AND 8TH GENERATION INTEL® CORE™ PROCESSORS 6th Generation Intel® Core™ processors are based on the Skylake microarchitecture. They support the architectural performance monitoring events listed in Table 19-1. Fixed counters in the core PMU support the architecture events defined in Table 19-2. Model-specific performance monitoring events in the processor core are listed in Table 19-6. The events in Table 19-6 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_4EH and 06_5EH. Table 19-12 lists performance events supporting Intel TSX (see Section 18.3.6.5) and the events are applicable to processors based on Skylake microarchitecture. Where Skylake microarchitecture implements TSX-related event semantics that differ from Table 19-12, they are listed in Table 19-7. 7th Generation Intel® Core™ processors are based on the Kaby Lake microarchitecture. 8th Generation Intel® Core™ processors are based on the Coffee Lake microarchitecture. Model-specific performance monitoring events in the processor core for these processors are listed in Table 19-6. The events in Table 19-6 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_8EH and 06_9EH. The comment column in Table 19-6 uses abbreviated letters to indicate additional conditions applicable to the Event Mask Mnemonic. For event umasks listed in Table 19-6 that do not show “AnyT”, users should refrain from programming “AnyThread =1” in IA32_PERF_EVTSELx. 19-42 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures Event Num. Umask Value Event Mask Mnemonic Description 03H 02H LD_BLOCKS.STORE_FORWARD Loads blocked by overlapping with store buffer that cannot be forwarded. 03H 08H LD_BLOCKS.NO_SR The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use. 07H 01H LD_BLOCKS_PARTIAL.ADDRESS _ALIAS False dependencies in MOB due to partial compare on address. 08H 01H DTLB_LOAD_MISSES.MISS_CAUS Load misses in all TLB levels that cause a page walk of ES_A_WALK any page size. 08H 0EH DTLB_LOAD_MISSES.WALK_COM Load misses in all TLB levels causes a page walk that PLETED completes. (All page sizes.) 08H 10H DTLB_LOAD_MISSES.WALK_PEN Counts 1 per cycle for each PMH that is busy with a DING page walk for a load. 08H 10H DTLB_LOAD_MISSES.WALK_ACT Cycles when at least one PMH is busy with a walk for a CMSK1 IVE load. 08H 20H DTLB_LOAD_MISSES.STLB_HIT Loads that miss the DTLB but hit STLB. 0DH 01H INT_MISC.RECOVERY_CYCLES Core cycles the allocator was stalled due to recovery from earlier machine clear event for this thread (for example, misprediction or memory order conflict). 0DH 01H INT_MISC.RECOVERY_CYCLES_A Core cycles the allocator was stalled due to recovery NY from earlier machine clear event for any logical thread in this processor core. 0DH 80H INT_MISC.CLEAR_RESTEER_CYC LES Comment AnyT Cycles the issue-stage is waiting for front end to fetch from resteered path following branch misprediction or machine clear events. 0EH 01H UOPS_ISSUED.ANY The number of uops issued by the RAT to RS. 0EH 01H UOPS_ISSUED.STALL_CYCLES Cycles when the RAT does not issue uops to RS for the CMSK1, INV thread. 0EH 02H UOPS_ISSUED.VECTOR_WIDTH_ Uops inserted at issue-stage in order to preserve upper MISMATCH bits of vector registers. 0EH 20H UOPS_ISSUED.SLOW_LEA Number of slow LEA or similar uops allocated. Such uop has 3 sources (for example, 2 sources + immediate) regardless of whether it is a result of LEA instruction or not. 14H 01H ARITH.FPU_DIVIDER_ACTIVE Cycles when divider is busy executing divide or square root operations. Accounts for FP operations including integer divides. 24H 21H L2_RQSTS.DEMAND_DATA_RD_ MISS Demand Data Read requests that missed L2, no rejects. 24H 22H L2_RQSTS.RFO_MISS RFO requests that missed L2. 24H 24H L2_RQSTS.CODE_RD_MISS L2 cache misses when fetching instructions. 24H 27H L2_RQSTS.ALL_DEMAND_MISS Demand requests that missed L2. 24H 38H L2_RQSTS.PF_MISS Requests from the L1/L2/L3 hardware prefetchers or load software prefetches that miss L2 cache. 24H 3FH L2_RQSTS.MISS All requests that missed L2. Vol. 3B 19-43 PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value 24H 41H L2_RQSTS.DEMAND_DATA_RD_ Demand Data Read requests that hit L2 cache. HIT 24H 42H L2_RQSTS.RFO_HIT RFO requests that hit L2 cache. 24H 44H L2_RQSTS.CODE_RD_HIT L2 cache hits when fetching instructions. 24H D8H L2_RQSTS.PF_HIT Prefetches that hit L2. 24H E1H L2_RQSTS.ALL_DEMAND_DATA _RD All demand data read requests to L2. 24H E2H L2_RQSTS.ALL_RFO All L RFO requests to L2. Event Mask Mnemonic Description Comment 24H E4H L2_RQSTS.ALL_CODE_RD All L2 code requests. 24H E7H L2_RQSTS.ALL_DEMAND_REFE RENCES All demand requests to L2. 24H F8H L2_RQSTS.ALL_PF All requests from the L1/L2/L3 hardware prefetchers or load software prefetches. 24H EFH L2_RQSTS.REFERENCES All requests to L2. 2EH 4FH LONGEST_LAT_CACHE.REFEREN This event counts requests originating from the core CE that reference a cache line in the L3 cache. See Table 19-1. 2EH 41H LONGEST_LAT_CACHE.MISS This event counts each cache miss condition for references to the L3 cache. See Table 19-1. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P Cycles while the logical processor is not in a halt state. See Table 19-1. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P_ANY Cycles while at least one logical processor is not in a halt state. AnyT 3CH 01H CPU_CLK_THREAD_UNHALTED. REF_XCLK Core crystal clock cycles when the thread is unhalted. See Table 19-1. 3CH 01H CPU_CLK_THREAD_UNHALTED. REF_XCLK_ANY Core crystal clock cycles when at least one thread on the physical core is unhalted. AnyT 3CH 02H CPU_CLK_THREAD_UNHALTED. ONE_THREAD_ACTIVE Core crystal clock cycles when this thread is unhalted and the other thread is halted. 48H 01H L1D_PEND_MISS.PENDING Increments the number of outstanding L1D misses every cycle. 48H 01H L1D_PEND_MISS.PENDING_CYCL Cycles with at least one outstanding L1D misses from ES this logical processor. CMSK1 48H 01H L1D_PEND_MISS.PENDING_CYCL Cycles with at least one outstanding L1D misses from ES_ANY any logical processor in this core. CMSK1, AnyT 48H 02H L1D_PEND_MISS.FB_FULL 49H 01H DTLB_STORE_MISSES.MISS_CAU Store misses in all TLB levels that cause page walks. SES_A_WALK 49H 0EH DTLB_STORE_MISSES.WALK_CO Counts completed page walks in any TLB levels due to MPLETED store misses (all page sizes). 19-44 Vol. 3B Number of times a request needed a FB entry but there was no entry available for it. That is, the FB unavailability was the dominant reason for blocking the request. A request includes cacheable/uncacheable demand that is load, store or SW prefetch. HWP are excluded. PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value 49H 10H DTLB_STORE_MISSES.WALK_PE Counts 1 per cycle for each PMH that is busy with a NDING page walk for a store. 49H 10H DTLB_STORE_MISSES.WALK_AC Cycles when at least one PMH is busy with a page walk CMSK1 TIVE for a store. 49H 20H DTLB_STORE_MISSES.STLB_HIT Store misses that missed DTLB but hit STLB. 4CH 01H LOAD_HIT_PRE.HW_PF Demand load dispatches that hit fill buffer allocated for software prefetch. 4FH 10H EPT.WALK_PENDING Counts 1 per cycle for each PMH that is busy with an EPT walk for any request type. 51H 01H L1D.REPLACEMENT Counts the number of lines brought into the L1 data cache. 5EH 01H RS_EVENTS.EMPTY_CYCLES Cycles the RS is empty for the thread. 5EH 01H RS_EVENTS.EMPTY_END Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Front-end Latency Bound issues. 60H 01H OFFCORE_REQUESTS_OUTSTAN Increment each cycle of the number of offcore DING.DEMAND_DATA_RD outstanding Demand Data Read transactions in SQ to uncore. 60H 01H OFFCORE_REQUESTS_OUTSTAN Cycles with at least one offcore outstanding Demand DING.CYCLES_WITH_DEMAND_D Data Read transactions in SQ to uncore. ATA_RD 60H 01H OFFCORE_REQUESTS_OUTSTAN Cycles with at least 6 offcore outstanding Demand Data CMSK6 DING.DEMAND_DATA_RD_GE_6 Read transactions in SQ to uncore. 60H 02H OFFCORE_REQUESTS_OUTSTAN Increment each cycle of the number of offcore DING.DEMAND_CODE_RD outstanding demand code read transactions in SQ to uncore. 60H 02H OFFCORE_REQUESTS_OUTSTAN Cycles with at least one offcore outstanding demand DING.CYCLES_WITH_DEMAND_C code read transactions in SQ to uncore. ODE_RD 60H 04H OFFCORE_REQUESTS_OUTSTAN Increment each cycle of the number of offcore DING.DEMAND_RFO outstanding RFO store transactions in SQ to uncore. Set Cmask=1 to count cycles. 60H 04H OFFCORE_REQUESTS_OUTSTAN Cycles with at least one offcore outstanding RFO DING.CYCLES_WITH_DEMAND_R transactions in SQ to uncore. FO 60H 08H OFFCORE_REQUESTS_OUTSTAN Increment each cycle of the number of offcore DING.ALL_DATA_RD outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles. 60H 08H OFFCORE_REQUESTS_OUTSTAN Cycles with at least one offcore outstanding data read DING.CYCLES_WITH_DATA_RD transactions in SQ to uncore. 60H 10H OFFCORE_REQUESTS_OUTSTAN Increment each cycle of the number of offcore DING.L3_MISS_DEMAND_DATA_ outstanding demand data read requests from SQ that missed L3. RD 60H 10H OFFCORE_REQUESTS_OUTSTAN Cycles with at least one offcore outstanding demand DING.CYCLES_WITH_L3_MISS_D data read requests from SQ that missed L3. EMAND_DATA_RD Event Mask Mnemonic Description Comment CMSK1, INV CMSK1 CMSK1 CMSK1 CMSK1 CMSK1 Vol. 3B 19-45 PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value 60H 10H OFFCORE_REQUESTS_OUTSTAN Cycles with at least one offcore outstanding demand DING.L3_MISS_DEMAND_DATA_ data read requests from SQ that missed L3. RD_GE_6 63H 02H LOCK_CYCLES.CACHE_LOCK_DU RATION Cycles in which the L1D is locked. 79H 04H IDQ.MITE_UOPS Increment each cycle # of uops delivered to IDQ from MITE path. 79H 04H IDQ.MITE_CYCLES Cycles when uops are being delivered to IDQ from MITE CMSK1 path. 79H 08H IDQ.DSB_UOPS Increment each cycle. # of uops delivered to IDQ from DSB path. 79H 08H IDQ.DSB_CYCLES Cycles when uops are being delivered to IDQ from DSB path. 79H 10H IDQ.MS_DSB_UOPS Increment each cycle # of uops delivered to IDQ by DSB when MS_busy. 79H 18H IDQ.ALL_DSB_CYCLES_ANY_UO PS Cycles DSB is delivered at least one uops. CMSK1 79H 18H IDQ.ALL_DSB_CYCLES_4_UOPS Cycles DSB is delivered four uops. CMSK4 79H 20H IDQ.MS_MITE_UOPS Increment each cycle # of uops delivered to IDQ by MITE when MS_busy. 79H 24H IDQ.ALL_MITE_CYCLES_ANY_UO Counts cycles MITE is delivered at least one uops. PS CMSK1 79H 24H IDQ.ALL_MITE_CYCLES_4_UOPS Counts cycles MITE is delivered four uops. CMSK4 79H 30H IDQ.MS_UOPS Increment each cycle # of uops delivered to IDQ while MS is busy. 79H 30H IDQ.MS_SWITCHES Number of switches from DSB or MITE to MS. EDG 79H 30H IDQ.MS_CYCLES Cycles MS is delivered at least one uops. CMSK1 80H 04H ICACHE_16B.IFDATA_STALL Cycles where a code fetch is stalled due to L1 instruction cache miss. 80H 04H ICACHE_64B.IFDATA_STALL Cycles where a code fetch is stalled due to L1 instruction cache tag miss. 83H 01H ICACHE_64B.IFTAG_HIT Instruction fetch tag lookups that hit in the instruction cache (L1I). Counts at 64-byte cache-line granularity. 83H 02H ICACHE_64B.IFTAG_MISS Instruction fetch tag lookups that miss in the instruction cache (L1I). Counts at 64-byte cache-line granularity. 85H 01H ITLB_MISSES.MISS_CAUSES_A_ WALK Misses at all ITLB levels that cause page walks. 85H 0EH ITLB_MISSES.WALK_COMPLETE D Counts completed page walks in any TLB level due to code fetch misses (all page sizes). 85H 10H ITLB_MISSES.WALK_PENDING Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. 85H 20H ITLB_MISSES.STLB_HIT ITLB misses that hit STLB. 87H 01H ILD_STALL.LCP Stalls caused by changing prefix length of the instruction. 19-46 Vol. 3B Event Mask Mnemonic Description Comment CMSK6 CMSK1 PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 9CH 01H IDQ_UOPS_NOT_DELIVERED.CO RE Count issue pipeline slots where no uop was delivered from the front end to the back end when there is no back-end stall. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Cycles which 4 issue pipeline slots had no uop delivered CMSK4 LES_0_UOP_DELIV.CORE from the front end to the back end when there is no back-end stall. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Cycles which “4-n” issue pipeline slots had no uop LES_LE_n_UOP_DELIV.CORE delivered from the front end to the back end when there is no back-end stall. 9CH 01H IDQ_UOPS_NOT_DELIVERED.CYC Cycles which front end delivered 4 uops or the RAT was CMSK, INV LES_FE_WAS_OK stalling FE. A1H 01H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _0 dispatched to port 0. A1H 02H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _1 dispatched to port 1. A1H 04H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _2 dispatched to port 2. A1H 08H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _3 dispatched to port 3. A1H 10H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _4 dispatched to port 4. A1H 20H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _5 dispatched to port 5. A1H 40H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _6 dispatched to port 6. A1H 80H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _7 dispatched to port 7. A2H 01H RESOURCE_STALLS.ANY Resource-related stall cycles. A2H 08H RESOURCE_STALLS.SB Cycles stalled due to no store buffers available (not including draining form sync). A3H 01H CYCLE_ACTIVITY.CYCLES_L2_MI SS Cycles while L2 cache miss demand load is outstanding. CMSK1 A3H 02H CYCLE_ACTIVITY.CYCLES_L3_MI SS Cycles while L3 cache miss demand load is outstanding. CMSK2 A3H 04H CYCLE_ACTIVITY.STALLS_TOTAL Total execution stalls. CMSK4 A3H 05H CYCLE_ACTIVITY.STALLS_L2_MI SS Execution stalls while L2 cache miss demand load is outstanding. CMSK5 A3H 06H CYCLE_ACTIVITY.STALLS_L3_MI SS Execution stalls while L3 cache miss demand load is outstanding. CMSK6 A3H 08H CYCLE_ACTIVITY.CYCLES_L1D_M Cycles while L1 data cache miss demand load is ISS outstanding. CMSK8 A3H 0CH CYCLE_ACTIVITY.STALLS_L1D_M Execution stalls while L1 data cache miss demand load ISS is outstanding. CMSK12 A3H 10H CYCLE_ACTIVITY.CYCLES_MEM_ ANY CMSK16 Cycles while memory subsystem has an outstanding load. Comment Set CMSK = 4-n; n = 1, 2, 3 Vol. 3B 19-47 PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment A3H 14H CYCLE_ACTIVITY.STALLS_MEM_ ANY Execution stalls while memory subsystem has an outstanding load. CMSK20 A6H 01H EXE_ACTIVITY.EXE_BOUND_0_P Cycles for which no uops began execution, the ORTS Reservation Station was not empty, the Store Buffer was full and there was no outstanding load. A6H 02H EXE_ACTIVITY.1_PORTS_UTIL Cycles for which one uop began execution on any port, and the Reservation Station was not empty. A6H 04H EXE_ACTIVITY.2_PORTS_UTIL Cycles for which two uops began execution, and the Reservation Station was not empty. A6H 08H EXE_ACTIVITY.3_PORTS_UTIL Cycles for which three uops began execution, and the Reservation Station was not empty. A6H 04H EXE_ACTIVITY.4_PORTS_UTIL Cycles for which four uops began execution, and the Reservation Station was not empty. A6H 40H EXE_ACTIVITY.BOUND_ON_STO RES Cycles where the Store Buffer was full and no outstanding load. A8H 01H LSD.UOPS Number of uops delivered by the LSD. A8H 01H LSD.CYCLES_ACTIVE Cycles with at least one uop delivered by the LSD and none from the decoder. A8H 01H LSD.CYCLES_4_UOPS Cycles with 4 uops delivered by the LSD and none from CMSK4 the decoder. ABH 02H DSB2MITE_SWITCHES.PENALTY _CYCLES DSB-to-MITE switch true penalty cycles. AEH 01H ITLB.ITLB_FLUSH Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages. B0H 01H OFFCORE_REQUESTS.DEMAND_ Demand data read requests sent to uncore. DATA_RD B0H 02H OFFCORE_REQUESTS.DEMAND_ Demand code read requests sent to uncore. CODE_RD B0H 04H OFFCORE_REQUESTS.DEMAND_ Demand RFO read requests sent to uncore, including RFO regular RFOs, locks, ItoM. B0H 08H OFFCORE_REQUESTS.ALL_DATA Data read requests sent to uncore (demand and _RD prefetch). B0H 10H OFFCORE_REQUESTS.L3_MISS_ DEMAND_DATA_RD B0H 80H OFFCORE_REQUESTS.ALL_REQU Any memory transaction that reached the SQ. ESTS B1H 01H UOPS_EXECUTED.THREAD Counts the number of uops that begin execution across all ports. B1H 01H UOPS_EXECUTED.STALL_CYCLE S Cycles where there were no uops that began execution. CMSK, INV B1H 01H UOPS_EXECUTED.CYCLES_GE_1 Cycles where there was at least one uop that began _UOP_EXEC execution. CMSK1 B1H 01H UOPS_EXECUTED.CYCLES_GE_2 Cycles where there were at least two uops that began _UOP_EXEC execution. CMSK2 B1H 01H UOPS_EXECUTED.CYCLES_GE_3 Cycles where there were at least three uops that began CMSK3 _UOP_EXEC execution. 19-48 Vol. 3B CMSK1 Demand data read requests that missed L3. PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value B1H 01H UOPS_EXECUTED.CYCLES_GE_4 Cycles where there were at least four uops that began _UOP_EXEC execution. B1H 02H UOPS_EXECUTED.CORE B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles where there was at least one uop, from any _GE_1 logical processor in this core, that began execution. CMSK1 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles where there were at least two uops, from any _GE_2 logical processor in this core, that began execution. CMSK2 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles where there were at least three uops, from any _GE_3 logical processor in this core, that began execution. CMSK3 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles where there were at least four uops, from any _GE_4 logical processor in this core, that began execution. CMSK4 B1H 02H UOPS_EXECUTED.CORE_CYCLES Cycles where there were no uops from any logical _NONE processor in this core that began execution. CMSK1, INV B1H 10H UOPS_EXECUTED.X87 B2H 01H OFF_CORE_REQUEST_BUFFER.S Offcore requests buffer cannot take more entries for Q_FULL this core. B7H 01H OFF_CORE_RESPONSE_0 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A6H BBH 01H OFF_CORE_RESPONSE_1 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A7H BDH 01H TLB_FLUSH.DTLB_THREAD DTLB flush attempts of the thread-specific entries. BDH 01H TLB_FLUSH.STLB_ANY STLB flush attempts. C0H 00H INST_RETIRED.ANY_P Number of instructions at retirement. See Table 19-1. C0H 01H INST_RETIRED.PREC_DIST Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution. PMC1 only; C0H 01H INST_RETIRED.TOTAL_CYCLES Number of cycles using always true condition applied to CMSK10, PS PEBS instructions retired event. C1H 3FH OTHER_ASSISTS.ANY Number of times a microcode assist is invoked by HW other than FP-assist. Examples include AD (page Access Dirty) and AVX* related assists. C2H 01H UOPS_RETIRED.STALL_CYCLES Cycles without actually retired uops. CMSK1, INV C2H 01H UOPS_RETIRED.TOTAL_CYCLES Cycles with less than 10 actually retired uops. CMSK10, INV C2H 02H UOPS_RETIRED.RETIRE_SLOTS Retirement slots used. C3H 01H MACHINE_CLEARS.COUNT Number of machine clears of any type. C3H 02H MACHINE_CLEARS.MEMORY_OR DERING Counts the number of machine clears due to memory order conflicts. C3H 04H MACHINE_CLEARS.SMC Number of self-modifying-code machine clears detected. C4H 00H BR_INST_RETIRED.ALL_BRANC HES Branch instructions that retired. C4H 01H BR_INST_RETIRED.CONDITIONA Counts the number of conditional branch instructions L retired. PS C4H 02H BR_INST_RETIRED.NEAR_CALL PS Event Mask Mnemonic Description Comment CMSK4 Counts the number of uops from any logical processor in this core that begin execution. Counts the number of X87 uops that begin execution. Direct and indirect near call instructions retired. CMSK1, EDG See Table 19-1. Vol. 3B 19-49 PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment C4H 04H BR_INST_RETIRED.ALL_BRANC HES Counts the number of branch instructions retired. PS C4H 08H BR_INST_RETIRED.NEAR_RETU RN Counts the number of near return instructions retired. PS C4H 10H BR_INST_RETIRED.NOT_TAKEN Counts the number of not taken branch instructions retired. C4H 20H BR_INST_RETIRED.NEAR_TAKE N Number of near taken branches retired. PS C4H 40H BR_INST_RETIRED.FAR_BRANC H Number of far branches retired. PS C5H 00H BR_MISP_RETIRED.ALL_BRANC HES Mispredicted branch instructions at retirement. See Table 19-1. C5H 01H BR_MISP_RETIRED.CONDITIONA Mispredicted conditional branch instructions retired. L PS C5H 04H BR_MISP_RETIRED.ALL_BRANC HES Mispredicted macro branch instructions retired. PS C5H 20H BR_MISP_RETIRED.NEAR_TAKE N Number of near branch instructions retired that were mispredicted and taken. PS C6H 01H FRONTEND_RETIRED.DSB_MISS Retired instructions which experienced DSB miss. Specify MSR_PEBS_FRONTEND.EVTSEL=11H. PS C6H 01H FRONTEND_RETIRED.L1I_MISS Retired instructions which experienced instruction L1 cache true miss. Specify MSR_PEBS_FRONTEND.EVTSEL=12H. PS C6H 01H FRONTEND_RETIRED.L2_MISS Retired instructions which experienced L2 cache true miss. Specify MSR_PEBS_FRONTEND.EVTSEL=13H. PS C6H 01H FRONTEND_RETIRED.ITLB_MISS Retired instructions which experienced ITLB true miss. Specify MSR_PEBS_FRONTEND.EVTSEL=14H. C6H 01H FRONTEND_RETIRED.STLB_MIS S C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval GE_16 where the front end delivered no uops for at least 16 cycles. Specify the following fields in MSR_PEBS_FRONTEND: EVTSEL=16H, IDQ_Bubble_Length =16, IDQ_Bubble_Width = 4. C6H 01H FRONTEND_RETIRED.LATENCY_ Retired instructions that are fetched after an interval PS, m = 1, 2, 3 GE_2_BUBBLES_GE_m where the front end had ‘m’ IDQ slots delivered, no uops for at least 2 cycles. Specify the following fields in MSR_PEBS_FRONTEND: EVTSEL=16H, IDQ_Bubble_Length =2, IDQ_Bubble_Width = m. C7H 01H FP_ARITH_INST_RETIRED.SCAL AR_DOUBLE Number of double-precision, floating-point, scalar SSE/AVX computational instructions that are retired. Each scalar FMA instruction counts as 2. Software may treat each count as one DP FLOP. C7H 02H FP_ARITH_INST_RETIRED.SCAL AR_SINGLE Number of single-precision, floating-point, scalar SSE/AVX computational instructions that are retired. Each scalar FMA instruction counts as 2. Software may treat each count as one SP FLOP. 19-50 Vol. 3B PS Retired instructions which experienced STLB true miss. PS Specify MSR_PEBS_FRONTEND.EVTSEL=15H. PS PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value C7H 04H FP_ARITH_INST_RETIRED.128B Number of double-precision, floating-point, 128-bit _PACKED_DOUBLE SSE/AVX computational instructions that are retired. Each 128-bit FMA or (V)DPPD instruction counts as 2. Software may treat each count as two DP FLOPs. C7H 08H FP_ARITH_INST_RETIRED.128B Number of single-precision, floating-point, 128-bit _PACKED_SINGLE SSE/AVX computational instructions that are retired. Each 128-bit FMA or (V)DPPS instruction counts as 2. Software may treat each count as four SP FLOPs. C7H 10H FP_ARITH_INST_RETIRED.256B Number of double-precision, floating-point, 256-bit _PACKED_DOUBLE SSE/AVX computational instructions that are retired. Each 256-bit FMA instruction counts as 2. Software may treat each count as four DP FLOPs. C7H 20H FP_ARITH_INST_RETIRED.256B Number of single-precision, floating-point, 256-bit _PACKED_SINGLE SSE/AVX computational instructions that are retired. Each 256-bit FMA or VDPPS instruction counts as 2. Software may treat each count as eight SP FLOPs. CAH 1EH FP_ASSIST.ANY Cycles with any input/output SSE* or FP assists. CMSK1 CBH 01H HW_INTERRUPTS.RECEIVED Number of hardware interrupts received by the processor. CCH 20H ROB_MISC_EVENTS.LBR_INSERT Increments when an entry is added to the Last Branch S Record (LBR) array (or removed from the array in case of RETURNs in call stack mode). The event requires LBR enable via IA32_DEBUGCTL MSR and branch type selection via MSR_LBR_SELECT. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY D0H 11H MEM_INST_RETIRED.STLB_MISS Retired load instructions that miss the STLB. _LOADS PSDLA D0H 12H MEM_INST_RETIRED.STLB_MISS Retired store instructions that miss the STLB. _STORES PSDLA D0H 21H MEM_INST_RETIRED.LOCK_LOA DS PSDLA D0H 41H MEM_INST_RETIRED.SPLIT_LOA Number of load instructions retired with cache-line DS splits that may impact performance. PSDLA D0H 42H MEM_INST_RETIRED.SPLIT_STO Number of store instructions retired with line-split. RES PSDLA D0H 81H MEM_INST_RETIRED.ALL_LOAD S All retired load instructions. PSDLA D0H 82H MEM_INST_RETIRED.ALL_STOR ES All retired store instructions. PSDLA D1H 01H MEM_LOAD_RETIRED.L1_HIT Retired load instructions with L1 cache hits as data sources. PSDLA D1H 02H MEM_LOAD_RETIRED.L2_HIT Retired load instructions with L2 cache hits as data sources. PSDLA D1H 04H MEM_LOAD_RETIRED.L3_HIT Retired load instructions with L3 cache hits as data sources. PSDLA D1H 08H MEM_LOAD_RETIRED.L1_MISS Retired load instructions missed L1 cache as data sources. PSDLA D1H 10H MEM_LOAD_RETIRED.L2_MISS Retired load instructions missed L2. Unknown data source excluded. PSDLA Event Mask Mnemonic Description Comment Randomly sampled loads whose latency is above a user Specify threshold in defined threshold. A small fraction of the overall loads MSR 3F6H. are sampled due to randomization. PSDLA Retired load instructions with locked access. Vol. 3B 19-51 PERFORMANCE MONITORING EVENTS Table 19-6. Performance Events of the Processor Core Supported by Skylake, Kaby Lake and Coffee Lake Microarchitectures (Contd.) Event Num. Umask Value Event Mask Mnemonic Description D1H 20H MEM_LOAD_RETIRED.L3_MISS Retired load instructions missed L3. Excludes unknown PSDLA data source. D1H 40H MEM_LOAD_RETIRED.FB_HIT Retired load instructions where data sources were load PSDLA uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. D2H 01H MEM_LOAD_L3_HIT_RETIRED.X SNP_MISS Retired load instructions where data sources were L3 hit and cross-core snoop missed in on-pkg core cache. PSDLA D2H 02H MEM_LOAD_L3_HIT_RETIRED.X SNP_HIT Retired load Instructions where data sources were L3 and cross-core snoop hits in on-pkg core cache. PSDLA D2H 04H MEM_LOAD_L3_HIT_RETIRED.X SNP_HITM Retired load instructions where data sources were HitM PSDLA responses from shared L3. D2H 08H MEM_LOAD_L3_HIT_RETIRED.X SNP_NONE Retired load instructions where data sources were hits PSDLA in L3 without snoops required. E6H 01H BACLEARS.ANY Number of front end re-steers due to BPU misprediction. F0H 40H L2_TRANS.L2_WB L2 writebacks that access L2 cache. F1H 07H L2_LINES_IN.ALL L2 cache lines filling L2. Comment CMSK1: Counter Mask = 1 required; CMSK4: CounterMask = 4 required; CMSK6: CounterMask = 6 required; CMSK8: CounterMask = 8 required; CMSK10: CounterMask = 10 required; CMSK12: CounterMask = 12 required; CMSK16: CounterMask = 16 required; CMSK20: CounterMask = 20 required. AnyT: AnyThread = 1 required. INV: Invert = 1 required. EDG: EDGE = 1 required. PSDLA: Also supports PEBS and DataLA. PS: Also supports PEBS. Table 19-12 lists performance events supporting Intel TSX (see Section 18.3.6.5) and the events are applicable to processors based on Skylake microarchitecture. Where Skylake microarchitecture implements TSX-related event semantics that differ from Table 19-12, they are listed in Table 19-7. Table 19-7. Intel® TSX Performance Event Addendum in Processors based on Skylake Microarchitecture Event Num. Umask Value Event Mask Mnemonic Description 54H 02H TX_MEM.ABORT_CAPACITY Number of times a transactional abort was signaled due to a data capacity limitation for transactional reads or writes. 19.5 Comment PERFORMANCE MONITORING EVENTS FOR INTEL® XEON PHI™ PROCESSOR 3200, 5200, 7200 SERIES AND INTEL® XEON PHI™ PROCESSOR 7215, 7285, 7295 SERIES The Intel® Xeon Phi™ processor 3200/5200/7200 series is based on the Knights Landing microarchitecture with CPUID DisplayFamily_DisplayModel signature 06_57H. The Intel® Xeon Phi™ processor 7215/7285/7295 series is based on the Knights Mill microarchitecture with CPUID DisplayFamily_DisplayModel signature 06_85H. Modelspecific performance monitoring events in these processor cores are listed in Table 19-8. The events in Table 19-8 19-52 Vol. 3B PERFORMANCE MONITORING EVENTS apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_57H and 06_85H. Table 19-8. Performance Events of the Processor Core Supported by Knights Landing and Knights Mill Microarchitectures Event Num. Umask Value 03H 01H RECYCLEQ.LD_BLOCK_ST_FORW Counts the number of occurrences a retired load gets ARD blocked because its address partially overlaps with a store. 03H 02H RECYCLEQ.LD_BLOCK_STD_NOT Counts the number of occurrences a retired load gets READY blocked because its address overlaps with a store whose data is not ready. 03H 04H RECYCLEQ.ST_SPLITS Counts the number of occurrences a retired store that is a cache line split. Each split should be counted only once. 03H 08H RECYCLEQ.LD_SPLITS Counts the number of occurrences a retired load that is PSDLA a cache line split. Each split should be counted only once. 03H 10H RECYCLEQ.LOCK Counts all the retired locked loads. It does not include stores because we would double count if we count stores. 03H 20H RECYCLEQ.STA_FULL Counts the store micro-ops retired that were pushed in the recycle queue because the store address buffer is full. 03H 40H RECYCLEQ.ANY_LD Counts any retired load that was pushed into the recycle queue for any reason. 03H 80H RECYCLEQ.ANY_ST Counts any retired store that was pushed into the recycle queue for any reason. 04H 01H MEM_UOPS_RETIRED.L1_MISS_ LOADS Counts the number of load micro-ops retired that miss in L1 D cache. 04H 02H MEM_UOPS_RETIRED.L2_HIT_L OADS Counts the number of load micro-ops retired that hit in the L2. PSDLA 04H 04H MEM_UOPS_RETIRED.L2_MISS_ LOADS Counts the number of load micro-ops retired that miss in the L2. PSDLA 04H 08H MEM_UOPS_RETIRED.DTLB_MIS Counts the number of load micro-ops retired that cause PSDLA S_LOADS a DTLB miss. 04H 10H MEM_UOPS_RETIRED.UTLB_MIS Counts the number of load micro-ops retired that S_LOADS caused micro TLB miss. 04H 20H MEM_UOPS_RETIRED.HITM 04H 40H MEM_UOPS_RETIRED.ALL_LOAD Counts all the load micro-ops retired. S 04H 80H MEM_UOPS_RETIRED.ALL_STOR Counts all the store micro-ops retired. ES 05H 01H PAGE_WALKS.D_SIDE_WALKS Counts the total D-side page walks that are completed or started. The page walks started in the speculative path will also be counted. 05H 01H PAGE_WALKS.D_SIDE_CYCLES Counts the total number of core cycles for all the D-side page walks. The cycles for page walks started in speculative path will also be included. 05H 02H PAGE_WALKS.I_SIDE_WALKS Counts the total I-side page walks that are completed. Event Mask Mnemonic Description Comment PSDLA Counts the loads retired that get the data from the other core in the same tile in M state. EdgeDetect=1 EdgeDetect=1 Vol. 3B 19-53 PERFORMANCE MONITORING EVENTS Table 19-8. Performance Events of the Processor Core Supported by Knights Landing and Knights Mill Microarchitectures Event Num. Umask Value Event Mask Mnemonic Description 05H 02H PAGE_WALKS.I_SIDE_CYCLES Counts the total number of core cycles for all the I-side page walks. The cycles for page walks started in speculative path will also be included. 05H 03H PAGE_WALKS.WALKS Counts the total page walks that are completed (I-side and D-side). 05H 03H PAGE_WALKS.CYCLES Counts the total number of core cycles for all the page walks. The cycles for page walks started in speculative path will also be included. 2EH 41H LONGEST_LAT_CACHE.MISS Counts the number of L2 cache misses. Also called L2_REQUESTS_MISS. 2EH 4FH LONGEST_LAT_CACHE.REFEREN Counts the total number of L2 cache references. Also CE called L2_REQUESTS_REFERENCE. 30H 00H L2_REQUESTS_REJECT.ALL Counts the number of MEC requests from the L2Q that reference a cache line (cacheable requests) excluding SW prefetches filling only to L2 cache and L1 evictions (automatically excludes L2HWP, UC, WC) that were rejected - Multiple repeated rejects should be counted multiple times. 31H 00H CORE_REJECT_L2Q.ALL Counts the number of MEC requests that were not accepted into the L2Q because of any L2 queue reject condition. There is no concept of at-ret here. It might include requests due to instructions in the speculative path. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P Counts the number of unhalted core clock cycles. 3CH 01H CPU_CLK_UNHALTED.REF Counts the number of unhalted reference clock cycles. 3EH 04H L2_PREFETCHER.ALLOC_XQ Counts the number of L2HWP allocated into XQ GP. 80H 01H ICACHE.HIT Counts all instruction fetches that hit the instruction cache. 80H 02H ICACHE.MISSES Counts all instruction fetches that miss the instruction cache or produce memory requests. An instruction fetch miss is counted only once and not once for every cycle it is outstanding. 80H 03H ICACHE.ACCESSES Counts all instruction fetches, including uncacheable fetches. 86H 04H FETCH_STALL.ICACHE_FILL_PEN Counts the number of core cycles the fetch stalls DING_CYCLES because of an icache miss. This is a cumulative count of core cycles the fetch stalled for all icache misses. B7H 01H OFFCORE_RESPONSE_0 See Section 18.4.1.1.2. Requires MSR_OFFCORE_RESP 0 to specify request type and response. B7H 02H OFFCORE_RESPONSE_1 See Section 18.4.1.1.2. Requires MSR_OFFCORE_RESP 1 to specify request type and response. C0H 00H INST_RETIRED.ANY_P Counts the total number of instructions retired. PS 19-54 Vol. 3B Comment EdgeDetect=1 PERFORMANCE MONITORING EVENTS Table 19-8. Performance Events of the Processor Core Supported by Knights Landing and Knights Mill Microarchitectures Event Num. Umask Value Event Mask Mnemonic Description C2H 01H UOPS_RETIRED.MS Counts the number of micro-ops retired that are from the complex flows issued by the micro-sequencer (MS). C2H 10H UOPS_RETIRED.ALL Counts the number of micro-ops retired. C2H 20H UOPS_RETIRED.SCALAR_SIMD Counts the number of scalar SSE, AVX, AVX2, and AVX512 micro-ops except for loads (memory-to-register mov-type micro ops), division and sqrt. C2H 40H UOPS_RETIRED.PACKED_SIMD Counts the number of packed SSE, AVX, AVX2, and AVX-512 micro-ops (both floating point and integer) except for loads (memory-to-register mov-type microops), packed byte and word multiplies. C3H 01H MACHINE_CLEARS.SMC Counts the number of times that the machine clears due to program modifying data within 1K of a recently fetched code page. C3H 02H MACHINE_CLEARS.MEMORY_OR DERING Counts the number of times the machine clears due to memory ordering hazards. C3H 04H MACHINE_CLEARS.FP_ASSIST Counts the number of floating operations retired that required microcode assists. C3H 08H MACHINE_CLEARS.ALL Counts all machine clears. C4H 00H BR_INST_RETIRED.ALL_BRANC HES Counts the number of branch instructions retired. Comment PS C4H 7EH BR_INST_RETIRED.JCC Counts the number of JCC branch instructions retired. PS C4H BFH BR_INST_RETIRED.FAR_BRANC H Counts the number of far branch instructions retired. PS C4H EBH BR_INST_RETIRED.NON_RETUR Counts the number of branch instructions retired that N_IND were near indirect CALL or near indirect JMP. PS C4H F7H BR_INST_RETIRED.RETURN Counts the number of near RET branch instructions retired. PS C4H F9H BR_INST_RETIRED.CALL Counts the number of near CALL branch instructions retired. PS C4H FBH BR_INST_RETIRED.IND_CALL Counts the number of near indirect CALL branch instructions retired. PS C4H FDH BR_INST_RETIRED.REL_CALL Counts the number of near relative CALL branch instructions retired. PS C4H FEH BR_INST_RETIRED.TAKEN_JCC Counts the number of branch instructions retired that were taken conditional jumps. PS C5H 00H BR_MISP_RETIRED.ALL_BRANC HES Counts the number of mispredicted branch instructions PS retired. C5H 7EH BR_MISP_RETIRED.JCC Counts the number of mispredicted JCC branch instructions retired. PS C5H BFH BR_MISP_RETIRED.FAR_BRANC H Counts the number of mispredicted far branch instructions retired. PS C5H EBH BR_MISP_RETIRED.NON_RETUR Counts the number of mispredicted branch instructions PS N_IND retired that were near indirect CALL or near indirect JMP. C5H F7H BR_MISP_RETIRED.RETURN Counts the number of mispredicted near RET branch instructions retired. PS Vol. 3B 19-55 PERFORMANCE MONITORING EVENTS Table 19-8. Performance Events of the Processor Core Supported by Knights Landing and Knights Mill Microarchitectures Event Num. Umask Value Event Mask Mnemonic Description Comment C5H F9H BR_MISP_RETIRED.CALL Counts the number of mispredicted near CALL branch instructions retired. PS C5H FBH BR_MISP_RETIRED.IND_CALL Counts the number of mispredicted near indirect CALL branch instructions retired. PS C5H FDH BR_MISP_RETIRED.REL_CALL Counts the number of mispredicted near relative CALL branch instructions retired. PS C5H FEH BR_MISP_RETIRED.TAKEN_JCC Counts the number of mispredicted branch instructions PS retired that were taken conditional jumps. CAH 01H NO_ALLOC_CYCLES.ROB_FULL Counts the number of core cycles when no micro-ops are allocated and the ROB is full. CAH 04H NO_ALLOC_CYCLES.MISPREDICT Counts the number of core cycles when no micro-ops S are allocated and the alloc pipe is stalled waiting for a mispredicted branch to retire. CAH 20H NO_ALLOC_CYCLES.RAT_STALL CAH 90H NO_ALLOC_CYCLES.NOT_DELIVE Counts the number of core cycles when no micro-ops RED are allocated, the IQ is empty, and no other condition is blocking allocation. CAH 7FH NO_ALLOC_CYCLES.ALL Counts the total number of core cycles when no microops are allocated for any reason. CBH 01H RS_FULL_STALL.MEC Counts the number of core cycles when allocation pipeline is stalled and is waiting for a free MEC reservation station entry. CBH 1FH RS_FULL_STALL.ALL Counts the total number of core cycles the allocation pipeline is stalled when any one of the reservation stations is full. CDH 01H CYCLES_DIV_BUSY.ALL Cycles the number of core cycles when divider is busy. Does not imply a stall waiting for the divider. E6H 01H BACLEARS.ALL Counts the number of times the front end resteers for any branch as a result of another branch handling mechanism in the front end. E6H 08H BACLEARS.RETURN Counts the number of times the front end resteers for RET branches as a result of another branch handling mechanism in the front end. E6H 10H BACLEARS.COND Counts the number of times the front end resteers for conditional branches as a result of another branch handling mechanism in the front end. E7H 01H MS_DECODED.MS_ENTRY Counts the number of times the MSROM starts a flow of uops. PS: Also supports PEBS. PSDLA: Also supports PEBS and DataLA. 19-56 Vol. 3B Counts the number of core cycles when no micro-ops are allocated and a RATstall (caused by reservation station full) is asserted. PERFORMANCE MONITORING EVENTS 19.6 PERFORMANCE MONITORING EVENTS FOR THE INTEL® CORE™ M AND 5TH GENERATION INTEL® CORE™ PROCESSORS The Intel® Core™ M processors, the 5th generation Intel® Core™ processors and the Intel Xeon processor E3 1200 v4 product family are based on the Broadwell microarchitecture. They support the architectural performance monitoring events listed in Table 19-1. Model-specific performance monitoring events in the processor core are listed in Table 19-9. The events in Table 19-9 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3DH and 06_47H. Table 19-12 lists performance events supporting Intel TSX (see Section 18.3.6.5) and the events are available on processors based on Broadwell microarchitecture. Fixed counters in the core PMU support the architecture events defined in Table 19-2. Model-specific performance monitoring events that are located in the uncore sub-system are implementation specific between different platforms using processors based on Broadwell microarchitecture and with different DisplayFamily_DisplayModel signatures. Processors with CPUID signature of DisplayFamily_DisplayModel 06_3DH and 06_47H support uncore performance events listed in Table 19-13. Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture Event Num. Umask Value Event Mask Mnemonic Description 03H 02H LD_BLOCKS.STORE_FORWARD Loads blocked by overlapping with store buffer that cannot be forwarded. 03H 08H LD_BLOCKS.NO_SR The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use. 05H 01H MISALIGN_MEM_REF.LOADS Speculative cache-line split load uops dispatched to L1D. 05H 02H MISALIGN_MEM_REF.STORES Speculative cache-line split store-address uops dispatched to L1D. 07H 01H LD_BLOCKS_PARTIAL.ADDRESS _ALIAS False dependencies in MOB due to partial compare on address. 08H 01H DTLB_LOAD_MISSES.MISS_CAUS Load misses in all TLB levels that cause a page walk ES_A_WALK of any page size. 08H 02H DTLB_LOAD_MISSES.WALK_COM Completed page walks due to demand load misses PLETED_4K that caused 4K page walks in any TLB levels. 08H 10H DTLB_LOAD_MISSES.WALK_DUR Cycle PMH is busy with a walk. ATION 08H 20H DTLB_LOAD_MISSES.STLB_HIT_ Load misses that missed DTLB but hit STLB (4K). 4K 0DH 03H INT_MISC.RECOVERY_CYCLES Cycles waiting to recover after Machine Clears except JEClear. Set Cmask= 1. Set Edge to count occurrences. 0EH 01H UOPS_ISSUED.ANY Increments each cycle the # of uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles of this core. Set Cmask = 1, Inv = 1to count stalled cycles. 0EH 10H UOPS_ISSUED.FLAGS_MERGE Number of flags-merge uops allocated. Such uops add delay. 0EH 20H UOPS_ISSUED.SLOW_LEA Number of slow LEA or similar uops allocated. Such uop has 3 sources (for example, 2 sources + immediate) regardless of whether it is a result of LEA instruction or not. 0EH 40H UOPS_ISSUED.SiNGLE_MUL Number of multiply packed/scalar single precision uops allocated. Comment Vol. 3B 19-57 PERFORMANCE MONITORING EVENTS Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 14H 01H ARITH.FPU_DIV_ACTIVE Cycles when divider is busy executing divide operations. 24H 21H L2_RQSTS.DEMAND_DATA_RD_ Demand data read requests that missed L2, no MISS rejects. 24H 41H L2_RQSTS.DEMAND_DATA_RD_ Demand data read requests that hit L2 cache. HIT 24H 50H L2_RQSTS.L2_PF_HIT Counts all L2 HW prefetcher requests that hit L2. 24H 30H L2_RQSTS.L2_PF_MISS Counts all L2 HW prefetcher requests that missed L2. 24H E1H L2_RQSTS.ALL_DEMAND_DATA _RD Counts any demand and L1 HW prefetch data load requests to L2. 24H E2H L2_RQSTS.ALL_RFO Counts all L2 store RFO requests. Comment 24H E4H L2_RQSTS.ALL_CODE_RD Counts all L2 code requests. 24H F8H L2_RQSTS.ALL_PF Counts all L2 HW prefetcher requests. 27H 50H L2_DEMAND_RQSTS.WB_HIT Not rejected writebacks that hit L2 cache. 2EH 4FH LONGEST_LAT_CACHE.REFEREN This event counts requests originating from the core See Table 19-1. CE that reference a cache line in the last level cache. 2EH 41H LONGEST_LAT_CACHE.MISS This event counts each cache miss condition for references to the last level cache. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P Counts the number of thread cycles while the thread See Table 19-1. is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. 3CH 01H CPU_CLK_THREAD_UNHALTED. REF_XCLK Increments at the frequency of XCLK (100 MHz) when not halted. See Table 19-1. 48H 01H L1D_PEND_MISS.PENDING Increments the number of outstanding L1D misses every cycle. Set Cmask = 1 and Edge =1 to count occurrences. Counter 2 only. 49H 01H DTLB_STORE_MISSES.MISS_CAU Miss in all TLB levels causes a page walk of any page SES_A_WALK size (4K/2M/4M/1G). 49H 02H DTLB_STORE_MISSES.WALK_CO Completed page walks due to store misses in one or MPLETED_4K more TLB levels of 4K page structure. 49H 10H DTLB_STORE_MISSES.WALK_DU Cycles PMH is busy with this walk. RATION 49H 20H DTLB_STORE_MISSES.STLB_HIT Store misses that missed DTLB but hit STLB (4K). _4K 4CH 02H LOAD_HIT_PRE.HW_PF Non-SW-prefetch load dispatches that hit fill buffer allocated for H/W prefetch. 4FH 10H EPT.WALK_CYCLES Cycles of Extended Page Table walks. 51H 01H L1D.REPLACEMENT Counts the number of lines brought into the L1 data cache. 58H 04H MOVE_ELIMINATION.INT_NOT_E Number of integer move elimination candidate uops LIMINATED that were not eliminated. 58H 08H MOVE_ELIMINATION.SIMD_NOT_ Number of SIMD move elimination candidate uops ELIMINATED that were not eliminated. 19-58 Vol. 3B See Table 19-1. Set Cmask = 1 to count cycles. PERFORMANCE MONITORING EVENTS Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture (Contd.) Event Num. Umask Value 58H 01H MOVE_ELIMINATION.INT_ELIMIN Number of integer move elimination candidate uops ATED that were eliminated. 58H 02H MOVE_ELIMINATION.SIMD_ELIMI Number of SIMD move elimination candidate uops NATED that were eliminated. 5CH 01H CPL_CYCLES.RING0 Unhalted core cycles when the thread is in ring 0. 5CH 02H CPL_CYCLES.RING123 Unhalted core cycles when the thread is not in ring 0. Cycles the RS is empty for the thread. Event Mask Mnemonic Description Comment Use Edge to count transition. 5EH 01H RS_EVENTS.EMPTY_CYCLES 60H 01H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding demand data read transactions DING.DEMAND_DATA_RD in SQ to uncore. Set Cmask=1 to count cycles. 60H 02H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding demand code read transactions Use only when HTT is off. DING.DEMAND_CODE_RD in SQ to uncore. Set Cmask=1 to count cycles. 60H 04H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding RFO store transactions in SQ to Use only when HTT is off. DING.DEMAND_RFO uncore. Set Cmask=1 to count cycles. 60H 08H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding cacheable data read DING.ALL_DATA_RD transactions in SQ to uncore. Set Cmask=1 to count cycles. 63H 01H LOCK_CYCLES.SPLIT_LOCK_UC_ LOCK_DURATION Cycles in which the L1D and L2 are locked, due to a UC lock or split lock. 63H 02H LOCK_CYCLES.CACHE_LOCK_DU RATION Cycles in which the L1D is locked. 79H 02H IDQ.EMPTY Counts cycles the IDQ is empty. 79H 04H IDQ.MITE_UOPS Increment each cycle # of uops delivered to IDQ from Can combine Umask 04H MITE path. Set Cmask = 1 to count cycles. and 20H. 79H 08H IDQ.DSB_UOPS Increment each cycle # of uops delivered to IDQ from Can combine Umask 08H DSB path. Set Cmask = 1 to count cycles. and 10H. 79H 10H IDQ.MS_DSB_UOPS Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery. Can combine Umask 04H, 08H. 79H 20H IDQ.MS_MITE_UOPS Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles. Can combine Umask 04H, 08H. 79H 30H IDQ.MS_UOPS Increment each cycle # of uops delivered to IDQ from Can combine Umask 04H, MS by either DSB or MITE. Set Cmask = 1 to count 08H. cycles. 79H 18H IDQ.ALL_DSB_CYCLES_ANY_UO PS Counts cycles DSB is delivered at least one uops. Set Cmask = 1. 79H 18H IDQ.ALL_DSB_CYCLES_4_UOPS Counts cycles DSB is delivered four uops. Set Cmask = 4. 79H 24H IDQ.ALL_MITE_CYCLES_ANY_UO Counts cycles MITE is delivered at least one uop. Set PS Cmask = 1. 79H 24H IDQ.ALL_MITE_CYCLES_4_UOPS Counts cycles MITE is delivered four uops. Set Cmask = 4. 79H 3CH IDQ.MITE_ALL_UOPS Number of uops delivered to IDQ from any path. 80H 02H ICACHE.MISSES Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses. Use only when HTT is off. Use only when HTT is off. Vol. 3B 19-59 PERFORMANCE MONITORING EVENTS Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 85H 01H ITLB_MISSES.MISS_CAUSES_A_ WALK Misses in ITLB that cause a page walk of any page size. 85H 02H ITLB_MISSES.WALK_COMPLETE D_4K Completed page walks due to misses in ITLB 4K page entries. 85H 10H ITLB_MISSES.WALK_DURATION Cycle PMH is busy with a walk. 85H 20H ITLB_MISSES.STLB_HIT_4K ITLB misses that hit STLB (4K). 87H 01H ILD_STALL.LCP Stalls caused by changing prefix length of the instruction. 88H 01H BR_INST_EXEC.COND Qualify conditional near branch instructions executed, but not necessarily retired. Must combine with umask 40H, 80H. 88H 02H BR_INST_EXEC.DIRECT_JMP Qualify all unconditional near branch instructions excluding calls and indirect branches. Must combine with umask 80H. 88H 04H BR_INST_EXEC.INDIRECT_JMP_ NON_CALL_RET Qualify executed indirect near branch instructions that are not calls or returns. Must combine with umask 80H. 88H 08H BR_INST_EXEC.RETURN_NEAR Qualify indirect near branches that have a return mnemonic. Must combine with umask 80H. 88H 10H BR_INST_EXEC.DIRECT_NEAR_C Qualify unconditional near call branch instructions, ALL excluding non-call branch, executed. 88H 20H BR_INST_EXEC.INDIRECT_NEAR Qualify indirect near calls, including both register and Must combine with _CALL memory indirect, executed. umask 80H. 88H 40H BR_INST_EXEC.NONTAKEN Qualify non-taken near branches executed. 88H 80H BR_INST_EXEC.TAKEN Qualify taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H. 88H FFH BR_INST_EXEC.ALL_BRANCHES Counts all near executed branches (not necessarily retired). 89H 01H BR_MISP_EXEC.COND Qualify conditional near branch instructions mispredicted. Must combine with umask 40H, 80H. 89H 04H BR_MISP_EXEC.INDIRECT_JMP_ NON_CALL_RET Qualify mispredicted indirect near branch instructions that are not calls or returns. Must combine with umask 80H. 89H 08H BR_MISP_EXEC.RETURN_NEAR Qualify mispredicted indirect near branches that have a return mnemonic. Must combine with umask 80H. 89H 10H BR_MISP_EXEC.DIRECT_NEAR_C Qualify mispredicted unconditional near call branch ALL instructions, excluding non-call branch, executed. Must combine with umask 80H. 89H 20H BR_MISP_EXEC.INDIRECT_NEAR Qualify mispredicted indirect near calls, including _CALL both register and memory indirect, executed. Must combine with umask 80H. 89H 40H BR_MISP_EXEC.NONTAKEN Qualify mispredicted non-taken near branches executed. Applicable to umask 01H only. 89H 80H BR_MISP_EXEC.TAKEN Qualify mispredicted taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H. 89H FFH BR_MISP_EXEC.ALL_BRANCHES Counts all near executed branches (not necessarily retired). 9CH 01H IDQ_UOPS_NOT_DELIVERED.CO RE A1H 01H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _0 dispatched to port 0. 19-60 Vol. 3B Count issue pipeline slots where no uop was delivered from the front end to the back end when there is no back end stall. Comment Must combine with umask 80H. Applicable to umask 01H only. Use Cmask to qualify uop b/w. Set AnyThread to count per core. PERFORMANCE MONITORING EVENTS Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture (Contd.) Event Num. Umask Value A1H 02H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _1 dispatched to port 1. Set AnyThread to count per core. A1H 04H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _2 dispatched to port 2. Set AnyThread to count per core. A1H 08H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _3 dispatched to port 3. Set AnyThread to count per core. A1H 10H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _4 dispatched to port 4. Set AnyThread to count per core. A1H 20H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _5 dispatched to port 5. Set AnyThread to count per core. A1H 40H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _6 dispatched to port 6. Set AnyThread to count per core. A1H 80H UOPS_DISPATCHED_PORT.PORT Counts the number of cycles in which a uop is _7 dispatched to port 7. Set AnyThread to count per core. A2H 01H RESOURCE_STALLS.ANY Cycles Allocation is stalled due to resource related reason. A2H 04H RESOURCE_STALLS.RS Cycles stalled due to no eligible RS entry available. A2H 08H RESOURCE_STALLS.SB Cycles stalled due to no store buffers available (not including draining form sync). A2H 10H RESOURCE_STALLS.ROB Cycles stalled due to re-order buffer full. A8H 01H LSD.UOPS Number of uops delivered by the LSD. ABH 02H DSB2MITE_SWITCHES.PENALTY _CYCLES Cycles of delay due to Decode Stream Buffer to MITE switches. AEH 01H ITLB.ITLB_FLUSH Counts the number of ITLB flushes; includes 4k/2M/4M pages. B0H 01H OFFCORE_REQUESTS.DEMAND_ Demand data read requests sent to uncore. DATA_RD Use only when HTT is off. B0H 02H OFFCORE_REQUESTS.DEMAND_ Demand code read requests sent to uncore. CODE_RD Use only when HTT is off. B0H 04H OFFCORE_REQUESTS.DEMAND_ Demand RFO read requests sent to uncore, including Use only when HTT is off. RFO regular RFOs, locks, ItoM. B0H 08H OFFCORE_REQUESTS.ALL_DATA Data read requests sent to uncore (demand and _RD prefetch). Use only when HTT is off. B1H 01H UOPS_EXECUTED.THREAD Counts total number of uops to be executed perlogical-processor each cycle. Use Cmask to count stall cycles. B1H 02H UOPS_EXECUTED.CORE Counts total number of uops to be executed per-core Do not need to set ANY. each cycle. B7H 01H OFF_CORE_RESPONSE_0 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A6H. BBH 01H OFF_CORE_RESPONSE_1 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A7H. BCH 11H PAGE_WALKER_LOADS.DTLB_L1 Number of DTLB page walker loads that hit in the L1+FB. BCH 21H PAGE_WALKER_LOADS.ITLB_L1 BCH 12H PAGE_WALKER_LOADS.DTLB_L2 Number of DTLB page walker loads that hit in the L2. Event Mask Mnemonic Description Comment Number of ITLB page walker loads that hit in the L1+FB. Vol. 3B 19-61 PERFORMANCE MONITORING EVENTS Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description BCH 22H PAGE_WALKER_LOADS.ITLB_L2 Number of ITLB page walker loads that hit in the L2. BCH 14H PAGE_WALKER_LOADS.DTLB_L3 Number of DTLB page walker loads that hit in the L3. BCH 24H PAGE_WALKER_LOADS.ITLB_L3 Number of ITLB page walker loads that hit in the L3. BCH 18H PAGE_WALKER_LOADS.DTLB_M EMORY Number of DTLB page walker loads from memory. C0H 00H INST_RETIRED.ANY_P Number of instructions at retirement. See Table 19-1. C0H 01H INST_RETIRED.PREC_DIST Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution. PMC1 only. C0H 02H INST_RETIRED.X87 FP operations retired. X87 FP operations that have no exceptions. C1H 08H OTHER_ASSISTS.AVX_TO_SSE Number of transitions from AVX-256 to legacy SSE when penalty applicable. C1H 10H OTHER_ASSISTS.SSE_TO_AVX Number of transitions from SSE to AVX-256 when penalty applicable. C1H 40H OTHER_ASSISTS.ANY_WB_ASSI ST Number of microcode assists invoked by HW upon uop writeback. C2H 01H UOPS_RETIRED.ALL Counts the number of micro-ops retired. Use cmask=1 and invert to count active cycles or stalled cycles. Comment Supports PEBS and DataLA, use Any=1 for core granular. C2H 02H UOPS_RETIRED.RETIRE_SLOTS Counts the number of retirement slots used each cycle. C3H 01H MACHINE_CLEARS.CYCLES Counts cycles while a machine clears stalled forward progress of a logical processor or a processor core. C3H 02H MACHINE_CLEARS.MEMORY_OR Counts the number of machine clears due to memory DERING order conflicts. C3H 04H MACHINE_CLEARS.SMC Number of self-modifying-code machine clears detected. C3H 20H MACHINE_CLEARS.MASKMOV Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0. C4H 00H BR_INST_RETIRED.ALL_BRANC HES Branch instructions at retirement. C4H 01H BR_INST_RETIRED.CONDITIONA Counts the number of conditional branch instructions Supports PEBS. L retired. C4H 02H BR_INST_RETIRED.NEAR_CALL Direct and indirect near call instructions retired. Supports PEBS. C4H 04H BR_INST_RETIRED.ALL_BRANC HES Counts the number of branch instructions retired. Supports PEBS. C4H 08H BR_INST_RETIRED.NEAR_RETU RN Counts the number of near return instructions retired. Supports PEBS. C4H 10H BR_INST_RETIRED.NOT_TAKEN Counts the number of not taken branch instructions retired. C4H 20H BR_INST_RETIRED.NEAR_TAKE N Number of near taken branches retired. C4H 40H BR_INST_RETIRED.FAR_BRANC H Number of far branches retired. 19-62 Vol. 3B Supports PEBS. See Table 19-1. Supports PEBS. PERFORMANCE MONITORING EVENTS Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment C5H 00H BR_MISP_RETIRED.ALL_BRANC HES Mispredicted branch instructions at retirement. See Table 19-1. C5H 01H BR_MISP_RETIRED.CONDITIONA Mispredicted conditional branch instructions retired. L Supports PEBS. C5H 04H BR_MISP_RETIRED.ALL_BRANC HES Mispredicted macro branch instructions retired. Supports PEBS. CAH 02H FP_ASSIST.X87_OUTPUT Number of X87 FP assists due to output values. CAH 04H FP_ASSIST.X87_INPUT Number of X87 FP assists due to input values. CAH 08H FP_ASSIST.SIMD_OUTPUT Number of SIMD FP assists due to output values. CAH 10H FP_ASSIST.SIMD_INPUT Number of SIMD FP assists due to input values. CAH 1EH FP_ASSIST.ANY Cycles with any input/output SSE* or FP assists. CCH 20H ROB_MISC_EVENTS.LBR_INSER TS Count cases of saving new LBR records by hardware. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY Randomly sampled loads whose latency is above a Specify threshold in MSR user defined threshold. A small fraction of the overall 3F6H. loads are sampled due to randomization. D0H 11H MEM_UOPS_RETIRED.STLB_MIS Retired load uops that miss the STLB. S_LOADS Supports PEBS and DataLA. D0H 12H MEM_UOPS_RETIRED.STLB_MIS Retired store uops that miss the STLB. S_STORES Supports PEBS and DataLA. D0H 21H MEM_UOPS_RETIRED.LOCK_LOA Retired load uops with locked access. DS Supports PEBS and DataLA. D0H 41H MEM_UOPS_RETIRED.SPLIT_LO ADS Retired load uops that split across a cacheline boundary. Supports PEBS and DataLA. D0H 42H MEM_UOPS_RETIRED.SPLIT_ST ORES Retired store uops that split across a cacheline boundary. Supports PEBS and DataLA. D0H 81H MEM_UOPS_RETIRED.ALL_LOAD All retired load uops. S Supports PEBS and DataLA. D0H 82H MEM_UOPS_RETIRED.ALL_STOR All retired store uops. ES Supports PEBS and DataLA. D1H 01H MEM_LOAD_UOPS_RETIRED.L1_ Retired load uops with L1 cache hits as data sources. Supports PEBS and HIT DataLA. D1H 02H MEM_LOAD_UOPS_RETIRED.L2_ Retired load uops with L2 cache hits as data sources. Supports PEBS and HIT DataLA. D1H 04H MEM_LOAD_UOPS_RETIRED.L3_ Retired load uops with L3 cache hits as data sources. Supports PEBS and HIT DataLA. D1H 08H MEM_LOAD_UOPS_RETIRED.L1_ Retired load uops missed L1 cache as data sources. MISS Supports PEBS and DataLA. D1H 10H MEM_LOAD_UOPS_RETIRED.L2_ Retired load uops missed L2. Unknown data source MISS excluded. Supports PEBS and DataLA. D1H 20H MEM_LOAD_UOPS_RETIRED.L3_ Retired load uops missed L3. Excludes unknown data Supports PEBS and MISS source. DataLA. D1H 40H MEM_LOAD_UOPS_RETIRED.HIT Retired load uops where data sources were load _LFB uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. Supports PEBS and DataLA. Vol. 3B 19-63 PERFORMANCE MONITORING EVENTS Table 19-9. Performance Events of the Processor Core Supported by Broadwell Microarchitecture (Contd.) Event Num. Umask Value D2H 01H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops where data sources were L3 hit RED.XSNP_MISS and cross-core snoop missed in on-pkg core cache. Supports PEBS and DataLA. D2H 02H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops where data sources were L3 and RED.XSNP_HIT cross-core snoop hits in on-pkg core cache. Supports PEBS and DataLA. D2H 04H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops where data sources were HitM RED.XSNP_HITM responses from shared L3. Supports PEBS and DataLA. D2H 08H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops where data sources were hits in RED.XSNP_NONE L3 without snoops required. Supports PEBS and DataLA. D3H 01H MEM_LOAD_UOPS_L3_MISS_RE Retired load uops where data sources missed L3 but Supports PEBS and TIRED.LOCAL_DRAM serviced from local dram. DataLA. F0H 01H L2_TRANS.DEMAND_DATA_RD Demand data read requests that access L2 cache. F0H 02H L2_TRANS.RFO RFO requests that access L2 cache. F0H 04H L2_TRANS.CODE_RD L2 cache accesses when fetching instructions. F0H 08H L2_TRANS.ALL_PF Any MLC or L3 HW prefetch accessing L2, including rejects. F0H 10H L2_TRANS.L1D_WB L1D writebacks that access L2 cache. F0H 20H L2_TRANS.L2_FILL L2 fill requests that access L2 cache. F0H 40H L2_TRANS.L2_WB L2 writebacks that access L2 cache. F0H 80H L2_TRANS.ALL_REQUESTS Transactions accessing L2 pipe. F1H 01H L2_LINES_IN.I L2 cache lines in I state filling L2. Counting does not cover rejects. F1H 02H L2_LINES_IN.S L2 cache lines in S state filling L2. Counting does not cover rejects. F1H 04H L2_LINES_IN.E L2 cache lines in E state filling L2. Counting does not cover rejects. F1H 07H L2_LINES_IN.ALL L2 cache lines filling L2. Counting does not cover rejects. F2H 05H L2_LINES_OUT.DEMAND_CLEAN Clean L2 cache lines evicted by demand. Event Mask Mnemonic Description Comment Table 19-12 lists performance events supporting Intel TSX (see Section 18.3.6.5) and the events are applicable to processors based on Broadwell microarchitecture. Where Broadwell microarchitecture implements TSX-related event semantics that differ from Table 19-12, they are listed in Table 19-10. Table 19-10. Intel® TSX Performance Event Addendum in Processors Based on Broadwell Microarchitecture Event Num. Umask Value Event Mask Mnemonic Description 54H 02H TX_MEM.ABORT_CAPACITY Number of times a transactional abort was signaled due to a data capacity limitation for transactional reads or writes. 19.7 Comment PERFORMANCE MONITORING EVENTS FOR THE 4TH GENERATION INTEL® CORE™ PROCESSORS 4th generation Intel® Core™ processors and Intel Xeon processor E3-1200 v3 product family are based on the Haswell microarchitecture. They support the architectural performance monitoring events listed in Table 19-1. Model-specific performance monitoring events in the processor core are listed in Table 19-11. The events in Table 19-64 Vol. 3B PERFORMANCE MONITORING EVENTS 19-11 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3CH, 06_45H and 06_46H. Table 19-12 lists performance events focused on supporting Intel TSX (see Section 18.3.6.5). Fixed counters in the core PMU support the architecture events defined in Table 19-2. Additional information on event specifics (e.g., derivative events using specific IA32_PERFEVTSELx modifiers, limitations, special notes and recommendations) can be found at https://software.intel.com/en-us/forums/softwaretuning-performance-optimization-platform-monitoring. Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors Event Num. Umask Value Event Mask Mnemonic Description 03H 02H LD_BLOCKS.STORE_FORWARD Loads blocked by overlapping with store buffer that cannot be forwarded. 03H 08H LD_BLOCKS.NO_SR The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use. 05H 01H MISALIGN_MEM_REF.LOADS Speculative cache-line split load uops dispatched to L1D. 05H 02H MISALIGN_MEM_REF.STORES Speculative cache-line split store-address uops dispatched to L1D. 07H 01H LD_BLOCKS_PARTIAL.ADDRESS _ALIAS False dependencies in MOB due to partial compare on address. 08H 01H DTLB_LOAD_MISSES.MISS_CAUS Misses in all TLB levels that cause a page walk of any ES_A_WALK page size. 08H 02H DTLB_LOAD_MISSES.WALK_COM Completed page walks due to demand load misses PLETED_4K that caused 4K page walks in any TLB levels. 08H 04H DTLB_LOAD_MISSES.WALK_COM Completed page walks due to demand load misses PLETED_2M_4M that caused 2M/4M page walks in any TLB levels. 08H 0EH DTLB_LOAD_MISSES.WALK_COM Completed page walks in any TLB of any page size PLETED due to demand load misses. 08H 10H DTLB_LOAD_MISSES.WALK_DUR Cycle PMH is busy with a walk. ATION 08H 20H DTLB_LOAD_MISSES.STLB_HIT_ Load misses that missed DTLB but hit STLB (4K). 4K 08H 40H DTLB_LOAD_MISSES.STLB_HIT_ Load misses that missed DTLB but hit STLB (2M). 2M 08H 60H DTLB_LOAD_MISSES.STLB_HIT 08H 80H DTLB_LOAD_MISSES.PDE_CACH DTLB demand load misses with low part of linear-toE_MISS physical address translation missed. 0DH 03H INT_MISC.RECOVERY_CYCLES Cycles waiting to recover after Machine Clears except JEClear. Set Cmask= 1. Set Edge to count occurrences. 0EH 01H UOPS_ISSUED.ANY Increments each cycle the # of uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1 to count stalled cycles of this core. Set Cmask = 1, Inv = 1to count stalled cycles. 0EH 10H UOPS_ISSUED.FLAGS_MERGE Number of flags-merge uops allocated. Such uops add delay. 0EH 20H UOPS_ISSUED.SLOW_LEA Number of slow LEA or similar uops allocated. Such uop has 3 sources (for example, 2 sources + immediate) regardless of whether it is a result of LEA instruction or not. Comment Number of cache load STLB hits. No page walk. Vol. 3B 19-65 PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 0EH 40H UOPS_ISSUED.SiNGLE_MUL Number of multiply packed/scalar single precision uops allocated. 24H 21H L2_RQSTS.DEMAND_DATA_RD_ Demand data read requests that missed L2, no MISS rejects. 24H 41H L2_RQSTS.DEMAND_DATA_RD_ Demand data read requests that hit L2 cache. HIT 24H E1H L2_RQSTS.ALL_DEMAND_DATA _RD Counts any demand and L1 HW prefetch data load requests to L2. 24H 42H L2_RQSTS.RFO_HIT Counts the number of store RFO requests that hit the L2 cache. 24H 22H L2_RQSTS.RFO_MISS Counts the number of store RFO requests that miss the L2 cache. 24H E2H L2_RQSTS.ALL_RFO Counts all L2 store RFO requests. 24H 44H L2_RQSTS.CODE_RD_HIT Number of instruction fetches that hit the L2 cache. 24H 24H L2_RQSTS.CODE_RD_MISS Number of instruction fetches that missed the L2 cache. 24H 27H L2_RQSTS.ALL_DEMAND_MISS Demand requests that miss L2 cache. 24H E7H L2_RQSTS.ALL_DEMAND_REFE RENCES Demand requests to L2 cache. 24H E4H L2_RQSTS.ALL_CODE_RD Counts all L2 code requests. 24H 50H L2_RQSTS.L2_PF_HIT Counts all L2 HW prefetcher requests that hit L2. 24H 30H L2_RQSTS.L2_PF_MISS Counts all L2 HW prefetcher requests that missed L2. 24H F8H L2_RQSTS.ALL_PF Counts all L2 HW prefetcher requests. Comment 24H 3FH L2_RQSTS.MISS All requests that missed L2. 24H FFH L2_RQSTS.REFERENCES All requests to L2 cache. 27H 50H L2_DEMAND_RQSTS.WB_HIT Not rejected writebacks that hit L2 cache. 2EH 4FH LONGEST_LAT_CACHE.REFEREN This event counts requests originating from the core See Table 19-1. CE that reference a cache line in the last level cache. 2EH 41H LONGEST_LAT_CACHE.MISS This event counts each cache miss condition for references to the last level cache. 3CH 00H CPU_CLK_UNHALTED.THREAD_ P Counts the number of thread cycles while the thread See Table 19-1. is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. 3CH 01H CPU_CLK_THREAD_UNHALTED. REF_XCLK Increments at the frequency of XCLK (100 MHz) when not halted. See Table 19-1. 48H 01H L1D_PEND_MISS.PENDING Increments the number of outstanding L1D misses every cycle. Set Cmask = 1 and Edge =1 to count occurrences. Counter 2 only. 49H 01H DTLB_STORE_MISSES.MISS_CAU Miss in all TLB levels causes a page walk of any page SES_A_WALK size (4K/2M/4M/1G). 49H 02H DTLB_STORE_MISSES.WALK_CO Completed page walks due to store misses in one or MPLETED_4K more TLB levels of 4K page structure. 19-66 Vol. 3B See Table 19-1. Set Cmask = 1 to count cycles. PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value 49H 04H DTLB_STORE_MISSES.WALK_CO Completed page walks due to store misses in one or MPLETED_2M_4M more TLB levels of 2M/4M page structure. 49H 0EH DTLB_STORE_MISSES.WALK_CO Completed page walks due to store miss in any TLB MPLETED levels of any page size (4K/2M/4M/1G). 49H 10H DTLB_STORE_MISSES.WALK_DU Cycles PMH is busy with this walk. RATION 49H 20H DTLB_STORE_MISSES.STLB_HIT Store misses that missed DTLB but hit STLB (4K). _4K 49H 40H DTLB_STORE_MISSES.STLB_HIT Store misses that missed DTLB but hit STLB (2M). _2M 49H 60H DTLB_STORE_MISSES.STLB_HIT Store operations that miss the first TLB level but hit the second and do not cause page walks. 49H 80H DTLB_STORE_MISSES.PDE_CAC HE_MISS DTLB store misses with low part of linear-to-physical address translation missed. 4CH 01H LOAD_HIT_PRE.SW_PF Non-SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch. 4CH 02H LOAD_HIT_PRE.HW_PF Non-SW-prefetch load dispatches that hit fill buffer allocated for H/W prefetch. 51H 01H L1D.REPLACEMENT Counts the number of lines brought into the L1 data cache. 58H 04H MOVE_ELIMINATION.INT_NOT_E Number of integer move elimination candidate uops LIMINATED that were not eliminated. 58H 08H MOVE_ELIMINATION.SIMD_NOT_ Number of SIMD move elimination candidate uops ELIMINATED that were not eliminated. 58H 01H MOVE_ELIMINATION.INT_ELIMIN Number of integer move elimination candidate uops ATED that were eliminated. 58H 02H MOVE_ELIMINATION.SIMD_ELIMI Number of SIMD move elimination candidate uops NATED that were eliminated. 5CH 01H CPL_CYCLES.RING0 Unhalted core cycles when the thread is in ring 0. 5CH 02H CPL_CYCLES.RING123 Unhalted core cycles when the thread is not in ring 0. 5EH 01H RS_EVENTS.EMPTY_CYCLES Cycles the RS is empty for the thread. 60H 01H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding demand data read transactions DING.DEMAND_DATA_RD in SQ to uncore. Set Cmask=1 to count cycles. 60H 02H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding Demand code Read transactions Use only when HTT is off. DING.DEMAND_CODE_RD in SQ to uncore. Set Cmask=1 to count cycles. 60H 04H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding RFO store transactions in SQ to Use only when HTT is off. DING.DEMAND_RFO uncore. Set Cmask=1 to count cycles. 60H 08H OFFCORE_REQUESTS_OUTSTAN Offcore outstanding cacheable data read DING.ALL_DATA_RD transactions in SQ to uncore. Set Cmask=1 to count cycles. 63H 01H LOCK_CYCLES.SPLIT_LOCK_UC_ LOCK_DURATION Cycles in which the L1D and L2 are locked, due to a UC lock or split lock. 63H 02H LOCK_CYCLES.CACHE_LOCK_DU RATION Cycles in which the L1D is locked. 79H 02H IDQ.EMPTY Counts cycles the IDQ is empty. Event Mask Mnemonic Description Comment Use Edge to count transition. Use only when HTT is off. Use only when HTT is off. Vol. 3B 19-67 PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 79H 04H IDQ.MITE_UOPS Increment each cycle # of uops delivered to IDQ from Can combine Umask 04H MITE path. Set Cmask = 1 to count cycles. and 20H. 79H 08H IDQ.DSB_UOPS Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles. Can combine Umask 08H and 10H. 79H 10H IDQ.MS_DSB_UOPS Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery. Can combine Umask 04H, 08H. 79H 20H IDQ.MS_MITE_UOPS Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles. Can combine Umask 04H, 08H. 79H 30H IDQ.MS_UOPS Increment each cycle # of uops delivered to IDQ from Can combine Umask 04H, MS by either DSB or MITE. Set Cmask = 1 to count 08H. cycles. 79H 18H IDQ.ALL_DSB_CYCLES_ANY_UO PS Counts cycles DSB is delivered at least one uops. Set Cmask = 1. 79H 18H IDQ.ALL_DSB_CYCLES_4_UOPS Counts cycles DSB is delivered four uops. Set Cmask = 4. 79H 24H IDQ.ALL_MITE_CYCLES_ANY_UO Counts cycles MITE is delivered at least one uop. Set PS Cmask = 1. 79H 24H IDQ.ALL_MITE_CYCLES_4_UOPS Counts cycles MITE is delivered four uops. Set Cmask = 4. 79H 3CH IDQ.MITE_ALL_UOPS # of uops delivered to IDQ from any path. 80H 02H ICACHE.MISSES Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses. 85H 01H ITLB_MISSES.MISS_CAUSES_A_ WALK Misses in ITLB that causes a page walk of any page size. 85H 02H ITLB_MISSES.WALK_COMPLETE D_4K Completed page walks due to misses in ITLB 4K page entries. 85H 04H ITLB_MISSES.WALK_COMPLETE D_2M_4M Completed page walks due to misses in ITLB 2M/4M page entries. 85H 0EH ITLB_MISSES.WALK_COMPLETE D Completed page walks in ITLB of any page size. 85H 10H ITLB_MISSES.WALK_DURATION Cycle PMH is busy with a walk. 85H 20H ITLB_MISSES.STLB_HIT_4K ITLB misses that hit STLB (4K). 85H 40H ITLB_MISSES.STLB_HIT_2M ITLB misses that hit STLB (2M). 85H 60H ITLB_MISSES.STLB_HIT ITLB misses that hit STLB. No page walk. 87H 01H ILD_STALL.LCP Stalls caused by changing prefix length of the instruction. 87H 04H ILD_STALL.IQ_FULL Stall cycles due to IQ is full. 88H 01H BR_INST_EXEC.COND Qualify conditional near branch instructions executed, but not necessarily retired. Must combine with umask 40H, 80H. 88H 02H BR_INST_EXEC.DIRECT_JMP Qualify all unconditional near branch instructions excluding calls and indirect branches. Must combine with umask 80H. 88H 04H BR_INST_EXEC.INDIRECT_JMP_ NON_CALL_RET Qualify executed indirect near branch instructions that are not calls or returns. Must combine with umask 80H. 19-68 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 88H 08H BR_INST_EXEC.RETURN_NEAR Qualify indirect near branches that have a return mnemonic. Must combine with umask 80H. 88H 10H BR_INST_EXEC.DIRECT_NEAR_C Qualify unconditional near call branch instructions, ALL excluding non-call branch, executed. 88H 20H BR_INST_EXEC.INDIRECT_NEAR Qualify indirect near calls, including both register and Must combine with _CALL memory indirect, executed. umask 80H. 88H 40H BR_INST_EXEC.NONTAKEN Qualify non-taken near branches executed. 88H 80H BR_INST_EXEC.TAKEN Qualify taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H. 88H FFH BR_INST_EXEC.ALL_BRANCHES Counts all near executed branches (not necessarily retired). 89H 01H BR_MISP_EXEC.COND Qualify conditional near branch instructions mispredicted. Must combine with umask 40H, 80H. 89H 04H BR_MISP_EXEC.INDIRECT_JMP_ NON_CALL_RET Qualify mispredicted indirect near branch instructions that are not calls or returns. Must combine with umask 80H. 89H 08H BR_MISP_EXEC.RETURN_NEAR Qualify mispredicted indirect near branches that have a return mnemonic. Must combine with umask 80H. 89H 10H BR_MISP_EXEC.DIRECT_NEAR_C Qualify mispredicted unconditional near call branch ALL instructions, excluding non-call branch, executed. Must combine with umask 80H. 89H 20H BR_MISP_EXEC.INDIRECT_NEAR Qualify mispredicted indirect near calls, including _CALL both register and memory indirect, executed. Must combine with umask 80H. 89H 40H BR_MISP_EXEC.NONTAKEN Qualify mispredicted non-taken near branches executed. Applicable to umask 01H only. 89H 80H BR_MISP_EXEC.TAKEN Qualify mispredicted taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H. 89H FFH BR_MISP_EXEC.ALL_BRANCHES Counts all near executed branches (not necessarily retired). 9CH 01H IDQ_UOPS_NOT_DELIVERED.CO RE Count issue pipeline slots where no uop was delivered from the front end to the back end when there is no back-end stall. Use Cmask to qualify uop b/w. A1H 01H UOPS_EXECUTED_PORT.PORT_ 0 Cycles which a uop is dispatched on port 0 in this thread. Set AnyThread to count per core. A1H 02H UOPS_EXECUTED_PORT.PORT_ 1 Cycles which a uop is dispatched on port 1 in this thread. Set AnyThread to count per core. A1H 04H UOPS_EXECUTED_PORT.PORT_ 2 Cycles which a uop is dispatched on port 2 in this thread. Set AnyThread to count per core. A1H 08H UOPS_EXECUTED_PORT.PORT_ 3 Cycles which a uop is dispatched on port 3 in this thread. Set AnyThread to count per core. A1H 10H UOPS_EXECUTED_PORT.PORT_ 4 Cycles which a uop is dispatched on port 4 in this thread. Set AnyThread to count per core. A1H 20H UOPS_EXECUTED_PORT.PORT_ 5 Cycles which a uop is dispatched on port 5 in this thread. Set AnyThread to count per core. A1H 40H UOPS_EXECUTED_PORT.PORT_ 6 Cycles which a uop is dispatched on port 6 in this thread. Set AnyThread to count per core. A1H 80H UOPS_EXECUTED_PORT.PORT_ 7 Cycles which a uop is dispatched on port 7 in this thread Set AnyThread to count per core. Must combine with umask 80H. Applicable to umask 01H only. Vol. 3B 19-69 PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description A2H 01H RESOURCE_STALLS.ANY Cycles allocation is stalled due to resource related reason. A2H 04H RESOURCE_STALLS.RS Cycles stalled due to no eligible RS entry available. A2H 08H RESOURCE_STALLS.SB Cycles stalled due to no store buffers available (not including draining form sync). A2H 10H RESOURCE_STALLS.ROB Cycles stalled due to re-order buffer full. A3H 01H CYCLE_ACTIVITY.CYCLES_L2_PE Cycles with pending L2 miss loads. Set Cmask=2 to NDING count cycle. A3H 02H CYCLE_ACTIVITY.CYCLES_LDM_ PENDING A3H 05H CYCLE_ACTIVITY.STALLS_L2_PE Number of loads missed L2. NDING Use only when HTT is off. A3H 08H CYCLE_ACTIVITY.CYCLES_L1D_P Cycles with pending L1 data cache miss loads. Set ENDING Cmask=8 to count cycle. PMC2 only. A3H 0CH CYCLE_ACTIVITY.STALLS_L1D_P Execution stalls due to L1 data cache miss loads. Set PMC2 only. ENDING Cmask=0CH. A8H 01H LSD.UOPS Number of uops delivered by the LSD. AEH 01H ITLB.ITLB_FLUSH Counts the number of ITLB flushes, includes 4k/2M/4M pages. B0H 01H OFFCORE_REQUESTS.DEMAND_ Demand data read requests sent to uncore. DATA_RD Use only when HTT is off. B0H 02H OFFCORE_REQUESTS.DEMAND_ Demand code read requests sent to uncore. CODE_RD Use only when HTT is off. B0H 04H OFFCORE_REQUESTS.DEMAND_ Demand RFO read requests sent to uncore, including Use only when HTT is off. RFO regular RFOs, locks, ItoM. B0H 08H OFFCORE_REQUESTS.ALL_DATA Data read requests sent to uncore (demand and _RD prefetch). B1H 02H UOPS_EXECUTED.CORE Counts total number of uops to be executed per-core Do not need to set ANY. each cycle. B7H 01H OFF_CORE_RESPONSE_0 See Table 18-28 or Table 18-29. Requires MSR 01A6H. BBH 01H OFF_CORE_RESPONSE_1 See Table 18-28 or Table 18-29. Requires MSR 01A7H. BCH 11H PAGE_WALKER_LOADS.DTLB_L1 Number of DTLB page walker loads that hit in the L1+FB. BCH 21H PAGE_WALKER_LOADS.ITLB_L1 BCH 12H PAGE_WALKER_LOADS.DTLB_L2 Number of DTLB page walker loads that hit in the L2. BCH 22H PAGE_WALKER_LOADS.ITLB_L2 BCH 14H PAGE_WALKER_LOADS.DTLB_L3 Number of DTLB page walker loads that hit in the L3. BCH 24H PAGE_WALKER_LOADS.ITLB_L3 Number of ITLB page walker loads that hit in the L3. BCH 18H PAGE_WALKER_LOADS.DTLB_M EMORY Number of DTLB page walker loads from memory. BCH 28H PAGE_WALKER_LOADS.ITLB_ME Number of ITLB page walker loads from memory. MORY BDH 01H TLB_FLUSH.DTLB_THREAD DTLB flush attempts of the thread-specific entries. BDH 20H TLB_FLUSH.STLB_ANY Count number of STLB flush attempts. 19-70 Vol. 3B Comment Use only when HTT is off. Cycles with pending memory loads. Set Cmask=2 to count cycle. Number of ITLB page walker loads that hit in the L1+FB. Number of ITLB page walker loads that hit in the L2. Use only when HTT is off. PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment C0H 00H INST_RETIRED.ANY_P Number of instructions at retirement. See Table 19-1. C0H 01H INST_RETIRED.PREC_DIST Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution. PMC1 only. C1H 08H OTHER_ASSISTS.AVX_TO_SSE Number of transitions from AVX-256 to legacy SSE when penalty applicable. C1H 10H OTHER_ASSISTS.SSE_TO_AVX Number of transitions from SSE to AVX-256 when penalty applicable. C1H 40H OTHER_ASSISTS.ANY_WB_ASSI ST Number of microcode assists invoked by HW upon uop writeback. C2H 01H UOPS_RETIRED.ALL Counts the number of micro-ops retired. Use Supports PEBS and Cmask=1 and invert to count active cycles or stalled DataLA; use Any=1 for cycles. core granular. C2H 02H UOPS_RETIRED.RETIRE_SLOTS Counts the number of retirement slots used each cycle. C3H 02H MACHINE_CLEARS.MEMORY_OR Counts the number of machine clears due to memory DERING order conflicts. C3H 04H MACHINE_CLEARS.SMC Number of self-modifying-code machine clears detected. C3H 20H MACHINE_CLEARS.MASKMOV Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0. C4H 00H BR_INST_RETIRED.ALL_BRANC HES Branch instructions at retirement. C4H 01H BR_INST_RETIRED.CONDITIONA Counts the number of conditional branch instructions Supports PEBS. L retired. C4H 02H BR_INST_RETIRED.NEAR_CALL Direct and indirect near call instructions retired. Supports PEBS. C4H 04H BR_INST_RETIRED.ALL_BRANC HES Counts the number of branch instructions retired. Supports PEBS. C4H 08H BR_INST_RETIRED.NEAR_RETU RN Counts the number of near return instructions retired. Supports PEBS. C4H 10H BR_INST_RETIRED.NOT_TAKEN Counts the number of not taken branch instructions retired. C4H 20H BR_INST_RETIRED.NEAR_TAKE N Number of near taken branches retired. C4H 40H BR_INST_RETIRED.FAR_BRANC H Number of far branches retired. C5H 00H BR_MISP_RETIRED.ALL_BRANC HES Mispredicted branch instructions at retirement. C5H 01H BR_MISP_RETIRED.CONDITIONA Mispredicted conditional branch instructions retired. L Supports PEBS. C5H 04H BR_MISP_RETIRED.ALL_BRANC HES Mispredicted macro branch instructions retired. Supports PEBS. C5H 20H BR_MISP_RETIRED.NEAR_TAKE N Number of near branch instructions retired that were taken but mispredicted. CAH 02H FP_ASSIST.X87_OUTPUT Number of X87 FP assists due to output values. CAH 04H FP_ASSIST.X87_INPUT Number of X87 FP assists due to input values. Supports PEBS. See Table 19-1. Supports PEBS. See Table 19-1. Vol. 3B 19-71 PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description CAH 08H FP_ASSIST.SIMD_OUTPUT Number of SIMD FP assists due to output values. CAH 10H FP_ASSIST.SIMD_INPUT Number of SIMD FP assists due to input values. CAH 1EH FP_ASSIST.ANY Cycles with any input/output SSE* or FP assists. CCH 20H ROB_MISC_EVENTS.LBR_INSER TS Count cases of saving new LBR records by hardware. CDH 01H MEM_TRANS_RETIRED.LOAD_L ATENCY Randomly sampled loads whose latency is above a Specify threshold in MSR user defined threshold. A small fraction of the overall 3F6H. loads are sampled due to randomization. D0H 11H MEM_UOPS_RETIRED.STLB_MIS Retired load uops that miss the STLB. S_LOADS Supports PEBS and DataLA. D0H 12H MEM_UOPS_RETIRED.STLB_MIS Retired store uops that miss the STLB. S_STORES Supports PEBS and DataLA. D0H 21H MEM_UOPS_RETIRED.LOCK_LOA Retired load uops with locked access. DS Supports PEBS and DataLA. D0H 41H MEM_UOPS_RETIRED.SPLIT_LO ADS Retired load uops that split across a cacheline boundary. Supports PEBS and DataLA. D0H 42H MEM_UOPS_RETIRED.SPLIT_ST ORES Retired store uops that split across a cacheline boundary. Supports PEBS and DataLA. D0H 81H MEM_UOPS_RETIRED.ALL_LOAD All retired load uops. S Supports PEBS and DataLA. D0H 82H MEM_UOPS_RETIRED.ALL_STOR All retired store uops. ES Supports PEBS and DataLA. D1H 01H MEM_LOAD_UOPS_RETIRED.L1_ Retired load uops with L1 cache hits as data sources. Supports PEBS and HIT DataLA. D1H 02H MEM_LOAD_UOPS_RETIRED.L2_ Retired load uops with L2 cache hits as data sources. Supports PEBS and HIT DataLA. D1H 04H MEM_LOAD_UOPS_RETIRED.L3_ Retired load uops with L3 cache hits as data sources. Supports PEBS and HIT DataLA. D1H 08H MEM_LOAD_UOPS_RETIRED.L1_ Retired load uops missed L1 cache as data sources. MISS Supports PEBS and DataLA. D1H 10H MEM_LOAD_UOPS_RETIRED.L2_ Retired load uops missed L2. Unknown data source MISS excluded. Supports PEBS and DataLA. D1H 20H MEM_LOAD_UOPS_RETIRED.L3_ Retired load uops missed L3. Excludes unknown data Supports PEBS and MISS source . DataLA. D1H 40H MEM_LOAD_UOPS_RETIRED.HIT Retired load uops which data sources were load uops Supports PEBS and _LFB missed L1 but hit FB due to preceding miss to the DataLA. same cache line with data not ready. D2H 01H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops which data sources were L3 hit RED.XSNP_MISS and cross-core snoop missed in on-pkg core cache. Supports PEBS and DataLA. D2H 02H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops which data sources were L3 and RED.XSNP_HIT cross-core snoop hits in on-pkg core cache. Supports PEBS and DataLA. D2H 04H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops which data sources were HitM responses from shared L3. RED.XSNP_HITM Supports PEBS and DataLA. D2H 08H MEM_LOAD_UOPS_L3_HIT_RETI Retired load uops which data sources were hits in L3 Supports PEBS and RED.XSNP_NONE without snoops required. DataLA. D3H 01H MEM_LOAD_UOPS_L3_MISS_RE Retired load uops which data sources missed L3 but TIRED.LOCAL_DRAM serviced from local dram. 19-72 Vol. 3B Comment Supports PEBS and DataLA. PERFORMANCE MONITORING EVENTS Table 19-11. Performance Events in the Processor Core of 4th Generation Intel® Core™ Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description E6H 1FH BACLEARS.ANY Number of front end re-steers due to BPU misprediction. F0H 01H L2_TRANS.DEMAND_DATA_RD Demand data read requests that access L2 cache. F0H 02H L2_TRANS.RFO RFO requests that access L2 cache. Comment F0H 04H L2_TRANS.CODE_RD L2 cache accesses when fetching instructions. F0H 08H L2_TRANS.ALL_PF Any MLC or L3 HW prefetch accessing L2, including rejects. F0H 10H L2_TRANS.L1D_WB L1D writebacks that access L2 cache. F0H 20H L2_TRANS.L2_FILL L2 fill requests that access L2 cache. F0H 40H L2_TRANS.L2_WB L2 writebacks that access L2 cache. F0H 80H L2_TRANS.ALL_REQUESTS Transactions accessing L2 pipe. F1H 01H L2_LINES_IN.I L2 cache lines in I state filling L2. Counting does not cover rejects. F1H 02H L2_LINES_IN.S L2 cache lines in S state filling L2. Counting does not cover rejects. F1H 04H L2_LINES_IN.E L2 cache lines in E state filling L2. Counting does not cover rejects. F1H 07H L2_LINES_IN.ALL L2 cache lines filling L2. Counting does not cover rejects. F2H 05H L2_LINES_OUT.DEMAND_CLEAN Clean L2 cache lines evicted by demand. F2H 06H L2_LINES_OUT.DEMAND_DIRTY Event Num. Umask Value Event Mask Mnemonic Description 54H 01H TX_MEM.ABORT_CONFLICT Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address. 54H 02H TX_MEM.ABORT_CAPACITY_W Number of times a transactional abort was signaled due RITE to a data capacity limitation for transactional writes. 54H 04H TX_MEM.ABORT_HLE_STORE_ Number of times a HLE transactional region aborted due TO_ELIDED_LOCK to a non XRELEASE prefixed instruction writing to an elided lock in the elision buffer. 54H 08H TX_MEM.ABORT_HLE_ELISION Number of times an HLE transactional execution aborted _BUFFER_NOT_EMPTY due to NoAllocatedElisionBuffer being non-zero. 54H 10H TX_MEM.ABORT_HLE_ELISION Number of times an HLE transactional execution aborted _BUFFER_MISMATCH due to XRELEASE lock not satisfying the address and value requirements in the elision buffer. 54H 20H TX_MEM.ABORT_HLE_ELISION Number of times an HLE transactional execution aborted _BUFFER_UNSUPPORTED_ALI due to an unsupported read alignment from the elision GNMENT buffer. 54H 40H TX_MEM.HLE_ELISION_BUFFE R_FULL Dirty L2 cache lines evicted by demand. Table 19-12. Intel TSX Performance Events in Processors Based on Haswell Microarchitecture Comment Number of times HLE lock could not be elided due to ElisionBufferAvailable being zero. Vol. 3B 19-73 PERFORMANCE MONITORING EVENTS Table 19-12. Intel TSX Performance Events in Processors Based on Haswell Microarchitecture Event Num. Umask Value Event Mask Mnemonic Description 5DH 01H TX_EXEC.MISC1 Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort. 5DH 02H TX_EXEC.MISC2 Counts the number of times a class of instructions (for example, vzeroupper) that may cause a transactional abort was executed inside a transactional region. 5DH 04H TX_EXEC.MISC3 Counts the number of times an instruction execution caused the transactional nest count supported to be exceeded. 5DH 08H TX_EXEC.MISC4 Counts the number of times an XBEGIN instruction was executed inside an HLE transactional region. 5DH 10H TX_EXEC.MISC5 Counts the number of times an instruction with HLEXACQUIRE semantic was executed inside an RTM transactional region. C8H 01H HLE_RETIRED.START Number of times an HLE execution started. C8H 02H HLE_RETIRED.COMMIT Number of times an HLE execution successfully committed. C8H 04H HLE_RETIRED.ABORTED Number of times an HLE execution aborted due to any reasons (multiple categories may count as one). Supports PEBS. C8H 08H HLE_RETIRED.ABORTED_MEM Number of times an HLE execution aborted due to various memory events (for example, read/write capacity and conflicts). C8H 10H HLE_RETIRED.ABORTED_TIME R Number of times an HLE execution aborted due to uncommon conditions. C8H 20H HLE_RETIRED.ABORTED_UNFR Number of times an HLE execution aborted due to HLEIENDLY unfriendly instructions. C8H 40H HLE_RETIRED.ABORTED_MEM TYPE C8H 80H HLE_RETIRED.ABORTED_EVEN Number of times an HLE execution aborted due to none TS of the previous 4 categories (for example, interrupts). IF HLE is supported. Number of times an HLE execution aborted due to incompatible memory type. C9H 01H RTM_RETIRED.START Number of times an RTM execution started. C9H 02H RTM_RETIRED.COMMIT Number of times an RTM execution successfully committed. C9H 04H RTM_RETIRED.ABORTED Number of times an RTM execution aborted due to any reasons (multiple categories may count as one). Supports PEBS. 19-74 Vol. 3B Comment IF RTM is supported. PERFORMANCE MONITORING EVENTS Table 19-12. Intel TSX Performance Events in Processors Based on Haswell Microarchitecture Event Num. Umask Value C9H 08H RTM_RETIRED.ABORTED_MEM Number of times an RTM execution aborted due to various memory events (for example, read/write capacity and conflicts). C9H 10H RTM_RETIRED.ABORTED_TIME Number of times an RTM execution aborted due to R uncommon conditions. C9H 20H RTM_RETIRED.ABORTED_UNF RIENDLY C9H 40H RTM_RETIRED.ABORTED_MEM Number of times an RTM execution aborted due to TYPE incompatible memory type. C9H 80H RTM_RETIRED.ABORTED_EVE NTS Event Mask Mnemonic Description Comment IF RTM is supported. Number of times an RTM execution aborted due to HLEunfriendly instructions. Number of times an RTM execution aborted due to none of the previous 4 categories (for example, interrupt). Model-specific performance monitoring events that are located in the uncore sub-system are implementation specific between different platforms using processors based on Haswell microarchitecture and with different DisplayFamily_DisplayModel signatures. Processors with CPUID signature of DisplayFamily_DisplayModel 06_3CH and 06_45H support performance events listed in Table 19-13. Table 19-13. Uncore Performance Events in the 4th Generation Intel® Core™ Processors Event Num.1 Umask Value 22H 01H UNC_CBO_XSNP_RESPONSE.M A snoop misses in some processor core. ISS 22H 02H UNC_CBO_XSNP_RESPONSE.I NVAL 22H 04H UNC_CBO_XSNP_RESPONSE.H A snoop hits a non-modified line in some processor IT core. 22H 08H UNC_CBO_XSNP_RESPONSE.H A snoop hits a modified line in some processor core. ITM 22H 10H UNC_CBO_XSNP_RESPONSE.I NVAL_M A snoop invalidates a modified line in some processor core. 22H 20H UNC_CBO_XSNP_RESPONSE.E XTERNAL_FILTER Filter on cross-core snoops initiated by this Cbox due to external snoop request. 22H 40H UNC_CBO_XSNP_RESPONSE.X Filter on cross-core snoops initiated by this Cbox due CORE_FILTER to processor core memory request. 22H 80H UNC_CBO_XSNP_RESPONSE.E VICTION_FILTER Filter on cross-core snoops initiated by this Cbox due to L3 eviction. 34H 01H UNC_CBO_CACHE_LOOKUP.M L3 lookup request that access cache and found line in M-state. 34H 06H UNC_CBO_CACHE_LOOKUP.ES 34H 08H UNC_CBO_CACHE_LOOKUP.I L3 lookup request that access cache and found line in Istate. 34H 10H UNC_CBO_CACHE_LOOKUP.RE AD_FILTER Filter on processor core initiated cacheable read requests. Must combine with at least one of 01H, 02H, 04H, 08H. Event Mask Mnemonic Description A snoop invalidates a non-modified line in some processor core. Comment Must combine with one of the umask values of 20H, 40H, 80H. Must combine with at least one of 01H, 02H, 04H, 08H, 10H. Must combine with one of the umask L3 lookup request that access cache and found line in E values of 10H, 20H, 40H, 80H. or S state. Vol. 3B 19-75 PERFORMANCE MONITORING EVENTS Table 19-13. Uncore Performance Events in the 4th Generation Intel® Core™ Processors (Contd.) Event Num.1 Umask Value 34H 20H UNC_CBO_CACHE_LOOKUP.WR Filter on processor core initiated cacheable write ITE_FILTER requests. Must combine with at least one of 01H, 02H, 04H, 08H. 34H 40H UNC_CBO_CACHE_LOOKUP.EX TSNP_FILTER 34H 80H UNC_CBO_CACHE_LOOKUP.AN Filter on any IRQ or IPQ initiated requests including Y_REQUEST_FILTER uncacheable, non-coherent requests. Must combine with at least one of 01H, 02H, 04H, 08H. 80H 01H UNC_ARB_TRK_OCCUPANCY.A Counts cycles weighted by the number of requests Counter 0 only. LL waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or L3. 81H 01H UNC_ARB_TRK_REQUEST.ALL Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or L3. 81H 20H UNC_ARB_TRK_REQUEST.WRI TES Counts the number of allocated write entries, include full, partial, and L3 evictions. 81H 80H UNC_ARB_TRK_REQUEST.EVIC Counts the number of L3 evictions allocated. TIONS 83H 01H UNC_ARB_COH_TRK_OCCUPA NCY.ALL Cycles weighted by number of requests pending in Coherency Tracker. 84H 01H UNC_ARB_COH_TRK_REQUES T.ALL Number of requests allocated in Coherency Tracker. Event Mask Mnemonic Description Comment Filter on external snoop requests. Must combine with at least one of 01H, 02H, 04H, 08H. Counter 0 only. NOTES: 1. The uncore events must be programmed using MSRs located in specific performance monitoring units in the uncore. UNC_CBO* events are supported using MSR_UNC_CBO* MSRs; UNC_ARB* events are supported using MSR_UNC_ARB*MSRs. 19.7.1 Performance Monitoring Events in the Processor Core of Intel Xeon Processor E5 v3 Family Model-specific performance monitoring events in the processor core that are applicable only to Intel Xeon processor E5 v3 family based on the Haswell-E microarchitecture, with CPUID signature of DisplayFamily_DisplayModel 06_3FH, are listed in Table 19-14. The performance events listed in Table 19-11 and Table 19-12 also apply Intel Xeon processor E5 v3 family, except that the OFF_CORE_RESPONSE_x event listed in Table 19-11 should reference Table 18-30. Uncore performance monitoring events for Intel Xeon Processor E5 v3 families are described in “Intel® Xeon® Processor E5 v3 Uncore Performance Monitoring Programming Reference Manual”. Table 19-14. Performance Events Applicable only to the Processor Core of Intel® Xeon® Processor E5 v3 Family Event Num. Umask Value D3H Event Mask Mnemonic Description Comment 04H MEM_LOAD_UOPS_L3_MISS_RE TIRED.REMOTE_DRAM Retired load uops whose data sources were remote DRAM (snoop not needed, Snoop Miss). Supports PEBS. D3H 10H MEM_LOAD_UOPS_L3_MISS_RE TIRED.REMOTE_HITM Retired load uops whose data sources were remote cache HITM. Supports PEBS. D3H 20H MEM_LOAD_UOPS_L3_MISS_RE TIRED.REMOTE_FWD Retired load uops whose data sources were forwards from a remote cache. Supports PEBS. 19-76 Vol. 3B PERFORMANCE MONITORING EVENTS 19.8 PERFORMANCE MONITORING EVENTS FOR 3RD GENERATION INTEL® CORE™ PROCESSORS 3rd generation Intel® Core™ processors and Intel Xeon processor E3-1200 v2 product family are based on Intel microarchitecture code name Ivy Bridge. They support architectural performance monitoring events listed in Table 19-1. Model-specific performance monitoring events in the processor core are listed in Table 19-15. The events in Table 19-15 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3AH. Fixed counters in the core PMU support the architecture events defined in Table 19-26. Additional information on event specifics (e.g. derivative events using specific IA32_PERFEVTSELx modifiers, limitations, special notes and recommendations) can be found at found at https://software.intel.com/enus/forums/software-tuning-performance-optimization-platform-monitoring. Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors Event Num. Umask Value Event Mask Mnemonic Description 03H 02H LD_BLOCKS.STORE_FORWARD Loads blocked by overlapping with store buffer that cannot be forwarded. 03H 08H LD_BLOCKS.NO_SR The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use. 05H 01H MISALIGN_MEM_REF.LOADS Speculative cache-line split load uops dispatched to L1D. 05H 02H MISALIGN_MEM_REF.STORES Speculative cache-line split Store-address uops dispatched to L1D. 07H 01H LD_BLOCKS_PARTIAL.ADDRESS_ ALIAS False dependencies in MOB due to partial compare on address. 08H 81H DTLB_LOAD_MISSES.MISS_CAUSE Misses in all TLB levels that cause a page walk of S_A_WALK any page size from demand loads. 08H 82H DTLB_LOAD_MISSES.WALK_COM PLETED Misses in all TLB levels that caused page walk completed of any size by demand loads. 08H 84H DTLB_LOAD_MISSES.WALK_DUR ATION Cycle PMH is busy with a walk due to demand loads. 08H 88H DTLB_LOAD_MISSES.LARGE_PAG Page walk for a large page completed for Demand E_WALK_DURATION load. 0EH 01H UOPS_ISSUED.ANY Increments each cycle the # of Uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles of this core. 0EH 10H UOPS_ISSUED.FLAGS_MERGE Number of flags-merge uops allocated. Such uops adds delay. 0EH 20H UOPS_ISSUED.SLOW_LEA Number of slow LEA or similar uops allocated. Such uop has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not. 0EH 40H UOPS_ISSUED.SiNGLE_MUL Number of multiply packed/scalar single precision uops allocated. Counts number of X87 uops executed. 10H 01H FP_COMP_OPS_EXE.X87 10H 10H FP_COMP_OPS_EXE.SSE_FP_PAC Counts number of SSE* or AVX-128 double KED_DOUBLE precision FP packed uops executed. 10H 20H FP_COMP_OPS_EXE.SSE_FP_SCA Counts number of SSE* or AVX-128 single precision LAR_SINGLE FP scalar uops executed. 10H 40H FP_COMP_OPS_EXE.SSE_PACKED Counts number of SSE* or AVX-128 single precision SINGLE FP packed uops executed. Comment Set Cmask = 1, Inv = 1to count stalled cycles. Vol. 3B 19-77 PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value 10H 80H FP_COMP_OPS_EXE.SSE_SCALAR Counts number of SSE* or AVX-128 double _DOUBLE precision FP scalar uops executed. 11H 01H SIMD_FP_256.PACKED_SINGLE Counts 256-bit packed single-precision floatingpoint instructions. 11H 02H SIMD_FP_256.PACKED_DOUBLE Counts 256-bit packed double-precision floatingpoint instructions. 14H 01H ARITH.FPU_DIV_ACTIVE Cycles that the divider is active, includes INT and FP. Set 'edge =1, cmask=1' to count the number of divides. 24H 01H L2_RQSTS.DEMAND_DATA_RD_H Demand Data Read requests that hit L2 cache. IT 24H 03H L2_RQSTS.ALL_DEMAND_DATA_ RD Counts any demand and L1 HW prefetch data load requests to L2. 24H 04H L2_RQSTS.RFO_HITS Counts the number of store RFO requests that hit the L2 cache. 24H 08H L2_RQSTS.RFO_MISS Counts the number of store RFO requests that miss the L2 cache. 24H 0CH L2_RQSTS.ALL_RFO Counts all L2 store RFO requests. 24H 10H L2_RQSTS.CODE_RD_HIT Number of instruction fetches that hit the L2 cache. 24H 20H L2_RQSTS.CODE_RD_MISS Number of instruction fetches that missed the L2 cache. 24H 30H L2_RQSTS.ALL_CODE_RD Counts all L2 code requests. 24H 40H L2_RQSTS.PF_HIT Counts all L2 HW prefetcher requests that hit L2. 24H 80H L2_RQSTS.PF_MISS Counts all L2 HW prefetcher requests that missed L2. 24H C0H L2_RQSTS.ALL_PF Counts all L2 HW prefetcher requests. 27H 01H L2_STORE_LOCK_RQSTS.MISS RFOs that miss cache lines. 27H 08H L2_STORE_LOCK_RQSTS.HIT_M RFOs that hit cache lines in M state. 27H 0FH L2_STORE_LOCK_RQSTS.ALL RFOs that access cache lines in any state. 28H 01H L2_L1D_WB_RQSTS.MISS Not rejected writebacks that missed LLC. 28H 04H L2_L1D_WB_RQSTS.HIT_E Not rejected writebacks from L1D to L2 cache lines in E state. 28H 08H L2_L1D_WB_RQSTS.HIT_M Not rejected writebacks from L1D to L2 cache lines in M state. 28H 0FH L2_L1D_WB_RQSTS.ALL Not rejected writebacks from L1D to L2 cache lines in any state. 2EH 4FH LONGEST_LAT_CACHE.REFERENC This event counts requests originating from the E core that reference a cache line in the last level cache. See Table 19-1 2EH 41H LONGEST_LAT_CACHE.MISS This event counts each cache miss condition for references to the last level cache. See Table 19-1 3CH 00H CPU_CLK_UNHALTED.THREAD_P Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. See Table 19-1. 19-78 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value 3CH 01H CPU_CLK_THREAD_UNHALTED.R Increments at the frequency of XCLK (100 MHz) EF_XCLK when not halted. See Table 19-1. 48H 01H L1D_PEND_MISS.PENDING PMC2 only; Event Mask Mnemonic Description Increments the number of outstanding L1D misses every cycle. Set Cmask = 1 and Edge =1 to count occurrences. 49H 01H DTLB_STORE_MISSES.MISS_CAUS Miss in all TLB levels causes a page walk of any ES_A_WALK page size (4K/2M/4M/1G). 49H 02H DTLB_STORE_MISSES.WALK_CO MPLETED 49H 04H DTLB_STORE_MISSES.WALK_DUR Cycles PMH is busy with this walk. ATION 49H 10H DTLB_STORE_MISSES.STLB_HIT Store operations that miss the first TLB level but hit the second and do not cause page walks. 4CH 01H LOAD_HIT_PRE.SW_PF Non-SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch. 4CH 02H LOAD_HIT_PRE.HW_PF Non-SW-prefetch load dispatches that hit fill buffer allocated for H/W prefetch. 51H 01H L1D.REPLACEMENT Counts the number of lines brought into the L1 data cache. 58H 04H MOVE_ELIMINATION.INT_NOT_EL Number of integer Move Elimination candidate uops IMINATED that were not eliminated. 58H 08H MOVE_ELIMINATION.SIMD_NOT_E Number of SIMD Move Elimination candidate uops LIMINATED that were not eliminated. 58H 01H MOVE_ELIMINATION.INT_ELIMINA Number of integer Move Elimination candidate uops TED that were eliminated. 58H 02H MOVE_ELIMINATION.SIMD_ELIMIN Number of SIMD Move Elimination candidate uops ATED that were eliminated. 5CH 01H CPL_CYCLES.RING0 Unhalted core cycles when the thread is in ring 0. 5CH 02H CPL_CYCLES.RING123 Unhalted core cycles when the thread is not in ring 0. 5EH 01H RS_EVENTS.EMPTY_CYCLES Cycles the RS is empty for the thread. 5FH 04H DTLB_LOAD_MISSES.STLB_HIT Counts load operations that missed 1st level DTLB but hit the 2nd level. 60H 01H OFFCORE_REQUESTS_OUTSTAN DING.DEMAND_DATA_RD Offcore outstanding Demand Data Read transactions in SQ to uncore. Set Cmask=1 to count cycles. 60H 02H OFFCORE_REQUESTS_OUTSTAN DING.DEMAND_CODE_RD Offcore outstanding Demand Code Read transactions in SQ to uncore. Set Cmask=1 to count cycles. 60H 04H OFFCORE_REQUESTS_OUTSTAN DING.DEMAND_RFO Offcore outstanding RFO store transactions in SQ to uncore. Set Cmask=1 to count cycles. 60H 08H OFFCORE_REQUESTS_OUTSTAN DING.ALL_DATA_RD Offcore outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles. 63H 01H LOCK_CYCLES.SPLIT_LOCK_UC_L OCK_DURATION Cycles in which the L1D and L2 are locked, due to a UC lock or split lock. Comment Set Cmask = 1 to count cycles. Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M/1G). Use Edge to count transition. Vol. 3B 19-79 PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value 63H 02H LOCK_CYCLES.CACHE_LOCK_DUR Cycles in which the L1D is locked. ATION 79H 02H IDQ.EMPTY Counts cycles the IDQ is empty. 79H 04H IDQ.MITE_UOPS Increment each cycle # of uops delivered to IDQ from MITE path. Set Cmask = 1 to count cycles. Can combine Umask 04H and 20H. 79H 08H IDQ.DSB_UOPS Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles. Can combine Umask 08H and 10H. 79H 10H IDQ.MS_DSB_UOPS Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery. Can combine Umask 04H, 08H. 79H 20H IDQ.MS_MITE_UOPS Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles. Can combine Umask 04H, 08H. 79H 30H IDQ.MS_UOPS Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles. Can combine Umask 04H, 08H. 79H 18H IDQ.ALL_DSB_CYCLES_ANY_UOP S Counts cycles DSB is delivered at least one uops. Set Cmask = 1. 79H 18H IDQ.ALL_DSB_CYCLES_4_UOPS Counts cycles DSB is delivered four uops. Set Cmask = 4. 79H 24H IDQ.ALL_MITE_CYCLES_ANY_UOP Counts cycles MITE is delivered at least one uops. S Set Cmask = 1. 79H 24H IDQ.ALL_MITE_CYCLES_4_UOPS 79H 3CH IDQ.MITE_ALL_UOPS # of uops delivered to IDQ from any path. 80H 04H ICACHE.IFETCH_STALL Cycles where a code-fetch stalled due to L1 instruction-cache miss or an iTLB miss. 80H 02H ICACHE.MISSES Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses. 85H 01H ITLB_MISSES.MISS_CAUSES_A_W Misses in all ITLB levels that cause page walks. ALK 85H 02H ITLB_MISSES.WALK_COMPLETED Misses in all ITLB levels that cause completed page walks. 85H 04H ITLB_MISSES.WALK_DURATION Cycle PMH is busy with a walk. 85H 10H ITLB_MISSES.STLB_HIT Number of cache load STLB hits. No page walk. 87H 01H ILD_STALL.LCP Stalls caused by changing prefix length of the instruction. 87H 04H ILD_STALL.IQ_FULL Stall cycles due to IQ is full. 88H 01H BR_INST_EXEC.COND Qualify conditional near branch instructions executed, but not necessarily retired. Must combine with umask 40H, 80H. 88H 02H BR_INST_EXEC.DIRECT_JMP Qualify all unconditional near branch instructions excluding calls and indirect branches. Must combine with umask 80H. 88H 04H BR_INST_EXEC.INDIRECT_JMP_N ON_CALL_RET Qualify executed indirect near branch instructions that are not calls or returns. Must combine with umask 80H. 88H 08H BR_INST_EXEC.RETURN_NEAR Qualify indirect near branches that have a return mnemonic. Must combine with umask 80H. 19-80 Vol. 3B Event Mask Mnemonic Description Comment Counts cycles MITE is delivered four uops. Set Cmask = 4. PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 88H 10H BR_INST_EXEC.DIRECT_NEAR_C ALL Qualify unconditional near call branch instructions, excluding non-call branch, executed. Must combine with umask 80H. 88H 20H BR_INST_EXEC.INDIRECT_NEAR_ Qualify indirect near calls, including both register CALL and memory indirect, executed. Must combine with umask 80H. 88H 40H BR_INST_EXEC.NONTAKEN Qualify non-taken near branches executed. Applicable to umask 01H only. 88H 80H BR_INST_EXEC.TAKEN Qualify taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H. 88H FFH BR_INST_EXEC.ALL_BRANCHES Counts all near executed branches (not necessarily retired). 89H 01H BR_MISP_EXEC.COND Qualify conditional near branch instructions mispredicted. Must combine with umask 40H, 80H. 89H 04H BR_MISP_EXEC.INDIRECT_JMP_N ON_CALL_RET Qualify mispredicted indirect near branch instructions that are not calls or returns. Must combine with umask 80H. 89H 08H BR_MISP_EXEC.RETURN_NEAR Qualify mispredicted indirect near branches that have a return mnemonic. Must combine with umask 80H. 89H 10H BR_MISP_EXEC.DIRECT_NEAR_C ALL Qualify mispredicted unconditional near call branch instructions, excluding non-call branch, executed. Must combine with umask 80H. 89H 20H BR_MISP_EXEC.INDIRECT_NEAR_ Qualify mispredicted indirect near calls, including CALL both register and memory indirect, executed. Must combine with umask 80H. 89H 40H BR_MISP_EXEC.NONTAKEN Qualify mispredicted non-taken near branches executed. Applicable to umask 01H only. 89H 80H BR_MISP_EXEC.TAKEN Qualify mispredicted taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H. 89H FFH BR_MISP_EXEC.ALL_BRANCHES Counts all near executed branches (not necessarily retired). 9CH 01H IDQ_UOPS_NOT_DELIVERED.COR Count issue pipeline slots where no uop was E delivered from the front end to the back end when there is no back-end stall. A1H 01H UOPS_DISPATCHED_PORT.PORT_ Cycles which a Uop is dispatched on port 0. 0 A1H 02H UOPS_DISPATCHED_PORT.PORT_ Cycles which a Uop is dispatched on port 1. 1 A1H 0CH UOPS_DISPATCHED_PORT.PORT_ Cycles which a Uop is dispatched on port 2. 2 A1H 30H UOPS_DISPATCHED_PORT.PORT_ Cycles which a Uop is dispatched on port 3. 3 A1H 40H UOPS_DISPATCHED_PORT.PORT_ Cycles which a Uop is dispatched on port 4. 4 A1H 80H UOPS_DISPATCHED_PORT.PORT_ Cycles which a Uop is dispatched on port 5. 5 A2H 01H RESOURCE_STALLS.ANY Cycles Allocation is stalled due to Resource Related reason. A2H 04H RESOURCE_STALLS.RS Cycles stalled due to no eligible RS entry available. A2H 08H RESOURCE_STALLS.SB Cycles stalled due to no store buffers available (not including draining form sync). A2H 10H RESOURCE_STALLS.ROB Cycles stalled due to re-order buffer full. Use Cmask to qualify uop b/w. Vol. 3B 19-81 PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value A3H 01H CYCLE_ACTIVITY.CYCLES_L2_PEN Cycles with pending L2 miss loads. Set AnyThread DING to count per core. A3H 02H CYCLE_ACTIVITY.CYCLES_LDM_P ENDING Cycles with pending memory loads. Set AnyThread to count per core. Restricted to counters 03 when HTT is disabled. A3H 04H CYCLE_ACTIVITY.CYCLES_NO_EX ECUTE Cycles of dispatch stalls. Set AnyThread to count per core. Restricted to counters 03 when HTT is disabled. A3H 05H CYCLE_ACTIVITY.STALLS_L2_PEN Number of loads missed L2. DING Restricted to counters 03 when HTT is disabled. A3H 06H CYCLE_ACTIVITY.STALLS_LDM_P ENDING Restricted to counters 03 when HTT is disabled. A3H 08H CYCLE_ACTIVITY.CYCLES_L1D_PE Cycles with pending L1 cache miss loads. Set NDING AnyThread to count per core. PMC2 only. A3H 0CH CYCLE_ACTIVITY.STALLS_L1D_PE Execution stalls due to L1 data cache miss loads. NDING Set Cmask=0CH. PMC2 only. A8H 01H LSD.UOPS Number of Uops delivered by the LSD. ABH 01H DSB2MITE_SWITCHES.COUNT Number of DSB to MITE switches. ABH 02H DSB2MITE_SWITCHES.PENALTY_ CYCLES Cycles DSB to MITE switches caused delay. ACH 08H DSB_FILL.EXCEED_DSB_LINES DSB Fill encountered > 3 DSB lines. AEH 01H ITLB.ITLB_FLUSH Counts the number of ITLB flushes, includes 4k/2M/4M pages. B0H 01H OFFCORE_REQUESTS.DEMAND_D Demand data read requests sent to uncore. ATA_RD B0H 02H OFFCORE_REQUESTS.DEMAND_C Demand code read requests sent to uncore. ODE_RD B0H 04H OFFCORE_REQUESTS.DEMAND_R Demand RFO read requests sent to uncore, FO including regular RFOs, locks, ItoM. B0H 08H OFFCORE_REQUESTS.ALL_DATA_ Data read requests sent to uncore (demand and RD prefetch). B1H 01H UOPS_EXECUTED.THREAD Counts total number of uops to be executed perthread each cycle. Set Cmask = 1, INV =1 to count stall cycles. B1H 02H UOPS_EXECUTED.CORE Counts total number of uops to be executed percore each cycle. Do not need to set ANY. B7H 01H OFFCORE_RESPONSE_0 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A6H. BBH 01H OFFCORE_RESPONSE_1 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A7H. BDH 01H TLB_FLUSH.DTLB_THREAD DTLB flush attempts of the thread-specific entries. BDH 20H TLB_FLUSH.STLB_ANY Count number of STLB flush attempts. Event Mask Mnemonic Description Comment C0H 00H INST_RETIRED.ANY_P Number of instructions at retirement. C0H 01H INST_RETIRED.PREC_DIST Precise instruction retired event with HW to reduce PMC1 only. effect of PEBS shadow in IP distribution. C1H 08H OTHER_ASSISTS.AVX_STORE Number of assists associated with 256-bit AVX store operations. 19-82 Vol. 3B See Table 19-1. PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description C1H 10H OTHER_ASSISTS.AVX_TO_SSE Number of transitions from AVX-256 to legacy SSE when penalty applicable. C1H 20H OTHER_ASSISTS.SSE_TO_AVX Number of transitions from SSE to AVX-256 when penalty applicable. C1H 80H OTHER_ASSISTS.WB Number of times microcode assist is invoked by hardware upon uop writeback. C2H 01H UOPS_RETIRED.ALL Counts the number of micro-ops retired, Use Supports PEBS, use cmask=1 and invert to count active cycles or stalled Any=1 for core granular. cycles. C2H 02H UOPS_RETIRED.RETIRE_SLOTS Counts the number of retirement slots used each cycle. C3H 02H MACHINE_CLEARS.MEMORY_ORD Counts the number of machine clears due to ERING memory order conflicts. C3H 04H MACHINE_CLEARS.SMC Number of self-modifying-code machine clears detected. C3H 20H MACHINE_CLEARS.MASKMOV Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0. C4H 00H BR_INST_RETIRED.ALL_BRANCH ES Branch instructions at retirement. C4H 01H BR_INST_RETIRED.CONDITIONAL Counts the number of conditional branch instructions retired. Supports PEBS. C4H 02H BR_INST_RETIRED.NEAR_CALL Direct and indirect near call instructions retired. Supports PEBS. C4H 04H BR_INST_RETIRED.ALL_BRANCH ES Counts the number of branch instructions retired. Supports PEBS. C4H 08H BR_INST_RETIRED.NEAR_RETUR Counts the number of near return instructions N retired. C4H 10H BR_INST_RETIRED.NOT_TAKEN Counts the number of not taken branch instructions Supports PEBS. retired. C4H 20H BR_INST_RETIRED.NEAR_TAKEN Number of near taken branches retired. C4H 40H BR_INST_RETIRED.FAR_BRANCH Number of far branches retired. Supports PEBS. C5H 00H BR_MISP_RETIRED.ALL_BRANCH Mispredicted branch instructions at retirement. ES See Table 19-1. C5H 01H BR_MISP_RETIRED.CONDITIONAL Mispredicted conditional branch instructions retired. Supports PEBS. C5H 04H BR_MISP_RETIRED.ALL_BRANCH Mispredicted macro branch instructions retired. ES Supports PEBS. C5H 20H BR_MISP_RETIRED.NEAR_TAKEN Mispredicted taken branch instructions retired. Supports PEBS. CAH 02H FP_ASSIST.X87_OUTPUT Number of X87 FP assists due to output values. Supports PEBS. CAH 04H FP_ASSIST.X87_INPUT Number of X87 FP assists due to input values. Supports PEBS. CAH 08H FP_ASSIST.SIMD_OUTPUT Number of SIMD FP assists due to output values. Supports PEBS. CAH 10H FP_ASSIST.SIMD_INPUT Number of SIMD FP assists due to input values. CAH 1EH FP_ASSIST.ANY Cycles with any input/output SSE* or FP assists. CCH 20H ROB_MISC_EVENTS.LBR_INSERT S Count cases of saving new LBR records by hardware. Comment Supports PEBS. See Table 19-1. Supports PEBS. Supports PEBS. Vol. 3B 19-83 PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment CDH 01H MEM_TRANS_RETIRED.LOAD_LA TENCY Randomly sampled loads whose latency is above a user defined threshold. A small fraction of the overall loads are sampled due to randomization. Specify threshold in MSR 3F6H. PMC 3 only. CDH 02H MEM_TRANS_RETIRED.PRECISE_ Sample stores and collect precise store operation STORE via PEBS record. PMC3 only. See Section 18.3.4.4.3. D0H 11H MEM_UOPS_RETIRED.STLB_MISS Retired load uops that miss the STLB. _LOADS Supports PEBS. D0H 12H MEM_UOPS_RETIRED.STLB_MISS Retired store uops that miss the STLB. _STORES Supports PEBS. D0H 21H MEM_UOPS_RETIRED.LOCK_LOA DS Retired load uops with locked access. Supports PEBS. D0H 41H MEM_UOPS_RETIRED.SPLIT_LOA DS Retired load uops that split across a cacheline boundary. Supports PEBS. D0H 42H MEM_UOPS_RETIRED.SPLIT_STO RES Retired store uops that split across a cacheline boundary. Supports PEBS. D0H 81H MEM_UOPS_RETIRED.ALL_LOADS All retired load uops. Supports PEBS. D0H 82H MEM_UOPS_RETIRED.ALL_STORE All retired store uops. S Supports PEBS. D1H 01H MEM_LOAD_UOPS_RETIRED.L1_ HIT Retired load uops with L1 cache hits as data sources. Supports PEBS. D1H 02H MEM_LOAD_UOPS_RETIRED.L2_ HIT Retired load uops with L2 cache hits as data sources. Supports PEBS. D1H 04H MEM_LOAD_UOPS_RETIRED.LLC_ Retired load uops whose data source was LLC hit HIT with no snoop required. Supports PEBS. D1H 08H MEM_LOAD_UOPS_RETIRED.L1_ MISS Retired load uops whose data source followed an L1 miss. Supports PEBS. D1H 10H MEM_LOAD_UOPS_RETIRED.L2_ MISS Retired load uops that missed L2, excluding unknown sources. Supports PEBS. D1H 20H MEM_LOAD_UOPS_RETIRED.LLC_ Retired load uops whose data source is LLC miss. MISS D1H 40H MEM_LOAD_UOPS_RETIRED.HIT_ Retired load uops which data sources were load Supports PEBS. LFB uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. D2H 01H MEM_LOAD_UOPS_LLC_HIT_RETI Retired load uops whose data source was an onRED.XSNP_MISS package core cache LLC hit and cross-core snoop missed. Supports PEBS. D2H 02H MEM_LOAD_UOPS_LLC_HIT_RETI Retired load uops whose data source was an onRED.XSNP_HIT package LLC hit and cross-core snoop hits. Supports PEBS. D2H 04H MEM_LOAD_UOPS_LLC_HIT_RETI Retired load uops whose data source was an onRED.XSNP_HITM package core cache with HitM responses. Supports PEBS. D2H 08H MEM_LOAD_UOPS_LLC_HIT_RETI Retired load uops whose data source was LLC hit RED.XSNP_NONE with no snoop required. Supports PEBS. D3H 01H MEM_LOAD_UOPS_LLC_MISS_RE TIRED.LOCAL_DRAM Supports PEBS. Retired load uops whose data source was local memory (cross-socket snoop not needed or missed). E6H 1FH BACLEARS.ANY Number of front end re-steers due to BPU misprediction. 19-84 Vol. 3B Supports PEBS. Restricted to counters 03 when HTT is disabled. PERFORMANCE MONITORING EVENTS Table 19-15. Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors (Contd.) Event Num. Umask Value Event Mask Mnemonic Description F0H 01H L2_TRANS.DEMAND_DATA_RD Demand Data Read requests that access L2 cache. F0H 02H L2_TRANS.RFO RFO requests that access L2 cache. F0H 04H L2_TRANS.CODE_RD L2 cache accesses when fetching instructions. F0H 08H L2_TRANS.ALL_PF Any MLC or LLC HW prefetch accessing L2, including rejects. F0H 10H L2_TRANS.L1D_WB L1D writebacks that access L2 cache. F0H 20H L2_TRANS.L2_FILL L2 fill requests that access L2 cache. F0H 40H L2_TRANS.L2_WB L2 writebacks that access L2 cache. F0H 80H L2_TRANS.ALL_REQUESTS Transactions accessing L2 pipe. F1H 01H L2_LINES_IN.I L2 cache lines in I state filling L2. Counting does not cover rejects. F1H 02H L2_LINES_IN.S L2 cache lines in S state filling L2. Counting does not cover rejects. F1H 04H L2_LINES_IN.E L2 cache lines in E state filling L2. Counting does not cover rejects. F1H 07H L2_LINES_IN.ALL L2 cache lines filling L2. Counting does not cover rejects. F2H 01H L2_LINES_OUT.DEMAND_CLEAN Clean L2 cache lines evicted by demand. F2H 02H L2_LINES_OUT.DEMAND_DIRTY Dirty L2 cache lines evicted by demand. F2H 04H L2_LINES_OUT.PF_CLEAN Clean L2 cache lines evicted by the MLC prefetcher. F2H 08H L2_LINES_OUT.PF_DIRTY Dirty L2 cache lines evicted by the MLC prefetcher. F2H 0AH L2_LINES_OUT.DIRTY_ALL Dirty L2 cache lines filling the L2. 19.8.1 Comment Counting does not cover rejects. Performance Monitoring Events in the Processor Core of Intel Xeon Processor E5 v2 Family and Intel Xeon Processor E7 v2 Family Model-specific performance monitoring events in the processor core that are applicable only to Intel Xeon processor E5 v2 family and Intel Xeon processor E7 v2 family based on the Ivy Bridge-E microarchitecture, with CPUID signature of DisplayFamily_DisplayModel 06_3EH, are listed in Table 19-16. Table 19-16. Performance Events Applicable Only to the Processor Core of Intel® Xeon® Processor E5 v2 Family and Intel® Xeon® Processor E7 v2 Family Event Num. Umask Value D3H Event Mask Mnemonic Description Comment 03H MEM_LOAD_UOPS_LLC_MISS_R ETIRED.LOCAL_DRAM Retired load uops whose data sources were local DRAM (snoop not needed, Snoop Miss, or Snoop Hit data not forwarded). Supports PEBS. D3H 0CH MEM_LOAD_UOPS_LLC_MISS_R ETIRED.REMOTE_DRAM Retired load uops whose data source was remote DRAM (snoop not needed, Snoop Miss, or Snoop Hit data not forwarded). Supports PEBS. D3H 10H MEM_LOAD_UOPS_LLC_MISS_R ETIRED.REMOTE_HITM Retired load uops whose data sources were remote HITM. Supports PEBS. D3H 20H MEM_LOAD_UOPS_LLC_MISS_R ETIRED.REMOTE_FWD Retired load uops whose data sources were forwards from a remote cache. Supports PEBS. Vol. 3B 19-85 PERFORMANCE MONITORING EVENTS 19.9 PERFORMANCE MONITORING EVENTS FOR 2ND GENERATION INTEL® CORE™ I7-2XXX, INTEL® CORE™ I5-2XXX, INTEL® CORE™ I3-2XXX PROCESSOR SERIES 2nd generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx processor series, and Intel Xeon processor E3-1200 product family are based on the Intel microarchitecture code name Sandy Bridge. They support architectural performance monitoring events listed in Table 19-1. Model-specific performance monitoring events in the processor core are listed in Table 19-17, Table 19-18, and Table 19-19. The events in Table 19-17 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_2AH and 06_2DH. The events in Table 19-18 apply to processors with CPUID signature 06_2AH. The events in Table 19-19 apply to processors with CPUID signature 06_2DH. Fixed counters in the core PMU support the architecture events defined in Table 19-2. Additional information on event specifics (e.g. derivative events using specific IA32_PERFEVTSELx modifiers, limitations, special notes and recommendations) can be found at found at https://software.intel.com/enus/forums/software-tuning-performance-optimization-platform-monitoring. Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family Event Num. Umask Value Event Mask Mnemonic Description 03H 01H LD_BLOCKS.DATA_UNKNOWN Blocked loads due to store buffer blocks with unknown data. 03H 02H LD_BLOCKS.STORE_FORWARD Loads blocked by overlapping with store buffer that cannot be forwarded. 03H 08H LD_BLOCKS.NO_SR # of Split loads blocked due to resource not available. 03H 10H LD_BLOCKS.ALL_BLOCK Number of cases where any load is blocked but has no DCU miss. 05H 01H MISALIGN_MEM_REF.LOADS Speculative cache-line split load uops dispatched to L1D. 05H 02H MISALIGN_MEM_REF.STORES Speculative cache-line split Store-address uops dispatched to L1D. 07H 01H LD_BLOCKS_PARTIAL.ADDRES False dependencies in MOB due to partial compare S_ALIAS on address. 07H 08H LD_BLOCKS_PARTIAL.ALL_STA The number of times that load operations are _BLOCK temporarily blocked because of older stores, with addresses that are not yet known. A load operation may incur more than one block of this type. 08H 01H DTLB_LOAD_MISSES.MISS_CA USES_A_WALK 08H 02H DTLB_LOAD_MISSES.WALK_CO Misses in all TLB levels that caused page walk MPLETED completed of any size. 08H 04H DTLB_LOAD_MISSES.WALK_DU Cycle PMH is busy with a walk. RATION 08H 10H DTLB_LOAD_MISSES.STLB_HIT Number of cache load STLB hits. No page walk. 0DH 03H INT_MISC.RECOVERY_CYCLES Cycles waiting to recover after Machine Clears or JEClear. Set Cmask= 1. 0DH 40H INT_MISC.RAT_STALL_CYCLES Cycles RAT external stall is sent to IDQ for this thread. 19-86 Vol. 3B Comment Misses in all TLB levels that cause a page walk of any page size. Set Edge to count occurrences. PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 0EH 01H UOPS_ISSUED.ANY Increments each cycle the # of Uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles of this core. Set Cmask = 1, Inv = 1to count stalled cycles. 10H 01H FP_COMP_OPS_EXE.X87 Counts number of X87 uops executed. 10H 10H FP_COMP_OPS_EXE.SSE_FP_P Counts number of SSE* double precision FP packed ACKED_DOUBLE uops executed. 10H 20H FP_COMP_OPS_EXE.SSE_FP_S Counts number of SSE* single precision FP scalar CALAR_SINGLE uops executed. 10H 40H FP_COMP_OPS_EXE.SSE_PACK Counts number of SSE* single precision FP packed ED SINGLE uops executed. 10H 80H FP_COMP_OPS_EXE.SSE_SCAL Counts number of SSE* double precision FP scalar AR_DOUBLE uops executed. 11H 01H SIMD_FP_256.PACKED_SINGLE Counts 256-bit packed single-precision floatingpoint instructions. 11H 02H SIMD_FP_256.PACKED_DOUBL Counts 256-bit packed double-precision floatingE point instructions. 14H 01H ARITH.FPU_DIV_ACTIVE Cycles that the divider is active, includes INT and FP. Set 'edge =1, cmask=1' to count the number of divides. 17H 01H INSTS_WRITTEN_TO_IQ.INSTS Counts the number of instructions written into the IQ every cycle. 24H 01H L2_RQSTS.DEMAND_DATA_RD Demand Data Read requests that hit L2 cache. _HIT 24H 03H L2_RQSTS.ALL_DEMAND_DAT A_RD Counts any demand and L1 HW prefetch data load requests to L2. 24H 04H L2_RQSTS.RFO_HITS Counts the number of store RFO requests that hit the L2 cache. 24H 08H L2_RQSTS.RFO_MISS Counts the number of store RFO requests that miss the L2 cache. 24H 0CH L2_RQSTS.ALL_RFO Counts all L2 store RFO requests. 24H 10H L2_RQSTS.CODE_RD_HIT Number of instruction fetches that hit the L2 cache. 24H 20H L2_RQSTS.CODE_RD_MISS Number of instruction fetches that missed the L2 cache. 24H 30H L2_RQSTS.ALL_CODE_RD Counts all L2 code requests. 24H 40H L2_RQSTS.PF_HIT Requests from L2 Hardware prefetcher that hit L2. 24H 80H L2_RQSTS.PF_MISS Requests from L2 Hardware prefetcher that missed L2. 24H C0H L2_RQSTS.ALL_PF Any requests from L2 Hardware prefetchers. 27H 01H L2_STORE_LOCK_RQSTS.MISS RFOs that miss cache lines. 27H 04H L2_STORE_LOCK_RQSTS.HIT_ E RFOs that hit cache lines in E state. 27H 08H L2_STORE_LOCK_RQSTS.HIT_ M RFOs that hit cache lines in M state. 27H 0FH L2_STORE_LOCK_RQSTS.ALL RFOs that access cache lines in any state. Vol. 3B 19-87 PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 28H 01H L2_L1D_WB_RQSTS.MISS Not rejected writebacks from L1D to L2 cache lines that missed L2. 28H 02H L2_L1D_WB_RQSTS.HIT_S Not rejected writebacks from L1D to L2 cache lines in S state. 28H 04H L2_L1D_WB_RQSTS.HIT_E Not rejected writebacks from L1D to L2 cache lines in E state. 28H 08H L2_L1D_WB_RQSTS.HIT_M Not rejected writebacks from L1D to L2 cache lines in M state. Not rejected writebacks from L1D to L2 cache. Comment 28H 0FH L2_L1D_WB_RQSTS.ALL 2EH 4FH LONGEST_LAT_CACHE.REFERE This event counts requests originating from the NCE core that reference a cache line in the last level cache. See Table 19-1. 2EH 41H LONGEST_LAT_CACHE.MISS See Table 19-1. 3CH 00H CPU_CLK_UNHALTED.THREAD Counts the number of thread cycles while the _P thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. See Table 19-1. 3CH 01H CPU_CLK_THREAD_UNHALTED Increments at the frequency of XCLK (100 MHz) .REF_XCLK when not halted. See Table 19-1. 48H 01H L1D_PEND_MISS.PENDING PMC2 only; This event counts each cache miss condition for references to the last level cache. Increments the number of outstanding L1D misses every cycle. Set Cmask = 1 and Edge =1 to count occurrences. 49H 01H DTLB_STORE_MISSES.MISS_CA Miss in all TLB levels causes a page walk of any page USES_A_WALK size (4K/2M/4M/1G). 49H 02H DTLB_STORE_MISSES.WALK_C Miss in all TLB levels causes a page walk that OMPLETED completes of any page size (4K/2M/4M/1G). 49H 04H DTLB_STORE_MISSES.WALK_D Cycles PMH is busy with this walk. URATION 49H 10H DTLB_STORE_MISSES.STLB_HI Store operations that miss the first TLB level but hit T the second and do not cause page walks. 4CH 01H LOAD_HIT_PRE.SW_PF Not SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch. 4CH 02H LOAD_HIT_PRE.HW_PF Not SW-prefetch load dispatches that hit fill buffer allocated for H/W prefetch. 4EH 02H HW_PRE_REQ.DL1_MISS Hardware Prefetch requests that miss the L1D cache. A request is being counted each time it access the cache & miss it, including if a block is applicable or if hit the Fill Buffer for example. 51H 01H L1D.REPLACEMENT Counts the number of lines brought into the L1 data cache. 51H 02H L1D.ALLOCATED_IN_M Counts the number of allocations of modified L1D cache lines. 51H 04H L1D.EVICTION Counts the number of modified lines evicted from the L1 data cache due to replacement. 19-88 Vol. 3B Set Cmask = 1 to count cycles. This accounts for both L1 streamer and IP-based (IPP) HW prefetchers. PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 51H 08H L1D.ALL_M_REPLACEMENT Cache lines in M state evicted out of L1D due to Snoop HitM or dirty line replacement. 59H 20H PARTIAL_RAT_STALLS.FLAGS_ Increments the number of flags-merge uops in flight MERGE_UOP each cycle. Set Cmask = 1 to count cycles. 59H 40H PARTIAL_RAT_STALLS.SLOW_ LEA_WINDOW 59H 80H PARTIAL_RAT_STALLS.MUL_SI Number of Multiply packed/scalar single precision NGLE_UOP uops allocated. 5BH 0CH RESOURCE_STALLS2.ALL_FL_ EMPTY 5BH 0FH RESOURCE_STALLS2.ALL_PRF Cycles stalled due to control structures full for _CONTROL physical registers. 5BH 40H RESOURCE_STALLS2.BOB_FUL Cycles Allocator is stalled due Branch Order Buffer. L 5BH 4FH RESOURCE_STALLS2.OOO_RS RC Cycles stalled due to out of order resources full. 5CH 01H CPL_CYCLES.RING0 Unhalted core cycles when the thread is in ring 0. 5CH 02H CPL_CYCLES.RING123 Unhalted core cycles when the thread is not in ring 0. Cycles the RS is empty for the thread. Comment Cycles with at least one slow LEA uop allocated. Cycles stalled due to free list empty. PMC0-3 only regardless HTT. Use Edge to count transition. 5EH 01H RS_EVENTS.EMPTY_CYCLES 60H 01H OFFCORE_REQUESTS_OUTSTA Offcore outstanding Demand Data Read NDING.DEMAND_DATA_RD transactions in SQ to uncore. Set Cmask=1 to count cycles. 60H 04H OFFCORE_REQUESTS_OUTSTA Offcore outstanding RFO store transactions in SQ to NDING.DEMAND_RFO uncore. Set Cmask=1 to count cycles. 60H 08H OFFCORE_REQUESTS_OUTSTA Offcore outstanding cacheable data read NDING.ALL_DATA_RD transactions in SQ to uncore. Set Cmask=1 to count cycles. 63H 01H LOCK_CYCLES.SPLIT_LOCK_UC Cycles in which the L1D and L2 are locked, due to a _LOCK_DURATION UC lock or split lock. 63H 02H LOCK_CYCLES.CACHE_LOCK_D URATION Cycles in which the L1D is locked. 79H 02H IDQ.EMPTY Counts cycles the IDQ is empty. 79H 04H IDQ.MITE_UOPS Increment each cycle # of uops delivered to IDQ from MITE path. Set Cmask = 1 to count cycles. Can combine Umask 04H and 20H. 79H 08H IDQ.DSB_UOPS Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles. Can combine Umask 08H and 10H. 79H 10H IDQ.MS_DSB_UOPS Increment each cycle # of uops delivered to IDQ when MS busy by DSB. Set Cmask = 1 to count cycles MS is busy. Set Cmask=1 and Edge =1 to count MS activations. Can combine Umask 08H and 10H. 79H 20H IDQ.MS_MITE_UOPS Increment each cycle # of uops delivered to IDQ when MS is busy by MITE. Set Cmask = 1 to count cycles. Can combine Umask 04H and 20H. Vol. 3B 19-89 PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 79H 30H IDQ.MS_UOPS Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles. Can combine Umask 04H, 08H and 30H. 80H 02H ICACHE.MISSES Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses. 85H 01H ITLB_MISSES.MISS_CAUSES_A _WALK Misses in all ITLB levels that cause page walks. 85H 02H ITLB_MISSES.WALK_COMPLET ED Misses in all ITLB levels that cause completed page walks. 85H 04H ITLB_MISSES.WALK_DURATIO N Cycle PMH is busy with a walk. 85H 10H ITLB_MISSES.STLB_HIT Number of cache load STLB hits. No page walk. 87H 01H ILD_STALL.LCP Stalls caused by changing prefix length of the instruction. 87H 04H ILD_STALL.IQ_FULL Stall cycles due to IQ is full. 88H 41H BR_INST_EXEC.NONTAKEN_CO Not-taken macro conditional branches. NDITIONAL 88H 81H BR_INST_EXEC.TAKEN_CONDI TIONAL 88H 82H BR_INST_EXEC.TAKEN_DIRECT Taken speculative and retired conditional branches _JUMP excluding calls and indirects. 88H 84H BR_INST_EXEC.TAKEN_INDIRE Taken speculative and retired indirect branches CT_JUMP_NON_CALL_RET excluding calls and returns. 88H 88H BR_INST_EXEC.TAKEN_INDIRE Taken speculative and retired indirect branches that CT_NEAR_RETURN are returns. 88H 90H BR_INST_EXEC.TAKEN_DIRECT Taken speculative and retired direct near calls. _NEAR_CALL 88H A0H BR_INST_EXEC.TAKEN_INDIRE Taken speculative and retired indirect near calls. CT_NEAR_CALL 88H C1H BR_INST_EXEC.ALL_CONDITIO Speculative and retired conditional branches. NAL 88H C2H BR_INST_EXEC.ALL_DIRECT_J UMP Speculative and retired conditional branches excluding calls and indirects. 88H C4H BR_INST_EXEC.ALL_INDIRECT _JUMP_NON_CALL_RET Speculative and retired indirect branches excluding calls and returns. 88H C8H BR_INST_EXEC.ALL_INDIRECT _NEAR_RETURN Speculative and retired indirect branches that are returns. 88H D0H BR_INST_EXEC.ALL_NEAR_CA LL Speculative and retired direct near calls. 88H FFH BR_INST_EXEC.ALL_BRANCHE Speculative and retired branches. S 89H 41H BR_MISP_EXEC.NONTAKEN_CO Not-taken mispredicted macro conditional branches. NDITIONAL 89H 81H BR_MISP_EXEC.TAKEN_CONDI TIONAL 19-90 Vol. 3B Taken speculative and retired conditional branches. Taken speculative and retired mispredicted conditional branches. PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value 89H 84H BR_MISP_EXEC.TAKEN_INDIRE Taken speculative and retired mispredicted indirect CT_JUMP_NON_CALL_RET branches excluding calls and returns. 89H 88H BR_MISP_EXEC.TAKEN_RETUR Taken speculative and retired mispredicted indirect N_NEAR branches that are returns. 89H 90H BR_MISP_EXEC.TAKEN_DIRECT Taken speculative and retired mispredicted direct _NEAR_CALL near calls. 89H A0H BR_MISP_EXEC.TAKEN_INDIRE Taken speculative and retired mispredicted indirect CT_NEAR_CALL near calls. 89H C1H BR_MISP_EXEC.ALL_CONDITIO Speculative and retired mispredicted conditional NAL branches. 89H C4H BR_MISP_EXEC.ALL_INDIRECT _JUMP_NON_CALL_RET 89H D0H BR_MISP_EXEC.ALL_NEAR_CA Speculative and retired mispredicted direct near LL calls. 89H FFH BR_MISP_EXEC.ALL_BRANCHE Speculative and retired mispredicted branches. S 9CH 01H IDQ_UOPS_NOT_DELIVERED.C ORE A1H 01H UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 0. T_0 A1H 02H UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 1. T_1 A1H 0CH UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 2. T_2 A1H 30H UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 3. T_3 A1H 40H UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 4. T_4 A1H 80H UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 5. T_5 A2H 01H RESOURCE_STALLS.ANY Cycles Allocation is stalled due to Resource Related reason. A2H 02H RESOURCE_STALLS.LB Counts the cycles of stall due to lack of load buffers. A2H 04H RESOURCE_STALLS.RS Cycles stalled due to no eligible RS entry available. A2H 08H RESOURCE_STALLS.SB Cycles stalled due to no store buffers available (not including draining form sync). A2H 10H RESOURCE_STALLS.ROB Cycles stalled due to re-order buffer full. A2H 20H RESOURCE_STALLS.FCSW Cycles stalled due to writing the FPU control word. A3H 01H CYCLE_ACTIVITY.CYCLES_L2_P Cycles with pending L2 miss loads. Set AnyThread ENDING to count per core. A3H 02H CYCLE_ACTIVITY.CYCLES_L1D_ Cycles with pending L1 cache miss loads. Set PENDING AnyThread to count per core. A3H 04H CYCLE_ACTIVITY.CYCLES_NO_ DISPATCH Event Mask Mnemonic Description Comment Speculative and retired mispredicted indirect branches excluding calls and returns. Count issue pipeline slots where no uop was delivered from the front end to the back end when there is no back-end stall. Use Cmask to qualify uop b/w. PMC2 only. Cycles of dispatch stalls. Set AnyThread to count per PMC0-3 only. core. Vol. 3B 19-91 PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value A3H 05H CYCLE_ACTIVITY.STALL_CYCLE S_L2_PENDING PMC0-3 only. A3H 06H CYCLE_ACTIVITY.STALL_CYCLE S_L1D_PENDING PMC2 only. Event Mask Mnemonic Description Comment A8H 01H LSD.UOPS Number of Uops delivered by the LSD. ABH 01H DSB2MITE_SWITCHES.COUNT Number of DSB to MITE switches. ABH 02H DSB2MITE_SWITCHES.PENALT Y_CYCLES Cycles DSB to MITE switches caused delay. ACH 02H DSB_FILL.OTHER_CANCEL Cases of cancelling valid DSB fill not because of exceeding way limit. ACH 08H DSB_FILL.EXCEED_DSB_LINES DSB Fill encountered > 3 DSB lines. AEH 01H ITLB.ITLB_FLUSH Counts the number of ITLB flushes; includes 4k/2M/4M pages. B0H 01H OFFCORE_REQUESTS.DEMAND Demand data read requests sent to uncore. _DATA_RD B0H 04H OFFCORE_REQUESTS.DEMAND Demand RFO read requests sent to uncore, including _RFO regular RFOs, locks, ItoM. B0H 08H OFFCORE_REQUESTS.ALL_DAT Data read requests sent to uncore (demand and A_RD prefetch). B1H 01H UOPS_DISPATCHED.THREAD Counts total number of uops to be dispatched perthread each cycle. Set Cmask = 1, INV =1 to count stall cycles. PMC0-3 only regardless HTT. B1H 02H UOPS_DISPATCHED.CORE Counts total number of uops to be dispatched percore each cycle. Do not need to set ANY. B2H 01H OFFCORE_REQUESTS_BUFFER Offcore requests buffer cannot take more entries .SQ_FULL for this thread core. B6H 01H AGU_BYPASS_CANCEL.COUNT Counts executed load operations with all the following traits: 1. Addressing of the format [base + offset], 2. The offset is between 1 and 2047, 3. The address specified in the base register is in one page and the address [base+offset] is in another page. B7H 01H OFF_CORE_RESPONSE_0 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A6H. BBH 01H OFF_CORE_RESPONSE_1 See Section 18.3.4.5, “Off-core Response Performance Monitoring”. Requires MSR 01A7H. BDH 01H TLB_FLUSH.DTLB_THREAD DTLB flush attempts of the thread-specific entries. BDH 20H TLB_FLUSH.STLB_ANY Count number of STLB flush attempts. BFH 05H L1D_BLOCKS.BANK_CONFLICT Cycles when dispatched loads are cancelled due to _CYCLES L1D bank conflicts with other load ports. Cmask=1. C0H 00H INST_RETIRED.ANY_P Number of instructions at retirement. See Table 19-1. C0H 01H INST_RETIRED.PREC_DIST Precise instruction retired event with HW to reduce PMC1 only; must quiesce effect of PEBS shadow in IP distribution. other PMCs. C1H 02H OTHER_ASSISTS.ITLB_MISS_R ETIRED Instructions that experienced an ITLB miss. 19-92 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value Event Mask Mnemonic Description C1H 08H OTHER_ASSISTS.AVX_STORE Number of assists associated with 256-bit AVX store operations. C1H 10H OTHER_ASSISTS.AVX_TO_SSE Number of transitions from AVX-256 to legacy SSE when penalty applicable. C1H 20H OTHER_ASSISTS.SSE_TO_AVX Number of transitions from SSE to AVX-256 when penalty applicable. C2H 01H UOPS_RETIRED.ALL C2H 02H UOPS_RETIRED.RETIRE_SLOTS Counts the number of retirement slots used each cycle. C3H 02H MACHINE_CLEARS.MEMORY_O Counts the number of machine clears due to RDERING memory order conflicts. C3H 04H MACHINE_CLEARS.SMC Counts the number of times that a program writes to a code section. C3H 20H MACHINE_CLEARS.MASKMOV Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0. C4H 00H BR_INST_RETIRED.ALL_BRAN CHES Branch instructions at retirement. C4H 01H BR_INST_RETIRED.CONDITION Counts the number of conditional branch AL instructions retired. Supports PEBS. C4H 02H BR_INST_RETIRED.NEAR_CALL Direct and indirect near call instructions retired. Supports PEBS. C4H 04H BR_INST_RETIRED.ALL_BRAN CHES Counts the number of branch instructions retired. Supports PEBS. C4H 08H BR_INST_RETIRED.NEAR_RET URN Counts the number of near return instructions retired. Supports PEBS. C4H 10H BR_INST_RETIRED.NOT_TAKE N Counts the number of not taken branch instructions retired. C4H 20H BR_INST_RETIRED.NEAR_TAK EN Number of near taken branches retired. C4H 40H BR_INST_RETIRED.FAR_BRAN CH Number of far branches retired. C5H 00H BR_MISP_RETIRED.ALL_BRAN CHES Mispredicted branch instructions at retirement. C5H 01H BR_MISP_RETIRED.CONDITION Mispredicted conditional branch instructions retired. Supports PEBS. AL C5H 02H BR_MISP_RETIRED.NEAR_CAL L Direct and indirect mispredicted near call instructions retired. Supports PEBS. C5H 04H BR_MISP_RETIRED.ALL_BRAN CHES Mispredicted macro branch instructions retired. Supports PEBS. C5H 10H BR_MISP_RETIRED.NOT_TAKE N Mispredicted not taken branch instructions retired. Supports PEBS. C5H 20H BR_MISP_RETIRED.TAKEN Mispredicted taken branch instructions retired. Supports PEBS. CAH 02H FP_ASSIST.X87_OUTPUT Number of X87 assists due to output value. Comment Counts the number of micro-ops retired, Use Supports PEBS. cmask=1 and invert to count active cycles or stalled cycles. Supports PEBS. See Table 19-1. Supports PEBS. See Table 19-1. Vol. 3B 19-93 PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value Event Mask Mnemonic Description CAH 04H FP_ASSIST.X87_INPUT Number of X87 assists due to input value. CAH 08H FP_ASSIST.SIMD_OUTPUT Number of SIMD FP assists due to output values. CAH 10H FP_ASSIST.SIMD_INPUT Number of SIMD FP assists due to input values. Comment CAH 1EH FP_ASSIST.ANY Cycles with any input/output SSE* or FP assists. CCH 20H ROB_MISC_EVENTS.LBR_INSE RTS Count cases of saving new LBR records by hardware. CDH 01H MEM_TRANS_RETIRED.LOAD_ LATENCY Randomly sampled loads whose latency is above a user defined threshold. A small fraction of the overall loads are sampled due to randomization. PMC3 only. CDH 02H MEM_TRANS_RETIRED.PRECIS Sample stores and collect precise store operation E_STORE via PEBS record. PMC3 only. See Section 18.3.4.4.3. D0H 11H MEM_UOPS_RETIRED.STLB_MI Retired load uops that miss the STLB. SS_LOADS Supports PEBS. PMC0-3 only regardless HTT. D0H 12H MEM_UOPS_RETIRED.STLB_MI Retired store uops that miss the STLB. SS_STORES Supports PEBS. PMC0-3 only regardless HTT. D0H 21H MEM_UOPS_RETIRED.LOCK_LO Retired load uops with locked access. ADS Supports PEBS. PMC0-3 only regardless HTT. D0H 41H MEM_UOPS_RETIRED.SPLIT_L OADS Retired load uops that split across a cacheline boundary. Supports PEBS. PMC0-3 only regardless HTT. D0H 42H MEM_UOPS_RETIRED.SPLIT_S TORES Retired store uops that split across a cacheline boundary. Supports PEBS. PMC0-3 only regardless HTT. D0H 81H MEM_UOPS_RETIRED.ALL_LOA All retired load uops. DS Supports PEBS. PMC0-3 only regardless HTT. D0H 82H MEM_UOPS_RETIRED.ALL_STO All retired store uops. RES Supports PEBS. PMC0-3 only regardless HTT. D1H 01H MEM_LOAD_UOPS_RETIRED.L 1_HIT Retired load uops with L1 cache hits as data sources. Supports PEBS. PMC0-3 only regardless HTT. D1H 02H MEM_LOAD_UOPS_RETIRED.L 2_HIT Retired load uops with L2 cache hits as data sources. Supports PEBS. D1H 04H MEM_LOAD_UOPS_RETIRED.LL Retired load uops which data sources were data hits Supports PEBS. C_HIT in LLC without snoops required. D1H 20H MEM_LOAD_UOPS_RETIRED.LL Retired load uops which data sources were data C_MISS missed LLC (excluding unknown data source). Supports PEBS. D1H 40H MEM_LOAD_UOPS_RETIRED.HI Retired load uops which data sources were load T_LFB uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. Supports PEBS. D2H 01H MEM_LOAD_UOPS_LLC_HIT_R ETIRED.XSNP_MISS Retired load uops whose data source was an onpackage core cache LLC hit and cross-core snoop missed. Supports PEBS. D2H 02H MEM_LOAD_UOPS_LLC_HIT_R ETIRED.XSNP_HIT Retired load uops whose data source was an onpackage LLC hit and cross-core snoop hits. Supports PEBS. D2H 04H MEM_LOAD_UOPS_LLC_HIT_R ETIRED.XSNP_HITM Retired load uops whose data source was an onpackage core cache with HitM responses. Supports PEBS. D2H 08H MEM_LOAD_UOPS_LLC_HIT_R ETIRED.XSNP_NONE Retired load uops whose data source was LLC hit with no snoop required. Supports PEBS. 19-94 Vol. 3B Specify threshold in MSR 3F6H. PERFORMANCE MONITORING EVENTS Table 19-17. Performance Events In the Processor Core Common to 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5 Family (Contd.) Event Num. Umask Value Event Mask Mnemonic Description E6H 01H BACLEARS.ANY Counts the number of times the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end. F0H 01H L2_TRANS.DEMAND_DATA_RD Demand Data Read requests that access L2 cache. F0H 02H L2_TRANS.RFO RFO requests that access L2 cache. F0H 04H L2_TRANS.CODE_RD L2 cache accesses when fetching instructions. F0H 08H L2_TRANS.ALL_PF L2 or LLC HW prefetches that access L2 cache. F0H 10H L2_TRANS.L1D_WB L1D writebacks that access L2 cache. F0H 20H L2_TRANS.L2_FILL L2 fill requests that access L2 cache. F0H 40H L2_TRANS.L2_WB L2 writebacks that access L2 cache. F0H 80H L2_TRANS.ALL_REQUESTS Transactions accessing L2 pipe. F1H 01H L2_LINES_IN.I L2 cache lines in I state filling L2. Counting does not cover rejects. F1H 02H L2_LINES_IN.S L2 cache lines in S state filling L2. Counting does not cover rejects. F1H 04H L2_LINES_IN.E L2 cache lines in E state filling L2. Counting does not cover rejects. F1H 07H L2_LINES_IN.ALL L2 cache lines filling L2. Counting does not cover rejects. F2H 01H L2_LINES_OUT.DEMAND_CLEA Clean L2 cache lines evicted by demand. N F2H 02H L2_LINES_OUT.DEMAND_DIRT Y Comment Including rejects. Dirty L2 cache lines evicted by demand. F2H 04H L2_LINES_OUT.PF_CLEAN Clean L2 cache lines evicted by L2 prefetch. F2H 08H L2_LINES_OUT.PF_DIRTY Dirty L2 cache lines evicted by L2 prefetch. F2H 0AH L2_LINES_OUT.DIRTY_ALL Dirty L2 cache lines filling the L2. F4H 10H SQ_MISC.SPLIT_LOCK Split locks in SQ. Counting does not cover rejects. Non-architecture performance monitoring events in the processor core that are applicable only to Intel processors with CPUID signature of DisplayFamily_DisplayModel 06_2AH are listed in Table 19-18. Table 19-18. Performance Events applicable only to the Processor core for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series Event Num. Umask Value D2H 01H MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC hit and Supports PEBS. PMC0ETIRED.XSNP_MISS cross-core snoop missed in on-pkg core cache. 3 only regardless HTT. D2H 02H MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC and ETIRED.XSNP_HIT cross-core snoop hits in on-pkg core cache. Supports PEBS. D2H 04H MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were HitM ETIRED.XSNP_HITM responses from shared LLC. Supports PEBS. D2H 08H MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were hits in LLC ETIRED.XSNP_NONE without snoops required. Supports PEBS. Event Mask Mnemonic Description Comment Vol. 3B 19-95 PERFORMANCE MONITORING EVENTS Table 19-18. Performance Events applicable only to the Processor core for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series (Contd.) Event Num. Umask Value D4H 02H B7H/BBH 01H 19-96 Vol. 3B Event Mask Mnemonic Description Comment MEM_LOAD_UOPS_MISC_RETI RED.LLC_MISS Retired load uops with unknown information as data source in cache serviced the load. Supports PEBS. PMC03 only regardless HTT. OFF_CORE_RESPONSE_N Sub-events of OFF_CORE_RESPONSE_N (suffix N = 0, 1) programmed using MSR 01A6H/01A7H with values shown in the comment column. OFFCORE_RESPONSE.ALL_CODE_RD.LLC_HIT_N 10003C0244H OFFCORE_RESPONSE.ALL_CODE_RD.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0244H OFFCORE_RESPONSE.ALL_CODE_RD.LLC_HIT.SNOOP_MISS_N 2003C0244H OFFCORE_RESPONSE.ALL_CODE_RD.LLC_HIT.MISS_DRAM_N 300400244H OFFCORE_RESPONSE.ALL_DATA_RD.LLC_HIT.ANY_RESPONSE_N 3F803C0091H OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.DRAM_N 300400091H OFFCORE_RESPONSE.ALL_PF_CODE_RD.LLC_HIT.ANY_RESPONSE_N 3F803C0240H OFFCORE_RESPONSE.ALL_PF_CODE_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0240H OFFCORE_RESPONSE.ALL_PF_CODE_RD.LLC_HIT.HITM_OTHER_CORE_N 10003C0240H OFFCORE_RESPONSE.ALL_PF_CODE_RD.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0240H OFFCORE_RESPONSE.ALL_PF_CODE_RD.LLC_HIT.SNOOP_MISS_N 2003C0240H OFFCORE_RESPONSE.ALL_PF_CODE_RD.LLC_MISS.DRAM_N 300400240H OFFCORE_RESPONSE.ALL_PF_DATA_RD.LLC_MISS.DRAM_N 300400090H OFFCORE_RESPONSE.ALL_PF_RFO.LLC_HIT.ANY_RESPONSE_N 3F803C0120H OFFCORE_RESPONSE.ALL_PF_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0120H OFFCORE_RESPONSE.ALL_PF_RFO.LLC_HIT.HITM_OTHER_CORE_N 10003C0120H OFFCORE_RESPONSE.ALL_PF_RfO.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0120H OFFCORE_RESPONSE.ALL_PF_RFO.LLC_HIT.SNOOP_MISS_N 2003C0120H OFFCORE_RESPONSE.ALL_PF_RFO.LLC_MISS.DRAM_N 300400120H OFFCORE_RESPONSE.ALL_READS.LLC_MISS.DRAM_N 3004003F7H OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.ANY_RESPONSE_N 3F803C0122H OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0122H OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.HITM_OTHER_CORE_N 10003C0122H OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0122H OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.SNOOP_MISS_N 2003C0122H OFFCORE_RESPONSE.ALL_RFO.LLC_MISS.DRAM_N 300400122H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_HIT.HITM_OTHER_CORE_N 10003C0004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_HIT.SNOOP_MISS_N 2003C0004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.DRAM_N 300400004H OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.DRAM_N 300400001H OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.ANY_RESPONSE_N 3F803C0002H OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0002H OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.HITM_OTHER_CORE_N 10003C0002H PERFORMANCE MONITORING EVENTS Table 19-18. Performance Events applicable only to the Processor core for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0002H OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.SNOOP_MISS_N 2003C0002H OFFCORE_RESPONSE.DEMAND_RFO.LLC_MISS.DRAM_N 300400002H OFFCORE_RESPONSE.OTHER.ANY_RESPONSE_N 18000H OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0040H OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_HIT.HITM_OTHER_CORE_N 10003C0040H OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0040H OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_HIT.SNOOP_MISS_N 2003C0040H OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_MISS.DRAM_N 300400040H OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.DRAM_N 300400010H OFFCORE_RESPONSE.PF_L2_RFO.LLC_HIT.ANY_RESPONSE_N 3F803C0020H OFFCORE_RESPONSE.PF_L2_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0020H OFFCORE_RESPONSE.PF_L2_RFO.LLC_HIT.HITM_OTHER_CORE_N 10003C0020H OFFCORE_RESPONSE.PF_L2_RFO.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0020H OFFCORE_RESPONSE.PF_L2_RFO.LLC_HIT.SNOOP_MISS_N 2003C0020H OFFCORE_RESPONSE.PF_L2_RFO.LLC_MISS.DRAM_N 300400020H OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0200H OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_HIT.HITM_OTHER_CORE_N 10003C0200H OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0200H OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_HIT.SNOOP_MISS_N 2003C0200H OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_MISS.DRAM_N 300400200H OFFCORE_RESPONSE.PF_LLC_DATA_RD.LLC_MISS.DRAM_N 300400080H OFFCORE_RESPONSE.PF_LLC_RFO.LLC_HIT.ANY_RESPONSE_N 3F803C0100H OFFCORE_RESPONSE.PF_LLC_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD_N 4003C0100H OFFCORE_RESPONSE.PF_LLC_RFO.LLC_HIT.HITM_OTHER_CORE_N 10003C0100H OFFCORE_RESPONSE.PF_LLC_RFO.LLC_HIT.NO_SNOOP_NEEDED_N 1003C0100H OFFCORE_RESPONSE.PF_LLC_RFO.LLC_HIT.SNOOP_MISS_N 2003C0100H OFFCORE_RESPONSE.PF_LLC_RFO.LLC_MISS.DRAM_N 300400100H Non-architecture performance monitoring events in the processor core that are applicable only to Intel Xeon processor E5 family (and Intel Core i7-3930 processor) based on Intel microarchitecture code name Sandy Bridge, with CPUID signature of DisplayFamily_DisplayModel 06_2DH, are listed in Table 19-19. Table 19-19. Performance Events Applicable only to the Processor Core of Intel® Xeon® Processor E5 Family Event Num. Umask Value Event Mask Mnemonic Description CDH 01H MEM_TRANS_RETIRED.LOAD_ LATENCY Additional Configuration: Disable BL bypass and direct2core, and if the memory is remotely homed. The count is not reliable If the memory is locally homed. D1H 04H MEM_LOAD_UOPS_RETIRED.LL Additional Configuration: Disable BL bypass. Supports PEBS. C_HIT Comment Vol. 3B 19-97 PERFORMANCE MONITORING EVENTS Table 19-19. Performance Events Applicable only to the Processor Core of Intel® Xeon® Processor E5 Family Event Num. Umask Value D1H 20H MEM_LOAD_UOPS_RETIRED.LL Additional Configuration: Disable BL bypass and direct2core. Supports PEBS. C_MISS D2H 01H MEM_LOAD_UOPS_LLC_HIT_R Additional Configuration: Disable bypass. Supports PEBS. ETIRED.XSNP_MISS D2H 02H MEM_LOAD_UOPS_LLC_HIT_R Additional Configuration: Disable bypass. Supports PEBS. ETIRED.XSNP_HIT D2H 04H MEM_LOAD_UOPS_LLC_HIT_R Additional Configuration: Disable bypass. Supports PEBS. ETIRED.XSNP_HITM D2H 08H MEM_LOAD_UOPS_LLC_HIT_R Additional Configuration: Disable bypass. Supports PEBS. ETIRED.XSNP_NONE D3H 01H MEM_LOAD_UOPS_LLC_MISS_ RETIRED.LOCAL_DRAM Retired load uops which data sources were data missed LLC but serviced by local DRAM. Supports PEBS. Disable BL bypass and direct2core (see MSR 3C9H). D3H 04H MEM_LOAD_UOPS_LLC_MISS_ RETIRED.REMOTE_DRAM Retired load uops which data sources were data missed LLC but serviced by remote DRAM. Supports PEBS. Disable BL bypass and direct2core (see MSR 3C9H). OFF_CORE_RESPONSE_N Sub-events of OFF_CORE_RESPONSE_N (suffix N = 0, 1) programmed using MSR 01A6H/01A7H with values shown in the comment column. B7H/BB 01H H Event Mask Mnemonic Description OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.ANY_RESPONSE_N Comment 3FFFC00004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.LOCAL_DRAM_N 600400004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.REMOTE_DRAM_N 67F800004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.REMOTE_HIT_FWD_N 87F800004H OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.REMOTE_HITM_N 107FC00004H OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.ANY_DRAM_N 67FC00001H OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.ANY_RESPONSE_N 3F803C0001H OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.LOCAL_DRAM_N 600400001H OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.REMOTE_DRAM_N 67F800001H OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.REMOTE_HIT_FWD_N 87F800001H OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.REMOTE_HITM_N 107FC00001H OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_MISS.ANY_RESPONSE_N 3F803C0040H OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.ANY_DRAM_N 67FC00010H OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.ANY_RESPONSE_N 3F803C0010H OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.LOCAL_DRAM_N 600400010H OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.REMOTE_DRAM_N 67F800010H OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.REMOTE_HIT_FWD_N 87F800010H OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.REMOTE_HITM_N 107FC00010H OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_MISS.ANY_RESPONSE_N 3FFFC00200H OFFCORE_RESPONSE.PF_LLC_DATA_RD.LLC_MISS.ANY_RESPONSE_N 3FFFC00080H Model-specific performance monitoring events that are located in the uncore sub-system are implementation specific between different platforms using processors based on Intel microarchitecture code name Sandy Bridge. Processors with CPUID signature of DisplayFamily_DisplayModel 06_2AH support performance events listed in Table 19-20. 19-98 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-20. Performance Events In the Processor Uncore for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series Event Num.1 Umask Value 22H 01H UNC_CBO_XSNP_RESPONSE.M A snoop misses in some processor core. ISS 22H 02H UNC_CBO_XSNP_RESPONSE.I NVAL 22H 04H UNC_CBO_XSNP_RESPONSE.H A snoop hits a non-modified line in some processor IT core. 22H 08H UNC_CBO_XSNP_RESPONSE.H A snoop hits a modified line in some processor core. ITM 22H 10H UNC_CBO_XSNP_RESPONSE.I NVAL_M A snoop invalidates a modified line in some processor core. 22H 20H UNC_CBO_XSNP_RESPONSE.E XTERNAL_FILTER Filter on cross-core snoops initiated by this Cbox due to external snoop request. 22H 40H UNC_CBO_XSNP_RESPONSE.X Filter on cross-core snoops initiated by this Cbox due CORE_FILTER to processor core memory request. 22H 80H UNC_CBO_XSNP_RESPONSE.E VICTION_FILTER Filter on cross-core snoops initiated by this Cbox due to LLC eviction. 34H 01H UNC_CBO_CACHE_LOOKUP.M 34H 02H UNC_CBO_CACHE_LOOKUP.E LLC lookup request that access cache and found line in Must combine with M-state. one of the umask LLC lookup request that access cache and found line in values of 10H, 20H, 40H, 80H. E-state. 34H 04H UNC_CBO_CACHE_LOOKUP.S LLC lookup request that access cache and found line in S-state. 34H 08H UNC_CBO_CACHE_LOOKUP.I LLC lookup request that access cache and found line in I-state. 34H 10H UNC_CBO_CACHE_LOOKUP.RE AD_FILTER Filter on processor core initiated cacheable read requests. Must combine with at least one of 01H, 02H, 04H, 08H. 34H 20H UNC_CBO_CACHE_LOOKUP.WR Filter on processor core initiated cacheable write ITE_FILTER requests. Must combine with at least one of 01H, 02H, 04H, 08H. 34H 40H UNC_CBO_CACHE_LOOKUP.EX TSNP_FILTER 34H 80H UNC_CBO_CACHE_LOOKUP.AN Filter on any IRQ or IPQ initiated requests including Y_REQUEST_FILTER uncacheable, non-coherent requests. Must combine with at least one of 01H, 02H, 04H, 08H. 80H 01H UNC_ARB_TRK_OCCUPANCY.A Counts cycles weighted by the number of requests Counter 0 only. LL waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC. 81H 01H UNC_ARB_TRK_REQUEST.ALL Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC. 81H 20H UNC_ARB_TRK_REQUEST.WRI TES Counts the number of allocated write entries, include full, partial, and LLC evictions. 81H 80H UNC_ARB_TRK_REQUEST.EVIC Counts the number of LLC evictions allocated. TIONS Event Mask Mnemonic Description A snoop invalidates a non-modified line in some processor core. Comment Must combine with one of the umask values of 20H, 40H, 80H. Must combine with at least one of 01H, 02H, 04H, 08H, 10H. Filter on external snoop requests. Must combine with at least one of 01H, 02H, 04H, 08H. Vol. 3B 19-99 PERFORMANCE MONITORING EVENTS Table 19-20. Performance Events In the Processor Uncore for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series (Contd.) Event Num.1 Umask Value Event Mask Mnemonic Description Comment 83H 01H UNC_ARB_COH_TRK_OCCUPA NCY.ALL Cycles weighted by number of requests pending in Coherency Tracker. Counter 0 only. 84H 01H UNC_ARB_COH_TRK_REQUES T.ALL Number of requests allocated in Coherency Tracker. NOTES: 1. The uncore events must be programmed using MSRs located in specific performance monitoring units in the uncore. UNC_CBO* events are supported using MSR_UNC_CBO* MSRs; UNC_ARB* events are supported using MSR_UNC_ARB*MSRs. 19.10 PERFORMANCE MONITORING EVENTS FOR INTEL® CORE™ I7 PROCESSOR FAMILY AND INTEL® XEON® PROCESSOR FAMILY Processors based on the Intel microarchitecture code name Nehalem support the architectural and model-specific performance monitoring events listed in Table 19-1 and Table 19-21. The events in Table 19-21 generally applies to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_1AH, 06_1EH, 06_1FH, and 06_2EH. However, Intel Xeon processors with CPUID signature of DisplayFamily_DisplayModel 06_2EH have a small number of events that are not supported in processors with CPUID signature 06_1AH, 06_1EH, and 06_1FH. These events are noted in the comment column. In addition, these processors (CPUID signature of DisplayFamily_DisplayModel 06_1AH, 06_1EH, 06_1FH) also support the following model-specific, product-specific uncore performance monitoring events listed in Table 19-22. Fixed counters in the core PMU support the architecture events defined in Table 19-2. Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series Event Num. Umask Value Event Mask Mnemonic 04H 07H SB_DRAIN.ANY Counts the number of store buffer drains. 06H 04H STORE_BLOCKS.AT_RET Counts number of loads delayed with at-Retirement block code. The following loads need to be executed at retirement and wait for all senior stores on the same thread to be drained: load splitting across 4K boundary (page split), load accessing uncacheable (UC or WC) memory, load lock, and load with page table in UC or WC memory region. 06H 08H STORE_BLOCKS.L1D_BLOCK Cacheable loads delayed with L1D block code. 07H 01H PARTIAL_ADDRESS_ALIAS Counts false dependency due to partial address aliasing. 08H 01H DTLB_LOAD_MISSES.ANY Counts all load misses that cause a page walk. 08H 02H DTLB_LOAD_MISSES.WALK_CO Counts number of completed page walks due to load MPLETED miss in the STLB. 08H 10H DTLB_LOAD_MISSES.STLB_HIT Number of cache load STLB hits. 08H 20H DTLB_LOAD_MISSES.PDE_MIS S 08H 80H DTLB_LOAD_MISSES.LARGE_W Counts number of completed large page walks due ALK_COMPLETED to load miss in the STLB. 19-100 Vol. 3B Description Number of DTLB cache load misses where the low part of the linear to physical address translation was missed. Comment PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 0BH 01H MEM_INST_RETIRED.LOADS Counts the number of instructions with an architecturally-visible load retired on the architected path. 0BH 02H MEM_INST_RETIRED.STORES Counts the number of instructions with an architecturally-visible store retired on the architected path. 0BH 10H MEM_INST_RETIRED.LATENCY Counts the number of instructions exceeding the _ABOVE_THRESHOLD latency specified with ld_lat facility. 0CH 01H MEM_STORE_RETIRED.DTLB_ MISS The event counts the number of retired stores that missed the DTLB. The DTLB miss is not counted if the store operation causes a fault. Does not counter prefetches. Counts both primary and secondary misses to the TLB. 0EH 01H UOPS_ISSUED.ANY Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i.e. the UOPs issued from the front end to the back end. 0EH 01H UOPS_ISSUED.STALLED_CYCLE Counts the number of cycles no Uops issued by the Set “invert=1, cmask = S Register Allocation Table to the Reservation 1“. Station, i.e. the UOPs issued from the front end to the back end. 0EH 02H UOPS_ISSUED.FUSED Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station. 0FH 01H MEM_UNCORE_RETIRED.L3_D ATA_MISS_UNKNOWN Counts number of memory load instructions retired Available only for CPUID where the memory reference missed L3 and data signature 06_2EH. source is unknown. 0FH 02H MEM_UNCORE_RETIRED.OTHE Counts number of memory load instructions retired R_CORE_L2_HITM where the memory reference hit modified data in a sibling core residing on the same socket. 0FH 08H MEM_UNCORE_RETIRED.REMO Counts number of memory load instructions retired TE_CACHE_LOCAL_HOME_HIT where the memory reference missed the L1, L2 and L3 caches and HIT in a remote socket's cache. Only counts locally homed lines. 0FH 10H MEM_UNCORE_RETIRED.REMO Counts number of memory load instructions retired TE_DRAM where the memory reference missed the L1, L2 and L3 caches and was remotely homed. This includes both DRAM access and HITM in a remote socket's cache for remotely homed lines. 0FH 20H MEM_UNCORE_RETIRED.LOCA L_DRAM 0FH 80H MEM_UNCORE_RETIRED.UNCA Counts number of memory load instructions retired Available only for CPUID CHEABLE where the memory reference missed the L1, L2 and signature 06_2EH. L3 caches and to perform I/O. Comment In conjunction with ld_lat facility. Counts number of memory load instructions retired where the memory reference missed the L1, L2 and L3 caches and required a local socket memory reference. This includes locally homed cachelines that were in a modified state in another socket. Vol. 3B 19-101 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 10H 01H FP_COMP_OPS_EXE.X87 Counts the number of FP Computational Uops Executed. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction. 10H 02H FP_COMP_OPS_EXE.MMX Counts number of MMX Uops executed. 10H 04H FP_COMP_OPS_EXE.SSE_FP Counts number of SSE and SSE2 FP uops executed. 10H 08H FP_COMP_OPS_EXE.SSE2_INT Counts number of SSE2 integer uops executed. EGER 10H 10H FP_COMP_OPS_EXE.SSE_FP_P Counts number of SSE FP packed uops executed. ACKED 10H 20H FP_COMP_OPS_EXE.SSE_FP_S Counts number of SSE FP scalar uops executed. CALAR 10H 40H FP_COMP_OPS_EXE.SSE_SING Counts number of SSE* FP single precision uops LE_PRECISION executed. 10H 80H FP_COMP_OPS_EXE.SSE_DOU BLE_PRECISION Counts number of SSE* FP double precision uops executed. 12H 01H SIMD_INT_128.PACKED_MPY Counts number of 128 bit SIMD integer multiply operations. 12H 02H SIMD_INT_128.PACKED_SHIFT Counts number of 128 bit SIMD integer shift operations. 12H 04H SIMD_INT_128.PACK Counts number of 128 bit SIMD integer pack operations. 12H 08H SIMD_INT_128.UNPACK Counts number of 128 bit SIMD integer unpack operations. 12H 10H SIMD_INT_128.PACKED_LOGIC Counts number of 128 bit SIMD integer logical AL operations. 12H 20H SIMD_INT_128.PACKED_ARITH Counts number of 128 bit SIMD integer arithmetic operations. 12H 40H SIMD_INT_128.SHUFFLE_MOV Counts number of 128 bit SIMD integer shuffle and E move operations. 13H 01H LOAD_DISPATCH.RS Counts number of loads dispatched from the Reservation Station that bypass the Memory Order Buffer. 13H 02H LOAD_DISPATCH.RS_DELAYED Counts the number of delayed RS dispatches at the stage latch. If an RS dispatch cannot bypass to LB, it has another chance to dispatch from the one-cycle delayed staging latch before it is written into the LB. 13H 04H LOAD_DISPATCH.MOB Counts the number of loads dispatched from the Reservation Station to the Memory Order Buffer. 13H 07H LOAD_DISPATCH.ANY Counts all loads dispatched from the Reservation Station. 19-102 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 14H 01H ARITH.CYCLES_DIV_BUSY Counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Count may be incorrect When SMT is on. Set 'edge =1, invert=1, cmask=1' to count the number of divides. 14H 02H ARITH.MUL Counts the number of multiply operations executed. Count may be incorrect This includes integer as well as floating point When SMT is on. multiply operations but excludes DPPS mul and MPSAD. 17H 01H INST_QUEUE_WRITES Counts the number of instructions written into the instruction queue every cycle. 18H 01H INST_DECODED.DEC0 Counts number of instructions that require decoder 0 to be decoded. Usually, this means that the instruction maps to more than 1 uop. 19H 01H TWO_UOP_INSTS_DECODED An instruction that generates two uops was decoded. 1EH 01H INST_QUEUE_WRITE_CYCLES This event counts the number of cycles during which instructions are written to the instruction queue. Dividing this counter by the number of instructions written to the instruction queue (INST_QUEUE_WRITES) yields the average number of instructions decoded each cycle. If this number is less than four and the pipe stalls, this indicates that the decoder is failing to decode enough instructions per cycle to sustain the 4-wide pipeline. 20H 01H LSD_OVERFLOW Counts number of loops that can’t stream from the instruction queue. 24H 01H L2_RQSTS.LD_HIT Counts number of loads that hit the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches. L2 loads can be rejected for various reasons. Only non rejected loads are counted. 24H 02H L2_RQSTS.LD_MISS Counts the number of loads that miss the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches. 24H 03H L2_RQSTS.LOADS Counts all L2 load requests. L2 loads include both L1D demand misses as well as L1D prefetches. 24H 04H L2_RQSTS.RFO_HIT Counts the number of store RFO requests that hit the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. Count includes WC memory requests, where the data is not fetched but the permission to write the line is required. 24H 08H L2_RQSTS.RFO_MISS Counts the number of store RFO requests that miss the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. If SSE* instructions that are 6 bytes or longer arrive one after another, then front end throughput may limit execution speed. Vol. 3B 19-103 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 24H 0CH L2_RQSTS.RFOS Counts all L2 store RFO requests. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. 24H 10H L2_RQSTS.IFETCH_HIT Counts number of instruction fetches that hit the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches. 24H 20H L2_RQSTS.IFETCH_MISS Counts number of instruction fetches that miss the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches. 24H 30H L2_RQSTS.IFETCHES Counts all instruction fetches. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches. 24H 40H L2_RQSTS.PREFETCH_HIT Counts L2 prefetch hits for both code and data. 24H 80H L2_RQSTS.PREFETCH_MISS Counts L2 prefetch misses for both code and data. 24H C0H L2_RQSTS.PREFETCHES Counts all L2 prefetches for both code and data. 24H AAH L2_RQSTS.MISS Counts all L2 misses for both code and data. 24H FFH L2_RQSTS.REFERENCES Counts all L2 requests for both code and data. 26H 01H L2_DATA_RQSTS.DEMAND.I_S TATE Counts number of L2 data demand loads where the cache line to be loaded is in the I (invalid) state, i.e., a cache miss. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 02H L2_DATA_RQSTS.DEMAND.S_S Counts number of L2 data demand loads where the TATE cache line to be loaded is in the S (shared) state. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 04H L2_DATA_RQSTS.DEMAND.E_S Counts number of L2 data demand loads where the TATE cache line to be loaded is in the E (exclusive) state. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 08H L2_DATA_RQSTS.DEMAND.M_ STATE Counts number of L2 data demand loads where the cache line to be loaded is in the M (modified) state. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 0FH L2_DATA_RQSTS.DEMAND.ME SI Counts all L2 data demand requests. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 10H L2_DATA_RQSTS.PREFETCH.I_ Counts number of L2 prefetch data loads where the STATE cache line to be loaded is in the I (invalid) state, i.e., a cache miss. 26H 20H L2_DATA_RQSTS.PREFETCH.S _STATE Counts number of L2 prefetch data loads where the cache line to be loaded is in the S (shared) state. A prefetch RFO will miss on an S state line, while a prefetch read will hit on an S state line. 26H 40H L2_DATA_RQSTS.PREFETCH.E _STATE Counts number of L2 prefetch data loads where the cache line to be loaded is in the E (exclusive) state. 19-104 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value 26H 80H L2_DATA_RQSTS.PREFETCH.M Counts number of L2 prefetch data loads where the _STATE cache line to be loaded is in the M (modified) state. 26H F0H L2_DATA_RQSTS.PREFETCH.M Counts all L2 prefetch requests. ESI Event Mask Mnemonic Description Comment 26H FFH L2_DATA_RQSTS.ANY Counts all L2 data requests. 27H 01H L2_WRITE.RFO.I_STATE Counts number of L2 demand store RFO requests This is a demand RFO where the cache line to be loaded is in the I (invalid) request. state, i.e., a cache miss. The L1D prefetcher does not issue a RFO prefetch. 27H 02H L2_WRITE.RFO.S_STATE Counts number of L2 store RFO requests where the This is a demand RFO cache line to be loaded is in the S (shared) state. request. The L1D prefetcher does not issue a RFO prefetch. 27H 08H L2_WRITE.RFO.M_STATE Counts number of L2 store RFO requests where the This is a demand RFO cache line to be loaded is in the M (modified) state. request. The L1D prefetcher does not issue a RFO prefetch. 27H 0EH L2_WRITE.RFO.HIT Counts number of L2 store RFO requests where the This is a demand RFO cache line to be loaded is in either the S, E or M request. states. The L1D prefetcher does not issue a RFO prefetch. 27H 0FH L2_WRITE.RFO.MESI Counts all L2 store RFO requests. The L1D prefetcher does not issue a RFO prefetch. 27H 10H L2_WRITE.LOCK.I_STATE Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the I (invalid) state, for example, a cache miss. 27H 20H L2_WRITE.LOCK.S_STATE Counts number of L2 lock RFO requests where the cache line to be loaded is in the S (shared) state. 27H 40H L2_WRITE.LOCK.E_STATE Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the E (exclusive) state. 27H 80H L2_WRITE.LOCK.M_STATE Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the M (modified) state. 27H E0H L2_WRITE.LOCK.HIT Counts number of L2 demand lock RFO requests where the cache line to be loaded is in either the S, E, or M state. 27H F0H L2_WRITE.LOCK.MESI Counts all L2 demand lock RFO requests. 28H 01H L1D_WB_L2.I_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the I (invalid) state, i.e., a cache miss. 28H 02H L1D_WB_L2.S_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the S state. 28H 04H L1D_WB_L2.E_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the E (exclusive) state. 28H 08H L1D_WB_L2.M_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the M (modified) state. This is a demand RFO request. Vol. 3B 19-105 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 28H 0FH L1D_WB_L2.MESI Counts all L1 writebacks to the L2 . 2EH 4FH L3_LAT_CACHE.REFERENCE This event counts requests originating from the See Table 19-1. core that reference a cache line in the last level cache. The event count includes speculative traffic but excludes cache line fills due to a L2 hardwareprefetch. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. 2EH 41H L3_LAT_CACHE.MISS This event counts each cache miss condition for references to the last level cache. The event count may include speculative traffic but excludes cache line fills due to L2 hardware-prefetches. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. See Table 19-1. 3CH 00H CPU_CLK_UNHALTED.THREAD Counts the number of thread cycles while the _P thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. See Table 19-1. 3CH 01H CPU_CLK_UNHALTED.REF_P Increments at the frequency of TSC when not halted. See Table 19-1. 40H 01H L1D_CACHE_LD.I_STATE Counts L1 data cache read requests where the cache line to be loaded is in the I (invalid) state, i.e. the read request missed the cache. Counter 0, 1 only. 40H 02H L1D_CACHE_LD.S_STATE Counts L1 data cache read requests where the cache line to be loaded is in the S (shared) state. Counter 0, 1 only. 40H 04H L1D_CACHE_LD.E_STATE Counts L1 data cache read requests where the cache line to be loaded is in the E (exclusive) state. Counter 0, 1 only. 40H 08H L1D_CACHE_LD.M_STATE Counts L1 data cache read requests where the cache line to be loaded is in the M (modified) state. Counter 0, 1 only. 40H 0FH L1D_CACHE_LD.MESI Counts L1 data cache read requests. Counter 0, 1 only. 41H 02H L1D_CACHE_ST.S_STATE Counts L1 data cache store RFO requests where the Counter 0, 1 only. cache line to be loaded is in the S (shared) state. 41H 04H L1D_CACHE_ST.E_STATE Counts L1 data cache store RFO requests where the Counter 0, 1 only. cache line to be loaded is in the E (exclusive) state. 41H 08H L1D_CACHE_ST.M_STATE Counts L1 data cache store RFO requests where cache line to be loaded is in the M (modified) state. Counter 0, 1 only. 42H 01H L1D_CACHE_LOCK.HIT Counts retired load locks that hit in the L1 data cache or hit in an already allocated fill buffer. The lock portion of the load lock transaction must hit in the L1D. The initial load will pull the lock into the L1 data cache. Counter 0, 1 only. 42H 02H L1D_CACHE_LOCK.S_STATE Counts L1 data cache retired load locks that hit the target cache line in the shared state. Counter 0, 1 only. 42H 04H L1D_CACHE_LOCK.E_STATE Counts L1 data cache retired load locks that hit the target cache line in the exclusive state. Counter 0, 1 only. 19-106 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 42H 08H L1D_CACHE_LOCK.M_STATE Counts L1 data cache retired load locks that hit the target cache line in the modified state. Counter 0, 1 only. 43H 01H L1D_ALL_REF.ANY Counts all references (uncached, speculated and retired) to the L1 data cache, including all loads and stores with any memory types. The event counts memory accesses only when they are actually performed. For example, a load blocked by unknown store address and later performed is only counted once. The event does not include non-memory accesses, such as I/O accesses. Counter 0, 1 only. 43H 02H L1D_ALL_REF.CACHEABLE Counts all data reads and writes (speculated and retired) from cacheable memory, including locked operations. Counter 0, 1 only. 49H 01H DTLB_MISSES.ANY Counts the number of misses in the STLB which causes a page walk. 49H 02H DTLB_MISSES.WALK_COMPLET Counts number of misses in the STLB which ED resulted in a completed page walk. 49H 10H DTLB_MISSES.STLB_HIT Counts the number of DTLB first level misses that hit in the second level TLB. This event is only relevant if the core contains multiple DTLB levels. 49H 20H DTLB_MISSES.PDE_MISS Number of DTLB misses caused by low part of address, includes references to 2M pages because 2M pages do not use the PDE. 49H 80H DTLB_MISSES.LARGE_WALK_C Counts number of misses in the STLB which OMPLETED resulted in a completed page walk for large pages. 4CH 01H LOAD_HIT_PRE Counts load operations sent to the L1 data cache while a previous SSE prefetch instruction to the same cache line has started prefetching but has not yet finished. 4EH 01H L1D_PREFETCH.REQUESTS Counts number of hardware prefetch requests dispatched out of the prefetch FIFO. 4EH 02H L1D_PREFETCH.MISS Counts number of hardware prefetch requests that miss the L1D. There are two prefetchers in the L1D. A streamer, which predicts lines sequentially after this one should be fetched, and the IP prefetcher that remembers access patterns for the current instruction. The streamer prefetcher stops on an L1D hit, while the IP prefetcher does not. 4EH 04H L1D_PREFETCH.TRIGGERS Counts number of prefetch requests triggered by the Finite State Machine and pushed into the prefetch FIFO. Some of the prefetch requests are dropped due to overwrites or competition between the IP index prefetcher and streamer prefetcher. The prefetch FIFO contains 4 entries. 51H 01H L1D.REPL Counts the number of lines brought into the L1 data Counter 0, 1 only. cache. 51H 02H L1D.M_REPL Counts the number of modified lines brought into the L1 data cache. Counter 0, 1 only. 51H 04H L1D.M_EVICT Counts the number of modified lines evicted from the L1 data cache due to replacement. Counter 0, 1 only. Vol. 3B 19-107 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 51H 08H L1D.M_SNOOP_EVICT Counts the number of modified lines evicted from the L1 data cache due to snoop HITM intervention. Counter 0, 1 only. 52H 01H L1D_CACHE_PREFETCH_LOCK _FB_HIT Counts the number of cacheable load lock speculated instructions accepted into the fill buffer. 53H 01H L1D_CACHE_LOCK_FB_HIT Counts the number of cacheable load lock speculated or retired instructions accepted into the fill buffer. 63H 01H CACHE_LOCK_CYCLES.L1D_L2 Cycle count during which the L1D and L2 are locked. A lock is asserted when there is a locked memory access, due to uncacheable memory, a locked operation that spans two cache lines, or a page walk from an uncacheable page table. Counter 0, 1 only. L1D and L2 locks have a very high performance penalty and it is highly recommended to avoid such accesses. 63H 02H CACHE_LOCK_CYCLES.L1D Counts the number of cycles that cacheline in the L1 data cache unit is locked. Counter 0, 1 only. 6CH 01H IO_TRANSACTIONS Counts the number of completed I/O transactions. 80H 01H L1I.HITS Counts all instruction fetches that hit the L1 instruction cache. 80H 02H L1I.MISSES Counts all instruction fetches that miss the L1I cache. This includes instruction cache misses, streaming buffer misses, victim cache misses and uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding. 80H 03H L1I.READS Counts all instruction fetches, including uncacheable fetches that bypass the L1I. 80H 04H L1I.CYCLES_STALLED Cycle counts for which an instruction fetch stalls due to a L1I cache miss, ITLB miss or ITLB fault. 82H 01H LARGE_ITLB.HIT Counts number of large ITLB hits. 85H 01H ITLB_MISSES.ANY Counts the number of misses in all levels of the ITLB which causes a page walk. 85H 02H ITLB_MISSES.WALK_COMPLET ED Counts number of misses in all levels of the ITLB which resulted in a completed page walk. 87H 01H ILD_STALL.LCP Cycles Instruction Length Decoder stalls due to length changing prefixes: 66, 67 or REX.W (for Intel 64) instructions which change the length of the decoded instruction. 87H 02H ILD_STALL.MRU Instruction Length Decoder stall cycles due to Brand Prediction Unit (PBU) Most Recently Used (MRU) bypass. 87H 04H ILD_STALL.IQ_FULL Stall cycles due to a full instruction queue. 87H 08H ILD_STALL.REGEN Counts the number of regen stalls. 87H 0FH ILD_STALL.ANY Counts any cycles the Instruction Length Decoder is stalled. 88H 01H BR_INST_EXEC.COND Counts the number of conditional near branch instructions executed, but not necessarily retired. 19-108 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 88H 02H BR_INST_EXEC.DIRECT Counts all unconditional near branch instructions excluding calls and indirect branches. 88H 04H BR_INST_EXEC.INDIRECT_NON Counts the number of executed indirect near _CALL branch instructions that are not calls. 88H 07H BR_INST_EXEC.NON_CALLS Counts all non-call near branch instructions executed, but not necessarily retired. 88H 08H BR_INST_EXEC.RETURN_NEA R Counts indirect near branches that have a return mnemonic. 88H 10H BR_INST_EXEC.DIRECT_NEAR _CALL Counts unconditional near call branch instructions, excluding non-call branch, executed. 88H 20H BR_INST_EXEC.INDIRECT_NEA Counts indirect near calls, including both register R_CALL and memory indirect, executed. 88H 30H BR_INST_EXEC.NEAR_CALLS Counts all near call branches executed, but not necessarily retired. 88H 40H BR_INST_EXEC.TAKEN Counts taken near branches executed, but not necessarily retired. 88H 7FH BR_INST_EXEC.ANY Counts all near executed branches (not necessarily retired). This includes only instructions and not micro-op branches. Frequent branching is not necessarily a major performance issue. However frequent branch mispredictions may be a problem. 89H 01H BR_MISP_EXEC.COND Counts the number of mispredicted conditional near branch instructions executed, but not necessarily retired. 89H 02H BR_MISP_EXEC.DIRECT Counts mispredicted macro unconditional near branch instructions, excluding calls and indirect branches (should always be 0). 89H 04H BR_MISP_EXEC.INDIRECT_NO N_CALL Counts the number of executed mispredicted indirect near branch instructions that are not calls. 89H 07H BR_MISP_EXEC.NON_CALLS Counts mispredicted non-call near branches executed, but not necessarily retired. 89H 08H BR_MISP_EXEC.RETURN_NEA R Counts mispredicted indirect branches that have a rear return mnemonic. 89H 10H BR_MISP_EXEC.DIRECT_NEAR _CALL Counts mispredicted non-indirect near calls executed, (should always be 0). 89H 20H BR_MISP_EXEC.INDIRECT_NEA Counts mispredicted indirect near calls executed, R_CALL including both register and memory indirect. 89H 30H BR_MISP_EXEC.NEAR_CALLS Counts all mispredicted near call branches executed, but not necessarily retired. 89H 40H BR_MISP_EXEC.TAKEN Counts executed mispredicted near branches that are taken, but not necessarily retired. 89H 7FH BR_MISP_EXEC.ANY Counts the number of mispredicted near branch instructions that were executed, but not necessarily retired. Comment Vol. 3B 19-109 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment A2H 01H RESOURCE_STALLS.ANY Counts the number of Allocator resource related stalls. Includes register renaming buffer entries, memory buffer entries. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations. Does not include stalls due to SuperQ (off core) queue full, too many cache misses, etc. A2H 02H RESOURCE_STALLS.LOAD Counts the cycles of stall due to lack of load buffer for load operation. A2H 04H RESOURCE_STALLS.RS_FULL This event counts the number of cycles when the number of instructions in the pipeline waiting for execution reaches the limit the processor can handle. A high count of this event indicates that there are long latency operations in the pipe (possibly load and store operations that miss the L2 cache, or instructions dependent upon instructions further down the pipeline that have yet to retire. A2H 08H RESOURCE_STALLS.STORE This event counts the number of cycles that a resource related stall will occur due to the number of store instructions reaching the limit of the pipeline, (i.e. all store buffers are used). The stall ends when a store instruction commits its data to the cache or memory. A2H 10H RESOURCE_STALLS.ROB_FULL Counts the cycles of stall due to re-order buffer full. A2H 20H RESOURCE_STALLS.FPCW Counts the number of cycles while execution was stalled due to writing the floating-point unit (FPU) control word. A2H 40H RESOURCE_STALLS.MXCSR Stalls due to the MXCSR register rename occurring to close to a previous MXCSR rename. The MXCSR provides control and status for the MMX registers. A2H 80H RESOURCE_STALLS.OTHER Counts the number of cycles while execution was stalled due to other resource issues. A6H 01H MACRO_INSTS.FUSIONS_DECO Counts the number of instructions decoded that are DED macro-fused but not necessarily executed or retired. A7H 01H BACLEAR_FORCE_IQ Counts number of times a BACLEAR was forced by the Instruction Queue. The IQ is also responsible for providing conditional branch prediction direction based on a static scheme and dynamic data provided by the L2 Branch Prediction Unit. If the conditional branch target is not found in the Target Array and the IQ predicts that the branch is taken, then the IQ will force the Branch Address Calculator to issue a BACLEAR. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline. A8H 01H LSD.UOPS Counts the number of micro-ops delivered by loop stream detector. AEH 01H ITLB_FLUSH Counts the number of ITLB flushes. 19-110 Vol. 3B When RS is full, new instructions cannot enter the reservation station and start execution. Use cmask=1 and invert to count cycles. PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value B0H 40H OFFCORE_REQUESTS.L1D_WR Counts number of L1D writebacks to the uncore. ITEBACK B1H 01H UOPS_EXECUTED.PORT0 Counts number of uops executed that were issued on port 0. Port 0 handles integer arithmetic, SIMD and FP add uops. B1H 02H UOPS_EXECUTED.PORT1 Counts number of uops executed that were issued on port 1. Port 1 handles integer arithmetic, SIMD, integer shift, FP multiply and FP divide uops. B1H 04H UOPS_EXECUTED.PORT2_COR Counts number of uops executed that were issued E on port 2. Port 2 handles the load uops. This is a core count only and cannot be collected per thread. B1H 08H UOPS_EXECUTED.PORT3_COR Counts number of uops executed that were issued E on port 3. Port 3 handles store uops. This is a core count only and cannot be collected per thread. B1H 10H UOPS_EXECUTED.PORT4_COR Counts number of uops executed that where issued E on port 4. Port 4 handles the value to be stored for the store uops issued on port 3. This is a core count only and cannot be collected per thread. B1H 1FH UOPS_EXECUTED.CORE_ACTIV Counts cycles when the uops executed were issued E_CYCLES_NO_PORT5 from any ports except port 5. Use Cmask=1 for active cycles; Cmask=0 for weighted cycles. Use CMask=1, Invert=1 to count P0-4 stalled cycles. Use Cmask=1, Edge=1, Invert=1 to count P0-4 stalls. B1H 20H UOPS_EXECUTED.PORT5 B1H 3FH UOPS_EXECUTED.CORE_ACTIV Counts cycles when the uops are executing. Use E_CYCLES Cmask=1 for active cycles; Cmask=0 for weighted cycles. Use CMask=1, Invert=1 to count P0-4 stalled cycles. Use Cmask=1, Edge=1, Invert=1 to count P04 stalls. B1H 40H UOPS_EXECUTED.PORT015 Counts number of uops executed that where issued Use cmask=1, invert=1 on port 0, 1, or 5. to count stall cycles. B1H 80H UOPS_EXECUTED.PORT234 Counts number of uops executed that where issued on port 2, 3, or 4. B2H 01H OFFCORE_REQUESTS_SQ_FUL Counts number of cycles the SQ is full to handle offL core requests. B7H 01H OFF_CORE_RESPONSE_0 See Section 18.3.1.1.3, “Off-core Response Performance Monitoring in the Processor Core”. B8H 01H SNOOP_RESPONSE.HIT Counts HIT snoop response sent by this thread in response to a snoop request. B8H 02H SNOOP_RESPONSE.HITE Counts HIT E snoop response sent by this thread in response to a snoop request. B8H 04H SNOOP_RESPONSE.HITM Counts HIT M snoop response sent by this thread in response to a snoop request. BBH 01H OFF_CORE_RESPONSE_1 See Section 18.6.3, “Performance Monitoring (Processors Based on Intel NetBurst® Microarchitecture)”. Event Mask Mnemonic Description Comment Counts number of uops executed that where issued on port 5. Requires programming MSR 01A6H. Requires programming MSR 01A7H. Vol. 3B 19-111 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment C0H 00H INST_RETIRED.ANY_P See Table 19-1. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions. Notes: INST_RETIRED.ANY is counted by a designated fixed counter. INST_RETIRED.ANY_P is counted by a programmable counter and is an architectural performance event. Event is supported if CPUID.A.EBX[1] = 0. C0H 02H INST_RETIRED.X87 Counts the number of MMX instructions retired. C0H 04H INST_RETIRED.MMX Counts the number of floating point computational operations retired: floating point computational operations executed by the assist handler and suboperations of complex floating point instructions like transcendental instructions. C2H 01H UOPS_RETIRED.ANY Counts the number of micro-ops retired, (macroUse cmask=1 and invert fused=1, micro-fused=2, others=1; maximum count to count active cycles or of 8 per cycle). Most instructions are composed of stalled cycles. one or two micro-ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. C2H 02H UOPS_RETIRED.RETIRE_SLOTS Counts the number of retirement slots used each cycle. C2H 04H UOPS_RETIRED.MACRO_FUSE D Counts number of macro-fused uops retired. Counts the cycles machine clear is asserted. C3H 01H MACHINE_CLEARS.CYCLES C3H 02H MACHINE_CLEARS.MEM_ORDE Counts the number of machine clears due to R memory order conflicts. C3H 04H MACHINE_CLEARS.SMC Counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel 64 and IA-32 processors. The modified cache line is written back to the L2 and L3caches. C4H 00H BR_INST_RETIRED.ALL_BRAN CHES Branch instructions at retirement. C4H 01H BR_INST_RETIRED.CONDITION Counts the number of conditional branch AL instructions retired. C4H 02H BR_INST_RETIRED.NEAR_CAL L Counts the number of direct & indirect near unconditional calls retired. C5H 00H BR_MISP_RETIRED.ALL_BRAN CHES Mispredicted branch instructions at retirement. C5H 02H BR_MISP_RETIRED.NEAR_CAL L Counts mispredicted direct & indirect near unconditional retired calls. C7H 01H SSEX_UOPS_RETIRED.PACKED Counts SIMD packed single-precision floating point _SINGLE Uops retired. C7H 02H SSEX_UOPS_RETIRED.SCALAR Counts SIMD scalar single-precision floating point _SINGLE Uops retired. C7H 04H SSEX_UOPS_RETIRED.PACKED Counts SIMD packed double-precision floating point _DOUBLE Uops retired. 19-112 Vol. 3B See Table 19-1. See Table 19-1. PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value C7H 08H SSEX_UOPS_RETIRED.SCALAR Counts SIMD scalar double-precision floating point _DOUBLE Uops retired. C7H 10H SSEX_UOPS_RETIRED.VECTOR Counts 128-bit SIMD vector integer Uops retired. _INTEGER C8H 20H ITLB_MISS_RETIRED CBH 01H MEM_LOAD_RETIRED.L1D_HIT Counts number of retired loads that hit the L1 data cache. CBH 02H MEM_LOAD_RETIRED.L2_HIT Counts number of retired loads that hit the L2 data cache. CBH 04H MEM_LOAD_RETIRED.L3_UNS HARED_HIT Counts number of retired loads that hit their own, unshared lines in the L3 cache. CBH 08H MEM_LOAD_RETIRED.OTHER_ CORE_L2_HIT_HITM Counts number of retired loads that hit in a sibling core's L2 (on die core). Since the L3 is inclusive of all cores on the package, this is an L3 hit. This counts both clean and modified hits. CBH 10H MEM_LOAD_RETIRED.L3_MISS Counts number of retired loads that miss the L3 cache. The load was satisfied by a remote socket, local memory or an IOH. CBH 40H MEM_LOAD_RETIRED.HIT_LFB Counts number of retired loads that miss the L1D and the address is located in an allocated line fill buffer and will soon be committed to cache. This is counting secondary L1D misses. CBH 80H MEM_LOAD_RETIRED.DTLB_MI Counts the number of retired loads that missed the SS DTLB. The DTLB miss is not counted if the load operation causes a fault. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. Counts both primary and secondary misses to the TLB. CCH 01H FP_MMX_TRANS.TO_FP Counts the first floating-point instruction following any MMX instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states. CCH 02H FP_MMX_TRANS.TO_MMX Counts the first MMX instruction following a floating-point instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states. CCH 03H FP_MMX_TRANS.ANY Counts all transitions from floating point to MMX instructions and from MMX instructions to floating point instructions. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states. D0H 01H MACRO_INSTS.DECODED Counts the number of instructions decoded, (but not necessarily executed or retired). D1H 02H UOPS_DECODED.MS Counts the number of Uops decoded by the Microcode Sequencer, MS. The MS delivers uops when the instruction is more than 4 uops long or a microcode assist is occurring. Event Mask Mnemonic Description Comment Counts the number of retired instructions that missed the ITLB when the instruction was fetched. Vol. 3B 19-113 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value D1H 04H UOPS_DECODED.ESP_FOLDING Counts number of stack pointer (ESP) instructions decoded: push, pop, call, ret, etc. ESP instructions do not generate a Uop to increment or decrement ESP. Instead, they update an ESP_Offset register that keeps track of the delta to the current value of the ESP register. D1H 08H UOPS_DECODED.ESP_SYNC Counts number of stack pointer (ESP) sync operations where an ESP instruction is corrected by adding the ESP offset register to the current value of the ESP register. D2H 01H RAT_STALLS.FLAGS Counts the number of cycles during which execution stalled due to several reasons, one of which is a partial flag register stall. A partial register stall may occur when two conditions are met: 1) an instruction modifies some, but not all, of the flags in the flag register and 2) the next instruction, which depends on flags, depends on flags that were not modified by this instruction. D2H 02H RAT_STALLS.REGISTERS This event counts the number of cycles instruction execution latency became longer than the defined latency because the instruction used a register that was partially written by previous instruction. D2H 04H RAT_STALLS.ROB_READ_POR T Counts the number of cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the out-of-order pipeline. Note that, at this stage in the pipeline, additional stalls may occur at the same cycle and prevent the stalled micro-ops from entering the pipe. In such a case, micro-ops retry entering the execution pipe in the next cycle and the ROB-read port stall is counted again. D2H 08H RAT_STALLS.SCOREBOARD Counts the cycles where we stall due to microarchitecturally required serialization. Microcode scoreboarding stalls. D2H 0FH RAT_STALLS.ANY Counts all Register Allocation Table stall cycles due to: Cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the execution pipe. Cycles when partial register stalls occurred. Cycles when flag stalls occurred. Cycles floating-point unit (FPU) status word stalls occurred. To count each of these conditions separately use the events: RAT_STALLS.ROB_READ_PORT, RAT_STALLS.PARTIAL, RAT_STALLS.FLAGS, and RAT_STALLS.FPSW. D4H 01H SEG_RENAME_STALLS Counts the number of stall cycles due to the lack of renaming resources for the ES, DS, FS, and GS segment registers. If a segment is renamed but not retired and a second update to the same segment occurs, a stall occurs in the front end of the pipeline until the renamed segment retires. D5H 01H ES_REG_RENAMES Counts the number of times the ES segment register is renamed. 19-114 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description DBH 01H UOP_UNFUSION Counts unfusion events due to floating-point exception to a fused uop. E0H 01H BR_INST_DECODED Counts the number of branch instructions decoded. E5H 01H BPU_MISSED_CALL_RET Counts number of times the Branch Prediction Unit missed predicting a call or return branch. E6H 01H BACLEAR.CLEAR Counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. This can occur if the code has many branches such that they cannot be consumed by the BPU. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline. The effect on total execution time depends on the surrounding code. E6H 02H BACLEAR.BAD_TARGET Counts number of Branch Address Calculator clears (BACLEAR) asserted due to conditional branch instructions in which there was a target hit but the direction was wrong. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline. E8H 01H BPU_CLEARS.EARLY Counts early (normal) Branch Prediction Unit clears: The BPU clear leads to 2 BPU predicted a taken branch after incorrectly cycle bubble in the front assuming that it was not taken. end. E8H 02H BPU_CLEARS.LATE Counts late Branch Prediction Unit clears due to Most Recently Used conflicts. The PBU clear leads to a 3 cycle bubble in the front end. F0H 01H L2_TRANSACTIONS.LOAD Counts L2 load operations due to HW prefetch or demand loads. F0H 02H L2_TRANSACTIONS.RFO Counts L2 RFO operations due to HW prefetch or demand RFOs. F0H 04H L2_TRANSACTIONS.IFETCH Counts L2 instruction fetch operations due to HW prefetch or demand ifetch. F0H 08H L2_TRANSACTIONS.PREFETCH Counts L2 prefetch operations. F0H 10H L2_TRANSACTIONS.L1D_WB Counts L1D writeback operations to the L2. F0H 20H L2_TRANSACTIONS.FILL Counts L2 cache line fill operations due to load, RFO, L1D writeback or prefetch. F0H 40H L2_TRANSACTIONS.WB Counts L2 writeback operations to the L3. F0H 80H L2_TRANSACTIONS.ANY Counts all L2 cache operations. F1H 02H L2_LINES_IN.S_STATE Counts the number of cache lines allocated in the L2 cache in the S (shared) state. F1H 04H L2_LINES_IN.E_STATE Counts the number of cache lines allocated in the L2 cache in the E (exclusive) state. F1H 07H L2_LINES_IN.ANY Counts the number of cache lines allocated in the L2 cache. F2H 01H L2_LINES_OUT.DEMAND_CLEA Counts L2 clean cache lines evicted by a demand N request. Comment Vol. 3B 19-115 PERFORMANCE MONITORING EVENTS Table 19-21. Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value F2H 02H L2_LINES_OUT.DEMAND_DIRT Counts L2 dirty (modified) cache lines evicted by a Y demand request. F2H 04H L2_LINES_OUT.PREFETCH_CLE Counts L2 clean cache line evicted by a prefetch AN request. F2H 08H L2_LINES_OUT.PREFETCH_DIR Counts L2 modified cache line evicted by a prefetch TY request. F2H 0FH L2_LINES_OUT.ANY Counts all L2 cache lines evicted for any reason. F4H 10H SQ_MISC.SPLIT_LOCK Counts the number of SQ lock splits across a cache line. F6H 01H SQ_FULL_STALL_CYCLES Counts cycles the Super Queue is full. Neither of the threads on this core will be able to access the uncore. F7H 01H FP_ASSIST.ALL Counts the number of floating point operations executed that required micro-code assist intervention. Assists are required in the following cases: SSE instructions (denormal input when the DAZ flag is off or underflow result when the FTZ flag is off); x87 instructions (NaN or denormal are loaded to a register or used as input from memory, division by 0 or underflow output). F7H 02H FP_ASSIST.OUTPUT Counts number of floating point micro-code assist when the output value (destination register) is invalid. F7H 04H FP_ASSIST.INPUT Counts number of floating point micro-code assist when the input value (one of the source operands to an FP instruction) is invalid. FDH 01H SIMD_INT_64.PACKED_MPY Counts number of SID integer 64 bit packed multiply operations. FDH 02H SIMD_INT_64.PACKED_SHIFT Counts number of SID integer 64 bit packed shift operations. FDH 04H SIMD_INT_64.PACK Counts number of SID integer 64 bit pack operations. FDH 08H SIMD_INT_64.UNPACK Counts number of SID integer 64 bit unpack operations. FDH 10H SIMD_INT_64.PACKED_LOGICA Counts number of SID integer 64 bit logical L operations. FDH 20H SIMD_INT_64.PACKED_ARITH FDH 40H SIMD_INT_64.SHUFFLE_MOVE Counts number of SID integer 64 bit shift or move operations. Event Mask Mnemonic Description Comment Counts number of SID integer 64 bit arithmetic operations. Model-specific performance monitoring events that are located in the uncore sub-system are implementation specific between different platforms using processors based on Intel microarchitecture code name Nehalem. Processors with CPUID signature of DisplayFamily_DisplayModel 06_1AH, 06_1EH, and 06_1FH support performance events listed in Table 19-22. 19-116 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series Event Num. Umask Value 00H 01H UNC_GQ_CYCLES_FULL.READ_ Uncore cycles Global Queue read tracker is full. TRACKER 00H 02H UNC_GQ_CYCLES_FULL.WRITE Uncore cycles Global Queue write tracker is full. _TRACKER 00H 04H UNC_GQ_CYCLES_FULL.PEER_ Uncore cycles Global Queue peer probe tracker is full. PROBE_TRACKER The peer probe tracker queue tracks snoops from the IOH and remote sockets. 01H 01H UNC_GQ_CYCLES_NOT_EMPTY Uncore cycles were Global Queue read tracker has at .READ_TRACKER least one valid entry. 01H 02H UNC_GQ_CYCLES_NOT_EMPTY Uncore cycles were Global Queue write tracker has at .WRITE_TRACKER least one valid entry. 01H 04H UNC_GQ_CYCLES_NOT_EMPTY Uncore cycles were Global Queue peer probe tracker .PEER_PROBE_TRACKER has at least one valid entry. The peer probe tracker queue tracks IOH and remote socket snoops. 03H 01H UNC_GQ_ALLOC.READ_TRACK Counts the number of tread tracker allocate to ER deallocate entries. The GQ read tracker allocate to deallocate occupancy count is divided by the count to obtain the average read tracker latency. 03H 02H UNC_GQ_ALLOC.RT_L3_MISS 03H 04H UNC_GQ_ALLOC.RT_TO_L3_RE Counts the number of GQ read tracker entries that are SP allocated in the read tracker queue that hit or miss the L3. The GQ read tracker L3 hit occupancy count is divided by this count to obtain the average L3 hit latency. 03H 08H UNC_GQ_ALLOC.RT_TO_RTID_ Counts the number of GQ read tracker entries that are ACQUIRED allocated in the read tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ read tracker L3 miss to RTID acquired occupancy count is divided by this count to obtain the average latency for a read L3 miss to acquire an RTID. 03H 10H UNC_GQ_ALLOC.WT_TO_RTID _ACQUIRED Counts the number of GQ write tracker entries that are allocated in the write tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ write tracker L3 miss to RTID occupancy count is divided by this count to obtain the average latency for a write L3 miss to acquire an RTID. 03H 20H UNC_GQ_ALLOC.WRITE_TRAC KER Counts the number of GQ write tracker entries that are allocated in the write tracker queue that miss the L3. The GQ write tracker occupancy count is divided by this count to obtain the average L3 write miss latency. Event Mask Mnemonic Description Comment Counts the number GQ read tracker entries for which a full cache line read has missed the L3. The GQ read tracker L3 miss to fill occupancy count is divided by this count to obtain the average cache line read L3 miss latency. The latency represents the time after which the L3 has determined that the cache line has missed. The time between a GQ read tracker allocation and the L3 determining that the cache line has missed is the average L3 hit latency. The total L3 cache line read miss latency is the hit latency + L3 miss latency. Vol. 3B 19-117 PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 03H 40H UNC_GQ_ALLOC.PEER_PROBE _TRACKER Counts the number of GQ peer probe tracker (snoop) entries that are allocated in the peer probe tracker queue that miss the L3. The GQ peer probe occupancy count is divided by this count to obtain the average L3 peer probe miss latency. 04H 01H UNC_GQ_DATA.FROM_QPI Cycles Global Queue Quickpath Interface input data port is busy importing data from the Quickpath Interface. Each cycle the input port can transfer 8 or 16 bytes of data. 04H 02H UNC_GQ_DATA.FROM_QMC Cycles Global Queue Quickpath Memory Interface input data port is busy importing data from the Quickpath Memory Interface. Each cycle the input port can transfer 8 or 16 bytes of data. 04H 04H UNC_GQ_DATA.FROM_L3 Cycles GQ L3 input data port is busy importing data from the Last Level Cache. Each cycle the input port can transfer 32 bytes of data. 04H 08H UNC_GQ_DATA.FROM_CORES_ Cycles GQ Core 0 and 2 input data port is busy 02 importing data from processor cores 0 and 2. Each cycle the input port can transfer 32 bytes of data. 04H 10H UNC_GQ_DATA.FROM_CORES_ Cycles GQ Core 1 and 3 input data port is busy 13 importing data from processor cores 1 and 3. Each cycle the input port can transfer 32 bytes of data. 05H 01H UNC_GQ_DATA.TO_QPI_QMC Cycles GQ QPI and QMC output data port is busy sending data to the Quickpath Interface or Quickpath Memory Interface. Each cycle the output port can transfer 32 bytes of data. 05H 02H UNC_GQ_DATA.TO_L3 Cycles GQ L3 output data port is busy sending data to the Last Level Cache. Each cycle the output port can transfer 32 bytes of data. 05H 04H UNC_GQ_DATA.TO_CORES Cycles GQ Core output data port is busy sending data to the Cores. Each cycle the output port can transfer 32 bytes of data. 06H 01H UNC_SNP_RESP_TO_LOCAL_H Number of snoop responses to the local home that L3 OME.I_STATE does not have the referenced cache line. 06H 02H UNC_SNP_RESP_TO_LOCAL_H Number of snoop responses to the local home that L3 OME.S_STATE has the referenced line cached in the S state. 06H 04H UNC_SNP_RESP_TO_LOCAL_H Number of responses to code or data read snoops to OME.FWD_S_STATE the local home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the local home in the S state. 06H 08H UNC_SNP_RESP_TO_LOCAL_H Number of responses to read invalidate snoops to the OME.FWD_I_STATE local home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the local home in the M state. 06H 10H UNC_SNP_RESP_TO_LOCAL_H Number of conflict snoop responses sent to the local OME.CONFLICT home. 19-118 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value 06H 20H UNC_SNP_RESP_TO_LOCAL_H Number of responses to code or data read snoops to OME.WB the local home that the L3 has the referenced line cached in the M state. 07H 01H UNC_SNP_RESP_TO_REMOTE _HOME.I_STATE Number of snoop responses to a remote home that L3 does not have the referenced cache line. 07H 02H UNC_SNP_RESP_TO_REMOTE _HOME.S_STATE Number of snoop responses to a remote home that L3 has the referenced line cached in the S state. 07H 04H UNC_SNP_RESP_TO_REMOTE _HOME.FWD_S_STATE Number of responses to code or data read snoops to a remote home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the remote home in the S state. 07H 08H UNC_SNP_RESP_TO_REMOTE _HOME.FWD_I_STATE Number of responses to read invalidate snoops to a remote home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the remote home in the M state. 07H 10H UNC_SNP_RESP_TO_REMOTE _HOME.CONFLICT Number of conflict snoop responses sent to the local home. 07H 20H UNC_SNP_RESP_TO_REMOTE _HOME.WB Number of responses to code or data read snoops to a remote home that the L3 has the referenced line cached in the M state. 07H 24H UNC_SNP_RESP_TO_REMOTE _HOME.HITM Number of HITM snoop responses to a remote home. 08H 01H UNC_L3_HITS.READ Number of code read, data read and RFO requests that hit in the L3. 08H 02H UNC_L3_HITS.WRITE Number of writeback requests that hit in the L3. Writebacks from the cores will always result in L3 hits due to the inclusive property of the L3. 08H 04H UNC_L3_HITS.PROBE Number of snoops from IOH or remote sockets that hit in the L3. 08H 03H UNC_L3_HITS.ANY Number of reads and writes that hit the L3. 09H 01H UNC_L3_MISS.READ Number of code read, data read and RFO requests that miss the L3. 09H 02H UNC_L3_MISS.WRITE Number of writeback requests that miss the L3. Should always be zero as writebacks from the cores will always result in L3 hits due to the inclusive property of the L3. 09H 04H UNC_L3_MISS.PROBE Number of snoops from IOH or remote sockets that miss the L3. 09H 03H UNC_L3_MISS.ANY Number of reads and writes that miss the L3. 0AH 01H UNC_L3_LINES_IN.M_STATE Counts the number of L3 lines allocated in M state. The only time a cache line is allocated in the M state is when the line was forwarded in M state is forwarded due to a Snoop Read Invalidate Own request. 0AH 02H UNC_L3_LINES_IN.E_STATE Counts the number of L3 lines allocated in E state. 0AH 04H UNC_L3_LINES_IN.S_STATE Counts the number of L3 lines allocated in S state. Event Mask Mnemonic Description Comment Vol. 3B 19-119 PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 0AH 08H UNC_L3_LINES_IN.F_STATE Counts the number of L3 lines allocated in F state. 0AH 0FH UNC_L3_LINES_IN.ANY Counts the number of L3 lines allocated in any state. 0BH 01H UNC_L3_LINES_OUT.M_STATE Counts the number of L3 lines victimized that were in the M state. When the victim cache line is in M state, the line is written to its home cache agent which can be either local or remote. 0BH 02H UNC_L3_LINES_OUT.E_STATE Counts the number of L3 lines victimized that were in the E state. 0BH 04H UNC_L3_LINES_OUT.S_STATE Counts the number of L3 lines victimized that were in the S state. 0BH 08H UNC_L3_LINES_OUT.I_STATE Counts the number of L3 lines victimized that were in the I state. 0BH 10H UNC_L3_LINES_OUT.F_STATE Counts the number of L3 lines victimized that were in the F state. 0BH 1FH UNC_L3_LINES_OUT.ANY Counts the number of L3 lines victimized in any state. 20H 01H UNC_QHL_REQUESTS.IOH_RE ADS Counts number of Quickpath Home Logic read requests from the IOH. 20H 02H UNC_QHL_REQUESTS.IOH_WR Counts number of Quickpath Home Logic write ITES requests from the IOH. 20H 04H UNC_QHL_REQUESTS.REMOTE Counts number of Quickpath Home Logic read _READS requests from a remote socket. 20H 08H UNC_QHL_REQUESTS.REMOTE Counts number of Quickpath Home Logic write _WRITES requests from a remote socket. 20H 10H UNC_QHL_REQUESTS.LOCAL_ READS Counts number of Quickpath Home Logic read requests from the local socket. 20H 20H UNC_QHL_REQUESTS.LOCAL_ WRITES Counts number of Quickpath Home Logic write requests from the local socket. 21H 01H UNC_QHL_CYCLES_FULL.IOH Counts uclk cycles all entries in the Quickpath Home Logic IOH are full. 21H 02H UNC_QHL_CYCLES_FULL.REM OTE Counts uclk cycles all entries in the Quickpath Home Logic remote tracker are full. 21H 04H UNC_QHL_CYCLES_FULL.LOCA Counts uclk cycles all entries in the Quickpath Home L Logic local tracker are full. 22H 01H UNC_QHL_CYCLES_NOT_EMPT Counts uclk cycles all entries in the Quickpath Home Y.IOH Logic IOH is busy. 22H 02H UNC_QHL_CYCLES_NOT_EMPT Counts uclk cycles all entries in the Quickpath Home Y.REMOTE Logic remote tracker is busy. 22H 04H UNC_QHL_CYCLES_NOT_EMPT Counts uclk cycles all entries in the Quickpath Home Y.LOCAL Logic local tracker is busy. 23H 01H UNC_QHL_OCCUPANCY.IOH 23H 02H UNC_QHL_OCCUPANCY.REMOT QHL remote tracker allocate to deallocate read E occupancy. 23H 04H UNC_QHL_OCCUPANCY.LOCAL 19-120 Vol. 3B QHL IOH tracker allocate to deallocate read occupancy. QHL local tracker allocate to deallocate read occupancy. Comment PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value 24H 02H UNC_QHL_ADDRESS_CONFLIC Counts number of QHL Active Address Table (AAT) TS.2WAY entries that saw a max of 2 conflicts. The AAT is a structure that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates. 24H 04H UNC_QHL_ADDRESS_CONFLIC Counts number of QHL Active Address Table (AAT) TS.3WAY entries that saw a max of 3 conflicts. The AAT is a structure that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates. 25H 01H UNC_QHL_CONFLICT_CYCLES.I Counts cycles the Quickpath Home Logic IOH Tracker OH contains two or more requests with an address conflict. A max of 3 requests can be in conflict. 25H 02H UNC_QHL_CONFLICT_CYCLES. REMOTE Counts cycles the Quickpath Home Logic Remote Tracker contains two or more requests with an address conflict. A max of 3 requests can be in conflict. 25H 04H UNC_QHL_CONFLICT_CYCLES. LOCAL Counts cycles the Quickpath Home Logic Local Tracker contains two or more requests with an address conflict. A max of 3 requests can be in conflict. 26H 01H UNC_QHL_TO_QMC_BYPASS Counts number or requests to the Quickpath Memory Controller that bypass the Quickpath Home Logic. All local accesses can be bypassed. For remote requests, only read requests can be bypassed. 27H 01H UNC_QMC_NORMAL_FULL.RE AD.CH0 Uncore cycles all the entries in the DRAM channel 0 medium or low priority queue are occupied with read requests. 27H 02H UNC_QMC_NORMAL_FULL.RE AD.CH1 Uncore cycles all the entries in the DRAM channel 1 medium or low priority queue are occupied with read requests. 27H 04H UNC_QMC_NORMAL_FULL.RE AD.CH2 Uncore cycles all the entries in the DRAM channel 2 medium or low priority queue are occupied with read requests. 27H 08H UNC_QMC_NORMAL_FULL.WRI Uncore cycles all the entries in the DRAM channel 0 TE.CH0 medium or low priority queue are occupied with write requests. 27H 10H UNC_QMC_NORMAL_FULL.WRI Counts cycles all the entries in the DRAM channel 1 TE.CH1 medium or low priority queue are occupied with write requests. 27H 20H UNC_QMC_NORMAL_FULL.WRI Uncore cycles all the entries in the DRAM channel 2 TE.CH2 medium or low priority queue are occupied with write requests. 28H 01H UNC_QMC_ISOC_FULL.READ.C H0 Counts cycles all the entries in the DRAM channel 0 high priority queue are occupied with isochronous read requests. 28H 02H UNC_QMC_ISOC_FULL.READ.C H1 Counts cycles all the entries in the DRAM channel 1high priority queue are occupied with isochronous read requests. Event Mask Mnemonic Description Comment Vol. 3B 19-121 PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 28H 04H UNC_QMC_ISOC_FULL.READ.C H2 Counts cycles all the entries in the DRAM channel 2 high priority queue are occupied with isochronous read requests. 28H 08H UNC_QMC_ISOC_FULL.WRITE.C Counts cycles all the entries in the DRAM channel 0 H0 high priority queue are occupied with isochronous write requests. 28H 10H UNC_QMC_ISOC_FULL.WRITE.C Counts cycles all the entries in the DRAM channel 1 H1 high priority queue are occupied with isochronous write requests. 28H 20H UNC_QMC_ISOC_FULL.WRITE.C Counts cycles all the entries in the DRAM channel 2 H2 high priority queue are occupied with isochronous write requests. 29H 01H UNC_QMC_BUSY.READ.CH0 Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 0. 29H 02H UNC_QMC_BUSY.READ.CH1 Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 1. 29H 04H UNC_QMC_BUSY.READ.CH2 Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 2. 29H 08H UNC_QMC_BUSY.WRITE.CH0 Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 0. 29H 10H UNC_QMC_BUSY.WRITE.CH1 Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 1. 29H 20H UNC_QMC_BUSY.WRITE.CH2 Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 2. 2AH 01H UNC_QMC_OCCUPANCY.CH0 IMC channel 0 normal read request occupancy. 2AH 02H UNC_QMC_OCCUPANCY.CH1 IMC channel 1 normal read request occupancy. IMC channel 2 normal read request occupancy. 2AH 04H UNC_QMC_OCCUPANCY.CH2 2BH 01H UNC_QMC_ISSOC_OCCUPANCY. IMC channel 0 issoc read request occupancy. CH0 2BH 02H UNC_QMC_ISSOC_OCCUPANCY. IMC channel 1 issoc read request occupancy. CH1 2BH 04H UNC_QMC_ISSOC_OCCUPANCY. IMC channel 2 issoc read request occupancy. CH2 2BH 07H UNC_QMC_ISSOC_READS.ANY IMC issoc read request occupancy. 2CH 01H UNC_QMC_NORMAL_READS.C H0 Counts the number of Quickpath Memory Controller channel 0 medium and low priority read requests. The QMC channel 0 normal read occupancy divided by this count provides the average QMC channel 0 read latency. 19-122 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 2CH 02H UNC_QMC_NORMAL_READS.C H1 Counts the number of Quickpath Memory Controller channel 1 medium and low priority read requests. The QMC channel 1 normal read occupancy divided by this count provides the average QMC channel 1 read latency. 2CH 04H UNC_QMC_NORMAL_READS.C H2 Counts the number of Quickpath Memory Controller channel 2 medium and low priority read requests. The QMC channel 2 normal read occupancy divided by this count provides the average QMC channel 2 read latency. 2CH 07H UNC_QMC_NORMAL_READS.A NY Counts the number of Quickpath Memory Controller medium and low priority read requests. The QMC normal read occupancy divided by this count provides the average QMC read latency. 2DH 01H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.CH0 channel 0 high priority isochronous read requests. 2DH 02H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.CH1 channel 1 high priority isochronous read requests. 2DH 04H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.CH2 channel 2 high priority isochronous read requests. 2DH 07H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.ANY high priority isochronous read requests. 2EH 01H UNC_QMC_CRITICAL_PRIORIT Y_READS.CH0 Counts the number of Quickpath Memory Controller channel 0 critical priority isochronous read requests. 2EH 02H UNC_QMC_CRITICAL_PRIORIT Y_READS.CH1 Counts the number of Quickpath Memory Controller channel 1 critical priority isochronous read requests. 2EH 04H UNC_QMC_CRITICAL_PRIORIT Y_READS.CH2 Counts the number of Quickpath Memory Controller channel 2 critical priority isochronous read requests. 2EH 07H UNC_QMC_CRITICAL_PRIORIT Y_READS.ANY Counts the number of Quickpath Memory Controller critical priority isochronous read requests. 2FH 01H UNC_QMC_WRITES.FULL.CH0 Counts number of full cache line writes to DRAM channel 0. 2FH 02H UNC_QMC_WRITES.FULL.CH1 Counts number of full cache line writes to DRAM channel 1. 2FH 04H UNC_QMC_WRITES.FULL.CH2 Counts number of full cache line writes to DRAM channel 2. 2FH 07H UNC_QMC_WRITES.FULL.ANY Counts number of full cache line writes to DRAM. 2FH 08H UNC_QMC_WRITES.PARTIAL.C H0 Counts number of partial cache line writes to DRAM channel 0. 2FH 10H UNC_QMC_WRITES.PARTIAL.C H1 Counts number of partial cache line writes to DRAM channel 1. 2FH 20H UNC_QMC_WRITES.PARTIAL.C H2 Counts number of partial cache line writes to DRAM channel 2. 2FH 38H UNC_QMC_WRITES.PARTIAL.A NY Counts number of partial cache line writes to DRAM. 30H 01H UNC_QMC_CANCEL.CH0 Counts number of DRAM channel 0 cancel requests. 30H 02H UNC_QMC_CANCEL.CH1 Counts number of DRAM channel 1 cancel requests. Comment Vol. 3B 19-123 PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 30H 04H UNC_QMC_CANCEL.CH2 Counts number of DRAM channel 2 cancel requests. 30H 07H UNC_QMC_CANCEL.ANY Counts number of DRAM cancel requests. 31H 01H UNC_QMC_PRIORITY_UPDATE S.CH0 Counts number of DRAM channel 0 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 31H 02H UNC_QMC_PRIORITY_UPDATE S.CH1 Counts number of DRAM channel 1 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 31H 04H UNC_QMC_PRIORITY_UPDATE S.CH2 Counts number of DRAM channel 2 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 31H 07H UNC_QMC_PRIORITY_UPDATE S.ANY Counts number of DRAM priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 33H 04H UNC_QHL_FRC_ACK_CNFLTS.L Counts number of Force Acknowledge Conflict OCAL messages sent by the Quickpath Home Logic to the local home. 40H 01H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 0 HOME E_FLIT.HOME.LINK_0 virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 02H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 0 SNOOP E_FLIT.SNOOP.LINK_0 virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 04H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 0 non-data E_FLIT.NDR.LINK_0 response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 08H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 1 HOME E_FLIT.HOME.LINK_1 virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 19-124 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value 40H 10H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 1 SNOOP E_FLIT.SNOOP.LINK_1 virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 20H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 1 non-data E_FLIT.NDR.LINK_1 response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 07H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 0 virtual E_FLIT.LINK_0 channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 38H UNC_QPI_TX_STALLED_SINGL Counts cycles the Quickpath outbound link 1 virtual E_FLIT.LINK_1 channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 01H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 Data _FLIT.DRS.LINK_0 Response virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 02H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 Non_FLIT.NCB.LINK_0 Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 04H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 Non_FLIT.NCS.LINK_0 Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 08H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 Data _FLIT.DRS.LINK_1 Response virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 10H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 Non_FLIT.NCB.LINK_1 Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. Event Mask Mnemonic Description Comment Vol. 3B 19-125 PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value 41H 20H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 Non_FLIT.NCS.LINK_1 Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 07H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 virtual _FLIT.LINK_0 channels are stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 38H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 virtual _FLIT.LINK_1 channels are stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 42H 02H UNC_QPI_TX_HEADER.BUSY.LI Number of cycles that the header buffer in the NK_0 Quickpath Interface outbound link 0 is busy. 42H 08H UNC_QPI_TX_HEADER.BUSY.LI Number of cycles that the header buffer in the NK_1 Quickpath Interface outbound link 1 is busy. 43H 01H UNC_QPI_RX_NO_PPT_CREDI T.STALLS.LINK_0 Number of cycles that snoop packets incoming to the Quickpath Interface link 0 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries. 43H 02H UNC_QPI_RX_NO_PPT_CREDI T.STALLS.LINK_1 Number of cycles that snoop packets incoming to the Quickpath Interface link 1 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries. 60H 01H UNC_DRAM_OPEN.CH0 Counts number of DRAM Channel 0 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened. 60H 02H UNC_DRAM_OPEN.CH1 Counts number of DRAM Channel 1 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened. 60H 04H UNC_DRAM_OPEN.CH2 Counts number of DRAM Channel 2 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened. 61H 01H UNC_DRAM_PAGE_CLOSE.CH0 DRAM channel 0 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge. 61H 02H UNC_DRAM_PAGE_CLOSE.CH1 DRAM channel 1 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge. 61H 04H UNC_DRAM_PAGE_CLOSE.CH2 DRAM channel 2 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge. 19-126 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 62H 01H UNC_DRAM_PAGE_MISS.CH0 Counts the number of precharges (PRE) that were issued to DRAM channel 0 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge. 62H 02H UNC_DRAM_PAGE_MISS.CH1 Counts the number of precharges (PRE) that were issued to DRAM channel 1 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge. 62H 04H UNC_DRAM_PAGE_MISS.CH2 Counts the number of precharges (PRE) that were issued to DRAM channel 2 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge. 63H 01H UNC_DRAM_READ_CAS.CH0 Counts the number of times a read CAS command was issued on DRAM channel 0. 63H 02H UNC_DRAM_READ_CAS.AUTO PRE_CH0 Counts the number of times a read CAS command was issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode. 63H 04H UNC_DRAM_READ_CAS.CH1 Counts the number of times a read CAS command was issued on DRAM channel 1. 63H 08H UNC_DRAM_READ_CAS.AUTO PRE_CH1 Counts the number of times a read CAS command was issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode. 63H 10H UNC_DRAM_READ_CAS.CH2 Counts the number of times a read CAS command was issued on DRAM channel 2. 63H 20H UNC_DRAM_READ_CAS.AUTO PRE_CH2 Counts the number of times a read CAS command was issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode. 64H 01H UNC_DRAM_WRITE_CAS.CH0 Counts the number of times a write CAS command was issued on DRAM channel 0. 64H 02H UNC_DRAM_WRITE_CAS.AUTO Counts the number of times a write CAS command was PRE_CH0 issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode. 64H 04H UNC_DRAM_WRITE_CAS.CH1 64H 08H UNC_DRAM_WRITE_CAS.AUTO Counts the number of times a write CAS command was PRE_CH1 issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode. 64H 10H UNC_DRAM_WRITE_CAS.CH2 Comment Counts the number of times a write CAS command was issued on DRAM channel 1. Counts the number of times a write CAS command was issued on DRAM channel 2. Vol. 3B 19-127 PERFORMANCE MONITORING EVENTS Table 19-22. Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.) Event Num. Umask Value 64H 20H UNC_DRAM_WRITE_CAS.AUTO Counts the number of times a write CAS command was PRE_CH2 issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode. 65H 01H UNC_DRAM_REFRESH.CH0 Counts number of DRAM channel 0 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically. 65H 02H UNC_DRAM_REFRESH.CH1 Counts number of DRAM channel 1 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically. 65H 04H UNC_DRAM_REFRESH.CH2 Counts number of DRAM channel 2 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically. 66H 01H UNC_DRAM_PRE_ALL.CH0 Counts number of DRAM Channel 0 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode. 66H 02H UNC_DRAM_PRE_ALL.CH1 Counts number of DRAM Channel 1 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode. 66H 04H UNC_DRAM_PRE_ALL.CH2 Counts number of DRAM Channel 2 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode. Event Mask Mnemonic Description Comment Intel Xeon processors with CPUID signature of DisplayFamily_DisplayModel 06_2EH have a distinct uncore subsystem that is significantly different from the uncore found in processors with CPUID signature 06_1AH, 06_1EH, and 06_1FH. Model-specific performance monitoring events for its uncore will be available in future documentation. 19.11 PERFORMANCE MONITORING EVENTS FOR PROCESSORS BASED ON INTEL® MICROARCHITECTURE CODE NAME WESTMERE Intel 64 processors based on Intel® microarchitecture code name Westmere support the architectural and modelspecific performance monitoring events listed in Table 19-1 and Table 19-23. Table 19-23 applies to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_25H, 06_2CH. In addition, these processors (CPUID signature of DisplayFamily_DisplayModel 06_25H, 06_2CH) also support the following model-specific, product-specific uncore performance monitoring events listed in Table 19-24. Fixed counters support the architecture events defined in Table 19-2. Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere Event Num. Umask Value 03H 02H LOAD_BLOCK.OVERLAP_STOR Loads that partially overlap an earlier store. E 04H 07H SB_DRAIN.ANY 19-128 Vol. 3B Event Mask Mnemonic Description All Store buffer stall cycles. Comment PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 05H 02H MISALIGN_MEMORY.STORE All store referenced with misaligned address. 06H 04H STORE_BLOCKS.AT_RET Counts number of loads delayed with at-Retirement block code. The following loads need to be executed at retirement and wait for all senior stores on the same thread to be drained: load splitting across 4K boundary (page split), load accessing uncacheable (UC or WC) memory, load lock, and load with page table in UC or WC memory region. 06H 08H STORE_BLOCKS.L1D_BLOCK Cacheable loads delayed with L1D block code. 07H 01H PARTIAL_ADDRESS_ALIAS Counts false dependency due to partial address aliasing. 08H 01H DTLB_LOAD_MISSES.ANY Counts all load misses that cause a page walk. 08H 02H DTLB_LOAD_MISSES.WALK_C OMPLETED Counts number of completed page walks due to load miss in the STLB. 08H 04H DTLB_LOAD_MISSES.WALK_CY Cycles PMH is busy with a page walk due to a load CLES miss in the STLB. 08H 10H DTLB_LOAD_MISSES.STLB_HI T Number of cache load STLB hits. 08H 20H DTLB_LOAD_MISSES.PDE_MIS S Number of DTLB cache load misses where the low part of the linear to physical address translation was missed. 0BH 01H MEM_INST_RETIRED.LOADS Counts the number of instructions with an architecturally-visible load retired on the architected path. 0BH 02H MEM_INST_RETIRED.STORES Counts the number of instructions with an architecturally-visible store retired on the architected path. 0BH 10H MEM_INST_RETIRED.LATENCY Counts the number of instructions exceeding the _ABOVE_THRESHOLD latency specified with ld_lat facility. 0CH 01H MEM_STORE_RETIRED.DTLB_ MISS The event counts the number of retired stores that missed the DTLB. The DTLB miss is not counted if the store operation causes a fault. Does not counter prefetches. Counts both primary and secondary misses to the TLB. 0EH 01H UOPS_ISSUED.ANY Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i.e. the UOPs issued from the front end to the back end. 0EH 01H UOPS_ISSUED.STALLED_CYCL ES Counts the number of cycles no uops issued by the Register Allocation Table to the Reservation Station, i.e. the UOPs issued from the front end to the back end. 0EH 02H UOPS_ISSUED.FUSED Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station. 0FH 01H MEM_UNCORE_RETIRED.UNK NOWN_SOURCE Load instructions retired with unknown LLC miss (Precise Event). 0FH 02H MEM_UNCORE_RETIRED.OHTE Load instructions retired that HIT modified data in R_CORE_L2_HIT sibling core (Precise Event). Comment In conjunction with ld_lat facility. Set “invert=1, cmask = 1“. Applicable to one and two sockets. Applicable to one and two sockets. Vol. 3B 19-129 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 0FH 04H MEM_UNCORE_RETIRED.REMO Load instructions retired that HIT modified data in TE_HITM remote socket (Precise Event). Applicable to two sockets only. 0FH 08H MEM_UNCORE_RETIRED.LOCA Load instructions retired local dram and remote L_DRAM_AND_REMOTE_CACH cache HIT data sources (Precise Event). E_HIT Applicable to one and two sockets. 0FH 10H MEM_UNCORE_RETIRED.REMO Load instructions retired remote DRAM and remote TE_DRAM home-remote cache HITM (Precise Event). Applicable to two sockets only. 0FH 20H MEM_UNCORE_RETIRED.OTHE Load instructions retired other LLC miss (Precise R_LLC_MISS Event). Applicable to two sockets only. 0FH 80H MEM_UNCORE_RETIRED.UNCA Load instructions retired I/O (Precise Event). CHEABLE Applicable to one and two sockets. 10H 01H FP_COMP_OPS_EXE.X87 Counts the number of FP Computational Uops Executed. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction. 10H 02H FP_COMP_OPS_EXE.MMX Counts number of MMX Uops executed. Counts number of SSE and SSE2 FP uops executed. Event Mask Mnemonic Description 10H 04H FP_COMP_OPS_EXE.SSE_FP 10H 08H FP_COMP_OPS_EXE.SSE2_INT Counts number of SSE2 integer uops executed. EGER 10H 10H FP_COMP_OPS_EXE.SSE_FP_P Counts number of SSE FP packed uops executed. ACKED 10H 20H FP_COMP_OPS_EXE.SSE_FP_S Counts number of SSE FP scalar uops executed. CALAR 10H 40H FP_COMP_OPS_EXE.SSE_SING Counts number of SSE* FP single precision uops LE_PRECISION executed. 10H 80H FP_COMP_OPS_EXE.SSE_DOU Counts number of SSE* FP double precision uops BLE_PRECISION executed. 12H 01H SIMD_INT_128.PACKED_MPY 12H 02H SIMD_INT_128.PACKED_SHIFT Counts number of 128 bit SIMD integer shift operations. 12H 04H SIMD_INT_128.PACK Counts number of 128 bit SIMD integer pack operations. 12H 08H SIMD_INT_128.UNPACK Counts number of 128 bit SIMD integer unpack operations. 12H 10H SIMD_INT_128.PACKED_LOGIC Counts number of 128 bit SIMD integer logical AL operations. 12H 20H SIMD_INT_128.PACKED_ARIT H 12H 40H SIMD_INT_128.SHUFFLE_MOV Counts number of 128 bit SIMD integer shuffle and E move operations. 19-130 Vol. 3B Counts number of 128 bit SIMD integer multiply operations. Counts number of 128 bit SIMD integer arithmetic operations. Comment PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 13H 01H LOAD_DISPATCH.RS Counts number of loads dispatched from the Reservation Station that bypass the Memory Order Buffer. 13H 02H LOAD_DISPATCH.RS_DELAYED Counts the number of delayed RS dispatches at the stage latch. If an RS dispatch cannot bypass to LB, it has another chance to dispatch from the one-cycle delayed staging latch before it is written into the LB. 13H 04H LOAD_DISPATCH.MOB Counts the number of loads dispatched from the Reservation Station to the Memory Order Buffer. 13H 07H LOAD_DISPATCH.ANY Counts all loads dispatched from the Reservation Station. 14H 01H ARITH.CYCLES_DIV_BUSY Counts the number of cycles the divider is busy Count may be incorrect executing divide or square root operations. The When SMT is on. divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Set 'edge =1, invert=1, cmask=1' to count the number of divides. 14H 02H ARITH.MUL Counts the number of multiply operations executed. Count may be incorrect This includes integer as well as floating point When SMT is on. multiply operations but excludes DPPS mul and MPSAD. 17H 01H INST_QUEUE_WRITES Counts the number of instructions written into the instruction queue every cycle. 18H 01H INST_DECODED.DEC0 Counts number of instructions that require decoder 0 to be decoded. Usually, this means that the instruction maps to more than 1 uop. 19H 01H TWO_UOP_INSTS_DECODED An instruction that generates two uops was decoded. 1EH 01H INST_QUEUE_WRITE_CYCLES This event counts the number of cycles during which instructions are written to the instruction queue. Dividing this counter by the number of instructions written to the instruction queue (INST_QUEUE_WRITES) yields the average number of instructions decoded each cycle. If this number is less than four and the pipe stalls, this indicates that the decoder is failing to decode enough instructions per cycle to sustain the 4-wide pipeline. 20H 01H LSD_OVERFLOW Number of loops that cannot stream from the instruction queue. 24H 01H L2_RQSTS.LD_HIT Counts number of loads that hit the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches. L2 loads can be rejected for various reasons. Only non rejected loads are counted. 24H 02H L2_RQSTS.LD_MISS Counts the number of loads that miss the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches. 24H 03H L2_RQSTS.LOADS Counts all L2 load requests. L2 loads include both L1D demand misses as well as L1D prefetches. Comment If SSE* instructions that are 6 bytes or longer arrive one after another, then front end throughput may limit execution speed. Vol. 3B 19-131 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 24H 04H L2_RQSTS.RFO_HIT Counts the number of store RFO requests that hit the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. Count includes WC memory requests, where the data is not fetched but the permission to write the line is required. 24H 08H L2_RQSTS.RFO_MISS Counts the number of store RFO requests that miss the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. 24H 0CH L2_RQSTS.RFOS Counts all L2 store RFO requests. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. 24H 10H L2_RQSTS.IFETCH_HIT Counts number of instruction fetches that hit the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches. 24H 20H L2_RQSTS.IFETCH_MISS Counts number of instruction fetches that miss the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches. 24H 30H L2_RQSTS.IFETCHES Counts all instruction fetches. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches. 24H 40H L2_RQSTS.PREFETCH_HIT Counts L2 prefetch hits for both code and data. 24H 80H L2_RQSTS.PREFETCH_MISS Counts L2 prefetch misses for both code and data. 24H C0H L2_RQSTS.PREFETCHES Counts all L2 prefetches for both code and data. 24H AAH L2_RQSTS.MISS Counts all L2 misses for both code and data. 24H FFH L2_RQSTS.REFERENCES Counts all L2 requests for both code and data. 26H 01H L2_DATA_RQSTS.DEMAND.I_S Counts number of L2 data demand loads where the TATE cache line to be loaded is in the I (invalid) state, i.e., a cache miss. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 02H L2_DATA_RQSTS.DEMAND.S_ STATE Counts number of L2 data demand loads where the cache line to be loaded is in the S (shared) state. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 04H L2_DATA_RQSTS.DEMAND.E_ STATE Counts number of L2 data demand loads where the cache line to be loaded is in the E (exclusive) state. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 08H L2_DATA_RQSTS.DEMAND.M_ Counts number of L2 data demand loads where the STATE cache line to be loaded is in the M (modified) state. L2 demand loads are both L1D demand misses and L1D prefetches. 26H 0FH L2_DATA_RQSTS.DEMAND.ME Counts all L2 data demand requests. L2 demand SI loads are both L1D demand misses and L1D prefetches. 19-132 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 26H 10H L2_DATA_RQSTS.PREFETCH.I_ Counts number of L2 prefetch data loads where the STATE cache line to be loaded is in the I (invalid) state, i.e., a cache miss. 26H 20H L2_DATA_RQSTS.PREFETCH.S Counts number of L2 prefetch data loads where the _STATE cache line to be loaded is in the S (shared) state. A prefetch RFO will miss on an S state line, while a prefetch read will hit on an S state line. 26H 40H L2_DATA_RQSTS.PREFETCH.E Counts number of L2 prefetch data loads where the _STATE cache line to be loaded is in the E (exclusive) state. 26H 80H L2_DATA_RQSTS.PREFETCH.M Counts number of L2 prefetch data loads where the _STATE cache line to be loaded is in the M (modified) state. 26H F0H L2_DATA_RQSTS.PREFETCH.M Counts all L2 prefetch requests. ESI 26H FFH L2_DATA_RQSTS.ANY Counts all L2 data requests. 27H 01H L2_WRITE.RFO.I_STATE Counts number of L2 demand store RFO requests This is a demand RFO where the cache line to be loaded is in the I (invalid) request. state, i.e., a cache miss. The L1D prefetcher does not issue a RFO prefetch. 27H 02H L2_WRITE.RFO.S_STATE Counts number of L2 store RFO requests where the This is a demand RFO cache line to be loaded is in the S (shared) state. request. The L1D prefetcher does not issue a RFO prefetch. 27H 08H L2_WRITE.RFO.M_STATE Counts number of L2 store RFO requests where the This is a demand RFO cache line to be loaded is in the M (modified) state. request. The L1D prefetcher does not issue a RFO prefetch. 27H 0EH L2_WRITE.RFO.HIT Counts number of L2 store RFO requests where the This is a demand RFO cache line to be loaded is in either the S, E or M request. states. The L1D prefetcher does not issue a RFO prefetch. 27H 0FH L2_WRITE.RFO.MESI Counts all L2 store RFO requests. The L1D prefetcher does not issue a RFO prefetch. 27H 10H L2_WRITE.LOCK.I_STATE Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the I (invalid) state, i.e., a cache miss. 27H 20H L2_WRITE.LOCK.S_STATE Counts number of L2 lock RFO requests where the cache line to be loaded is in the S (shared) state. 27H 40H L2_WRITE.LOCK.E_STATE Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the E (exclusive) state. 27H 80H L2_WRITE.LOCK.M_STATE Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the M (modified) state. 27H E0H L2_WRITE.LOCK.HIT Counts number of L2 demand lock RFO requests where the cache line to be loaded is in either the S, E, or M state. 27H F0H L2_WRITE.LOCK.MESI Counts all L2 demand lock RFO requests. Event Mask Mnemonic Description Comment This is a demand RFO request. Vol. 3B 19-133 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 28H 01H L1D_WB_L2.I_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the I (invalid) state, i.e., a cache miss. 28H 02H L1D_WB_L2.S_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the S state. 28H 04H L1D_WB_L2.E_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the E (exclusive) state. 28H 08H L1D_WB_L2.M_STATE Counts number of L1 writebacks to the L2 where the cache line to be written is in the M (modified) state. 28H 0FH L1D_WB_L2.MESI Counts all L1 writebacks to the L2 . 2EH 41H L3_LAT_CACHE.MISS Counts uncore Last Level Cache misses. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. See Table 19-1. 2EH 4FH L3_LAT_CACHE.REFERENCE Counts uncore Last Level Cache references. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. See Table 19-1. 3CH 00H CPU_CLK_UNHALTED.THREAD Counts the number of thread cycles while the _P thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. See Table 19-1. 3CH 01H CPU_CLK_UNHALTED.REF_P Increments at the frequency of TSC when not halted. See Table 19-1. 49H 01H DTLB_MISSES.ANY Counts the number of misses in the STLB which causes a page walk. 49H 02H DTLB_MISSES.WALK_COMPLE TED Counts number of misses in the STLB which resulted in a completed page walk. 49H 04H DTLB_MISSES.WALK_CYCLES Counts cycles of page walk due to misses in the STLB. 49H 10H DTLB_MISSES.STLB_HIT Counts the number of DTLB first level misses that hit in the second level TLB. This event is only relevant if the core contains multiple DTLB levels. 49H 20H DTLB_MISSES.PDE_MISS Number of DTLB misses caused by low part of address, includes references to 2M pages because 2M pages do not use the PDE. 49H 80H DTLB_MISSES.LARGE_WALK_C Counts number of completed large page walks due OMPLETED to misses in the STLB. 4CH 01H LOAD_HIT_PRE 19-134 Vol. 3B Comment Counter 0, 1 only. Counts load operations sent to the L1 data cache while a previous SSE prefetch instruction to the same cache line has started prefetching but has not yet finished. PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment 4EH 01H L1D_PREFETCH.REQUESTS Counts number of hardware prefetch requests dispatched out of the prefetch FIFO. Counter 0, 1 only. 4EH 02H L1D_PREFETCH.MISS Counts number of hardware prefetch requests that Counter 0, 1 only. miss the L1D. There are two prefetchers in the L1D. A streamer, which predicts lines sequentially after this one should be fetched, and the IP prefetcher that remembers access patterns for the current instruction. The streamer prefetcher stops on an L1D hit, while the IP prefetcher does not. 4EH 04H L1D_PREFETCH.TRIGGERS Counts number of prefetch requests triggered by Counter 0, 1 only. the Finite State Machine and pushed into the prefetch FIFO. Some of the prefetch requests are dropped due to overwrites or competition between the IP index prefetcher and streamer prefetcher. The prefetch FIFO contains 4 entries. 4FH 10H EPT.WALK_CYCLES Counts Extended Page walk cycles. 51H 01H L1D.REPL Counts the number of lines brought into the L1 data Counter 0, 1 only. cache. 51H 02H L1D.M_REPL Counts the number of modified lines brought into the L1 data cache. Counter 0, 1 only. 51H 04H L1D.M_EVICT Counts the number of modified lines evicted from the L1 data cache due to replacement. Counter 0, 1 only. 51H 08H L1D.M_SNOOP_EVICT Counts the number of modified lines evicted from the L1 data cache due to snoop HITM intervention. Counter 0, 1 only. 52H 01H L1D_CACHE_PREFETCH_LOCK Counts the number of cacheable load lock _FB_HIT speculated instructions accepted into the fill buffer. 60H 01H OFFCORE_REQUESTS_OUTST Counts weighted cycles of offcore demand data ANDING.DEMAND.READ_DATA read requests. Does not include L2 prefetch requests. Counter 0. 60H 02H OFFCORE_REQUESTS_OUTST Counts weighted cycles of offcore demand code ANDING.DEMAND.READ_CODE read requests. Does not include L2 prefetch requests. Counter 0. 60H 04H OFFCORE_REQUESTS_OUTST ANDING.DEMAND.RFO Counts weighted cycles of offcore demand RFO requests. Does not include L2 prefetch requests. Counter 0. 60H 08H OFFCORE_REQUESTS_OUTST ANDING.ANY.READ Counts weighted cycles of offcore read requests of any kind. Include L2 prefetch requests. Counter 0. 63H 01H CACHE_LOCK_CYCLES.L1D_L2 Cycle count during which the L1D and L2 are locked. A lock is asserted when there is a locked memory access, due to uncacheable memory, a locked operation that spans two cache lines, or a page walk from an uncacheable page table. This event does not cause locks, it merely detects them. Counter 0, 1 only. L1D and L2 locks have a very high performance penalty and it is highly recommended to avoid such accesses. 63H 02H CACHE_LOCK_CYCLES.L1D Counts the number of cycles that cacheline in the L1 data cache unit is locked. Counter 0, 1 only. 6CH 01H IO_TRANSACTIONS Counts the number of completed I/O transactions. 80H 01H L1I.HITS Counts all instruction fetches that hit the L1 instruction cache. Vol. 3B 19-135 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 80H 02H L1I.MISSES Counts all instruction fetches that miss the L1I cache. This includes instruction cache misses, streaming buffer misses, victim cache misses and uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding. 80H 03H L1I.READS Counts all instruction fetches, including uncacheable fetches that bypass the L1I. 80H 04H L1I.CYCLES_STALLED Cycle counts for which an instruction fetch stalls due to a L1I cache miss, ITLB miss or ITLB fault. 82H 01H LARGE_ITLB.HIT Counts number of large ITLB hits. 85H 01H ITLB_MISSES.ANY Counts the number of misses in all levels of the ITLB which causes a page walk. 85H 02H ITLB_MISSES.WALK_COMPLET Counts number of misses in all levels of the ITLB ED which resulted in a completed page walk. 85H 04H ITLB_MISSES.WALK_CYCLES Counts ITLB miss page walk cycles. 85H 10H ITLB_MISSES.STLB_HIT Counts number of ITLB first level miss but second level hits. 85H 80H ITLB_MISSES.LARGE_WALK_C OMPLETED Counts number of completed large page walks due to misses in the STLB. 87H 01H ILD_STALL.LCP Cycles Instruction Length Decoder stalls due to length changing prefixes: 66, 67 or REX.W (for Intel 64) instructions which change the length of the decoded instruction. 87H 02H ILD_STALL.MRU Instruction Length Decoder stall cycles due to Brand Prediction Unit (PBU) Most Recently Used (MRU) bypass. 87H 04H ILD_STALL.IQ_FULL Stall cycles due to a full instruction queue. 87H 08H ILD_STALL.REGEN Counts the number of regen stalls. 87H 0FH ILD_STALL.ANY Counts any cycles the Instruction Length Decoder is stalled. 88H 01H BR_INST_EXEC.COND Counts the number of conditional near branch instructions executed, but not necessarily retired. 88H 02H BR_INST_EXEC.DIRECT Counts all unconditional near branch instructions excluding calls and indirect branches. 88H 04H BR_INST_EXEC.INDIRECT_NO N_CALL Counts the number of executed indirect near branch instructions that are not calls. 88H 07H BR_INST_EXEC.NON_CALLS Counts all non-call near branch instructions executed, but not necessarily retired. 88H 08H BR_INST_EXEC.RETURN_NEA R Counts indirect near branches that have a return mnemonic. 88H 10H BR_INST_EXEC.DIRECT_NEAR _CALL Counts unconditional near call branch instructions, excluding non-call branch, executed. 88H 20H BR_INST_EXEC.INDIRECT_NEA Counts indirect near calls, including both register R_CALL and memory indirect, executed. 19-136 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 88H 30H BR_INST_EXEC.NEAR_CALLS Counts all near call branches executed, but not necessarily retired. 88H 40H BR_INST_EXEC.TAKEN Counts taken near branches executed, but not necessarily retired. 88H 7FH BR_INST_EXEC.ANY Counts all near executed branches (not necessarily retired). This includes only instructions and not micro-op branches. Frequent branching is not necessarily a major performance issue. However frequent branch mispredictions may be a problem. 89H 01H BR_MISP_EXEC.COND Counts the number of mispredicted conditional near branch instructions executed, but not necessarily retired. 89H 02H BR_MISP_EXEC.DIRECT Counts mispredicted macro unconditional near branch instructions, excluding calls and indirect branches (should always be 0). 89H 04H BR_MISP_EXEC.INDIRECT_NO N_CALL Counts the number of executed mispredicted indirect near branch instructions that are not calls. 89H 07H BR_MISP_EXEC.NON_CALLS Counts mispredicted non-call near branches executed, but not necessarily retired. 89H 08H BR_MISP_EXEC.RETURN_NEA R Counts mispredicted indirect branches that have a rear return mnemonic. 89H 10H BR_MISP_EXEC.DIRECT_NEAR Counts mispredicted non-indirect near calls _CALL executed, (should always be 0). 89H 20H BR_MISP_EXEC.INDIRECT_NE AR_CALL Counts mispredicted indirect near calls executed, including both register and memory indirect. 89H 30H BR_MISP_EXEC.NEAR_CALLS Counts all mispredicted near call branches executed, but not necessarily retired. 89H 40H BR_MISP_EXEC.TAKEN Counts executed mispredicted near branches that are taken, but not necessarily retired. 89H 7FH BR_MISP_EXEC.ANY Counts the number of mispredicted near branch instructions that were executed, but not necessarily retired. A2H 01H RESOURCE_STALLS.ANY Counts the number of Allocator resource related stalls. Includes register renaming buffer entries, memory buffer entries. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations. A2H 02H RESOURCE_STALLS.LOAD Counts the cycles of stall due to lack of load buffer for load operation. Comment Does not include stalls due to SuperQ (off core) queue full, too many cache misses, etc. Vol. 3B 19-137 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment A2H 04H RESOURCE_STALLS.RS_FULL This event counts the number of cycles when the number of instructions in the pipeline waiting for execution reaches the limit the processor can handle. A high count of this event indicates that there are long latency operations in the pipe (possibly load and store operations that miss the L2 cache, or instructions dependent upon instructions further down the pipeline that have yet to retire. When RS is full, new instructions cannot enter the reservation station and start execution. A2H 08H RESOURCE_STALLS.STORE This event counts the number of cycles that a resource related stall will occur due to the number of store instructions reaching the limit of the pipeline, (i.e. all store buffers are used). The stall ends when a store instruction commits its data to the cache or memory. A2H 10H RESOURCE_STALLS.ROB_FULL Counts the cycles of stall due to re-order buffer full. A2H 20H RESOURCE_STALLS.FPCW Counts the number of cycles while execution was stalled due to writing the floating-point unit (FPU) control word. A2H 40H RESOURCE_STALLS.MXCSR Stalls due to the MXCSR register rename occurring to close to a previous MXCSR rename. The MXCSR provides control and status for the MMX registers. A2H 80H RESOURCE_STALLS.OTHER Counts the number of cycles while execution was stalled due to other resource issues. A6H 01H MACRO_INSTS.FUSIONS_DECO Counts the number of instructions decoded that are DED macro-fused but not necessarily executed or retired. A7H 01H BACLEAR_FORCE_IQ Counts number of times a BACLEAR was forced by the Instruction Queue. The IQ is also responsible for providing conditional branch prediction direction based on a static scheme and dynamic data provided by the L2 Branch Prediction Unit. If the conditional branch target is not found in the Target Array and the IQ predicts that the branch is taken, then the IQ will force the Branch Address Calculator to issue a BACLEAR. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline. A8H 01H LSD.UOPS Counts the number of micro-ops delivered by loop stream detector. AEH 01H ITLB_FLUSH Counts the number of ITLB flushes. B0H 01H OFFCORE_REQUESTS.DEMAN D.READ_DATA Counts number of offcore demand data read requests. Does not count L2 prefetch requests. B0H 02H OFFCORE_REQUESTS.DEMAN D.READ_CODE Counts number of offcore demand code read requests. Does not count L2 prefetch requests. B0H 04H OFFCORE_REQUESTS.DEMAN D.RFO Counts number of offcore demand RFO requests. Does not count L2 prefetch requests. B0H 08H OFFCORE_REQUESTS.ANY.REA Counts number of offcore read requests. Includes D L2 prefetch requests. 19-138 Vol. 3B Use cmask=1 and invert to count cycles. PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value B0H 10H OFFCORE_REQUESTS.ANY.RFO Counts number of offcore RFO requests. Includes L2 prefetch requests. B0H 40H OFFCORE_REQUESTS.L1D_WR Counts number of L1D writebacks to the uncore. ITEBACK Event Mask Mnemonic Description Comment B0H 80H OFFCORE_REQUESTS.ANY Counts all offcore requests. B1H 01H UOPS_EXECUTED.PORT0 Counts number of uops executed that were issued on port 0. Port 0 handles integer arithmetic, SIMD and FP add uops. B1H 02H UOPS_EXECUTED.PORT1 Counts number of uops executed that were issued on port 1. Port 1 handles integer arithmetic, SIMD, integer shift, FP multiply and FP divide uops. B1H 04H UOPS_EXECUTED.PORT2_COR Counts number of uops executed that were issued E on port 2. Port 2 handles the load uops. This is a core count only and cannot be collected per thread. B1H 08H UOPS_EXECUTED.PORT3_COR Counts number of uops executed that were issued E on port 3. Port 3 handles store uops. This is a core count only and cannot be collected per thread. B1H 10H UOPS_EXECUTED.PORT4_COR Counts number of uops executed that where issued E on port 4. Port 4 handles the value to be stored for the store uops issued on port 3. This is a core count only and cannot be collected per thread. B1H 1FH UOPS_EXECUTED.CORE_ACTI VE_CYCLES_NO_PORT5 Counts number of cycles there are one or more uops being executed and were issued on ports 0-4. This is a core count only and cannot be collected per thread. B1H 20H UOPS_EXECUTED.PORT5 Counts number of uops executed that where issued on port 5. B1H 3FH UOPS_EXECUTED.CORE_ACTI VE_CYCLES Counts number of cycles there are one or more uops being executed on any ports. This is a core count only and cannot be collected per thread. B1H 40H UOPS_EXECUTED.PORT015 Counts number of uops executed that where issued Use cmask=1, invert=1 on port 0, 1, or 5. to count stall cycles. B1H 80H UOPS_EXECUTED.PORT234 Counts number of uops executed that where issued on port 2, 3, or 4. B2H 01H OFFCORE_REQUESTS_SQ_FUL Counts number of cycles the SQ is full to handle offL core requests. B3H 01H SNOOPQ_REQUESTS_OUTSTA Counts weighted cycles of snoopq requests for NDING.DATA data. Counter 0 only. Use cmask=1 to count cycles not empty. B3H 02H SNOOPQ_REQUESTS_OUTSTA Counts weighted cycles of snoopq invalidate NDING.INVALIDATE requests. Counter 0 only. Use cmask=1 to count cycles not empty. B3H 04H SNOOPQ_REQUESTS_OUTSTA Counts weighted cycles of snoopq requests for NDING.CODE code. Counter 0 only. Use cmask=1 to count cycles not empty. B4H 01H SNOOPQ_REQUESTS.CODE Counts the number of snoop code requests. B4H 02H SNOOPQ_REQUESTS.DATA Counts the number of snoop data requests. B4H 04H SNOOPQ_REQUESTS.INVALID ATE Counts the number of snoop invalidate requests. Vol. 3B 19-139 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description Comment B7H 01H OFF_CORE_RESPONSE_0 See Section 18.3.1.1.3, “Off-core Response Performance Monitoring in the Processor Core”. Requires programming MSR 01A6H. B8H 01H SNOOP_RESPONSE.HIT Counts HIT snoop response sent by this thread in response to a snoop request. B8H 02H SNOOP_RESPONSE.HITE Counts HIT E snoop response sent by this thread in response to a snoop request. B8H 04H SNOOP_RESPONSE.HITM Counts HIT M snoop response sent by this thread in response to a snoop request. BBH 01H OFF_CORE_RESPONSE_1 See Section 18.3.1.1.3, “Off-core Response Performance Monitoring in the Processor Core”. Use MSR 01A7H. C0H 00H INST_RETIRED.ANY_P See Table 19-1. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions. Notes: INST_RETIRED.ANY is counted by a designated fixed counter. INST_RETIRED.ANY_P is counted by a programmable counter and is an architectural performance event. Event is supported if CPUID.A.EBX[1] = 0. C0H 02H INST_RETIRED.X87 Counts the number of floating point computational operations retired: floating point computational operations executed by the assist handler and suboperations of complex floating point instructions like transcendental instructions. C0H 04H INST_RETIRED.MMX Counts the number of retired: MMX instructions. C2H 01H UOPS_RETIRED.ANY Counts the number of micro-ops retired, (macroUse cmask=1 and invert fused=1, micro-fused=2, others=1; maximum count to count active cycles or of 8 per cycle). Most instructions are composed of stalled cycles. one or two micro-ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. C2H 02H UOPS_RETIRED.RETIRE_SLOT S Counts the number of retirement slots used each cycle. C2H 04H UOPS_RETIRED.MACRO_FUSE D Counts number of macro-fused uops retired. C3H 01H MACHINE_CLEARS.CYCLES Counts the cycles machine clear is asserted. C3H 02H MACHINE_CLEARS.MEM_ORDE Counts the number of machine clears due to R memory order conflicts. C3H 04H MACHINE_CLEARS.SMC C4H 00H BR_INST_RETIRED.ALL_BRAN Branch instructions at retirement. CHES C4H 01H BR_INST_RETIRED.CONDITION Counts the number of conditional branch AL instructions retired. C4H 02H BR_INST_RETIRED.NEAR_CAL Counts the number of direct & indirect near L unconditional calls retired. 19-140 Vol. 3B Counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel 64 and IA-32 processors. The modified cache line is written back to the L2 and L3caches. See Table 19-1. PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value C5H 00H BR_MISP_RETIRED.ALL_BRAN Mispredicted branch instructions at retirement. CHES C5H 01H BR_MISP_RETIRED.CONDITION Counts mispredicted conditional retired calls. AL C5H 02H BR_MISP_RETIRED.NEAR_CAL Counts mispredicted direct & indirect near L unconditional retired calls. C5H 04H BR_MISP_RETIRED.ALL_BRAN Counts all mispredicted retired calls. CHES C7H 01H SSEX_UOPS_RETIRED.PACKED Counts SIMD packed single-precision floating-point _SINGLE uops retired. C7H 02H SSEX_UOPS_RETIRED.SCALAR Counts SIMD scalar single-precision floating-point _SINGLE uops retired. C7H 04H SSEX_UOPS_RETIRED.PACKED Counts SIMD packed double-precision floating-point _DOUBLE uops retired. C7H 08H SSEX_UOPS_RETIRED.SCALAR Counts SIMD scalar double-precision floating-point _DOUBLE uops retired. C7H 10H SSEX_UOPS_RETIRED.VECTOR Counts 128-bit SIMD vector integer uops retired. _INTEGER C8H 20H ITLB_MISS_RETIRED CBH 01H MEM_LOAD_RETIRED.L1D_HIT Counts number of retired loads that hit the L1 data cache. CBH 02H MEM_LOAD_RETIRED.L2_HIT Counts number of retired loads that hit the L2 data cache. CBH 04H MEM_LOAD_RETIRED.L3_UNS HARED_HIT Counts number of retired loads that hit their own, unshared lines in the L3 cache. CBH 08H MEM_LOAD_RETIRED.OTHER_ Counts number of retired loads that hit in a sibling CORE_L2_HIT_HITM core's L2 (on die core). Since the L3 is inclusive of all cores on the package, this is an L3 hit. This counts both clean and modified hits. CBH 10H MEM_LOAD_RETIRED.L3_MISS Counts number of retired loads that miss the L3 cache. The load was satisfied by a remote socket, local memory or an IOH. CBH 40H MEM_LOAD_RETIRED.HIT_LFB Counts number of retired loads that miss the L1D and the address is located in an allocated line fill buffer and will soon be committed to cache. This is counting secondary L1D misses. CBH 80H MEM_LOAD_RETIRED.DTLB_MI Counts the number of retired loads that missed the SS DTLB. The DTLB miss is not counted if the load operation causes a fault. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. Counts both primary and secondary misses to the TLB. CCH 01H FP_MMX_TRANS.TO_FP Event Mask Mnemonic Description Comment See Table 19-1. Counts the number of retired instructions that missed the ITLB when the instruction was fetched. Counts the first floating-point instruction following any MMX instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states. Vol. 3B 19-141 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description CCH 02H FP_MMX_TRANS.TO_MMX Counts the first MMX instruction following a floating-point instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states. CCH 03H FP_MMX_TRANS.ANY Counts all transitions from floating point to MMX instructions and from MMX instructions to floating point instructions. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states. D0H 01H MACRO_INSTS.DECODED Counts the number of instructions decoded, (but not necessarily executed or retired). D1H 01H UOPS_DECODED.STALL_CYCLE Counts the cycles of decoder stalls. INV=1, Cmask= S 1. D1H 02H UOPS_DECODED.MS Counts the number of Uops decoded by the Microcode Sequencer, MS. The MS delivers uops when the instruction is more than 4 uops long or a microcode assist is occurring. D1H 04H UOPS_DECODED.ESP_FOLDIN G Counts number of stack pointer (ESP) instructions decoded: push, pop, call, ret, etc. ESP instructions do not generate a Uop to increment or decrement ESP. Instead, they update an ESP_Offset register that keeps track of the delta to the current value of the ESP register. D1H 08H UOPS_DECODED.ESP_SYNC Counts number of stack pointer (ESP) sync operations where an ESP instruction is corrected by adding the ESP offset register to the current value of the ESP register. D2H 01H RAT_STALLS.FLAGS Counts the number of cycles during which execution stalled due to several reasons, one of which is a partial flag register stall. A partial register stall may occur when two conditions are met: 1) an instruction modifies some, but not all, of the flags in the flag register and 2) the next instruction, which depends on flags, depends on flags that were not modified by this instruction. D2H 02H RAT_STALLS.REGISTERS This event counts the number of cycles instruction execution latency became longer than the defined latency because the instruction used a register that was partially written by previous instruction. D2H 04H RAT_STALLS.ROB_READ_POR T Counts the number of cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the out-of-order pipeline. Note that, at this stage in the pipeline, additional stalls may occur at the same cycle and prevent the stalled micro-ops from entering the pipe. In such a case, micro-ops retry entering the execution pipe in the next cycle and the ROB-read port stall is counted again. D2H 08H RAT_STALLS.SCOREBOARD Counts the cycles where we stall due to microarchitecturally required serialization. Microcode scoreboarding stalls. 19-142 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description D2H 0FH RAT_STALLS.ANY Counts all Register Allocation Table stall cycles due to: Cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the execution pipe, Cycles when partial register stalls occurred, Cycles when flag stalls occurred, Cycles floating-point unit (FPU) status word stalls occurred. To count each of these conditions separately use the events: RAT_STALLS.ROB_READ_PORT, RAT_STALLS.PARTIAL, RAT_STALLS.FLAGS, and RAT_STALLS.FPSW. D4H 01H SEG_RENAME_STALLS Counts the number of stall cycles due to the lack of renaming resources for the ES, DS, FS, and GS segment registers. If a segment is renamed but not retired and a second update to the same segment occurs, a stall occurs in the front end of the pipeline until the renamed segment retires. D5H 01H ES_REG_RENAMES Counts the number of times the ES segment register is renamed. DBH 01H UOP_UNFUSION Counts unfusion events due to floating point exception to a fused uop. E0H 01H BR_INST_DECODED Counts the number of branch instructions decoded. E5H 01H BPU_MISSED_CALL_RET Counts number of times the Branch Prediction Unit missed predicting a call or return branch. E6H 01H BACLEAR.CLEAR Counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. This can occur if the code has many branches such that they cannot be consumed by the BPU. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline. The effect on total execution time depends on the surrounding code. E6H 02H BACLEAR.BAD_TARGET Counts number of Branch Address Calculator clears (BACLEAR) asserted due to conditional branch instructions in which there was a target hit but the direction was wrong. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline. E8H 01H BPU_CLEARS.EARLY Counts early (normal) Branch Prediction Unit clears: The BPU clear leads to 2 BPU predicted a taken branch after incorrectly cycle bubble in the front assuming that it was not taken. end. E8H 02H BPU_CLEARS.LATE Counts late Branch Prediction Unit clears due to Most Recently Used conflicts. The PBU clear leads to a 3 cycle bubble in the front end. ECH 01H THREAD_ACTIVE Counts cycles threads are active. F0H 01H L2_TRANSACTIONS.LOAD Counts L2 load operations due to HW prefetch or demand loads. Comment Vol. 3B 19-143 PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description F0H 02H L2_TRANSACTIONS.RFO Counts L2 RFO operations due to HW prefetch or demand RFOs. F0H 04H L2_TRANSACTIONS.IFETCH Counts L2 instruction fetch operations due to HW prefetch or demand ifetch. F0H 08H L2_TRANSACTIONS.PREFETC H Counts L2 prefetch operations. F0H 10H L2_TRANSACTIONS.L1D_WB Counts L1D writeback operations to the L2. F0H 20H L2_TRANSACTIONS.FILL Counts L2 cache line fill operations due to load, RFO, L1D writeback or prefetch. F0H 40H L2_TRANSACTIONS.WB Counts L2 writeback operations to the L3. F0H 80H L2_TRANSACTIONS.ANY Counts all L2 cache operations. F1H 02H L2_LINES_IN.S_STATE Counts the number of cache lines allocated in the L2 cache in the S (shared) state. F1H 04H L2_LINES_IN.E_STATE Counts the number of cache lines allocated in the L2 cache in the E (exclusive) state. F1H 07H L2_LINES_IN.ANY Counts the number of cache lines allocated in the L2 cache. F2H 01H L2_LINES_OUT.DEMAND_CLEA Counts L2 clean cache lines evicted by a demand N request. F2H 02H L2_LINES_OUT.DEMAND_DIRT Counts L2 dirty (modified) cache lines evicted by a Y demand request. F2H 04H L2_LINES_OUT.PREFETCH_CL EAN F2H 08H L2_LINES_OUT.PREFETCH_DIR Counts L2 modified cache line evicted by a prefetch TY request. F2H 0FH L2_LINES_OUT.ANY Counts all L2 cache lines evicted for any reason. F4H 04H SQ_MISC.LRU_HINTS Counts number of Super Queue LRU hints sent to L3. F4H 10H SQ_MISC.SPLIT_LOCK Counts the number of SQ lock splits across a cache line. F6H 01H SQ_FULL_STALL_CYCLES Counts cycles the Super Queue is full. Neither of the threads on this core will be able to access the uncore. F7H 01H FP_ASSIST.ALL Counts the number of floating point operations executed that required micro-code assist intervention. Assists are required in the following cases: SSE instructions, (Denormal input when the DAZ flag is off or Underflow result when the FTZ flag is off): x87 instructions, (NaN or denormal are loaded to a register or used as input from memory, Division by 0 or Underflow output). F7H 02H FP_ASSIST.OUTPUT Counts number of floating point micro-code assist when the output value (destination register) is invalid. F7H 04H FP_ASSIST.INPUT Counts number of floating point micro-code assist when the input value (one of the source operands to an FP instruction) is invalid. 19-144 Vol. 3B Counts L2 clean cache line evicted by a prefetch request. Comment PERFORMANCE MONITORING EVENTS Table 19-23. Performance Events In the Processor Core for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description FDH 01H SIMD_INT_64.PACKED_MPY Counts number of SID integer 64 bit packed multiply operations. FDH 02H SIMD_INT_64.PACKED_SHIFT Counts number of SID integer 64 bit packed shift operations. FDH 04H SIMD_INT_64.PACK Counts number of SID integer 64 bit pack operations. FDH 08H SIMD_INT_64.UNPACK Counts number of SID integer 64 bit unpack operations. FDH 10H SIMD_INT_64.PACKED_LOGICA Counts number of SID integer 64 bit logical L operations. FDH 20H SIMD_INT_64.PACKED_ARITH FDH 40H SIMD_INT_64.SHUFFLE_MOVE Counts number of SID integer 64 bit shift or move operations. Comment Counts number of SID integer 64 bit arithmetic operations. Model-specific performance monitoring events of the uncore sub-system for processors with CPUID signature of DisplayFamily_DisplayModel 06_25H, 06_2CH, and 06_1FH support performance events listed in Table 19-24. Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere Event Num. Umask Value 00H 01H UNC_GQ_CYCLES_FULL.READ_ Uncore cycles Global Queue read tracker is full. TRACKER 00H 02H UNC_GQ_CYCLES_FULL.WRITE Uncore cycles Global Queue write tracker is full. _TRACKER 00H 04H UNC_GQ_CYCLES_FULL.PEER_ Uncore cycles Global Queue peer probe tracker is full. PROBE_TRACKER The peer probe tracker queue tracks snoops from the IOH and remote sockets. 01H 01H UNC_GQ_CYCLES_NOT_EMPTY Uncore cycles were Global Queue read tracker has at .READ_TRACKER least one valid entry. 01H 02H UNC_GQ_CYCLES_NOT_EMPTY Uncore cycles were Global Queue write tracker has at .WRITE_TRACKER least one valid entry. 01H 04H UNC_GQ_CYCLES_NOT_EMPTY Uncore cycles were Global Queue peer probe tracker .PEER_PROBE_TRACKER has at least one valid entry. The peer probe tracker queue tracks IOH and remote socket snoops. 02H 01H UNC_GQ_OCCUPANCY.READ_T Increments the number of queue entries (code read, RACKER data read, and RFOs) in the tread tracker. The GQ read tracker allocate to deallocate occupancy count is divided by the count to obtain the average read tracker latency. 03H 01H UNC_GQ_ALLOC.READ_TRACK ER Event Mask Mnemonic Description Comment Counts the number of tread tracker allocate to deallocate entries. The GQ read tracker allocate to deallocate occupancy count is divided by the count to obtain the average read tracker latency. Vol. 3B 19-145 PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 03H 02H UNC_GQ_ALLOC.RT_L3_MISS Counts the number GQ read tracker entries for which a full cache line read has missed the L3. The GQ read tracker L3 miss to fill occupancy count is divided by this count to obtain the average cache line read L3 miss latency. The latency represents the time after which the L3 has determined that the cache line has missed. The time between a GQ read tracker allocation and the L3 determining that the cache line has missed is the average L3 hit latency. The total L3 cache line read miss latency is the hit latency + L3 miss latency. 03H 04H UNC_GQ_ALLOC.RT_TO_L3_RE Counts the number of GQ read tracker entries that are SP allocated in the read tracker queue that hit or miss the L3. The GQ read tracker L3 hit occupancy count is divided by this count to obtain the average L3 hit latency. 03H 08H UNC_GQ_ALLOC.RT_TO_RTID_ Counts the number of GQ read tracker entries that are ACQUIRED allocated in the read tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ read tracker L3 miss to RTID acquired occupancy count is divided by this count to obtain the average latency for a read L3 miss to acquire an RTID. 03H 10H UNC_GQ_ALLOC.WT_TO_RTID_ Counts the number of GQ write tracker entries that are ACQUIRED allocated in the write tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ write tracker L3 miss to RTID occupancy count is divided by this count to obtain the average latency for a write L3 miss to acquire an RTID. 03H 20H UNC_GQ_ALLOC.WRITE_TRAC KER Counts the number of GQ write tracker entries that are allocated in the write tracker queue that miss the L3. The GQ write tracker occupancy count is divided by this count to obtain the average L3 write miss latency. 03H 40H UNC_GQ_ALLOC.PEER_PROBE _TRACKER Counts the number of GQ peer probe tracker (snoop) entries that are allocated in the peer probe tracker queue that miss the L3. The GQ peer probe occupancy count is divided by this count to obtain the average L3 peer probe miss latency. 04H 01H UNC_GQ_DATA.FROM_QPI Cycles Global Queue Quickpath Interface input data port is busy importing data from the Quickpath Interface. Each cycle the input port can transfer 8 or 16 bytes of data. 04H 02H UNC_GQ_DATA.FROM_QMC Cycles Global Queue Quickpath Memory Interface input data port is busy importing data from the Quickpath Memory Interface. Each cycle the input port can transfer 8 or 16 bytes of data. 04H 04H UNC_GQ_DATA.FROM_L3 Cycles GQ L3 input data port is busy importing data from the Last Level Cache. Each cycle the input port can transfer 32 bytes of data. 04H 08H UNC_GQ_DATA.FROM_CORES_ Cycles GQ Core 0 and 2 input data port is busy 02 importing data from processor cores 0 and 2. Each cycle the input port can transfer 32 bytes of data. 19-146 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 04H 10H UNC_GQ_DATA.FROM_CORES_ Cycles GQ Core 1 and 3 input data port is busy 13 importing data from processor cores 1 and 3. Each cycle the input port can transfer 32 bytes of data. 05H 01H UNC_GQ_DATA.TO_QPI_QMC Cycles GQ QPI and QMC output data port is busy sending data to the Quickpath Interface or Quickpath Memory Interface. Each cycle the output port can transfer 32 bytes of data. 05H 02H UNC_GQ_DATA.TO_L3 Cycles GQ L3 output data port is busy sending data to the Last Level Cache. Each cycle the output port can transfer 32 bytes of data. 05H 04H UNC_GQ_DATA.TO_CORES Cycles GQ Core output data port is busy sending data to the Cores. Each cycle the output port can transfer 32 bytes of data. 06H 01H UNC_SNP_RESP_TO_LOCAL_H Number of snoop responses to the local home that L3 OME.I_STATE does not have the referenced cache line. 06H 02H UNC_SNP_RESP_TO_LOCAL_H Number of snoop responses to the local home that L3 OME.S_STATE has the referenced line cached in the S state. 06H 04H UNC_SNP_RESP_TO_LOCAL_H Number of responses to code or data read snoops to OME.FWD_S_STATE the local home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the local home in the S state. 06H 08H UNC_SNP_RESP_TO_LOCAL_H Number of responses to read invalidate snoops to the OME.FWD_I_STATE local home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the local home in the M state. 06H 10H UNC_SNP_RESP_TO_LOCAL_H Number of conflict snoop responses sent to the local OME.CONFLICT home. 06H 20H UNC_SNP_RESP_TO_LOCAL_H Number of responses to code or data read snoops to OME.WB the local home that the L3 has the referenced line cached in the M state. 07H 01H UNC_SNP_RESP_TO_REMOTE_ Number of snoop responses to a remote home that L3 HOME.I_STATE does not have the referenced cache line. 07H 02H UNC_SNP_RESP_TO_REMOTE_ Number of snoop responses to a remote home that L3 HOME.S_STATE has the referenced line cached in the S state. 07H 04H UNC_SNP_RESP_TO_REMOTE_ Number of responses to code or data read snoops to a HOME.FWD_S_STATE remote home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the remote home in the S state. 07H 08H UNC_SNP_RESP_TO_REMOTE_ Number of responses to read invalidate snoops to a HOME.FWD_I_STATE remote home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the remote home in the M state. 07H 10H UNC_SNP_RESP_TO_REMOTE_ Number of conflict snoop responses sent to the local HOME.CONFLICT home. Event Mask Mnemonic Description Comment Vol. 3B 19-147 PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 07H 20H UNC_SNP_RESP_TO_REMOTE_ Number of responses to code or data read snoops to a HOME.WB remote home that the L3 has the referenced line cached in the M state. 07H 24H UNC_SNP_RESP_TO_REMOTE_ Number of HITM snoop responses to a remote home. HOME.HITM 08H 01H UNC_L3_HITS.READ Number of code read, data read and RFO requests that hit in the L3. 08H 02H UNC_L3_HITS.WRITE Number of writeback requests that hit in the L3. Writebacks from the cores will always result in L3 hits due to the inclusive property of the L3. 08H 04H UNC_L3_HITS.PROBE Number of snoops from IOH or remote sockets that hit in the L3. 08H 03H UNC_L3_HITS.ANY Number of reads and writes that hit the L3. 09H 01H UNC_L3_MISS.READ Number of code read, data read and RFO requests that miss the L3. 09H 02H UNC_L3_MISS.WRITE Number of writeback requests that miss the L3. Should always be zero as writebacks from the cores will always result in L3 hits due to the inclusive property of the L3. 09H 04H UNC_L3_MISS.PROBE Number of snoops from IOH or remote sockets that miss the L3. 09H 03H UNC_L3_MISS.ANY Number of reads and writes that miss the L3. 0AH 01H UNC_L3_LINES_IN.M_STATE Counts the number of L3 lines allocated in M state. The only time a cache line is allocated in the M state is when the line was forwarded in M state is forwarded due to a Snoop Read Invalidate Own request. 0AH 02H UNC_L3_LINES_IN.E_STATE Counts the number of L3 lines allocated in E state. 0AH 04H UNC_L3_LINES_IN.S_STATE Counts the number of L3 lines allocated in S state. 0AH 08H UNC_L3_LINES_IN.F_STATE Counts the number of L3 lines allocated in F state. 0AH 0FH UNC_L3_LINES_IN.ANY Counts the number of L3 lines allocated in any state. 0BH 01H UNC_L3_LINES_OUT.M_STATE Counts the number of L3 lines victimized that were in the M state. When the victim cache line is in M state, the line is written to its home cache agent which can be either local or remote. 0BH 02H UNC_L3_LINES_OUT.E_STATE Counts the number of L3 lines victimized that were in the E state. 0BH 04H UNC_L3_LINES_OUT.S_STATE Counts the number of L3 lines victimized that were in the S state. 0BH 08H UNC_L3_LINES_OUT.I_STATE Counts the number of L3 lines victimized that were in the I state. 0BH 10H UNC_L3_LINES_OUT.F_STATE Counts the number of L3 lines victimized that were in the F state. 0BH 1FH UNC_L3_LINES_OUT.ANY Counts the number of L3 lines victimized in any state. 0CH 01H UNC_GQ_SNOOP.GOTO_S Counts the number of remote snoops that have requested a cache line be set to the S state. 19-148 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 0CH 02H UNC_GQ_SNOOP.GOTO_I Counts the number of remote snoops that have requested a cache line be set to the I state. 0CH 04H UNC_GQ_SNOOP.GOTO_S_HIT_ Counts the number of remote snoops that have E requested a cache line be set to the S state from E state. Requires writing MSR 301H with mask = 2H. 0CH 04H UNC_GQ_SNOOP.GOTO_S_HIT_ Counts the number of remote snoops that have F requested a cache line be set to the S state from F (forward) state. Requires writing MSR 301H with mask = 8H. 0CH 04H UNC_GQ_SNOOP.GOTO_S_HIT_ Counts the number of remote snoops that have M requested a cache line be set to the S state from M state. Requires writing MSR 301H with mask = 1H. 0CH 04H UNC_GQ_SNOOP.GOTO_S_HIT_ Counts the number of remote snoops that have S requested a cache line be set to the S state from S state. Requires writing MSR 301H with mask = 4H. 0CH 08H UNC_GQ_SNOOP.GOTO_I_HIT_ Counts the number of remote snoops that have E requested a cache line be set to the I state from E state. Requires writing MSR 301H with mask = 2H. 0CH 08H UNC_GQ_SNOOP.GOTO_I_HIT_ Counts the number of remote snoops that have F requested a cache line be set to the I state from F (forward) state. Requires writing MSR 301H with mask = 8H. 0CH 08H UNC_GQ_SNOOP.GOTO_I_HIT_ Counts the number of remote snoops that have M requested a cache line be set to the I state from M state. Requires writing MSR 301H with mask = 1H. 0CH 08H UNC_GQ_SNOOP.GOTO_I_HIT_ Counts the number of remote snoops that have S requested a cache line be set to the I state from S state. Requires writing MSR 301H with mask = 4H. 20H 01H UNC_QHL_REQUESTS.IOH_RE ADS 20H 02H UNC_QHL_REQUESTS.IOH_WRI Counts number of Quickpath Home Logic write TES requests from the IOH. 20H 04H UNC_QHL_REQUESTS.REMOTE Counts number of Quickpath Home Logic read requests _READS from a remote socket. 20H 08H UNC_QHL_REQUESTS.REMOTE Counts number of Quickpath Home Logic write _WRITES requests from a remote socket. 20H 10H UNC_QHL_REQUESTS.LOCAL_ READS Counts number of Quickpath Home Logic read requests from the local socket. 20H 20H UNC_QHL_REQUESTS.LOCAL_ WRITES Counts number of Quickpath Home Logic write requests from the local socket. 21H 01H UNC_QHL_CYCLES_FULL.IOH Counts uclk cycles all entries in the Quickpath Home Logic IOH are full. 21H 02H UNC_QHL_CYCLES_FULL.REMO Counts uclk cycles all entries in the Quickpath Home TE Logic remote tracker are full. 21H 04H UNC_QHL_CYCLES_FULL.LOCA Counts uclk cycles all entries in the Quickpath Home L Logic local tracker are full. 22H 01H UNC_QHL_CYCLES_NOT_EMPT Counts uclk cycles all entries in the Quickpath Home Y.IOH Logic IOH is busy. Comment Counts number of Quickpath Home Logic read requests from the IOH. Vol. 3B 19-149 PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 22H 02H UNC_QHL_CYCLES_NOT_EMPT Counts uclk cycles all entries in the Quickpath Home Y.REMOTE Logic remote tracker is busy. 22H 04H UNC_QHL_CYCLES_NOT_EMPT Counts uclk cycles all entries in the Quickpath Home Y.LOCAL Logic local tracker is busy. 23H 01H UNC_QHL_OCCUPANCY.IOH 23H 02H UNC_QHL_OCCUPANCY.REMOT QHL remote tracker allocate to deallocate read E occupancy. 23H 04H UNC_QHL_OCCUPANCY.LOCAL QHL local tracker allocate to deallocate read occupancy. 24H 02H UNC_QHL_ADDRESS_CONFLIC TS.2WAY Counts number of QHL Active Address Table (AAT) entries that saw a max of 2 conflicts. The AAT is a structure that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates. 24H 04H UNC_QHL_ADDRESS_CONFLIC TS.3WAY Counts number of QHL Active Address Table (AAT) entries that saw a max of 3 conflicts. The AAT is a structure that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates. 25H 01H UNC_QHL_CONFLICT_CYCLES.I Counts cycles the Quickpath Home Logic IOH Tracker OH contains two or more requests with an address conflict. A max of 3 requests can be in conflict. 25H 02H UNC_QHL_CONFLICT_CYCLES. REMOTE 25H 04H UNC_QHL_CONFLICT_CYCLES.L Counts cycles the Quickpath Home Logic Local Tracker OCAL contains two or more requests with an address conflict. A max of 3 requests can be in conflict. 26H 01H UNC_QHL_TO_QMC_BYPASS Counts number or requests to the Quickpath Memory Controller that bypass the Quickpath Home Logic. All local accesses can be bypassed. For remote requests, only read requests can be bypassed. 28H 01H UNC_QMC_ISOC_FULL.READ.C H0 Counts cycles all the entries in the DRAM channel 0 high priority queue are occupied with isochronous read requests. 28H 02H UNC_QMC_ISOC_FULL.READ.C H1 Counts cycles all the entries in the DRAM channel 1high priority queue are occupied with isochronous read requests. 28H 04H UNC_QMC_ISOC_FULL.READ.C H2 Counts cycles all the entries in the DRAM channel 2 high priority queue are occupied with isochronous read requests. 28H 08H UNC_QMC_ISOC_FULL.WRITE.C Counts cycles all the entries in the DRAM channel 0 H0 high priority queue are occupied with isochronous write requests. 28H 10H UNC_QMC_ISOC_FULL.WRITE.C Counts cycles all the entries in the DRAM channel 1 H1 high priority queue are occupied with isochronous write requests. 19-150 Vol. 3B Event Mask Mnemonic Description QHL IOH tracker allocate to deallocate read occupancy. Counts cycles the Quickpath Home Logic Remote Tracker contains two or more requests with an address conflict. A max of 3 requests can be in conflict. Comment PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 28H 20H UNC_QMC_ISOC_FULL.WRITE.C Counts cycles all the entries in the DRAM channel 2 H2 high priority queue are occupied with isochronous write requests. 29H 01H UNC_QMC_BUSY.READ.CH0 Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 0. 29H 02H UNC_QMC_BUSY.READ.CH1 Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 1. 29H 04H UNC_QMC_BUSY.READ.CH2 Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 2. 29H 08H UNC_QMC_BUSY.WRITE.CH0 Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 0. 29H 10H UNC_QMC_BUSY.WRITE.CH1 Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 1. 29H 20H UNC_QMC_BUSY.WRITE.CH2 Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 2. 2AH 01H UNC_QMC_OCCUPANCY.CH0 IMC channel 0 normal read request occupancy. 2AH 02H UNC_QMC_OCCUPANCY.CH1 IMC channel 1 normal read request occupancy. 2AH 04H UNC_QMC_OCCUPANCY.CH2 IMC channel 2 normal read request occupancy. Normal read request occupancy for any channel. Event Mask Mnemonic Description 2AH 07H UNC_QMC_OCCUPANCY.ANY 2BH 01H UNC_QMC_ISSOC_OCCUPANCY. IMC channel 0 issoc read request occupancy. CH0 2BH 02H UNC_QMC_ISSOC_OCCUPANCY. IMC channel 1 issoc read request occupancy. CH1 2BH 04H UNC_QMC_ISSOC_OCCUPANCY. IMC channel 2 issoc read request occupancy. CH2 2BH 07H UNC_QMC_ISSOC_READS.ANY IMC issoc read request occupancy. 2CH 01H UNC_QMC_NORMAL_READS.C H0 Counts the number of Quickpath Memory Controller channel 0 medium and low priority read requests. The QMC channel 0 normal read occupancy divided by this count provides the average QMC channel 0 read latency. 2CH 02H UNC_QMC_NORMAL_READS.C H1 Counts the number of Quickpath Memory Controller channel 1 medium and low priority read requests. The QMC channel 1 normal read occupancy divided by this count provides the average QMC channel 1 read latency. 2CH 04H UNC_QMC_NORMAL_READS.C H2 Counts the number of Quickpath Memory Controller channel 2 medium and low priority read requests. The QMC channel 2 normal read occupancy divided by this count provides the average QMC channel 2 read latency. Comment Vol. 3B 19-151 PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 2CH 07H UNC_QMC_NORMAL_READS.A NY Counts the number of Quickpath Memory Controller medium and low priority read requests. The QMC normal read occupancy divided by this count provides the average QMC read latency. 2DH 01H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.CH0 channel 0 high priority isochronous read requests. 2DH 02H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.CH1 channel 1 high priority isochronous read requests. 2DH 04H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.CH2 channel 2 high priority isochronous read requests. 2DH 07H UNC_QMC_HIGH_PRIORITY_RE Counts the number of Quickpath Memory Controller ADS.ANY high priority isochronous read requests. 2EH 01H UNC_QMC_CRITICAL_PRIORITY Counts the number of Quickpath Memory Controller _READS.CH0 channel 0 critical priority isochronous read requests. 2EH 02H UNC_QMC_CRITICAL_PRIORITY Counts the number of Quickpath Memory Controller _READS.CH1 channel 1 critical priority isochronous read requests. 2EH 04H UNC_QMC_CRITICAL_PRIORITY Counts the number of Quickpath Memory Controller _READS.CH2 channel 2 critical priority isochronous read requests. 2EH 07H UNC_QMC_CRITICAL_PRIORITY Counts the number of Quickpath Memory Controller _READS.ANY critical priority isochronous read requests. 2FH 01H UNC_QMC_WRITES.FULL.CH0 Counts number of full cache line writes to DRAM channel 0. 2FH 02H UNC_QMC_WRITES.FULL.CH1 Counts number of full cache line writes to DRAM channel 1. 2FH 04H UNC_QMC_WRITES.FULL.CH2 Counts number of full cache line writes to DRAM channel 2. 2FH 07H UNC_QMC_WRITES.FULL.ANY Counts number of full cache line writes to DRAM. 2FH 08H UNC_QMC_WRITES.PARTIAL.C H0 Counts number of partial cache line writes to DRAM channel 0. 2FH 10H UNC_QMC_WRITES.PARTIAL.C H1 Counts number of partial cache line writes to DRAM channel 1. 2FH 20H UNC_QMC_WRITES.PARTIAL.C H2 Counts number of partial cache line writes to DRAM channel 2. 2FH 38H UNC_QMC_WRITES.PARTIAL.A NY Counts number of partial cache line writes to DRAM. 30H 01H UNC_QMC_CANCEL.CH0 Counts number of DRAM channel 0 cancel requests. 30H 02H UNC_QMC_CANCEL.CH1 Counts number of DRAM channel 1 cancel requests. 30H 04H UNC_QMC_CANCEL.CH2 Counts number of DRAM channel 2 cancel requests. 30H 07H UNC_QMC_CANCEL.ANY Counts number of DRAM cancel requests. 31H 01H UNC_QMC_PRIORITY_UPDATE S.CH0 Counts number of DRAM channel 0 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 19-152 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 31H 02H UNC_QMC_PRIORITY_UPDATE S.CH1 Counts number of DRAM channel 1 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 31H 04H UNC_QMC_PRIORITY_UPDATE S.CH2 Counts number of DRAM channel 2 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 31H 07H UNC_QMC_PRIORITY_UPDATE S.ANY Counts number of DRAM priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request. 32H 01H UNC_IMC_RETRY.CH0 Counts number of IMC DRAM channel 0 retries. DRAM retry only occurs when configured in RAS mode. 32H 02H UNC_IMC_RETRY.CH1 Counts number of IMC DRAM channel 1 retries. DRAM retry only occurs when configured in RAS mode. 32H 04H UNC_IMC_RETRY.CH2 Counts number of IMC DRAM channel 2 retries. DRAM retry only occurs when configured in RAS mode. 32H 07H UNC_IMC_RETRY.ANY Counts number of IMC DRAM retries from any channel. DRAM retry only occurs when configured in RAS mode. 33H 01H UNC_QHL_FRC_ACK_CNFLTS.I OH Counts number of Force Acknowledge Conflict messages sent by the Quickpath Home Logic to the IOH. 33H 02H UNC_QHL_FRC_ACK_CNFLTS.R Counts number of Force Acknowledge Conflict EMOTE messages sent by the Quickpath Home Logic to the remote home. 33H 04H UNC_QHL_FRC_ACK_CNFLTS.L Counts number of Force Acknowledge Conflict OCAL messages sent by the Quickpath Home Logic to the local home. 33H 07H UNC_QHL_FRC_ACK_CNFLTS.A Counts number of Force Acknowledge Conflict NY messages sent by the Quickpath Home Logic. 34H 01H UNC_QHL_SLEEPS.IOH_ORDER Counts number of occurrences a request was put to sleep due to IOH ordering (write after read) conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC. 34H 02H UNC_QHL_SLEEPS.REMOTE_O RDER 34H 04H UNC_QHL_SLEEPS.LOCAL_ORD Counts number of occurrences a request was put to ER sleep due to local socket ordering (write after read) conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC. Comment Counts number of occurrences a request was put to sleep due to remote socket ordering (write after read) conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC. Vol. 3B 19-153 PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 34H 08H UNC_QHL_SLEEPS.IOH_CONFLI Counts number of occurrences a request was put to CT sleep due to IOH address conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC. 34H 10H UNC_QHL_SLEEPS.REMOTE_C ONFLICT 34H 20H UNC_QHL_SLEEPS.LOCAL_CON Counts number of occurrences a request was put to FLICT sleep due to local socket address conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC. 35H 01H UNC_ADDR_OPCODE_MATCH.I OH Event Mask Mnemonic Description Comment Counts number of occurrences a request was put to sleep due to remote socket address conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC. Counts number of requests from the IOH, address/opcode of request is qualified by mask value written to MSR 396H. The following mask values are supported: Match opcode/address by writing MSR 396H with mask supported mask value. 0: NONE 40000000_00000000H:RSPFWDI 40001A00_00000000H:RSPFWDS 40001D00_00000000H:RSPIWB 35H 02H UNC_ADDR_OPCODE_MATCH.R Counts number of requests from the remote socket, EMOTE address/opcode of request is qualified by mask value written to MSR 396H. The following mask values are supported: Match opcode/address by writing MSR 396H with mask supported mask value. 0: NONE 40000000_00000000H:RSPFWDI 40001A00_00000000H:RSPFWDS 40001D00_00000000H:RSPIWB 35H 04H UNC_ADDR_OPCODE_MATCH.L Counts number of requests from the local socket, OCAL address/opcode of request is qualified by mask value written to MSR 396H. The following mask values are supported: 0: NONE 40000000_00000000H:RSPFWDI 40001A00_00000000H:RSPFWDS 40001D00_00000000H:RSPIWB 40H 01H UNC_QPI_TX_STALLED_SINGL E_FLIT.HOME.LINK_0 Counts cycles the Quickpath outbound link 0 HOME virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 02H UNC_QPI_TX_STALLED_SINGL E_FLIT.SNOOP.LINK_0 Counts cycles the Quickpath outbound link 0 SNOOP virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 19-154 Vol. 3B Match opcode/address by writing MSR 396H with mask supported mask value. PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 40H 04H UNC_QPI_TX_STALLED_SINGL E_FLIT.NDR.LINK_0 Counts cycles the Quickpath outbound link 0 non-data response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 08H UNC_QPI_TX_STALLED_SINGL E_FLIT.HOME.LINK_1 Counts cycles the Quickpath outbound link 1 HOME virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 10H UNC_QPI_TX_STALLED_SINGL E_FLIT.SNOOP.LINK_1 Counts cycles the Quickpath outbound link 1 SNOOP virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 20H UNC_QPI_TX_STALLED_SINGL E_FLIT.NDR.LINK_1 Counts cycles the Quickpath outbound link 1 non-data response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 07H UNC_QPI_TX_STALLED_SINGL E_FLIT.LINK_0 Counts cycles the Quickpath outbound link 0 virtual channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 40H 38H UNC_QPI_TX_STALLED_SINGL E_FLIT.LINK_1 Counts cycles the Quickpath outbound link 1 virtual channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 01H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 Data _FLIT.DRS.LINK_0 Response virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 02H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 Non_FLIT.NCB.LINK_0 Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 04H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 Non_FLIT.NCS.LINK_0 Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. Comment Vol. 3B 19-155 PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 41H 08H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 Data _FLIT.DRS.LINK_1 Response virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 10H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 Non_FLIT.NCB.LINK_1 Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 20H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 Non_FLIT.NCS.LINK_1 Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 07H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 0 virtual _FLIT.LINK_0 channels are stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 41H 38H UNC_QPI_TX_STALLED_MULTI Counts cycles the Quickpath outbound link 1 virtual _FLIT.LINK_1 channels are stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated. 42H 01H UNC_QPI_TX_HEADER.FULL.LI Number of cycles that the header buffer in the NK_0 Quickpath Interface outbound link 0 is full. 42H 02H UNC_QPI_TX_HEADER.BUSY.LI Number of cycles that the header buffer in the NK_0 Quickpath Interface outbound link 0 is busy. 42H 04H UNC_QPI_TX_HEADER.FULL.LI Number of cycles that the header buffer in the NK_1 Quickpath Interface outbound link 1 is full. 42H 08H UNC_QPI_TX_HEADER.BUSY.LI Number of cycles that the header buffer in the NK_1 Quickpath Interface outbound link 1 is busy. 43H 01H UNC_QPI_RX_NO_PPT_CREDI T.STALLS.LINK_0 Number of cycles that snoop packets incoming to the Quickpath Interface link 0 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries. 43H 02H UNC_QPI_RX_NO_PPT_CREDI T.STALLS.LINK_1 Number of cycles that snoop packets incoming to the Quickpath Interface link 1 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries. 60H 01H UNC_DRAM_OPEN.CH0 Counts number of DRAM Channel 0 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened. 19-156 Vol. 3B Event Mask Mnemonic Description Comment PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 60H 02H UNC_DRAM_OPEN.CH1 Counts number of DRAM Channel 1 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened. 60H 04H UNC_DRAM_OPEN.CH2 Counts number of DRAM Channel 2 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened. 61H 01H UNC_DRAM_PAGE_CLOSE.CH0 DRAM channel 0 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge. 61H 02H UNC_DRAM_PAGE_CLOSE.CH1 DRAM channel 1 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge. 61H 04H UNC_DRAM_PAGE_CLOSE.CH2 DRAM channel 2 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge. 62H 01H UNC_DRAM_PAGE_MISS.CH0 Counts the number of precharges (PRE) that were issued to DRAM channel 0 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge. 62H 02H UNC_DRAM_PAGE_MISS.CH1 Counts the number of precharges (PRE) that were issued to DRAM channel 1 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge. 62H 04H UNC_DRAM_PAGE_MISS.CH2 Counts the number of precharges (PRE) that were issued to DRAM channel 2 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge. 63H 01H UNC_DRAM_READ_CAS.CH0 Counts the number of times a read CAS command was issued on DRAM channel 0. 63H 02H UNC_DRAM_READ_CAS.AUTO PRE_CH0 Counts the number of times a read CAS command was issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode. 63H 04H UNC_DRAM_READ_CAS.CH1 Counts the number of times a read CAS command was issued on DRAM channel 1. 63H 08H UNC_DRAM_READ_CAS.AUTO PRE_CH1 Counts the number of times a read CAS command was issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode. 63H 10H UNC_DRAM_READ_CAS.CH2 Counts the number of times a read CAS command was issued on DRAM channel 2. Comment Vol. 3B 19-157 PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value Event Mask Mnemonic Description 63H 20H UNC_DRAM_READ_CAS.AUTO PRE_CH2 Counts the number of times a read CAS command was issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode. 64H 01H UNC_DRAM_WRITE_CAS.CH0 Counts the number of times a write CAS command was issued on DRAM channel 0. 64H 02H UNC_DRAM_WRITE_CAS.AUTO Counts the number of times a write CAS command was PRE_CH0 issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode. 64H 04H UNC_DRAM_WRITE_CAS.CH1 64H 08H UNC_DRAM_WRITE_CAS.AUTO Counts the number of times a write CAS command was PRE_CH1 issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode. 64H 10H UNC_DRAM_WRITE_CAS.CH2 64H 20H UNC_DRAM_WRITE_CAS.AUTO Counts the number of times a write CAS command was PRE_CH2 issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode. 65H 01H UNC_DRAM_REFRESH.CH0 Counts number of DRAM channel 0 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically. 65H 02H UNC_DRAM_REFRESH.CH1 Counts number of DRAM channel 1 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically. 65H 04H UNC_DRAM_REFRESH.CH2 Counts number of DRAM channel 2 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically. 66H 01H UNC_DRAM_PRE_ALL.CH0 Counts number of DRAM Channel 0 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode. 66H 02H UNC_DRAM_PRE_ALL.CH1 Counts number of DRAM Channel 1 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode. 66H 04H UNC_DRAM_PRE_ALL.CH2 Counts number of DRAM Channel 2 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode. 67H 01H UNC_DRAM_THERMAL_THROT Uncore cycles DRAM was throttled due to its TLED temperature being above the thermal throttling threshold. 80H 01H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 0 is above the TEMP.CORE_0 thermal throttling threshold temperature. 19-158 Vol. 3B Counts the number of times a write CAS command was issued on DRAM channel 1. Counts the number of times a write CAS command was issued on DRAM channel 2. Comment PERFORMANCE MONITORING EVENTS Table 19-24. Performance Events In the Processor Uncore for Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) Event Num. Umask Value 80H 02H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 1 is above the TEMP.CORE_1 thermal throttling threshold temperature. 80H 04H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 2 is above the TEMP.CORE_2 thermal throttling threshold temperature. 80H 08H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 3 is above the TEMP.CORE_3 thermal throttling threshold temperature. 81H 01H UNC_THERMAL_THROTTLED_ TEMP.CORE_0 Cycles that the PCU records that core 0 is in the power throttled state due to core’s temperature being above the thermal throttling threshold. 81H 02H UNC_THERMAL_THROTTLED_ TEMP.CORE_1 Cycles that the PCU records that core 1 is in the power throttled state due to core’s temperature being above the thermal throttling threshold. 81H 04H UNC_THERMAL_THROTTLED_ TEMP.CORE_2 Cycles that the PCU records that core 2 is in the power throttled state due to core’s temperature being above the thermal throttling threshold. 81H 08H UNC_THERMAL_THROTTLED_ TEMP.CORE_3 Cycles that the PCU records that core 3 is in the power throttled state due to core’s temperature being above the thermal throttling threshold. 82H 01H UNC_PROCHOT_ASSERTION Number of system assertions of PROCHOT indicating the entire processor has exceeded the thermal limit. 83H 01H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 0 is a low power PROCHOT.CORE_0 state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit. 83H 02H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 1 is a low power PROCHOT.CORE_1 state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit. 83H 04H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 2 is a low power PROCHOT.CORE_2 state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit. 83H 08H UNC_THERMAL_THROTTLING_ Cycles that the PCU records that core 3 is a low power PROCHOT.CORE_3 state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit. 84H 01H UNC_TURBO_MODE.CORE_0 Uncore cycles that core 0 is operating in turbo mode. 84H 02H UNC_TURBO_MODE.CORE_1 Uncore cycles that core 1 is operating in turbo mode. 84H 04H UNC_TURBO_MODE.CORE_2 Uncore cycles that core 2 is operating in turbo mode. 84H 08H UNC_TURBO_MODE.CORE_3 Uncore cycles that core 3 is operating in turbo mode. 85H 02H UNC_CYCLES_UNHALTED_L3_ FLL_ENABLE Uncore cycles that at least one core is unhalted and all L3 ways are enabled. 86H 01H UNC_CYCLES_UNHALTED_L3_ FLL_DISABLE Uncore cycles that at least one core is unhalted and all L3 ways are disabled. Event Mask Mnemonic Description Comment Vol. 3B 19-159 PERFORMANCE MONITORING EVENTS 19.12 PERFORMANCE MONITORING EVENTS FOR INTEL® XEON® PROCESSOR 5200, 5400 SERIES AND INTEL® CORE™2 EXTREME PROCESSORS QX 9000 SERIES Processors based on the Enhanced Intel Core microarchitecture support the architectural and model-specific performance monitoring events listed in Table 19-1 and Table 19-27. In addition, they also support the following model-specific performance monitoring events listed in Table 19-25. Fixed counters support the architecture events defined in Table 19-26. Table 19-25. Performance Events for Processors Based on Enhanced Intel Core Microarchitecture Event Num. Umask Value Event Mask Mnemonic Description Comment C0H 08H INST_RETIRED.VM_HOST Instruction retired while in VMX root operations. D2H 10H RAT_STAALS.OTHER_SERIALIZ ATION_STALLS This event counts the number of stalls due to other RAT resource serialization not counted by Umask value 0FH. 19.13 PERFORMANCE MONITORING EVENTS FOR INTEL® XEON® PROCESSOR 3000, 3200, 5100, 5300 SERIES AND INTEL® CORE™2 DUO PROCESSORS Processors based on the Intel® Core™ microarchitecture support architectural and model-specific performance monitoring events. Fixed-function performance counters are introduced first on processors based on Intel Core microarchitecture. Table 19-26 lists pre-defined performance events that can be counted using fixed-function performance counters. Table 19-26. Fixed-Function Performance Counter and Pre-defined Performance Events Fixed-Function Performance Counter Address Event Mask Mnemonic Description MSR_PERF_FIXED_ CTR0/IA32_PERF_FIXED_CTR0 309H Inst_Retired.Any This event counts the number of instructions that retire execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. MSR_PERF_FIXED_ CTR1/IA32_PERF_FIXED_CTR1 30AH CPU_CLK_UNHALTED.CORE This event counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in halt state. MSR_PERF_FIXED_ CTR2/IA32_PERF_FIXED_CTR2 19-160 Vol. 3B 30BH CPU_CLK_UNHALTED.REF This event counts the number of reference cycles when the core is not in a halt state and not in a TM stop-clock state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. PERFORMANCE MONITORING EVENTS Table 19-26. Fixed-Function Performance Counter and Pre-defined Performance Events (Contd.) Fixed-Function Performance Counter Address Event Mask Mnemonic Description This event is not affected by core frequency changes (e.g., P states) but counts at the same frequency as the time stamp counter. This event can approximate elapsed time while the core was not in halt state and not in a TM stop-clock state. This event has a constant ratio with the CPU_CLK_UNHALTED.BUS event. Table 19-27 lists general-purpose model-specific performance monitoring events supported in processors based on Intel® Core™ microarchitecture. For convenience, Table 19-27 also includes architectural events and describes minor model-specific behavior where applicable. Software must use a general-purpose performance counter to count events listed in Table 19-27. Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture Event Num Umask Value Event Name Definition 03H 02H LOAD_BLOCK.STA Loads blocked by a preceding store with unknown address. Description and Comment This event indicates that loads are blocked by preceding stores. A load is blocked when there is a preceding store to an address that is not yet calculated. The number of events is greater or equal to the number of load operations that were blocked. If the load and the store are always to different addresses, check why the memory disambiguation mechanism is not working. To avoid such blocks, increase the distance between the store and the following load so that the store address is known at the time the load is dispatched. 03H 04H LOAD_BLOCK.STD Loads blocked by a preceding store with unknown data. This event indicates that loads are blocked by preceding stores. A load is blocked when there is a preceding store to the same address and the stored data value is not yet known. The number of events is greater or equal to the number of load operations that were blocked. To avoid such blocks, increase the distance between the store and the dependent load, so that the store data is known at the time the load is dispatched. 03H 08H LOAD_BLOCK. OVERLAP_STORE Loads that partially overlap an earlier store, or 4-Kbyte aliased with a previous store. This event indicates that loads are blocked due to a variety of reasons. Some of the triggers for this event are when a load is blocked by a preceding store, in one of the following: • Some of the loaded byte locations are written by the preceding store and some are not. • The load is from bytes written by the preceding store, the store is aligned to its size and either: • The load’s data size is one or two bytes and it is not aligned to the store. • The load’s data size is of four or eight bytes and the load is misaligned. Vol. 3B 19-161 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition Description and Comment • The load is from bytes written by the preceding store, the store is misaligned and the load is not aligned on the beginning of the store. • The load is split over an eight byte boundary (excluding 16-byte loads). • The load and store have the same offset relative to the beginning of different 4-KByte pages. This case is also called 4-KByte aliasing. • In all these cases the load is blocked until after the blocking store retires and the stored data is committed to the cache hierarchy. 03H 10H LOAD_BLOCK. UNTIL_RETIRE Loads blocked until retirement. This event indicates that load operations were blocked until retirement. The number of events is greater or equal to the number of load operations that were blocked. This includes mainly uncacheable loads and split loads (loads that cross the cache line boundary) but may include other cases where loads are blocked until retirement. 03H 20H LOAD_BLOCK.L1D Loads blocked by the L1 data cache. This event indicates that loads are blocked due to one or more reasons. Some triggers for this event are: • The number of L1 data cache misses exceeds the maximum number of outstanding misses supported by the processor. This includes misses generated as result of demand fetches, software prefetches or hardware prefetches. • Cache line split loads. • Partial reads, such as reads to un-cacheable memory, I/O instructions and more. • A locked load operation is in progress. The number of events is greater or equal to the number of load operations that were blocked. 04H 01H SB_DRAIN_ CYCLES Cycles while stores are This event counts every cycle during which the store buffer blocked due to store is draining. This includes: buffer drain. • Serializing operations such as CPUID • Synchronizing operations such as XCHG • Interrupt acknowledgment • Other conditions, such as cache flushing 04H 02H STORE_BLOCK. ORDER Cycles while store is waiting for a preceding store to be globally observed. This event counts the total duration, in number of cycles, which stores are waiting for a preceding stored cache line to be observed by other cores. This situation happens as a result of the strong store ordering behavior, as defined in “Memory Ordering,” Chapter 8, Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. The stall may occur and be noticeable if there are many cases when a store either misses the L1 data cache or hits a cache line in the Shared state. If the store requires a bus transaction to read the cache line then the stall ends when snoop response for the bus transaction arrives. 04H 08H 19-162 Vol. 3B STORE_BLOCK. SNOOP A store is blocked due to a conflict with an external or internal snoop. This event counts the number of cycles the store port was used for snooping the L1 data cache and a store was stalled by the snoop. The store is typically resubmitted one cycle later. PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 06H 00H Event Name Definition SEGMENT_REG_ LOADS Number of segment register loads. Description and Comment This event counts the number of segment register load operations. Instructions that load new values into segment registers cause a penalty. This event indicates performance issues in 16-bit code. If this event occurs frequently, it may be useful to calculate the number of instructions retired per segment register load. If the resulting calculation is low (on average a small number of instructions are executed between segment register loads), then the code’s segment register usage should be optimized. As a result of branch misprediction, this event is speculative and may include segment register loads that do not actually occur. However, most segment register loads are internally serialized and such speculative effects are minimized. 07H 00H SSE_PRE_EXEC. NTA Streaming SIMD This event counts the number of times the SSE instruction Extensions (SSE) prefetchNTA is executed. Prefetch NTA This instruction prefetches the data to the L1 data cache. instructions executed. 07H 01H SSE_PRE_EXEC.L1 Streaming SIMD This event counts the number of times the SSE instruction Extensions (SSE) prefetchT0 is executed. This instruction prefetches the data PrefetchT0 to the L1 data cache and L2 cache. instructions executed. 07H 02H SSE_PRE_EXEC.L2 Streaming SIMD This event counts the number of times the SSE instructions Extensions (SSE) prefetchT1 and prefetchT2 are executed. These PrefetchT1 and instructions prefetch the data to the L2 cache. PrefetchT2 instructions executed. 07H 03H SSE_PRE_ EXEC.STORES Streaming SIMD This event counts the number of times SSE non-temporal Extensions (SSE) store instructions are executed. Weakly-ordered store instructions executed. 08H 01H DTLB_MISSES. ANY Memory accesses that This event counts the number of Data Table Lookaside missed the DTLB. Buffer (DTLB) misses. The count includes misses detected as a result of speculative accesses. Typically a high count for this event indicates that the code accesses a large number of data pages. 08H 02H DTLB_MISSES .MISS_LD DTLB misses due to load operations. This event counts the number of Data Table Lookaside Buffer (DTLB) misses due to load operations. This count includes misses detected as a result of speculative accesses. 08H 04H DTLB_MISSES.L0_MISS_LD L0 DTLB misses due to This event counts the number of level 0 Data Table load operations. Lookaside Buffer (DTLB0) misses due to load operations. This count includes misses detected as a result of speculative accesses. Loads that miss that DTLB0 and hit the DTLB1 can incur two-cycle penalty. Vol. 3B 19-163 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 08H 08H Event Name Definition Description and Comment DTLB_MISSES. MISS_ST TLB misses due to store operations. This event counts the number of Data Table Lookaside Buffer (DTLB) misses due to store operations. This count includes misses detected as a result of speculative accesses. Address translation for store operations is performed in the DTLB1. 09H 01H MEMORY_ DISAMBIGUATION.RESET Memory disambiguation reset cycles. This event counts the number of cycles during which memory disambiguation misprediction occurs. As a result the execution pipeline is cleaned and execution of the mispredicted load instruction and all succeeding instructions restarts. This event occurs when the data address accessed by a load instruction, collides infrequently with preceding stores, but usually there is no collision. It happens rarely, and may have a penalty of about 20 cycles. 09H 02H MEMORY_DISAMBIGUATIO N.SUCCESS Number of loads successfully disambiguated. This event counts the number of load operations that were successfully disambiguated. Loads are preceded by a store with an unknown address, but they are not blocked. 0CH 01H PAGE_WALKS .COUNT Number of page-walks This event counts the number of page-walks executed due executed. to either a DTLB or ITLB miss. The page walk duration, PAGE_WALKS.CYCLES, divided by number of page walks is the average duration of a page walk. The average can hint whether most of the page-walks are satisfied by the caches or cause an L2 cache miss. 0CH 02H PAGE_WALKS. CYCLES Duration of pagewalks in core cycles. This event counts the duration of page-walks in core cycles. The paging mode in use typically affects the duration of page walks. Page walk duration divided by number of page walks is the average duration of page-walks. The average can hint at whether most of the page-walks are satisfied by the caches or cause an L2 cache miss. 10H 11H 00H 00H FP_COMP_OPS _EXE FP_ASSIST Floating point computational microops executed. This event counts the number of floating point computational micro-ops executed. Floating point assists. This event counts the number of floating point operations executed that required micro-code assist intervention. Assists are required in the following cases: Use IA32_PMC0 only. • Streaming SIMD Extensions (SSE) instructions: • Denormal input when the DAZ (Denormals Are Zeros) flag is off • Underflow result when the FTZ (Flush To Zero) flag is off • X87 instructions: • NaN or denormal are loaded to a register or used as input from memory • Division by 0 • Underflow output Use IA32_PMC1 only. 12H 00H MUL Multiply operations executed. This event counts the number of multiply operations executed. This includes integer as well as floating point multiply operations. Use IA32_PMC1 only. 19-164 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition 13H 00H DIV Divide operations executed. Description and Comment This event counts the number of divide operations executed. This includes integer divides, floating point divides and square-root operations executed. Use IA32_PMC1 only. 14H 00H CYCLES_DIV _BUSY Cycles the divider busy. This event counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Use IA32_PMC0 only. 18H 00H IDLE_DURING _DIV Cycles the divider is busy and all other execution units are idle. This event counts the number of cycles the divider is busy (with a divide or a square root operation) and no other execution unit or load operation is in progress. Load operations are assumed to hit the L1 data cache. This event considers only micro-ops dispatched after the divider started operating. Use IA32_PMC0 only. 19H 00H DELAYED_ BYPASS.FP Delayed bypass to FP operation. This event counts the number of times floating point operations use data immediately after the data was generated by a non-floating point execution unit. Such cases result in one penalty cycle due to data bypass between the units. Use IA32_PMC1 only. 19H 01H DELAYED_ BYPASS.SIMD Delayed bypass to SIMD operation. 19H 02H DELAYED_ BYPASS.LOAD Delayed bypass to load operation. This event counts the number of times SIMD operations use data immediately after the data was generated by a nonSIMD execution unit. Such cases result in one penalty cycle due to data bypass between the units. Use IA32_PMC1 only. This event counts the number of delayed bypass penalty cycles that a load operation incurred. When load operations use data immediately after the data was generated by an integer execution unit, they may (pending on certain dynamic internal conditions) incur one penalty cycle due to delayed data bypass between the units. Use IA32_PMC1 only. 21H See Table 18-71 L2_ADS.(Core) Cycles L2 address bus is in use. This event counts the number of cycles the L2 address bus is being used for accesses to the L2 cache or bus queue. It can count occurrences for this core or both cores. 23H See Table 18-71 L2_DBUS_BUSY _RD.(Core) Cycles the L2 transfers data to the core. This event counts the number of cycles during which the L2 data bus is busy transferring data from the L2 cache to the core. It counts for all L1 cache misses (data and instruction) that hit the L2 cache. This event can count occurrences for this core or both cores. Vol. 3B 19-165 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 24H Combin L2_LINES_IN. ed mask (Core, Prefetch) from Table 18-71 and Table 18-73 25H 26H 27H 28H 29H 2AH Event Name Definition L2 cache misses. Description and Comment This event counts the number of cache lines allocated in the L2 cache. Cache lines are allocated in the L2 cache as a result of requests from the L1 data and instruction caches and the L2 hardware prefetchers to cache lines that are missing in the L2 cache. This event can count occurrences for this core or both cores. It can also count demand requests and L2 hardware prefetch requests together or separately. See Table 18-71 L2_M_LINES_IN. (Core) L2 cache line modifications. See Table 18-71 and Table 18-73 L2_LINES_OUT. (Core, Prefetch) L2 cache lines evicted. This event counts the number of L2 cache lines evicted. See Table 18-71 and Table 18-73 L2_M_LINES_OUT.(Core, Prefetch) Modified lines evicted from the L2 cache. Combined mask from Table 18-71 and Table 18-74 L2_IFETCH.(Core, Cache Line State) This event can count occurrences for this core or both cores. This event can count occurrences for this core or both cores. It can also count evictions due to demand requests and L2 hardware prefetch requests together or separately. This event counts the number of L2 modified cache lines evicted. These lines are written back to memory unless they also exist in a modified-state in one of the L1 data caches. This event can count occurrences for this core or both cores. It can also count evictions due to demand requests and L2 hardware prefetch requests together or separately. L2 cacheable instruction fetch requests. This event counts the number of instruction cache line requests from the IFU. It does not include fetch requests from uncacheable memory. It does not include ITLB miss accesses. This event can count occurrences for this core or both cores. It can also count accesses to cache lines at different MESI states. Combin L2_LD.(Core, Prefetch, ed mask Cache Line State) from Table 18-71, Table 18-73, and Table 18-74 L2 cache reads. See Table 18-71 and Table 18-74 L2 store requests. 19-166 Vol. 3B This event counts whenever a modified cache line is written back from the L1 data cache to the L2 cache. L2_ST.(Core, Cache Line State) This event counts L2 cache read requests coming from the L1 data cache and L2 prefetchers. The event can count occurrences: • For this core or both cores. • Due to demand requests and L2 hardware prefetch requests together or separately. • Of accesses to cache lines at different MESI states. This event counts all store operations that miss the L1 data cache and request the data from the L2 cache. The event can count occurrences for this core or both cores. It can also count accesses to cache lines at different MESI states. PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 2BH 2EH 2EH Event Name Definition See Table 18-71 and Table 18-74 L2_LOCK.(Core, Cache Line State) L2 locked accesses. See Table 18-71, Table 18-73, and Table 18-74 L2_RQSTS.(Core, Prefetch, Cache Line State) 41H L2_RQSTS.SELF. DEMAND.I_STATE Description and Comment This event counts all locked accesses to cache lines that miss the L1 data cache. The event can count occurrences for this core or both cores. It can also count accesses to cache lines at different MESI states. L2 cache requests. This event counts all completed L2 cache requests. This includes L1 data cache reads, writes, and locked accesses, L1 data prefetch requests, instruction fetches, and all L2 hardware prefetch requests. This event can count occurrences: • For this core or both cores. • Due to demand requests and L2 hardware prefetch requests together, or separately. • Of accesses to cache lines at different MESI states. L2 cache demand requests from this core that missed the L2. This event counts all completed L2 cache demand requests from this core that miss the L2 cache. This includes L1 data cache reads, writes, and locked accesses, L1 data prefetch requests, and instruction fetches. This is an architectural performance event. 2EH 4FH L2_RQSTS.SELF. DEMAND.MESI L2 cache demand requests from this core. This event counts all completed L2 cache demand requests from this core. This includes L1 data cache reads, writes, and locked accesses, L1 data prefetch requests, and instruction fetches. This is an architectural performance event. 30H See Table 18-71, Table 18-73, and Table 18-74 L2_REJECT_BUSQ.(Core, Rejected L2 cache Prefetch, Cache Line State) requests. This event indicates that a pending L2 cache request that requires a bus transaction is delayed from moving to the bus queue. Some of the reasons for this event are: • The bus queue is full. • The bus queue already holds an entry for a cache line in the same set. The number of events is greater or equal to the number of requests that were rejected. • For this core or both cores. • Due to demand requests and L2 hardware prefetch requests together, or separately. • Of accesses to cache lines at different MESI states. 32H See Table 18-71 L2_NO_REQ.(Core) Cycles no L2 cache requests are pending. This event counts the number of cycles that no L2 cache requests were pending from a core. When using the BOTH_CORE modifier, the event counts only if none of the cores have a pending request. The event counts also when one core is halted and the other is not halted. The event can count occurrences for this core or both cores. 3AH 00H EIST_TRANS Number of Enhanced Intel SpeedStep Technology (EIST) transitions. This event counts the number of transitions that include a frequency change, either with or without voltage change. This includes Enhanced Intel SpeedStep Technology (EIST) and TM2 transitions. The event is incremented only while the counting core is in C0 state. Since transitions to higher-numbered CxE states and TM2 transitions include a frequency change or voltage transition, the event is incremented accordingly. Vol. 3B 19-167 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition 3BH C0H THERMAL_TRIP Number of thermal trips. Description and Comment This event counts the number of thermal trips. A thermal trip occurs whenever the processor temperature exceeds the thermal trip threshold temperature. Following a thermal trip, the processor automatically reduces frequency and voltage. The processor checks the temperature every millisecond and returns to normal when the temperature falls below the thermal trip threshold temperature. 3CH 00H CPU_CLK_ UNHALTED. CORE_P Core cycles when core This event counts the number of core cycles while the core is not halted. is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason, this event may have a changing ratio in regard to time. When the core frequency is constant, this event can give approximate elapsed time while the core not in halt state. This is an architectural performance event. 3CH 01H CPU_CLK_ UNHALTED.BUS Bus cycles when core is not halted. This event counts the number of bus cycles while the core is not in the halt state. This event can give a measurement of the elapsed time while the core was not in the halt state. The core enters the halt state when it is running the HLT instruction. The event also has a constant ratio with CPU_CLK_UNHALTED.REF event, which is the maximum bus to processor frequency ratio. Non-halted bus cycles are a component in many key event ratios. 3CH 02H CPU_CLK_ UNHALTED.NO _OTHER Bus cycles when core This event counts the number of bus cycles during which is active and the other the core remains non-halted and the other core on the is halted. processor is halted. This event can be used to determine the amount of parallelism exploited by an application or a system. Divide this event count by the bus frequency to determine the amount of time that only one core was in use. 40H See Table 18-74 L1D_CACHE_LD. (Cache Line State) L1 cacheable data reads. This event counts the number of data reads from cacheable memory. Locked reads are not counted. 41H See Table 18-74 L1D_CACHE_ST. (Cache Line State) L1 cacheable data writes. This event counts the number of data writes to cacheable memory. Locked writes are not counted. 42H See Table 18-74 L1D_CACHE_ LOCK.(Cache Line State) L1 data cacheable locked reads. This event counts the number of locked data reads from cacheable memory. 19-168 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 42H 10H 43H 01H Description and Comment Event Name Definition L1D_CACHE_ LOCK_DURATION Duration of L1 data cacheable locked operation. This event counts the number of cycles during which any cache line is locked by any locking instruction. All references to the L1 data cache. This event counts all references to the L1 data cache, including all loads and stores with any memory types. L1D_ALL_REF Locking happens at retirement and therefore the event does not occur for instructions that are speculatively executed. Locking duration is shorter than locked instruction execution duration. The event counts memory accesses only when they are actually performed. For example, a load blocked by unknown store address and later performed is only counted once. The event includes non-cacheable accesses, such as I/O accesses. 43H 02H L1D_ALL_ CACHE_REF L1 Data cacheable reads and writes. This event counts the number of data reads and writes from cacheable memory, including locked operations. This event is a sum of: • L1D_CACHE_LD.MESI • L1D_CACHE_ST.MESI • L1D_CACHE_LOCK.MESI 45H 0FH L1D_REPL Cache lines allocated in the L1 data cache. This event counts the number of lines brought into the L1 data cache. 46H 00H L1D_M_REPL Modified cache lines allocated in the L1 data cache. This event counts the number of modified lines brought into the L1 data cache. 47H 00H L1D_M_EVICT Modified cache lines evicted from the L1 data cache. This event counts the number of modified lines evicted from the L1 data cache, whether due to replacement or by snoop HITM intervention. 48H 00H L1D_PEND_ MISS Total number of outstanding L1 data cache misses at any cycle. This event counts the number of outstanding L1 data cache misses at any cycle. An L1 data cache miss is outstanding from the cycle on which the miss is determined until the first chunk of data is available. This event counts: • All cacheable demand requests. • L1 data cache hardware prefetch requests. • Requests to write through memory. • Requests to write combine memory. Uncacheable requests are not counted. The count of this event divided by the number of L1 data cache misses, L1D_REPL, is the average duration in core cycles of an L1 data cache miss. 49H 01H L1D_SPLIT.LOADS Cache line split loads from the L1 data cache. This event counts the number of load operations that span two cache lines. Such load operations are also called split loads. Split load operations are executed at retirement. 49H 02H L1D_SPLIT. STORES Cache line split stores to the L1 data cache. This event counts the number of store operations that span two cache lines. 4BH 00H SSE_PRE_ MISS.NTA Streaming SIMD Extensions (SSE) Prefetch NTA instructions missing all cache levels. This event counts the number of times the SSE instructions prefetchNTA were executed and missed all cache levels. Due to speculation an executed instruction might not retire. This instruction prefetches the data to the L1 data cache. Vol. 3B 19-169 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 4BH 01H 4BH 02H Description and Comment Event Name Definition SSE_PRE_ MISS.L1 Streaming SIMD Extensions (SSE) PrefetchT0 instructions missing all cache levels. This event counts the number of times the SSE instructions prefetchT0 were executed and missed all cache levels. Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions missing all cache levels. This event counts the number of times the SSE instructions prefetchT1 and prefetchT2 were executed and missed all cache levels. This event counts load operations sent to the L1 data cache while a previous Streaming SIMD Extensions (SSE) prefetch instruction to the same cache line has started prefetching but has not yet finished. SSE_PRE_ MISS.L2 Due to speculation executed instruction might not retire. The prefetchT0 instruction prefetches data to the L2 cache and L1 data cache. Due to speculation, an executed instruction might not retire. The prefetchT1 and PrefetchNT2 instructions prefetch data to the L2 cache. 4CH 00H LOAD_HIT_PRE Load operations conflicting with a software prefetch to the same address. 4EH 10H L1D_PREFETCH. REQUESTS L1 data cache prefetch This event counts the number of times the L1 data cache requests. requested to prefetch a data cache line. Requests can be rejected when the L2 cache is busy and resubmitted later or lost. All requests are counted, including those that are rejected. 60H 61H See Table 18-71 and Table 18-72. BUS_REQUEST_ OUTSTANDING. (Core and Bus Agents) See Table 18-72. BUS_BNR_DRV. (Bus Agents) Outstanding cacheable This event counts the number of pending full cache line read data read bus transactions on the bus occurring in each cycle. A read requests duration. transaction is pending from the cycle it is sent on the bus until the full cache line is received by the processor. The event counts only full-line cacheable read requests from either the L1 data cache or the L2 prefetchers. It does not count Read for Ownership transactions, instruction byte fetch transactions, or any other bus transaction. Number of Bus Not Ready signals asserted. This event counts the number of Bus Not Ready (BNR) signals that the processor asserts on the bus to suspend additional bus requests by other bus agents. A bus agent asserts the BNR signal when the number of data and snoop transactions is close to the maximum that the bus can handle. To obtain the number of bus cycles during which the BNR signal is asserted, multiply the event count by two. While this signal is asserted, new transactions cannot be submitted on the bus. As a result, transaction latency may have higher impact on program performance. 62H See Table 18-72. BUS_DRDY_ CLOCKS.(Bus Agents) Bus cycles when data is sent on the bus. This event counts the number of bus cycles during which the DRDY (Data Ready) signal is asserted on the bus. The DRDY signal is asserted when data is sent on the bus. With the 'THIS_AGENT' mask this event counts the number of bus cycles during which this agent (the processor) writes data on the bus back to memory or to other bus agents. This includes all explicit and implicit data writebacks, as well as partial writes. With the 'ALL_AGENTS' mask, this event counts the number of bus cycles during which any bus agent sends data on the bus. This includes all data reads and writes on the bus. 19-170 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 63H Description and Comment Event Name Definition See Table 18-71 and Table 18-72. BUS_LOCK_ CLOCKS.(Core and Bus Agents) Bus cycles when a LOCK signal asserted. 64H See Table 18-71. BUS_DATA_ RCV.(Core) Bus cycles while processor receives data. This event counts the number of bus cycles during which the processor is busy receiving data. 65H See Table 18-71 and Table 18-72. BUS_TRANS_BRD.(Core and Bus Agents) Burst read bus transactions. This event counts the number of burst read transactions including: 66H See Table 18-71 and Table 18-72. BUS_TRANS_RFO.(Core and Bus Agents) RFO bus transactions. This event counts the number of Read For Ownership (RFO) bus transactions, due to store operations that miss the L1 data cache and the L2 cache. It also counts RFO bus transactions due to locked operations. 67H See Table 18-71 and Table 18-72. BUS_TRANS_WB. (Core and Bus Agents) Explicit writeback bus transactions. This event counts all explicit writeback bus transactions due to dirty line evictions. It does not count implicit writebacks due to invalidation by a snoop request. 68H See Table 18-71 and Table 18-72. BUS_TRANS_ IFETCH.(Core and Bus Agents) Instruction-fetch bus transactions. This event counts all instruction fetch full cache line bus transactions. 69H See Table 18-71 and Table 18-72. BUS_TRANS_ INVAL.(Core and Bus Agents) Invalidate bus transactions. This event counts all invalidate transactions. Invalidate transactions are generated when: See Table 18-71 and Table 18-72. BUS_TRANS_ Partial write bus PWR.(Core and Bus Agents) transaction. 6AH This event counts the number of bus cycles, during which the LOCK signal is asserted on the bus. A LOCK signal is asserted when there is a locked memory access, due to: • Uncacheable memory. • Locked operation that spans two cache lines. • Page-walk from an uncacheable page table. Bus locks have a very high performance penalty and it is highly recommended to avoid such accesses. • L1 data cache read misses (and L1 data cache hardware prefetches). • L2 hardware prefetches by the DPL and L2 streamer. • IFU read misses of cacheable lines. It does not include RFO transactions. • A store operation hits a shared line in the L2 cache. • A full cache line write misses the L2 cache or hits a shared line in the L2 cache. This event counts partial write bus transactions. Vol. 3B 19-171 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition Description and Comment 6BH See Table 18-71 and Table 18-72. BUS_TRANS _P.(Core and Bus Agents) Partial bus transactions. This event counts all (read and write) partial bus transactions. 6CH See Table 18-71 and Table 18-72. BUS_TRANS_IO.(Core and Bus Agents) IO bus transactions. This event counts the number of completed I/O bus transactions as a result of IN and OUT instructions. The count does not include memory mapped IO. 6DH See Table 18-71 and Table 18-72. BUS_TRANS_ DEF.(Core and Bus Agents) Deferred bus transactions. This event counts the number of deferred transactions. 6EH See Table 18-71 and Table 18-72. BUS_TRANS_ BURST.(Core and Bus Agents) Burst (full cache-line) bus transactions. This event counts burst (full cache line) transactions including: See Table 18-71 and Table 18-72. BUS_TRANS_ Memory bus MEM.(Core and Bus Agents) transactions. This event counts all memory bus transactions including: See Table 18-71 and Table 18-72. BUS_TRANS_ All bus transactions. ANY.(Core and Bus Agents) This event counts all bus transactions. This includes: See Table 18-71 and Table 18-75. EXT_SNOOP. (Bus Agents, Snoop Response) This event counts the snoop responses to bus transactions. Responses can be counted separately by type and by bus agent. See Table 18-71 and Table 18-76. CMP_SNOOP.(Core, Snoop Type) 6FH 70H 77H 78H 19-172 Vol. 3B • • • • External snoops. Burst reads. RFOs. Explicit writebacks. Write combine lines. • Burst transactions. • Partial reads and writes - invalidate transactions. The BUS_TRANS_MEM count is the sum of BUS_TRANS_BURST, BUS_TRANS_P and BUS_TRANS_IVAL. • • • • Memory transactions. IO transactions (non memory-mapped). Deferred transaction completion. Other less frequent transactions, such as interrupts. With the 'THIS_AGENT' mask, the event counts snoop responses from this processor to bus transactions sent by this processor. With the 'ALL_AGENTS' mask the event counts all snoop responses seen on the bus. L1 data cache This event counts the number of times the L1 data cache is snooped by other core. snooped for a cache line that is needed by the other core in the same processor. The cache line is either missing in the L1 instruction or data caches of the other core, or is available for reading only and the other core wishes to write the cache line. PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition Description and Comment The snoop operation may change the cache line state. If the other core issued a read request that hit this core in E state, typically the state changes to S state in this core. If the other core issued a read for ownership request (due a write miss or hit to S state) that hits this core's cache line in E or S state, this typically results in invalidation of the cache line in this core. If the snoop hits a line in M state, the state is changed at a later opportunity. These snoops are performed through the L1 data cache store port. Therefore, frequent snoops may conflict with extensive stores to the L1 data cache, which may increase store latency and impact performance. 7AH 7BH 7DH See Table 18-72. BUS_HIT_DRV. See Table 18-72. BUS_HITM_DRV. See Table 18-71. BUSQ_EMPTY. HIT signal asserted. This event counts the number of bus cycles during which the processor drives the HIT# pin to signal HIT snoop response. HITM signal asserted. This event counts the number of bus cycles during which the processor drives the HITM# pin to signal HITM snoop response. Bus queue empty. This event counts the number of cycles during which the core did not have any pending transactions in the bus queue. It also counts when the core is halted and the other core is not halted. (Bus Agents) (Bus Agents) (Core) This event can count occurrences for this core or both cores. 7EH 7FH See Table 18-71 and Table 18-72. SNOOP_STALL_ Bus stalled for snoops. This event counts the number of times that the bus snoop DRV.(Core and Bus Agents) stall signal is asserted. To obtain the number of bus cycles during which snoops on the bus are prohibited, multiply the event count by two. See Table 18-71. BUS_IO_WAIT. (Core) During the snoop stall cycles, no new bus transactions requiring a snoop response can be initiated on the bus. A bus agent asserts a snoop stall signal if it cannot response to a snoop request within three bus cycles. IO requests waiting in the bus queue. This event counts the number of core cycles during which IO requests wait in the bus queue. With the SELF modifier this event counts IO requests per core. With the BOTH_CORE modifier, this event increments by one for any cycle for which there is a request from either core. 80H 00H L1I_READS Instruction fetches. This event counts all instruction fetches, including uncacheable fetches that bypass the Instruction Fetch Unit (IFU). 81H 00H L1I_MISSES Instruction Fetch Unit misses. This event counts all instruction fetches that miss the Instruction Fetch Unit (IFU) or produce memory requests. This includes uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding. 82H 02H ITLB.SMALL_MISS ITLB small page misses. This event counts the number of instruction fetches from small pages that miss the ITLB. 82H 10H ITLB.LARGE_MISS ITLB large page misses. This event counts the number of instruction fetches from large pages that miss the ITLB. Vol. 3B 19-173 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition Description and Comment 82H 40H ITLB.FLUSH ITLB flushes. This event counts the number of ITLB flushes. This usually happens upon CR3 or CR0 writes, which are executed by the operating system during process switches. 82H 12H ITLB.MISSES ITLB misses. This event counts the number of instruction fetches from either small or large pages that miss the ITLB. 83H 02H INST_QUEUE.FULL Cycles during which the instruction queue is full. This event counts the number of cycles during which the instruction queue is full. In this situation, the core front end stops fetching more instructions. This is an indication of very long stalls in the back-end pipeline stages. 86H 00H CYCLES_L1I_ MEM_STALLED Cycles during which instruction fetches stalled. This event counts the number of cycles for which an instruction fetch stalls, including stalls due to any of the following reasons: • Instruction Fetch Unit cache misses. • Instruction TLB misses. • Instruction TLB faults. 87H 00H ILD_STALL Instruction Length Decoder stall cycles due to a length changing prefix. This event counts the number of cycles during which the instruction length decoder uses the slow length decoder. Usually, instruction length decoding is done in one cycle. When the slow decoder is used, instruction decoding requires 6 cycles. The slow decoder is used in the following cases: • Operand override prefix (66H) preceding an instruction with immediate data. • Address override prefix (67H) preceding an instruction with a modr/m in real, big real, 16-bit protected or 32-bit protected modes. To avoid instruction length decoding stalls, generate code using imm8 or imm32 values instead of imm16 values. If you must use an imm16 value, store the value in a register using “mov reg, imm32” and use the register format of the instruction. 88H 00H BR_INST_EXEC Branch instructions executed. This event counts all executed branches (not necessarily retired). This includes only instructions and not micro-op branches. Frequent branching is not necessarily a major performance issue. However frequent branch mispredictions may be a problem. 89H 00H BR_MISSP_EXEC Mispredicted branch This event counts the number of mispredicted branch instructions executed. instructions that were executed. 8AH 00H BR_BAC_ MISSP_EXEC Branch instructions mispredicted at decoding. 8BH 00H BR_CND_EXEC Conditional branch This event counts the number of conditional branch instructions executed. instructions executed, but not necessarily retired. 8CH 00H BR_CND_ MISSP_EXEC Mispredicted This event counts the number of mispredicted conditional conditional branch branch instructions that were executed. instructions executed. 8DH 00H BR_IND_EXEC Indirect branch This event counts the number of indirect branch instructions instructions executed. that were executed. 19-174 Vol. 3B This event counts the number of branch instructions that were mispredicted at decoding. PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value 8EH Description and Comment Event Name Definition 00H BR_IND_MISSP _EXEC Mispredicted indirect branch instructions executed. This event counts the number of mispredicted indirect branch instructions that were executed. 8FH 00H BR_RET_EXEC RET instructions executed. This event counts the number of RET instructions that were executed. 90H 00H BR_RET_ MISSP_EXEC Mispredicted RET This event counts the number of mispredicted RET instructions executed. instructions that were executed. 91H 00H BR_RET_BAC_ MISSP_EXEC This event counts the number of RET instructions that were RET instructions executed mispredicted executed and were mispredicted at decoding. at decoding. 92H 00H BR_CALL_EXEC CALL instructions executed. 93H 00H BR_CALL_ MISSP_EXEC Mispredicted CALL This event counts the number of mispredicted CALL instructions executed. instructions that were executed. 94H 00H BR_IND_CALL_ EXEC Indirect CALL This event counts the number of indirect CALL instructions instructions executed. that were executed. 97H 00H BR_TKN_ BUBBLE_1 Branch predicted taken with bubble 1. This event counts the number of CALL instructions executed. The events BR_TKN_BUBBLE_1 and BR_TKN_BUBBLE_2 together count the number of times a taken branch prediction incurred a one-cycle penalty. The penalty incurs when: • Too many taken branches are placed together. To avoid this, unroll loops and add a non-taken branch in the middle of the taken sequence. • The branch target is unaligned. To avoid this, align the branch target. 98H 00H BR_TKN_ BUBBLE_2 Branch predicted taken with bubble 2. The events BR_TKN_BUBBLE_1 and BR_TKN_BUBBLE_2 together count the number of times a taken branch prediction incurred a one-cycle penalty. The penalty incurs when: • Too many taken branches are placed together. To avoid this, unroll loops and add a non-taken branch in the middle of the taken sequence. • The branch target is unaligned. To avoid this, align the branch target. A0H 00H RS_UOPS_ DISPATCHED Micro-ops dispatched for execution. This event counts the number of micro-ops dispatched for execution. Up to six micro-ops can be dispatched in each cycle. A1H 01H RS_UOPS_ DISPATCHED.PORT0 Cycles micro-ops dispatched for execution on port 0. This event counts the number of cycles for which micro-ops dispatched for execution. Each cycle, at most one micro-op can be dispatched on the port. Issue Ports are described in Intel® 64 and IA-32 Architectures Optimization Reference Manual. Use IA32_PMC0 only. A1H 02H RS_UOPS_ DISPATCHED.PORT1 Cycles micro-ops dispatched for execution on port 1. This event counts the number of cycles for which micro-ops dispatched for execution. Each cycle, at most one micro-op can be dispatched on the port. Use IA32_PMC0 only. A1H 04H RS_UOPS_ DISPATCHED.PORT2 Cycles micro-ops dispatched for execution on port 2. This event counts the number of cycles for which micro-ops dispatched for execution. Each cycle, at most one micro-op can be dispatched on the port. Use IA32_PMC0 only. Vol. 3B 19-175 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition Description and Comment A1H 08H RS_UOPS_ DISPATCHED.PORT3 Cycles micro-ops dispatched for execution on port 3. This event counts the number of cycles for which micro-ops dispatched for execution. Each cycle, at most one micro-op can be dispatched on the port. Use IA32_PMC0 only. A1H 10H RS_UOPS_ DISPATCHED.PORT4 Cycles micro-ops dispatched for execution on port 4. This event counts the number of cycles for which micro-ops dispatched for execution. Each cycle, at most one micro-op can be dispatched on the port. Use IA32_PMC0 only. A1H 20H RS_UOPS_ DISPATCHED.PORT5 Cycles micro-ops dispatched for execution on port 5. This event counts the number of cycles for which micro-ops dispatched for execution. Each cycle, at most one micro-op can be dispatched on the port. Use IA32_PMC0 only. AAH 01H MACRO_INSTS. DECODED Instructions decoded. This event counts the number of instructions decoded (but not necessarily executed or retired). AAH 08H MACRO_INSTS. CISC_DECODED CISC Instructions decoded. This event counts the number of complex instructions decoded. Complex instructions usually have more than four micro-ops. Only one complex instruction can be decoded at a time. ABH 01H ESP.SYNCH ESP register content synchron-ization. This event counts the number of times that the ESP register is explicitly used in the address expression of a load or store operation, after it is implicitly used, for example by a push or a pop instruction. ESP synch micro-op uses resources from the rename pipestage and up to retirement. The expected ratio of this event divided by the number of ESP implicit changes is 0,2. If the ratio is higher, consider rearranging your code to avoid ESP synchronization events. ABH 02H ESP.ADDITIONS ESP register automatic This event counts the number of ESP additions performed additions. automatically by the decoder. A high count of this event is good, since each automatic addition performed by the decoder saves a micro-op from the execution units. To maximize the number of ESP additions performed automatically by the decoder, choose instructions that implicitly use the ESP, such as PUSH, POP, CALL, and RET instructions whenever possible. B0H 00H SIMD_UOPS_EXEC SIMD micro-ops executed (excluding stores). This event counts all the SIMD micro-ops executed. It does not count MOVQ and MOVD stores from register to memory. B1H 00H SIMD_SAT_UOP_ EXEC SIMD saturated arithmetic micro-ops executed. This event counts the number of SIMD saturated arithmetic micro-ops executed. B3H 01H SIMD_UOP_ TYPE_EXEC.MUL SIMD packed multiply micro-ops executed. This event counts the number of SIMD packed multiply micro-ops executed. B3H 02H SIMD_UOP_TYPE_EXEC.SHI SIMD packed shift FT micro-ops executed. This event counts the number of SIMD packed shift microops executed. B3H 04H SIMD_UOP_TYPE_EXEC.PA CK This event counts the number of SIMD pack micro-ops executed. B3H 08H SIMD_UOP_TYPE_EXEC.UN SIMD unpack microPACK ops executed. This event counts the number of SIMD unpack micro-ops executed. B3H 10H SIMD_UOP_TYPE_EXEC.LO GICAL This event counts the number of SIMD packed logical microops executed. 19-176 Vol. 3B SIMD pack micro-ops executed. SIMD packed logical micro-ops executed. PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value B3H 20H SIMD_UOP_TYPE_EXEC.ARI SIMD packed THMETIC arithmetic micro-ops executed. This event counts the number of SIMD packed arithmetic micro-ops executed. C0H 00H INST_RETIRED. ANY_P This event counts the number of instructions that retire execution. For instructions that consist of multiple microops, this event counts the retirement of the last micro-op of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. Event Name Definition Instructions retired. Description and Comment INST_RETIRED.ANY_P is an architectural performance event. C0H 01H INST_RETIRED. LOADS Instructions retired, which contain a load. This event counts the number of instructions retired that contain a load operation. C0H 02H INST_RETIRED. STORES Instructions retired, which contain a store. This event counts the number of instructions retired that contain a store operation. C0H 04H INST_RETIRED. OTHER Instructions retired, with no load or store operation. This event counts the number of instructions retired that do not contain a load or a store operation. C1H 01H X87_OPS_ RETIRED.FXCH FXCH instructions retired. This event counts the number of FXCH instructions retired. Modern compilers generate more efficient code and are less likely to use this instruction. If you obtain a high count for this event consider recompiling the code. C1H FEH X87_OPS_ RETIRED.ANY Retired floating-point computational operations (precise event). This event counts the number of floating-point computational operations retired. It counts: • Floating point computational operations executed by the assist handler. • Sub-operations of complex floating-point instructions like transcendental instructions. This event does not count: • Floating-point computational operations that cause traps or assists. • Floating-point loads and stores. When this event is captured with the precise event mechanism, the collected samples contain the address of the instruction that was executed immediately after the instruction that caused the event. C2H 01H UOPS_RETIRED. LD_IND_BR Fused load+op or load+indirect branch retired. This event counts the number of retired micro-ops that fused a load with another operation. This includes: • Fusion of a load and an arithmetic operation, such as with the following instruction: ADD EAX, [EBX] where the content of the memory location specified by EBX register is loaded, added to EXA register, and the result is stored in EAX. • Fusion of a load and a branch in an indirect branch operation, such as with the following instructions: • JMP [RDI+200] • RET • Fusion decreases the number of micro-ops in the processor pipeline. A high value for this event count indicates that the code is using the processor resources effectively. Vol. 3B 19-177 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value C2H 02H Description and Comment Event Name Definition UOPS_RETIRED. STD_STA Fused store address + This event counts the number of store address calculations data retired. that are fused with store data emission into one micro-op. Traditionally, each store operation required two micro-ops. This event counts fusion of retired micro-ops only. Fusion decreases the number of micro-ops in the processor pipeline. A high value for this event count indicates that the code is using the processor resources effectively. C2H 04H UOPS_RETIRED. MACRO_FUSION Retired instruction pairs fused into one micro-op. This event counts the number of times CMP or TEST instructions were fused with a conditional branch instruction into one micro-op. It counts fusion by retired micro-ops only. Fusion decreases the number of micro-ops in the processor pipeline. A high value for this event count indicates that the code uses the processor resources more effectively. C2H 07H UOPS_RETIRED. FUSED Fused micro-ops retired. This event counts the total number of retired fused microops. The counts include the following fusion types: • Fusion of load operation with an arithmetic operation or with an indirect branch (counted by event UOPS_RETIRED.LD_IND_BR) • Fusion of store address and data (counted by event UOPS_RETIRED.STD_STA) • Fusion of CMP or TEST instruction with a conditional branch instruction (counted by event UOPS_RETIRED.MACRO_FUSION) Fusion decreases the number of micro-ops in the processor pipeline. A high value for this event count indicates that the code is using the processor resources effectively. C2H 08H UOPS_RETIRED. NON_FUSED Non-fused micro-ops retired. This event counts the number of micro-ops retired that were not fused. C2H 0FH UOPS_RETIRED. ANY Micro-ops retired. This event counts the number of micro-ops retired. The processor decodes complex macro instructions into a sequence of simpler micro-ops. Most instructions are composed of one or two micro-ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. In some cases micro-op sequences are fused or whole instructions are fused into one micro-op. See other UOPS_RETIRED events for differentiating retired fused and non-fused micro-ops. C3H 01H 19-178 Vol. 3B MACHINE_ NUKES.SMC Self-Modifying Code detected. This event counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel 64 and IA-32 processors. PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value C3H 04H Event Name Definition MACHINE_NUKES.MEM_OR DER Execution pipeline restart due to memory ordering conflict or memory disambiguation misprediction. Description and Comment This event counts the number of times the pipeline is restarted due to either multi-threaded memory ordering conflicts or memory disambiguation misprediction. A multi-threaded memory ordering conflict occurs when a store, which is executed in another core, hits a load that is executed out of order in this core but not yet retired. As a result, the load needs to be restarted to satisfy the memory ordering model. See Chapter 8, “Multiple-Processor Management” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. To count memory disambiguation mispredictions, use the event MEMORY_DISAMBIGUATION.RESET. C4H 00H BR_INST_RETIRED.ANY Retired branch instructions. This event counts the number of branch instructions retired. This is an architectural performance event. C4H 01H BR_INST_RETIRED.PRED_N Retired branch This event counts the number of branch instructions retired OT_ instructions that were that were correctly predicted to be not-taken. TAKEN predicted not-taken. C4H 02H BR_INST_RETIRED.MISPRE D_NOT_ TAKEN C4H 04H BR_INST_RETIRED.PRED_T Retired branch This event counts the number of branch instructions retired AKEN instructions that were that were correctly predicted to be taken. predicted taken. C4H 08H BR_INST_RETIRED.MISPRE D_TAKEN Retired branch This event counts the number of branch instructions retired instructions that were that were mispredicted and taken. mispredicted taken. C4H 0CH BR_INST_RETIRED.TAKEN Retired taken branch instructions. This event counts the number of branches retired that were taken. C5H 00H BR_INST_RETIRED.MISPRE D Retired mispredicted branch instructions. (precise event) This event counts the number of retired branch instructions that were mispredicted by the processor. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. Retired branch This event counts the number of branch instructions retired instructions that were that were mispredicted and not-taken. mispredicted nottaken. This is an architectural performance event. C6H 01H CYCLES_INT_ MASKED Cycles during which interrupts are disabled. This event counts the number of cycles during which interrupts are disabled. C6H 02H CYCLES_INT_ PENDING_AND _MASKED Cycles during which This event counts the number of cycles during which there interrupts are pending are pending interrupts but interrupts are disabled. and disabled. C7H 01H SIMD_INST_ RETIRED.PACKED_SINGLE Retired SSE packedsingle instructions. This event counts the number of SSE packed-single instructions retired. C7H 02H SIMD_INST_ RETIRED.SCALAR_SINGLE Retired SSE scalarsingle instructions. This event counts the number of SSE scalar-single instructions retired. C7H 04H SIMD_INST_ RETIRED.PACKED_DOUBLE Retired SSE2 packeddouble instructions. This event counts the number of SSE2 packed-double instructions retired. C7H 08H SIMD_INST_ RETIRED.SCALAR_DOUBLE Retired SSE2 scalardouble instructions. This event counts the number of SSE2 scalar-double instructions retired. Vol. 3B 19-179 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition Description and Comment C7H 10H SIMD_INST_ RETIRED.VECTOR Retired SSE2 vector integer instructions. This event counts the number of SSE2 vector integer instructions retired. C7H 1FH SIMD_INST_ RETIRED.ANY Retired Streaming SIMD instructions (precise event). This event counts the overall number of retired SIMD instructions that use XMM registers. To count each type of SIMD instruction separately, use the following events: • SIMD_INST_RETIRED.PACKED_SINGLE • SIMD_INST_RETIRED.SCALAR_SINGLE • SIMD_INST_RETIRED.PACKED_DOUBLE • SIMD_INST_RETIRED.SCALAR_DOUBLE • and SIMD_INST_RETIRED.VECTOR When this event is captured with the precise event mechanism, the collected samples contain the address of the instruction that was executed immediately after the instruction that caused the event. C8H 00H HW_INT_RCV Hardware interrupts received. This event counts the number of hardware interrupts received by the processor. C9H 00H ITLB_MISS_ RETIRED Retired instructions that missed the ITLB. This event counts the number of retired instructions that missed the ITLB when they were fetched. CAH 01H SIMD_COMP_ INST_RETIRED. PACKED_SINGLE Retired computational This event counts the number of computational SSE packedSSE packed-single single instructions retired. Computational instructions instructions. perform arithmetic computations (for example: add, multiply and divide). Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. CAH 02H SIMD_COMP_ INST_RETIRED. SCALAR_SINGLE Retired computational This event counts the number of computational SSE scalarSSE scalar-single single instructions retired. Computational instructions instructions. perform arithmetic computations (for example: add, multiply and divide). Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. CAH 04H SIMD_COMP_ INST_RETIRED. PACKED_DOUBLE Retired computational This event counts the number of computational SSE2 SSE2 packed-double packed-double instructions retired. Computational instructions. instructions perform arithmetic computations (for example: add, multiply and divide). Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. CAH 08H SIMD_COMP_INST_RETIRE D.SCALAR_DOUBLE Retired computational This event counts the number of computational SSE2 scalarSSE2 scalar-double double instructions retired. Computational instructions instructions. perform arithmetic computations (for example: add, multiply and divide). Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. 19-180 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value CBH 01H Description and Comment Event Name Definition MEM_LOAD_ RETIRED.L1D _MISS Retired loads that miss This event counts the number of retired load operations the L1 data cache that missed the L1 data cache. This includes loads from (precise event). cache lines that are currently being fetched, due to a previous L1 data cache miss to the same cache line. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. When this event is captured with the precise event mechanism, the collected samples contain the address of the instruction that was executed immediately after the instruction that caused the event. Use IA32_PMC0 only. CBH 02H MEM_LOAD_ RETIRED.L1D_ LINE_MISS L1 data cache line missed by retired loads (precise event). This event counts the number of load operations that miss the L1 data cache and send a request to the L2 cache to fetch the missing cache line. That is the missing cache line fetching has not yet started. The event count is equal to the number of cache lines fetched from the L2 cache by retired loads. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. The event might not be counted if the load is blocked (see LOAD_BLOCK events). When this event is captured with the precise event mechanism, the collected samples contain the address of the instruction that was executed immediately after the instruction that caused the event. Use IA32_PMC0 only. CBH 04H MEM_LOAD_ RETIRED.L2_MISS Retired loads that miss This event counts the number of retired load operations the L2 cache (precise that missed the L2 cache. event). This event counts loads from cacheable memory only. It does not count loads by software prefetches. When this event is captured with the precise event mechanism, the collected samples contain the address of the instruction that was executed immediately after the instruction that caused the event. Use IA32_PMC0 only. Vol. 3B 19-181 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value CBH 08H Event Name Definition MEM_LOAD_ RETIRED.L2_LINE_MISS L2 cache line missed by retired loads (precise event). Description and Comment This event counts the number of load operations that miss the L2 cache and result in a bus request to fetch the missing cache line. That is the missing cache line fetching has not yet started. This event count is equal to the number of cache lines fetched from memory by retired loads. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. The event might not be counted if the load is blocked (see LOAD_BLOCK events). When this event is captured with the precise event mechanism, the collected samples contain the address of the instruction that was executed immediately after the instruction that caused the event. Use IA32_PMC0 only. CBH 10H MEM_LOAD_ RETIRED.DTLB_ MISS Retired loads that miss This event counts the number of retired loads that missed the DTLB (precise the DTLB. The DTLB miss is not counted if the load event). operation causes a fault. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. When this event is captured with the precise event mechanism, the collected samples contain the address of the instruction that was executed immediately after the instruction that caused the event. Use IA32_PMC0 only. CCH 01H FP_MMX_TRANS_TO_MMX Transitions from This event counts the first MMX instructions following a Floating Point to MMX floating-point instruction. Use this event to estimate the Instructions. penalties for the transitions between floating-point and MMX states. CCH 02H FP_MMX_TRANS_TO_FP Transitions from MMX Instructions to Floating Point Instructions. This event counts the first floating-point instructions following any MMX instruction. Use this event to estimate the penalties for the transitions between floating-point and MMX states. CDH 00H SIMD_ASSIST SIMD assists invoked. This event counts the number of SIMD assists invoked. SIMD assists are invoked when an EMMS instruction is executed, changing the MMX state in the floating point stack. CEH 00H SIMD_INSTR_ RETIRED SIMD Instructions retired. This event counts the number of retired SIMD instructions that use MMX registers. CFH 00H SIMD_SAT_INSTR_RETIRED Saturated arithmetic instructions retired. This event counts the number of saturated arithmetic SIMD instructions that retired. D2H 01H RAT_STALLS. ROB_READ_PORT This event counts the number of cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the out-of-order pipeline. ROB read port stalls cycles. Note that, at this stage in the pipeline, additional stalls may occur at the same cycle and prevent the stalled micro-ops from entering the pipe. In such a case, micro-ops retry entering the execution pipe in the next cycle and the ROBread-port stall is counted again. 19-182 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value D2H D2H Description and Comment Event Name Definition 02H RAT_STALLS. PARTIAL_CYCLES Partial register stall cycles. This event counts the number of cycles instruction execution latency became longer than the defined latency because the instruction uses a register that was partially written by previous instructions. 04H RAT_STALLS. FLAGS Flag stall cycles. This event counts the number of cycles during which execution stalled due to several reasons, one of which is a partial flag register stall. A partial register stall may occur when two conditions are met: • An instruction modifies some, but not all, of the flags in the flag register. • The next instruction, which depends on flags, depends on flags that were not modified by this instruction. D2H 08H RAT_STALLS. FPSW FPU status word stall. This event indicates that the FPU status word (FPSW) is written. To obtain the number of times the FPSW is written divide the event count by 2. The FPSW is written by instructions with long latency; a small count may indicate a high penalty. D2H 0FH RAT_STALLS. ANY All RAT stall cycles. This event counts the number of stall cycles due to conditions described by: • • • • RAT_STALLS.ROB_READ_PORT RAT_STALLS.PARTIAL RAT_STALLS.FLAGS RAT_STALLS.FPSW. D4H 01H SEG_RENAME_ STALLS.ES Segment rename stalls This event counts the number of stalls due to the lack of - ES. renaming resources for the ES segment register. If a segment is renamed, but not retired and a second update to the same segment occurs, a stall occurs in the front end of the pipeline until the renamed segment retires. D4H 02H SEG_RENAME_ STALLS.DS Segment rename stalls This event counts the number of stalls due to the lack of - DS. renaming resources for the DS segment register. If a segment is renamed, but not retired and a second update to the same segment occurs, a stall occurs in the front end of the pipeline until the renamed segment retires. D4H 04H SEG_RENAME_ STALLS.FS Segment rename stalls This event counts the number of stalls due to the lack of - FS. renaming resources for the FS segment register. If a segment is renamed, but not retired and a second update to the same segment occurs, a stall occurs in the front end of the pipeline until the renamed segment retires. D4H 08H SEG_RENAME_ STALLS.GS Segment rename stalls This event counts the number of stalls due to the lack of - GS. renaming resources for the GS segment register. If a segment is renamed, but not retired and a second update to the same segment occurs, a stall occurs in the front end of the pipeline until the renamed segment retires. D4H 0FH SEG_RENAME_ STALLS.ANY Any (ES/DS/FS/GS) This event counts the number of stalls due to the lack of segment rename stall. renaming resources for the ES, DS, FS, and GS segment registers. If a segment is renamed but not retired and a second update to the same segment occurs, a stall occurs in the front end of the pipeline until the renamed segment retires. Vol. 3B 19-183 PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value Event Name Definition Description and Comment D5H 01H SEG_REG_ RENAMES.ES Segment renames ES. This event counts the number of times the ES segment register is renamed. D5H 02H SEG_REG_ RENAMES.DS Segment renames DS. This event counts the number of times the DS segment register is renamed. D5H 04H SEG_REG_ RENAMES.FS Segment renames FS. This event counts the number of times the FS segment register is renamed. D5H 08H SEG_REG_ RENAMES.GS Segment renames GS. This event counts the number of times the GS segment register is renamed. D5H 0FH SEG_REG_ RENAMES.ANY Any (ES/DS/FS/GS) segment rename. This event counts the number of times any of the four segment registers (ES/DS/FS/GS) is renamed. DCH 01H RESOURCE_ STALLS.ROB_FULL Cycles during which the ROB full. This event counts the number of cycles when the number of instructions in the pipeline waiting for retirement reaches the limit the processor can handle. A high count for this event indicates that there are long latency operations in the pipe (possibly load and store operations that miss the L2 cache, and other instructions that depend on these cannot execute until the former instructions complete execution). In this situation new instructions cannot enter the pipe and start execution. DCH 02H RESOURCE_ STALLS.RS_FULL Cycles during which the RS full. This event counts the number of cycles when the number of instructions in the pipeline waiting for execution reaches the limit the processor can handle. A high count of this event indicates that there are long latency operations in the pipe (possibly load and store operations that miss the L2 cache, and other instructions that depend on these cannot execute until the former instructions complete execution). In this situation new instructions cannot enter the pipe and start execution. DCH 04 RESOURCE_ STALLS.LD_ST Cycles during which This event counts the number of cycles while resourcethe pipeline has related stalls occur due to: exceeded load or store • The number of load instructions in the pipeline reached limit or waiting to the limit the processor can handle. The stall ends when a commit all stores. loading instruction retires. • The number of store instructions in the pipeline reached the limit the processor can handle. The stall ends when a storing instruction commits its data to the cache or memory. • There is an instruction in the pipe that can be executed only when all previous stores complete and their data is committed in the caches or memory. For example, the SFENCE and MFENCE instructions require this behavior. DCH 08H RESOURCE_ STALLS.FPCW Cycles stalled due to FPU control word write. This event counts the number of cycles while execution was stalled due to writing the floating-point unit (FPU) control word. DCH 10H RESOURCE_ STALLS.BR_MISS_CLEAR Cycles stalled due to branch misprediction. This event counts the number of cycles after a branch misprediction is detected at execution until the branch and all older micro-ops retire. During this time new micro-ops cannot enter the out-of-order pipeline. 19-184 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-27. Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.) Event Num Umask Value DCH 1FH Event Name Definition RESOURCE_ STALLS.ANY Resource related stalls. Description and Comment This event counts the number of cycles while resourcerelated stalls occurs for any conditions described by the following events: • • • • • RESOURCE_STALLS.ROB_FULL RESOURCE_STALLS.RS_FULL RESOURCE_STALLS.LD_ST RESOURCE_STALLS.FPCW RESOURCE_STALLS.BR_MISS_CLEAR E0H 00H BR_INST_ DECODED Branch instructions decoded. This event counts the number of branch instructions decoded. E4H 00H BOGUS_BR Bogus branches. This event counts the number of byte sequences that were mistakenly detected as taken branch instructions. This results in a BACLEAR event. This occurs mainly after task switches. E6H 00H BACLEARS BACLEARS asserted. This event counts the number of times the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front and. This can occur if the code has many branches such that they cannot be consumed by the BPU. Each BACLEAR asserted costs approximately 7 cycles of instruction fetch. The effect on total execution time depends on the surrounding code. F0H 00H PREF_RQSTS_UP Upward prefetches issued from DPL. This event counts the number of upward prefetches issued from the Data Prefetch Logic (DPL) to the L2 cache. A prefetch request issued to the L2 cache cannot be cancelled and the requested cache line is fetched to the L2 cache. F8H 00H PREF_RQSTS_DN Downward prefetches This event counts the number of downward prefetches issued from DPL. issued from the Data Prefetch Logic (DPL) to the L2 cache. A prefetch request issued to the L2 cache cannot be cancelled and the requested cache line is fetched to the L2 cache. Vol. 3B 19-185 PERFORMANCE MONITORING EVENTS 19.14 PERFORMANCE MONITORING EVENTS FOR PROCESSORS BASED ON THE GOLDMONT PLUS MICROARCHITECTURE Intel Atom processors based on the Goldmont Plus microarchitecture support the architectural performance monitoring events listed in Table 19-1 and fixed-function performance events using a fixed counter. They also support the following performance monitoring events listed in Table 19-29. These events apply to processors with CPUID signature of 06_7AH. In addition, processors based on the Goldmont Plus microarchitecture also support the events listed in Table 19-29 (see Section 19.15, “Performance Monitoring Events for Processors Based on the Goldmont Microarchitecture”). For an event listed in Table 19-29 that also appears in the model-specific tables of prior generations, Table 19-29 supersedes prior generation tables. Performance monitoring event descriptions may refer to terminology described in Section B.2, “Intel® Xeon® processor 5500 Series,” in Appendix B of the Intel® 64 and IA-32 Architectures Optimization Reference Manual. In Goldmont Plus microarchitecture, performance monitoring events that support Processor Event Based Sampling (PEBS) and PEBS records that contain processor state information that are associated with at-retirement tagging are marked by “Precise Event”. Table 19-28. Performance Events for the Goldmont Plus Microarchitecture Event Num. Umask Value Event Name Description Comment 00H 01H INST_RETIRED.ANY Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. This event uses fixed counter 0. Fixed Event, Precise Event, Not Reduced Skid 08H 02H DTLB_LOAD_MISSES.W ALK_COMPLETED_4K Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 4K pages. The page walks can end with or without a page fault. 08H 04H DTLB_LOAD_MISSES.W ALK_COMPLETED_2M_ 4M Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 2M or 4M pages. The page walks can end with or without a page fault. 08H 08H DTLB_LOAD_MISSES.W ALK_COMPLETED_1GB Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 1GB pages. The page walks can end with or without a page fault. 08H 10H DTLB_LOAD_MISSES.W ALK_PENDING Counts once per cycle for each page walk occurring due to a load (demand data loads or SW prefetches). Includes cycles spent traversing the Extended Page Table (EPT). Average cycles per walk can be calculated by dividing by the number of walks. 49H 02H DTLB_STORE_MISSES.W Counts page walks completed due to demand data stores whose ALK_COMPLETED_4K address translations missed in the TLB and were mapped to 4K pages. The page walks can end with or without a page fault. 49H 04H DTLB_STORE_MISSES.W Counts page walks completed due to demand data stores whose ALK_COMPLETED_2M_ address translations missed in the TLB and were mapped to 2M or 4M 4M pages. The page walks can end with or without a page fault. 49H 08H DTLB_STORE_MISSES.W Counts page walks completed due to demand data stores whose ALK_COMPLETED_1GB address translations missed in the TLB and were mapped to 1GB pages. The page walks can end with or without a page fault. 49H 10H DTLB_STORE_MISSES.W Counts once per cycle for each page walk occurring due to a demand ALK_PENDING data store. Includes cycles spent traversing the Extended Page Table (EPT). Average cycles per walk can be calculated by dividing by the number of walks. 19-186 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-28. Performance Events for the Goldmont Plus Microarchitecture (Contd.) Event Num. Umask Value Event Name Description 4FH 10H EPT.WALK_PENDING Counts once per cycle for each page walk only while traversing the Extended Page Table (EPT), and does not count during the rest of the translation. The EPT is used for translating Guest-Physical Addresses to Physical Addresses for Virtual Machine Monitors (VMMs). Average cycles per walk can be calculated by dividing the count by number of walks. 85H 02H ITLB_MISSES.WALK_CO MPLETED_4K Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 4K pages. The page walks can end with or without a page fault. 85H 04H ITLB_MISSES.WALK_CO MPLETED_2M_4M Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 2M or 4M pages. The page walks can end with or without a page fault. 85H 08H ITLB_MISSES.WALK_CO MPLETED_1GB Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 1GB pages. The page walks can end with or without a page fault. 85H 10H ITLB_MISSES.WALK_PE NDING Counts once per cycle for each page walk occurring due to an instruction fetch. Includes cycles spent traversing the Extended Page Table (EPT). Average cycles per walk can be calculated by dividing by the number of walks. BDH 20H TLB_FLUSHES.STLB_AN Counts STLB flushes. The TLBs are flushed on instructions like INVLPG Y and MOV to CR3. C3H 20H MACHINE_CLEARS.PAGE Counts the number of times that the machines clears due to a page _FAULT fault. Covers both I-side and D-side (Loads/Stores) page faults. A page fault occurs when either page is not present, or an access violation. 19.15 Comment PERFORMANCE MONITORING EVENTS FOR PROCESSORS BASED ON THE GOLDMONT MICROARCHITECTURE Intel Atom processors based on the Goldmont microarchitecture support the architectural performance monitoring events listed in Table 19-1 and fixed-function performance events using a fixed counter. In addition, they also support the following model-specific performance monitoring events listed in Table 19-29. These events apply to processors with CPUID signatures of 06_5CH, 06_5FH, and 06_7AH. Performance monitoring event descriptions may refer to terminology described in Section B.2, “Intel® Xeon® processor 5500 Series,” in Appendix B of the Intel® 64 and IA-32 Architectures Optimization Reference Manual. In Goldmont microarchitecture, performance monitoring events that support Processor Event Based Sampling (PEBS) and PEBS records that contain processor state information that are associated with at-retirement tagging are marked by “Precise Event”. Table 19-29. Performance Events for the Goldmont Microarchitecture Event Num. Umask Value Event Name 03H 10H LD_BLOCKS.ALL_BLOCK Counts anytime a load that retires is blocked for any reason. Precise Event 03H 08H LD_BLOCKS.UTLB_MISS Counts loads blocked because they are unable to find their physical address in the micro TLB (UTLB). Precise Event 03H 02H LD_BLOCKS.STORE_FO RWARD Precise Event Description Counts a load blocked from using a store forward because of an address/size mismatch; only one of the loads blocked from each store will be counted. Comment Vol. 3B 19-187 PERFORMANCE MONITORING EVENTS Table 19-29. Performance Events for the Goldmont Microarchitecture (Contd.) Event Num. Umask Value Event Name Description 03H 01H LD_BLOCKS.DATA_UNK NOWN Counts a load blocked from using a store forward, but did not occur Precise Event because the store data was not available at the right time. The forward might occur subsequently when the data is available. 03H 04H LD_BLOCKS.4K_ALIAS Counts loads that block because their address modulo 4K matches a pending store. 05H 01H PAGE_WALKS.D_SIDE_C Counts every core cycle when a Data-side (walks due to data operation) YCLES page walk is in progress. 05H 02H PAGE_WALKS.I_SIDE_CY Counts every core cycle when an Instruction-side (walks due to an CLES instruction fetch) page walk is in progress. 05H 03H PAGE_WALKS.CYCLES Counts every core cycle a page-walk is in progress due to either a data memory operation, or an instruction fetch. 0EH 00H UOPS_ISSUED.ANY Counts uops issued by the front end and allocated into the back end of the machine. This event counts uops that retire as well as uops that were speculatively executed but didn't retire. The sort of speculative uops that might be counted includes, but is not limited to those uops issued in the shadow of a mispredicted branch, those uops that are inserted during an assist (such as for a denormal floating-point result), and (previously allocated) uops that might be canceled during a machine clear. 13H 02H MISALIGN_MEM_REF.LO Counts when a memory load of a uop that spans a page boundary (a AD_PAGE_SPLIT split) is retired. Precise Event 13H 04H MISALIGN_MEM_REF.ST Counts when a memory store of a uop that spans a page boundary (a ORE_PAGE_SPLIT split) is retired. Precise Event 2EH 4FH LONGEST_LAT_CACHE. REFERENCE Counts memory requests originating from the core that reference a cache line in the L2 cache. 2EH 41H LONGEST_LAT_CACHE. MISS Counts memory requests originating from the core that miss in the L2 cache. 30H 00H L2_REJECT_XQ.ALL Counts the number of demand and prefetch transactions that the L2 XQ rejects due to a full or near full condition which likely indicates back pressure from the intra-die interconnect (IDI) fabric. The XQ may reject transactions from the L2Q (non-cacheable requests), L2 misses and L2 write-back victims. 31H 00H CORE_REJECT_L2Q.ALL Counts the number of demand and L1 prefetcher requests rejected by the L2Q due to a full or nearly full condition which likely indicates back pressure from L2Q. It also counts requests that would have gone directly to the XQ, but are rejected due to a full or nearly full condition, indicating back pressure from the IDI link. The L2Q may also reject transactions from a core to ensure fairness between cores, or to delay a core's dirty eviction when the address conflicts with incoming external snoops. 3CH 00H CPU_CLK_UNHALTED.C ORE_P Core cycles when core is not halted. This event uses a programmable general purpose performance counter. 3CH 01H CPU_CLK_UNHALTED.R EF Reference cycles when core is not halted. This event uses a programmable general purpose performance counter. 51H 01H DL1.DIRTY_EVICTION Counts when a modified (dirty) cache line is evicted from the data L1 cache and needs to be written back to memory. No count will occur if the evicted line is clean, and hence does not require a writeback. 19-188 Vol. 3B Comment Precise Event PERFORMANCE MONITORING EVENTS Table 19-29. Performance Events for the Goldmont Microarchitecture (Contd.) Event Num. Umask Value Event Name Description 80H 01H ICACHE.HIT Counts requests to the Instruction Cache (ICache) for one or more bytes in an ICache Line and that cache line is in the Icache (hit). The event strives to count on a cache line basis, so that multiple accesses which hit in a single cache line count as one ICACHE.HIT. Specifically, the event counts when straight line code crosses the cache line boundary, or when a branch target is to a new line, and that cache line is in the ICache. This event counts differently than Intel processors based on the Silvermont microarchitecture. 80H 02H ICACHE.MISSES Counts requests to the Instruction Cache (ICache) for one or more bytes in an ICache Line and that cache line is not in the Icache (miss). The event strives to count on a cache line basis, so that multiple accesses which miss in a single cache line count as one ICACHE.MISS. Specifically, the event counts when straight line code crosses the cache line boundary, or when a branch target is to a new line, and that cache line is not in the ICache. This event counts differently than Intel processors based on the Silvermont microarchitecture. 80H 03H ICACHE.ACCESSES Counts requests to the Instruction Cache (ICache) for one or more bytes in an ICache Line. The event strives to count on a cache line basis, so that multiple fetches to a single cache line count as one ICACHE.ACCESS. Specifically, the event counts when accesses from straight line code crosses the cache line boundary, or when a branch target is to a new line. This event counts differently than Intel processors based on the Silvermont microarchitecture. 81H 04H ITLB.MISS Counts the number of times the machine was unable to find a translation in the Instruction Translation Lookaside Buffer (ITLB) for a linear address of an instruction fetch. It counts when new translations are filled into the ITLB. The event is speculative in nature, but will not count translations (page walks) that are begun and not finished, or translations that are finished but not filled into the ITLB. 86H 00H FETCH_STALL.ALL Counts cycles that fetch is stalled due to any reason. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes. This will include cycles due to an ITLB miss, ICache miss and other events. 86H 01H FETCH_STALL.ITLB_FIL L_PENDING_CYCLES Counts cycles that fetch is stalled due to an outstanding ITLB miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ITLB miss. Note: this event is not the same as page walk cycles to retrieve an instruction translation. 86H 02H FETCH_STALL.ICACHE_F Counts cycles that an ICache miss is outstanding, and instruction fetch ILL_PENDING_CYCLES is stalled. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes, while an Icache miss is outstanding. Note this event is not the same as cycles to retrieve an instruction due to an Icache miss. Rather, it is the part of the Instruction Cache (ICache) miss time where no bytes are available for the decoder. Comment Vol. 3B 19-189 PERFORMANCE MONITORING EVENTS Table 19-29. Performance Events for the Goldmont Microarchitecture (Contd.) Event Num. Umask Value 9CH 00H Event Name Description Comment UOPS_NOT_DELIVERED. This event is used to measure front-end inefficiencies, i.e., when the ANY front end of the machine is not delivering uops to the back end and the back end has not stalled. This event can be used to identify if the machine is truly front-end bound. When this event occurs, it is an indication that the front end of the machine is operating at less than its theoretical peak performance. Background: We can think of the processor pipeline as being divided into 2 broader parts: the front end and the back end. The front end is responsible for fetching the instruction, decoding into uops in machine understandable format and putting them into a uop queue to be consumed by the back end. The back end then takes these uops and allocates the required resources. When all resources are ready, uops are executed. If the back end is not ready to accept uops from the front end, then we do not want to count these as front-end bottlenecks. However, whenever we have bottlenecks in the back end, we will have allocation unit stalls and eventually force the front end to wait until the back end is ready to receive more uops. This event counts only when the back end is requesting more micro-uops and the front end is not able to provide them. When 3 uops are requested and no uops are delivered, the event counts 3. When 3 are requested, and only 1 is delivered, the event counts 2. When only 2 are delivered, the event counts 1. Alternatively stated, the event will not count if 3 uops are delivered, or if the back end is stalled and not requesting any uops at all. Counts indicate missed opportunities for the front end to deliver a uop to the back end. Some examples of conditions that cause front-end efficiencies are: Icache misses, ITLB misses, and decoder restrictions that limit the front-end bandwidth. Known Issues: Some uops require multiple allocation slots. These uops will not be charged as a front end 'not delivered' opportunity, and will be regarded as a back-end problem. For example, the INC instruction has one uop that requires 2 issue slots. A stream of INC instructions will not count as UOPS_NOT_DELIVERED, even though only one instruction can be issued per clock. The low uop issue rate for a stream of INC instructions is considered to be a back-end issue. B7H 01H, 02H OFFCORE_RESPONSE Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (Duplicated for both MSRs.) C0H 00H INST_RETIRED.ANY_P Counts the number of instructions that retire execution. For Precise Event instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The event continues counting during hardware interrupts, traps, and inside interrupt handlers. This is an architectural performance event. This event uses a programmable general purpose performance counter. *This event is a Precise Event: the EventingRIP field in the PEBS record is precise to the address of the instruction which caused the event. Note: Because PEBS records can be collected only on IA32_PMC0, only one event can use the PEBS facility at a time. C2H 00H 19-190 Vol. 3B UOPS_RETIRED.ANY Counts uops which have retired. Precise Event, Not Reduced Skid PERFORMANCE MONITORING EVENTS Table 19-29. Performance Events for the Goldmont Microarchitecture (Contd.) Event Num. Umask Value Event Name Description Comment C2H 01H UOPS_RETIRED.MS Counts uops retired that are from the complex flows issued by the micro-sequencer (MS). Counts both the uops from a micro-coded instruction, and the uops that might be generated from a micro-coded assist. Precise Event, Not Reduced Skid C2H 08H UOPS_RETIRED.FPDIV Counts the number of floating point divide uops retired. Precise Event C2H 10H UOPS_RETIRED.IDIV Counts the number of integer divide uops retired. Precise Event C3H 01H MACHINE_CLEARS.SMC Counts the number of times that the processor detects that a program is writing to a code section and has to perform a machine clear because of that modification. Self-modifying code (SMC) causes a severe penalty in all Intel architecture processors. C3H 02H MACHINE_CLEARS.MEM Counts machine clears due to memory ordering issues. This occurs ORY_ORDERING when a snoop request happens and the machine is uncertain if memory ordering will be preserved as another core is in the process of modifying the data. C3H 04H MACHINE_CLEARS.FP_A Counts machine clears due to floating-point (FP) operations needing SSIST assists. For instance, if the result was a floating-point denormal, the hardware clears the pipeline and reissues uops to produce the correct IEEE compliant denormal result. C3H 08H MACHINE_CLEARS.DISA Counts machine clears due to memory disambiguation. Memory MBIGUATION disambiguation happens when a load which has been issued conflicts with a previous un-retired store in the pipeline whose address was not known at issue time, but is later resolved to be the same as the load address. C3H 00H MACHINE_CLEARS.ALL C4H 00H BR_INST_RETIRED.ALL_ Counts branch instructions retired for all branch types. This is an BRANCHES architectural performance event. Precise Event C4H 7EH BR_INST_RETIRED.JCC Precise Event C4H 80H BR_INST_RETIRED.ALL_ Counts the number of taken branch instructions retired. TAKEN_BRANCHES Precise Event C4H FEH BR_INST_RETIRED.TAK EN_JCC Precise Event C4H F9H BR_INST_RETIRED.CALL Counts near CALL branch instructions retired. Precise Event C4H FDH BR_INST_RETIRED.REL_ Counts near relative CALL branch instructions retired. CALL Precise Event C4H FBH BR_INST_RETIRED.IND_ Counts near indirect CALL branch instructions retired. CALL Precise Event C4H F7H BR_INST_RETIRED.RET URN Precise Event C4H EBH BR_INST_RETIRED.NON Counts near indirect call or near indirect jmp branch instructions retired. Precise Event _RETURN_IND C4H BFH BR_INST_RETIRED.FAR _BRANCH Counts far branch instructions retired. This includes far jump, far call and return, and Interrupt call and return. Precise Event C5H 00H BR_MISP_RETIRED.ALL _BRANCHES Counts mispredicted branch instructions retired including all branch types. Precise Event Counts machine clears for any reason. Counts retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired, including both when the branch was taken and when it was not taken. Counts Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were taken and does not count when the Jcc branch instruction were not taken. Counts near return branch instructions retired. Vol. 3B 19-191 PERFORMANCE MONITORING EVENTS Table 19-29. Performance Events for the Goldmont Microarchitecture (Contd.) Event Num. Umask Value Event Name Description C5H 7EH BR_MISP_RETIRED.JCC Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Precise Event Condition is Met) branch instructions retired, including both when the branch was supposed to be taken and when it was not supposed to be taken (but the processor predicted the opposite condition). C5H FEH BR_MISP_RETIRED.TAK EN_JCC Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were supposed to be taken but the processor predicted that it would not be taken. C5H FBH BR_MISP_RETIRED.IND_ Counts mispredicted near indirect CALL branch instructions retired, Precise Event CALL where the target address taken was not what the processor predicted. C5H F7H BR_MISP_RETIRED.RET URN C5H EBH BR_MISP_RETIRED.NON Counts mispredicted branch instructions retired that were near indirect Precise Event _RETURN_IND call or near indirect jmp, where the target address taken was not what the processor predicted. CAH 01H ISSUE_SLOTS_NOT_CO Counts the number of issue slots per core cycle that were not NSUMED.RESOURCE_FU consumed because of a full resource in the back end. Including but not LL limited to resources include the Re-order Buffer (ROB), reservation stations (RS), load/store buffers, physical registers, or any other needed machine resource that is currently unavailable. Note that uops must be available for consumption in order for this event to fire. If a uop is not available (Instruction Queue is empty), this event will not count. CAH 02H ISSUE_SLOTS_NOT_CO NSUMED.RECOVERY Counts the number of issue slots per core cycle that were not consumed by the back end because allocation is stalled waiting for a mispredicted jump to retire or other branch-like conditions (e.g. the event is relevant during certain microcode flows). Counts all issue slots blocked while within this window, including slots where uops were not available in the Instruction Queue. CAH 00H ISSUE_SLOTS_NOT_CO NSUMED.ANY Counts the number of issue slots per core cycle that were not consumed by the back end due to either a full resource in the back end (RESOURCE_FULL), or due to the processor recovering from some event (RECOVERY). CBH 01H HW_INTERRUPTS.RECEI Counts hardware interrupts received by the processor. VED CBH 02H HW_INTERRUPTS.MASK Counts the number of core cycles during which interrupts are masked ED (disabled). Increments by 1 each core cycle that EFLAGS.IF is 0, regardless of whether interrupts are pending or not. CBH 04H HW_INTERRUPTS.PENDI Counts core cycles during which there are pending interrupts, but NG_AND_MASKED interrupts are masked (EFLAGS.IF = 0). CDH 00H CYCLES_DIV_BUSY.ALL Counts core cycles if either divide unit is busy. CDH 01H CYCLES_DIV_BUSY.IDIV Counts core cycles if the integer divide unit is busy. CDH 02H CYCLES_DIV_BUSY.FPDI Counts core cycles if the floating point divide unit is busy. V D0H 81H MEM_UOPS_RETIRED.A LL_LOADS Counts the number of load uops retired. Precise Event D0H 82H MEM_UOPS_RETIRED.A LL_STORES Counts the number of store uops retired. Precise Event 19-192 Vol. 3B Counts mispredicted near RET branch instructions retired, where the return address taken was not what the processor predicted. Comment Precise Event Precise Event PERFORMANCE MONITORING EVENTS Table 19-29. Performance Events for the Goldmont Microarchitecture (Contd.) Event Num. Umask Value Event Name Description Comment D0H 83H MEM_UOPS_RETIRED.A LL Counts the number of memory uops retired that are either a load or a store or both. Precise Event D0H 11H MEM_UOPS_RETIRED.D TLB_MISS_LOADS Counts load uops retired that caused a DTLB miss. Precise Event D0H 12H MEM_UOPS_RETIRED.D TLB_MISS_STORES Counts store uops retired that caused a DTLB miss. Precise Event D0H 13H MEM_UOPS_RETIRED.D TLB_MISS Counts uops retired that had a DTLB miss on load, store or either. Precise Event Note that when two distinct memory operations to the same page miss the DTLB, only one of them will be recorded as a DTLB miss. D0H 21H MEM_UOPS_RETIRED.L OCK_LOADS Counts locked memory uops retired. This includes 'regular' locks and Precise Event bus locks. To specifically count bus locks only, see the offcore response event. A locked access is one with a lock prefix, or an exchange to memory. D0H 41H MEM_UOPS_RETIRED.S PLIT_LOADS Counts load uops retired where the data requested spans a 64 byte cache line boundary. Precise Event D0H 42H MEM_UOPS_RETIRED.S PLIT_STORES Counts store uops retired where the data requested spans a 64 byte cache line boundary. Precise Event D0H 43H MEM_UOPS_RETIRED.S PLIT Counts memory uops retired where the data requested spans a 64 byte cache line boundary. Precise Event D1H 01H MEM_LOAD_UOPS_RETI Counts load uops retired that hit the L1 data cache. RED.L1_HIT Precise Event D1H 08H MEM_LOAD_UOPS_RETI Counts load uops retired that miss the L1 data cache. RED.L1_MISS Precise Event D1H 02H MEM_LOAD_UOPS_RETI Counts load uops retired that hit in the L2 cache. RED.L2_HIT Precise Event 0xD1H 10H MEM_LOAD_UOPS_RETI Counts load uops retired that miss in the L2 cache. RED.L2_MISS Precise Event D1H 20H MEM_LOAD_UOPS_RETI Counts load uops retired where the cache line containing the data was Precise Event RED.HITM in the modified state of another core or modules cache (HITM). More specifically, this means that when the load address was checked by other caching agents (typically another processor) in the system, one of those caching agents indicated that they had a dirty copy of the data. Loads that obtain a HITM response incur greater latency than most that is typical for a load. In addition, since HITM indicates that some other processor had this data in its cache, it implies that the data was shared between processors, or potentially was a lock or semaphore value. This event is useful for locating sharing, false sharing, and contended locks. Vol. 3B 19-193 PERFORMANCE MONITORING EVENTS Table 19-29. Performance Events for the Goldmont Microarchitecture (Contd.) Event Num. Umask Value D1H 40H MEM_LOAD_UOPS_RETI Counts memory load uops retired where the data is retrieved from the Precise Event RED.WCB_HIT WCB (or fill buffer), indicating that the load found its data while that data was in the process of being brought into the L1 cache. Typically a load will receive this indication when some other load or prefetch missed the L1 cache and was in the process of retrieving the cache line containing the data, but that process had not yet finished (and written the data back to the cache). For example, consider load X and Y, both referencing the same cache line that is not in the L1 cache. If load X misses cache first, it obtains and WCB (or fill buffer) begins the process of requesting the data. When load Y requests the data, it will either hit the WCB, or the L1 cache, depending on exactly what time the request to Y occurs. D1H 80H MEM_LOAD_UOPS_RETI Counts memory load uops retired where the data is retrieved from Precise Event RED.DRAM_HIT DRAM. Event is counted at retirement, so the speculative loads are ignored. A memory load can hit (or miss) the L1 cache, hit (or miss) the L2 cache, hit DRAM, hit in the WCB or receive a HITM response. E6H 01H BACLEARS.ALL E6H 08H BACLEARS.RETURN Counts BACLEARS on return instructions. E6H 10H BACLEARS.COND Counts BACLEARS on Jcc (Jump on Conditional Code/Jump if Condition is Met) branches. E7H 01H MS_DECODED.MS_ENTR Counts the number of times the Microcode Sequencer (MS) starts a Y flow of uops from the MSROM. It does not count every time a uop is read from the MSROM. The most common case that this counts is when a micro-coded instruction is encountered by the front end of the machine. Other cases include when an instruction encounters a fault, trap, or microcode assist of any sort that initiates a flow of uops. The event will count MS startups for uops that are speculative, and subsequently cleared by branch mispredict or a machine clear. E9H 01H DECODE_RESTRICTION. PREDECODE_WRONG 19.16 Event Name Description Comment Counts the number of times a BACLEAR is signaled for any reason, including, but not limited to indirect branch/call, Jcc (Jump on Conditional Code/Jump if Condition is Met) branch, unconditional branch/call, and returns. Counts the number of times the prediction (from the pre-decode cache) for instruction length is incorrect. PERFORMANCE MONITORING EVENTS FOR PROCESSORS BASED ON THE SILVERMONT MICROARCHITECTURE Processors based on the Silvermont microarchitecture support the architectural performance monitoring events listed in Table 19-1 and fixed-function performance events using fixed counter. In addition, they also support the following model-specific performance monitoring events listed in Table 19-30. These processors have the CPUID signatures of 06_37H, 06_4AH, 06_4DH, 06_5AH, and 06_5DH. Performance monitoring event descriptions may refer to terminology described in Section B.2, “Intel® Xeon® processor 5500 Series,” in Appendix B of the Intel® 64 and IA-32 Architectures Optimization Reference Manual. 19-194 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-30. Performance Events for Silvermont Microarchitecture Event Num. Umask Value Event Name Definition Description and Comment 03H 01H REHABQ.LD_BLOCK_S T_FORWARD Loads blocked due to store forward restriction. This event counts the number of retired loads that were prohibited from receiving forwarded data from the store because of address mismatch. 03H 02H REHABQ.LD_BLOCK_S TD_NOTREADY Loads blocked due to store data not ready. This event counts the cases where a forward was technically possible, but did not occur because the store data was not available at the right time. 03H 04H REHABQ.ST_SPLITS Store uops that split cache line boundary. This event counts the number of retire stores that experienced cache line boundary splits. 03H 08H REHABQ.LD_SPLITS Load uops that split cache line boundary. This event counts the number of retire loads that experienced cache line boundary splits. 03H 10H REHABQ.LOCK Uops with lock semantics. This event counts the number of retired memory operations with lock semantics. These are either implicit locked instructions such as the XCHG instruction or instructions with an explicit LOCK prefix (F0H). 03H 20H REHABQ.STA_FULL Store address buffer full. This event counts the number of retired stores that are delayed because there is not a store address buffer available. 03H 40H REHABQ.ANY_LD Any reissued load uops. This event counts the number of load uops reissued from Rehabq. 03H 80H REHABQ.ANY_ST Any reissued store uops. 04H 01H MEM_UOPS_RETIRED.L Loads retired that 1_MISS_LOADS missed L1 data cache. This event counts the number of load ops retired that miss in L1 Data cache. Note that prefetch misses will not be counted. 04H 02H MEM_UOPS_RETIRED.L Loads retired that hit 2_HIT_LOADS L2. This event counts the number of load micro-ops retired that hit L2. 04H 04H MEM_UOPS_RETIRED.L Loads retired that 2_MISS_LOADS missed L2. This event counts the number of load micro-ops retired that missed L2. 04H 08H MEM_UOPS_RETIRED. DTLB_MISS_LOADS Loads missed DTLB. This event counts the number of load ops retired that had DTLB miss. 04H 10H MEM_UOPS_RETIRED. UTLB_MISS Loads missed UTLB. This event counts the number of load ops retired that had UTLB miss. 04H 20H MEM_UOPS_RETIRED. HITM Cross core or cross module hitm. This event counts the number of load ops retired that got data from the other core or from the other module. 04H 40H MEM_UOPS_RETIRED. ALL_LOADS All Loads. This event counts the number of load ops retired. 04H 80H MEM_UOP_RETIRED.A LL_STORES All Stores. This event counts the number of store ops retired. 05H 01H PAGE_WALKS.D_SIDE_ CYCLES Duration of D-side page-walks in core cycles. This event counts every cycle when a D-side (walks due to a load) page walk is in progress. Page walk duration divided by number of page walks is the average duration of page-walks. This event counts the number of store uops reissued from Rehabq. Edge trigger bit must be cleared. Set Edge to count the number of page walks. 05H 02H PAGE_WALKS.I_SIDE_C Duration of I-side page- This event counts every cycle when an I-side (walks due to an YCLES walks in core cycles. instruction fetch) page walk is in progress. Page walk duration divided by number of page walks is the average duration of page-walks. Edge trigger bit must be cleared. Set Edge to count the number of page walks. Vol. 3B 19-195 PERFORMANCE MONITORING EVENTS Table 19-30. Performance Events for Silvermont Microarchitecture Event Num. Umask Value Event Name Definition Description and Comment 05H 03H PAGE_WALKS.WALKS Total number of pagewalks that are completed (I-side and D-side). This event counts when a data (D) page walk or an instruction (I) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks. Edge trigger bit must be set. Clear Edge to count the number of cycles. 2EH 41H LONGEST_LAT_CACHE. L2 cache request MISS misses. This event counts the total number of L2 cache references and the number of L2 cache misses respectively. L3 is not supported in Silvermont microarchitecture. 2EH 4FH LONGEST_LAT_CACHE. L2 cache requests REFERENCE from this core. This event counts requests originating from the core that references a cache line in the L2 cache. L3 is not supported in Silvermont microarchitecture. 30H 00H L2_REJECT_XQ.ALL Counts the number of request from the L2 that were not accepted into the XQ. This event counts the number of demand and prefetch transactions that the L2 XQ rejects due to a full or near full condition which likely indicates back pressure from the IDI link. The XQ may reject transactions from the L2Q (non-cacheable requests), BBS (L2 misses) and WOB (L2 write-back victims). 31H 00H CORE_REJECT_L2Q.ALL Counts the number of request that were not accepted into the L2Q because the L2Q is FULL. This event counts the number of demand and L1 prefetcher requests rejected by the L2Q due to a full or nearly full condition which likely indicates back pressure from L2Q. It also counts requests that would have gone directly to the XQ, but are rejected due to a full or nearly full condition, indicating back pressure from the IDI link. The L2Q may also reject transactions from a core to insure fairness between cores, or to delay a core's dirty eviction when the address conflicts incoming external snoops. (Note that L2 prefetcher requests that are dropped are not counted by this event.). 3CH 00H CPU_CLK_UNHALTED.C Core cycles when core ORE_P is not halted. This event counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time. N/A N/A CPU_CLK_UNHALTED.C Core cycles when core ORE is not halted. This uses the fixed counter 1 to count the same condition as CPU_CLK_UNHALTED.CORE_P does. 3CH 01H CPU_CLK_UNHALTED.R Bus cycles when core is This event counts the number of bus cycles that the core is not EF_P not halted. in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time. This event is not affected by core frequency changes. N/A N/A CPU_CLK_UNHALTED.R Reference cycles when This event counts the number of reference cycles at a TSC rate EF_TSC core is not halted. that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time. This event is not affected by core frequency changes. 80H 01H ICACHE.HIT Instruction fetches from Icache. This event counts all instruction fetches from the instruction cache. 80H 02H ICACHE.MISSES Icache miss. This event counts all instruction fetches that miss the Instruction cache or produce memory requests. This includes uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding. 19-196 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-30. Performance Events for Silvermont Microarchitecture Event Num. Umask Value Event Name Definition Description and Comment 80H 03H ICACHE.ACCESSES Instruction fetches. This event counts all instruction fetches, including uncacheable fetches. B7H 01H OFFCORE_RESPONSE_ See Section 18.5.2.2. 0 Requires MSR_OFFCORE_RESP0 to specify request type and response. B7H 02H OFFCORE_RESPONSE_ See Section 18.5.2.2. 1 Requires MSR_OFFCORE_RESP1 to specify request type and response. C0H 00H INST_RETIRED.ANY_P Instructions retired (PEBS supported with IA32_PMC0). This event counts the number of instructions that retire execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. N/A N/A INST_RETIRED.ANY Instructions retired. This uses the fixed counter 0 to count the same condition as INST_RETIRED.ANY_P does. C2H 01H UOPS_RETIRED.MS MSROM micro-ops retired. This event counts the number of micro-ops retired that were supplied from MSROM. C2H 10H UOPS_RETIRED.ALL Micro-ops retired. This event counts the number of micro-ops retired. C3H 01H MACHINE_CLEARS.SMC Self-Modifying Code detected. This event counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel® architecture processors. C3H 02H MACHINE_CLEARS.ME MORY_ORDERING This event counts the number of times that pipeline was cleared due to memory ordering issues. C3H 04H MACHINE_CLEARS.FP_ Stalls due to FP assists. This event counts the number of times that pipeline stalled due ASSIST to FP operations needing assists. C3H 08H MACHINE_CLEARS.ALL Stalls due to any causes. This event counts the number of times that pipeline stalled due to due to any causes (including SMC, MO, FP assist, etc.). C4H 00H BR_INST_RETIRED.ALL Retired branch _BRANCHES instructions. This event counts the number of branch instructions retired. C4H 7EH BR_INST_RETIRED.JCC This event counts the number of branch instructions retired that were conditional jumps. C4H BFH BR_INST_RETIRED.FAR Retired far branch _BRANCH instructions. This event counts the number of far branch instructions retired. C4H EBH BR_INST_RETIRED.NO N_RETURN_IND This event counts the number of branch instructions retired that were near indirect call or near indirect jmp. C4H F7H BR_INST_RETIRED.RET Retired near return URN instructions. This event counts the number of near RET branch instructions retired. C4H F9H BR_INST_RETIRED.CAL Retired near call L instructions. This event counts the number of near CALL branch instructions retired. C4H FBH BR_INST_RETIRED.IND Retired near indirect _CALL call instructions. This event counts the number of near indirect CALL branch instructions retired. C4H FDH BR_INST_RETIRED.REL Retired near relative _CALL call instructions. This event counts the number of near relative CALL branch instructions retired. C4H FEH This event counts the number of branch instructions retired that BR_INST_RETIRED.TAK Retired conditional jumps that were taken. were conditional jumps and taken. EN_JCC C5H 00H BR_MISP_RETIRED.ALL Retired mispredicted _BRANCHES branch instructions. Stalls due to Memory ordering. Retired branch instructions that were conditional jumps. Retired instructions of near indirect Jmp or call. This event counts the number of mispredicted branch instructions retired. Vol. 3B 19-197 PERFORMANCE MONITORING EVENTS Table 19-30. Performance Events for Silvermont Microarchitecture Event Num. Umask Value Event Name Definition Description and Comment C5H 7EH BR_MISP_RETIRED.JCC Retired mispredicted conditional jumps. This event counts the number of mispredicted branch instructions retired that were conditional jumps. C5H BFH BR_MISP_RETIRED.FA R Retired mispredicted This event counts the number of mispredicted far branch far branch instructions. instructions retired. C5H EBH BR_MISP_RETIRED.NO N_RETURN_IND Retired mispredicted instructions of near indirect Jmp or call. This event counts the number of mispredicted branch instructions retired that were near indirect call or near indirect jmp. C5H F7H BR_MISP_RETIRED.RE TURN Retired mispredicted near return instructions. This event counts the number of mispredicted near RET branch instructions retired. C5H F9H BR_MISP_RETIRED.CAL Retired mispredicted L near call instructions. This event counts the number of mispredicted near CALL branch instructions retired. C5H FBH BR_MISP_RETIRED.IND Retired mispredicted _CALL near indirect call instructions. This event counts the number of mispredicted near indirect CALL branch instructions retired. C5H FDH BR_MISP_RETIRED.REL Retired mispredicted _CALL near relative call instructions This event counts the number of mispredicted near relative CALL branch instructions retired. C5H FEH BR_MISP_RETIRED.TA KEN_JCC This event counts the number of mispredicted branch instructions retired that were conditional jumps and taken. CAH 01H NO_ALLOC_CYCLES.RO Counts the number of B_FULL cycles when no uops are allocated and the ROB is full (less than 2 entries available). Counts the number of cycles when no uops are allocated and the ROB is full (less than 2 entries available). CAH 20H NO_ALLOC_CYCLES.RA Counts the number of T_STALL cycles when no uops are allocated and a RATstall is asserted. Counts the number of cycles when no uops are allocated and a RATstall is asserted. CAH 3FH NO_ALLOC_CYCLES.AL L This event counts the number of cycles when the front end does not provide any instructions to be allocated for any reason. CAH 50H NO_ALLOC_CYCLES.NO Front end not This event counts the number of cycles when the front end does T_DELIVERED delivering back end not not provide any instructions to be allocated but the back end is stalled. not stalled. CBH 01H RS_FULL_STALL.MEC MEC RS full. This event counts the number of cycles the allocation pipe line stalled due to the RS for the MEC cluster is full. CBH 1FH RS_FULL_STALL.ALL Any RS full. This event counts the number of cycles that the allocation pipe line stalled due to any one of the RS is full. CDH 01H CYCLES_DIV_BUSY.AN Y Divider Busy. This event counts the number of cycles the divider is busy. E6H 01H BACLEARS.ALL BACLEARS asserted for This event counts the number of baclears for any type of branch. any branch. E6H 08H BACLEARS.RETURN BACLEARS asserted for This event counts the number of baclears for return branches. return branch. 19-198 Vol. 3B Retired mispredicted conditional jumps that were taken. Front end not delivering. PERFORMANCE MONITORING EVENTS Table 19-30. Performance Events for Silvermont Microarchitecture Event Num. Umask Value Event Name Definition E6H 10H BACLEARS.COND BACLEARS asserted for This event counts the number of baclears for conditional conditional branch. branches. E7H 01H MS_DECODED.MS_ENT MS Decode starts. RY Description and Comment This event counts the number of times the MSROM starts a flow of UOPS. 19.16.1 Performance Monitoring Events for Processors Based on the Airmont Microarchitecture Intel processors based on the Airmont microarchitecture support the same architectural and the model-specific performance monitoring events as processors based on the Silvermont microarchitecture. All of the events listed in Table 19-30 apply. These processors have the CPUID signatures that include 06_4CH. 19.17 PERFORMANCE MONITORING EVENTS FOR 45 NM AND 32 NM INTEL® ATOM™ PROCESSORS 45 nm and 32 nm processors based on the Intel® Atom™ microarchitecture support the architectural performance monitoring events listed in Table 19-1 and fixed-function performance events using fixed counter listed in Table 19-26. In addition, they also support the following model-specific performance monitoring events listed in Table 19-31. Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors Event Num. Umask Value 02H 81H STORe_FORWARDS.GO Good store forwards. OD This event counts the number of times store data was forwarded directly to a load. 06H 00H SEGMENT_REG_ LOADS.ANY This event counts the number of segment register load operations. Instructions that load new values into segment registers cause a penalty. This event indicates performance issues in 16-bit code. If this event occurs frequently, it may be useful to calculate the number of instructions retired per segment register load. If the resulting calculation is low (on average a small number of instructions are executed between segment register loads), then the code’s segment register usage should be optimized. Event Name Definition Number of segment register loads. Description and Comment As a result of branch misprediction, this event is speculative and may include segment register loads that do not actually occur. However, most segment register loads are internally serialized and such speculative effects are minimized. 07H 01H PREFETCH.PREFETCHT Streaming SIMD 0 Extensions (SSE) PrefetchT0 instructions executed. This event counts the number of times the SSE instruction prefetchT0 is executed. This instruction prefetches the data to the L1 data cache and L2 cache. 07H 06H PREFETCH.SW_L2 This event counts the number of times the SSE instructions prefetchT1 and prefetchT2 are executed. These instructions prefetch the data to the L2 cache. Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions executed. Vol. 3B 19-199 PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value 07H 08H PREFETCH.PREFETCHN Streaming SIMD TA Extensions (SSE) Prefetch NTA instructions executed. This event counts the number of times the SSE instruction prefetchNTA is executed. This instruction prefetches the data to the L1 data cache. 08H 07H DATA_TLB_MISSES.DT LB_MISS Memory accesses that missed the DTLB. This event counts the number of Data Table Lookaside Buffer (DTLB) misses. The count includes misses detected as a result of speculative accesses. Typically a high count for this event indicates that the code accesses a large number of data pages. 08H 05H DATA_TLB_MISSES.DT LB_MISS_LD DTLB misses due to load operations. This event counts the number of Data Table Lookaside Buffer (DTLB) misses due to load operations. This count includes misses detected as a result of speculative accesses. 08H 09H DATA_TLB_MISSES.L0 _DTLB_MISS_LD L0_DTLB misses due to This event counts the number of L0_DTLB misses due to load load operations. operations. This count includes misses detected as a result of speculative accesses. 08H 06H DATA_TLB_MISSES.DT LB_MISS_ST DTLB misses due to store operations. This event counts the number of Data Table Lookaside Buffer (DTLB) misses due to store operations. This count includes misses detected as a result of speculative accesses. 0CH 03H PAGE_WALKS.WALKS Number of page-walks executed. This event counts the number of page-walks executed due to either a DTLB or ITLB miss. The page walk duration, PAGE_WALKS.CYCLES, divided by number of page walks is the average duration of a page walk. This can hint to whether most of the page-walks are satisfied by the caches or cause an L2 cache miss. 0CH 03H PAGE_WALKS.CYCLES Duration of page-walks This event counts the duration of page-walks in core cycles. The in core cycles. paging mode in use typically affects the duration of page walks. Page walk duration divided by number of page walks is the average duration of page-walks. This can hint at whether most of the page-walks are satisfied by the caches or cause an L2 cache miss. 10H 01H X87_COMP_OPS_EXE. ANY.S Floating point computational microops executed. This event counts the number of x87 floating point computational micro-ops executed. 10H 81H X87_COMP_OPS_EXE. ANY.AR Floating point computational microops retired. This event counts the number of x87 floating point computational micro-ops retired. 11H 01H FP_ASSIST Floating point assists. This event counts the number of floating point operations executed that required micro-code assist intervention. These assists are required in the following cases. Event Name Definition Description and Comment Edge trigger bit must be set. Edge trigger bit must be cleared. X87 instructions: 1. NaN or denormal are loaded to a register or used as input from memory. 2. Division by 0. 3. Underflow output. 19-200 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value Event Name Definition Description and Comment 11H 81H FP_ASSIST.AR Floating point assists. This event counts the number of floating point operations executed that required micro-code assist intervention. These assists are required in the following cases. X87 instructions: 1. NaN or denormal are loaded to a register or used as input from memory. 2. Division by 0. 3. Underflow output. 12H 01H MUL.S Multiply operations executed. This event counts the number of multiply operations executed. This includes integer as well as floating point multiply operations. 12H 81H MUL.AR Multiply operations retired. This event counts the number of multiply operations retired. This includes integer as well as floating point multiply operations. 13H 01H DIV.S Divide operations executed. This event counts the number of divide operations executed. This includes integer divides, floating point divides and squareroot operations executed. 13H 81H DIV.AR Divide operations retired. This event counts the number of divide operations retired. This includes integer divides, floating point divides and square-root operations executed. 14H 01H CYCLES_DIV_BUSY Cycles the driver is busy. This event counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. 21H See Table 18-71 L2_ADS Cycles L2 address bus is in use. This event counts the number of cycles the L2 address bus is being used for accesses to the L2 cache or bus queue. 22H See Table 18-71 L2_DBUS_BUSY Cycles the L2 cache data bus is busy. This event counts core cycles during which the L2 cache data bus is busy transferring data from the L2 cache to the core. It counts for all L1 cache misses (data and instruction) that hit the L2 cache. The count will increment by two for a full cache-line request. 24H See Table 18-71 and Table 18-73 L2_LINES_IN L2 cache misses. This event counts the number of cache lines allocated in the L2 cache. Cache lines are allocated in the L2 cache as a result of requests from the L1 data and instruction caches and the L2 hardware prefetchers to cache lines that are missing in the L2 cache. See Table 18-71 L2_M_LINES_IN 25H This event can count occurrences for this core or both cores. This event can count occurrences for this core or both cores. This event can also count demand requests and L2 hardware prefetch requests together or separately. L2 cache line modifications. This event counts whenever a modified cache line is written back from the L1 data cache to the L2 cache. This event can count occurrences for this core or both cores. Vol. 3B 19-201 PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value Event Name Definition Description and Comment 26H See Table 18-71 and Table 18-73 L2_LINES_OUT L2 cache lines evicted. This event counts the number of L2 cache lines evicted. See Table 18-71 and Table 18-73 L2_M_LINES_OUT See Table 18-71 and Table 18-74 L2_IFETCH See Table 18-71, Table 18-73 and Table 18-74 L2_LD 27H 28H 29H This event can count occurrences for this core or both cores. This event can also count evictions due to demand requests and L2 hardware prefetch requests together or separately. Modified lines evicted from the L2 cache. This event counts the number of L2 modified cache lines evicted. These lines are written back to memory unless they also exist in a shared-state in one of the L1 data caches. This event can count occurrences for this core or both cores. This event can also count evictions due to demand requests and L2 hardware prefetch requests together or separately. L2 cacheable instruction fetch requests. This event counts the number of instruction cache line requests from the ICache. It does not include fetch requests from uncacheable memory. It does not include ITLB miss accesses. This event can count occurrences for this core or both cores. This event can also count accesses to cache lines at different MESI states. L2 cache reads. This event counts L2 cache read requests coming from the L1 data cache and L2 prefetchers. This event can count occurrences for this core or both cores. This event can count occurrences - for this core or both cores. - due to demand requests and L2 hardware prefetch requests together or separately. - of accesses to cache lines at different MESI states. 2AH 2BH 2EH See Table 18-71 and Table 18-74 L2_ST See Table 18-71 and Table 18-74 L2_LOCK See Table 18-71, Table 18-73 and Table 18-74 L2_RQSTS L2 store requests. This event counts all store operations that miss the L1 data cache and request the data from the L2 cache. This event can count occurrences for this core or both cores. This event can also count accesses to cache lines at different MESI states. L2 locked accesses. This event counts all locked accesses to cache lines that miss the L1 data cache. This event can count occurrences for this core or both cores. This event can also count accesses to cache lines at different MESI states. L2 cache requests. This event counts all completed L2 cache requests. This includes L1 data cache reads, writes, and locked accesses, L1 data prefetch requests, instruction fetches, and all L2 hardware prefetch requests. This event can count occurrences - for this core or both cores. - due to demand requests and L2 hardware prefetch requests together, or separately. - of accesses to cache lines at different MESI states. 19-202 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value 2EH 41H Event Name Definition L2_RQSTS.SELF.DEMA ND.I_STATE L2 cache demand This event counts all completed L2 cache demand requests requests from this core from this core that miss the L2 cache. This includes L1 data that missed the L2. cache reads, writes, and locked accesses, L1 data prefetch requests, and instruction fetches. Description and Comment This is an architectural performance event. 2EH 4FH L2_RQSTS.SELF.DEMA ND.MESI L2 cache demand requests from this core. This event counts all completed L2 cache demand requests from this core. This includes L1 data cache reads, writes, and locked accesses, L1 data prefetch requests, and instruction fetches. This is an architectural performance event. 30H See Table 18-71, Table 18-73 and Table 18-74 L2_REJECT_BUSQ Rejected L2 cache requests. This event indicates that a pending L2 cache request that requires a bus transaction is delayed from moving to the bus queue. Some of the reasons for this event are: - The bus queue is full. - The bus queue already holds an entry for a cache line in the same set. The number of events is greater or equal to the number of requests that were rejected. - For this core or both cores. - Due to demand requests and L2 hardware prefetch requests together, or separately. - Of accesses to cache lines at different MESI states. 32H See Table 18-71 L2_NO_REQ Cycles no L2 cache requests are pending. This event counts the number of cycles that no L2 cache requests are pending. 3AH 00H EIST_TRANS Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions. This event counts the number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions that include a frequency change, either with or without VID change. This event is incremented only while the counting core is in C0 state. In situations where an EIST transition was caused by hardware as a result of CxE state transitions, those EIST transitions will also be registered in this event. Enhanced Intel Speedstep Technology transitions are commonly initiated by OS, but can be initiated by HW internally. For example: CxE states are C-states (C1,C2,C3…) which not only place the CPU into a sleep state by turning off the clock and other components, but also lower the voltage (which reduces the leakage power consumption). The same is true for thermal throttling transition which uses Enhanced Intel Speedstep Technology internally. 3BH C0H THERMAL_TRIP Number of thermal trips. This event counts the number of thermal trips. A thermal trip occurs whenever the processor temperature exceeds the thermal trip threshold temperature. Following a thermal trip, the processor automatically reduces frequency and voltage. The processor checks the temperature every millisecond, and returns to normal when the temperature falls below the thermal trip threshold temperature. Vol. 3B 19-203 PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value 3CH 00H Event Name Definition CPU_CLK_UNHALTED.C Core cycles when core ORE_P is not halted. Description and Comment This event counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. In mobile systems the core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time. In systems with a constant core frequency, this event can give you a measurement of the elapsed time while the core was not in halt state by dividing the event count by the core frequency. -This is an architectural performance event. - The event CPU_CLK_UNHALTED.CORE_P is counted by a programmable counter. - The event CPU_CLK_UNHALTED.CORE is counted by a designated fixed counter, leaving the two programmable counters available for other events. 3CH 01H CPU_CLK_UNHALTED.B Bus cycles when core is This event counts the number of bus cycles while the core is not US not halted. in the halt state. This event can give you a measurement of the elapsed time while the core was not in the halt state, by dividing the event count by the bus frequency. The core enters the halt state when it is running the HLT instruction. The event also has a constant ratio with CPU_CLK_UNHALTED.REF event, which is the maximum bus to processor frequency ratio. Non-halted bus cycles are a component in many key event ratios. 3CH 02H CPU_CLK_UNHALTED. NO_OTHER Bus cycles when core is This event counts the number of bus cycles during which the active and the other is core remains non-halted, and the other core on the processor is halted. halted. This event can be used to determine the amount of parallelism exploited by an application or a system. Divide this event count by the bus frequency to determine the amount of time that only one core was in use. 40H 21H L1D_CACHE.LD L1 Cacheable Data Reads. This event counts the number of data reads from cacheable memory. 40H 22H L1D_CACHE.ST L1 Cacheable Data Writes. This event counts the number of data writes to cacheable memory. 60H See Table 18-71 and Table 18-72. BUS_REQUEST_OUTST Outstanding cacheable This event counts the number of pending full cache line read ANDING data read bus requests transactions on the bus occurring in each cycle. A read duration. transaction is pending from the cycle it is sent on the bus until the full cache line is received by the processor. NOTE: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. 19-204 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value 61H See Table 18-72. Event Name Definition BUS_BNR_DRV Number of Bus Not This event counts the number of Bus Not Ready (BNR) signals Ready signals asserted. that the processor asserts on the bus to suspend additional bus requests by other bus agents. A bus agent asserts the BNR signal when the number of data and snoop transactions is close to the maximum that the bus can handle. Description and Comment While this signal is asserted, new transactions cannot be submitted on the bus. As a result, transaction latency may have higher impact on program performance. NOTE: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. 62H See Table 18-72. BUS_DRDY_CLOCKS Bus cycles when data is sent on the bus. This event counts the number of bus cycles during which the DRDY (Data Ready) signal is asserted on the bus. The DRDY signal is asserted when data is sent on the bus. This event counts the number of bus cycles during which this agent (the processor) writes data on the bus back to memory or to other bus agents. This includes all explicit and implicit data writebacks, as well as partial writes. Note: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. 63H See Table 18-71 and Table 18-72. BUS_LOCK_CLOCKS Bus cycles when a This event counts the number of bus cycles, during which the LOCK signal is asserted. LOCK signal is asserted on the bus. A LOCK signal is asserted when there is a locked memory access, due to: - Uncacheable memory. - Locked operation that spans two cache lines. - Page-walk from an uncacheable page table. Bus locks have a very high performance penalty and it is highly recommended to avoid such accesses. NOTE: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. 64H See Table 18-71. BUS_DATA_RCV Bus cycles while processor receives data. This event counts the number of cycles during which the processor is busy receiving data. NOTE: This event is threadindependent and will not provide a count per logical processor when AnyThr is disabled. 65H See Table 18-71 and Table 18-72. BUS_TRANS_BRD Burst read bus transactions. This event counts the number of burst read transactions including: - L1 data cache read misses (and L1 data cache hardware prefetches). - L2 hardware prefetches by the DPL and L2 streamer. - IFU read misses of cacheable lines. It does not include RFO transactions. 66H See Table 18-71 and Table 18-72. BUS_TRANS_RFO RFO bus transactions. This event counts the number of Read For Ownership (RFO) bus transactions, due to store operations that miss the L1 data cache and the L2 cache. This event also counts RFO bus transactions due to locked operations. Vol. 3B 19-205 PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value Event Name Definition Description and Comment 67H See Table 18-71 and Table 18-72. BUS_TRANS_WB Explicit writeback bus transactions. This event counts all explicit writeback bus transactions due to dirty line evictions. It does not count implicit writebacks due to invalidation by a snoop request. 68H See Table 18-71 and Table 18-72. BUS_TRANS_IFETCH Instruction-fetch bus transactions. This event counts all instruction fetch full cache line bus transactions. 69H See Table 18-71 and Table 18-72. BUS_TRANS_INVAL Invalidate bus transactions. This event counts all invalidate transactions. Invalidate transactions are generated when: 6AH See Table 18-71 and Table 18-72. BUS_TRANS_PWR Partial write bus transaction. This event counts partial write bus transactions. 6BH See Table 18-71 and Table 18-72. BUS_TRANS_P Partial bus transactions. This event counts all (read and write) partial bus transactions. 6CH See Table 18-71 and Table 18-72. BUS_TRANS_IO IO bus transactions. This event counts the number of completed I/O bus transactions as a result of IN and OUT instructions. The count does not include memory mapped IO. 6DH See Table 18-71 and Table 18-72. BUS_TRANS_DEF Deferred bus transactions. This event counts the number of deferred transactions. 6EH See Table 18-71 and Table 18-72. BUS_TRANS_BURST Burst (full cache-line) bus transactions. This event counts burst (full cache line) transactions including: 19-206 Vol. 3B - A store operation hits a shared line in the L2 cache. - A full cache line write misses the L2 cache or hits a shared line in the L2 cache. - Burst reads. - RFOs. - Explicit writebacks. - Write combine lines. PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value Event Name Definition Description and Comment 6FH See Table 18-71 and Table 18-72. BUS_TRANS_MEM Memory bus transactions. This event counts all memory bus transactions including: See Table 18-71 and Table 18-72. BUS_TRANS_ANY See Table 18-71 and Table 18-74. EXT_SNOOP See Table 18-72. BUS_HIT_DRV 7BH See Table 18-72. BUS_HITM_DRV HITM signal asserted. This event counts the number of bus cycles during which the processor drives the HITM# pin to signal HITM snoop response. NOTE: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. 7DH See Table 18-71. BUSQ_EMPTY Bus queue is empty. This event counts the number of cycles during which the core did not have any pending transactions in the bus queue. See Table 18-71 and Table 18-72. SNOOP_STALL_DRV 7FH See Table 18-71. BUS_IO_WAIT IO requests waiting in the bus queue. This event counts the number of core cycles during which IO requests wait in the bus queue. This event counts IO requests from the core. 80H 03H ICACHE.ACCESSES Instruction fetches. This event counts all instruction fetches, including uncacheable fetches. 80H 02H ICACHE.MISSES Icache miss. This event counts all instruction fetches that miss the Instruction cache or produce memory requests. This includes uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding. 82H 04H ITLB.FLUSH ITLB flushes. This event counts the number of ITLB flushes. 82H 02H ITLB.MISSES ITLB misses. This event counts the number of instruction fetches that miss the ITLB. 70H 77H 7AH 7EH - Burst transactions. - Partial reads and writes. - Invalidate transactions. The BUS_TRANS_MEM count is the sum of BUS_TRANS_BURST, BUS_TRANS_P and BUS_TRANS_INVAL. All bus transactions. This event counts all bus transactions. This includes: - Memory transactions. - IO transactions (non memory-mapped). - Deferred transaction completion. - Other less frequent transactions, such as interrupts. External snoops. This event counts the snoop responses to bus transactions. Responses can be counted separately by type and by bus agent. Note: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. HIT signal asserted. This event counts the number of bus cycles during which the processor drives the HIT# pin to signal HIT snoop response. Note: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. Note: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. Bus stalled for snoops. This event counts the number of times that the bus snoop stall signal is asserted. During the snoop stall cycles no new bus transactions requiring a snoop response can be initiated on the bus. Note: This event is thread-independent and will not provide a count per logical processor when AnyThr is disabled. Vol. 3B 19-207 PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value AAH 02H MACRO_INSTS.CISC_DE CISC macro instructions This event counts the number of complex instructions decoded, CODED decoded. but not necessarily executed or retired. Only one complex instruction can be decoded at a time. AAH 03H MACRO_INSTS.ALL_DE All Instructions CODED decoded. This event counts the number of instructions decoded. B0H 00H SIMD_UOPS_EXEC.S SIMD micro-ops executed (excluding stores). This event counts all the SIMD micro-ops executed. This event does not count MOVQ and MOVD stores from register to memory. B0H 80H SIMD_UOPS_EXEC.AR SIMD micro-ops retired This event counts the number of SIMD saturated arithmetic (excluding stores). micro-ops executed. B1H 00H SIMD_SAT_UOP_EXEC. SIMD saturated S arithmetic micro-ops executed. This event counts the number of SIMD saturated arithmetic micro-ops executed. B1H 80H SIMD_SAT_UOP_EXEC. SIMD saturated AR arithmetic micro-ops retired. This event counts the number of SIMD saturated arithmetic micro-ops retired. B3H 01H SIMD_UOP_TYPE_EXE C.MUL.S SIMD packed multiply micro-ops executed. This event counts the number of SIMD packed multiply microops executed. B3H 81H SIMD_UOP_TYPE_EXE C.MUL.AR SIMD packed multiply micro-ops retired. This event counts the number of SIMD packed multiply microops retired. B3H 02H SIMD_UOP_TYPE_EXE C.SHIFT.S SIMD packed shift micro-ops executed. This event counts the number of SIMD packed shift micro-ops executed. B3H 82H SIMD_UOP_TYPE_EXE C.SHIFT.AR SIMD packed shift micro-ops retired. This event counts the number of SIMD packed shift micro-ops retired. B3H 04H SIMD_UOP_TYPE_EXE C.PACK.S SIMD pack micro-ops executed. This event counts the number of SIMD pack micro-ops executed. B3H 84H SIMD_UOP_TYPE_EXE C.PACK.AR SIMD pack micro-ops retired. This event counts the number of SIMD pack micro-ops retired. B3H 08H SIMD_UOP_TYPE_EXE C.UNPACK.S SIMD unpack micro-ops This event counts the number of SIMD unpack micro-ops executed. executed. B3H 88H SIMD_UOP_TYPE_EXE C.UNPACK.AR SIMD unpack micro-ops This event counts the number of SIMD unpack micro-ops retired. retired. B3H 10H SIMD_UOP_TYPE_EXE C.LOGICAL.S SIMD packed logical micro-ops executed. This event counts the number of SIMD packed logical micro-ops executed. B3H 90H SIMD_UOP_TYPE_EXE C.LOGICAL.AR SIMD packed logical micro-ops retired. This event counts the number of SIMD packed logical micro-ops retired. B3H 20H SIMD_UOP_TYPE_EXE C.ARITHMETIC.S SIMD packed arithmetic This event counts the number of SIMD packed arithmetic micromicro-ops executed. ops executed. B3H A0H SIMD_UOP_TYPE_EXE C.ARITHMETIC.AR SIMD packed arithmetic This event counts the number of SIMD packed arithmetic micromicro-ops retired. ops retired. C0H 00H INST_RETIRED.ANY_P Instructions retired (precise event). 19-208 Vol. 3B Event Name Definition Description and Comment This event counts the number of instructions that retire execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value Event Name Definition Description and Comment N/A 00H INST_RETIRED.ANY Instructions retired. This event counts the number of instructions that retire execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers. C2H 10H UOPS_RETIRED.ANY Micro-ops retired. This event counts the number of micro-ops retired. The processor decodes complex macro instructions into a sequence of simpler micro-ops. Most instructions are composed of one or two micro-ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. In some cases micro-op sequences are fused or whole instructions are fused into one micro-op. See other UOPS_RETIRED events for differentiating retired fused and non-fused micro-ops. C3H 01H MACHINE_CLEARS.SMC Self-Modifying Code detected. This event counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel® architecture processors. C4H 00H BR_INST_RETIRED.AN Y This event counts the number of branch instructions retired. Retired branch instructions. This is an architectural performance event. C4H 01H BR_INST_RETIRED.PRE Retired branch D_NOT_TAKEN instructions that were predicted not-taken. This event counts the number of branch instructions retired that were correctly predicted to be not-taken. C4H 02H BR_INST_RETIRED.MIS Retired branch PRED_NOT_TAKEN instructions that were mispredicted nottaken. This event counts the number of branch instructions retired that were mispredicted and not-taken. C4H 04H BR_INST_RETIRED.PRE Retired branch D_TAKEN instructions that were predicted taken. This event counts the number of branch instructions retired that were correctly predicted to be taken. C4H 08H BR_INST_RETIRED.MIS Retired branch PRED_TAKEN instructions that were mispredicted taken. This event counts the number of branch instructions retired that were mispredicted and taken. C4H 0AH BR_INST_RETIRED.MIS Retired mispredicted PRED branch instructions (precise event). This event counts the number of retired branch instructions that were mispredicted by the processor. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. Mispredicted branches degrade the performance because the processor starts executing instructions along a wrong path it predicts. When the misprediction is discovered, all the instructions executed in the wrong path must be discarded, and the processor must start again on the correct path. Using the Profile-Guided Optimization (PGO) features of the Intel® C++ compiler may help reduce branch mispredictions. See the compiler documentation for more information on this feature. Vol. 3B 19-209 PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value Event Name Definition Description and Comment To determine the branch misprediction ratio, divide the BR_INST_RETIRED.MISPRED event count by the number of BR_INST_RETIRED.ANY event count. To determine the number of mispredicted branches per instruction, divide the number of mispredicted branches by the INST_RETIRED.ANY event count. To measure the impact of the branch mispredictions use the event RESOURCE_STALLS.BR_MISS_CLEAR. Tips: - See the optimization guide for tips on reducing branch mispredictions. - PGO's purpose is to have straight line code for the most frequent execution paths, reducing branches taken and increasing the “basic block” size, possibly also reducing the code footprint or working-set. C4H 0CH BR_INST_RETIRED.TAK Retired taken branch EN instructions. This event counts the number of branches retired that were taken. C4H 0FH BR_INST_RETIRED.AN Y1 This event counts the number of branch instructions retired that were mispredicted. This event is a duplicate of BR_INST_RETIRED.MISPRED. C5H 00H BR_INST_RETIRED.MIS Retired mispredicted PRED branch instructions (precise event). Retired branch instructions. This event counts the number of retired branch instructions that were mispredicted by the processor. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. Mispredicted branches degrade the performance because the processor starts executing instructions along a wrong path it predicts. When the misprediction is discovered, all the instructions executed in the wrong path must be discarded, and the processor must start again on the correct path. Using the Profile-Guided Optimization (PGO) features of the Intel® C++ compiler may help reduce branch mispredictions. See the compiler documentation for more information on this feature. To determine the branch misprediction ratio, divide the BR_INST_RETIRED.MISPRED event count by the number of BR_INST_RETIRED.ANY event count. To determine the number of mispredicted branches per instruction, divide the number of mispredicted branches by the INST_RETIRED.ANY event count. To measure the impact of the branch mispredictions use the event RESOURCE_STALLS.BR_MISS_CLEAR. Tips: - See the optimization guide for tips on reducing branch mispredictions. - PGO's purpose is to have straight line code for the most frequent execution paths, reducing branches taken and increasing the “basic block” size, possibly also reducing the code footprint or working-set. C6H 01H CYCLES_INT_MASKED. CYCLES_INT_MASKED Cycles during which This event counts the number of cycles during which interrupts interrupts are disabled. are disabled. C6H 02H CYCLES_INT_MASKED. CYCLES_INT_PENDING _AND_MASKED Cycles during which interrupts are pending and disabled. 19-210 Vol. 3B This event counts the number of cycles during which there are pending interrupts but interrupts are disabled. PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value C7H 01H SIMD_INST_RETIRED.P Retired Streaming ACKED_SINGLE SIMD Extensions (SSE) packed-single instructions. This event counts the number of SSE packed-single instructions retired. C7H 02H SIMD_INST_RETIRED.S Retired Streaming CALAR_SINGLE SIMD Extensions (SSE) scalar-single instructions. This event counts the number of SSE scalar-single instructions retired. C7H 04H SIMD_INST_RETIRED.P Retired Streaming ACKED_DOUBLE SIMD Extensions 2 (SSE2) packed-double instructions. This event counts the number of SSE2 packed-double instructions retired. C7H 08H SIMD_INST_RETIRED.S Retired Streaming CALAR_DOUBLE SIMD Extensions 2 (SSE2) scalar-double instructions. This event counts the number of SSE2 scalar-double instructions retired. C7H 10H SIMD_INST_RETIRED.V Retired Streaming ECTOR SIMD Extensions 2 (SSE2) vector instructions. This event counts the number of SSE2 vector instructions retired. C7H 1FH SIMD_INST_RETIRED.A Retired Streaming NY SIMD instructions. This event counts the overall number of SIMD instructions retired. To count each type of SIMD instruction separately, use the following events: Event Name Definition Description and Comment SIMD_INST_RETIRED.PACKED_SINGLE SIMD_INST_RETIRED.SCALAR_SINGLE SIMD_INST_RETIRED.PACKED_DOUBLE SIMD_INST_RETIRED.SCALAR_DOUBLE SIMD_INST_RETIRED.VECTOR. C8H 00H HW_INT_RCV Hardware interrupts received. This event counts the number of hardware interrupts received by the processor. This event will count twice for dual-pipe micro-ops. CAH 01H SIMD_COMP_INST_RET Retired computational IRED.PACKED_SINGLE Streaming SIMD Extensions (SSE) packed-single instructions. This event counts the number of computational SSE packedsingle instructions retired. Computational instructions perform arithmetic computations, like add, multiply and divide. Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. CAH 02H SIMD_COMP_INST_RET Retired computational IRED.SCALAR_SINGLE Streaming SIMD Extensions (SSE) scalar-single instructions. This event counts the number of computational SSE scalarsingle instructions retired. Computational instructions perform arithmetic computations, like add, multiply and divide. Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. CAH 04H SIMD_COMP_INST_RET Retired computational IRED.PACKED_DOUBLE Streaming SIMD Extensions 2 (SSE2) packed-double instructions. This event counts the number of computational SSE2 packeddouble instructions retired. Computational instructions perform arithmetic computations, like add, multiply and divide. Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. Vol. 3B 19-211 PERFORMANCE MONITORING EVENTS Table 19-31. Performance Events for 45 nm, 32 nm Intel® Atom™ Processors (Contd.) Event Num. Umask Value CAH 08H SIMD_COMP_INST_RET Retired computational IRED.SCALAR_DOUBLE Streaming SIMD Extensions 2 (SSE2) scalar-double instructions. This event counts the number of computational SSE2 scalardouble instructions retired. Computational instructions perform arithmetic computations, like add, multiply and divide. Instructions that perform load and store operations or logical operations, like XOR, OR, and AND are not counted by this event. CBH 01H MEM_LOAD_RETIRED.L Retired loads that hit 2_HIT the L2 cache (precise event). This event counts the number of retired load operations that missed the L1 data cache and hit the L2 cache. CBH 02H MEM_LOAD_RETIRED.L Retired loads that miss This event counts the number of retired load operations that 2_MISS the L2 cache (precise missed the L2 cache. event). CBH 04H MEM_LOAD_RETIRED.D Retired loads that miss This event counts the number of retired loads that missed the DTLB. The DTLB miss is not counted if the load operation causes TLB_MISS the DTLB (precise a fault. event). CDH 00H SIMD_ASSIST Event Name Definition SIMD assists invoked. Description and Comment This event counts the number of SIMD assists invoked. SIMD assists are invoked when an EMMS instruction is executed after MMX™ technology code has changed the MMX state in the floating point stack. For example, these assists are required in the following cases. Streaming SIMD Extensions (SSE) instructions: 1. Denormal input when the DAZ (Denormals Are Zeros) flag is off. 2. Underflow result when the FTZ (Flush To Zero) flag is off. CEH 00H SIMD_INSTR_RETIRED CFH 00H SIMD_SAT_INSTR_RETI Saturated arithmetic RED instructions retired. This event counts the number of saturated arithmetic SIMD instructions that retired. E0H 01H BR_INST_DECODED Branch instructions decoded. This event counts the number of branch instructions decoded. E4H 01H BOGUS_BR Bogus branches. This event counts the number of byte sequences that were mistakenly detected as taken branch instructions. This results in a BACLEAR event and the BTB is flushed. This occurs mainly after task switches. E6H 01H BACLEARS.ANY BACLEARS asserted. This event counts the number of times the front end is redirected for a branch prediction, mainly when an early branch prediction is corrected by other branch handling mechanisms in the front end. This can occur if the code has many branches such that they cannot be consumed by the branch predictor. Each Baclear asserted costs approximately 7 cycles. The effect on total execution time depends on the surrounding code. 19-212 Vol. 3B SIMD Instructions retired. This event counts the number of SIMD instructions that retired. PERFORMANCE MONITORING EVENTS 19.18 PERFORMANCE MONITORING EVENTS FOR INTEL® CORE™ SOLO AND INTEL® CORE™ DUO PROCESSORS Table 19-32 lists model-specific performance events for Intel® Core™ Duo processors. If a model-specific event requires qualification in core specificity, it is indicated in the comment column. Table 19-32 also applies to Intel® Core™ Solo processors; bits in the unit mask corresponding to core-specificity are reserved and should be 00B. Table 19-32. Performance Events in Intel® Core™ Solo and Intel® Core™ Duo Processors Event Num. Event Mask Mnemonic Umask Value 03H LD_Blocks 00H Description Comment Load operations delayed due to store buffer blocks. The preceding store may be blocked due to unknown address, unknown data, or conflict due to partial overlap between the load and store. 04H SD_Drains 00H Cycles while draining store buffers. 05H Misalign_Mem_Ref 00H Misaligned data memory references (MOB splits of loads and stores). 06H Seg_Reg_Loads 00H Segment register loads. 07H SSE_PrefNta_Ret 00H SSE software prefetch instruction PREFETCHNTA retired. 07H SSE_PrefT1_Ret 01H SSE software prefetch instruction PREFETCHT1 retired. 07H SSE_PrefT2_Ret 02H SSE software prefetch instruction PREFETCHT2 retired. 07H SSE_NTStores_Ret 03H SSE streaming store instruction retired. 10H FP_Comps_Op_Exe 00H FP computational Instruction executed. FADD, FSUB, FCOM, FMULs, MUL, IMUL, FDIVs, DIV, IDIV, FPREMs, FSQRT are included; but exclude FADD or FMUL used in the middle of a transcendental instruction. 11H FP_Assist 00H FP exceptions experienced microcode assists. IA32_PMC1 only. 12H Mul 00H Multiply operations (a speculative count, including FP and integer multiplies). IA32_PMC1 only. 13H Div 00H Divide operations (a speculative count, including FP IA32_PMC1 only. and integer divisions). 14H Cycles_Div_Busy 00H Cycles the divider is busy. IA32_PMC0 only. 21H L2_ADS 00H L2 Address strobes. Requires corespecificity. 22H Dbus_Busy 00H Core cycle during which data bus was busy (increments by 4). Requires corespecificity. 23H Dbus_Busy_Rd 00H Cycles data bus is busy transferring data to a core (increments by 4). Requires corespecificity. 24H L2_Lines_In 00H L2 cache lines allocated. Requires core-specificity and HW prefetch qualification. 25H L2_M_Lines_In 00H L2 Modified-state cache lines allocated. Requires corespecificity. 26H L2_Lines_Out 00H L2 cache lines evicted. 27H L2_M_Lines_Out 00H L2 Modified-state cache lines evicted. Requires core-specificity and HW prefetch qualification. Vol. 3B 19-213 PERFORMANCE MONITORING EVENTS Table 19-32. Performance Events in Intel® Core™ Solo and Intel® Core™ Duo Processors (Contd.) Event Num. Event Mask Mnemonic Umask Value Description Comment 28H L2_IFetch Requires MESI qualification L2 instruction fetches from instruction fetch unit (includes speculative fetches). Requires corespecificity. 29H L2_LD Requires MESI qualification L2 cache reads. Requires corespecificity. 2AH L2_ST Requires MESI qualification L2 cache writes (includes speculation). Requires corespecificity. 2EH L2_Rqsts Requires MESI qualification L2 cache reference requests. 30H L2_Reject_Cycles Requires MESI qualification Cycles L2 is busy and rejecting new requests. Requires corespecificity, HW prefetch qualification. 32H L2_No_Request_ Cycles Requires MESI qualification Cycles there is no request to access L2. 3AH EST_Trans_All 00H Any Intel Enhanced SpeedStep(R) Technology transitions. 3AH EST_Trans_All 10H Intel Enhanced SpeedStep Technology frequency transitions. 3BH Thermal_Trip C0H Duration in a thermal trip based on the current core Use edge trigger to clock. count occurrence. 3CH NonHlt_Ref_Cycles 01H Non-halted bus cycles. 3CH Serial_Execution_ Cycles 02H Non-halted bus cycles of this core executing code while the other core is halted. 40H DCache_Cache_LD Requires MESI qualification L1 cacheable data read operations. 41H DCache_Cache_ST Requires MESI qualification L1 cacheable data write operations. 42H DCache_Cache_ Lock Requires MESI qualification L1 cacheable lock read operations to invalid state. 43H Data_Mem_Ref 01H L1 data read and writes of cacheable and noncacheable types. 44H Data_Mem_Cache_ Ref 02H L1 data cacheable read and write operations. 45H DCache_Repl 0FH L1 data cache line replacements. 46H DCache_M_Repl 00H L1 data M-state cache line allocated. 47H DCache_M_Evict 00H L1 data M-state cache line evicted. 48H DCache_Pend_Miss 00H Weighted cycles of L1 miss outstanding. 49H Dtlb_Miss 00H Data references that missed TLB. 4BH SSE_PrefNta_Miss 00H PREFETCHNTA missed all caches. 4BH SSE_PrefT1_Miss 01H PREFETCHT1 missed all caches. 4BH SSE_PrefT2_Miss 02H PREFETCHT2 missed all caches. 4BH SSE_NTStores_ Miss 03H SSE streaming store instruction missed all caches. 4FH L1_Pref_Req 00H L1 prefetch requests due to DCU cache misses. 19-214 Vol. 3B Use Cmask =1 to count duration. May overcount if request re-submitted. PERFORMANCE MONITORING EVENTS Table 19-32. Performance Events in Intel® Core™ Solo and Intel® Core™ Duo Processors (Contd.) Event Num. Event Mask Mnemonic Umask Value 60H Bus_Req_ Outstanding 00; Requires coreWeighted cycles of cacheable bus data read specificity, and agent requests. This event counts full-line read request specificity from DCU or HW prefetcher, but not RFO, write, instruction fetches, or others. Description Comment Use Cmask =1 to count duration. Use Umask bit 12 to include HWP or exclude HWP separately. 61H Bus_BNR_Clocks 00H External bus cycles while BNR asserted. 62H Bus_DRDY_Clocks 00H External bus cycles while DRDY asserted. Requires agent specificity. 63H Bus_Locks_Clocks 00H External bus cycles while bus lock signal asserted. Requires core specificity. 64H Bus_Data_Rcv 40H Number of data chunks received by this processor. 65H Bus_Trans_Brd See comment. Burst read bus transactions (data or code). Requires core specificity. 66H Bus_Trans_RFO See comment. Completed read for ownership (RFO) transactions. 68H Bus_Trans_Ifetch See comment. Completed instruction fetch transactions. Requires agent specificity. 69H Bus_Trans_Inval See comment. Completed invalidate transactions. 6AH Bus_Trans_Pwr See comment. Completed partial write transactions. 6BH Bus_Trans_P See comment. 6CH Bus_Trans_IO See comment. 6DH Bus_Trans_Def 20H Requires core specificity. Each transaction counts Completed partial transactions (include partial read its address strobe. + partial write + line write). Retried transaction may be counted more than Completed I/O transactions (read and write). once. Completed defer transactions. Requires core specificity. Retried transaction may be counted more than once. 67H Bus_Trans_WB C0H Completed writeback transactions from DCU (does not include L2 writebacks). Requires agent specificity. 6EH Bus_Trans_Burst C0H 6FH Bus_Trans_Mem C0H 70H Bus_Trans_Any C0H Completed burst transactions (full line transactions Each transaction counts its address strobe. include reads, write, RFO, and writebacks). Retried transaction may Completed memory transactions. This includes Bus_Trans_Burst + Bus_Trans_P+Bus_Trans_Inval. be counted more than once. Any completed bus transactions. 77H Bus_Snoops 00H Counts any snoop on the bus. Requires MESI qualification. Requires agent specificity. 78H DCU_Snoop_To_ Share 01H DCU snoops to share-state L1 cache line due to L1 misses. Requires core specificity. 7DH Bus_Not_In_Use 00H Number of cycles there is no transaction from the core. Requires core specificity. 7EH Bus_Snoop_Stall 00H Number of bus cycles while bus snoop is stalled. 80H ICache_Reads 00H Number of instruction fetches from ICache, streaming buffers (both cacheable and uncacheable fetches). Vol. 3B 19-215 PERFORMANCE MONITORING EVENTS Table 19-32. Performance Events in Intel® Core™ Solo and Intel® Core™ Duo Processors (Contd.) Event Num. Event Mask Mnemonic Umask Value 81H ICache_Misses 00H Number of instruction fetch misses from ICache, streaming buffers. 85H ITLB_Misses 00H Number of iITLB misses. 86H IFU_Mem_Stall 00H Cycles IFU is stalled while waiting for data from memory. 87H ILD_Stall 00H Number of instruction length decoder stalls (Counts number of LCP stalls). 88H Br_Inst_Exec 00H Branch instruction executed (includes speculation). 89H Br_Missp_Exec 00H Branch instructions executed and mispredicted at execution (includes branches that do not have prediction or mispredicted). 8AH Br_BAC_Missp_ Exec 00H Branch instructions executed that were mispredicted at front end. 8BH Br_Cnd_Exec 00H Conditional branch instructions executed. 8CH Br_Cnd_Missp_ Exec 00H Conditional branch instructions executed that were mispredicted. 8DH Br_Ind_Exec 00H Indirect branch instructions executed. 8EH Br_Ind_Missp_Exec 00H Indirect branch instructions executed that were mispredicted. 8FH Br_Ret_Exec 00H Return branch instructions executed. 90H Br_Ret_Missp_Exec 00H Return branch instructions executed that were mispredicted. 91H Br_Ret_BAC_Missp_ Exec 00H Return branch instructions executed that were mispredicted at the front end. Description 92H Br_Call_Exec 00H Return call instructions executed. 93H Br_Call_Missp_Exec 00H Return call instructions executed that were mispredicted. 94H Br_Ind_Call_Exec 00H Indirect call branch instructions executed. A2H Resource_Stall 00H Cycles while there is a resource related stall (renaming, buffer entries) as seen by allocator. B0H MMX_Instr_Exec 00H Number of MMX instructions executed (does not include MOVQ and MOVD stores). B1H SIMD_Int_Sat_Exec 00H Number of SIMD Integer saturating instructions executed. B3H SIMD_Int_Pmul_ Exec 01H Number of SIMD Integer packed multiply instructions executed. B3H SIMD_Int_Psft_Exec 02H Number of SIMD Integer packed shift instructions executed. B3H SIMD_Int_Pck_Exec 04H Number of SIMD Integer pack operations instruction executed. B3H SIMD_Int_Upck_ Exec 08H Number of SIMD Integer unpack instructions executed. B3H SIMD_Int_Plog_ Exec 10H Number of SIMD Integer packed logical instructions executed. B3H SIMD_Int_Pari_Exec 20H Number of SIMD Integer packed arithmetic instructions executed. 19-216 Vol. 3B Comment PERFORMANCE MONITORING EVENTS Table 19-32. Performance Events in Intel® Core™ Solo and Intel® Core™ Duo Processors (Contd.) Event Num. Event Mask Mnemonic Umask Value C0H Instr_Ret 00H Number of instruction retired (Macro fused instruction count as 2). C1H FP_Comp_Instr_Ret 00H Number of FP compute instructions retired (X87 instruction or instruction that contains X87 operations). C2H Uops_Ret 00H Number of micro-ops retired (include fused uops). C3H SMC_Detected 00H Number of times self-modifying code condition detected. C4H Br_Instr_Ret 00H Number of branch instructions retired. C5H Br_MisPred_Ret 00H Number of mispredicted branch instructions retired. C6H Cycles_Int_Masked 00H Cycles while interrupt is disabled. C7H Cycles_Int_Pedning_ 00H Masked Cycles while interrupt is disabled and interrupts are pending. C8H HW_Int_Rx 00H Number of hardware interrupts received. C9H Br_Taken_Ret 00H Number of taken branch instruction retired. CAH Br_MisPred_Taken_ Ret 00H Number of taken and mispredicted branch instructions retired. CCH MMX_FP_Trans 00H Number of transitions from MMX to X87. CCH FP_MMX_Trans 01H Number of transitions from X87 to MMX. CDH MMX_Assist 00H Number of EMMS executed. CEH MMX_Instr_Ret 00H Number of MMX instruction retired. D0H Instr_Decoded 00H Number of instruction decoded. D7H ESP_Uops 00H Number of ESP folding instruction decoded. D8H SIMD_FP_SP_Ret 00H Number of SSE/SSE2 single precision instructions retired (packed and scalar). D8H SIMD_FP_SP_S_ Ret 01H Number of SSE/SSE2 scalar single precision instructions retired. D8H SIMD_FP_DP_P_ Ret 02H Number of SSE/SSE2 packed double precision instructions retired. D8H SIMD_FP_DP_S_ Ret 03H Number of SSE/SSE2 scalar double precision instructions retired. D8H SIMD_Int_128_Ret 04H Number of SSE2 128 bit integer instructions retired. D9H SIMD_FP_SP_P_ Comp_Ret 00H Number of SSE/SSE2 packed single precision compute instructions retired (does not include AND, OR, XOR). D9H SIMD_FP_SP_S_ Comp_Ret 01H Number of SSE/SSE2 scalar single precision compute instructions retired (does not include AND, OR, XOR). D9H SIMD_FP_DP_P_ Comp_Ret 02H Number of SSE/SSE2 packed double precision compute instructions retired (does not include AND, OR, XOR). D9H SIMD_FP_DP_S_ Comp_Ret 03H Number of SSE/SSE2 scalar double precision compute instructions retired (does not include AND, OR, XOR). Description Comment Use IA32_PMC0 only. Vol. 3B 19-217 PERFORMANCE MONITORING EVENTS Table 19-32. Performance Events in Intel® Core™ Solo and Intel® Core™ Duo Processors (Contd.) Event Num. Event Mask Mnemonic Umask Value Description DAH Fused_Uops_Ret 00H All fused uops retired. DAH Fused_Ld_Uops_ Ret 01H Fused load uops retired. DAH Fused_St_Uops_Ret 02H Fused store uops retired. DBH Unfusion 00H Number of unfusion events in the ROB (due to exception). E0H Br_Instr_Decoded 00H Branch instructions decoded. E2H BTB_Misses 00H Number of branches the BTB did not produce a prediction. E4H Br_Bogus 00H Number of bogus branches. E6H BAClears 00H Number of BAClears asserted. F0H Pref_Rqsts_Up 00H Number of hardware prefetch requests issued in forward streams. F8H Pref_Rqsts_Dn 00H Number of hardware prefetch requests issued in backward streams. 19.19 Comment PENTIUM® 4 AND INTEL® XEON® PROCESSOR PERFORMANCE MONITORING EVENTS Tables 19-33, 19-34 and 19-35 list performance monitoring events that can be counted or sampled on processors based on Intel NetBurst® microarchitecture. Table 19-33 lists the non-retirement events, and Table 19-34 lists the at-retirement events. Tables 19-36, 19-37, and 19-38 describes three sets of parameters that are available for three of the at-retirement counting events defined in Table 19-34. Table 19-39 shows which of the non-retirement and at retirement events are logical processor specific (TS) (see Section 18.6.4.4, “Performance Monitoring Events”) and which are non-logical processor specific (TI). Some of the Pentium 4 and Intel Xeon processor performance monitoring events may be available only to specific models. The performance monitoring events listed in Tables 19-33 and 19-34 apply to processors with CPUID signature that matches family encoding 15, model encoding 0, 1, 2 3, 4, or 6. Table applies to processors with a CPUID signature that matches family encoding 15, model encoding 3, 4 or 6. The functionality of performance monitoring events in Pentium 4 and Intel Xeon processors is also available when IA-32e mode is enabled. Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting Event Name Event Parameters Parameter Value TC_deliver_mode Description This event counts the duration (in clock cycles) of the operating modes of the trace cache and decode engine in the processor package. The mode is specified by one or more of the event mask bits. ESCR restrictions MSR_TC_ESCR0 MSR_TC_ESCR1 19-218 Vol. 3B Counter numbers per ESCR ESCR0: 4, 5 ESCR Event Select 01H ESCR1: 6, 7 ESCR[31:25] PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value ESCR Event Mask Description ESCR[24:9] Bit CCCR Select 0: DD Both logical processors are in deliver mode. 1: DB Logical processor 0 is in deliver mode and logical processor 1 is in build mode. 2: DI Logical processor 0 is in deliver mode and logical processor 1 is either halted, under a machine clear condition or transitioning to a long microcode flow. 3: BD Logical processor 0 is in build mode and logical processor 1 is in deliver mode. 4: BB Both logical processors are in build mode. 5: BI Logical processor 0 is in build mode and logical processor 1 is either halted, under a machine clear condition or transitioning to a long microcode flow. 6: ID Logical processor 0 is either halted, under a machine clear condition or transitioning to a long microcode flow. Logical processor 1 is in deliver mode. 7: IB Logical processor 0 is either halted, under a machine clear condition or transitioning to a long microcode flow. Logical processor 1 is in build mode. 01H CCCR[15:13] Event Specific Notes If only one logical processor is available from a physical processor package, the event mask should be interpreted as logical processor 1 is halted. Event mask bit 2 was previously known as “DELIVER”, bit 5 was previously known as “BUILD”. BPU_fetch_ request This event counts instruction fetch requests of specified request type by the Branch Prediction unit. Specify one or more mask bits to qualify the request type(s). ESCR restrictions MSR_BPU_ESCR0 MSR_BPU_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 03H ESCR1: 2, 3 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] Bit 0: TCMISS Trace cache lookup miss 00H CCCR[15:13] Vol. 3B 19-219 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value ITLB_reference Description This event counts translations using the Instruction Translation Look-aside Buffer (ITLB). ESCR restrictions MSR_ITLB_ESCR0 MSR_ITLB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 18H ESCR1: 2, 3 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 0: HIT ITLB hit 1: MISS ITLB miss 2: HIT_UC Uncacheable ITLB hit 03H CCCR[15:13] Event Specific Notes All page references regardless of the page size are looked up as actual 4-KByte pages. Use the page_walk_type event with the ITMISS mask for a more conservative count. memory_cancel This event counts the canceling of various type of request in the Data cache Address Control unit (DAC). Specify one or more mask bits to select the type of requests that are canceled. ESCR restrictions MSR_DAC_ESCR0 MSR_DAC_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 02H ESCR1: 10, 11 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit 2: ST_RB_FULL 3: 64K_CONF CCCR Select 05H Event Specific Notes Replayed because no store request buffer is available. Conflicts due to 64-KByte aliasing. CCCR[15:13] All_CACHE_MISS includes uncacheable memory in count. memory_ complete This event counts the completion of a load split, store split, uncacheable (UC) split, or UC load. Specify one or more mask bits to select the operations to be counted. ESCR restrictions MSR_SAAT_ESCR0 MSR_SAAT_ESCR1 19-220 Vol. 3B Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 08H ESCR1: 10, 11 ESCR[31:25] PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value ESCR Event Mask Description ESCR[24:9] Bit 0: LSC Load split completed, excluding UC/WC loads. Any split stores completed. 1: SSC CCCR Select 02H load_port_replay CCCR[15:13] This event counts replayed events at the load port. Specify one or more mask bits to select the cause of the replay. ESCR restrictions MSR_SAAT_ESCR0 MSR_SAAT_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 04H ESCR1: 10, 11 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] Bit 1: SPLIT_LD Split load. 02H CCCR[15:13] Event Specific Notes Must use ESCR1 for at-retirement counting. store_port_replay This event counts replayed events at the store port. Specify one or more mask bits to select the cause of the replay. ESCR restrictions MSR_SAAT_ESCR0 MSR_SAAT_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 05H ESCR1: 10, 11 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] Bit 1: SPLIT_ST Split store 02H CCCR[15:13] Event Specific Notes Must use ESCR1 for at-retirement counting. MOB_load_replay This event triggers if the memory order buffer (MOB) caused a load operation to be replayed. Specify one or more mask bits to select the cause of the replay. ESCR restrictions MSR_MOB_ESCR0 MSR_MOB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 03H ESCR1: 2, 3 ESCR[31:25] Vol. 3B 19-221 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value ESCR Event Mask Description ESCR[24:9] Bit 1: NO_STA Replayed because of unknown store address. Replayed because of unknown store data. 3: NO_STD 4: PARTIAL_DATA CCCR Select Replayed because of partially overlapped data access between the load and store operations. 5: UNALGN_ADDR Replayed because the lower 4 bits of the linear address do not match between the load and store operations. 02H CCCR[15:13] page_walk_type This event counts various types of page walks that the page miss handler (PMH) performs. ESCR restrictions MSR_PMH_ ESCR0 MSR_PMH_ ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 01H ESCR1: 2, 3 ESCR Event Mask ESCR[31:25] ESCR[24:9] Bit 0: DTMISS Page walk for a data TLB miss (either load or store). Page walk for an instruction TLB miss. 1: ITMISS CCCR Select 04H BSQ_cache _reference CCCR[15:13] This event counts cache references (2nd level cache or 3rd level cache) as seen by the bus unit. Specify one or more mask bit to select an access according to the access type (read type includes both load and RFO, write type includes writebacks and evictions) and the access result (hit, misses). ESCR restrictions MSR_BSU_ ESCR0 MSR_BSU_ ESCR1 19-222 Vol. 3B Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 0CH ESCR1: 2, 3 ESCR[31:25] PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description ESCR[24:9] Bit 0: RD_2ndL_HITS Read 2nd level cache hit Shared (includes load and RFO). 1: RD_2ndL_HITE Read 2nd level cache hit Exclusive (includes load and RFO). 2: RD_2ndL_HITM Read 2nd level cache hit Modified (includes load and RFO). 3: RD_3rdL_HITS Read 3rd level cache hit Shared (includes load and RFO). 4: RD_3rdL_HITE Read 3rd level cache hit Exclusive (includes load and RFO). 5: RD_3rdL_HITM ESCR Event Mask CCCR Select Event Specific Notes Read 3rd level cache hit Modified (includes load and RFO). 8: RD_2ndL_MISS Read 2nd level cache miss (includes load and RFO). 9: RD_3rdL_MISS Read 3rd level cache miss (includes load and RFO). 10: WR_2ndL_MISS A Writeback lookup from DAC misses the 2nd level cache (unlikely to happen). 07H CCCR[15:13] 1: The implementation of this event in current Pentium 4 and Xeon processors treats either a load operation or a request for ownership (RFO) request as a “read” type operation. 2: Currently this event causes both over and undercounting by as much as a factor of two due to an erratum. 3: It is possible for a transaction that is started as a prefetch to change the transaction's internal status, making it no longer a prefetch. or change the access result status (hit, miss) as seen by this event. IOQ_allocation This event counts the various types of transactions on the bus. A count is generated each time a transaction is allocated into the IOQ that matches the specified mask bits. An allocated entry can be a sector (64 bytes) or a chunks of 8 bytes. Requests are counted once per retry. The event mask bits constitute 4 bit fields. A transaction type is specified by interpreting the values of each bit field. Specify one or more event mask bits in a bit field to select the value of the bit field. Each field (bits 0-4 are one field) are independent of and can be ORed with the others. The request type field is further combined with bit 5 and 6 to form a binary expression. Bits 7 and 8 form a bit field to specify the memory type of the target address. Bits 13 and 14 form a bit field to specify the source agent of the request. Bit 15 affects read operation only. The event is triggered by evaluating the logical expression: (((Request type) OR Bit 5 OR Bit 6) OR (Memory type)) AND (Source agent). Vol. 3B 19-223 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value ESCR restrictions MSR_FSB_ESCR0, MSR_FSB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1; ESCR Event Select 03H Description ESCR1: 2, 3 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bits CCCR Select Event Specific Notes 0-4 (single field) Bus request type (use 00001 for invalid or default). 5: ALL_READ Count read entries. 6: ALL_WRITE Count write entries. 7: MEM_UC Count UC memory access entries. 8: MEM_WC Count WC memory access entries. 9: MEM_WT Count write-through (WT) memory access entries. 10: MEM_WP Count write-protected (WP) memory access entries. 11: MEM_WB Count WB memory access entries. 13: OWN Count all store requests driven by processor, as opposed to other processor or DMA. 14: OTHER Count all requests driven by other processors or DMA. 15: PREFETCH Include HW and SW prefetch requests in the count. 06H CCCR[15:13] 1: If PREFETCH bit is cleared, sectors fetched using prefetch are excluded in the counts. If PREFETCH bit is set, all sectors or chunks read are counted. 2: Specify the edge trigger in CCCR to avoid double counting. 3: The mapping of interpreted bit field values to transaction types may differ with different processor model implementations of the Pentium 4 processor family. Applications that program performance monitoring events should use CPUID to determine processor models when using this event. The logic equations that trigger the event are model-specific (see 4a and 4b below). 4a:For Pentium 4 and Xeon Processors starting with CPUID Model field encoding equal to 2 or greater, this event is triggered by evaluating the logical expression ((Request type) and (Bit 5 or Bit 6) and (Memory type) and (Source agent)). 4b:For Pentium 4 and Xeon Processors with CPUID Model field encoding less than 2, this event is triggered by evaluating the logical expression [((Request type) or Bit 5 or Bit 6) or (Memory type)] and (Source agent). Note that event mask bits for memory type are ignored if either ALL_READ or ALL_WRITE is specified. 5: This event is known to ignore CPL in early implementations of Pentium 4 and Xeon Processors. Both user requests and OS requests are included in the count. This behavior is fixed starting with Pentium 4 and Xeon Processors with CPUID signature F27H (Family 15, Model 2, Stepping 7). 19-224 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description 6: For write-through (WT) and write-protected (WP) memory types, this event counts reads as the number of 64-byte sectors. Writes are counted by individual chunks. 7: For uncacheable (UC) memory types, this event counts the number of 8-byte chunks allocated. 8: For Pentium 4 and Xeon Processors with CPUID Signature less than F27H, only MSR_FSB_ESCR0 is available. IOQ_active_ entries This event counts the number of entries (clipped at 15) in the IOQ that are active. An allocated entry can be a sector (64 bytes) or a chunks of 8 bytes. The event must be programmed in conjunction with IOQ_allocation. Specify one or more event mask bits to select the transactions that is counted. ESCR restrictions MSR_FSB_ESCR1 Counter numbers per ESCR ESCR1: 2, 3 ESCR Event Select 01AH ESCR[30:25] ESCR[24:9] ESCR Event Mask Bits CCCR Select Event Specific Notes 0-4 (single field) Bus request type (use 00001 for invalid or default). 5: ALL_READ Count read entries. 6: ALL_WRITE Count write entries. 7: MEM_UC Count UC memory access entries. 8: MEM_WC Count WC memory access entries. 9: MEM_WT Count write-through (WT) memory access entries. 10: MEM_WP Count write-protected (WP) memory access entries. 11: MEM_WB Count WB memory access entries. 13: OWN Count all store requests driven by processor, as opposed to other processor or DMA. 14: OTHER Count all requests driven by other processors or DMA. 15: PREFETCH Include HW and SW prefetch requests in the count. 06H CCCR[15:13] 1: Specified desired mask bits in ESCR0 and ESCR1. 2: See the ioq_allocation event for descriptions of the mask bits. 3: Edge triggering should not be used when counting cycles. 4: The mapping of interpreted bit field values to transaction types may differ across different processor model implementations of the Pentium 4 processor family. Applications that programs performance monitoring events should use the CPUID instruction to detect processor models when using this event. The logical expression that triggers this event as describe below: 5a:For Pentium 4 and Xeon Processors starting with CPUID MODEL field encoding equal to 2 or greater, this event is triggered by evaluating the logical expression ((Request type) and (Bit 5 or Bit 6) and (Memory type) and (Source agent)). Vol. 3B 19-225 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description 5b:For Pentium 4 and Xeon Processors starting with CPUID MODEL field encoding less than 2, this event is triggered by evaluating the logical expression [((Request type) or Bit 5 or Bit 6) or (Memory type)] and (Source agent). Event mask bits for memory type are ignored if either ALL_READ or ALL_WRITE is specified. 5c: This event is known to ignore CPL in the current implementations of Pentium 4 and Xeon Processors Both user requests and OS requests are included in the count. 6: An allocated entry can be a full line (64 bytes) or in individual chunks of 8 bytes. FSB_data_ activity This event increments once for each DRDY or DBSY event that occurs on the front side bus. The event allows selection of a specific DRDY or DBSY event. ESCR restrictions MSR_FSB_ESCR0 MSR_FSB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 17H ESCR1: 2, 3 ESCR Event Mask ESCR[31:25] ESCR[24:9] Bit 0: DRDY_DRV Count when this processor drives data onto the bus - includes writes and implicit writebacks. Asserted two processor clock cycles for partial writes and 4 processor clocks (usually in consecutive bus clocks) for full line writes. 1: DRDY_OWN Count when this processor reads data from the bus - includes loads and some PIC transactions. Asserted two processor clock cycles for partial reads and 4 processor clocks (usually in consecutive bus clocks) for full line reads. Count DRDY events that we drive. Count DRDY events sampled that we own. 2: DRDY_OTHER Count when data is on the bus but not being sampled by the processor. It may or may not be being driven by this processor. Asserted two processor clock cycles for partial transactions and 4 processor clocks (usually in consecutive bus clocks) for full line transactions. 3: DBSY_DRV Count when this processor reserves the bus for use in the next bus cycle in order to drive data. Asserted for two processor clock cycles for full line writes and not at all for partial line writes. May be asserted multiple times (in consecutive bus clocks) if we stall the bus waiting for a cache lock to complete. 4: DBSY_OWN Count when some agent reserves the bus for use in the next bus cycle to drive data that this processor will sample. Asserted for two processor clock cycles for full line writes and not at all for partial line writes. May be asserted multiple times (all one bus clock apart) if we stall the bus for some reason. 19-226 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description 5:DBSY_OTHER Count when some agent reserves the bus for use in the next bus cycle to drive data that this processor will NOT sample. It may or may not be being driven by this processor. Asserted two processor clock cycles for partial transactions and 4 processor clocks (usually in consecutive bus clocks) for full line transactions. CCCR Select 06H Event Specific Notes CCCR[15:13] Specify edge trigger in the CCCR MSR to avoid double counting. DRDY_OWN and DRDY_OTHER are mutually exclusive; similarly for DBSY_OWN and DBSY_OTHER. BSQ_allocation This event counts allocations in the Bus Sequence Unit (BSQ) according to the specified mask bit encoding. The event mask bits consist of four sub-groups: • • • • Request type. Request length. Memory type. Sub-group consisting mostly of independent bits (bits 5, 6, 7, 8, 9, and 10). Specify an encoding for each sub-group. ESCR restrictions MSR_BSU_ESCR0 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 05H ESCR[31:25] ESCR Event Mask Bit ESCR[24:9] 0: REQ_TYPE0 1: REQ_TYPE1 Request type encoding (bit 0 and 1) are: 2: REQ_LEN0 3: REQ_LEN1 Request length encoding (bit 2, 3) are: 5: REQ_IO_TYPE Request type is input or output. 6: REQ_LOCK_ TYPE Request type is bus lock. 7: REQ_CACHE_ TYPE Request type is cacheable. 8: REQ_SPLIT_ TYPE 9: REQ_DEM_TYPE Request type is a bus 8-byte chunk split across 8-byte boundary. 10: REQ_ORD_ TYPE Request is an ordered type. 0 – Read (excludes read invalidate). 1 – Read invalidate. 2 – Write (other than writebacks). 3 – Writeback (evicted from cache). (public) 0 – 0 chunks 1 – 1 chunks 3 – 8 chunks Request type is a demand if set. Request type is HW.SW prefetch if 0. Vol. 3B 19-227 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters CCCR Select Parameter Value Description 11: MEM_TYPE0 12: MEM_TYPE1 13: MEM_TYPE2 Memory type encodings (bit 11-13) are: 07H CCCR[15:13] Event Specific Notes 0 – UC 1 – WC 4 – WT 5 – WP 6 – WB 1: Specify edge trigger in CCCR to avoid double counting. 2: A writebacks to 3rd level cache from 2nd level cache counts as a separate entry, this is in additional to the entry allocated for a request to the bus. 3: A read request to WB memory type results in a request to the 64-byte sector, containing the target address, followed by a prefetch request to an adjacent sector. 4: For Pentium 4 and Xeon processors with CPUID model encoding value equals to 0 and 1, an allocated BSQ entry includes both the demand sector and prefetched 2nd sector. 5: An allocated BSQ entry for a data chunk is any request less than 64 bytes. 6a:This event may undercount for requests of split type transactions if the data address straddled across modulo-64 byte boundary. 6b:This event may undercount for requests of read request of 16-byte operands from WC or UC address. 6c: This event may undercount WC partial requests originated from store operands that are dwords. bsq_active_ entries This event represents the number of BSQ entries (clipped at 15) currently active (valid) which meet the subevent mask criteria during allocation in the BSQ. Active request entries are allocated on the BSQ until de-allocated. De-allocation of an entry does not necessarily imply the request is filled. This event must be programmed in conjunction with BSQ_allocation. Specify one or more event mask bits to select the transactions that is counted. ESCR restrictions ESCR1 Counter numbers per ESCR ESCR1: 2, 3 ESCR Event Select 06H ESCR Event Mask CCCR Select Event Specific Notes ESCR[30:25] ESCR[24:9] 07H CCCR[15:13] 1: Specified desired mask bits in ESCR0 and ESCR1. 2: See the BSQ_allocation event for descriptions of the mask bits. 3: Edge triggering should not be used when counting cycles. 4: This event can be used to estimate the latency of a transaction from allocation to de-allocation in the BSQ. The latency observed by BSQ_allocation includes the latency of FSB, plus additional overhead. 19-228 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description 5: Additional overhead may include the time it takes to issue two requests (the sector by demand and the adjacent sector via prefetch). Since adjacent sector prefetches have lower priority that demand fetches, on a heavily used system there is a high probability that the adjacent sector prefetch will have to wait until the next bus arbitration. 6: For Pentium 4 and Xeon processors with CPUID model encoding value less than 3, this event is updated every clock. 7: For Pentium 4 and Xeon processors with CPUID model encoding value equals to 3 or 4, this event is updated every other clock. SSE_input_assist This event counts the number of times an assist is requested to handle problems with input operands for SSE/SSE2/SSE3 operations; most notably denormal source operands when the DAZ bit is not set. Set bit 15 of the event mask to use this event. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 34H ESCR1: 10, 11 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] 15: ALL Count assists for SSE/SSE2/SSE3 μops. 01H CCCR[15:13] Event Specific Notes 1: Not all requests for assists are actually taken. This event is known to overcount in that it counts requests for assists from instructions on the non-retired path that do not incur a performance penalty. An assist is actually taken only for nonbogus μops. Any appreciable counts for this event are an indication that the DAZ or FTZ bit should be set and/or the source code should be changed to eliminate the condition. 2: Two common situations for an SSE/SSE2/SSE3 operation needing an assist are: (1) when a denormal constant is used as an input and the Denormals-Are-Zero (DAZ) mode is not set, (2) when the input operand uses the underflowed result of a previous SSE/SSE2/SSE3 operation and neither the DAZ nor Flush-To-Zero (FTZ) modes are set. 3: Enabling the DAZ mode prevents SSE/SSE2/SSE3 operations from needing assists in the first situation. Enabling the FTZ mode prevents SSE/SSE2/SSE3 operations from needing assists in the second situation. packed_SP_uop This event increments for each packed single-precision μop, specified through the event mask for detection. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 08H ESCR1: 10, 11 ESCR[31:25] Vol. 3B 19-229 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value ESCR Event Mask CCCR Select Description ESCR[24:9] Bit 15: ALL Count all μops operating on packed single-precision operands. 01H CCCR[15:13] Event Specific Notes 1: If an instruction contains more than one packed SP μops, each packed SP μop that is specified by the event mask will be counted. 2: This metric counts instances of packed memory μops in a repeat move string. packed_DP_uop This event increments for each packed double-precision μop, specified through the event mask for detection. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 0CH ESCR1: 10, 11 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] Bit 15: ALL Count all μops operating on packed double-precision operands. 01H CCCR[15:13] Event Specific Notes If an instruction contains more than one packed DP μops, each packed DP μop that is specified by the event mask will be counted. scalar_SP_uop This event increments for each scalar single-precision μop, specified through the event mask for detection. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 0AH ESCR1: 10, 11 ESCR Event Mask ESCR[24:9] Bit 15: ALL CCCR Select ESCR[31:25] 01H Event Specific Notes Count all μops operating on scalar single-precision operands. CCCR[15:13] If an instruction contains more than one scalar SP μops, each scalar SP μop that is specified by the event mask will be counted. scalar_DP_uop This event increments for each scalar double-precision μop, specified through the event mask for detection. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 0EH ESCR1: 10, 11 ESCR Event Mask CCCR Select 19-230 Vol. 3B ESCR[31:25] ESCR[24:9] Bit 15: ALL Count all μops operating on scalar double-precision operands. 01H CCCR[15:13] PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Event Specific Notes Description If an instruction contains more than one scalar DP μops, each scalar DP μop that is specified by the event mask is counted. 64bit_MMX_uop This event increments for each MMX instruction, which operate on 64-bit SIMD operands. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 02H ESCR1: 10, 11 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] Bit 15: ALL Count all μops operating on 64- bit SIMD integer operands in memory or MMX registers. 01H CCCR[15:13] Event Specific Notes If an instruction contains more than one 64-bit MMX μops, each 64bit MMX μop that is specified by the event mask will be counted. 128bit_MMX_uop This event increments for each integer SIMD SSE2 instruction, which operate on 128-bit SIMD operands. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 1AH ESCR1: 10, 11 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] Bit 15: ALL Count all μops operating on 128-bit SIMD integer operands in memory or XMM registers. 01H CCCR[15:13] Event Specific Notes If an instruction contains more than one 128-bit MMX μops, each 128-bit MMX μop that is specified by the event mask will be counted. x87_FP_uop This event increments for each x87 floating-point μop, specified through the event mask for detection. ESCR restrictions MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 04H ESCR1: 10, 11 ESCR Event Mask CCCR Select ESCR[31:25] ESCR[24:9] Bit 15: ALL Count all x87 FP μops. 01H CCCR[15:13] Vol. 3B 19-231 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Event Specific Notes Description 1: If an instruction contains more than one x87 FP μops, each x87 FP μop that is specified by the event mask will be counted. 2: This event does not count x87 FP μop for load, store, move between registers. TC_misc This event counts miscellaneous events detected by the TC. The counter will count twice for each occurrence. ESCR restrictions MSR_TC_ESCR0 MSR_TC_ESCR1 Counter numbers per ESCR ESCR0: 4, 5 ESCR Event Select 06H ESCR[31:25] CCCR Select 01H CCCR[15:13] ESCR1: 6, 7 ESCR Event Mask ESCR[24:9] Bit 4: FLUSH global_power _events Number of flushes This event accumulates the time during which a processor is not stopped. ESCR restrictions MSR_FSB_ESCR0 MSR_FSB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 013H ESCR Event Mask Bit 0: Running ESCR1: 2, 3 ESCR[31:25] ESCR[24:9] The processor is active (includes the handling of HLT STPCLK and throttling. CCCR Select 06H tc_ms_xfer CCCR[15:13] This event counts the number of times that uop delivery changed from TC to MS ROM. ESCR restrictions MSR_MS_ESCR0 MSR_MS_ESCR1 Counter numbers per ESCR ESCR0: 4, 5 ESCR Event Select 05H ESCR1: 6, 7 ESCR Event Mask CCCR Select ESCR[24:9] Bit 0: CISC A TC to MS transfer occurred. 0H CCCR[15:13] uop_queue_ writes This event counts the number of valid uops written to the uop queue. Specify one or more mask bits to select the source type of writes. ESCR restrictions MSR_MS_ESCR0 MSR_MS_ESCR1 Counter numbers per ESCR 19-232 Vol. 3B ESCR[31:25] ESCR0: 4, 5 ESCR1: 6, 7 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description ESCR Event Select 09H ESCR[31:25] ESCR Event Mask ESCR[24:9] Bit 0: FROM_TC_ BUILD The uops being written are from TC build mode. 1: FROM_TC_ DELIVER The uops being written are from TC deliver mode. 2: FROM_ROM CCCR Select 0H retired_mispred The uops being written are from microcode ROM. CCCR[15:13] This event counts retiring mispredicted branches by type. _branch_type ESCR restrictions MSR_TBPU_ESCR0 MSR_TBPU_ESCR1 Counter numbers per ESCR ESCR0: 4, 5 ESCR Event Select 05H ESCR1: 6, 7 ESCR[30:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 1: CONDITIONAL Conditional jumps. 2: CALL Indirect call branches. 3: RETURN Return branches. 4: INDIRECT Returns, indirect calls, or indirect jumps. 02H CCCR[15:13] Event Specific Notes This event may overcount conditional branches if: • Mispredictions cause the trace cache and delivery engine to build new traces. • When the processor's pipeline is being cleared. retired_branch This event counts retiring branches by type. Specify one or more mask bits to qualify the branch by its type. _type ESCR restrictions MSR_TBPU_ESCR0 MSR_TBPU_ESCR1 Counter numbers per ESCR ESCR0: 4, 5 ESCR Event Select 04H ESCR1: 6, 7 ESCR[30:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 1: CONDITIONAL Conditional jumps. 2: CALL Direct or indirect calls. 3: RETURN Return branches. 4: INDIRECT Returns, indirect calls, or indirect jumps. 02H CCCR[15:13] Vol. 3B 19-233 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Event Specific Notes Description This event may overcount conditional branches if : • Mispredictions cause the trace cache and delivery engine to build new traces. • When the processor's pipeline is being cleared. resource_stall This event monitors the occurrence or latency of stalls in the Allocator. ESCR restrictions MSR_ALF_ESCR0 MSR_ALF_ESCR1 Counter numbers per ESCR ESCR0: 12, 13, 16 ESCR1: 14, 15, 17 ESCR Event Select 01H Event Masks ESCR[30:25] ESCR[24:9] Bit CCCR Select 5: SBFULL A Stall due to lack of store buffers. 01H CCCR[15:13] Event Specific Notes This event may not be supported in all models of the processor family. WC_Buffer This event counts Write Combining Buffer operations that are selected by the event mask. ESCR restrictions MSR_DAC_ESCR0 MSR_DAC_ESCR1 Counter numbers per ESCR ESCR0: 8, 9 ESCR Event Select 05H ESCR1: 10, 11 Event Masks ESCR[30:25] ESCR[24:9] Bit CCCR Select 0: WCB_EVICTS WC Buffer evictions of all causes. 1: WCB_FULL_ EVICT WC Buffer eviction: no WC buffer is available. 05H CCCR[15:13] Event Specific Notes This event is useful for detecting the subset of 64K aliasing cases that are more costly (i.e. 64K aliasing cases involving stores) as long as there are no significant contributions due to write combining buffer full or hit-modified conditions. b2b_cycles This event can be configured to count the number back-to-back bus cycles using sub-event mask bits 1 through 6. ESCR restrictions MSR_FSB_ESCR0 MSR_FSB_ESCR1 19-234 Vol. 3B Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 016H ESCR[30:25] Event Masks Bit ESCR[24:9] ESCR1: 2, 3 PERFORMANCE MONITORING EVENTS Table 19-33. Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description CCCR Select 03H CCCR[15:13] Event Specific Notes This event may not be supported in all models of the processor family. bnr This event can be configured to count bus not ready conditions using sub-event mask bits 0 through 2. ESCR restrictions MSR_FSB_ESCR0 MSR_FSB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 08H ESCR[30:25] Event Masks Bit ESCR[24:9] CCCR Select 03H CCCR[15:13] ESCR1: 2, 3 Event Specific Notes This event may not be supported in all models of the processor family. snoop This event can be configured to count snoop hit modified bus traffic using sub-event mask bits 2, 6 and 7. ESCR restrictions MSR_FSB_ESCR0 MSR_FSB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 06H ESCR[30:25] Event Masks Bit ESCR[24:9] CCCR Select 03H CCCR[15:13] ESCR1: 2, 3 Event Specific Notes This event may not be supported in all models of the processor family. Response This event can be configured to count different types of responses using sub-event mask bits 1,2, 8, and 9. ESCR restrictions MSR_FSB_ESCR0 MSR_FSB_ESCR1 Counter numbers per ESCR ESCR0: 0, 1 ESCR Event Select 04H ESCR[30:25] Event Masks Bit ESCR[24:9] CCCR Select 03H CCCR[15:13] Event Specific Notes ESCR1: 2, 3 This event may not be supported in all models of the processor family. Vol. 3B 19-235 PERFORMANCE MONITORING EVENTS Table 19-34. Performance Monitoring Events For Intel NetBurst® Microarchitecture for At-Retirement Counting Event Name Event Parameters Parameter Value front_end_event Description This event counts the retirement of tagged μops, which are specified through the front-end tagging mechanism. The event mask specifies bogus or non-bogus μops. ESCR restrictions MSR_CRU_ESCR2 MSR_CRU_ESCR3 Counter numbers per ESCR ESCR2: 12, 13, 16 ESCR Event Select 08H ESCR3: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit 0: NBOGUS The marked μops are not bogus. 1: BOGUS The marked μops are bogus. CCCR Select 05H CCCR[15:13] Can Support PEBS Yes Require Additional MSRs for tagging Selected ESCRs and/or MSR_TC_ PRECISE_EVENT execution_event See list of metrics supported by Front_end tagging in Table A-3 This event counts the retirement of tagged μops, which are specified through the execution tagging mechanism. The event mask allows from one to four types of μops to be specified as either bogus or non-bogus μops to be tagged. ESCR restrictions MSR_CRU_ESCR2 MSR_CRU_ESCR3 Counter numbers per ESCR ESCR2: 12, 13, 16 ESCR Event Select 0CH ESCR3: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 0: NBOGUS0 The marked μops are not bogus. 1: NBOGUS1 The marked μops are not bogus. 2: NBOGUS2 The marked μops are not bogus. 3: NBOGUS3 The marked μops are not bogus. 4: BOGUS0 The marked μops are bogus. 5: BOGUS1 The marked μops are bogus. 6: BOGUS2 The marked μops are bogus. 7: BOGUS3 The marked μops are bogus. 05H Event Specific Notes Can Support PEBS 19-236 Vol. 3B CCCR[15:13] Each of the 4 slots to specify the bogus/non-bogus μops must be coordinated with the 4 TagValue bits in the ESCR (for example, NBOGUS0 must accompany a ‘1’ in the lowest bit of the TagValue field in ESCR, NBOGUS1 must accompany a ‘1’ in the next but lowest bit of the TagValue field). Yes PERFORMANCE MONITORING EVENTS Table 19-34. Performance Monitoring Events For Intel NetBurst® Microarchitecture for At-Retirement Counting (Contd.) Event Name Event Parameters Require Additional MSRs for tagging Parameter Value An ESCR for an upstream event replay_event Description See list of metrics supported by execution tagging in Table A-4. This event counts the retirement of tagged μops, which are specified through the replay tagging mechanism. The event mask specifies bogus or non-bogus μops. ESCR restrictions MSR_CRU_ESCR2 MSR_CRU_ESCR3 Counter numbers per ESCR ESCR2: 12, 13, 16 ESCR Event Select 09H ESCR3: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 0: NBOGUS The marked μops are not bogus. 1: BOGUS The marked μops are bogus. 05H CCCR[15:13] Event Specific Notes Supports counting tagged μops with additional MSRs. Can Support PEBS Yes Require Additional MSRs for tagging IA32_PEBS_ ENABLE See list of metrics supported by replay tagging in Table A-5. MSR_PEBS_ MATRIX_VERT Selected ESCR instr_retired This event counts instructions that are retired during a clock cycle. Mask bits specify bogus or non-bogus (and whether they are tagged using the front-end tagging mechanism). ESCR restrictions MSR_CRU_ESCR0 MSR_CRU_ESCR1 Counter numbers per ESCR ESCR0: 12, 13, 16 ESCR Event Select 02H ESCR1: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit 0: NBOGUSNTAG 1: NBOGUSTAG 2: BOGUSNTAG 3: BOGUSTAG CCCR Select Event Specific Notes 04H Non-bogus instructions that are not tagged. Non-bogus instructions that are tagged. Bogus instructions that are not tagged. Bogus instructions that are tagged. CCCR[15:13] 1: The event count may vary depending on the microarchitectural states of the processor when the event detection is enabled. 2: The event may count more than once for some instructions with complex uop flows and were interrupted before retirement. Vol. 3B 19-237 PERFORMANCE MONITORING EVENTS Table 19-34. Performance Monitoring Events For Intel NetBurst® Microarchitecture for At-Retirement Counting (Contd.) Event Name Event Parameters Can Support PEBS Parameter Value Description No uops_retired This event counts μops that are retired during a clock cycle. Mask bits specify bogus or non-bogus. ESCR restrictions MSR_CRU_ESCR0 MSR_CRU_ESCR1 Counter numbers per ESCR ESCR0: 12, 13, 16 ESCR Event Select 01H ESCR1: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 0: NBOGUS The marked μops are not bogus. 1: BOGUS The marked μops are bogus. 04H Event Specific Notes Can Support PEBS CCCR[15:13] P6: EMON_UOPS_RETIRED No uop_type This event is used in conjunction with the front-end at-retirement mechanism to tag load and store μops. ESCR restrictions MSR_RAT_ESCR0 MSR_RAT_ESCR1 Counter numbers per ESCR ESCR0: 12, 13, 16 ESCR Event Select 02H ESCR1: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 1: TAGLOADS The μop is a load operation. 2: TAGSTORES The μop is a store operation. 02H CCCR[15:13] Event Specific Notes Can Support PEBS Setting the TAGLOADS and TAGSTORES mask bits does not cause a counter to increment. They are only used to tag uops. No branch_retired 19-238 Vol. 3B This event counts the retirement of a branch. Specify one or more mask bits to select any combination of taken, not-taken, predicted and mispredicted. ESCR restrictions MSR_CRU_ESCR2 MSR_CRU_ESCR3 See Table 18-80 for the addresses of the ESCR MSRs Counter numbers per ESCR ESCR2: 12, 13, 16 ESCR3: 14, 15, 17 The counter numbers associated with each ESCR are provided. The performance counters and corresponding CCCRs can be obtained from Table 18-80. ESCR Event Select 06H ESCR[31:25] PERFORMANCE MONITORING EVENTS Table 19-34. Performance Monitoring Events For Intel NetBurst® Microarchitecture for At-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value Description ESCR[24:9] ESCR Event Mask Bit CCCR Select 0: MMNP Branch not-taken predicted 1: MMNM Branch not-taken mispredicted 2: MMTP Branch taken predicted 3: MMTM Branch taken mispredicted 05H CCCR[15:13] Event Specific Notes Can Support PEBS P6: EMON_BR_INST_RETIRED No mispred_branch_ retired This event represents the retirement of mispredicted branch instructions. ESCR restrictions MSR_CRU_ESCR0 MSR_CRU_ESCR1 Counter numbers per ESCR ESCR0: 12, 13, 16 ESCR Event Select 03H ESCR1: 14, 15, 17 ESCR Event Mask ESCR[31:25] ESCR[24:9] Bit 0: NBOGUS The retired instruction is not bogus. CCCR Select 04H CCCR[15:13] Can Support PEBS No x87_assist This event counts the retirement of x87 instructions that required special handling. Specifies one or more event mask bits to select the type of assistance. ESCR restrictions MSR_CRU_ESCR2 MSR_CRU_ESCR3 Counter numbers per ESCR ESCR2: 12, 13, 16 ESCR Event Select 03H ESCR3: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit 0: FPSU Handle FP stack underflow. 1: FPSO Handle FP stack overflow. 2: POAO Handle x87 output overflow. 3: POAU Handle x87 output underflow. 4: PREA Handle x87 input assist. CCCR Select 05H CCCR[15:13] Can Support PEBS No Vol. 3B 19-239 PERFORMANCE MONITORING EVENTS Table 19-34. Performance Monitoring Events For Intel NetBurst® Microarchitecture for At-Retirement Counting (Contd.) Event Name Event Parameters Parameter Value machine_clear Description This event increments according to the mask bit specified while the entire pipeline of the machine is cleared. Specify one of the mask bit to select the cause. ESCR restrictions MSR_CRU_ESCR2 MSR_CRU_ESCR3 Counter numbers per ESCR ESCR2: 12, 13, 16 ESCR Event Select 02H ESCR3: 14, 15, 17 ESCR Event Mask ESCR[31:25] ESCR[24:9] Bit 0: CLEAR Counts for a portion of the many cycles while the machine is cleared for any cause. Use Edge triggering for this bit only to get a count of occurrence versus a duration. 2: MOCLEAR Increments each time the machine is cleared due to memory ordering issues. 6: SMCLEAR Increments each time the machine is cleared due to self-modifying code issues. CCCR Select 05H CCCR[15:13] Can Support PEBS No Table 19-35. Intel NetBurst® Microarchitecture Model-Specific Performance Monitoring Events (For Model Encoding 3, 4 or 6) Event Name Event Parameters Parameter Value instr_completed Description This event counts instructions that have completed and retired during a clock cycle. Mask bits specify whether the instruction is bogus or non-bogus and whether they are: ESCR restrictions MSR_CRU_ESCR0 MSR_CRU_ESCR1 Counter numbers per ESCR ESCR0: 12, 13, 16 ESCR Event Select 07H ESCR1: 14, 15, 17 ESCR[31:25] ESCR[24:9] ESCR Event Mask Bit CCCR Select 0: NBOGUS Non-bogus instructions 1: BOGUS Bogus instructions 04H CCCR[15:13] Event Specific Notes Can Support PEBS 19-240 Vol. 3B This metric differs from instr_retired, since it counts instructions completed, rather than the number of times that instructions started. No PERFORMANCE MONITORING EVENTS Table 19-36. List of Metrics Available for Front_end Tagging (For Front_end Event Only) Front-end metric1 MSR_ TC_PRECISE_EVENT MSR Bit field Additional MSR Event mask value for Front_end_event memory_loads None Set TAGLOADS bit in ESCR corresponding to event Uop_Type. NBOGUS memory_stores None Set TAGSTORES bit in the ESCR corresponding to event Uop_Type. NBOGUS NOTES: 1. There may be some undercounting of front end events when there is an overflow or underflow of the floating point stack. Table 19-37. List of Metrics Available for Execution Tagging (For Execution Event Only) Execution metric Upstream ESCR TagValue in Upstream ESCR Event mask value for execution_event packed_SP_retired Set ALL bit in event mask, TagUop bit in ESCR of packed_SP_uop. 1 NBOGUS0 packed_DP_retired Set ALL bit in event mask, TagUop bit in ESCR of packed_DP_uop. 1 NBOGUS0 scalar_SP_retired Set ALL bit in event mask, TagUop bit in ESCR of scalar_SP_uop. 1 NBOGUS0 scalar_DP_retired Set ALL bit in event mask, TagUop bit in ESCR of scalar_DP_uop. 1 NBOGUS0 128_bit_MMX_retired Set ALL bit in event mask, TagUop bit in ESCR of 128_bit_MMX_uop. 1 NBOGUS0 64_bit_MMX_retired Set ALL bit in event mask, TagUop bit in ESCR of 64_bit_MMX_uop. 1 NBOGUS0 X87_FP_retired Set ALL bit in event mask, TagUop bit in ESCR of x87_FP_uop. 1 NBOGUS0 1 NBOGUS0 X87_SIMD_memory_m Set ALLP0, ALLP2 bits in event mask, TagUop bit in oves_retired ESCR of X87_SIMD_ moves_uop. Table 19-38. List of Metrics Available for Replay Tagging (For Replay Event Only) Replay metric1 IA32_PEBS_ ENABLE Field to Set MSR_PEBS_ MATRIX_VERT Bit Field to Set Additional MSR/ Event Event Mask Value for Replay_event 1stL_cache_load _miss_retired Bit 0, Bit 24, Bit 25 Bit 0 None NBOGUS 2ndL_cache_load _miss_retired2 Bit 1, Bit 24, Bit 25 Bit 0 None NBOGUS DTLB_load_miss _retired Bit 2, Bit 24, Bit 25 Bit 0 None NBOGUS DTLB_store_miss _retired Bit 2, Bit 24, Bit 25 Bit 1 None NBOGUS DTLB_all_miss _retired Bit 2, Bit 24, Bit 25 Bit 0, Bit 1 None NBOGUS Tagged_mispred_ branch Bit 15, Bit 16, Bit 24, Bit 25 Bit 4 None NBOGUS Vol. 3B 19-241 PERFORMANCE MONITORING EVENTS Table 19-38. List of Metrics Available for Replay Tagging (For Replay Event Only) (Contd.) Replay metric1 IA32_PEBS_ ENABLE Field to Set MSR_PEBS_ MATRIX_VERT Bit Field to Set MOB_load _replay_retired3 Bit 9, Bit 24, Bit 25 Bit 0 Select MOB_load_replay event and set PARTIAL_DATA and UNALGN_ADDR bit. NBOGUS split_load_retired Bit 10, Bit 24, Bit 25 Bit 0 Select load_port_replay event with the MSR_SAAT_ESCR1 MSR and set the SPLIT_LD mask bit. NBOGUS split_store_retired Bit 10, Bit 24, Bit 25 Bit 1 Select store_port_replay event with the MSR_SAAT_ESCR0 MSR and set the SPLIT_ST mask bit. NBOGUS Additional MSR/ Event Event Mask Value for Replay_event NOTES: 1. Certain kinds of μops cannot be tagged. These include I/O operations, UC and locked accesses, returns, and far transfers. 2. 2nd-level misses retired does not count all 2nd-level misses. It only includes those references that are found to be misses by the fast detection logic and not those that are later found to be misses. 3. While there are several causes for a MOB replay, the event counted with this event mask setting is the case where the data from a load that would otherwise be forwarded is not an aligned subset of the data from a preceding store. 19-242 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-39. Event Mask Qualification for Logical Processors Event Type Event Name Event Masks, ESCR[24:9] Non-Retirement BPU_fetch_request Bit 0: TCMISS Non-Retirement BSQ_allocation Non-Retirement Non-Retirement BSQ_cache_reference memory_cancel Bit TS or TI TS 0: REQ_TYPE0 TS 1: REQ_TYPE1 TS 2: REQ_LEN0 TS 3: REQ_LEN1 TS 5: REQ_IO_TYPE TS 6: REQ_LOCK_TYPE TS 7: REQ_CACHE_TYPE TS 8: REQ_SPLIT_TYPE TS 9: REQ_DEM_TYPE TS 10: REQ_ORD_TYPE TS 11: MEM_TYPE0 TS 12: MEM_TYPE1 TS 13: MEM_TYPE2 TS Bit 0: RD_2ndL_HITS TS 1: RD_2ndL_HITE TS 2: RD_2ndL_HITM TS 3: RD_3rdL_HITS TS 4: RD_3rdL_HITE TS 5: RD_3rdL_HITM TS 6: WR_2ndL_HIT TS 7: WR_3rdL_HIT TS 8: RD_2ndL_MISS TS 9: RD_3rdL_MISS TS 10: WR_2ndL_MISS TS 11: WR_3rdL_MISS TS Bit 2: ST_RB_FULL TS 3: 64K_CONF TS Non-Retirement SSE_input_assist Bit 15: ALL TI Non-Retirement 64bit_MMX_uop Bit 15: ALL TI Non-Retirement packed_DP_uop Bit 15: ALL TI Non-Retirement packed_SP_uop Bit 15: ALL TI Non-Retirement scalar_DP_uop Bit 15: ALL TI Non-Retirement scalar_SP_uop Bit 15: ALL TI Non-Retirement 128bit_MMX_uop Bit 15: ALL TI Non-Retirement x87_FP_uop Bit 15: ALL TI Vol. 3B 19-243 PERFORMANCE MONITORING EVENTS Table 19-39. Event Mask Qualification for Logical Processors (Contd.) Event Type Event Name Non-Retirement x87_SIMD_moves_uop Non-Retirement Non-Retirement Non-Retirement FSB_data_activity IOQ_allocation IOQ_active_entries Event Masks, ESCR[24:9] Bit TS or TI 3: ALLP0 TI 4: ALLP2 TI Bit 0: DRDY_DRV TI 1: DRDY_OWN TI 2: DRDY_OTHER TI 3: DBSY_DRV TI 4: DBSY_OWN TI 5: DBSY_OTHER TI Bit 0: ReqA0 TS 1: ReqA1 TS 2: ReqA2 TS 3: ReqA3 TS 4: ReqA4 TS 5: ALL_READ TS 6: ALL_WRITE TS 7: MEM_UC TS 8: MEM_WC TS 9: MEM_WT TS 10: MEM_WP TS 11: MEM_WB TS 13: OWN TS 14: OTHER TS 15: PREFETCH TS Bit TS 0: ReqA0 19-244 Vol. 3B 1:ReqA1 TS 2: ReqA2 TS 3: ReqA3 TS 4: ReqA4 TS 5: ALL_READ TS 6: ALL_WRITE TS 7: MEM_UC TS 8: MEM_WC TS 9: MEM_WT TS 10: MEM_WP TS 11: MEM_WB TS PERFORMANCE MONITORING EVENTS Table 19-39. Event Mask Qualification for Logical Processors (Contd.) Event Type Event Name Event Masks, ESCR[24:9] 13: OWN Non-Retirement global_power_events Non-Retirement ITLB_reference Non-Retirement Non-Retirement Non-Retirement MOB_load_replay page_walk_type uop_type TS or TI TS 14: OTHER TS 15: PREFETCH TS Bit 0: RUNNING TS Bit 0: HIT TS 1: MISS TS 2: HIT_UC TS Bit 1: NO_STA TS 3: NO_STD TS 4: PARTIAL_DATA TS 5: UNALGN_ADDR TS Bit 0: DTMISS TI 1: ITMISS TI Bit 1: TAGLOADS TS 2: TAGSTORES TS Non-Retirement load_port_replay Bit 1: SPLIT_LD TS Non-Retirement store_port_replay Bit 1: SPLIT_ST TS Non-Retirement memory_complete Non-Retirement Non-Retirement retired_mispred_branch_ type retired_branch_type Bit 0: LSC TS 1: SSC TS 2: USC TS 3: ULC TS Bit 0: UNCONDITIONAL TS 1: CONDITIONAL TS 2: CALL TS 3: RETURN TS 4: INDIRECT TS Bit 0: UNCONDITIONAL TS 1: CONDITIONAL TS 2: CALL TS 3: RETURN TS 4: INDIRECT TS Vol. 3B 19-245 PERFORMANCE MONITORING EVENTS Table 19-39. Event Mask Qualification for Logical Processors (Contd.) Event Type Event Name Event Masks, ESCR[24:9] Non-Retirement tc_ms_xfer Bit 0: CISC Non-Retirement tc_misc Bit 4: FLUSH Non-Retirement Non-Retirement TC_deliver_mode uop_queue_writes Bit TS or TI TS TS 0: DD TI 1: DB TI 2: DI TI 3: BD TI 4: BB TI 5: BI TI 6: ID TI 7: IB TI Bit 0: FROM_TC_BUILD TS 1: FROM_TC_DELIVER TS 2: FROM_ROM TS Non-Retirement resource_stall Bit 5: SBFULL TS Non-Retirement WC_Buffer Bit TI 0: WCB_EVICTS TI 1: WCB_FULL_EVICT TI 2: WCB_HITM_EVICT TI At Retirement At Retirement At Retirement At Retirement At Retirement 19-246 Vol. 3B instr_retired machine_clear front_end_event replay_event execution_event Bit 0: NBOGUSNTAG TS 1: NBOGUSTAG TS 2: BOGUSNTAG TS 3: BOGUSTAG TS Bit 0: CLEAR TS 2: MOCLEAR TS 6: SMCCLEAR TS Bit 0: NBOGUS TS 1: BOGUS TS Bit 0: NBOGUS TS 1: BOGUS TS Bit 0: NONBOGUS0 TS 1: NONBOGUS1 TS PERFORMANCE MONITORING EVENTS Table 19-39. Event Mask Qualification for Logical Processors (Contd.) Event Type Event Name At Retirement At Retirement x87_assist branch_retired At Retirement mispred_branch_retired At Retirement uops_retired At Retirement 19.20 instr_completed Event Masks, ESCR[24:9] TS or TI 2: NONBOGUS2 TS 3: NONBOGUS3 TS 4: BOGUS0 TS 5: BOGUS1 TS 6: BOGUS2 TS 7: BOGUS3 TS Bit 0: FPSU TS 1: FPSO TS 2: POAO TS 3: POAU TS 4: PREA TS Bit 0: MMNP TS 1: MMNM TS 2: MMTP TS 3: MMTM TS Bit 0: NBOGUS TS Bit 0: NBOGUS TS 1: BOGUS TS Bit 0: NBOGUS TS 1: BOGUS TS PERFORMANCE MONITORING EVENTS FOR INTEL® PENTIUM® M PROCESSORS The Pentium M processor’s performance monitoring events are based on monitoring events for the P6 family of processors. All of these performance events are model specific for the Pentium M processor and are not available in this form in other processors. Table 19-40 lists the performance monitoring events that were added in the Pentium M processor. Vol. 3B 19-247 PERFORMANCE MONITORING EVENTS Table 19-40. Performance Monitoring Events on Intel® Pentium® M Processors Name Hex Values Descriptions Power Management EMON_EST_TRANS 58H Number of Enhanced Intel SpeedStep technology transitions: Mask = 00H - All transitions Mask = 02H - Only Frequency transitions EMON_THERMAL_TRIP 59H Duration/Occurrences in thermal trip; to count number of thermal trips: bit 22 in PerfEvtSel0/1 needs to be set to enable edge detect. BR_INST_EXEC 88H Branch instructions that were executed (not necessarily retired). BR_MISSP_EXEC 89H Branch instructions executed that were mispredicted at execution. BR_BAC_MISSP_EXEC 8AH Branch instructions executed that were mispredicted at front end (BAC). BR_CND_EXEC 8BH Conditional branch instructions that were executed. BR_CND_MISSP_EXEC 8CH Conditional branch instructions executed that were mispredicted. BR_IND_EXEC 8DH Indirect branch instructions executed. BR_IND_MISSP_EXEC 8EH Indirect branch instructions executed that were mispredicted. BR_RET_EXEC 8FH Return branch instructions executed. BR_RET_MISSP_EXEC 90H Return branch instructions executed that were mispredicted at execution. BR_RET_BAC_MISSP_EXEC 91H Return branch instructions executed that were mispredicted at front end (BAC). BR_CALL_EXEC 92H CALL instruction executed. BR_CALL_MISSP_EXEC 93H CALL instruction executed and miss predicted. BR_IND_CALL_EXEC 94H Indirect CALL instructions executed. EMON_SIMD_INSTR_RETIRED CEH Number of retired MMX instructions. EMON_SYNCH_UOPS D3H Sync micro-ops EMON_ESP_UOPS D7H Total number of micro-ops EMON_FUSED_UOPS_RET DAH Number of retired fused micro-ops: BPU Decoder Mask = 0 - Fused micro-ops Mask = 1 - Only load+Op micro-ops Mask = 2 - Only std+sta micro-ops EMON_UNFUSION DBH Number of unfusion events in the ROB, happened on a FP exception to a fused µop. EMON_PREF_RQSTS_UP F0H Number of upward prefetches issued. EMON_PREF_RQSTS_DN F8H Number of downward prefetches issued. Prefetcher 19-248 Vol. 3B PERFORMANCE MONITORING EVENTS A number of P6 family processor performance monitoring events are modified for the Pentium M processor. Table 19-41 lists the performance monitoring events that were changed in the Pentium M processor, and differ from performance monitoring events for the P6 family of processors. Table 19-41. Performance Monitoring Events Modified on Intel® Pentium® M Processors Name Hex Values Descriptions CPU_CLK_UNHALTED 79H Number of cycles during which the processor is not halted, and not in a thermal trip. EMON_SSE_SSE2_INST_ RETIRED D8H Streaming SIMD Extensions Instructions Retired: Mask = 0 – SSE packed single and scalar single Mask = 1 – SSE scalar-single Mask = 2 – SSE2 packed-double Mask = 3 – SSE2 scalar-double EMON_SSE_SSE2_COMP_INST_ RETIRED D9H Computational SSE Instructions Retired: Mask = 0 – SSE packed single Mask = 1 – SSE Scalar-single Mask = 2 – SSE2 packed-double Mask = 3 – SSE2 scalar-double L2_LD 29H L2 data loads Mask[0] = 1 – count I state lines L2_LINES_IN 24H L2 lines allocated Mask[1] = 1 – count S state lines L2_LINES_OUT 26H L2 lines evicted Mask[2] = 1 – count E state lines L2_M_LINES_OUT 27H Lw M-state lines evicted Mask[3] = 1 – count M state lines Mask[5:4]: 00H – Excluding hardware-prefetched lines 01H - Hardware-prefetched lines only 02H/03H – All (HW-prefetched lines and non HW -Prefetched lines) 19.21 P6 FAMILY PROCESSOR PERFORMANCE MONITORING EVENTS Table 19-42 lists the events that can be counted with the performance monitoring counters and read with the RDPMC instruction for the P6 family processors. The unit column gives the microarchitecture or bus unit that produces the event; the event number column gives the hexadecimal number identifying the event; the mnemonic event name column gives the name of the event; the unit mask column gives the unit mask required (if any); the description column describes the event; and the comments column gives additional information about the event. All of these performance events are model specific for the P6 family processors and are not available in this form in the Pentium 4 processors or the Pentium processors. Some events (such as those added in later generations of the P6 family processors) are only available in specific processors in the P6 family. All performance event encodings not listed in Table 19-42 are reserved and their use will result in undefined counter results. See the end of the table for notes related to certain entries in the table. Vol. 3B 19-249 PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters Unit Data Cache Unit (DCU) Event Mnemonic Event Num. Name Unit Mask 43H 00H DATA_MEM_REFS Description Comments All loads from any memory type. All stores to any memory type. Each part of a split is counted separately. The internal logic counts not only memory loads and stores, but also internal retries. 80-bit floating-point accesses are double counted, since they are decomposed into a 16-bit exponent load and a 64-bit mantissa load. Memory accesses are only counted when they are actually performed (such as a load that gets squashed because a previous cache miss is outstanding to the same address, and which finally gets performed, is only counted once). Does not include I/O accesses, or other nonmemory accesses. 45H DCU_LINES_IN 00H Total lines allocated in DCU. 46H DCU_M_LINES_IN 00H Number of M state lines allocated in DCU. 47H DCU_M_LINES_ OUT 00H DCU_MISS_ OUTSTANDING 00H 48H Number of M state lines evicted from DCU. This includes evictions via snoop HITM, intervention or replacement. Weighted number of cycles while a DCU miss is outstanding, incremented by the number of outstanding cache misses at any particular time. An access that also misses the L2 is short-changed by 2 cycles (i.e., if counts N cycles, should be N+2 cycles). Cacheable read requests only are considered. Subsequent loads to the same cache line will not result in any additional counts. Uncacheable requests are excluded. Read-for-ownerships are counted, as well as line fills, invalidates, and stores. Instruction Fetch Unit (IFU) 80H IFU_IFETCH 00H Number of instruction fetches, both cacheable and noncacheable, including UC fetches. 81H IFU_IFETCH_ MISS 00H Number of instruction fetch misses 85H ITLB_MISS 00H Number of ITLB misses. 86H IFU_MEM_STALL 00H Number of cycles instruction fetch is stalled, for any reason. All instruction fetches that do not hit the IFU (i.e., that produce memory requests). This includes UC accesses. Includes IFU cache misses, ITLB misses, ITLB faults, and other minor stalls. L2 Cache1 19-250 Vol. 3B 87H ILD_STALL 00H Number of cycles that the instruction length decoder is stalled. 28H L2_IFETCH MESI 0FH Number of L2 instruction fetches. This event indicates that a normal instruction fetch was received by the L2. Count value not precise, but still useful. PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Unit Event Mnemonic Event Num. Name Unit Mask Description Comments The count includes only L2 cacheable instruction fetches; it does not include UC instruction fetches. It does not include ITLB miss accesses. 29H L2_LD MESI 0FH Number of L2 data loads. This event indicates that a normal, unlocked, load memory access was received by the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other nonmemory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses. 2AH L2_ST MESI 0FH Number of L2 data stores. This event indicates that a normal, unlocked, store memory access was received by the L2. it indicates that the DCU sent a read-forownership request to the L2. It also includes Invalid to Modified requests sent by the DCU to the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other nonmemory accesses, or memory accesses such as UC/WT memory accesses. It includes TLB miss memory accesses. External Bus Logic (EBL)2 24H L2_LINES_IN 00H Number of lines allocated in the L2. 26H L2_LINES_OUT 00H Number of lines removed from the L2 for any reason. 25H L2_M_LINES_INM 00H Number of modified lines allocated in the L2. 27H L2_M_LINES_ OUTM 00H Number of modified lines removed from the L2 for any reason. 2EH L2_RQSTS MESI 0FH Total number of L2 requests. 21H L2_ADS 00H Number of L2 address strobes. 22H L2_DBUS_BUSY 00H Number of cycles during which the L2 cache data bus was busy. 23H L2_DBUS_BUSY_ RD 00H Number of cycles during which the data bus was busy transferring read data from L2 to the processor. 62H BUS_DRDY_ CLOCKS 00H (Self) Number of clocks during which DRDY# is asserted. 20H (Any) Utilization of the external system data bus during data transfers. Unit Mask = 00H counts bus clocks when the processor is driving DRDY#. Unit Mask = 20H counts in processor clocks when any agent is driving DRDY#. Vol. 3B 19-251 PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Unit Event Mnemonic Event Num. Name Unit Mask Description Comments 63H 00H (Self) Number of clocks during which LOCK# is asserted on the external system bus.3 Always counts in processor clocks. Counts only DCU full-line cacheable reads, not RFOs, writes, instruction fetches, or anything else. Counts “waiting for bus to complete” (last data chunk received). BUS_LOCK_ CLOCKS 20H (Any) 60H 65H BUS_REQ_ OUTSTANDING 00H (Self) Number of bus requests outstanding. BUS_TRAN_BRD 00H (Self) Number of burst read transactions. This counter is incremented by the number of cacheable read bus requests outstanding in any given cycle. 20H (Any) 66H BUS_TRAN_RFO 00H (Self) Number of completed read for ownership transactions. 20H (Any) 67H BUS_TRANS_WB 00H (Self) Number of completed write back transactions. 20H (Any) 68H BUS_TRAN_ IFETCH 00H (Self) Number of completed instruction fetch transactions. 20H (Any) 69H BUS_TRAN_INVA L 00H (Self) Number of completed invalidate transactions. 20H (Any) 6AH BUS_TRAN_PWR 00H (Self) Number of completed partial write transactions. 20H (Any) 6BH BUS_TRANS_P 00H (Self) Number of completed partial transactions. 20H (Any) 6CH BUS_TRANS_IO 00H (Self) Number of completed I/O transactions. 20H (Any) 6DH BUS_TRAN_DEF 00H (Self) 20H (Any) 19-252 Vol. 3B Number of completed deferred transactions. PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Unit Event Mnemonic Event Num. Name Unit Mask 6EH 00H (Self) BUS_TRAN_ BURST Description Comments Number of completed burst transactions. 20H (Any) 70H BUS_TRAN_ANY 00H (Self) 20H (Any) Number of all completed bus transactions. Address bus utilization can be calculated knowing the minimum address bus occupancy. Includes special cycles, etc. 6FH BUS_TRAN_MEM 00H (Self) Number of completed memory transactions. 20H (Any) 64H BUS_DATA_RCV 00H (Self) Number of bus clock cycles during which this processor is receiving data. 61H BUS_BNR_DRV 00H (Self) Number of bus clock cycles during which this processor is driving the BNR# pin. 7AH BUS_HIT_DRV 00H (Self) Number of bus clock cycles during which this processor is driving the HIT# pin. Includes cycles due to snoop stalls. The event counts correctly, but BPMi (breakpoint monitor) pins function as follows based on the setting of the PC bits (bit 19 in the PerfEvtSel0 and PerfEvtSel1 registers): • If the core-clock-to- bus-clock ratio is 2:1 or 3:1, and a PC bit is set, the BPMi pins will be asserted for a single clock when the counters overflow. • If the PC bit is clear, the processor toggles the BPMi pins when the counter overflows. • If the clock ratio is not 2:1 or 3:1, the BPMi pins will not function for these performance monitoring counter events. 7BH BUS_HITM_DRV 00H (Self) Number of bus clock cycles during which this processor is driving the HITM# pin. Includes cycles due to snoop stalls. The event counts correctly, but BPMi (breakpoint monitor) pins function as follows based on the setting of the PC bits (bit 19 in the PerfEvtSel0 and PerfEvtSel1 registers): • If the core-clock-to- bus-clock ratio is 2:1 or 3:1, and a PC bit is set, the BPMi pins will be asserted for a single clock when the counters overflow. Vol. 3B 19-253 PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Unit Event Mnemonic Event Num. Name Unit Mask Description Comments • If the PC bit is clear, the processor toggles the BPMipins when the counter overflows. • If the clock ratio is not 2:1 or 3:1, the BPMi pins will not function for these performance monitoring counter events. FloatingPoint Unit 7EH BUS_SNOOP_ STALL 00H (Self) Number of clock cycles during which the bus is snoop stalled. C1H FLOPS 00H Number of computational floating-point operations retired. Counter 0 only. Excludes floating-point computational operations that cause traps or assists. Includes floating-point computational operations executed by the assist handler. Includes internal sub-operations for complex floating-point instructions like transcendentals. Excludes floating-point loads and stores. 10H FP_COMP_OPS_ EXE 00H Number of computational floating-point operations executed. Counter 0 only. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This number does not include the number of cycles, but the number of operations. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction. 11H 12H FP_ASSIST MUL 00H 00H Number of floating-point exception cases handled by microcode. Counter 1 only. Number of multiplies. Counter 1 only. This event includes counts due to speculative execution. This count includes integer as well as FP multiplies and is speculative. 13H DIV 00H Number of divides. Counter 1 only. This count includes integer as well as FP divides and is speculative. 14H CYCLES_DIV_ BUSY 00H Number of cycles during which the divider is busy, and cannot accept new divides. This includes integer and FP divides, FPREM, FPSQRT, etc. and is speculative. 19-254 Vol. 3B Counter 0 only. PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Unit Memory Ordering Event Mnemonic Event Num. Name Unit Mask 03H 00H LD_BLOCKS Description Comments Number of load operations delayed due to store buffer blocks. Includes counts caused by preceding stores whose addresses are unknown, preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflicts with the load but which incompletely overlap the load. 04H SB_DRAINS 00H Number of store buffer drain cycles. Incremented every cycle the store buffer is draining. Draining is caused by serializing operations like CPUID, synchronizing operations like XCHG, interrupt acknowledgment, as well as other conditions (such as cache flushing). 05H MISALIGN_ MEM_REF 00H Number of misaligned data memory references. Incremented by 1 every cycle, during which either the processor’s load or store pipeline dispatches a misaligned μop. Counting is performed if it is the first or second half, or if it is blocked, squashed, or missed. MISALIGN_MEM_ REF is only an approximation to the true number of misaligned memory references. The value returned is roughly proportional to the number of misaligned memory accesses (the size of the problem). In this context, misaligned means crossing a 64-bit boundary. 07H EMON_KNI_PREF _DISPATCHED Number of Streaming SIMD extensions prefetch/weakly-ordered instructions dispatched (speculative prefetches are included in counting): 00H 0: prefetch NTA 01H 1: prefetch T1 02H 2: prefetch T2 03H 4BH Instruction Decoding and Retirement C0H EMON_KNI_PREF _MISS INST_RETIRED Counters 0 and 1. Pentium III processor only. 3: weakly ordered stores Number of prefetch/weakly-ordered instructions that miss all caches: 00H 0: prefetch NTA 01H 1: prefetch T1 02H 2: prefetch T2 03H 3: weakly ordered stores 00H Number of instructions retired. Counters 0 and 1. Pentium III processor only. A hardware interrupt received during/after the last iteration of the REP STOS flow causes the counter to undercount by 1 instruction. An SMI received while executing a HLT instruction will cause the performance counter to not count the RSM instruction and undercount by 1. Vol. 3B 19-255 PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Unit Event Mnemonic Event Num. Name Unit Mask Description C2H UOPS_RETIRED 00H Number of μops retired. D0H INST_DECODED 00H Number of instructions decoded. D8H EMON_KNI_INST_ RETIRED D9H Number of Streaming SIMD extensions retired: 00H 0: packed & scalar 01H 1: scalar EMON_KNI_ COMP_ INST_RET Number of Streaming SIMD extensions computation instructions retired: 0: packed and scalar 00H 1: scalar 01H Interrupts Branches C8H HW_INT_RX 00H Number of hardware interrupts received. C6H CYCLES_INT_ MASKED 00H Number of processor cycles for which interrupts are disabled. C7H CYCLES_INT_ PENDING_ AND_MASKED 00H Number of processor cycles for which interrupts are disabled and interrupts are pending. C4H BR_INST_ RETIRED 00H Number of branch instructions retired. C5H BR_MISS_PRED_ RETIRED 00H Number of mispredicted branches retired. C9H BR_TAKEN_ RETIRED 00H Number of taken branches retired. CAH BR_MISS_PRED_ TAKEN_RET 00H Number of taken mispredictions branches retired. E0H BR_INST_ DECODED 00H Number of branch instructions decoded. E2H BTB_MISSES 00H Number of branches for which the BTB did not produce a prediction. E4H BR_BOGUS 00H Number of bogus branches. E6H BACLEARS 00H Number of times BACLEAR is asserted. This is the number of times that a static branch prediction was made, in which the branch decoder decided to make a branch prediction because the BTB did not. Stalls A2H RESOURCE_ STALLS 00H Incremented by 1 during every cycle for which there is a resource related stall. Includes register renaming buffer entries, memory buffer entries. 19-256 Vol. 3B Comments Counters 0 and 1. Pentium III processor only. Counters 0 and 1. Pentium III processor only. PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Unit Event Mnemonic Event Num. Name Unit Mask Description Comments Does not include stalls due to bus queue full, too many cache misses, etc. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations. D2H PARTIAL_RAT_ STALLS 00H Number of cycles or events for partial stalls. This includes flag partial stalls. Segment Register Loads 06H SEGMENT_REG_ LOADS 00H Number of segment register loads. Clocks 79H CPU_CLK_ UNHALTED 00H Number of cycles during which the processor is not halted. MMX Unit B0H MMX_INSTR_ EXEC 00H Number of MMX Instructions Executed. Available in Intel Celeron, Pentium II and Pentium II Xeon processors only. Does not account for MOVQ and MOVD stores from register to memory. B1H MMX_SAT_ INSTR_EXEC 00H Number of MMX Saturating Instructions Executed. Available in Pentium II and Pentium III processors only. B2H MMX_UOPS_ EXEC 0FH Number of MMX μops Executed. Available in Pentium II and Pentium III processors only. B3H MMX_INSTR_ TYPE_EXEC 01H MMX packed multiply instructions executed. Available in Pentium II and Pentium III processors only. 02H MMX packed shift instructions executed. MMX pack operation instructions executed. 04H CCH FP_MMX_TRANS 08H MMX unpack operation instructions executed. 10H MMX packed logical instructions executed. 20H MMX packed arithmetic instructions executed. 00H 01H Segment Register Renaming Transitions from MMX instruction to floating-point instructions. Available in Pentium II and Pentium III processors only. Transitions from floating-point instructions to MMX instructions. CDH MMX_ASSIST 00H Number of MMX Assists (that is, the number of EMMS instructions executed). Available in Pentium II and Pentium III processors only. CEH MMX_INSTR_RET 00H Number of MMX Instructions Retired. Available in Pentium II processors only. D4H SEG_RENAME_ STALLS Number of Segment Register Renaming Stalls: Available in Pentium II and Pentium III processors only. Vol. 3B 19-257 PERFORMANCE MONITORING EVENTS Table 19-42. Events That Can Be Counted with the P6 Family Performance Monitoring Counters (Contd.) Event Mnemonic Event Num. Name Unit Unit Mask Description 02H Segment register ES 04H Segment register DS 08H Segment register FS 0FH Segment register FS Comments Segment registers ES + DS + FS + GS D5H D6H SEG_REG_ RENAMES RET_SEG_ RENAMES Number of Segment Register Renames: 01H Segment register ES 02H Segment register DS 04H Segment register FS 08H Segment register FS 0FH Segment registers ES + DS + FS + GS 00H Number of segment register rename events retired. Available in Pentium II and Pentium III processors only. Available in Pentium II and Pentium III processors only. NOTES: 1. Several L2 cache events, where noted, can be further qualified using the Unit Mask (UMSK) field in the PerfEvtSel0 and PerfEvtSel1 registers. The lower 4 bits of the Unit Mask field are used in conjunction with L2 events to indicate the cache state or cache states involved. The P6 family processors identify cache states using the “MESI” protocol and consequently each bit in the Unit Mask field represents one of the four states: UMSK[3] = M (8H) state, UMSK[2] = E (4H) state, UMSK[1] = S (2H) state, and UMSK[0] = I (1H) state. UMSK[3:0] = MESI” (FH) should be used to collect data for all states; UMSK = 0H, for the applicable events, will result in nothing being counted. 2. All of the external bus logic (EBL) events, except where noted, can be further qualified using the Unit Mask (UMSK) field in the PerfEvtSel0 and PerfEvtSel1 registers. Bit 5 of the UMSK field is used in conjunction with the EBL events to indicate whether the processor should count transactions that are self- generated (UMSK[5] = 0) or transactions that result from any processor on the bus (UMSK[5] = 1). 3. L2 cache locks, so it is possible to have a zero count. 19.22 PENTIUM PROCESSOR PERFORMANCE MONITORING EVENTS Table 19-43 lists the events that can be counted with the performance monitoring counters for the Pentium processor. The Event Number column gives the hexadecimal code that identifies the event and that is entered in the ES0 or ES1 (event select) fields of the CESR MSR. The Mnemonic Event Name column gives the name of the event, and the Description and Comments columns give detailed descriptions of the events. Most events can be counted with either counter 0 or counter 1; however, some events can only be counted with only counter 0 or only counter 1 (as noted). NOTE The events in the table that are shaded are implemented only in the Pentium processor with MMX technology. 19-258 Vol. 3B PERFORMANCE MONITORING EVENTS Table 19-43. Events That Can Be Counted with Pentium Processor Performance Monitoring Counters Event Num. Mnemonic Event Name Description Comments 00H DATA_READ Number of memory data reads (internal data cache hit and miss combined). Split cycle reads are counted individually. Data Memory Reads that are part of TLB miss processing are not included. These events may occur at a maximum of two per clock. I/O is not included. 01H DATA_WRITE Number of memory data writes (internal data cache hit and miss combined); I/O not included. Split cycle writes are counted individually. These events may occur at a maximum of two per clock. I/O is not included. 0H2 DATA_TLB_MISS Number of misses to the data cache translation look-aside buffer. 03H DATA_READ_MISS Number of memory read accesses that miss the internal data cache whether or not the access is cacheable or noncacheable. Additional reads to the same cache line after the first BRDY# of the burst line fill is returned but before the final (fourth) BRDY# has been returned, will not cause the counter to be incremented additional times. Data accesses that are part of TLB miss processing are not included. Accesses directed to I/O space are not included. 04H DATA WRITE MISS Number of memory write accesses that miss the internal data cache whether or not the access is cacheable or noncacheable. Data accesses that are part of TLB miss processing are not included. Accesses directed to I/O space are not included. 05H WRITE_HIT_TO_ M-_OR_ESTATE_LINES Number of write hits to exclusive or modified lines in the data cache. These are the writes that may be held up if EWBE# is inactive. These events may occur a maximum of two per clock. 06H DATA_CACHE_ LINES_ WRITTEN_BACK Number of dirty lines (all) that are written back, regardless of the cause. Replacements and internal and external snoops can all cause writeback and are counted. 07H EXTERNAL_ SNOOPS Number of accepted external snoops whether they hit in the code cache or data cache or neither. Assertions of EADS# outside of the sampling interval are not counted, and no internal snoops are counted. 08H EXTERNAL_DATA_ CACHE_SNOOP_ HITS Number of external snoops to the data cache. Snoop hits to a valid line in either the data cache, the data line fill buffer, or one of the write back buffers are all counted as hits. 09H MEMORY ACCESSES IN BOTH PIPES Number of data memory reads or writes that are paired in both pipes of the pipeline. These accesses are not necessarily run in parallel due to cache misses, bank conflicts, etc. 0AH BANK CONFLICTS Number of actual bank conflicts. 0BH MISALIGNED DATA MEMORY OR I/O REFERENCES Number of memory or I/O reads or writes that are misaligned. 0CH CODE READ Number of instruction reads; whether Individual 8-byte noncacheable instruction reads are the read is cacheable or noncacheable. counted. 0DH CODE TLB MISS Number of instruction reads that miss the code TLB whether the read is cacheable or noncacheable. Individual 8-byte noncacheable instruction reads are counted. 0EH CODE CACHE MISS Number of instruction reads that miss the internal code cache; whether the read is cacheable or noncacheable. Individual 8-byte noncacheable instruction reads are counted. A 2- or 4-byte access is misaligned when it crosses a 4byte boundary; an 8-byte access is misaligned when it crosses an 8-byte boundary. Ten byte accesses are treated as two separate accesses of 8 and 2 bytes each. Vol. 3B 19-259 PERFORMANCE MONITORING EVENTS Table 19-43. Events That Can Be Counted with Pentium Processor Performance Monitoring Counters (Contd.) Event Num. Mnemonic Event Name 0FH ANY SEGMENT REGISTER LOADED 10H Reserved 11H Reserved 12H Branches Description Comments Number of writes into any segment register in real or protected mode including the LDTR, GDTR, IDTR, and TR. Segment loads are caused by explicit segment register load instructions, far control transfers, and task switches. Far control transfers and task switches causing a privilege level change will signal this event twice. Interrupts and exceptions may initiate a far control transfer. Number of taken and not taken branches, including: conditional branches, jumps, calls, returns, software interrupts, and interrupt returns. Also counted as taken branches are serializing instructions, VERR and VERW instructions, some segment descriptor loads, hardware interrupts (including FLUSH#), and programmatic exceptions that invoke a trap or fault handler. The pipe is not necessarily flushed. The number of branches actually executed is measured, not the number of predicted branches. 13H BTB_HITS Number of BTB hits that occur. Hits are counted only for those instructions that are actually executed. 14H TAKEN_BRANCH_ OR_BTB_HIT Number of taken branches or BTB hits that occur. This event type is a logical OR of taken branches and BTB hits. It represents an event that may cause a hit in the BTB. Specifically, it is either a candidate for a space in the BTB or it is already in the BTB. 15H PIPELINE FLUSHES Number of pipeline flushes that occur The counter will not be incremented for serializing instructions (serializing instructions cause the prefetch queue to be flushed but will not trigger the Pipeline Flushed event counter) and software interrupts (software interrupts do not flush the pipeline). Pipeline flushes are caused by BTB misses on taken branches, mispredictions, exceptions, interrupts, and some segment descriptor loads. 16H INSTRUCTIONS_ EXECUTED Number of instructions executed (up to two per clock). Invocations of a fault handler are considered instructions. All hardware and software interrupts and exceptions will also cause the count to be incremented. Repeat prefixed string instructions will only increment this counter once despite the fact that the repeat loop executes the same instruction multiple times until the loop criteria is satisfied. This applies to all the Repeat string instruction prefixes (i.e., REP, REPE, REPZ, REPNE, and REPNZ). This counter will also only increment once per each HLT instruction executed regardless of how many cycles the processor remains in the HALT state. 17H INSTRUCTIONS_ EXECUTED_ V PIPE Number of instructions executed in the V_pipe. The event indicates the number of instructions that were paired. 18H BUS_CYCLE_ DURATION Number of clocks while a bus cycle is in progress. This event is the same as the 16H event except it only counts the number of instructions actually executed in the V-pipe. The count includes HLDA, AHOLD, and BOFF# clocks. This event measures bus use. 19H WRITE_BUFFER_ FULL_STALL_ DURATION 19-260 Vol. 3B Number of clocks while the pipeline is stalled due to full write buffers. Full write buffers stall data memory read misses, data memory write misses, and data memory write hits to Sstate lines. Stalls on I/O accesses are not included. PERFORMANCE MONITORING EVENTS Table 19-43. Events That Can Be Counted with Pentium Processor Performance Monitoring Counters (Contd.) Event Num. Mnemonic Event Name Description Comments Data TLB Miss processing is also included in the count. The pipeline stalls while a data memory read is in progress including attempts to read that are not bypassed while a line is being filled. 1AH WAITING_FOR_ DATA_MEMORY_ READ_STALL_ DURATION Number of clocks while the pipeline is stalled while waiting for data memory reads. 1BH STALL ON WRITE TO AN E- OR MSTATE LINE Number of stalls on writes to E- or Mstate lines. 1CH LOCKED BUS CYCLE Number of locked bus cycles that occur as the result of the LOCK prefix or LOCK instruction, page-table updates, and descriptor table updates. Only the read portion of the locked read-modify-write is counted. Split locked cycles (SCYC active) count as two separate accesses. Cycles restarted due to BOFF# are not re-counted. 1DH I/O READ OR WRITE CYCLE Number of bus cycles directed to I/O space. Misaligned I/O accesses will generate two bus cycles. Bus cycles restarted due to BOFF# are not re-counted. 1EH NONCACHEABLE_ MEMORY_READS Number of noncacheable instruction or data memory read bus cycles. Cycles restarted due to BOFF# are not re-counted. The count includes read cycles caused by TLB misses, but does not include read cycles to I/O space. 1FH PIPELINE_AGI_ STALLS Number of address generation interlock (AGI) stalls. An AGI occurring in both the U- and Vpipelines in the same clock signals this event twice. 20H Reserved 21H Reserved 22H FLOPS Number of floating-point operations that occur. An AGI occurs when the instruction in the execute stage of either of U- or V-pipelines is writing to either the index or base address register of an instruction in the D2 (address generation) stage of either the U- or V- pipelines. Number of floating-point adds, subtracts, multiplies, divides, remainders, and square roots are counted. The transcendental instructions consist of multiple adds and multiplies and will signal this event multiple times. Instructions generating the divide-by-zero, negative square root, special operand, or stack exceptions will not be counted. Instructions generating all other floating-point exceptions will be counted. The integer multiply instructions and other instructions which use the x87 FPU will be counted. 23H BREAKPOINT MATCH ON DR0 REGISTER Number of matches on register DR0 breakpoint. The counters is incremented regardless if the breakpoints are enabled or not. However, if breakpoints are not enabled, code breakpoint matches will not be checked for instructions executed in the V-pipe and will not cause this counter to be incremented. (They are checked on instruction executed in the U-pipe only when breakpoints are not enabled.) These events correspond to the signals driven on the BP[3:0] pins. Refer to Chapter 17, “Debug, Branch Profile, TSC, and Intel® Resource Director Technology (Intel® RDT) Features” for more information. 24H BREAKPOINT MATCH ON DR1 REGISTER Number of matches on register DR1 breakpoint. See comment for 23H event. Vol. 3B 19-261 PERFORMANCE MONITORING EVENTS Table 19-43. Events That Can Be Counted with Pentium Processor Performance Monitoring Counters (Contd.) Event Num. Mnemonic Event Name Description Comments 25H BREAKPOINT MATCH ON DR2 REGISTER Number of matches on register DR2 breakpoint. See comment for 23H event. 26H BREAKPOINT MATCH ON DR3 REGISTER Number of matches on register DR3 breakpoint. See comment for 23H event. 27H HARDWARE INTERRUPTS Number of taken INTR and NMI interrupts. 28H DATA_READ_OR_ WRITE Number of memory data reads and/or writes (internal data cache hit and miss combined). Split cycle reads and writes are counted individually. Data Memory Reads that are part of TLB miss processing are not included. These events may occur at a maximum of two per clock. I/O is not included. 29H DATA_READ_MISS OR_WRITE MISS Number of memory read and/or write accesses that miss the internal data cache, whether or not the access is cacheable or noncacheable. Additional reads to the same cache line after the first BRDY# of the burst line fill is returned but before the final (fourth) BRDY# has been returned, will not cause the counter to be incremented additional times. Data accesses that are part of TLB miss processing are not included. Accesses directed to I/O space are not included. 2AH BUS_OWNERSHIP_ LATENCY (Counter 0) The time from LRM bus ownership request to bus ownership granted (that is, the time from the earlier of a PBREQ (0), PHITM# or HITM# assertion to a PBGNT assertion) The ratio of the 2AH events counted on counter 0 and counter 1 is the average stall time due to bus ownership conflict. 2AH BUS OWNERSHIP TRANSFERS (Counter 1) The number of buss ownership transfers (that is, the number of PBREQ (0) assertions The ratio of the 2AH events counted on counter 0 and counter 1 is the average stall time due to bus ownership conflict. 2BH MMX_ INSTRUCTIONS_ EXECUTED_ U-PIPE (Counter 0) Number of MMX instructions executed in the U-pipe 2BH MMX_ INSTRUCTIONS_ EXECUTED_ V-PIPE (Counter 1) Number of MMX instructions executed in the V-pipe 2CH CACHE_MSTATE_LINE_ SHARING (Counter 0) Number of times a processor identified a hit to a modified line due to a memory access in the other processor (PHITM (O)) 2CH CACHE_LINE_ SHARING (Counter 1) Number of shared data lines in the L1 cache (PHIT (O)) 2DH EMMS_ INSTRUCTIONS_ EXECUTED (Counter 0) Number of EMMS instructions executed 19-262 Vol. 3B If the average memory latencies of the system are known, this event enables the user to count the Write Backs on PHITM(O) penalty and the Latency on Hit Modified(I) penalty. PERFORMANCE MONITORING EVENTS Table 19-43. Events That Can Be Counted with Pentium Processor Performance Monitoring Counters (Contd.) Event Num. 2DH Mnemonic Event Name Description Comments TRANSITIONS_ BETWEEN_MMX_ AND_FP_ INSTRUCTIONS (Counter 1) Number of transitions between MMX and floating-point instructions or vice versa This event counts the first floating-point instruction following an MMX instruction or first MMX instruction following a floating-point instruction. An even count indicates the processor is in MMX state. an odd count indicates it is in FP state. The count may be used to estimate the penalty in transitions between floating-point state and MMX state. 2EH BUS_UTILIZATION_ DUE_TO_ PROCESSOR_ ACTIVITY (Counter 0) Number of clocks the bus is busy due to the processor’s own activity (the bus activity that is caused by the processor) 2EH WRITES_TO_ NONCACHEABLE_ MEMORY (Counter 1) Number of write accesses to noncacheable memory 2FH SATURATING_ MMX_ INSTRUCTIONS_ EXECUTED (Counter 0) Number of saturating MMX instructions executed, independently of whether they actually saturated. 2FH SATURATIONS_ PERFORMED (Counter 1) Number of MMX instructions that used saturating arithmetic when at least one of its results actually saturated If an MMX instruction operating on 4 doublewords saturated in three out of the four results, the counter will be incremented by one only. 30H NUMBER_OF_ CYCLES_NOT_IN_ HALT_STATE (Counter 0) Number of cycles the processor is not idle due to HLT instruction This event will enable the user to calculate “net CPI”. Note that during the time that the processor is executing the HLT instruction, the Time-Stamp Counter is not disabled. Since this event is controlled by the Counter Controls CC0, CC1 it can be used to calculate the CPI at CPL=3, which the TSC cannot provide. 30H DATA_CACHE_ TLB_MISS_ STALL_DURATION (Counter 1) Number of clocks the pipeline is stalled due to a data cache translation lookaside buffer (TLB) miss 31H MMX_ INSTRUCTION_ DATA_READS (Counter 0) Number of MMX instruction data reads 31H MMX_ INSTRUCTION_ DATA_READ_ MISSES (Counter 1) Number of MMX instruction data read misses 32H FLOATING_POINT_S Number of clocks while pipe is stalled TALLS_DURATION due to a floating-point freeze (Counter 0) 32H TAKEN_BRANCHES (Counter 1) The count includes write cycles caused by TLB misses and I/O write cycles. Cycles restarted due to BOFF# are not re-counted. Number of taken branches Vol. 3B 19-263 PERFORMANCE MONITORING EVENTS Table 19-43. Events That Can Be Counted with Pentium Processor Performance Monitoring Counters (Contd.) Event Num. Mnemonic Event Name Description Comments 33H D1_STARVATION_ AND_FIFO_IS_ EMPTY (Counter 0) Number of times D1 stage cannot issue ANY instructions since the FIFO buffer is empty The D1 stage can issue 0, 1, or 2 instructions per clock if those are available in an instructions FIFO buffer. 33H D1_STARVATION_ AND_ONLY_ONE_ INSTRUCTION_IN_ FIFO (Counter 1) Number of times the D1 stage issues a single instruction (since the FIFO buffer had just one instruction ready) The D1 stage can issue 0, 1, or 2 instructions per clock if those are available in an instructions FIFO buffer. 34H MMX_ INSTRUCTION_ DATA_WRITES (Counter 0) Number of data writes caused by MMX instructions 34H MMX_ INSTRUCTION_ DATA_WRITE_ MISSES (Counter 1) Number of data write misses caused by MMX instructions 35H PIPELINE_ FLUSHES_DUE_ TO_WRONG_ BRANCH_ PREDICTIONS (Counter 0) Number of pipeline flushes due to wrong branch predictions resolved in either the E-stage or the WB-stage 35H PIPELINE_ FLUSHES_DUE_ TO_WRONG_ BRANCH_ PREDICTIONS_ RESOLVED_IN_ WB-STAGE (Counter 1) Number of pipeline flushes due to wrong branch predictions resolved in the WB-stage 36H MISALIGNED_ DATA_MEMORY_ REFERENCE_ON_ MMX_ INSTRUCTIONS (Counter 0) Number of misaligned data memory references when executing MMX instructions 36H PIPELINE_ ISTALL_FOR_MMX_ INSTRUCTION_ DATA_MEMORY_ READS (Counter 1) Number clocks during pipeline stalls caused by waits form MMX instruction data memory reads 19-264 Vol. 3B When combined with the previously defined events, Instruction Executed (16H) and Instruction Executed in the V-pipe (17H), this event enables the user to calculate the numbers of time pairing rules prevented issuing of two instructions. The count includes any pipeline flush due to a branch that the pipeline did not follow correctly. It includes cases where a branch was not in the BTB, cases where a branch was in the BTB but was mispredicted, and cases where a branch was correctly predicted but to the wrong address. Branches are resolved in either the Execute stage (E-stage) or the Writeback stage (WB-stage). In the later case, the misprediction penalty is larger by one clock. The difference between the 35H event count in counter 0 and counter 1 is the number of E-stage resolved branches. See note for event 35H (Counter 0). T3: PERFORMANCE MONITORING EVENTS Table 19-43. Events That Can Be Counted with Pentium Processor Performance Monitoring Counters (Contd.) Event Num. Mnemonic Event Name Description Comments 37H MISPREDICTED_ OR_ UNPREDICTED_ RETURNS (Counter 1) Number of returns predicted incorrectly or not predicted at all The count is the difference between the total number of executed returns and the number of returns that were correctly predicted. Only RET instructions are counted (for example, IRET instructions are not counted). 37H PREDICTED_ RETURNS (Counter 1) Number of predicted returns (whether they are predicted correctly and incorrectly Only RET instructions are counted (for example, IRET instructions are not counted). 38H MMX_MULTIPLY_ UNIT_INTERLOCK (Counter 0) Number of clocks the pipe is stalled since the destination of previous MMX multiply instruction is not ready yet The counter will not be incremented if there is another cause for a stall. For each occurrence of a multiply interlock, this event will be counted twice (if the stalled instruction comes on the next clock after the multiply) or by once (if the stalled instruction comes two clocks after the multiply). 38H MOVD/MOVQ_ STORE_STALL_ DUE_TO_ PREVIOUS_MMX_ OPERATION (Counter 1) Number of clocks a MOVD/MOVQ instruction store is stalled in D2 stage due to a previous MMX operation with a destination to be used in the store instruction. 39H RETURNS (Counter 0) Number or returns executed. Only RET instructions are counted; IRET instructions are not counted. Any exception taken on a RET instruction and any interrupt recognized by the processor on the instruction boundary prior to the execution of the RET instruction will also cause this counter to be incremented. 39H Reserved 3AH BTB_FALSE_ ENTRIES (Counter 0) Number of false entries in the Branch Target Buffer False entries are causes for misprediction other than a wrong prediction. 3AH BTB_MISS_ PREDICTION_ON_ NOT-TAKEN_ BRANCH (Counter 1) Number of times the BTB predicted a not-taken branch as taken 3BH FULL_WRITE_ BUFFER_STALL_ DURATION_ WHILE_ EXECUTING_MMX_I NSTRUCTIONS (Counter 0) Number of clocks while the pipeline is stalled due to full write buffers while executing MMX instructions 3BH STALL_ON_MMX_ INSTRUCTION_ WRITE_TO E-_OR_ M-STATE_LINE (Counter 1) Number of clocks during stalls on MMX instructions writing to E- or M-state lines Vol. 3B 19-265 PERFORMANCE MONITORING EVENTS 19-266 Vol. 3B 16.Updates to Chapter 24, Volume 3C Change bars and green text show changes to Chapter 24 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to this chapter: Updates across chapter describing VMX support improvements made for Intel® Processor Trace (Intel® PT). Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 27 CHAPTER 24 VIRTUAL MACHINE CONTROL STRUCTURES 24.1 OVERVIEW A logical processor uses virtual-machine control data structures (VMCSs) while it is in VMX operation. These manage transitions into and out of VMX non-root operation (VM entries and VM exits) as well as processor behavior in VMX non-root operation. This structure is manipulated by the new instructions VMCLEAR, VMPTRLD, VMREAD, and VMWRITE. A VMM can use a different VMCS for each virtual machine that it supports. For a virtual machine with multiple logical processors (virtual processors), the VMM can use a different VMCS for each virtual processor. A logical processor associates a region in memory with each VMCS. This region is called the VMCS region.1 Software references a specific VMCS using the 64-bit physical address of the region (a VMCS pointer). VMCS pointers must be aligned on a 4-KByte boundary (bits 11:0 must be zero). These pointers must not set bits beyond the processor’s physical-address width.2,3 A logical processor may maintain a number of VMCSs that are active. The processor may optimize VMX operation by maintaining the state of an active VMCS in memory, on the processor, or both. At any given time, at most one of the active VMCSs is the current VMCS. (This document frequently uses the term “the VMCS” to refer to the current VMCS.) The VMLAUNCH, VMREAD, VMRESUME, and VMWRITE instructions operate only on the current VMCS. The following items describe how a logical processor determines which VMCSs are active and which is current: • The memory operand of the VMPTRLD instruction is the address of a VMCS. After execution of the instruction, that VMCS is both active and current on the logical processor. Any other VMCS that had been active remains so, but no other VMCS is current. • The VMCS link pointer field in the current VMCS (see Section 24.4.2) is itself the address of a VMCS. If VM entry is performed successfully with the 1-setting of the “VMCS shadowing” VM-execution control, the VMCS referenced by the VMCS link pointer field becomes active on the logical processor. The identity of the current VMCS does not change. • The memory operand of the VMCLEAR instruction is also the address of a VMCS. After execution of the instruction, that VMCS is neither active nor current on the logical processor. If the VMCS had been current on the logical processor, the logical processor no longer has a current VMCS. The VMPTRST instruction stores the address of the logical processor’s current VMCS into a specified memory location (it stores the value FFFFFFFF_FFFFFFFFH if there is no current VMCS). The launch state of a VMCS determines which VM-entry instruction should be used with that VMCS: the VMLAUNCH instruction requires a VMCS whose launch state is “clear”; the VMRESUME instruction requires a VMCS whose launch state is “launched”. A logical processor maintains a VMCS’s launch state in the corresponding VMCS region. The following items describe how a logical processor manages the launch state of a VMCS: • If the launch state of the current VMCS is “clear”, successful execution of the VMLAUNCH instruction changes the launch state to “launched”. • The memory operand of the VMCLEAR instruction is the address of a VMCS. After execution of the instruction, the launch state of that VMCS is “clear”. • There are no other ways to modify the launch state of a VMCS (it cannot be modified using VMWRITE) and there is no direct way to discover it (it cannot be read using VMREAD). 1. The amount of memory required for a VMCS region is at most 4 KBytes. The exact size is implementation specific and can be determined by consulting the VMX capability MSR IA32_VMX_BASIC to determine the size of the VMCS region (see Appendix A.1). 2. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 3. If IA32_VMX_BASIC[48] is read as 1, these pointers must not set any bits in the range 63:32; see Appendix A.1. Vol. 3C 24-1 VIRTUAL MACHINE CONTROL STRUCTURES Figure 24-1 illustrates the different states of a VMCS. It uses “X” to refer to the VMCS and “Y” to refer to any other VMCS. Thus: “VMPTRLD X” always makes X current and active; “VMPTRLD Y” always makes X not current (because it makes Y current); VMLAUNCH makes the launch state of X “launched” if X was current and its launch state was “clear”; and VMCLEAR X always makes X inactive and not current and makes its launch state “clear”. The figure does not illustrate operations that do not modify the VMCS state relative to these parameters (e.g., execution of VMPTRLD X when X is already current). Note that VMCLEAR X makes X “inactive, not current, and clear,” even if X’s current state is not defined (e.g., even if X has not yet been initialized). See Section 24.11.3. VMCLEAR X VMCLEAR X X VMLAUNCH VMPTRLD Y R Anything Else Active Not Current Launched VMPTRLD X VMCLEAR X A LE Active Current Clear Inactive Not Current Clear C VM VMPTRLD Y VMPTRLD X V VM M P CL TR EA LD R X X Active Not Current Clear Active Current Launched Figure 24-1. States of VMCS X Because a shadow VMCS (see Section 24.10) cannot be used for VM entry, the launch state of a shadow VMCS is not meaningful. Figure 24-1 does not illustrate all the ways in which a shadow VMCS may be made active. 24.2 FORMAT OF THE VMCS REGION A VMCS region comprises up to 4-KBytes.1 The format of a VMCS region is given in Table 24-1. Table 24-1. Format of the VMCS Region Byte Offset 0 Contents Bits 30:0: VMCS revision identifier Bit 31: shadow-VMCS indicator (see Section 24.10) 4 VMX-abort indicator 8 VMCS data (implementation-specific format) The first 4 bytes of the VMCS region contain the VMCS revision identifier at bits 30:0.2 Processors that maintain VMCS data in different formats (see below) use different VMCS revision identifiers. These identifiers enable soft- 1. The exact size is implementation specific and can be determined by consulting the VMX capability MSR IA32_VMX_BASIC to determine the size of the VMCS region (see Appendix A.1). 24-2 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES ware to avoid using a VMCS region formatted for one processor on a processor that uses a different format.1 Bit 31 of this 4-byte region indicates whether the VMCS is a shadow VMCS (see Section 24.10). Software should write the VMCS revision identifier to the VMCS region before using that region for a VMCS. The VMCS revision identifier is never written by the processor; VMPTRLD fails if its operand references a VMCS region whose VMCS revision identifier differs from that used by the processor. (VMPTRLD also fails if the shadow-VMCS indicator is 1 and the processor does not support the 1-setting of the “VMCS shadowing” VM-execution control; see Section 24.6.2) Software can discover the VMCS revision identifier that a processor uses by reading the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1). Software should clear or set the shadow-VMCS indicator depending on whether the VMCS is to be an ordinary VMCS or a shadow VMCS (see Section 24.10). VMPTRLD fails if the shadow-VMCS indicator is set and the processor does not support the 1-setting of the “VMCS shadowing” VM-execution control. Software can discover support for this setting by reading the VMX capability MSR IA32_VMX_PROCBASED_CTLS2 (see Appendix A.3.3). The next 4 bytes of the VMCS region are used for the VMX-abort indicator. The contents of these bits do not control processor operation in any way. A logical processor writes a non-zero value into these bits if a VMX abort occurs (see Section 27.7). Software may also write into this field. The remainder of the VMCS region is used for VMCS data (those parts of the VMCS that control VMX non-root operation and the VMX transitions). The format of these data is implementation-specific. VMCS data are discussed in Section 24.3 through Section 24.9. To ensure proper behavior in VMX operation, software should maintain the VMCS region and related structures (enumerated in Section 24.11.4) in writeback cacheable memory. Future implementations may allow or require a different memory type2. Software should consult the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1). 24.3 ORGANIZATION OF VMCS DATA The VMCS data are organized into six logical groups: • Guest-state area. Processor state is saved into the guest-state area on VM exits and loaded from there on VM entries. • • Host-state area. Processor state is loaded from the host-state area on VM exits. • • • VM-execution control fields. These fields control processor behavior in VMX non-root operation. They determine in part the causes of VM exits. VM-exit control fields. These fields control VM exits. VM-entry control fields. These fields control VM entries. VM-exit information fields. These fields receive information on VM exits and describe the cause and the nature of VM exits. On some processors, these fields are read-only.3 The VM-execution control fields, the VM-exit control fields, and the VM-entry control fields are sometimes referred to collectively as VMX controls. 2. Earlier versions of this manual specified that the VMCS revision identifier was a 32-bit field. For all processors produced prior to this change, bit 31 of the VMCS revision identifier was 0. 1. Logical processors that use the same VMCS revision identifier use the same size for VMCS regions. 2. Alternatively, software may map any of these regions or structures with the UC memory type. Doing so is strongly discouraged unless necessary as it will cause the performance of transitions using those structures to suffer significantly. In addition, the processor will continue to use the memory type reported in the VMX capability MSR IA32_VMX_BASIC with exceptions noted in Appendix A.1. 3. Software can discover whether these fields can be written by reading the VMX capability MSR IA32_VMX_MISC (see Appendix A.6). Vol. 3C 24-3 VIRTUAL MACHINE CONTROL STRUCTURES 24.4 GUEST-STATE AREA This section describes fields contained in the guest-state area of the VMCS. VM entries load processor state from these fields and VM exits store processor state into these fields. See Section 26.3.2 and Section 27.3 for details. 24.4.1 Guest Register State The following fields in the guest-state area correspond to processor registers: • Control registers CR0, CR3, and CR4 (64 bits each; 32 bits on processors that do not support Intel 64 architecture). • • • Debug register DR7 (64 bits; 32 bits on processors that do not support Intel 64 architecture). RSP, RIP, and RFLAGS (64 bits each; 32 bits on processors that do not support Intel 64 architecture).1 The following fields for each of the registers CS, SS, DS, ES, FS, GS, LDTR, and TR: — Selector (16 bits). — Base address (64 bits; 32 bits on processors that do not support Intel 64 architecture). The base-address fields for CS, SS, DS, and ES have only 32 architecturally-defined bits; nevertheless, the corresponding VMCS fields have 64 bits on processors that support Intel 64 architecture. — Segment limit (32 bits). The limit field is always a measure in bytes. — Access rights (32 bits). The format of this field is given in Table 24-2 and detailed as follows: • The low 16 bits correspond to bits 23:8 of the upper 32 bits of a 64-bit segment descriptor. While bits 19:16 of code-segment and data-segment descriptors correspond to the upper 4 bits of the segment limit, the corresponding bits (bits 11:8) are reserved in this VMCS field. • Bit 16 indicates an unusable segment. Attempts to use such a segment fault except in 64-bit mode. In general, a segment register is unusable if it has been loaded with a null selector.2 • Bits 31:17 are reserved. Table 24-2. Format of Access Rights Bit Position(s) Field 3:0 Segment type 4 S — Descriptor type (0 = system; 1 = code or data) 6:5 DPL — Descriptor privilege level 7 P — Segment present 11:8 Reserved 12 AVL — Available for use by system software 1. This chapter uses the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For processors that do not support Intel 64 architecture, this notation refers to the 32-bit forms of those registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to lower 32 bits of the indicated register. 2. There are a few exceptions to this statement. For example, a segment with a non-null selector may be unusable following a task switch that fails after its commit point; see “Interrupt 10—Invalid TSS Exception (#TS)” in Section 6.14, “Exception and Interrupt Handling in 64-bit Mode,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. In contrast, the TR register is usable after processor reset despite having a null selector; see Table 10-1 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 24-4 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES Table 24-2. Format of Access Rights (Contd.) Bit Position(s) Field 13 Reserved (except for CS) L — 64-bit mode active (for CS only) 14 D/B — Default operation size (0 = 16-bit segment; 1 = 32-bit segment) 15 G — Granularity 16 Segment unusable (0 = usable; 1 = unusable) 31:17 Reserved The base address, segment limit, and access rights compose the “hidden” part (or “descriptor cache”) of each segment register. These data are included in the VMCS because it is possible for a segment register’s descriptor cache to be inconsistent with the segment descriptor in memory (in the GDT or the LDT) referenced by the segment register’s selector. The value of the DPL field for SS is always equal to the logical processor’s current privilege level (CPL).1 On some processors, executions of VMWRITE ignore attempts to write non-zero values to any of bits 11:8 or bits 31:17. On such processors, VMREAD always returns 0 for those bits, and VM entry treats those bits as if they were all 0 (see Section 26.3.1.2). • The following fields for each of the registers GDTR and IDTR: — Base address (64 bits; 32 bits on processors that do not support Intel 64 architecture). — Limit (32 bits). The limit fields contain 32 bits even though these fields are specified as only 16 bits in the architecture. • The following MSRs: — IA32_DEBUGCTL (64 bits) — IA32_SYSENTER_CS (32 bits) — IA32_SYSENTER_ESP and IA32_SYSENTER_EIP (64 bits; 32 bits on processors that do not support Intel 64 architecture) — IA32_PERF_GLOBAL_CTRL (64 bits). This field is supported only on processors that support the 1-setting of the “load IA32_PERF_GLOBAL_CTRL” VM-entry control. — IA32_PAT (64 bits). This field is supported only on processors that support either the 1-setting of the “load IA32_PAT” VM-entry control or that of the “save IA32_PAT” VM-exit control. — IA32_EFER (64 bits). This field is supported only on processors that support either the 1-setting of the “load IA32_EFER” VM-entry control or that of the “save IA32_EFER” VM-exit control. — IA32_BNDCFGS (64 bits). This field is supported only on processors that support either the 1-setting of the “load IA32_BNDCFGS” VM-entry control or that of the “clear IA32_BNDCFGS” VM-exit control. — IA32_RTIT_CTL (64 bits). This field is supported only on processors that support either the 1-setting of the “load IA32_RTIT_CTL” VM-entry control or that of the “clear IA32_RTIT_CTL” VM-exit control. • The register SMBASE (32 bits). This register contains the base address of the logical processor’s SMRAM image. 1. In protected mode, CPL is also associated with the RPL field in the CS selector. However, the RPL fields are not meaningful in realaddress mode or in virtual-8086 mode. Vol. 3C 24-5 VIRTUAL MACHINE CONTROL STRUCTURES 24.4.2 Guest Non-Register State In addition to the register state described in Section 24.4.1, the guest-state area includes the following fields that characterize guest state but which do not correspond to processor registers: • Activity state (32 bits). This field identifies the logical processor’s activity state. When a logical processor is executing instructions normally, it is in the active state. Execution of certain instructions and the occurrence of certain events may cause a logical processor to transition to an inactive state in which it ceases to execute instructions. The following activity states are defined:1 — 0: Active. The logical processor is executing instructions normally. — 1: HLT. The logical processor is inactive because it executed the HLT instruction. — 2: Shutdown. The logical processor is inactive because it incurred a triple fault2 or some other serious error. — 3: Wait-for-SIPI. The logical processor is inactive because it is waiting for a startup-IPI (SIPI). Future processors may include support for other activity states. Software should read the VMX capability MSR IA32_VMX_MISC (see Appendix A.6) to determine what activity states are supported. • Interruptibility state (32 bits). The IA-32 architecture includes features that permit certain events to be blocked for a period of time. This field contains information about such blocking. Details and the format of this field are given in Table 24-3. Table 24-3. Format of Interruptibility State Bit Position(s) Bit Name Notes 0 Blocking by STI See the “STI—Set Interrupt Flag” section in Chapter 4 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. Execution of STI with RFLAGS.IF = 0 blocks maskable interrupts on the instruction boundary following its execution.1 Setting this bit indicates that this blocking is in effect. 1 Blocking by MOV SS See Section 6.8.3, “Masking Exceptions and Interrupts When Switching Stacks,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. Execution of a MOV to SS or a POP to SS blocks or suppresses certain debug exceptions as well as interrupts (maskable and nonmaskable) on the instruction boundary following its execution. Setting this bit indicates that this blocking is in effect.2 This document uses the term “blocking by MOV SS,” but it applies equally to POP SS. 2 Blocking by SMI See Section 34.2, “System Management Interrupt (SMI).” System-management interrupts (SMIs) are disabled while the processor is in system-management mode (SMM). Setting this bit indicates that blocking of SMIs is in effect. 3 Blocking by NMI See Section 6.7.1, “Handling Multiple NMIs,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A and Section 34.8, “NMI Handling While in SMM.” Delivery of a non-maskable interrupt (NMI) or a system-management interrupt (SMI) blocks subsequent NMIs until the next execution of IRET. See Section 25.3 for how this behavior of IRET may change in VMX non-root operation. Setting this bit indicates that blocking of NMIs is in effect. Clearing this bit does not imply that NMIs are not (temporarily) blocked for other reasons. If the “virtual NMIs” VM-execution control (see Section 24.6.1) is 1, this bit does not control the blocking of NMIs. Instead, it refers to “virtual-NMI blocking” (the fact that guest software is not ready for an NMI). 1. Execution of the MWAIT instruction may put a logical processor into an inactive state. However, this VMCS field never reflects this state. See Section 27.1. 2. A triple fault occurs when a logical processor encounters an exception while attempting to deliver a double fault. 24-6 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES Table 24-3. Format of Interruptibility State (Contd.) Bit Position(s) Bit Name Notes 4 Enclave interruption Set to 1 if the VM exit occurred while the logical processor was in enclave mode. Such VM exits includes those caused by interrupts, non-maskable interrupts, systemmanagement interrupts, INIT signals, and exceptions occurring in enclave mode as well as exceptions encountered during the delivery of such events incident to enclave mode. A VM exit that is incident to delivery of an event injected by VM entry leaves this bit unmodified. 31:5 Reserved VM entry will fail if these bits are not 0. See Section 26.3.1.5. NOTES: 1. Nonmaskable interrupts and system-management interrupts may also be inhibited on the instruction boundary following such an execution of STI. 2. System-management interrupts may also be inhibited on the instruction boundary following such an execution of MOV or POP. • Pending debug exceptions (64 bits; 32 bits on processors that do not support Intel 64 architecture). IA-32 processors may recognize one or more debug exceptions without immediately delivering them.1 This field contains information about such exceptions. This field is described in Table 24-4. Table 24-4. Format of Pending-Debug-Exceptions Bit Position(s) Bit Name Notes 3:0 B3 – B0 When set, each of these bits indicates that the corresponding breakpoint condition was met. Any of these bits may be set even if the corresponding enabling bit in DR7 is not set. 11:4 Reserved VM entry fails if these bits are not 0. See Section 26.3.1.5. 12 Enabled breakpoint When set, this bit indicates that at least one data or I/O breakpoint was met and was enabled in DR7. 13 Reserved VM entry fails if this bit is not 0. See Section 26.3.1.5. 14 BS When set, this bit indicates that a debug exception would have been triggered by single-step execution mode. 15 Reserved VM entry fails if this bit is not 0. See Section 26.3.1.5. 16 RTM When set, this bit indicates that a debug exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM region while advanced debugging of RTM transactional regions was enabled (see Section 16.3.7, “RTM-Enabled Debugger Support,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1).1 63:17 Reserved VM entry fails if these bits are not 0. See Section 26.3.1.5. Bits 63:32 exist only on processors that support Intel 64 architecture. NOTES: 1. In general, the format of this field matches that of DR6. However, DR6 clears bit 16 to indicate an RTM-related exception, while this field sets the bit to indicate that condition. 1. For example, execution of a MOV to SS or a POP to SS may inhibit some debug exceptions for one instruction. See Section 6.8.3 of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. In addition, certain events incident to an instruction (for example, an INIT signal) may take priority over debug traps generated by that instruction. See Table 6-2 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. Vol. 3C 24-7 VIRTUAL MACHINE CONTROL STRUCTURES • VMCS link pointer (64 bits). If the “VMCS shadowing” VM-execution control is 1, the VMREAD and VMWRITE instructions access the VMCS referenced by this pointer (see Section 24.10). Otherwise, software should set this field to FFFFFFFF_FFFFFFFFH to avoid VM-entry failures (see Section 26.3.1.5). • VMX-preemption timer value (32 bits). This field is supported only on processors that support the 1-setting of the “activate VMX-preemption timer” VM-execution control. This field contains the value that the VMXpreemption timer will use following the next VM entry with that setting. See Section 25.5.1 and Section 26.7.4. • Page-directory-pointer-table entries (PDPTEs; 64 bits each). These four (4) fields (PDPTE0, PDPTE1, PDPTE2, and PDPTE3) are supported only on processors that support the 1-setting of the “enable EPT” VMexecution control. They correspond to the PDPTEs referenced by CR3 when PAE paging is in use (see Section 4.4 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A). They are used only if the “enable EPT” VM-execution control is 1. • Guest interrupt status (16 bits). This field is supported only on processors that support the 1-setting of the “virtual-interrupt delivery” VM-execution control. It characterizes part of the guest’s virtual-APIC state and does not correspond to any processor or APIC registers. It comprises two 8-bit subfields: — Requesting virtual interrupt (RVI). This is the low byte of the guest interrupt status. The processor treats this value as the vector of the highest priority virtual interrupt that is requesting service. (The value 0 implies that there is no such interrupt.) — Servicing virtual interrupt (SVI). This is the high byte of the guest interrupt status. The processor treats this value as the vector of the highest priority virtual interrupt that is in service. (The value 0 implies that there is no such interrupt.) See Chapter 29 for more information on the use of this field. • PML index (16 bits). This field is supported only on processors that support the 1-setting of the “enable PML” VM-execution control. It contains the logical index of the next entry in the page-modification log. Because the page-modification log comprises 512 entries, the PML index is typically a value in the range 0–511. Details of the page-modification log and use of the PML index are given in Section 28.2.6. 24.5 HOST-STATE AREA This section describes fields contained in the host-state area of the VMCS. As noted earlier, processor state is loaded from these fields on every VM exit (see Section 27.5). All fields in the host-state area correspond to processor registers: • • • CR0, CR3, and CR4 (64 bits each; 32 bits on processors that do not support Intel 64 architecture). • Base-address fields for FS, GS, TR, GDTR, and IDTR (64 bits each; 32 bits on processors that do not support Intel 64 architecture). • The following MSRs: RSP and RIP (64 bits each; 32 bits on processors that do not support Intel 64 architecture). Selector fields (16 bits each) for the segment registers CS, SS, DS, ES, FS, GS, and TR. There is no field in the host-state area for the LDTR selector. — IA32_SYSENTER_CS (32 bits) — IA32_SYSENTER_ESP and IA32_SYSENTER_EIP (64 bits; 32 bits on processors that do not support Intel 64 architecture). — IA32_PERF_GLOBAL_CTRL (64 bits). This field is supported only on processors that support the 1-setting of the “load IA32_PERF_GLOBAL_CTRL” VM-exit control. — IA32_PAT (64 bits). This field is supported only on processors that support the 1-setting of the “load IA32_PAT” VM-exit control. — IA32_EFER (64 bits). This field is supported only on processors that support the 1-setting of the “load IA32_EFER” VM-exit control. In addition to the state identified here, some processor state components are loaded with fixed values on every VM exit; there are no fields corresponding to these components in the host-state area. See Section 27.5 for details of how state is loaded on VM exits. 24-8 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES 24.6 VM-EXECUTION CONTROL FIELDS The VM-execution control fields govern VMX non-root operation. These are described in Section 24.6.1 through Section 24.6.8. 24.6.1 Pin-Based VM-Execution Controls The pin-based VM-execution controls constitute a 32-bit vector that governs the handling of asynchronous events (for example: interrupts).1 Table 24-5 lists the controls. See Chapter 27 for how these controls affect processor behavior in VMX non-root operation. Table 24-5. Definitions of Pin-Based VM-Execution Controls Bit Position(s) Name Description 0 External-interrupt exiting If this control is 1, external interrupts cause VM exits. Otherwise, they are delivered normally through the guest interrupt-descriptor table (IDT). If this control is 1, the value of RFLAGS.IF does not affect interrupt blocking. 3 NMI exiting If this control is 1, non-maskable interrupts (NMIs) cause VM exits. Otherwise, they are delivered normally using descriptor 2 of the IDT. This control also determines interactions between IRET and blocking by NMI (see Section 25.3). 5 Virtual NMIs If this control is 1, NMIs are never blocked and the “blocking by NMI” bit (bit 3) in the interruptibility-state field indicates “virtual-NMI blocking” (see Table 24-3). This control also interacts with the “NMI-window exiting” VM-execution control (see Section 24.6.2). 6 Activate VMXpreemption timer If this control is 1, the VMX-preemption timer counts down in VMX non-root operation; see Section 25.5.1. A VM exit occurs when the timer counts down to zero; see Section 25.2. 7 Process posted interrupts If this control is 1, the processor treats interrupts with the posted-interrupt notification vector (see Section 24.6.8) specially, updating the virtual-APIC page with posted-interrupt requests (see Section 29.6). All other bits in this field are reserved, some to 0 and some to 1. Software should consult the VMX capability MSRs IA32_VMX_PINBASED_CTLS and IA32_VMX_TRUE_PINBASED_CTLS (see Appendix A.3.1) to determine how to set reserved bits. Failure to set reserved bits properly causes subsequent VM entries to fail (see Section 26.2.1.1). The first processors to support the virtual-machine extensions supported only the 1-settings of bits 1, 2, and 4. The VMX capability MSR IA32_VMX_PINBASED_CTLS will always report that these bits must be 1. Logical processors that support the 0-settings of any of these bits will support the VMX capability MSR IA32_VMX_TRUE_PINBASED_CTLS MSR, and software should consult this MSR to discover support for the 0settings of these bits. Software that is not aware of the functionality of any one of these bits should set that bit to 1. 24.6.2 Processor-Based VM-Execution Controls The processor-based VM-execution controls constitute two 32-bit vectors that govern the handling of synchronous events, mainly those caused by the execution of specific instructions.2 These are the primary processor-based VM-execution controls and the secondary processor-based VM-execution controls. Table 24-6 lists the primary processor-based VM-execution controls. See Chapter 25 for more details of how these controls affect processor behavior in VMX non-root operation. 1. Some asynchronous events cause VM exits regardless of the settings of the pin-based VM-execution controls (see Section 25.2). 2. Some instructions cause VM exits regardless of the settings of the processor-based VM-execution controls (see Section 25.1.2), as do task switches (see Section 25.2). Vol. 3C 24-9 VIRTUAL MACHINE CONTROL STRUCTURES Table 24-6. Definitions of Primary Processor-Based VM-Execution Controls Bit Position(s) Name Description 2 Interrupt-window exiting If this control is 1, a VM exit occurs at the beginning of any instruction if RFLAGS.IF = 1 and there are no other blocking of interrupts (see Section 24.4.2). 3 Use TSC offsetting This control determines whether executions of RDTSC, executions of RDTSCP, and executions of RDMSR that read from the IA32_TIME_STAMP_COUNTER MSR return a value modified by the TSC offset field (see Section 24.6.5 and Section 25.3). 7 HLT exiting This control determines whether executions of HLT cause VM exits. 9 INVLPG exiting This determines whether executions of INVLPG cause VM exits. 10 MWAIT exiting This control determines whether executions of MWAIT cause VM exits. 11 RDPMC exiting This control determines whether executions of RDPMC cause VM exits. 12 RDTSC exiting This control determines whether executions of RDTSC and RDTSCP cause VM exits. 15 CR3-load exiting In conjunction with the CR3-target controls (see Section 24.6.7), this control determines whether executions of MOV to CR3 cause VM exits. See Section 25.1.3. The first processors to support the virtual-machine extensions supported only the 1-setting of this control. 16 CR3-store exiting This control determines whether executions of MOV from CR3 cause VM exits. The first processors to support the virtual-machine extensions supported only the 1-setting of this control. 19 CR8-load exiting This control determines whether executions of MOV to CR8 cause VM exits. 20 CR8-store exiting This control determines whether executions of MOV from CR8 cause VM exits. 21 Use TPR shadow Setting this control to 1 enables TPR virtualization and other APIC-virtualization features. See Chapter 29. 22 NMI-window exiting If this control is 1, a VM exit occurs at the beginning of any instruction if there is no virtualNMI blocking (see Section 24.4.2). 23 MOV-DR exiting This control determines whether executions of MOV DR cause VM exits. 24 Unconditional I/O exiting This control determines whether executions of I/O instructions (IN, INS/INSB/INSW/INSD, OUT, and OUTS/OUTSB/OUTSW/OUTSD) cause VM exits. 25 Use I/O bitmaps This control determines whether I/O bitmaps are used to restrict executions of I/O instructions (see Section 24.6.4 and Section 25.1.3). For this control, “0” means “do not use I/O bitmaps” and “1” means “use I/O bitmaps.” If the I/O bitmaps are used, the setting of the “unconditional I/O exiting” control is ignored. 27 Monitor trap flag If this control is 1, the monitor trap flag debugging feature is enabled. See Section 25.5.2. 28 Use MSR bitmaps This control determines whether MSR bitmaps are used to control execution of the RDMSR and WRMSR instructions (see Section 24.6.9 and Section 25.1.3). For this control, “0” means “do not use MSR bitmaps” and “1” means “use MSR bitmaps.” If the MSR bitmaps are not used, all executions of the RDMSR and WRMSR instructions cause VM exits. 29 MONITOR exiting This control determines whether executions of MONITOR cause VM exits. 30 PAUSE exiting This control determines whether executions of PAUSE cause VM exits. 31 Activate secondary This control determines whether the secondary processor-based VM-execution controls are controls used. If this control is 0, the logical processor operates as if all the secondary processor-based VM-execution controls were also 0. 24-10 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES All other bits in this field are reserved, some to 0 and some to 1. Software should consult the VMX capability MSRs IA32_VMX_PROCBASED_CTLS and IA32_VMX_TRUE_PROCBASED_CTLS (see Appendix A.3.2) to determine how to set reserved bits. Failure to set reserved bits properly causes subsequent VM entries to fail (see Section 26.2.1.1). The first processors to support the virtual-machine extensions supported only the 1-settings of bits 1, 4–6, 8, 13– 16, and 26. The VMX capability MSR IA32_VMX_PROCBASED_CTLS will always report that these bits must be 1. Logical processors that support the 0-settings of any of these bits will support the VMX capability MSR IA32_VMX_TRUE_PROCBASED_CTLS MSR, and software should consult this MSR to discover support for the 0settings of these bits. Software that is not aware of the functionality of any one of these bits should set that bit to 1. Bit 31 of the primary processor-based VM-execution controls determines whether the secondary processor-based VM-execution controls are used. If that bit is 0, VM entry and VMX non-root operation function as if all the secondary processor-based VM-execution controls were 0. Processors that support only the 0-setting of bit 31 of the primary processor-based VM-execution controls do not support the secondary processor-based VM-execution controls. Table 24-7 lists the secondary processor-based VM-execution controls. See Chapter 25 for more details of how these controls affect processor behavior in VMX non-root operation. Table 24-7. Definitions of Secondary Processor-Based VM-Execution Controls Bit Position(s) Name Description 0 Virtualize APIC accesses If this control is 1, the logical processor treats specially accesses to the page with the APICaccess address. See Section 29.4. 1 Enable EPT If this control is 1, extended page tables (EPT) are enabled. See Section 28.2. 2 Descriptor-table exiting This control determines whether executions of LGDT, LIDT, LLDT, LTR, SGDT, SIDT, SLDT, and STR cause VM exits. 3 Enable RDTSCP If this control is 0, any execution of RDTSCP causes an invalid-opcode exception (#UD). 4 Virtualize x2APIC mode If this control is 1, the logical processor treats specially RDMSR and WRMSR to APIC MSRs (in the range 800H–8FFH). See Section 29.5. 5 Enable VPID If this control is 1, cached translations of linear addresses are associated with a virtualprocessor identifier (VPID). See Section 28.1. 6 WBINVD exiting This control determines whether executions of WBINVD cause VM exits. 7 Unrestricted guest This control determines whether guest software may run in unpaged protected mode or in realaddress mode. 8 APIC-register virtualization If this control is 1, the logical processor virtualizes certain APIC accesses. See Section 29.4 and Section 29.5. 9 Virtual-interrupt delivery This controls enables the evaluation and delivery of pending virtual interrupts as well as the emulation of writes to the APIC registers that control interrupt prioritization. 10 PAUSE-loop exiting This control determines whether a series of executions of PAUSE can cause a VM exit (see Section 24.6.13 and Section 25.1.3). 11 RDRAND exiting This control determines whether executions of RDRAND cause VM exits. 12 Enable INVPCID If this control is 0, any execution of INVPCID causes a #UD. 13 Enable VM functions Setting this control to 1 enables use of the VMFUNC instruction in VMX non-root operation. See Section 25.5.6. 14 VMCS shadowing If this control is 1, executions of VMREAD and VMWRITE in VMX non-root operation may access a shadow VMCS (instead of causing VM exits). See Section 24.10 and Section 30.3. 15 Enable ENCLS exiting If this control is 1, executions of ENCLS consult the ENCLS-exiting bitmap to determine whether the instruction causes a VM exit. See Section 24.6.16 and Section 25.1.3. 16 RDSEED exiting This control determines whether executions of RDSEED cause VM exits. 17 Enable PML If this control is 1, an access to a guest-physical address that sets an EPT dirty bit first adds an entry to the page-modification log. See Section 28.2.6. Vol. 3C 24-11 VIRTUAL MACHINE CONTROL STRUCTURES Table 24-7. Definitions of Secondary Processor-Based VM-Execution Controls (Contd.) Bit Position(s) Name Description 18 EPT-violation #VE If this control is 1, EPT violations may cause virtualization exceptions (#VE) instead of VM exits. See Section 25.5.7. 19 Conceal VMX from PT If this control is 1, Intel Processor Trace suppresses from PIPs an indication that the processor was in VMX non-root operation and omits a VMCS packet from any PSB+ produced in VMX nonroot operation (see Chapter 35). 20 Enable XSAVES/XRSTORS If this control is 0, any execution of XSAVES or XRSTORS causes a #UD. 22 Mode-based If this control is 1, EPT execute permissions are based on whether the linear address being execute control for accessed is supervisor mode or user mode. See Chapter 28. EPT 23 Sub-page write permissions for EPT 24 Intel PT uses guest If this control is 1, all output addresses used by Intel Processor Trace are treated as guestphysical addresses physical addresses and translated using EPT. See Section 25.5.4. 25 Use TSC scaling This control determines whether executions of RDTSC, executions of RDTSCP, and executions of RDMSR that read from the IA32_TIME_STAMP_COUNTER MSR return a value modified by the TSC multiplier field (see Section 24.6.5 and Section 25.3). 26 Enable user wait and pause If this control is 0, any execution of TPAUSE, UMONITOR, or UMWAIT causes a #UD. 28 Enable ENCLV exiting If this control is 1, executions of ENCLV consult the ENCLV-exiting bitmap to determine whether the instruction causes a VM exit. See Section 24.6.17 and Section 25.1.3. If this control is 1, EPT write permissions may be specified at the granularity of 128 bytes. See Section 28.2.4. All other bits in this field are reserved to 0. Software should consult the VMX capability MSR IA32_VMX_PROCBASED_CTLS2 (see Appendix A.3.3) to determine which bits may be set to 1. Failure to clear reserved bits causes subsequent VM entries to fail (see Section 26.2.1.1). 24.6.3 Exception Bitmap The exception bitmap is a 32-bit field that contains one bit for each exception. When an exception occurs, its vector is used to select a bit in this field. If the bit is 1, the exception causes a VM exit. If the bit is 0, the exception is delivered normally through the IDT, using the descriptor corresponding to the exception’s vector. Whether a page fault (exception with vector 14) causes a VM exit is determined by bit 14 in the exception bitmap as well as the error code produced by the page fault and two 32-bit fields in the VMCS (the page-fault error-code mask and page-fault error-code match). See Section 25.2 for details. 24.6.4 I/O-Bitmap Addresses The VM-execution control fields include the 64-bit physical addresses of I/O bitmaps A and B (each of which are 4 KBytes in size). I/O bitmap A contains one bit for each I/O port in the range 0000H through 7FFFH; I/O bitmap B contains bits for ports in the range 8000H through FFFFH. A logical processor uses these bitmaps if and only if the “use I/O bitmaps” control is 1. If the bitmaps are used, execution of an I/O instruction causes a VM exit if any bit in the I/O bitmaps corresponding to a port it accesses is 1. See Section 25.1.3 for details. If the bitmaps are used, their addresses must be 4-KByte aligned. 24.6.5 Time-Stamp Counter Offset and Multiplier The VM-execution control fields include a 64-bit TSC-offset field. If the “RDTSC exiting” control is 0 and the “use TSC offsetting” control is 1, this field controls executions of the RDTSC and RDTSCP instructions. It also controls 24-12 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES executions of the RDMSR instruction that read from the IA32_TIME_STAMP_COUNTER MSR. For all of these, the value of the TSC offset is added to the value of the time-stamp counter, and the sum is returned to guest software in EDX:EAX. Processors that support the 1-setting of the “use TSC scaling” control also support a 64-bit TSC-multiplier field. If this control is 1 (and the “RDTSC exiting” control is 0 and the “use TSC offsetting” control is 1), this field also affects the executions of the RDTSC, RDTSCP, and RDMSR instructions identified above. Specifically, the contents of the time-stamp counter is first multiplied by the TSC multiplier before adding the TSC offset. See Chapter 27 for a detailed treatment of the behavior of RDTSC, RDTSCP, and RDMSR in VMX non-root operation. 24.6.6 Guest/Host Masks and Read Shadows for CR0 and CR4 VM-execution control fields include guest/host masks and read shadows for the CR0 and CR4 registers. These fields control executions of instructions that access those registers (including CLTS, LMSW, MOV CR, and SMSW). They are 64 bits on processors that support Intel 64 architecture and 32 bits on processors that do not. In general, bits set to 1 in a guest/host mask correspond to bits “owned” by the host: • Guest attempts to set them (using CLTS, LMSW, or MOV to CR) to values differing from the corresponding bits in the corresponding read shadow cause VM exits. • Guest reads (using MOV from CR or SMSW) return values for these bits from the corresponding read shadow. Bits cleared to 0 correspond to bits “owned” by the guest; guest attempts to modify them succeed and guest reads return values for these bits from the control register itself. See Chapter 27 for details regarding how these fields affect VMX non-root operation. 24.6.7 CR3-Target Controls The VM-execution control fields include a set of 4 CR3-target values and a CR3-target count. The CR3-target values each have 64 bits on processors that support Intel 64 architecture and 32 bits on processors that do not. The CR3-target count has 32 bits on all processors. An execution of MOV to CR3 in VMX non-root operation does not cause a VM exit if its source operand matches one of these values. If the CR3-target count is n, only the first n CR3-target values are considered; if the CR3-target count is 0, MOV to CR3 always causes a VM exit There are no limitations on the values that can be written for the CR3-target values. VM entry fails (see Section 26.2) if the CR3-target count is greater than 4. Future processors may support a different number of CR3-target values. Software should read the VMX capability MSR IA32_VMX_MISC (see Appendix A.6) to determine the number of values supported. 24.6.8 Controls for APIC Virtualization There are three mechanisms by which software accesses registers of the logical processor’s local APIC: • If the local APIC is in xAPIC mode, it can perform memory-mapped accesses to addresses in the 4-KByte page referenced by the physical address in the IA32_APIC_BASE MSR (see Section 10.4.4, “Local APIC Status and Location” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A and Intel® 64 Architecture Processor Topology Enumeration).1 • If the local APIC is in x2APIC mode, it can accesses the local APIC’s registers using the RDMSR and WRMSR instructions (see Intel® 64 Architecture Processor Topology Enumeration). • In 64-bit mode, it can access the local APIC’s task-priority register (TPR) using the MOV CR8 instruction. There are five processor-based VM-execution controls (see Section 24.6.2) that control such accesses. There are “use TPR shadow”, “virtualize APIC accesses”, “virtualize x2APIC mode”, “virtual-interrupt delivery”, and “APICregister virtualization”. These controls interact with the following fields: 1. If the local APIC does not support x2APIC mode, it is always in xAPIC mode. Vol. 3C 24-13 VIRTUAL MACHINE CONTROL STRUCTURES • APIC-access address (64 bits). This field contains the physical address of the 4-KByte APIC-access page. If the “virtualize APIC accesses” VM-execution control is 1, access to this page may cause VM exits or be virtualized by the processor. See Section 29.4. The APIC-access address exists only on processors that support the 1-setting of the “virtualize APIC accesses” VM-execution control. • Virtual-APIC address (64 bits). This field contains the physical address of the 4-KByte virtual-APIC page. The processor uses the virtual-APIC page to virtualize certain accesses to APIC registers and to manage virtual interrupts; see Chapter 29. Depending on the setting of the controls indicated earlier, the virtual-APIC page may be accessed by the following operations: — The MOV CR8 instructions (see Section 29.3). — Accesses to the APIC-access page if, in addition, the “virtualize APIC accesses” VM-execution control is 1 (see Section 29.4). — The RDMSR and WRMSR instructions if, in addition, the value of ECX is in the range 800H–8FFH (indicating an APIC MSR) and the “virtualize x2APIC mode” VM-execution control is 1 (see Section 29.5). If the “use TPR shadow” VM-execution control is 1, VM entry ensures that the virtual-APIC address is 4-KByte aligned. The virtual-APIC address exists only on processors that support the 1-setting of the “use TPR shadow” VM-execution control. • TPR threshold (32 bits). Bits 3:0 of this field determine the threshold below which bits 7:4 of VTPR (see Section 29.1.1) cannot fall. If the “virtual-interrupt delivery” VM-execution control is 0, a VM exit occurs after an operation (e.g., an execution of MOV to CR8) that reduces the value of those bits below the TPR threshold. See Section 29.1.2. The TPR threshold exists only on processors that support the 1-setting of the “use TPR shadow” VM-execution control. • EOI-exit bitmap (4 fields; 64 bits each). These fields are supported only on processors that support the 1setting of the “virtual-interrupt delivery” VM-execution control. They are used to determine which virtualized writes to the APIC’s EOI register cause VM exits: — EOI_EXIT0 contains bits for vectors from 0 (bit 0) to 63 (bit 63). — EOI_EXIT1 contains bits for vectors from 64 (bit 0) to 127 (bit 63). — EOI_EXIT2 contains bits for vectors from 128 (bit 0) to 191 (bit 63). — EOI_EXIT3 contains bits for vectors from 192 (bit 0) to 255 (bit 63). See Section 29.1.4 for more information on the use of this field. • Posted-interrupt notification vector (16 bits). This field is supported only on processors that support the 1setting of the “process posted interrupts” VM-execution control. Its low 8 bits contain the interrupt vector that is used to notify a logical processor that virtual interrupts have been posted. See Section 29.6 for more information on the use of this field. • Posted-interrupt descriptor address (64 bits). This field is supported only on processors that support the 1setting of the “process posted interrupts” VM-execution control. It is the physical address of a 64-byte aligned posted interrupt descriptor. See Section 29.6 for more information on the use of this field. 24.6.9 MSR-Bitmap Address On processors that support the 1-setting of the “use MSR bitmaps” VM-execution control, the VM-execution control fields include the 64-bit physical address of four contiguous MSR bitmaps, which are each 1-KByte in size. This field does not exist on processors that do not support the 1-setting of that control. The four bitmaps are: • Read bitmap for low MSRs (located at the MSR-bitmap address). This contains one bit for each MSR address in the range 00000000H to 00001FFFH. The bit determines whether an execution of RDMSR applied to that MSR causes a VM exit. 24-14 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES • Read bitmap for high MSRs (located at the MSR-bitmap address plus 1024). This contains one bit for each MSR address in the range C0000000H toC0001FFFH. The bit determines whether an execution of RDMSR applied to that MSR causes a VM exit. • Write bitmap for low MSRs (located at the MSR-bitmap address plus 2048). This contains one bit for each MSR address in the range 00000000H to 00001FFFH. The bit determines whether an execution of WRMSR applied to that MSR causes a VM exit. • Write bitmap for high MSRs (located at the MSR-bitmap address plus 3072). This contains one bit for each MSR address in the range C0000000H toC0001FFFH. The bit determines whether an execution of WRMSR applied to that MSR causes a VM exit. A logical processor uses these bitmaps if and only if the “use MSR bitmaps” control is 1. If the bitmaps are used, an execution of RDMSR or WRMSR causes a VM exit if the value of RCX is in neither of the ranges covered by the bitmaps or if the appropriate bit in the MSR bitmaps (corresponding to the instruction and the RCX value) is 1. See Section 25.1.3 for details. If the bitmaps are used, their address must be 4-KByte aligned. 24.6.10 Executive-VMCS Pointer The executive-VMCS pointer is a 64-bit field used in the dual-monitor treatment of system-management interrupts (SMIs) and system-management mode (SMM). SMM VM exits save this field as described in Section 34.15.2. VM entries that return from SMM use this field as described in Section 34.15.4. 24.6.11 Extended-Page-Table Pointer (EPTP) The extended-page-table pointer (EPTP) contains the address of the base of EPT PML4 table (see Section 28.2.2), as well as other EPT configuration information. The format of this field is shown in Table 24-8. Table 24-8. Format of Extended-Page-Table Pointer Bit Position(s) Field 2:0 EPT paging-structure memory type (see Section 28.2.7): 0 = Uncacheable (UC) 6 = Write-back (WB) Other values are reserved.1 5:3 This value is 1 less than the EPT page-walk length (see Section 28.2.2) 6 Setting this control to 1 enables accessed and dirty flags for EPT (see Section 28.2.5)2 11:7 Reserved N–1:12 Bits N–1:12 of the physical address of the 4-KByte aligned EPT PML4 table3 63:N Reserved NOTES: 1. Software should read the VMX capability MSR IA32_VMX_EPT_VPID_CAP (see Appendix A.10) to determine what EPT paging-structure memory types are supported. 2. Not all processors support accessed and dirty flags for EPT. Software should read the VMX capability MSR IA32_VMX_EPT_VPID_CAP (see Appendix A.10) to determine whether the processor supports this feature. 3. N is the physical-address width supported by the logical processor. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. The EPTP exists only on processors that support the 1-setting of the “enable EPT” VM-execution control. Vol. 3C 24-15 VIRTUAL MACHINE CONTROL STRUCTURES 24.6.12 Virtual-Processor Identifier (VPID) The virtual-processor identifier (VPID) is a 16-bit field. It exists only on processors that support the 1-setting of the “enable VPID” VM-execution control. See Section 28.1 for details regarding the use of this field. 24.6.13 Controls for PAUSE-Loop Exiting On processors that support the 1-setting of the “PAUSE-loop exiting” VM-execution control, the VM-execution control fields include the following 32-bit fields: • PLE_Gap. Software can configure this field as an upper bound on the amount of time between two successive executions of PAUSE in a loop. • PLE_Window. Software can configure this field as an upper bound on the amount of time a guest is allowed to execute in a PAUSE loop. These fields measure time based on a counter that runs at the same rate as the timestamp counter (TSC). See Section 25.1.3 for more details regarding PAUSE-loop exiting. 24.6.14 VM-Function Controls The VM-function controls constitute a 64-bit vector that governs use of the VMFUNC instruction in VMX non-root operation. This field is supported only on processors that support the 1-settings of both the “activate secondary controls” primary processor-based VM-execution control and the “enable VM functions” secondary processorbased VM-execution control. Table 24-9 lists the VM-function controls. See Section 25.5.6 for more details of how these controls affect processor behavior in VMX non-root operation. Table 24-9. Definitions of VM-Function Controls Bit Position(s) Name Description 0 The EPTP-switching VM function changes the EPT pointer to a value chosen from the EPTP list. See Section 25.5.6.3. EPTP switching All other bits in this field are reserved to 0. Software should consult the VMX capability MSR IA32_VMX_VMFUNC (see Appendix A.11) to determine which bits are reserved. Failure to clear reserved bits causes subsequent VM entries to fail (see Section 26.2.1.1). Processors that support the 1-setting of the “EPTP switching” VM-function control also support a 64-bit field called the EPTP-list address. This field contains the physical address of the 4-KByte EPTP list. The EPTP list comprises 512 8-Byte entries (each an EPTP value) and is used by the EPTP-switching VM function (see Section 25.5.6.3). 24.6.15 VMCS Shadowing Bitmap Addresses On processors that support the 1-setting of the “VMCS shadowing” VM-execution control, the VM-execution control fields include the 64-bit physical addresses of the VMREAD bitmap and the VMWRITE bitmap. Each bitmap is 4 KBytes in size and thus contains 32 KBits. The addresses are the VMREAD-bitmap address and the VMWRITEbitmap address. If the “VMCS shadowing” VM-execution control is 1, executions of VMREAD and VMWRITE may consult these bitmaps (see Section 24.10 and Section 30.3). 24.6.16 ENCLS-Exiting Bitmap The ENCLS-exiting bitmap is a 64-bit field. If the “enable ENCLS exiting” VM-execution control is 1, execution of ENCLS causes a VM exit if the bit in this field corresponding to the value of EAX is 1. If the bit is 0, the instruction executes normally. See Section 25.1.3 for more information. 24-16 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES 24.6.17 ENCLV-Exiting Bitmap The ENCLV-exiting bitmap is a 64-bit field. If the “enable ENCLV exiting” VM-execution control is 1, execution of ENCLV causes a VM exit if the bit in this field corresponding to the value of EAX is 1. If the bit is 0, the instruction executes normally. See Section 25.1.3 for more information. 24.6.18 Control Field for Page-Modification Logging The PML address is a 64-bit field. It is the 4-KByte aligned address of the page-modification log. The pagemodification log consists of 512 64-bit entries. It is used for the page-modification logging feature. Details of the page-modification logging are given in Section 28.2.6. If the “enable PML” VM-execution control is 1, VM entry ensures that the PML address is 4-KByte aligned. The PML address exists only on processors that support the 1-setting of the “enable PML” VM-execution control. 24.6.19 Controls for Virtualization Exceptions On processors that support the 1-setting of the “EPT-violation #VE” VM-execution control, the VM-execution control fields include the following: • Virtualization-exception information address (64 bits). This field contains the physical address of the virtualization-exception information area. When a logical processor encounters a virtualization exception, it saves virtualization-exception information at the virtualization-exception information address; see Section 25.5.7.2. • EPTP index (16 bits). When an EPT violation causes a virtualization exception, the processor writes the value of this field to the virtualization-exception information area. The EPTP-switching VM function updates this field (see Section 25.5.6.3). 24.6.20 XSS-Exiting Bitmap On processors that support the 1-setting of the “enable XSAVES/XRSTORS” VM-execution control, the VM-execution control fields include a 64-bit XSS-exiting bitmap. If the “enable XSAVES/XRSTORS” VM-execution control is 1, executions of XSAVES and XRSTORS may consult this bitmap (see Section 25.1.3 and Section 25.3). 24.6.21 Sub-Page-Permission-Table Pointer (SPPTP) If the sub-page write-permission feature of EPT is enabled, EPT write permissions may be determined at a 128byte granularity (see Section 28.2.4). These permissions are determined using a hierarchy of sub-page-permission structures in memory. The root of this hierarchy is referenced by a VM-execution control field called the sub-page-permission-table pointer (SPPTP). The SPPTP contains the address of the base of the root SPP table (see Section 28.2.4.2). The format of this field is shown in Table 24-8. Table 24-10. Format of Sub-Page-Permission-Table Pointer Bit Position(s) Field 11:0 Reserved N–1:12 Bits N–1:12 of the physical address of the 4-KByte aligned root SPP table 63:N1 Reserved Vol. 3C 24-17 VIRTUAL MACHINE CONTROL STRUCTURES NOTES: 1. N is the processor’s physical-address width. Software can determine this width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. The SPPTP exists only on processors that support the 1-setting of the “sub-page write permissions for EPT” VMexecution control. 24.7 VM-EXIT CONTROL FIELDS The VM-exit control fields govern the behavior of VM exits. They are discussed in Section 24.7.1 and Section 24.7.2. 24.7.1 VM-Exit Controls The VM-exit controls constitute a 32-bit vector that governs the basic operation of VM exits. Table 24-11 lists the controls supported. See Chapter 27 for complete details of how these controls affect VM exits. Table 24-11. Definitions of VM-Exit Controls Bit Position(s) Name Description 2 This control determines whether DR7 and the IA32_DEBUGCTL MSR are saved on VM exit. Save debug controls The first processors to support the virtual-machine extensions supported only the 1setting of this control. 9 Host address-space size On processors that support Intel 64 architecture, this control determines whether a logical processor is in 64-bit mode after the next VM exit. Its value is loaded into CS.L, IA32_EFER.LME, and IA32_EFER.LMA on every VM exit.1 This control must be 0 on processors that do not support Intel 64 architecture. 12 Load IA32_PERF_GLOBAL_ CTRL This control determines whether the IA32_PERF_GLOBAL_CTRL MSR is loaded on VM exit. 15 Acknowledge interrupt on exit This control affects VM exits due to external interrupts: • If such a VM exit occurs and this control is 1, the logical processor acknowledges the interrupt controller, acquiring the interrupt’s vector. The vector is stored in the VM-exit interruption-information field, which is marked valid. • If such a VM exit occurs and this control is 0, the interrupt is not acknowledged and the VM-exit interruption-information field is marked invalid. 18 Save IA32_PAT This control determines whether the IA32_PAT MSR is saved on VM exit. 19 Load IA32_PAT This control determines whether the IA32_PAT MSR is loaded on VM exit. 20 Save IA32_EFER This control determines whether the IA32_EFER MSR is saved on VM exit. 21 Load IA32_EFER This control determines whether the IA32_EFER MSR is loaded on VM exit. 22 Save VMXpreemption timer value This control determines whether the value of the VMX-preemption timer is saved on VM exit. 23 Clear IA32_BNDCFGS This control determines whether the IA32_BNDCFGS MSR is cleared on VM exit. 24 Conceal VMX from PT If this control is 1, Intel Processor Trace does not produce a paging information packet (PIP) on a VM exit or a VMCS packet on an SMM VM exit (see Chapter 35). 25 Clear IA32_RTIT_CTL This control determines whether the IA32_RTIT_CTL MSR is cleared on VM exit. NOTES: 1. Since the Intel 64 architecture specifies that IA32_EFER.LMA is always set to the logical-AND of CR0.PG and IA32_EFER.LME, and since CR0.PG is always 1 in VMX root operation, IA32_EFER.LMA is always identical to IA32_EFER.LME in VMX root operation. 24-18 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES All other bits in this field are reserved, some to 0 and some to 1. Software should consult the VMX capability MSRs IA32_VMX_EXIT_CTLS and IA32_VMX_TRUE_EXIT_CTLS (see Appendix A.4) to determine how it should set the reserved bits. Failure to set reserved bits properly causes subsequent VM entries to fail (see Section 26.2.1.2). The first processors to support the virtual-machine extensions supported only the 1-settings of bits 0–8, 10, 11, 13, 14, 16, and 17. The VMX capability MSR IA32_VMX_EXIT_CTLS always reports that these bits must be 1. Logical processors that support the 0-settings of any of these bits will support the VMX capability MSR IA32_VMX_TRUE_EXIT_CTLS MSR, and software should consult this MSR to discover support for the 0-settings of these bits. Software that is not aware of the functionality of any one of these bits should set that bit to 1. 24.7.2 VM-Exit Controls for MSRs A VMM may specify lists of MSRs to be stored and loaded on VM exits. The following VM-exit control fields determine how MSRs are stored on VM exits: • VM-exit MSR-store count (32 bits). This field specifies the number of MSRs to be stored on VM exit. It is recommended that this count not exceed 512.1 Otherwise, unpredictable processor behavior (including a machine check) may result during VM exit. • VM-exit MSR-store address (64 bits). This field contains the physical address of the VM-exit MSR-store area. The area is a table of entries, 16 bytes per entry, where the number of entries is given by the VM-exit MSRstore count. The format of each entry is given in Table 24-12. If the VM-exit MSR-store count is not zero, the address must be 16-byte aligned. Table 24-12. Format of an MSR Entry Bit Position(s) Contents 31:0 MSR index 63:32 Reserved 127:64 MSR data See Section 27.4 for how this area is used on VM exits. The following VM-exit control fields determine how MSRs are loaded on VM exits: • VM-exit MSR-load count (32 bits). This field contains the number of MSRs to be loaded on VM exit. It is recommended that this count not exceed 512. Otherwise, unpredictable processor behavior (including a machine check) may result during VM exit.2 • VM-exit MSR-load address (64 bits). This field contains the physical address of the VM-exit MSR-load area. The area is a table of entries, 16 bytes per entry, where the number of entries is given by the VM-exit MSR-load count (see Table 24-12). If the VM-exit MSR-load count is not zero, the address must be 16-byte aligned. See Section 27.6 for how this area is used on VM exits. 24.8 VM-ENTRY CONTROL FIELDS The VM-entry control fields govern the behavior of VM entries. They are discussed in Sections 24.8.1 through 24.8.3. 1. Future implementations may allow more MSRs to be stored reliably. Software should consult the VMX capability MSR IA32_VMX_MISC to determine the number supported (see Appendix A.6). 2. Future implementations may allow more MSRs to be loaded reliably. Software should consult the VMX capability MSR IA32_VMX_MISC to determine the number supported (see Appendix A.6). Vol. 3C 24-19 VIRTUAL MACHINE CONTROL STRUCTURES 24.8.1 VM-Entry Controls The VM-entry controls constitute a 32-bit vector that governs the basic operation of VM entries. Table 24-13 lists the controls supported. See Chapter 24 for how these controls affect VM entries. Table 24-13. Definitions of VM-Entry Controls Bit Position(s) Name Description 2 This control determines whether DR7 and the IA32_DEBUGCTL MSR are loaded on VM entry. 9 Load debug controls The first processors to support the virtual-machine extensions supported only the 1-setting of this control. IA-32e mode guest On processors that support Intel 64 architecture, this control determines whether the logical processor is in IA-32e mode after VM entry. Its value is loaded into IA32_EFER.LMA as part of VM entry.1 This control must be 0 on processors that do not support Intel 64 architecture. 10 Entry to SMM This control determines whether the logical processor is in system-management mode (SMM) after VM entry. This control must be 0 for any VM entry from outside SMM. 11 Deactivate dualmonitor treatment If set to 1, the default treatment of SMIs and SMM is in effect after the VM entry (see Section 34.15.7). This control must be 0 for any VM entry from outside SMM. 13 This control determines whether the IA32_PERF_GLOBAL_CTRL MSR is loaded on VM entry. Load IA32_PERF_GLOBA L_CTRL 14 Load IA32_PAT This control determines whether the IA32_PAT MSR is loaded on VM entry. 15 Load IA32_EFER This control determines whether the IA32_EFER MSR is loaded on VM entry. 16 Load IA32_BNDCFGS This control determines whether the IA32_BNDCFGS MSR is loaded on VM entry. 17 Conceal VMX from PT If this control is 1, Intel Processor Trace does not produce a paging information packet (PIP) on a VM entry or a VMCS packet on a VM entry that returns from SMM (see Chapter 35). 18 Load IA32_RTIT_CTL This control determines whether the IA32_RTIT_CTL MSR is loaded on VM entry. NOTES: 1. Bit 5 of the IA32_VMX_MISC MSR is read as 1 on any logical processor that supports the 1-setting of the “unrestricted guest” VMexecution control. If it is read as 1, every VM exit stores the value of IA32_EFER.LMA into the “IA-32e mode guest” VM-entry control (see Section 27.2). All other bits in this field are reserved, some to 0 and some to 1. Software should consult the VMX capability MSRs IA32_VMX_ENTRY_CTLS and IA32_VMX_TRUE_ENTRY_CTLS (see Appendix A.5) to determine how it should set the reserved bits. Failure to set reserved bits properly causes subsequent VM entries to fail (see Section 26.2.1.3). The first processors to support the virtual-machine extensions supported only the 1-settings of bits 0–8 and 12. The VMX capability MSR IA32_VMX_ENTRY_CTLS always reports that these bits must be 1. Logical processors that support the 0-settings of any of these bits will support the VMX capability MSR IA32_VMX_TRUE_ENTRY_CTLS MSR, and software should consult this MSR to discover support for the 0-settings of these bits. Software that is not aware of the functionality of any one of these bits should set that bit to 1. 24-20 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES 24.8.2 VM-Entry Controls for MSRs A VMM may specify a list of MSRs to be loaded on VM entries. The following VM-entry control fields manage this functionality: • VM-entry MSR-load count (32 bits). This field contains the number of MSRs to be loaded on VM entry. It is recommended that this count not exceed 512. Otherwise, unpredictable processor behavior (including a machine check) may result during VM entry.1 • VM-entry MSR-load address (64 bits). This field contains the physical address of the VM-entry MSR-load area. The area is a table of entries, 16 bytes per entry, where the number of entries is given by the VM-entry MSR-load count. The format of entries is described in Table 24-12. If the VM-entry MSR-load count is not zero, the address must be 16-byte aligned. See Section 26.4 for details of how this area is used on VM entries. 24.8.3 VM-Entry Controls for Event Injection VM entry can be configured to conclude by delivering an event through the IDT (after all guest state and MSRs have been loaded). This process is called event injection and is controlled by the following three VM-entry control fields: • VM-entry interruption-information field (32 bits). This field provides details about the event to be injected. Table 24-14 describes the field. Table 24-14. Format of the VM-Entry Interruption-Information Field Bit Position(s) Content 7:0 Vector of interrupt or exception 10:8 Interruption type: 0: External interrupt 1: Reserved 2: Non-maskable interrupt (NMI) 3: Hardware exception (e.g,. #PF) 4: Software interrupt (INT n) 5: Privileged software exception (INT1) 6: Software exception (INT3 or INTO) 7: Other event 11 Deliver error code (0 = do not deliver; 1 = deliver) 30:12 Reserved 31 Valid — The vector (bits 7:0) determines which entry in the IDT is used or which other event is injected. — The interruption type (bits 10:8) determines details of how the injection is performed. In general, a VMM should use the type hardware exception for all exceptions other than the following: • • • breakpoint exceptions (#BP; a VMM should use the type software exception); overflow exceptions (#OF a VMM should use the use type software exception); and those debug exceptions (#DB) that are generated by INT1 (a VMM should use the use type privileged software exception).2 The type other event is used for injection of events that are not delivered through the IDT.3 1. Future implementations may allow more MSRs to be loaded reliably. Software should consult the VMX capability MSR IA32_VMX_MISC to determine the number supported (see Appendix A.6). 2. The type hardware exception should be used for all other debug exceptions. 3. INT1 and INT3 refer to the instructions with opcodes F1 and CC, respectively, and not to INT n with values 1 or 3 for n. Vol. 3C 24-21 VIRTUAL MACHINE CONTROL STRUCTURES — For exceptions, the deliver-error-code bit (bit 11) determines whether delivery pushes an error code on the guest stack. — VM entry injects an event if and only if the valid bit (bit 31) is 1. The valid bit in this field is cleared on every VM exit (see Section 27.2). • VM-entry exception error code (32 bits). This field is used if and only if the valid bit (bit 31) and the delivererror-code bit (bit 11) are both set in the VM-entry interruption-information field. • VM-entry instruction length (32 bits). For injection of events whose type is software interrupt, software exception, or privileged software exception, this field is used to determine the value of RIP that is pushed on the stack. See Section 26.6 for details regarding the mechanics of event injection, including the use of the interruption type and the VM-entry instruction length. VM exits clear the valid bit (bit 31) in the VM-entry interruption-information field. 24.9 VM-EXIT INFORMATION FIELDS The VMCS contains a section of fields that contain information about the most recent VM exit. On some processors, attempts to write to these fields with VMWRITE fail (see “VMWRITE—Write Field to VirtualMachine Control Structure” in Chapter 30).1 24.9.1 Basic VM-Exit Information The following VM-exit information fields provide basic information about a VM exit: • Exit reason (32 bits). This field encodes the reason for the VM exit and has the structure given in Table 24-15. Table 24-15. Format of Exit Reason Bit Position(s) Contents 15:0 Basic exit reason 16 Always cleared to 0 26:17 Reserved (cleared to 0) 27 A VM exit saves this bit as 1 to indicate that the VM exit was incident to enclave mode. 28 Pending MTF VM exit 29 VM exit from VMX root operation 30 Reserved (cleared to 0) 31 VM-entry failure (0 = true VM exit; 1 = VM-entry failure) — Bits 15:0 provide basic information about the cause of the VM exit (if bit 31 is clear) or of the VM-entry failure (if bit 31 is set). Appendix C enumerates the basic exit reasons. — Bit 16 is always cleared to 0. — Bit 27 is set to 1 if the VM exit occurred while the logical processor was in enclave mode. A VM exit also sets this bit if it is incident to delivery of an event injected by VM entry and the guest interruptibility-state field indicates an enclave interrupt (bit 4 of the field is 1). See Section 27.2.1 for details. 1. Software can discover whether these fields can be written by reading the VMX capability MSR IA32_VMX_MISC (see Appendix A.6). 24-22 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES — Bit 28 is set only by an SMM VM exit (see Section 34.15.2) that took priority over an MTF VM exit (see Section 25.5.2) that would have occurred had the SMM VM exit not occurred. See Section 34.15.2.3. — Bit 29 is set if and only if the processor was in VMX root operation at the time the VM exit occurred. This can happen only for SMM VM exits. See Section 34.15.2. — Because some VM-entry failures load processor state from the host-state area (see Section 26.8), software must be able to distinguish such cases from true VM exits. Bit 31 is used for that purpose. • Exit qualification (64 bits; 32 bits on processors that do not support Intel 64 architecture). This field contains additional information about the cause of VM exits due to the following: debug exceptions; page-fault exceptions; start-up IPIs (SIPIs); task switches; INVEPT; INVLPG;INVVPID; LGDT; LIDT; LLDT; LTR; SGDT; SIDT; SLDT; STR; VMCLEAR; VMPTRLD; VMPTRST; VMREAD; VMWRITE; VMXON; XRSTORS; XSAVES; controlregister accesses; MOV DR; I/O instructions; and MWAIT. The format of the field depends on the cause of the VM exit. See Section 27.2.1 for details. • Guest-linear address (64 bits; 32 bits on processors that do not support Intel 64 architecture). This field is used in the following cases: — VM exits due to attempts to execute LMSW with a memory operand. — VM exits due to attempts to execute INS or OUTS. — VM exits due to system-management interrupts (SMIs) that arrive immediately after retirement of I/O instructions. — Certain VM exits due to EPT violations See Section 27.2.1 and Section 34.15.2.3 for details of when and how this field is used. • Guest-physical address (64 bits). This field is used VM exits due to EPT violations and EPT misconfigurations. See Section 27.2.1 for details of when and how this field is used. 24.9.2 Information for VM Exits Due to Vectored Events Event-specific information is provided for VM exits due to the following vectored events: exceptions (including those generated by the instructions INT3, INTO, INT1, BOUND, UD0, UD1, and UD2); external interrupts that occur while the “acknowledge interrupt on exit” VM-exit control is 1; and non-maskable interrupts (NMIs). This information is provided in the following fields: • VM-exit interruption information (32 bits). This field receives basic information associated with the event causing the VM exit. Table 24-16 describes this field. Table 24-16. Format of the VM-Exit Interruption-Information Field Bit Position(s) Content 7:0 Vector of interrupt or exception 10:8 Interruption type: 0: External interrupt 1: Not used 2: Non-maskable interrupt (NMI) 3: Hardware exception 4: Not used 5: Privileged software exception 6: Software exception 7: Not used 11 Error code valid (0 = invalid; 1 = valid) 12 NMI unblocking due to IRET 30:13 Reserved (cleared to 0) 31 Valid Vol. 3C 24-23 VIRTUAL MACHINE CONTROL STRUCTURES • VM-exit interruption error code (32 bits). For VM exits caused by hardware exceptions that would have delivered an error code on the stack, this field receives that error code. Section 27.2.2 provides details of how these fields are saved on VM exits. 24.9.3 Information for VM Exits That Occur During Event Delivery Additional information is provided for VM exits that occur during event delivery in VMX non-root operation.1 This information is provided in the following fields: • IDT-vectoring information (32 bits). This field receives basic information associated with the event that was being delivered when the VM exit occurred. Table 24-17 describes this field. Table 24-17. Format of the IDT-Vectoring Information Field Bit Position(s) Content 7:0 Vector of interrupt or exception 10:8 Interruption type: 0: External interrupt 1: Not used 2: Non-maskable interrupt (NMI) 3: Hardware exception 4: Software interrupt 5: Privileged software exception 6: Software exception 7: Not used 11 Error code valid (0 = invalid; 1 = valid) 12 Undefined 30:13 Reserved (cleared to 0) 31 Valid • IDT-vectoring error code (32 bits). For VM exits the occur during delivery of hardware exceptions that would have delivered an error code on the stack, this field receives that error code. See Section 27.2.4 provides details of how these fields are saved on VM exits. 24.9.4 Information for VM Exits Due to Instruction Execution The following fields are used for VM exits caused by attempts to execute certain instructions in VMX non-root operation: • VM-exit instruction length (32 bits). For VM exits resulting from instruction execution, this field receives the length in bytes of the instruction whose execution led to the VM exit.2 See Section 27.2.5 for details of when and how this field is used. • VM-exit instruction information (32 bits). This field is used for VM exits due to attempts to execute INS, INVEPT, INVVPID, LIDT, LGDT, LLDT, LTR, OUTS, SIDT, SGDT, SLDT, STR, VMCLEAR, VMPTRLD, VMPTRST, VMREAD, VMWRITE, or VMXON.3 The format of the field depends on the cause of the VM exit. See Section 27.2.5 for details. 1. This includes cases in which the event delivery was caused by event injection as part of VM entry; see Section 26.6.1.2. 2. This field is also used for VM exits that occur during the delivery of a software interrupt or software exception. 3. Whether the processor provides this information on VM exits due to attempts to execute INS or OUTS can be determined by consulting the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1). 24-24 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES The following fields (64 bits each; 32 bits on processors that do not support Intel 64 architecture) are used only for VM exits due to SMIs that arrive immediately after retirement of I/O instructions. They provide information about that I/O instruction: • • • • I/O RCX. The value of RCX before the I/O instruction started. I/O RSI. The value of RSI before the I/O instruction started. I/O RDI. The value of RDI before the I/O instruction started. I/O RIP. The value of RIP before the I/O instruction started (the RIP that addressed the I/O instruction). 24.9.5 VM-Instruction Error Field The 32-bit VM-instruction error field does not provide information about the most recent VM exit. In fact, it is not modified on VM exits. Instead, it provides information about errors encountered by a non-faulting execution of one of the VMX instructions. 24.10 VMCS TYPES: ORDINARY AND SHADOW Every VMCS is either an ordinary VMCS or a shadow VMCS. A VMCS’s type is determined by the shadow-VMCS indicator in the VMCS region (this is the value of bit 31 of the first 4 bytes of the VMCS region; see Table 24-1): 0 indicates an ordinary VMCS, while 1 indicates a shadow VMCS. Shadow VMCSs are supported only on processors that support the 1-setting of the “VMCS shadowing” VM-execution control (see Section 24.6.2). A shadow VMCS differs from an ordinary VMCS in two ways: • An ordinary VMCS can be used for VM entry but a shadow VMCS cannot. Attempts to perform VM entry when the current VMCS is a shadow VMCS fail (see Section 26.1). • The VMREAD and VMWRITE instructions can be used in VMX non-root operation to access a shadow VMCS but not an ordinary VMCS. This fact results from the following: — If the “VMCS shadowing” VM-execution control is 0, execution of the VMREAD and VMWRITE instructions in VMX non-root operation always cause VM exits (see Section 25.1.3). — If the “VMCS shadowing” VM-execution control is 1, execution of the VMREAD and VMWRITE instructions in VMX non-root operation can access the VMCS referenced by the VMCS link pointer (see Section 30.3). — If the “VMCS shadowing” VM-execution control is 1, VM entry ensures that any VMCS referenced by the VMCS link pointer is a shadow VMCS (see Section 26.3.1.5). In VMX root operation, both types of VMCSs can be accessed with the VMREAD and VMWRITE instructions. Software should not modify the shadow-VMCS indicator in the VMCS region of a VMCS that is active. Doing so may cause the VMCS to become corrupted (see Section 24.11.1). Before modifying the shadow-VMCS indicator, software should execute VMCLEAR for the VMCS to ensure that it is not active. 24.11 SOFTWARE USE OF THE VMCS AND RELATED STRUCTURES This section details guidelines that software should observe when using a VMCS and related structures. It also provides descriptions of consequences for failing to follow guidelines. 24.11.1 Software Use of Virtual-Machine Control Structures To ensure proper processor behavior, software should observe certain guidelines when using an active VMCS. No VMCS should ever be active on more than one logical processor. If a VMCS is to be “migrated” from one logical processor to another, the first logical processor should execute VMCLEAR for the VMCS (to make it inactive on that logical processor and to ensure that all VMCS data are in memory) before the other logical processor executes VMPTRLD for the VMCS (to make it active on the second logical processor).1 A VMCS that is made active on more than one logical processor may become corrupted (see below). Vol. 3C 24-25 VIRTUAL MACHINE CONTROL STRUCTURES Software should not modify the shadow-VMCS indicator (see Table 24-1) in the VMCS region of a VMCS that is active. Doing so may cause the VMCS to become corrupted. Before modifying the shadow-VMCS indicator, software should execute VMCLEAR for the VMCS to ensure that it is not active. Software should use the VMREAD and VMWRITE instructions to access the different fields in the current VMCS (see Section 24.11.2). Software should never access or modify the VMCS data of an active VMCS using ordinary memory operations, in part because the format used to store the VMCS data is implementation-specific and not architecturally defined, and also because a logical processor may maintain some VMCS data of an active VMCS on the processor and not in the VMCS region. The following items detail some of the hazards of accessing VMCS data using ordinary memory operations: • Any data read from a VMCS with an ordinary memory read does not reliably reflect the state of the VMCS. Results may vary from time to time or from logical processor to logical processor. • Writing to a VMCS with an ordinary memory write is not guaranteed to have a deterministic effect on the VMCS. Doing so may cause the VMCS to become corrupted (see below). (Software can avoid these hazards by removing any linear-address mappings to a VMCS region before executing a VMPTRLD for that region and by not remapping it until after executing VMCLEAR for that region.) If a logical processor leaves VMX operation, any VMCSs active on that logical processor may be corrupted (see below). To prevent such corruption of a VMCS that may be used either after a return to VMX operation or on another logical processor, software should execute VMCLEAR for that VMCS before executing the VMXOFF instruction or removing power from the processor (e.g., as part of a transition to the S3 and S4 power states). This section has identified operations that may cause a VMCS to become corrupted. These operations may cause the VMCS’s data to become undefined. Behavior may be unpredictable if that VMCS used subsequently on any logical processor. The following items detail some hazards of VMCS corruption: • • VM entries may fail for unexplained reasons or may load undesired processor state. • VM exits may load undesired processor state, save incorrect state into the VMCS, or cause the logical processor to transition to a shutdown state. The processor may not correctly support VMX non-root operation as documented in Chapter 25 and may generate unexpected VM exits. 24.11.2 VMREAD, VMWRITE, and Encodings of VMCS Fields Every field of the VMCS is associated with a 32-bit value that is its encoding. The encoding is provided in an operand to VMREAD and VMWRITE when software wishes to read or write that field. These instructions fail if given, in 64-bit mode, an operand that sets an encoding bit beyond bit 32. See Chapter 30 for a description of these instructions. The structure of the 32-bit encodings of the VMCS components is determined principally by the width of the fields and their function in the VMCS. See Table 24-18. Table 24-18. Structure of VMCS Component Encoding Bit Position(s) Contents 0 Access type (0 = full; 1 = high); must be full for 16-bit, 32-bit, and natural-width fields 9:1 Index 11:10 Type: 0: control 1: VM-exit information 2: guest state 3: host state 1. As noted in Section 24.1, execution of the VMPTRLD instruction makes a VMCS is active. In addition, VM entry makes active any shadow VMCS referenced by the VMCS link pointer in the current VMCS. If a shadow VMCS is made active by VM entry, it is necessary to execute VMCLEAR for that VMCS before allowing that VMCS to become active on another logical processor. 24-26 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES Table 24-18. Structure of VMCS Component Encoding (Contd.) Bit Position(s) Contents 12 Reserved (must be 0) 14:13 Width: 0: 16-bit 1: 64-bit 2: 32-bit 3: natural-width 31:15 Reserved (must be 0) The following items detail the meaning of the bits in each encoding: • Field width. Bits 14:13 encode the width of the field. — A value of 0 indicates a 16-bit field. — A value of 1 indicates a 64-bit field. — A value of 2 indicates a 32-bit field. — A value of 3 indicates a natural-width field. Such fields have 64 bits on processors that support Intel 64 architecture and 32 bits on processors that do not. Fields whose encodings use value 1 are specially treated to allow 32-bit software access to all 64 bits of the field. Such access is allowed by defining, for each such field, an encoding that allows direct access to the high 32 bits of the field. See below. • Field type. Bits 11:10 encode the type of VMCS field: control, guest-state, host-state, or VM-exit information. (The last category also includes the VM-instruction error field.) • • Index. Bits 9:1 distinguish components with the same field width and type. Access type. Bit 0 must be 0 for all fields except for 64-bit fields (those with field-width 1; see above). A VMREAD or VMWRITE using an encoding with this bit cleared to 0 accesses the entire field. For a 64-bit field with field-width 1, a VMREAD or VMWRITE using an encoding with this bit set to 1 accesses only the high 32 bits of the field. Appendix B gives the encodings of all fields in the VMCS. The following describes the operation of VMREAD and VMWRITE based on processor mode, VMCS-field width, and access type: • 16-bit fields: — A VMREAD returns the value of the field in bits 15:0 of the destination operand; other bits of the destination operand are cleared to 0. — A VMWRITE writes the value of bits 15:0 of the source operand into the VMCS field; other bits of the source operand are not used. • 32-bit fields: — A VMREAD returns the value of the field in bits 31:0 of the destination operand; in 64-bit mode, bits 63:32 of the destination operand are cleared to 0. — A VMWRITE writes the value of bits 31:0 of the source operand into the VMCS field; in 64-bit mode, bits 63:32 of the source operand are not used. • 64-bit fields and natural-width fields using the full access type outside IA-32e mode. — A VMREAD returns the value of bits 31:0 of the field in its destination operand; bits 63:32 of the field are ignored. — A VMWRITE writes the value of its source operand to bits 31:0 of the field and clears bits 63:32 of the field. • 64-bit fields and natural-width fields using the full access type in 64-bit mode (only on processors that support Intel 64 architecture). Vol. 3C 24-27 VIRTUAL MACHINE CONTROL STRUCTURES — A VMREAD returns the value of the field in bits 63:0 of the destination operand — A VMWRITE writes the value of bits 63:0 of the source operand into the VMCS field. • 64-bit fields using the high access type. — A VMREAD returns the value of bits 63:32 of the field in bits 31:0 of the destination operand; in 64-bit mode, bits 63:32 of the destination operand are cleared to 0. — A VMWRITE writes the value of bits 31:0 of the source operand to bits 63:32 of the field; in 64-bit mode, bits 63:32 of the source operand are not used. Software seeking to read a 64-bit field outside IA-32e mode can use VMREAD with the full access type (reading bits 31:0 of the field) and VMREAD with the high access type (reading bits 63:32 of the field); the order of the two VMREAD executions is not important. Software seeking to modify a 64-bit field outside IA-32e mode should first use VMWRITE with the full access type (establishing bits 31:0 of the field while clearing bits 63:32) and then use VMWRITE with the high access type (establishing bits 63:32 of the field). 24.11.3 Initializing a VMCS Software should initialize fields in a VMCS (using VMWRITE) before using the VMCS for VM entry. Failure to do so may result in unpredictable behavior; for example, a VM entry may fail for unexplained reasons, or a successful transition (VM entry or VM exit) may load processor state with unexpected values. It is not necessary to initialize fields that the logical processor will not use. (For example, it is not necessary to unitize the MSR-bitmap address if the “use MSR bitmaps” VM-execution control is 0.) A processor maintains some VMCS information that cannot be modified with the VMWRITE instruction; this includes a VMCS’s launch state (see Section 24.1). Such information may be stored in the VMCS data portion of a VMCS region. Because the format of this information is implementation-specific, there is no way for software to know, when it first allocates a region of memory for use as a VMCS region, how the processor will determine this information from the contents of the memory region. In addition to its other functions, the VMCLEAR instruction initializes any implementation-specific information in the VMCS region referenced by its operand. To avoid the uncertainties of implementation-specific behavior, software should execute VMCLEAR on a VMCS region before making the corresponding VMCS active with VMPTRLD for the first time. (Figure 24-1 illustrates how execution of VMCLEAR puts a VMCS into a well-defined state.) The following software usage is consistent with these limitations: • • VMCLEAR should be executed for a VMCS before it is used for VM entry for the first time. • VMRESUME should be used for any subsequent VM entry using a VMCS (until the next execution of VMCLEAR for the VMCS). VMLAUNCH should be used for the first VM entry using a VMCS after VMCLEAR has been executed for that VMCS. It is expected that, in general, VMRESUME will have lower latency than VMLAUNCH. Since “migrating” a VMCS from one logical processor to another requires use of VMCLEAR (see Section 24.11.1), which sets the launch state of the VMCS to “clear”, such migration requires the next VM entry to be performed using VMLAUNCH. Software developers can avoid the performance cost of increased VM-entry latency by avoiding unnecessary migration of a VMCS from one logical processor to another. 24.11.4 Software Access to Related Structures In addition to data in the VMCS region itself, VMX non-root operation can be controlled by data structures that are referenced by pointers in a VMCS (for example, the I/O bitmaps). While the pointers to these data structures are parts of the VMCS, the data structures themselves are not. They are not accessible using VMREAD and VMWRITE but by ordinary memory writes. Software should ensure that each such data structure is modified only when no logical processor with a current VMCS that references it is in VMX non-root operation. Doing otherwise may lead to unpredictable behavior (including behaviors identified in Section 24.11.1). 24-28 Vol. 3C VIRTUAL MACHINE CONTROL STRUCTURES 24.11.5 VMXON Region Before executing VMXON, software allocates a region of memory (called the VMXON region)1 that the logical processor uses to support VMX operation. The physical address of this region (the VMXON pointer) is provided in an operand to VMXON. The VMXON pointer is subject to the limitations that apply to VMCS pointers: • • The VMXON pointer must be 4-KByte aligned (bits 11:0 must be zero). The VMXON pointer must not set any bits beyond the processor’s physical-address width.2,3 Before executing VMXON, software should write the VMCS revision identifier (see Section 24.2) to the VMXON region. (Specifically, it should write the 31-bit VMCS revision identifier to bits 30:0 of the first 4 bytes of the VMXON region; bit 31 should be cleared to 0.) It need not initialize the VMXON region in any other way. Software should use a separate region for each logical processor and should not access or modify the VMXON region of a logical processor between execution of VMXON and VMXOFF on that logical processor. Doing otherwise may lead to unpredictable behavior (including behaviors identified in Section 24.11.1). 1. The amount of memory required for the VMXON region is the same as that required for a VMCS region. This size is implementation specific and can be determined by consulting the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1). 2. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 3. If IA32_VMX_BASIC[48] is read as 1, the VMXON pointer must not set any bits in the range 63:32; see Appendix A.1. Vol. 3C 24-29 VIRTUAL MACHINE CONTROL STRUCTURES 24-30 Vol. 3C 17.Updates to Chapter 25, Volume 3C Change bars and green text show changes to Chapter 25 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to this chapter: Updates across chapter describing VMX support improvements made for Intel® Processor Trace (Intel® PT). Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 28 CHAPTER 25 VMX NON-ROOT OPERATION In a virtualized environment using VMX, the guest software stack typically runs on a logical processor in VMX nonroot operation. This mode of operation is similar to that of ordinary processor operation outside of the virtualized environment. This chapter describes the differences between VMX non-root operation and ordinary processor operation with special attention to causes of VM exits (which bring a logical processor from VMX non-root operation to root operation). The differences between VMX non-root operation and ordinary processor operation are described in the following sections: • • • • • • Section 25.1, “Instructions That Cause VM Exits” Section 25.2, “Other Causes of VM Exits” Section 25.3, “Changes to Instruction Behavior in VMX Non-Root Operation” Section 25.4, “Other Changes in VMX Non-Root Operation” Section 25.5, “Features Specific to VMX Non-Root Operation” Section 25.6, “Unrestricted Guests” Chapter 26, “VM Entries,” describes the data control structures that govern VMX non-root operation. Chapter 26, “VM Entries,” describes the operation of VM entries by which the processor transitions from VMX root operation to VMX non-root operation. Chapter 25, “VMX Non-Root Operation,” describes the operation of VM exits by which the processor transitions from VMX non-root operation to VMX root operation. Chapter 28, “VMX Support for Address Translation,” describes two features that support address translation in VMX non-root operation. Chapter 29, “APIC Virtualization and Virtual Interrupts,” describes features that support virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC) in VMX non-root operation. 25.1 INSTRUCTIONS THAT CAUSE VM EXITS Certain instructions may cause VM exits if executed in VMX non-root operation. Unless otherwise specified, such VM exits are “fault-like,” meaning that the instruction causing the VM exit does not execute and no processor state is updated by the instruction. Section 27.1 details architectural state in the context of a VM exit. Section 25.1.1 defines the prioritization between faults and VM exits for instructions subject to both. Section 25.1.2 identifies instructions that cause VM exits whenever they are executed in VMX non-root operation (and thus can never be executed in VMX non-root operation). Section 25.1.3 identifies instructions that cause VM exits depending on the settings of certain VM-execution control fields (see Section 24.6). 25.1.1 Relative Priority of Faults and VM Exits The following principles describe the ordering between existing faults and VM exits: • Certain exceptions have priority over VM exits. These include invalid-opcode exceptions, faults based on privilege level,1 and general-protection exceptions that are based on checking I/O permission bits in the taskstate segment (TSS). For example, execution of RDMSR with CPL = 3 generates a general-protection exception and not a VM exit.2 • Faults incurred while fetching instruction operands have priority over VM exits that are conditioned based on the contents of those operands (see LMSW in Section 25.1.3). • VM exits caused by execution of the INS and OUTS instructions (resulting either because the “unconditional I/O exiting” VM-execution control is 1 or because the “use I/O bitmaps control is 1) have priority over the following faults: 1. These include faults generated by attempts to execute, in virtual-8086 mode, privileged instructions that are not recognized in that mode. 2. MOV DR is an exception to this rule; see Section 25.1.3. Vol. 3C 25-1 VMX NON-ROOT OPERATION — A general-protection fault due to the relevant segment (ES for INS; DS for OUTS unless overridden by an instruction prefix) being unusable — A general-protection fault due to an offset beyond the limit of the relevant segment — An alignment-check exception • Fault-like VM exits have priority over exceptions other than those mentioned above. For example, RDMSR of a non-existent MSR with CPL = 0 generates a VM exit and not a general-protection exception. When Section 25.1.2 or Section 25.1.3 (below) identify an instruction execution that may lead to a VM exit, it is assumed that the instruction does not incur a fault that takes priority over a VM exit. 25.1.2 Instructions That Cause VM Exits Unconditionally The following instructions cause VM exits when they are executed in VMX non-root operation: CPUID, GETSEC,1 INVD, and XSETBV. This is also true of instructions introduced with VMX, which include: INVEPT, INVVPID, VMCALL,2 VMCLEAR, VMLAUNCH, VMPTRLD, VMPTRST, VMRESUME, VMXOFF, and VMXON. 25.1.3 Instructions That Cause VM Exits Conditionally Certain instructions cause VM exits in VMX non-root operation depending on the setting of the VM-execution controls. The following instructions can cause “fault-like” VM exits based on the conditions described:3 • CLTS. The CLTS instruction causes a VM exit if the bits in position 3 (corresponding to CR0.TS) are set in both the CR0 guest/host mask and the CR0 read shadow. • ENCLS. The ENCLS instruction causes a VM exit if the “enable ENCLS exiting” VM-execution control is 1 and one of the following is true: — The value of EAX is less than 63 and the corresponding bit in the ENCLS-exiting bitmap is 1 (see Section 24.6.16). — The value of EAX is greater than or equal to 63 and bit 63 in the ENCLS-exiting bitmap is 1. • ENCLV. The ENCLV instruction causes a VM exit if the “enable ENCLV exiting” VM-execution control is 1 and one of the following is true: — The value of EAX is less than 63 and the corresponding bit in the ENCLV-exiting bitmap is 1 (see Section 24.6.17). — The value of EAX is greater than or equal to 63 and bit 63 in the ENCLV-exiting bitmap is 1. • • HLT. The HLT instruction causes a VM exit if the “HLT exiting” VM-execution control is 1. IN, INS/INSB/INSW/INSD, OUT, OUTS/OUTSB/OUTSW/OUTSD. The behavior of each of these instructions is determined by the settings of the “unconditional I/O exiting” and “use I/O bitmaps” VM-execution controls: — If both controls are 0, the instruction executes normally. — If the “unconditional I/O exiting” VM-execution control is 1 and the “use I/O bitmaps” VM-execution control is 0, the instruction causes a VM exit. — If the “use I/O bitmaps” VM-execution control is 1, the instruction causes a VM exit if it attempts to access an I/O port corresponding to a bit set to 1 in the appropriate I/O bitmap (see Section 24.6.4). If an I/O operation “wraps around” the 16-bit I/O-port space (accesses ports FFFFH and 0000H), the I/O instruction 1. An execution of GETSEC in VMX non-root operation causes a VM exit if CR4.SMXE[Bit 14] = 1 regardless of the value of CPL or RAX. An execution of GETSEC causes an invalid-opcode exception (#UD) if CR4.SMXE[Bit 14] = 0. 2. Under the dual-monitor treatment of SMIs and SMM, executions of VMCALL cause SMM VM exits in VMX root operation outside SMM. See Section 34.15.2. 3. Many of the items in this section refer to secondary processor-based VM-execution controls. If bit 31 of the primary processorbased VM-execution controls is 0, VMX non-root operation functions as if these controls were all 0. See Section 24.6.2. 25-2 Vol. 3C VMX NON-ROOT OPERATION causes a VM exit (the “unconditional I/O exiting” VM-execution control is ignored if the “use I/O bitmaps” VM-execution control is 1). See Section 25.1.1 for information regarding the priority of VM exits relative to faults that may be caused by the INS and OUTS instructions. • • INVLPG. The INVLPG instruction causes a VM exit if the “INVLPG exiting” VM-execution control is 1. • LGDT, LIDT, LLDT, LTR, SGDT, SIDT, SLDT, STR. These instructions cause VM exits if the “descriptor-table exiting” VM-execution control is 1. • LMSW. In general, the LMSW instruction causes a VM exit if it would write, for any bit set in the low 4 bits of the CR0 guest/host mask, a value different than the corresponding bit in the CR0 read shadow. LMSW never clears bit 0 of CR0 (CR0.PE); thus, LMSW causes a VM exit if either of the following are true: INVPCID. The INVPCID instruction causes a VM exit if the “INVLPG exiting” and “enable INVPCID” VM-execution controls are both 1. — The bits in position 0 (corresponding to CR0.PE) are set in both the CR0 guest/host mask and the source operand, and the bit in position 0 is clear in the CR0 read shadow. — For any bit position in the range 3:1, the bit in that position is set in the CR0 guest/host mask and the values of the corresponding bits in the source operand and the CR0 read shadow differ. • • MONITOR. The MONITOR instruction causes a VM exit if the “MONITOR exiting” VM-execution control is 1. • MOV from CR8. The MOV from CR8 instruction causes a VM exit if the “CR8-store exiting” VM-execution control is 1. • MOV to CR0. The MOV to CR0 instruction causes a VM exit unless the value of its source operand matches, for the position of each bit set in the CR0 guest/host mask, the corresponding bit in the CR0 read shadow. (If every bit is clear in the CR0 guest/host mask, MOV to CR0 cannot cause a VM exit.) • MOV to CR3. The MOV to CR3 instruction causes a VM exit unless the “CR3-load exiting” VM-execution control is 0 or the value of its source operand is equal to one of the CR3-target values specified in the VMCS. Only the first n CR3-target values are considered, where n is the CR3-target count. If the “CR3-load exiting” VMexecution control is 1 and the CR3-target count is 0, MOV to CR3 always causes a VM exit. MOV from CR3. The MOV from CR3 instruction causes a VM exit if the “CR3-store exiting” VM-execution control is 1. The first processors to support the virtual-machine extensions supported only the 1-setting of this control. The first processors to support the virtual-machine extensions supported only the 1-setting of the “CR3-load exiting” VM-execution control. These processors always consult the CR3-target controls to determine whether an execution of MOV to CR3 causes a VM exit. • MOV to CR4. The MOV to CR4 instruction causes a VM exit unless the value of its source operand matches, for the position of each bit set in the CR4 guest/host mask, the corresponding bit in the CR4 read shadow. • • MOV to CR8. The MOV to CR8 instruction causes a VM exit if the “CR8-load exiting” VM-execution control is 1. • MWAIT. The MWAIT instruction causes a VM exit if the “MWAIT exiting” VM-execution control is 1. If this control is 0, the behavior of the MWAIT instruction may be modified (see Section 25.3). • PAUSE. The behavior of each of this instruction depends on CPL and the settings of the “PAUSE exiting” and “PAUSE-loop exiting” VM-execution controls: MOV DR. The MOV DR instruction causes a VM exit if the “MOV-DR exiting” VM-execution control is 1. Such VM exits represent an exception to the principles identified in Section 25.1.1 in that they take priority over the following: general-protection exceptions based on privilege level; and invalid-opcode exceptions that occur because CR4.DE=1 and the instruction specified access to DR4 or DR5. — CPL = 0. • If the “PAUSE exiting” and “PAUSE-loop exiting” VM-execution controls are both 0, the PAUSE instruction executes normally. • If the “PAUSE exiting” VM-execution control is 1, the PAUSE instruction causes a VM exit (the “PAUSEloop exiting” VM-execution control is ignored if CPL = 0 and the “PAUSE exiting” VM-execution control is 1). Vol. 3C 25-3 VMX NON-ROOT OPERATION • If the “PAUSE exiting” VM-execution control is 0 and the “PAUSE-loop exiting” VM-execution control is 1, the following treatment applies. The processor determines the amount of time between this execution of PAUSE and the previous execution of PAUSE at CPL 0. If this amount of time exceeds the value of the VM-execution control field PLE_Gap, the processor considers this execution to be the first execution of PAUSE in a loop. (It also does so for the first execution of PAUSE at CPL 0 after VM entry.) Otherwise, the processor determines the amount of time since the most recent execution of PAUSE that was considered to be the first in a loop. If this amount of time exceeds the value of the VM-execution control field PLE_Window, a VM exit occurs. For purposes of these computations, time is measured based on a counter that runs at the same rate as the timestamp counter (TSC). — CPL > 0. • • If the “PAUSE exiting” VM-execution control is 0, the PAUSE instruction executes normally. If the “PAUSE exiting” VM-execution control is 1, the PAUSE instruction causes a VM exit. The “PAUSE-loop exiting” VM-execution control is ignored if CPL > 0. • RDMSR. The RDMSR instruction causes a VM exit if any of the following are true: — The “use MSR bitmaps” VM-execution control is 0. — The value of ECX is not in the ranges 00000000H – 00001FFFH and C0000000H – C0001FFFH. — The value of ECX is in the range 00000000H – 00001FFFH and bit n in read bitmap for low MSRs is 1, where n is the value of ECX. — The value of ECX is in the range C0000000H – C0001FFFH and bit n in read bitmap for high MSRs is 1, where n is the value of ECX & 00001FFFH. See Section 24.6.9 for details regarding how these bitmaps are identified. • • • • • RDPMC. The RDPMC instruction causes a VM exit if the “RDPMC exiting” VM-execution control is 1. • • RSM. The RSM instruction causes a VM exit if executed in system-management mode (SMM).1 • UMWAIT. The UMWAIT instruction causes a VM exit if the “RDTSC exiting” and “enable user wait and pause” VM-execution controls are both 1. • VMREAD. The VMREAD instruction causes a VM exit if any of the following are true: RDRAND. The RDRAND instruction causes a VM exit if the “RDRAND exiting” VM-execution control is 1. RDSEED. The RDSEED instruction causes a VM exit if the “RDSEED exiting” VM-execution control is 1. RDTSC. The RDTSC instruction causes a VM exit if the “RDTSC exiting” VM-execution control is 1. RDTSCP. The RDTSCP instruction causes a VM exit if the “RDTSC exiting” and “enable RDTSCP” VM-execution controls are both 1. TPAUSE. The TPAUSE instruction causes a VM exit if the “RDTSC exiting” and “enable user wait and pause” VM-execution controls are both 1. — The “VMCS shadowing” VM-execution control is 0. — Bits 63:15 (bits 31:15 outside 64-bit mode) of the register source operand are not all 0. — Bit n in VMREAD bitmap is 1, where n is the value of bits 14:0 of the register source operand. See Section 24.6.15 for details regarding how the VMREAD bitmap is identified. If the VMREAD instruction does not cause a VM exit, it reads from the VMCS referenced by the VMCS link pointer. See Chapter 30, “VMREAD—Read Field from Virtual-Machine Control Structure” for details of the operation of the VMREAD instruction. • VMWRITE. The VMWRITE instruction causes a VM exit if any of the following are true: — The “VMCS shadowing” VM-execution control is 0. 1. Execution of the RSM instruction outside SMM causes an invalid-opcode exception regardless of whether the processor is in VMX operation. It also does so in VMX root operation in SMM; see Section 34.15.3. 25-4 Vol. 3C VMX NON-ROOT OPERATION — Bits 63:15 (bits 31:15 outside 64-bit mode) of the register source operand are not all 0. — Bit n in VMWRITE bitmap is 1, where n is the value of bits 14:0 of the register source operand. See Section 24.6.15 for details regarding how the VMWRITE bitmap is identified. If the VMWRITE instruction does not cause a VM exit, it writes to the VMCS referenced by the VMCS link pointer. See Chapter 30, “VMWRITE—Write Field to Virtual-Machine Control Structure” for details of the operation of the VMWRITE instruction. • • WBINVD. The WBINVD instruction causes a VM exit if the “WBINVD exiting” VM-execution control is 1. WRMSR. The WRMSR instruction causes a VM exit if any of the following are true: — The “use MSR bitmaps” VM-execution control is 0. — The value of ECX is not in the ranges 00000000H – 00001FFFH and C0000000H – C0001FFFH. — The value of ECX is in the range 00000000H – 00001FFFH and bit n in write bitmap for low MSRs is 1, where n is the value of ECX. — The value of ECX is in the range C0000000H – C0001FFFH and bit n in write bitmap for high MSRs is 1, where n is the value of ECX & 00001FFFH. See Section 24.6.9 for details regarding how these bitmaps are identified. • XRSTORS. The XRSTORS instruction causes a VM exit if the “enable XSAVES/XRSTORS” VM-execution control is 1and any bit is set in the logical-AND of the following three values: EDX:EAX, the IA32_XSS MSR, and the XSS-exiting bitmap (see Section 24.6.20). • XSAVES. The XSAVES instruction causes a VM exit if the “enable XSAVES/XRSTORS” VM-execution control is 1 and any bit is set in the logical-AND of the following three values: EDX:EAX, the IA32_XSS MSR, and the XSSexiting bitmap (see Section 24.6.20). 25.2 OTHER CAUSES OF VM EXITS In addition to VM exits caused by instruction execution, the following events can cause VM exits: • Exceptions. Exceptions (faults, traps, and aborts) cause VM exits based on the exception bitmap (see Section 24.6.3). If an exception occurs, its vector (in the range 0–31) is used to select a bit in the exception bitmap. If the bit is 1, a VM exit occurs; if the bit is 0, the exception is delivered normally through the guest IDT. This use of the exception bitmap applies also to exceptions generated by the instructions INT1, INT3, INTO, BOUND, UD0, UD1, and UD2.1 Page faults (exceptions with vector 14) are specially treated. When a page fault occurs, a processor consults (1) bit 14 of the exception bitmap; (2) the error code produced with the page fault [PFEC]; (3) the page-fault error-code mask field [PFEC_MASK]; and (4) the page-fault error-code match field [PFEC_MATCH]. It checks if PFEC & PFEC_MASK = PFEC_MATCH. If there is equality, the specification of bit 14 in the exception bitmap is followed (for example, a VM exit occurs if that bit is set). If there is inequality, the meaning of that bit is reversed (for example, a VM exit occurs if that bit is clear). Thus, if software desires VM exits on all page faults, it can set bit 14 in the exception bitmap to 1 and set the page-fault error-code mask and match fields each to 00000000H. If software desires VM exits on no page faults, it can set bit 14 in the exception bitmap to 1, the page-fault error-code mask field to 00000000H, and the page-fault error-code match field to FFFFFFFFH. • Triple fault. A VM exit occurs if the logical processor encounters an exception while attempting to call the double-fault handler and that exception itself does not cause a VM exit due to the exception bitmap. This applies to the case in which the double-fault exception was generated within VMX non-root operation, the case in which the double-fault exception was generated during event injection by VM entry, and to the case in which VM entry is injecting a double-fault exception. • External interrupts. An external interrupt causes a VM exit if the “external-interrupt exiting” VM-execution control is 1. (See Section 25.6 for an exception.) Otherwise, the interrupt is delivered normally through the IDT. (If a logical processor is in the shutdown state or the wait-for-SIPI state, external interrupts are blocked. The interrupt is not delivered through the IDT and no VM exit occurs.) 1. INT1 and INT3 refer to the instructions with opcodes F1 and CC, respectively, and not to INT n with value 1 or 3 for n. Vol. 3C 25-5 VMX NON-ROOT OPERATION • Non-maskable interrupts (NMIs). An NMI causes a VM exit if the “NMI exiting” VM-execution control is 1. Otherwise, it is delivered using descriptor 2 of the IDT. (If a logical processor is in the wait-for-SIPI state, NMIs are blocked. The NMI is not delivered through the IDT and no VM exit occurs.) • INIT signals. INIT signals cause VM exits. A logical processor performs none of the operations normally associated with these events. Such exits do not modify register state or clear pending events as they would outside of VMX operation. (If a logical processor is in the wait-for-SIPI state, INIT signals are blocked. They do not cause VM exits in this case.) • Start-up IPIs (SIPIs). SIPIs cause VM exits. If a logical processor is not in the wait-for-SIPI activity state when a SIPI arrives, no VM exit occurs and the SIPI is discarded. VM exits due to SIPIs do not perform any of the normal operations associated with those events: they do not modify register state as they would outside of VMX operation. (If a logical processor is not in the wait-for-SIPI state, SIPIs are blocked. They do not cause VM exits in this case.) • Task switches. Task switches are not allowed in VMX non-root operation. Any attempt to effect a task switch in VMX non-root operation causes a VM exit. See Section 25.4.2. • System-management interrupts (SMIs). If the logical processor is using the dual-monitor treatment of SMIs and system-management mode (SMM), SMIs cause SMM VM exits. See Section 34.15.2.1 • VMX-preemption timer. A VM exit occurs when the timer counts down to zero. See Section 25.5.1 for details of operation of the VMX-preemption timer. Debug-trap exceptions and higher priority events take priority over VM exits caused by the VMX-preemption timer. VM exits caused by the VMX-preemption timer take priority over VM exits caused by the “NMI-window exiting” VM-execution control and lower priority events. These VM exits wake a logical processor from the same inactive states as would a non-maskable interrupt. Specifically, they wake a logical processor from the shutdown state and from the states entered using the HLT and MWAIT instructions. These VM exits do not occur if the logical processor is in the wait-for-SIPI state. In addition, there are controls that cause VM exits based on the readiness of guest software to receive interrupts: • If the “interrupt-window exiting” VM-execution control is 1, a VM exit occurs before execution of any instruction if RFLAGS.IF = 1 and there is no blocking of events by STI or by MOV SS (see Table 24-3). Such a VM exit occurs immediately after VM entry if the above conditions are true (see Section 26.7.5). Non-maskable interrupts (NMIs) and higher priority events take priority over VM exits caused by this control. VM exits caused by this control take priority over external interrupts and lower priority events. These VM exits wake a logical processor from the same inactive states as would an external interrupt. Specifically, they wake a logical processor from the states entered using the HLT and MWAIT instructions. These VM exits do not occur if the logical processor is in the shutdown state or the wait-for-SIPI state. • If the “NMI-window exiting” VM-execution control is 1, a VM exit occurs before execution of any instruction if there is no virtual-NMI blocking and there is no blocking of events by MOV SS (see Table 24-3). (A logical processor may also prevent such a VM exit if there is blocking of events by STI.) Such a VM exit occurs immediately after VM entry if the above conditions are true (see Section 26.7.6). VM exits caused by the VMX-preemption timer and higher priority events take priority over VM exits caused by this control. VM exits caused by this control take priority over non-maskable interrupts (NMIs) and lower priority events. These VM exits wake a logical processor from the same inactive states as would an NMI. Specifically, they wake a logical processor from the shutdown state and from the states entered using the HLT and MWAIT instructions. These VM exits do not occur if the logical processor is in the wait-for-SIPI state. 25.3 CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION The behavior of some instructions is changed in VMX non-root operation. Some of these changes are determined by the settings of certain VM-execution control fields. The following items detail such changes:2 1. Under the dual-monitor treatment of SMIs and SMM, SMIs also cause SMM VM exits if they occur in VMX root operation outside SMM. If the processor is using the default treatment of SMIs and SMM, SMIs are delivered as described in Section 34.14.1. 25-6 Vol. 3C VMX NON-ROOT OPERATION • CLTS. Behavior of the CLTS instruction is determined by the bits in position 3 (corresponding to CR0.TS) in the CR0 guest/host mask and the CR0 read shadow: — If bit 3 in the CR0 guest/host mask is 0, CLTS clears CR0.TS normally (the value of bit 3 in the CR0 read shadow is irrelevant in this case), unless CR0.TS is fixed to 1 in VMX operation (see Section 23.8), in which case CLTS causes a general-protection exception. — If bit 3 in the CR0 guest/host mask is 1 and bit 3 in the CR0 read shadow is 0, CLTS completes but does not change the contents of CR0.TS. — If the bits in position 3 in the CR0 guest/host mask and the CR0 read shadow are both 1, CLTS causes a VM exit. • INVPCID. Behavior of the INVPCID instruction is determined first by the setting of the “enable INVPCID” VM-execution control: — If the “enable INVPCID” VM-execution control is 0, INVPCID causes an invalid-opcode exception (#UD). This exception takes priority over any other exception the instruction may incur. — If the “enable INVPCID” VM-execution control is 1, treatment is based on the setting of the “INVLPG exiting” VM-execution control: • • • If the “INVLPG exiting” VM-execution control is 0, INVPCID operates normally. If the “INVLPG exiting” VM-execution control is 1, INVPCID causes a VM exit. IRET. Behavior of IRET with regard to NMI blocking (see Table 24-3) is determined by the settings of the “NMI exiting” and “virtual NMIs” VM-execution controls: — If the “NMI exiting” VM-execution control is 0, IRET operates normally and unblocks NMIs. (If the “NMI exiting” VM-execution control is 0, the “virtual NMIs” control must be 0; see Section 26.2.1.1.) — If the “NMI exiting” VM-execution control is 1, IRET does not affect blocking of NMIs. If, in addition, the “virtual NMIs” VM-execution control is 1, the logical processor tracks virtual-NMI blocking. In this case, IRET removes any virtual-NMI blocking. The unblocking of NMIs or virtual NMIs specified above occurs even if IRET causes a fault. • LMSW. Outside of VMX non-root operation, LMSW loads its source operand into CR0[3:0], but it does not clear CR0.PE if that bit is set. In VMX non-root operation, an execution of LMSW that does not cause a VM exit (see Section 25.1.3) leaves unmodified any bit in CR0[3:0] corresponding to a bit set in the CR0 guest/host mask. An attempt to set any other bit in CR0[3:0] to a value not supported in VMX operation (see Section 23.8) causes a general-protection exception. Attempts to clear CR0.PE are ignored without fault. • MOV from CR0. The behavior of MOV from CR0 is determined by the CR0 guest/host mask and the CR0 read shadow. For each position corresponding to a bit clear in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in CR0. For each position corresponding to a bit set in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in the CR0 read shadow. Thus, if every bit is cleared in the CR0 guest/host mask, MOV from CR0 reads normally from CR0; if every bit is set in the CR0 guest/host mask, MOV from CR0 returns the value of the CR0 read shadow. Depending on the contents of the CR0 guest/host mask and the CR0 read shadow, bits may be set in the destination that would never be set when reading directly from CR0. • MOV from CR3. If the “enable EPT” VM-execution control is 1 and an execution of MOV from CR3 does not cause a VM exit (see Section 25.1.3), the value loaded from CR3 is a guest-physical address; see Section 28.2.1. • MOV from CR4. The behavior of MOV from CR4 is determined by the CR4 guest/host mask and the CR4 read shadow. For each position corresponding to a bit clear in the CR4 guest/host mask, the destination operand is loaded with the value of the corresponding bit in CR4. For each position corresponding to a bit set in the CR4 guest/host mask, the destination operand is loaded with the value of the corresponding bit in the CR4 read shadow. Thus, if every bit is cleared in the CR4 guest/host mask, MOV from CR4 reads normally from CR4; if every bit is set in the CR4 guest/host mask, MOV from CR4 returns the value of the CR4 read shadow. 2. Some of the items in this section refer to secondary processor-based VM-execution controls. If bit 31 of the primary processorbased VM-execution controls is 0, VMX non-root operation functions as if these controls were all 0. See Section 24.6.2. Vol. 3C 25-7 VMX NON-ROOT OPERATION Depending on the contents of the CR4 guest/host mask and the CR4 read shadow, bits may be set in the destination that would never be set when reading directly from CR4. • MOV from CR8. If the MOV from CR8 instruction does not cause a VM exit (see Section 25.1.3), its behavior is modified if the “use TPR shadow” VM-execution control is 1; see Section 29.3. • MOV to CR0. An execution of MOV to CR0 that does not cause a VM exit (see Section 25.1.3) leaves unmodified any bit in CR0 corresponding to a bit set in the CR0 guest/host mask. Treatment of attempts to modify other bits in CR0 depends on the setting of the “unrestricted guest” VM-execution control: — If the control is 0, MOV to CR0 causes a general-protection exception if it attempts to set any bit in CR0 to a value not supported in VMX operation (see Section 23.8). — If the control is 1, MOV to CR0 causes a general-protection exception if it attempts to set any bit in CR0 other than bit 0 (PE) or bit 31 (PG) to a value not supported in VMX operation. It remains the case, however, that MOV to CR0 causes a general-protection exception if it would result in CR0.PE = 0 and CR0.PG = 1 or if it would result in CR0.PG = 1, CR4.PAE = 0, and IA32_EFER.LME = 1. • MOV to CR3. If the “enable EPT” VM-execution control is 1 and an execution of MOV to CR3 does not cause a VM exit (see Section 25.1.3), the value loaded into CR3 is treated as a guest-physical address; see Section 28.2.1. — If PAE paging is not being used, the instruction does not use the guest-physical address to access memory and it does not cause it to be translated through EPT.1 — If PAE paging is being used, the instruction translates the guest-physical address through EPT and uses the result to load the four (4) page-directory-pointer-table entries (PDPTEs). The instruction does not use the guest-physical addresses the PDPTEs to access memory and it does not cause them to be translated through EPT. • MOV to CR4. An execution of MOV to CR4 that does not cause a VM exit (see Section 25.1.3) leaves unmodified any bit in CR4 corresponding to a bit set in the CR4 guest/host mask. Such an execution causes a general-protection exception if it attempts to set any bit in CR4 (not corresponding to a bit set in the CR4 guest/host mask) to a value not supported in VMX operation (see Section 23.8). • MOV to CR8. If the MOV to CR8 instruction does not cause a VM exit (see Section 25.1.3), its behavior is modified if the “use TPR shadow” VM-execution control is 1; see Section 29.3. • MWAIT. Behavior of the MWAIT instruction (which always causes an invalid-opcode exception—#UD—if CPL > 0) is determined by the setting of the “MWAIT exiting” VM-execution control: — If the “MWAIT exiting” VM-execution control is 1, MWAIT causes a VM exit. — If the “MWAIT exiting” VM-execution control is 0, MWAIT operates normally if one of the following are true: (1) ECX[0] is 0; (2) RFLAGS.IF = 1; or both of the following are true: (a) the “interrupt-window exiting” VMexecution control is 0; and (b) the logical processor has not recognized a pending virtual interrupt (see Section 29.2.1). — If the “MWAIT exiting” VM-execution control is 0, ECX[0] = 1, and RFLAGS.IF = 0, MWAIT does not cause the processor to enter an implementation-dependent optimized state if either the “interrupt-window exiting” VM-execution control is 1 or the logical processor has recognized a pending virtual interrupt; instead, control passes to the instruction following the MWAIT instruction. • RDMSR. Section 25.1.3 identifies when executions of the RDMSR instruction cause VM exits. If such an execution causes neither a fault due to CPL > 0 nor a VM exit, the instruction’s behavior may be modified for certain values of ECX: — If ECX contains 10H (indicating the IA32_TIME_STAMP_COUNTER MSR), the value returned by the instruction is determined by the setting of the “use TSC offsetting” VM-execution control: • If the control is 0, RDMSR operates normally, loading EAX:EDX with the value of the IA32_TIME_STAMP_COUNTER MSR. • If the control is 1, the value returned is determined by the setting of the “use TSC scaling” VM-execution control: 1. A logical processor uses PAE paging if CR0.PG = 1, CR4.PAE = 1 and IA32_EFER.LMA = 0. See Section 4.4 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 25-8 Vol. 3C VMX NON-ROOT OPERATION — If the control is 0, RDMSR loads EAX:EDX with the sum of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC offset. — If the control is 1, RDMSR first computes the product of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC multiplier. It then shifts the value of the product right 48 bits and loads EAX:EDX with the sum of that shifted value and the value of the TSC offset. The 1-setting of the “use TSC-offsetting” VM-execution control does not affect executions of RDMSR if ECX contains 6E0H (indicating the IA32_TSC_DEADLINE MSR). Such executions return the APIC-timer deadline relative to the actual timestamp counter without regard to the TSC offset. — If ECX is in the range 800H–8FFH (indicating an APIC MSR), instruction behavior may be modified if the “virtualize x2APIC mode” VM-execution control is 1; see Section 29.5. • RDPID. Behavior of the RDPID instruction is determined first by the setting of the “enable RDTSCP” VM-execution control: — If the “enable RDTSCP” VM-execution control is 0, RDPID causes an invalid-opcode exception (#UD). — If the “enable RDTSCP” VM-execution control is 1, RDPID operates normally. • RDTSC. Behavior of the RDTSC instruction is determined by the settings of the “RDTSC exiting” and “use TSC offsetting” VM-execution controls: — If both controls are 0, RDTSC operates normally. — If the “RDTSC exiting” VM-execution control is 0 and the “use TSC offsetting” VM-execution control is 1, the value returned is determined by the setting of the “use TSC scaling” VM-execution control: • If the control is 0, RDTSC loads EAX:EDX with the sum of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC offset. • If the control is 1, RDTSC first computes the product of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC multiplier. It then shifts the value of the product right 48 bits and loads EAX:EDX with the sum of that shifted value and the value of the TSC offset. — If the “RDTSC exiting” VM-execution control is 1, RDTSC causes a VM exit. • RDTSCP. Behavior of the RDTSCP instruction is determined first by the setting of the “enable RDTSCP” VM-execution control: — If the “enable RDTSCP” VM-execution control is 0, RDTSCP causes an invalid-opcode exception (#UD). This exception takes priority over any other exception the instruction may incur. — If the “enable RDTSCP” VM-execution control is 1, treatment is based on the settings of the “RDTSC exiting” and “use TSC offsetting” VM-execution controls: • • If both controls are 0, RDTSCP operates normally. If the “RDTSC exiting” VM-execution control is 0 and the “use TSC offsetting” VM-execution control is 1, the value returned is determined by the setting of the “use TSC scaling” VM-execution control: — If the control is 0, RDTSCP loads EAX:EDX with the sum of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC offset. — If the control is 1, RDTSCP first computes the product of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC multiplier. It then shifts the value of the product right 48 bits and loads EAX:EDX with the sum of that shifted value and the value of the TSC offset. In either case, RDTSCP also loads ECX with the value of bits 31:0 of the IA32_TSC_AUX MSR. • • If the “RDTSC exiting” VM-execution control is 1, RDTSCP causes a VM exit. SMSW. The behavior of SMSW is determined by the CR0 guest/host mask and the CR0 read shadow. For each position corresponding to a bit clear in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in CR0. For each position corresponding to a bit set in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in the CR0 read shadow. Thus, if every bit is cleared in the CR0 guest/host mask, SMSW reads normally from CR0; if every bit is set in the CR0 guest/host mask, SMSW returns the value of the CR0 read shadow. Vol. 3C 25-9 VMX NON-ROOT OPERATION Note the following: (1) for any memory destination or for a 16-bit register destination, only the low 16 bits of the CR0 guest/host mask and the CR0 read shadow are used (bits 63:16 of a register destination are left unchanged); (2) for a 32-bit register destination, only the low 32 bits of the CR0 guest/host mask and the CR0 read shadow are used (bits 63:32 of the destination are cleared); and (3) depending on the contents of the CR0 guest/host mask and the CR0 read shadow, bits may be set in the destination that would never be set when reading directly from CR0. • TPAUSE. Behavior of the TPAUSE instruction is determined first by the setting of the “enable user wait and pause” VM-execution control: — If the “enable user wait and pause” VM-execution control is 0, TPAUSE causes an invalid-opcode exception (#UD). This exception takes priority over any exception the instruction may incur. — If the “enable user wait and pause” VM-execution control is 1, treatment is based on the setting of the “RDTSC exiting” VM-execution control: • If the “RDTSC exiting” VM-execution control is 0, the instruction delays for an amount of time called here the physical delay. The physical delay is first computed by determining the virtual delay (the time to delay relative to the guest’s timestamp counter). If IA32_UMWAIT_CONTROL[31:2] is zero, the virtual delay is the value in EDX:EAX minus the value that RDTSC would return (see above); if IA32_UMWAIT_CONTROL[31:2] is not zero, the virtual delay is the minimum of that difference and AND(IA32_UMWAIT_CONTROL,FFFFFFFCH). The physical delay depends upon the settings of the “use TSC offsetting” and “use TSC scaling” VM-execution controls: • • — If either control is 0, the physical delay is the virtual delay. — If both controls are 1, the virtual delay is multiplied by 248 (using a shift) to produce a 128-bit integer. That product is then divided by the TSC multiplier to produce a 64-bit integer. The physical delay is that quotient. If the “RDTSC exiting” VM-execution control is 1, TPAUSE causes a VM exit. UMONITOR. Behavior of the UMONITOR instruction is determined by the setting of the “enable user wait and pause” VM-execution control: — If the “enable user wait and pause” VM-execution control is 0, UMONITOR causes an invalid-opcode exception (#UD). This exception takes priority over any exception the instruction may incur. — If the “enable user wait and pause” VM-execution control is 1, UMONITOR operates normally. • UMWAIT. Behavior of the UMWAIT instruction is determined first by the setting of the “enable user wait and pause” VM-execution control: — If the “enable user wait and pause” VM-execution control is 0, UMWAIT causes an invalid-opcode exception (#UD). This exception takes priority over any exception the instruction may incur. — If the “enable user wait and pause” VM-execution control is 1, treatment is based on the setting of the “RDTSC exiting” VM-execution control: • If the “RDTSC exiting” VM-execution control is 0, and if the instruction causes a delay, the amount of time delayed is called here the physical delay. The physical delay is first computed by determining the virtual delay (the time to delay relative to the guest’s timestamp counter). If IA32_UMWAIT_CONTROL[31:2] is zero, the virtual delay is the value in EDX:EAX minus the value that RDTSC would return (see above); if IA32_UMWAIT_CONTROL[31:2] is not zero, the virtual delay is the minimum of that difference and AND(IA32_UMWAIT_CONTROL,FFFFFFFCH). The physical delay depends upon the settings of the “use TSC offsetting” and “use TSC scaling” VM-execution controls: • 25-10 Vol. 3C — If either control is 0, the physical delay is the virtual delay. — If both controls are 1, the virtual delay is multiplied by 248 (using a shift) to produce a 128-bit integer. That product is then divided by the TSC multiplier to produce a 64-bit integer. The physical delay is that quotient. If the “RDTSC exiting” VM-execution control is 1, UMWAIT causes a VM exit. VMX NON-ROOT OPERATION • WRMSR. Section 25.1.3 identifies when executions of the WRMSR instruction cause VM exits. If such an execution neither a fault due to CPL > 0 nor a VM exit, the instruction’s behavior may be modified for certain values of ECX: — If ECX contains 79H (indicating IA32_BIOS_UPDT_TRIG MSR), no microcode update is loaded, and control passes to the next instruction. This implies that microcode updates cannot be loaded in VMX non-root operation. — On processors that support Intel PT but which do not allow it to be used in VMX operation, if ECX contains 570H (indicating the IA32_RTIT_CTL MSR), the instruction causes a general-protection exception.1 — If ECX contains 808H (indicating the TPR MSR), 80BH (the EOI MSR), or 83FH (self-IPI MSR), instruction behavior may modified if the “virtualize x2APIC mode” VM-execution control is 1; see Section 29.5. • XRSTORS. Behavior of the XRSTORS instruction is determined first by the setting of the “enable XSAVES/XRSTORS” VM-execution control: — If the “enable XSAVES/XRSTORS” VM-execution control is 0, XRSTORS causes an invalid-opcode exception (#UD). — If the “enable XSAVES/XRSTORS” VM-execution control is 1, treatment is based on the value of the XSSexiting bitmap (see Section 24.6.20): • • XRSTORS causes a VM exit if any bit is set in the logical-AND of the following three values: EDX:EAX, the IA32_XSS MSR, and the XSS-exiting bitmap. • Otherwise, XRSTORS operates normally. XSAVES. Behavior of the XSAVES instruction is determined first by the setting of the “enable XSAVES/XRSTORS” VM-execution control: — If the “enable XSAVES/XRSTORS” VM-execution control is 0, XSAVES causes an invalid-opcode exception (#UD). — If the “enable XSAVES/XRSTORS” VM-execution control is 1, treatment is based on the value of the XSSexiting bitmap (see Section 24.6.20): • XSAVES causes a VM exit if any bit is set in the logical-AND of the following three values: EDX:EAX, the IA32_XSS MSR, and the XSS-exiting bitmap. • Otherwise, XSAVES operates normally. 25.4 OTHER CHANGES IN VMX NON-ROOT OPERATION Treatments of event blocking and of task switches differ in VMX non-root operation as described in the following sections. 25.4.1 Event Blocking Event blocking is modified in VMX non-root operation as follows: • If the “external-interrupt exiting” VM-execution control is 1, RFLAGS.IF does not control the blocking of external interrupts. In this case, an external interrupt that is not blocked for other reasons causes a VM exit (even if RFLAGS.IF = 0). • If the “external-interrupt exiting” VM-execution control is 1, external interrupts may or may not be blocked by STI or by MOV SS (behavior is implementation-specific). • If the “NMI exiting” VM-execution control is 1, non-maskable interrupts (NMIs) may or may not be blocked by STI or by MOV SS (behavior is implementation-specific). 1. Software should read the VMX capability MSR IA32_VMX_MISC to determine whether the processor allows Intel PT to be used in VMX operation (see Appendix A.6). Vol. 3C 25-11 VMX NON-ROOT OPERATION 25.4.2 Treatment of Task Switches Task switches are not allowed in VMX non-root operation. Any attempt to effect a task switch in VMX non-root operation causes a VM exit. However, the following checks are performed (in the order indicated), possibly resulting in a fault, before there is any possibility of a VM exit due to task switch: 1. If a task gate is being used, appropriate checks are made on its P bit and on the proper values of the relevant privilege fields. The following cases detail the privilege checks performed: a. If CALL, INT n, INT1, INT3, INTO, or JMP accesses a task gate in IA-32e mode, a general-protection exception occurs. b. If CALL, INT n, INT3, INTO, or JMP accesses a task gate outside IA-32e mode, privilege-levels checks are performed on the task gate but, if they pass, privilege levels are not checked on the referenced task-state segment (TSS) descriptor. c. If CALL or JMP accesses a TSS descriptor directly in IA-32e mode, a general-protection exception occurs. d. If CALL or JMP accesses a TSS descriptor directly outside IA-32e mode, privilege levels are checked on the TSS descriptor. e. If a non-maskable interrupt (NMI), an exception, or an external interrupt accesses a task gate in the IDT in IA-32e mode, a general-protection exception occurs. f. If a non-maskable interrupt (NMI), an exception other than breakpoint exceptions (#BP) and overflow exceptions (#OF), or an external interrupt accesses a task gate in the IDT outside IA-32e mode, no privilege checks are performed. g. If IRET is executed with RFLAGS.NT = 1 in IA-32e mode, a general-protection exception occurs. h. If IRET is executed with RFLAGS.NT = 1 outside IA-32e mode, a TSS descriptor is accessed directly and no privilege checks are made. 2. Checks are made on the new TSS selector (for example, that is within GDT limits). 3. The new TSS descriptor is read. (A page fault results if a relevant GDT page is not present). 4. The TSS descriptor is checked for proper values of type (depends on type of task switch), P bit, S bit, and limit. Only if checks 1–4 all pass (do not generate faults) might a VM exit occur. However, the ordering between a VM exit due to a task switch and a page fault resulting from accessing the old TSS or the new TSS is implementationspecific. Some processors may generate a page fault (instead of a VM exit due to a task switch) if accessing either TSS would cause a page fault. Other processors may generate a VM exit due to a task switch even if accessing either TSS would cause a page fault. If an attempt at a task switch through a task gate in the IDT causes an exception (before generating a VM exit due to the task switch) and that exception causes a VM exit, information about the event whose delivery that accessed the task gate is recorded in the IDT-vectoring information fields and information about the exception that caused the VM exit is recorded in the VM-exit interruption-information fields. See Section 27.2. The fact that a task gate was being accessed is not recorded in the VMCS. If an attempt at a task switch through a task gate in the IDT causes VM exit due to the task switch, information about the event whose delivery accessed the task gate is recorded in the IDT-vectoring fields of the VMCS. Since the cause of such a VM exit is a task switch and not an interruption, the valid bit for the VM-exit interruption information field is 0. See Section 27.2. 25.5 FEATURES SPECIFIC TO VMX NON-ROOT OPERATION Some VM-execution controls support features that are specific to VMX non-root operation. These are the VMXpreemption timer (Section 25.5.1) and the monitor trap flag (Section 25.5.2), translation of guest-physical addresses (Section 25.5.3 and Section 25.5.4), APIC virtualization (Section 25.5.5), VM functions (Section 25.5.6), and virtualization exceptions (Section 25.5.7). 25-12 Vol. 3C VMX NON-ROOT OPERATION 25.5.1 VMX-Preemption Timer If the last VM entry was performed with the 1-setting of “activate VMX-preemption timer” VM-execution control, the VMX-preemption timer counts down (from the value loaded by VM entry; see Section 26.7.4) in VMX nonroot operation. When the timer counts down to zero, it stops counting down and a VM exit occurs (see Section 25.2). The VMX-preemption timer counts down at rate proportional to that of the timestamp counter (TSC). Specifically, the timer counts down by 1 every time bit X in the TSC changes due to a TSC increment. The value of X is in the range 0–31 and can be determined by consulting the VMX capability MSR IA32_VMX_MISC (see Appendix A.6). The VMX-preemption timer operates in the C-states C0, C1, and C2; it also operates in the shutdown and wait-forSIPI states. If the timer counts down to zero in any state other than the wait-for SIPI state, the logical processor transitions to the C0 C-state and causes a VM exit; the timer does not cause a VM exit if it counts down to zero in the wait-for-SIPI state. The timer is not decremented in C-states deeper than C2. Treatment of the timer in the case of system management interrupts (SMIs) and system-management mode (SMM) depends on whether the treatment of SMIs and SMM: • If the default treatment of SMIs and SMM (see Section 34.14) is active, the VMX-preemption timer counts across an SMI to VMX non-root operation, subsequent execution in SMM, and the return from SMM via the RSM instruction. However, the timer can cause a VM exit only from VMX non-root operation. If the timer expires during SMI, in SMM, or during RSM, a timer-induced VM exit occurs immediately after RSM with its normal priority unless it is blocked based on activity state (Section 25.2). • If the dual-monitor treatment of SMIs and SMM (see Section 34.15) is active, transitions into and out of SMM are VM exits and VM entries, respectively. The treatment of the VMX-preemption timer by those transitions is mostly the same as for ordinary VM exits and VM entries; Section 34.15.2 and Section 34.15.4 detail some differences. 25.5.2 Monitor Trap Flag The monitor trap flag is a debugging feature that causes VM exits to occur on certain instruction boundaries in VMX non-root operation. Such VM exits are called MTF VM exits. An MTF VM exit may occur on an instruction boundary in VMX non-root operation as follows: • If the “monitor trap flag” VM-execution control is 1 and VM entry is injecting a vectored event (see Section 26.6.1), an MTF VM exit is pending on the instruction boundary before the first instruction following the VM entry. • If VM entry is injecting a pending MTF VM exit (see Section 26.6.2), an MTF VM exit is pending on the instruction boundary before the first instruction following the VM entry. This is the case even if the “monitor trap flag” VM-execution control is 0. • If the “monitor trap flag” VM-execution control is 1, VM entry is not injecting an event, and a pending event (e.g., debug exception or interrupt) is delivered before an instruction can execute, an MTF VM exit is pending on the instruction boundary following delivery of the event (or any nested exception). • Suppose that the “monitor trap flag” VM-execution control is 1, VM entry is not injecting an event, and the first instruction following VM entry is a REP-prefixed string instruction: — If the first iteration of the instruction causes a fault, an MTF VM exit is pending on the instruction boundary following delivery of the fault (or any nested exception). — If the first iteration of the instruction does not cause a fault, an MTF VM exit is pending on the instruction boundary after that iteration. • Suppose that the “monitor trap flag” VM-execution control is 1, VM entry is not injecting an event, and the first instruction following VM entry is the XBEGIN instruction. In this case, an MTF VM exit is pending at the fallback instruction address of the XBEGIN instruction. This behavior applies regardless of whether advanced debugging of RTM transactional regions has been enabled (see Section 16.3.7, “RTM-Enabled Debugger Support,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). • Suppose that the “monitor trap flag” VM-execution control is 1, VM entry is not injecting an event, and the first instruction following VM entry is neither a REP-prefixed string instruction or the XBEGIN instruction: Vol. 3C 25-13 VMX NON-ROOT OPERATION — If the instruction causes a fault, an MTF VM exit is pending on the instruction boundary following delivery of the fault (or any nested exception).1 — If the instruction does not cause a fault, an MTF VM exit is pending on the instruction boundary following execution of that instruction. If the instruction is INT1, INT3, or INTO, this boundary follows delivery of any software exception. If the instruction is INT n, this boundary follows delivery of a software interrupt. If the instruction is HLT, the MTF VM exit will be from the HLT activity state. No MTF VM exit occurs if another VM exit occurs before reaching the instruction boundary on which an MTF VM exit would be pending (e.g., due to an exception or triple fault). An MTF VM exit occurs on the instruction boundary on which it is pending unless a higher priority event takes precedence or the MTF VM exit is blocked due to the activity state: • System-management interrupts (SMIs), INIT signals, and higher priority events take priority over MTF VM exits. MTF VM exits take priority over debug-trap exceptions and lower priority events. • No MTF VM exit occurs if the processor is in either the shutdown activity state or wait-for-SIPI activity state. If a non-maskable interrupt subsequently takes the logical processor out of the shutdown activity state without causing a VM exit, an MTF VM exit is pending after delivery of that interrupt. Special treatment may apply to Intel SGX instructions or if the logical processor is in enclave mode. See Section 42.2 for details. 25.5.3 Translation of Guest-Physical Addresses Using EPT The extended page-table mechanism (EPT) is a feature that can be used to support the virtualization of physical memory. When EPT is in use, certain physical addresses are treated as guest-physical addresses and are not used to access memory directly. Instead, guest-physical addresses are translated by traversing a set of EPT paging structures to produce physical addresses that are used to access memory. Details of the EPT mechanism are given in Section 28.2. 25.5.4 Translation of Guest-Physical Addresses Used by Intel Processor Trace As described in Chapter 35, Intel® Processor Trace (Intel PT) captures information about software execution using dedicated hardware facilities. Intel PT can be configured so that the trace output is written to memory using physical addresses. For example, when the ToPA (table of physical addresses) output mechanism is used, the IA32_RTIT_OUTPUT_BASE MSR contains the physical address of the base of the current ToPA. Each entry in that table contains the physical address of an output region in memory. When an output region becomes full, the ToPA output mechanism directs subsequent trace output to the next output region as indicated in the ToPA. When the “Intel PT uses guest physical addresses” VM-execution control is 1, the logical processor treats the addresses used by Intel PT (the output addresses as well as those used to discover the output addresses) as guestphysical addresses, translating to physical addresses using EPT before trace output is written to memory. Translating these addresses through EPT implies that the trace-output mechanism may cause EPT violations and VM exits; details are provided in Section 25.5.4.1. Section 25.5.4.2 describes a mechanism that ensures that these VM exits do not cause loss of trace data. 25.5.4.1 Guest-Physical Address Translation for Intel PT: Details When the “Intel PT uses guest physical addresses” VM-execution control is 1, the addresses used by Intel PT are treated as guest-physical addresses and translated using EPT. These addresses include the addresses of the output regions as well as the addresses of the ToPA entries that contain the output-region addresses. 1. This item includes the cases of an invalid opcode exception—#UD— generated by the UD0, UD1, and UD2 instructions and a BOUNDrange exceeded exception—#BR—generated by the BOUND instruction. 25-14 Vol. 3C VMX NON-ROOT OPERATION Translation of accesses by the trace-output process may result in EPT violations or EPT misconfigurations (Section 28.2.3), resulting in VM exits. EPT violations resulting from the trace-output process always cause VM exits and are never converted to virtualization exceptions (Section 25.5.7.1). If no EPT violation or EPT misconfiguration occurs and if page-modification logging (Section 28.2.6) is enabled, the address of an output region may be added to the page-modification log. If the log is full, a page-modification logfull event occurs, resulting in a VM exit. If the “virtualize APIC accesses” VM-execution control is 1, a guest-physical address used by the trace-output process may be translated to an address on the APIC-access page. In this case, the access by the trace-output process causes an APIC-access VM exit as discussed in Section 29.4.6.1. 25.5.4.2 Trace-Address Pre-Translation (TAPT) Because it buffers trace data produced by Intel PT before it is written to memory, the processor ensures that buffered data is not lost when a VM exit disables Intel PT. Specifically, the processor ensures that there is sufficient space left in the current output page for the buffered data. If this were not done, buffered trace data could be lost and the resulting trace corrupted. To prevent the loss of buffered trace data, the processor uses a mechanism called trace-address pre-translation (TAPT). With TAPT, the processor translates using EPT the guest-physical address of the current output region before that address would be used to write buffered trace data to memory. Because of TAPT, no translation (and thus no EPT violation) occurs at the time output is written to memory; the writes to memory use translations that were cached as part of TAPT. (The details given in Section 25.5.4.1 apply to TAPT.) TAPT ensures that, if a write to the output region would cause an EPT violation, the resulting VM exit is delivered at the time of TAPT, before the region would be used. This allows software to resolve the EPT violation at that time and ensures that, when it is necessary to write buffered trace data to memory, that data will not be lost due to an EPT violation. TAPT (and resulting VM exits) may occur at any of the following times: • When software in VMX non-root operation enables tracing by loading the IA32_RTIT_CTL MSR to set the TraceEn bit, using the WRMSR instruction or the XRSTORS instruction. Any VM exit resulting from TAPT in this case is trap-like: the WRMSR or XRSTORS completes before the VM exit occurs (for example, the value of CS:RIP saved in the guest-state area of the VMCS references the next instruction). • At an instruction boundary when one output region becomes full and Intel PT transitions to the next output region. VM exits resulting from TAPT in this case take priority over any pending debug exceptions. Such a VM exit will save information about such exceptions in the guest-state area of the VMCS. • As part of a VM entry that enables Intel PT. See Section 26.5 for details. TAPT may translate not only the guest-physical address of the current output region but those of subsequent output regions as well. (Doing so may provide better protection of trace data.) This implies that any VM exits resulting from TAPT may result from the translation of output-region addresses other than that of the current output region. 25.5.5 APIC Virtualization APIC virtualization is a collection of features that can be used to support the virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC). When APIC virtualization is enabled, the processor emulates many accesses to the APIC, tracks the state of the virtual APIC, and delivers virtual interrupts — all in VMX nonroot operation without a VM exit. Details of the APIC virtualization are given in Chapter 29. Vol. 3C 25-15 VMX NON-ROOT OPERATION 25.5.6 VM Functions A VM function is an operation provided by the processor that can be invoked from VMX non-root operation without a VM exit. VM functions are enabled and configured by the settings of different fields in the VMCS. Software in VMX non-root operation invokes a VM function with the VMFUNC instruction; the value of EAX selects the specific VM function being invoked. Section 25.5.6.1 explains how VM functions are enabled. Section 25.5.6.2 specifies the behavior of the VMFUNC instruction. Section 25.5.6.3 describes a specific VM function called EPTP switching. 25.5.6.1 Enabling VM Functions Software enables VM functions generally by setting the “enable VM functions” VM-execution control. A specific VM function is enabled by setting the corresponding VM-function control. Suppose, for example, that software wants to enable EPTP switching (VM function 0; see Section 24.6.14).To do so, it must set the “activate secondary controls” VM-execution control (bit 31 of the primary processor-based VMexecution controls), the “enable VM functions” VM-execution control (bit 13 of the secondary processor-based VMexecution controls) and the “EPTP switching” VM-function control (bit 0 of the VM-function controls). 25.5.6.2 General Operation of the VMFUNC Instruction The VMFUNC instruction causes an invalid-opcode exception (#UD) if the “enable VM functions” VM-execution controls is 01 or the value of EAX is greater than 63 (only VM functions 0–63 can be enable). Otherwise, the instruction causes a VM exit if the bit at position EAX is 0 in the VM-function controls (the selected VM function is not enabled). If such a VM exit occurs, the basic exit reason used is 59 (3BH), indicating “VMFUNC”, and the length of the VMFUNC instruction is saved into the VM-exit instruction-length field. If the instruction causes neither an invalid-opcode exception nor a VM exit due to a disabled VM function, it performs the functionality of the VM function specified by the value in EAX. Individual VM functions may perform additional fault checking (e.g., one might cause a general-protection exception if CPL > 0). In addition, specific VM functions may include checks that might result in a VM exit. If such a VM exit occurs, VM-exit information is saved as described in the previous paragraph. The specification of a VM function may indicate that additional VM-exit information is provided. The specific behavior of the EPTP-switching VM function (including checks that result in VM exits) is given in Section 25.5.6.3. 25.5.6.3 EPTP Switching EPTP switching is VM function 0. This VM function allows software in VMX non-root operation to load a new value for the EPT pointer (EPTP), thereby establishing a different EPT paging-structure hierarchy (see Section 28.2 for details of the operation of EPT). Software is limited to selecting from a list of potential EPTP values configured in advance by software in VMX root operation. Specifically, the value of ECX is used to select an entry from the EPTP list, the 4-KByte structure referenced by the EPTP-list address (see Section 24.6.14; because this structure contains 512 8-Byte entries, VMFUNC causes a VM exit if ECX ≥ 512). If the selected entry is a valid EPTP value (it would not cause VM entry to fail; see Section 26.2.1.1), it is stored in the EPTP field of the current VMCS and is used for subsequent accesses using guest-physical addresses. The following pseudocode provides details: IF ECX ≥ 512 THEN VM exit; ELSE tent_EPTP ← 8 bytes from EPTP-list address + 8 * ECX; IF tent_EPTP is not a valid EPTP value (would cause VM entry to fail if in EPTP) THEN VM exit; 1. “Enable VM functions” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the “enable VM functions” VM-execution control were 0. See Section 24.6.2. 25-16 Vol. 3C VMX NON-ROOT OPERATION ELSE write tent_EPTP to the EPTP field in the current VMCS; use tent_EPTP as the new EPTP value for address translation; IF processor supports the 1-setting of the “EPT-violation #VE” VM-execution control THEN write ECX[15:0] to EPTP-index field in current VMCS; use ECX[15:0] as EPTP index for subsequent EPT-violation virtualization exceptions (see Section 25.5.7.2); FI; FI; FI; Execution of the EPTP-switching VM function does not modify the state of any registers; no flags are modified. If the “Intel PT uses guest physical addresses” VM-execution control is 1 and IA32_RTIT_CTL.TraceEn = 1, any execution of the EPTP-switching VM function causes a VM exit.1 As noted in Section 25.5.6.2, an execution of the EPTP-switching VM function that causes a VM exit (as specified above), uses the basic exit reason 59, indicating “VMFUNC”. The length of the VMFUNC instruction is saved into the VM-exit instruction-length field. No additional VM-exit information is provided. An execution of VMFUNC loads EPTP from the EPTP list (and thus does not cause a fault or VM exit) is called an EPTP-switching VMFUNC. After an EPTP-switching VMFUNC, control passes to the next instruction. The logical processor starts creating and using guest-physical and combined mappings associated with the new value of bits 51:12 of EPTP; the combined mappings created and used are associated with the current VPID and PCID (these are not changed by VMFUNC).2 If the “enable VPID” VM-execution control is 0, an EPTP-switching VMFUNC invalidates combined mappings associated with VPID 0000H (for all PCIDs and for all EP4TA values, where EP4TA is the value of bits 51:12 of EPTP). Because an EPTP-switching VMFUNC may change the translation of guest-physical addresses, it may affect use of the guest-physical address in CR3. The EPTP-switching VMFUNC cannot itself cause a VM exit due to an EPT violation or an EPT misconfiguration due to the translation of that guest-physical address through the new EPT paging structures. The following items provide details that apply if CR0.PG = 1: • If 32-bit paging or 4-level paging3 is in use (either CR4.PAE = 0 or IA32_EFER.LMA = 1), the next memory access with a linear address uses the translation of the guest-physical address in CR3 through the new EPT paging structures. As a result, this access may cause a VM exit due to an EPT violation or an EPT misconfiguration encountered during that translation. • If PAE paging is in use (CR4.PAE = 1 and IA32_EFER.LMA = 0), an EPTP-switching VMFUNC does not load the four page-directory-pointer-table entries (PDPTEs) from the guest-physical address in CR3. The logical processor continues to use the four guest-physical addresses already present in the PDPTEs. The guestphysical address in CR3 is not translated through the new EPT paging structures (until some operation that would load the PDPTEs). The EPTP-switching VMFUNC cannot itself cause a VM exit due to an EPT violation or an EPT misconfiguration encountered during the translation of a guest-physical address in any of the PDPTEs. A subsequent memory access with a linear address uses the translation of the guest-physical address in the appropriate PDPTE through the new EPT paging structures. As a result, such an access may cause a VM exit due to an EPT violation or an EPT misconfiguration encountered during that translation. If an EPTP-switching VMFUNC establishes an EPTP value that enables accessed and dirty flags for EPT (by setting bit 6), subsequent memory accesses may fail to set those flags as specified if there has been no appropriate execution of INVEPT since the last use of an EPTP value that does not enable accessed and dirty flags for EPT (because bit 6 is clear) and that is identical to the new value on bits 51:12. IF the processor supports the 1-setting of the “EPT-violation #VE” VM-execution control, an EPTP-switching VMFUNC loads the value in ECX[15:0] into to EPTP-index field in current VMCS. Subsequent EPT-violation virtualization exceptions will save this value into the virtualization-exception information area (see Section 25.5.7.2); 1. Such a VM exit ensures the proper recording of trace data that might otherwise be lost during the change of EPT paging-structure hierarchy. Software handling the VM exit can change emulate the VM function and then resume the guest. 2. If the “enable VPID” VM-execution control is 0, the current VPID is 0000H; if CR4.PCIDE = 0, the current PCID is 000H. 3. Earlier versions of this manual used the term “IA-32e paging” to identify 4-level paging. Vol. 3C 25-17 VMX NON-ROOT OPERATION 25.5.7 Virtualization Exceptions A virtualization exception is a new processor exception. It uses vector 20 and is abbreviated #VE. A virtualization exception can occur only in VMX non-root operation. Virtualization exceptions occur only with certain settings of certain VM-execution controls. Generally, these settings imply that certain conditions that would normally cause VM exits instead cause virtualization exceptions In particular, the 1-setting of the “EPT-violation #VE” VM-execution control causes some EPT violations to generate virtualization exceptions instead of VM exits. Section 25.5.7.1 provides the details of how the processor determines whether an EPT violation causes a virtualization exception or a VM exit. When the processor encounters a virtualization exception, it saves information about the exception to the virtualization-exception information area; see Section 25.5.7.2. After saving virtualization-exception information, the processor delivers a virtualization exception as it would any other exception; see Section 25.5.7.3 for details. 25.5.7.1 Convertible EPT Violations If the “EPT-violation #VE” VM-execution control is 0 (e.g., on processors that do not support this feature), EPT violations always cause VM exits. If instead the control is 1, certain EPT violations may be converted to cause virtualization exceptions instead; such EPT violations are convertible. The values of certain EPT paging-structure entries determine which EPT violations are convertible. Specifically, bit 63 of certain EPT paging-structure entries may be defined to mean suppress #VE: • If bits 2:0 of an EPT paging-structure entry are all 0, the entry is not present.1 If the processor encounters such an entry while translating a guest-physical address, it causes an EPT violation. The EPT violation is convertible if and only if bit 63 of the entry is 0. • If an EPT paging-structure entry is present, the following cases apply: — If the value of the EPT paging-structure entry is not supported, the entry is misconfigured. If the processor encounters such an entry while translating a guest-physical address, it causes an EPT misconfiguration (not an EPT violation). EPT misconfigurations always cause VM exits. — If the value of the EPT paging-structure entry is supported, the following cases apply: • If bit 7 of the entry is 1, or if the entry is an EPT PTE, the entry maps a page. If the processor uses such an entry to translate a guest-physical address, and if an access to that address causes an EPT violation, the EPT violation is convertible if and only if bit 63 of the entry is 0. • If bit 7 of the entry is 0 and the entry is not an EPT PTE, the entry references another EPT paging structure. The processor does not use the value of bit 63 of the entry to determine whether any subsequent EPT violation is convertible. If an access to a guest-physical address causes an EPT violation, bit 63 of exactly one of the EPT paging-structure entries used to translate that address is used to determine whether the EPT violation is convertible: either a entry that is not present (if the guest-physical address does not translate to a physical address) or an entry that maps a page (if it does). A convertible EPT violation instead causes a virtualization exception if the following all hold: • • • • CR0.PE = 1; the logical processor is not in the process of delivering an event through the IDT; the EPT violation results from the output process of Intel Processor Trace (Section 25.5.4); and the 32 bits at offset 4 in the virtualization-exception information area are all 0. Delivery of virtualization exceptions writes the value FFFFFFFFH to offset 4 in the virtualization-exception information area (see Section 25.5.7.2). Thus, once a virtualization exception occurs, another can occur only if software clears this field. 1. If the “mode-based execute control for EPT” VM-execution control is 1, an EPT paging-structure entry is present if any of bits 2:0 or bit 10 is 1. 25-18 Vol. 3C VMX NON-ROOT OPERATION 25.5.7.2 Virtualization-Exception Information Virtualization exceptions save data into the virtualization-exception information area (see Section 24.6.19). Table 25-1 enumerates the data saved and the format of the area. Table 25-1. Format of the Virtualization-Exception Information Area Byte Offset Contents 0 The 32-bit value that would have been saved into the VMCS as an exit reason had a VM exit occurred instead of the virtualization exception. For EPT violations, this value is 48 (00000030H) 4 FFFFFFFFH 8 The 64-bit value that would have been saved into the VMCS as an exit qualification had a VM exit occurred instead of the virtualization exception 16 The 64-bit value that would have been saved into the VMCS as a guest-linear address had a VM exit occurred instead of the virtualization exception 24 The 64-bit value that would have been saved into the VMCS as a guest-physical address had a VM exit occurred instead of the virtualization exception 32 The current 16-bit value of the EPTP index VM-execution control (see Section 24.6.19 and Section 25.5.6.3) 25.5.7.3 Delivery of Virtualization Exceptions After saving virtualization-exception information, the processor treats a virtualization exception as it does other exceptions: • If bit 20 (#VE) is 1 in the exception bitmap in the VMCS, a virtualization exception causes a VM exit (see below). If the bit is 0, the virtualization exception is delivered using gate descriptor 20 in the IDT. • Virtualization exceptions produce no error code. Delivery of a virtualization exception pushes no error code on the stack. • With respect to double faults, virtualization exceptions have the same severity as page faults. If delivery of a virtualization exception encounters a nested fault that is either contributory or a page fault, a double fault (#DF) is generated. See Chapter 6, “Interrupt 8—Double Fault Exception (#DF)” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. It is not possible for a virtualization exception to be encountered while delivering another exception (see Section 25.5.7.1). If a virtualization exception causes a VM exit directly (because bit 20 is 1 in the exception bitmap), information about the exception is saved normally in the VM-exit interruption information field in the VMCS (see Section 27.2.2). Specifically, the event is reported as a hardware exception with vector 20 and no error code. Bit 12 of the field (NMI unblocking due to IRET) is set normally. If a virtualization exception causes a VM exit indirectly (because bit 20 is 0 in the exception bitmap and delivery of the exception generates an event that causes a VM exit), information about the exception is saved normally in the IDT-vectoring information field in the VMCS (see Section 27.2.4). Specifically, the event is reported as a hardware exception with vector 20 and no error code. 25.6 UNRESTRICTED GUESTS The first processors to support VMX operation require CR0.PE and CR0.PG to be 1 in VMX operation (see Section 23.8). This restriction implies that guest software cannot be run in unpaged protected mode or in real-address mode. Later processors support a VM-execution control called “unrestricted guest”.1 If this control is 1, CR0.PE and CR0.PG may be 0 in VMX non-root operation. Such processors allow guest software to run in unpaged protected mode or in real-address mode. The following items describe the behavior of such software: Vol. 3C 25-19 VMX NON-ROOT OPERATION • The MOV CR0 instructions does not cause a general-protection exception simply because it would set either CR0.PE and CR0.PG to 0. See Section 25.3 for details. • A logical processor treats the values of CR0.PE and CR0.PG in VMX non-root operation just as it does outside VMX operation. Thus, if CR0.PE = 0, the processor operates as it does normally in real-address mode (for example, it uses the 16-bit interrupt table to deliver interrupts and exceptions). If CR0.PG = 0, the processor operates as it does normally when paging is disabled. • Processor operation is modified by the fact that the processor is in VMX non-root operation and by the settings of the VM-execution controls just as it is in protected mode or when paging is enabled. Instructions, interrupts, and exceptions that cause VM exits in protected mode or when paging is enabled also do so in real-address mode or when paging is disabled. The following examples should be noted: — If CR0.PG = 0, page faults do not occur and thus cannot cause VM exits. — If CR0.PE = 0, invalid-TSS exceptions do not occur and thus cannot cause VM exits. — If CR0.PE = 0, the following instructions cause invalid-opcode exceptions and do not cause VM exits: INVEPT, INVVPID, LLDT, LTR, SLDT, STR, VMCLEAR, VMLAUNCH, VMPTRLD, VMPTRST, VMREAD, VMRESUME, VMWRITE, VMXOFF, and VMXON. • If CR0.PG = 0, each linear address is passed directly to the EPT mechanism for translation to a physical address.1 The guest memory type passed on to the EPT mechanism is WB (writeback). 1. “Unrestricted guest” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the “unrestricted guest” VM-execution control were 0. See Section 24.6.2. 1. As noted in Section 26.2.1.1, the “enable EPT” VM-execution control must be 1 if the “unrestricted guest” VM-execution control is 1. 25-20 Vol. 3C 18.Updates to Chapter 26, Volume 3C Change bars and green text show changes to Chapter 26 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to chapter: Updates across chapter describing VMX support improvements made for Intel® Processor Trace (Intel® PT). Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 29 CHAPTER 26 VM ENTRIES Software can enter VMX non-root operation using either of the VM-entry instructions VMLAUNCH and VMRESUME. VMLAUNCH can be used only with a VMCS whose launch state is clear and VMRESUME can be used only with a VMCS whose the launch state is launched. VMLAUNCH should be used for the first VM entry after VMCLEAR; VMRESUME should be used for subsequent VM entries with the same VMCS. Each VM entry performs the following steps in the order indicated: 1. Basic checks are performed to ensure that VM entry can commence (Section 26.1). 2. The control and host-state areas of the VMCS are checked to ensure that they are proper for supporting VMX non-root operation and that the VMCS is correctly configured to support the next VM exit (Section 26.2). 3. The following may be performed in parallel or in any order (Section 26.3): • The guest-state area of the VMCS is checked to ensure that, after the VM entry completes, the state of the logical processor is consistent with IA-32 and Intel 64 architectures. • • Processor state is loaded from the guest-state area and based on controls in the VMCS. Address-range monitoring is cleared. 4. MSRs are loaded from the VM-entry MSR-load area (Section 26.4). 5. If VMLAUNCH is being executed, the launch state of the VMCS is set to “launched.” 6. If the “Intel PT uses guest physical addresses” VM-execution control is 1, trace-address pre-translation (TAPT) may occur (see Section 25.5.4 and Section 26.5). 7. An event may be injected in the guest context (Section 26.6). Steps 1–4 above perform checks that may cause VM entry to fail. Such failures occur in one of the following three ways: • Some of the checks in Section 26.1 may generate ordinary faults (for example, an invalid-opcode exception). Such faults are delivered normally. • Some of the checks in Section 26.1 and all the checks in Section 26.2 cause control to pass to the instruction following the VM-entry instruction. The failure is indicated by setting RFLAGS.ZF1 (if there is a current VMCS) or RFLAGS.CF (if there is no current VMCS). If there is a current VMCS, an error number indicating the cause of the failure is stored in the VM-instruction error field. See Chapter 30 for the error numbers. • The checks in Section 26.3 and Section 26.4 cause processor state to be loaded from the host-state area of the VMCS (as would be done on a VM exit). Information about the failure is stored in the VM-exit information fields. See Section 26.8 for details. EFLAGS.TF = 1 causes a VM-entry instruction to generate a single-step debug exception only if failure of one of the checks in Section 26.1 and Section 26.2 causes control to pass to the following instruction. A VM-entry does not generate a single-step debug exception in any of the following cases: (1) the instruction generates a fault; (2) failure of one of the checks in Section 26.3 or in loading MSRs causes processor state to be loaded from the hoststate area of the VMCS; or (3) the instruction passes all checks in Section 26.1, Section 26.2, and Section 26.3 and there is no failure in loading MSRs. Section 34.15 describes the dual-monitor treatment of system-management interrupts (SMIs) and systemmanagement mode (SMM). Under this treatment, code running in SMM returns using VM entries instead of the RSM instruction. A VM entry returns from SMM if it is executed in SMM and the “entry to SMM” VM-entry control is 0. VM entries that return from SMM differ from ordinary VM entries in ways that are detailed in Section 34.15.4. 1. This chapter uses the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For IA-32 processors, this notation refers to the 32-bit forms of those registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to lower 32 bits of the indicated register. Vol. 3C 26-1 VM ENTRIES 26.1 BASIC VM-ENTRY CHECKS Before a VM entry commences, the current state of the logical processor is checked in the following order: 1. If the logical processor is in virtual-8086 mode or compatibility mode, an invalid-opcode exception is generated. 2. If the current privilege level (CPL) is not zero, a general-protection exception is generated. 3. If there is no current VMCS, RFLAGS.CF is set to 1 and control passes to the next instruction. 4. If there is a current VMCS but the current VMCS is a shadow VMCS (see Section 24.10), RFLAGS.CF is set to 1 and control passes to the next instruction. 5. If there is a current VMCS that is not a shadow VMCS, the following conditions are evaluated in order; any of these cause VM entry to fail: a. if there is MOV-SS blocking (see Table 24-3) b. if the VM entry is invoked by VMLAUNCH and the VMCS launch state is not clear c. if the VM entry is invoked by VMRESUME and the VMCS launch state is not launched If any of these checks fail, RFLAGS.ZF is set to 1 and control passes to the next instruction. An error number indicating the cause of the failure is stored in the VM-instruction error field. See Chapter 30 for the error numbers. 26.2 CHECKS ON VMX CONTROLS AND HOST-STATE AREA If the checks in Section 26.1 do not cause VM entry to fail, the control and host-state areas of the VMCS are checked to ensure that they are proper for supporting VMX non-root operation, that the VMCS is correctly configured to support the next VM exit, and that, after the next VM exit, the processor’s state is consistent with the Intel 64 and IA-32 architectures. VM entry fails if any of these checks fail. When such failures occur, control is passed to the next instruction, RFLAGS.ZF is set to 1 to indicate the failure, and the VM-instruction error field is loaded with an error number that indicates whether the failure was due to the controls or the host-state area (see Chapter 30). These checks may be performed in any order. Thus, an indication by error number of one cause (for example, host state) does not imply that there are not also other errors. Different processors may thus give different error numbers for the same VMCS. Some checks prevent establishment of settings (or combinations of settings) that are currently reserved. Future processors may allow such settings (or combinations) and may not perform the corresponding checks. The correctness of software should not rely on VM-entry failures resulting from the checks documented in this section. The checks on the controls and the host-state area are presented in Section 26.2.1 through Section 26.2.4. These sections reference VMCS fields that correspond to processor state. Unless otherwise stated, these references are to fields in the host-state area. 26.2.1 Checks on VMX Controls This section identifies VM-entry checks on the VMX control fields. 26.2.1.1 VM-Execution Control Fields VM entries perform the following checks on the VM-execution control fields:1 • Reserved bits in the pin-based VM-execution controls must be set properly. Software may consult the VMX capability MSRs to determine the proper settings (see Appendix A.3.1). 1. If the “activate secondary controls” primary processor-based VM-execution control is 0, VM entry operates as if each secondary processor-based VM-execution control were 0. 26-2 Vol. 3C VM ENTRIES • Reserved bits in the primary processor-based VM-execution controls must be set properly. Software may consult the VMX capability MSRs to determine the proper settings (see Appendix A.3.2). • If the “activate secondary controls” primary processor-based VM-execution control is 1, reserved bits in the secondary processor-based VM-execution controls must be cleared. Software may consult the VMX capability MSRs to determine which bits are reserved (see Appendix A.3.3). If the “activate secondary controls” primary processor-based VM-execution control is 0 (or if the processor does not support the 1-setting of that control), no checks are performed on the secondary processor-based VM-execution controls. The logical processor operates as if all the secondary processor-based VM-execution controls were 0. • The CR3-target count must not be greater than 4. Future processors may support a different number of CR3target values. Software should read the VMX capability MSR IA32_VMX_MISC to determine the number of values supported (see Appendix A.6). • If the “use I/O bitmaps” VM-execution control is 1, bits 11:0 of each I/O-bitmap address must be 0. Neither address should set any bits beyond the processor’s physical-address width.1,2 • If the “use MSR bitmaps” VM-execution control is 1, bits 11:0 of the MSR-bitmap address must be 0. The address should not set any bits beyond the processor’s physical-address width.3 • If the “use TPR shadow” VM-execution control is 1, the virtual-APIC address must satisfy the following checks: — Bits 11:0 of the address must be 0. — The address should not set any bits beyond the processor’s physical-address width.4 If all of the above checks are satisfied and the “use TPR shadow” VM-execution control is 1, bytes 3:1 of VTPR (see Section 29.1.1) may be cleared (behavior may be implementation-specific). The clearing of these bytes may occur even if the VM entry fails. This is true either if the failure causes control to pass to the instruction following the VM-entry instruction or if it causes processor state to be loaded from the host-state area of the VMCS. • If the “use TPR shadow” VM-execution control is 1 and the “virtual-interrupt delivery” VM-execution control is 0, bits 31:4 of the TPR threshold VM-execution control field must be 0.5 • The following check is performed if the “use TPR shadow” VM-execution control is 1 and the “virtualize APIC accesses” and “virtual-interrupt delivery” VM-execution controls are both 0: the value of bits 3:0 of the TPR threshold VM-execution control field should not be greater than the value of bits 7:4 of VTPR (see Section 29.1.1). • • • If the “NMI exiting” VM-execution control is 0, the “virtual NMIs” VM-execution control must be 0. If the “virtual NMIs” VM-execution control is 0, the “NMI-window exiting” VM-execution control must be 0. If the “virtualize APIC-accesses” VM-execution control is 1, the APIC-access address must satisfy the following checks: — Bits 11:0 of the address must be 0. — The address should not set any bits beyond the processor’s physical-address width.6 • If the “use TPR shadow” VM-execution control is 0, the following VM-execution controls must also be 0: “virtualize x2APIC mode”, “APIC-register virtualization”, and “virtual-interrupt delivery”.7 1. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 2. If IA32_VMX_BASIC[48] is read as 1, these addresses must not set any bits in the range 63:32; see Appendix A.1. 3. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 4. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 5. “Virtual-interrupt delivery” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “virtual-interrupt delivery” VM-execution control were 0. See Section 24.6.2. 6. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 7. “Virtualize x2APIC mode” and “APIC-register virtualization” are secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if these controls were 0. See Section 24.6.2. Vol. 3C 26-3 VM ENTRIES • If the “virtualize x2APIC mode” VM-execution control is 1, the “virtualize APIC accesses” VM-execution control must be 0. • If the “virtual-interrupt delivery” VM-execution control is 1, the “external-interrupt exiting” VM-execution control must be 1. • If the “process posted interrupts” VM-execution control is 1, the following must be true:1 — The “virtual-interrupt delivery” VM-execution control is 1. — The “acknowledge interrupt on exit” VM-exit control is 1. — The posted-interrupt notification vector has a value in the range 0–255 (bits 15:8 are all 0). — Bits 5:0 of the posted-interrupt descriptor address are all 0. — The posted-interrupt descriptor address does not set any bits beyond the processor's physical-address width.2 • If the “enable VPID” VM-execution control is 1, the value of the VPID VM-execution control field must not be 0000H.3 • If the “enable EPT” VM-execution control is 1, the EPTP VM-execution control field (see Table 24-8 in Section 24.6.11) must satisfy the following checks:4 — The EPT memory type (bits 2:0) must be a value supported by the processor as indicated in the IA32_VMX_EPT_VPID_CAP MSR (see Appendix A.10). — Bits 5:3 (1 less than the EPT page-walk length) must be 3, indicating an EPT page-walk length of 4; see Section 28.2.2. — Bit 6 (enable bit for accessed and dirty flags for EPT) must be 0 if bit 21 of the IA32_VMX_EPT_VPID_CAP MSR (see Appendix A.10) is read as 0, indicating that the processor does not support accessed and dirty flags for EPT. — Reserved bits 11:7 and 63:N (where N is the processor’s physical-address width) must all be 0. • If the “enable PML” VM-execution control is 1, the “enable EPT” VM-execution control must also be 1.5 In addition, the PML address must satisfy the following checks: — Bits 11:0 of the address must be 0. — The address should not set any bits beyond the processor’s physical-address width. • If either the “unrestricted guest” VM-execution control or the “mode-based execute control for EPT” VMexecution control is 1, the “enable EPT” VM-execution control must also be 1.6 • If the “sub-page write permissions for EPT” VM-execution control is 1, the “enable EPT” VM-execution control must also be 1.7 In addition, the SPPTP VM-execution control field (see Table 24-10 in Section 24.6.21) must satisfy the following checks: — Bits 11:0 of the address must be 0. 1. “Process posted interrupts” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “process posted interrupts” VM-execution control were 0. See Section 24.6.2. 2. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 3. “Enable VPID” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “enable VPID” VM-execution control were 0. See Section 24.6.2. 4. “Enable EPT” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “enable EPT” VM-execution control were 0. See Section 24.6.2. 5. “Enable PML” and “enable EPT” are both secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if both these controls were 0. See Section 24.6.2. 6. All these controls are secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if all these controls were 0. See Section 24.6.2. 7. “Sub-page write permissions for EPT” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “sub-page write permissions for EPT” VM-execution control were 0. See Section 24.6.2. 26-4 Vol. 3C VM ENTRIES — The address should not set any bits beyond the processor’s physical-address width. • If the “enable VM functions” processor-based VM-execution control is 1, reserved bits in the VM-function controls must be clear.1 Software may consult the VMX capability MSRs to determine which bits are reserved (see Appendix A.11). In addition, the following check is performed based on the setting of bits in the VMfunction controls (see Section 24.6.14): — If “EPTP switching” VM-function control is 1, the “enable EPT” VM-execution control must also be 1. In addition, the EPTP-list address must satisfy the following checks: • • Bits 11:0 of the address must be 0. The address must not set any bits beyond the processor’s physical-address width. If the “enable VM functions” processor-based VM-execution control is 0, no checks are performed on the VMfunction controls. • If the “VMCS shadowing” VM-execution control is 1, the VMREAD-bitmap and VMWRITE-bitmap addresses must each satisfy the following checks:2 — Bits 11:0 of the address must be 0. — The address must not set any bits beyond the processor’s physical-address width. • If the “EPT-violation #VE” VM-execution control is 1, the virtualization-exception information address must satisfy the following checks:3 — Bits 11:0 of the address must be 0. — The address must not set any bits beyond the processor’s physical-address width. • If the logical processor is operating with Intel PT enabled (if IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the “load IA32_RTIT_CTL” VM-entry control must be 0. • If the “Intel PT uses guest physical addresses” VM-execution control is 1, the following controls must also be 1: the “enable EPT” VM-execution control; the “load IA32_RTIT_CTL” VM-entry control; and the “clear IA32_RTIT_CTL” VM-exit control.4 26.2.1.2 VM-Exit Control Fields VM entries perform the following checks on the VM-exit control fields. • Reserved bits in the VM-exit controls must be set properly. Software may consult the VMX capability MSRs to determine the proper settings (see Appendix A.4). • If the “activate VMX-preemption timer” VM-execution control is 0, the “save VMX-preemption timer value” VMexit control must also be 0. • The following checks are performed for the VM-exit MSR-store address if the VM-exit MSR-store count field is non-zero: — The lower 4 bits of the VM-exit MSR-store address must be 0. The address should not set any bits beyond the processor’s physical-address width.5 1. “Enable VM functions” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “enable VM functions” VM-execution control were 0. See Section 24.6.2. 2. “VMCS shadowing” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “VMCS shadowing” VM-execution control were 0. See Section 24.6.2. 3. “EPT-violation #VE” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “EPT-violation #VE” VM-execution control were 0. See Section 24.6.2. 4. “Intel PT uses guest physical addresses” is a secondary processor-based VM-execution control. If bit 31 of the primary processorbased VM-execution controls is 0, VM entry functions as if the “Intel PT uses guest physical addresses” VM-execution control were 0. See Section 24.6.2. 5. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. Vol. 3C 26-5 VM ENTRIES — The address of the last byte in the VM-exit MSR-store area should not set any bits beyond the processor’s physical-address width. The address of this last byte is VM-exit MSR-store address + (MSR count * 16) – 1. (The arithmetic used for the computation uses more bits than the processor’s physical-address width.) If IA32_VMX_BASIC[48] is read as 1, neither address should set any bits in the range 63:32; see Appendix A.1. • The following checks are performed for the VM-exit MSR-load address if the VM-exit MSR-load count field is non-zero: — The lower 4 bits of the VM-exit MSR-load address must be 0. The address should not set any bits beyond the processor’s physical-address width. — The address of the last byte in the VM-exit MSR-load area should not set any bits beyond the processor’s physical-address width. The address of this last byte is VM-exit MSR-load address + (MSR count * 16) – 1. (The arithmetic used for the computation uses more bits than the processor’s physical-address width.) If IA32_VMX_BASIC[48] is read as 1, neither address should set any bits in the range 63:32; see Appendix A.1. 26.2.1.3 VM-Entry Control Fields VM entries perform the following checks on the VM-entry control fields. • Reserved bits in the VM-entry controls must be set properly. Software may consult the VMX capability MSRs to determine the proper settings (see Appendix A.5). • Fields relevant to VM-entry event injection must be set properly. These fields are the VM-entry interruptioninformation field (see Table 24-14 in Section 24.8.3), the VM-entry exception error code, and the VM-entry instruction length. If the valid bit (bit 31) in the VM-entry interruption-information field is 1, the following must hold: — The field’s interruption type (bits 10:8) is not set to a reserved value. Value 1 is reserved on all logical processors; value 7 (other event) is reserved on logical processors that do not support the 1-setting of the “monitor trap flag” VM-execution control. — The field’s vector (bits 7:0) is consistent with the interruption type: • • • If the interruption type is non-maskable interrupt (NMI), the vector is 2. If the interruption type is hardware exception, the vector is at most 31. If the interruption type is other event, the vector is 0 (pending MTF VM exit). — The field's deliver-error-code bit (bit 11) is 1 if and only if (1) either (a) the "unrestricted guest" VMexecution control is 0; or (b) bit 0 (corresponding to CR0.PE) is set in the CR0 field in the guest-state area; (2) the interruption type is hardware exception; and (3) the vector indicates an exception that would normally deliver an error code (8 = #DF; 10 = TS; 11 = #NP; 12 = #SS; 13 = #GP; 14 = #PF; or 17 = #AC). — Reserved bits in the field (30:12) are 0. — If the deliver-error-code bit (bit 11) is 1, bits 31:15 of the VM-entry exception error-code field are 0. — If the interruption type is software interrupt, software exception, or privileged software exception, the VM-entry instruction-length field is in the range 0–15. A VM-entry instruction length of 0 is allowed only if IA32_VMX_MISC[30] is read as 1; see Appendix A.6. • The following checks are performed for the VM-entry MSR-load address if the VM-entry MSR-load count field is non-zero: — The lower 4 bits of the VM-entry MSR-load address must be 0. The address should not set any bits beyond the processor’s physical-address width.1 1. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 26-6 Vol. 3C VM ENTRIES — The address of the last byte in the VM-entry MSR-load area should not set any bits beyond the processor’s physical-address width. The address of this last byte is VM-entry MSR-load address + (MSR count * 16) – 1. (The arithmetic used for the computation uses more bits than the processor’s physical-address width.) If IA32_VMX_BASIC[48] is read as 1, neither address should set any bits in the range 63:32; see Appendix A.1. • If the processor is not in SMM, the “entry to SMM” and “deactivate dual-monitor treatment” VM-entry controls must be 0. • The “entry to SMM” and “deactivate dual-monitor treatment” VM-entry controls cannot both be 1. 26.2.2 Checks on Host Control Registers and MSRs The following checks are performed on fields in the host-state area that correspond to control registers and MSRs: • • • The CR0 field must not set any bit to a value not supported in VMX operation (see Section 23.8).1 • On processors that support Intel 64 architecture, the IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field must each contain a canonical address. • If the “load IA32_PERF_GLOBAL_CTRL” VM-exit control is 1, bits reserved in the IA32_PERF_GLOBAL_CTRL MSR must be 0 in the field for that register (see Figure 18-3). • If the “load IA32_PAT” VM-exit control is 1, the value of the field for the IA32_PAT MSR must be one that could be written by WRMSR without fault at CPL 0. Specifically, each of the 8 bytes in the field must have one of the values 0 (UC), 1 (WC), 4 (WT), 5 (WP), 6 (WB), or 7 (UC-). • If the “load IA32_EFER” VM-exit control is 1, bits reserved in the IA32_EFER MSR must be 0 in the field for that register. In addition, the values of the LMA and LME bits in the field must each be that of the “host addressspace size” VM-exit control. The CR4 field must not set any bit to a value not supported in VMX operation (see Section 23.8). On processors that support Intel 64 architecture, the CR3 field must be such that bits 63:52 and bits in the range 51:32 beyond the processor’s physical-address width must be 0.2,3 26.2.3 Checks on Host Segment and Descriptor-Table Registers The following checks are performed on fields in the host-state area that correspond to segment and descriptortable registers: • In the selector field for each of CS, SS, DS, ES, FS, GS and TR, the RPL (bits 1:0) and the TI flag (bit 2) must be 0. • • • The selector fields for CS and TR cannot be 0000H. The selector field for SS cannot be 0000H if the “host address-space size” VM-exit control is 0. On processors that support Intel 64 architecture, the base-address fields for FS, GS, GDTR, IDTR, and TR must contain canonical addresses. 26.2.4 Checks Related to Address-Space Size On processors that support Intel 64 architecture, the following checks related to address-space size are performed on VMX controls and fields in the host-state area: 1. The bits corresponding to CR0.NW (bit 29) and CR0.CD (bit 30) are never checked because the values of these bits are not changed by VM exit; see Section 27.5.1. 2. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 3. Bit 63 of the CR3 field in the host-state area must be 0. This is true even though, If CR4.PCIDE = 1, bit 63 of the source operand to MOV to CR3 is used to determine whether cached translation information is invalidated. Vol. 3C 26-7 VM ENTRIES • If the logical processor is outside IA-32e mode (if IA32_EFER.LMA = 0) at the time of VM entry, the following must hold: — The “IA-32e mode guest” VM-entry control is 0. — The “host address-space size” VM-exit control is 0. • If the logical processor is in IA-32e mode (if IA32_EFER.LMA = 1) at the time of VM entry, the “host addressspace size” VM-exit control must be 1. • If the “host address-space size” VM-exit control is 0, the following must hold: — The “IA-32e mode guest” VM-entry control is 0. — Bit 17 of the CR4 field (corresponding to CR4.PCIDE) is 0. — Bits 63:32 in the RIP field is 0. • If the “host address-space size” VM-exit control is 1, the following must hold: — Bit 5 of the CR4 field (corresponding to CR4.PAE) is 1. — The RIP field contains a canonical address. On processors that do not support Intel 64 architecture, checks are performed to ensure that the “IA-32e mode guest” VM-entry control and the “host address-space size” VM-exit control are both 0. 26.3 CHECKING AND LOADING GUEST STATE If all checks on the VMX controls and the host-state area pass (see Section 26.2), the following operations take place concurrently: (1) the guest-state area of the VMCS is checked to ensure that, after the VM entry completes, the state of the logical processor is consistent with IA-32 and Intel 64 architectures; (2) processor state is loaded from the guest-state area or as specified by the VM-entry control fields; and (3) address-range monitoring is cleared. Because the checking and the loading occur concurrently, a failure may be discovered only after some state has been loaded. For this reason, the logical processor responds to such failures by loading state from the host-state area, as it would for a VM exit. See Section 26.8. 26.3.1 Checks on the Guest State Area This section describes checks performed on fields in the guest-state area. These checks may be performed in any order. Some checks prevent establishment of settings (or combinations of settings) that are currently reserved. Future processors may allow such settings (or combinations) and may not perform the corresponding checks. The correctness of software should not rely on VM-entry failures resulting from the checks documented in this section. The following subsections reference fields that correspond to processor state. Unless otherwise stated, these references are to fields in the guest-state area. 26.3.1.1 Checks on Guest Control Registers, Debug Registers, and MSRs The following checks are performed on fields in the guest-state area corresponding to control registers, debug registers, and MSRs: • The CR0 field must not set any bit to a value not supported in VMX operation (see Section 23.8). The following are exceptions: — Bit 0 (corresponding to CR0.PE) and bit 31 (PG) are not checked if the “unrestricted guest” VM-execution control is 1.1 — Bit 29 (corresponding to CR0.NW) and bit 30 (CD) are never checked because the values of these bits are not changed by VM entry; see Section 26.3.2.1. 1. “Unrestricted guest” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “unrestricted guest” VM-execution control were 0. See Section 24.6.2. 26-8 Vol. 3C VM ENTRIES • • • If bit 31 in the CR0 field (corresponding to PG) is 1, bit 0 in that field (PE) must also be 1.1 • The following checks are performed on processors that support Intel 64 architecture: The CR4 field must not set any bit to a value not supported in VMX operation (see Section 23.8). If the “load debug controls” VM-entry control is 1, bits reserved in the IA32_DEBUGCTL MSR must be 0 in the field for that register. The first processors to support the virtual-machine extensions supported only the 1setting of this control and thus performed this check unconditionally. — If the “IA-32e mode guest” VM-entry control is 1, bit 31 in the CR0 field (corresponding to CR0.PG) and bit 5 in the CR4 field (corresponding to CR4.PAE) must each be 1.2 — If the “IA-32e mode guest” VM-entry control is 0, bit 17 in the CR4 field (corresponding to CR4.PCIDE) must be 0. — The CR3 field must be such that bits 63:52 and bits in the range 51:32 beyond the processor’s physicaladdress width are 0.3,4 — If the “load debug controls” VM-entry control is 1, bits 63:32 in the DR7 field must be 0. The first processors to support the virtual-machine extensions supported only the 1-setting of this control and thus performed this check unconditionally (if they supported Intel 64 architecture). — The IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field must each contain a canonical address. • If the “load IA32_PERF_GLOBAL_CTRL” VM-entry control is 1, bits reserved in the IA32_PERF_GLOBAL_CTRL MSR must be 0 in the field for that register (see Figure 18-3). • If the “load IA32_PAT” VM-entry control is 1, the value of the field for the IA32_PAT MSR must be one that could be written by WRMSR without fault at CPL 0. Specifically, each of the 8 bytes in the field must have one of the values 0 (UC), 1 (WC), 4 (WT), 5 (WP), 6 (WB), or 7 (UC-). • If the “load IA32_EFER” VM-entry control is 1, the following checks are performed on the field for the IA32_EFER MSR: — Bits reserved in the IA32_EFER MSR must be 0. — Bit 10 (corresponding to IA32_EFER.LMA) must equal the value of the “IA-32e mode guest” VM-entry control. It must also be identical to bit 8 (LME) if bit 31 in the CR0 field (corresponding to CR0.PG) is 1.5 • If the “load IA32_BNDCFGS” VM-entry control is 1, the following checks are performed on the field for the IA32_BNDCFGS MSR: — Bits reserved in the IA32_BNDCFGS MSR must be 0. — The linear address in bits 63:12 must be canonical. • If the “load IA32_RTIT_CTL” VM-entry control is 1, bits reserved in the IA32_RTIT_CTL MSR must be 0 in the field for that register (see Table 35-6). 26.3.1.2 Checks on Guest Segment Registers This section specifies the checks on the fields for CS, SS, DS, ES, FS, GS, TR, and LDTR. The following terms are used in defining these checks: • The guest will be virtual-8086 if the VM flag (bit 17) is 1 in the RFLAGS field in the guest-state area. 1. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PE must be 1 in VMX operation, bit 0 in the CR0 field must be 1 unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. 2. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PG must be 1 in VMX operation, bit 31 in the CR0 field must be 1 unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. 3. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 4. Bit 63 of the CR3 field in the guest-state area must be 0. This is true even though, If CR4.PCIDE = 1, bit 63 of the source operand to MOV to CR3 is used to determine whether cached translation information is invalidated. 5. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PG must be 1 in VMX operation, bit 31 in the CR0 field must be 1 unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. Vol. 3C 26-9 VM ENTRIES • The guest will be IA-32e mode if the “IA-32e mode guest” VM-entry control is 1. (This is possible only on processors that support Intel 64 architecture.) • Any one of these registers is said to be usable if the unusable bit (bit 16) is 0 in the access-rights field for that register. The following are the checks on these fields: • Selector fields. — TR. The TI flag (bit 2) must be 0. — LDTR. If LDTR is usable, the TI flag (bit 2) must be 0. — SS. If the guest will not be virtual-8086 and the “unrestricted guest” VM-execution control is 0, the RPL (bits 1:0) must equal the RPL of the selector field for CS.1 • Base-address fields. — CS, SS, DS, ES, FS, GS. If the guest will be virtual-8086, the address must be the selector field shifted left 4 bits (multiplied by 16). — The following checks are performed on processors that support Intel 64 architecture: • • • • • • TR, FS, GS. The address must be canonical. LDTR. If LDTR is usable, the address must be canonical. CS. Bits 63:32 of the address must be zero. SS, DS, ES. If the register is usable, bits 63:32 of the address must be zero. Limit fields for CS, SS, DS, ES, FS, GS. If the guest will be virtual-8086, the field must be 0000FFFFH. Access-rights fields. — CS, SS, DS, ES, FS, GS. • If the guest will be virtual-8086, the field must be 000000F3H. This implies the following: — Bits 3:0 (Type) must be 3, indicating an expand-up read/write accessed data segment. — Bit 4 (S) must be 1. — Bits 6:5 (DPL) must be 3. — Bit 7 (P) must be 1. — Bits 11:8 (reserved), bit 12 (software available), bit 13 (reserved/L), bit 14 (D/B), bit 15 (G), bit 16 (unusable), and bits 31:17 (reserved) must all be 0. • If the guest will not be virtual-8086, the different sub-fields are considered separately: — Bits 3:0 (Type). • CS. The values allowed depend on the setting of the “unrestricted guest” VM-execution control: — If the control is 0, the Type must be 9, 11, 13, or 15 (accessed code segment). — If the control is 1, the Type must be either 3 (read/write accessed expand-up data segment) or one of 9, 11, 13, and 15 (accessed code segment). • • SS. If SS is usable, the Type must be 3 or 7 (read/write, accessed data segment). DS, ES, FS, GS. The following checks apply if the register is usable: — Bit 0 of the Type must be 1 (accessed). — If bit 3 of the Type is 1 (code segment), then bit 1 of the Type must be 1 (readable). — Bit 4 (S). If the register is CS or if the register is usable, S must be 1. — Bits 6:5 (DPL). 1. “Unrestricted guest” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “unrestricted guest” VM-execution control were 0. See Section 24.6.2. 26-10 Vol. 3C VM ENTRIES • CS. — If the Type is 3 (read/write accessed expand-up data segment), the DPL must be 0. The Type can be 3 only if the “unrestricted guest” VM-execution control is 1. — If the Type is 9 or 11 (non-conforming code segment), the DPL must equal the DPL in the access-rights field for SS. — If the Type is 13 or 15 (conforming code segment), the DPL cannot be greater than the DPL in the access-rights field for SS. • SS. — If the “unrestricted guest” VM-execution control is 0, the DPL must equal the RPL from the selector field. — The DPL must be 0 either if the Type in the access-rights field for CS is 3 (read/write accessed expand-up data segment) or if bit 0 in the CR0 field (corresponding to CR0.PE) is 0.1 • DS, ES, FS, GS. The DPL cannot be less than the RPL in the selector field if (1) the “unrestricted guest” VM-execution control is 0; (2) the register is usable; and (3) the Type in the access-rights field is in the range 0 – 11 (data segment or non-conforming code segment). — Bit 7 (P). If the register is CS or if the register is usable, P must be 1. — Bits 11:8 (reserved). If the register is CS or if the register is usable, these bits must all be 0. — Bit 14 (D/B). For CS, D/B must be 0 if the guest will be IA-32e mode and the L bit (bit 13) in the access-rights field is 1. — Bit 15 (G). The following checks apply if the register is CS or if the register is usable: • • If any bit in the limit field in the range 11:0 is 0, G must be 0. If any bit in the limit field in the range 31:20 is 1, G must be 1. — Bits 31:17 (reserved). If the register is CS or if the register is usable, these bits must all be 0. — TR. The different sub-fields are considered separately: • Bits 3:0 (Type). — If the guest will not be IA-32e mode, the Type must be 3 (16-bit busy TSS) or 11 (32-bit busy TSS). — If the guest will be IA-32e mode, the Type must be 11 (64-bit busy TSS). • • • • Bit 4 (S). S must be 0. Bit 7 (P). P must be 1. Bits 11:8 (reserved). These bits must all be 0. Bit 15 (G). — If any bit in the limit field in the range 11:0 is 0, G must be 0. — If any bit in the limit field in the range 31:20 is 1, G must be 1. • • Bit 16 (Unusable). The unusable bit must be 0. Bits 31:17 (reserved). These bits must all be 0. — LDTR. The following checks on the different sub-fields apply only if LDTR is usable: • • • Bits 3:0 (Type). The Type must be 2 (LDT). Bit 4 (S). S must be 0. Bit 7 (P). P must be 1. 1. The following apply if either the “unrestricted guest” VM-execution control or bit 31 of the primary processor-based VM-execution controls is 0: (1) bit 0 in the CR0 field must be 1 if the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PE must be 1 in VMX operation; and (2) the Type in the access-rights field for CS cannot be 3. Vol. 3C 26-11 VM ENTRIES • • Bits 11:8 (reserved). These bits must all be 0. Bit 15 (G). — If any bit in the limit field in the range 11:0 is 0, G must be 0. — If any bit in the limit field in the range 31:20 is 1, G must be 1. • Bits 31:17 (reserved). These bits must all be 0. 26.3.1.3 Checks on Guest Descriptor-Table Registers The following checks are performed on the fields for GDTR and IDTR: • • On processors that support Intel 64 architecture, the base-address fields must contain canonical addresses. Bits 31:16 of each limit field must be 0. 26.3.1.4 Checks on Guest RIP and RFLAGS The following checks are performed on fields in the guest-state area corresponding to RIP and RFLAGS: • RIP. The following checks are performed on processors that support Intel 64 architecture: — Bits 63:32 must be 0 if the “IA-32e mode guest” VM-entry control is 0 or if the L bit (bit 13) in the accessrights field for CS is 0. — If the processor supports N < 64 linear-address bits, bits 63:N must be identical if the “IA-32e mode guest” VM-entry control is 1 and the L bit in the access-rights field for CS is 1.1 (No check applies if the processor supports 64 linear-address bits.) • RFLAGS. — Reserved bits 63:22 (bits 31:22 on processors that do not support Intel 64 architecture), bit 15, bit 5 and bit 3 must be 0 in the field, and reserved bit 1 must be 1. — The VM flag (bit 17) must be 0 either if the “IA-32e mode guest” VM-entry control is 1 or if bit 0 in the CR0 field (corresponding to CR0.PE) is 0.2 — The IF flag (RFLAGS[bit 9]) must be 1 if the valid bit (bit 31) in the VM-entry interruption-information field is 1 and the interruption type (bits 10:8) is external interrupt. 26.3.1.5 Checks on Guest Non-Register State The following checks are performed on fields in the guest-state area corresponding to non-register state: • Activity state. — The activity-state field must contain a value in the range 0 – 3, indicating an activity state supported by the implementation (see Section 24.4.2). Future processors may include support for other activity states. Software should read the VMX capability MSR IA32_VMX_MISC (see Appendix A.6) to determine what activity states are supported. — The activity-state field must not indicate the HLT state if the DPL (bits 6:5) in the access-rights field for SS is not 0.3 — The activity-state field must indicate the active state if the interruptibility-state field indicates blocking by either MOV-SS or by STI (if either bit 0 or bit 1 in that field is 1). — If the valid bit (bit 31) in the VM-entry interruption-information field is 1, the interruption to be delivered (as defined by interruption type and vector) must not be one that would normally be blocked while a logical processor is in the activity state corresponding to the contents of the activity-state field. The following 1. Software can determine the number N by executing CPUID with 80000008H in EAX. The number of linear-address bits supported is returned in bits 15:8 of EAX. 2. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PE must be 1 in VMX operation, bit 0 in the CR0 field must be 1 unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. 3. As noted in Section 24.4.1, SS.DPL corresponds to the logical processor’s current privilege level (CPL). 26-12 Vol. 3C VM ENTRIES items enumerate the interruptions (as specified in the VM-entry interruption-information field) whose injection is allowed for the different activity states: • • Active. Any interruption is allowed. HLT. The only events allowed are the following: — Those with interruption type external interrupt or non-maskable interrupt (NMI). — Those with interruption type hardware exception and vector 1 (debug exception) or vector 18 (machine-check exception). — Those with interruption type other event and vector 0 (pending MTF VM exit). See Table 24-14 in Section 24.8.3 for details regarding the format of the VM-entry interruptioninformation field. • • Shutdown. Only NMIs and machine-check exceptions are allowed. Wait-for-SIPI. No interruptions are allowed. — The activity-state field must not indicate the wait-for-SIPI state if the “entry to SMM” VM-entry control is 1. • Interruptibility state. — The reserved bits (bits 31:5) must be 0. — The field cannot indicate blocking by both STI and MOV SS (bits 0 and 1 cannot both be 1). — Bit 0 (blocking by STI) must be 0 if the IF flag (bit 9) is 0 in the RFLAGS field. — Bit 0 (blocking by STI) and bit 1 (blocking by MOV-SS) must both be 0 if the valid bit (bit 31) in the VM-entry interruption-information field is 1 and the interruption type (bits 10:8) in that field has value 0, indicating external interrupt. — Bit 1 (blocking by MOV-SS) must be 0 if the valid bit (bit 31) in the VM-entry interruption-information field is 1 and the interruption type (bits 10:8) in that field has value 2, indicating non-maskable interrupt (NMI). — Bit 2 (blocking by SMI) must be 0 if the processor is not in SMM. — Bit 2 (blocking by SMI) must be 1 if the “entry to SMM” VM-entry control is 1. — A processor may require bit 0 (blocking by STI) to be 0 if the valid bit (bit 31) in the VM-entry interruptioninformation field is 1 and the interruption type (bits 10:8) in that field has value 2, indicating NMI. Other processors may not make this requirement. — Bit 3 (blocking by NMI) must be 0 if the “virtual NMIs” VM-execution control is 1, the valid bit (bit 31) in the VM-entry interruption-information field is 1, and the interruption type (bits 10:8) in that field has value 2 (indicating NMI). — If bit 4 (enclave interruption) is 1, bit 1 (blocking by MOV-SS) must be 0 and the processor must support for SGX by enumerating CPUID.(EAX=07H,ECX=0):EBX.SGX[bit 2] as 1. NOTE If the “virtual NMIs” VM-execution control is 0, there is no requirement that bit 3 be 0 if the valid bit in the VM-entry interruption-information field is 1 and the interruption type in that field has value 2. • Pending debug exceptions. — Bits 11:4, bit 13, bit 15, and bits 63:17 (bits 31:17 on processors that do not support Intel 64 architecture) must be 0. — The following checks are performed if any of the following holds: (1) the interruptibility-state field indicates blocking by STI (bit 0 in that field is 1); (2) the interruptibility-state field indicates blocking by MOV SS (bit 1 in that field is 1); or (3) the activity-state field indicates HLT: • Bit 14 (BS) must be 1 if the TF flag (bit 8) in the RFLAGS field is 1 and the BTF flag (bit 1) in the IA32_DEBUGCTL field is 0. Vol. 3C 26-13 VM ENTRIES • Bit 14 (BS) must be 0 if the TF flag (bit 8) in the RFLAGS field is 0 or the BTF flag (bit 1) in the IA32_DEBUGCTL field is 1. — The following checks are performed if bit 16 (RTM) is 1: • • Bits 11:0, bits 15:13, and bits 63:17 (bits 31:17 on processors that do not support Intel 64 architecture) must be 0; bit 12 must be 1. • • The processor must support for RTM by enumerating CPUID.(EAX=07H,ECX=0):EBX[bit 11] as 1. The interruptibility-state field must not indicate blocking by MOV SS (bit 1 in that field must be 0). VMCS link pointer. The following checks apply if the field contains a value other than FFFFFFFF_FFFFFFFFH: — Bits 11:0 must be 0. — Bits beyond the processor’s physical-address width must be 0.1,2 — The 4 bytes located in memory referenced by the value of the field (as a physical address) must satisfy the following: • • Bits 30:0 must contain the processor’s VMCS revision identifier (see Section 24.2).3 Bit 31 must contain the setting of the “VMCS shadowing” VM-execution control.4 This implies that the referenced VMCS is a shadow VMCS (see Section 24.10) if and only if the “VMCS shadowing” VMexecution control is 1. — If the processor is not in SMM or the “entry to SMM” VM-entry control is 1, the field must not contain the current VMCS pointer. — If the processor is in SMM and the “entry to SMM” VM-entry control is 0, the field must differ from the executive-VMCS pointer. 26.3.1.6 Checks on Guest Page-Directory-Pointer-Table Entries If CR0.PG =1, CR4.PAE = 1, and IA32_EFER.LME = 0, the logical processor uses PAE paging (see Section 4.4 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A).5 When PAE paging is in use, the physical address in CR3 references a table of page-directory-pointer-table entries (PDPTEs). A MOV to CR3 when PAE paging is in use checks the validity of the PDPTEs. A VM entry is to a guest that uses PAE paging if (1) bit 31 (corresponding to CR0.PG) is set in the CR0 field in the guest-state area; (2) bit 5 (corresponding to CR4.PAE) is set in the CR4 field; and (3) the “IA-32e mode guest” VM-entry control is 0. Such a VM entry checks the validity of the PDPTEs: • If the “enable EPT” VM-execution control is 0, VM entry checks the validity of the PDPTEs referenced by the CR3 field in the guest-state area if either (1) PAE paging was not in use before the VM entry; or (2) the value of CR3 is changing as a result of the VM entry. VM entry may check their validity even if neither (1) nor (2) hold.6 • If the “enable EPT” VM-execution control is 1, VM entry checks the validity of the PDPTE fields in the guest-state area (see Section 24.4.2). A VM entry to a guest that does not use PAE paging does not check the validity of any PDPTEs. 1. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 2. If IA32_VMX_BASIC[48] is read as 1, this field must not set any bits in the range 63:32; see Appendix A.1. 3. Earlier versions of this manual specified that the VMCS revision identifier was a 32-bit field. For all processors produced prior to this change, bit 31 of the VMCS revision identifier was 0. 4. “VMCS shadowing” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “VMCS shadowing” VM-execution control were 0. See Section 24.6.2. 5. On processors that support Intel 64 architecture, the physical-address extension may support more than 36 physical-address bits. Software can determine the number physical-address bits supported by executing CPUID with 80000008H in EAX. The physicaladdress width is returned in bits 7:0 of EAX. 6. “Enable EPT” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “enable EPT” VM-execution control were 0. See Section 24.6.2. 26-14 Vol. 3C VM ENTRIES A VM entry that checks the validity of the PDPTEs uses the same checks that are used when CR3 is loaded with MOV to CR3 when PAE paging is in use.1 If MOV to CR3 would cause a general-protection exception due to the PDPTEs that would be loaded (e.g., because a reserved bit is set), the VM entry fails. 26.3.2 Loading Guest State Processor state is updated on VM entries in the following ways: • • • Some state is loaded from the guest-state area. Some state is determined by VM-entry controls. The page-directory pointers are loaded based on the values of certain control registers. This loading may be performed in any order and in parallel with the checking of VMCS contents (see Section 26.3.1). The loading of guest state is detailed in Section 26.3.2.1 to Section 26.3.2.4. These sections reference VMCS fields that correspond to processor state. Unless otherwise stated, these references are to fields in the guest-state area. In addition to the state loading described in this section, VM entries may load MSRs from the VM-entry MSR-load area (see Section 26.4). This loading occurs only after the state loading described in this section and the checking of VMCS contents described in Section 26.3.1. 26.3.2.1 Loading Guest Control Registers, Debug Registers, and MSRs The following items describe how guest control registers, debug registers, and MSRs are loaded on VM entry: • CR0 is loaded from the CR0 field with the exception of the following bits, which are never modified on VM entry: ET (bit 4); reserved bits 15:6, 17, and 28:19; NW (bit 29) and CD (bit 30).2 The values of these bits in the CR0 field are ignored. • • CR3 and CR4 are loaded from the CR3 field and the CR4 field, respectively. If the “load debug controls” VM-entry control is 1, DR7 is loaded from the DR7 field with the exception that bit 12 and bits 15:14 are always 0 and bit 10 is always 1. The values of these bits in the DR7 field are ignored. The first processors to support the virtual-machine extensions supported only the 1-setting of the “load debug controls” VM-entry control and thus always loaded DR7 from the DR7 field. • The following describes how certain MSRs are loaded using fields in the guest-state area: — If the “load debug controls” VM-entry control is 1, the IA32_DEBUGCTL MSR is loaded from the IA32_DEBUGCTL field. The first processors to support the virtual-machine extensions supported only the 1setting of this control and thus always loaded the IA32_DEBUGCTL MSR from the IA32_DEBUGCTL field. — The IA32_SYSENTER_CS MSR is loaded from the IA32_SYSENTER_CS field. Since this field has only 32 bits, bits 63:32 of the MSR are cleared to 0. — The IA32_SYSENTER_ESP and IA32_SYSENTER_EIP MSRs are loaded from the IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field, respectively. On processors that do not support Intel 64 architecture, these fields have only 32 bits; bits 63:32 of the MSRs are cleared to 0. — The following are performed on processors that support Intel 64 architecture: • The MSRs FS.base and GS.base are loaded from the base-address fields for FS and GS, respectively (see Section 26.3.2.2). • If the “load IA32_EFER” VM-entry control is 0, bits in the IA32_EFER MSR are modified as follows: — IA32_EFER.LMA is loaded with the setting of the “IA-32e mode guest” VM-entry control. 1. This implies that (1) bits 11:9 in each PDPTE are ignored; and (2) if bit 0 (present) is clear in one of the PDPTEs, bits 63:1 of that PDPTE are ignored. 2. Bits 15:6, bit 17, and bit 28:19 of CR0 and CR0.ET are unchanged by executions of MOV to CR0. Bits 15:6, bit 17, and bit 28:19 of CR0 are always 0 and CR0.ET is always 1. Vol. 3C 26-15 VM ENTRIES — If CR0 is being loaded so that CR0.PG = 1, IA32_EFER.LME is also loaded with the setting of the “IA-32e mode guest” VM-entry control.1 Otherwise, IA32_EFER.LME is unmodified. See below for the case in which the “load IA32_EFER” VM-entry control is 1 — If the “load IA32_PERF_GLOBAL_CTRL” VM-entry control is 1, the IA32_PERF_GLOBAL_CTRL MSR is loaded from the IA32_PERF_GLOBAL_CTRL field. — If the “load IA32_PAT” VM-entry control is 1, the IA32_PAT MSR is loaded from the IA32_PAT field. — If the “load IA32_EFER” VM-entry control is 1, the IA32_EFER MSR is loaded from the IA32_EFER field. — If the “load IA32_BNDCFGS” VM-entry control is 1, the IA32_BNDCFGS MSR is loaded from the IA32_BNDCFGS field. — If the “load IA32_RTIT_CTL” VM-entry control is 1, the IA32_RTIT_CTL MSR is loaded from the IA32_RTIT_CTL field. With the exception of FS.base and GS.base, any of these MSRs is subsequently overwritten if it appears in the VM-entry MSR-load area. See Section 26.4. • The SMBASE register is unmodified by all VM entries except those that return from SMM. 26.3.2.2 Loading Guest Segment Registers and Descriptor-Table Registers For each of CS, SS, DS, ES, FS, GS, TR, and LDTR, fields are loaded from the guest-state area as follows: • The unusable bit is loaded from the access-rights field. This bit can never be set for TR (see Section 26.3.1.2). If it is set for one of the other registers, the following apply: — For each of CS, SS, DS, ES, FS, and GS, uses of the segment cause faults (general-protection exception or stack-fault exception) outside 64-bit mode, just as they would had the segment been loaded using a null selector. This bit does not cause accesses to fault in 64-bit mode. — If this bit is set for LDTR, uses of LDTR cause general-protection exceptions in all modes, just as they would had LDTR been loaded using a null selector. If this bit is clear for any of CS, SS, DS, ES, FS, GS, TR, and LDTR, a null selector value does not cause a fault (general-protection exception or stack-fault exception). • • TR. The selector, base, limit, and access-rights fields are loaded. CS. — The following fields are always loaded: selector, base address, limit, and (from the access-rights field) the L, D, and G bits. — For the other fields, the unusable bit of the access-rights field is consulted: • • • If the unusable bit is 0, all of the access-rights field is loaded. If the unusable bit is 1, the remainder of CS access rights are undefined after VM entry. SS, DS, ES, FS, GS, and LDTR. — The selector fields are loaded. — For the other fields, the unusable bit of the corresponding access-rights field is consulted: • • If the unusable bit is 0, the base-address, limit, and access-rights fields are loaded. If the unusable bit is 1, the base address, the segment limit, and the remainder of the access rights are undefined after VM entry with the following exceptions: — Bits 3:0 of the base address for SS are cleared to 0. — SS.DPL is always loaded from the SS access-rights field. This will be the current privilege level (CPL) after the VM entry completes. 1. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PG must be 1 in VMX operation, VM entry must be loading CR0 so that CR0.PG = 1 unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. 26-16 Vol. 3C VM ENTRIES — SS.B is always set to 1. — The base addresses for FS and GS are loaded from the corresponding fields in the VMCS. On processors that support Intel 64 architecture, the values loaded for base addresses for FS and GS are also manifest in the FS.base and GS.base MSRs. — On processors that support Intel 64 architecture, the base address for LDTR is set to an undefined but canonical value. — On processors that support Intel 64 architecture, bits 63:32 of the base addresses for SS, DS, and ES are cleared to 0. GDTR and IDTR are loaded using the base and limit fields. 26.3.2.3 Loading Guest RIP, RSP, and RFLAGS RSP, RIP, and RFLAGS are loaded from the RSP field, the RIP field, and the RFLAGS field, respectively. The following items regard the upper 32 bits of these fields on VM entries that are not to 64-bit mode: • Bits 63:32 of RSP are undefined outside 64-bit mode. Thus, a logical processor may ignore the contents of bits 63:32 of the RSP field on VM entries that are not to 64-bit mode. • As noted in Section 26.3.1.4, bits 63:32 of the RIP and RFLAGS fields must be 0 on VM entries that are not to 64-bit mode. 26.3.2.4 Loading Page-Directory-Pointer-Table Entries As noted in Section 26.3.1.6, the logical processor uses PAE paging if CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LME = 0. A VM entry to a guest that uses PAE paging loads the PDPTEs into internal, non-architectural registers based on the setting of the “enable EPT” VM-execution control: • If the control is 0, the PDPTEs are loaded from the page-directory-pointer table referenced by the physical address in the value of CR3 being loaded by the VM entry (see Section 26.3.2.1). The values loaded are treated as physical addresses in VMX non-root operation. • If the control is 1, the PDPTEs are loaded from corresponding fields in the guest-state area (see Section 24.4.2). The values loaded are treated as guest-physical addresses in VMX non-root operation. 26.3.2.5 Updating Non-Register State Section 28.3 describes how the VMX architecture controls how a logical processor manages information in the TLBs and paging-structure caches. The following items detail how VM entries invalidate cached mappings: • If the “enable VPID” VM-execution control is 0, the logical processor invalidates linear mappings and combined mappings associated with VPID 0000H (for all PCIDs); combined mappings for VPID 0000H are invalidated for all EP4TA values (EP4TA is the value of bits 51:12 of EPTP). • VM entries are not required to invalidate any guest-physical mappings, nor are they required to invalidate any linear mappings or combined mappings if the “enable VPID” VM-execution control is 1. If the “virtual-interrupt delivery” VM-execution control is 1, VM entry loads the values of RVI and SVI from the guest interrupt-status field in the VMCS (see Section 24.4.2). After doing so, the logical processor first causes PPR virtualization (Section 29.1.3) and then evaluates pending virtual interrupts (Section 29.2.1). If a virtual interrupt is recognized, it may be delivered in VMX non-root operation immediately after VM entry (including any specified event injection) completes; see Section 26.7.5. See Section 29.2.2 for details regarding the delivery of virtual interrupts. 26.3.3 Clearing Address-Range Monitoring The Intel 64 and IA-32 architectures allow software to monitor a specified address range using the MONITOR and MWAIT instructions. See Section 8.10.4 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. VM entries clear any address-range monitoring that may be in effect. Vol. 3C 26-17 VM ENTRIES 26.4 LOADING MSRS VM entries may load MSRs from the VM-entry MSR-load area (see Section 24.8.2). Specifically each entry in that area (up to the number specified in the VM-entry MSR-load count) is processed in order by loading the MSR indexed by bits 31:0 with the contents of bits 127:64 as they would be written by WRMSR.1 Processing of an entry fails in any of the following cases: • • The value of bits 31:0 is either C0000100H (the IA32_FS_BASE MSR) or C0000101 (the IA32_GS_BASE MSR). • The value of bits 31:0 indicates an MSR that can be written only in system-management mode (SMM) and the VM entry did not commence in SMM. (IA32_SMM_MONITOR_CTL is an MSR that can be written only in SMM.) • The value of bits 31:0 indicates an MSR that cannot be loaded on VM entries for model-specific reasons. A processor may prevent loading of certain MSRs even if they can normally be written by WRMSR. Such modelspecific behavior is documented in Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. • • Bits 63:32 are not all 0. The value of bits 31:8 is 000008H, meaning that the indexed MSR is one that allows access to an APIC register when the local APIC is in x2APIC mode. An attempt to write bits 127:64 to the MSR indexed by bits 31:0 of the entry would cause a general-protection exception if executed via WRMSR with CPL = 0.2 The VM entry fails if processing fails for any entry. The logical processor responds to such failures by loading state from the host-state area, as it would for a VM exit. See Section 26.8. If any MSR is being loaded in such a way that would architecturally require a TLB flush, the TLBs are updated so that, after VM entry, the logical processor will not use any translations that were cached before the transition. 26.5 TRACE-ADDRESS PRE-TRANSLATION (TAPT) When the “Intel PT uses guest physical addresses” VM-execution control is 1, the addresses used by Intel PT are treated as guest-physical addresses, and these are translated to physical addresses using EPT. VM entry uses trace-address pre-translation (TAPT) to prevent buffered trace data from being lost due to an EPT violation; see Section 25.5.4.2. VM entry uses TAPT only if Intel PT will be enabled following VM entry (IA32_RTIT_CTL.TraceEn = 1) and only if the “Intel PT uses guest physical addresses” VM-execution control is 1 As noted in Section 25.5.4, TAPT may cause a VM exit due to an EPT violation, EPT misconfiguration, page-modification log-full event, or APIC access. If such a VM exit occurs as a result of TAPT during VM entry, the VM exit operates as if it had occurred in VMX non-root operation after the VM entry completed (in the guest context). If TAPT during VM entry causes a VM exit, the VM entry does not perform event injection (Section 26.6), even if the valid bit in the VM-entry interruption-information field is 1. Such VM exits save the contents of VM-entry interruption-information and VM-entry exception error code fields into the IDT-vectoring information and IDT-vectoring error code fields, respectively. 26.6 EVENT INJECTION If the valid bit in the VM-entry interruption-information field (see Section 24.8.3) is 1, VM entry causes an event to be delivered (or made pending) after all components of guest state have been loaded (including MSRs) and after the VM-execution control fields have been established. 1. Because attempts to modify the value of IA32_EFER.LMA by WRMSR are ignored, attempts to modify it using the VM-entry MSRload area are also ignored. 2. If CR0.PG = 1, WRMSR to the IA32_EFER MSR causes a general-protection exception if it would modify the LME bit. If VM entry has established CR0.PG = 1, the IA32_EFER MSR should not be included in the VM-entry MSR-load area for the purpose of modifying the LME bit. 26-18 Vol. 3C VM ENTRIES • If the interruption type in the field is 0 (external interrupt), 2 (non-maskable interrupt); 3 (hardware exception), 4 (software interrupt), 5 (privileged software exception), or 6 (software exception), the event is delivered as described in Section 26.6.1. • If the interruption type in the field is 7 (other event) and the vector field is 0, an MTF VM exit is pending after VM entry. See Section 26.6.2. 26.6.1 Vectored-Event Injection VM entry delivers an injected vectored event within the guest context established by VM entry. This means that delivery occurs after all components of guest state have been loaded (including MSRs) and after the VM-execution control fields have been established.1 The event is delivered using the vector in that field to select a descriptor in the IDT. Since event injection occurs after loading IDTR from the guest-state area, this is the guest IDT. Section 26.6.1.1 provides details of vectored-event injection. In general, the event is delivered exactly as if it had been generated normally. If event delivery encounters a nested exception (for example, a general-protection exception because the vector indicates a descriptor beyond the IDT limit), the exception bitmap is consulted using the vector of that exception: • If the bit for the nested exception is 0, the nested exception is delivered normally. If the nested exception is benign, it is delivered through the IDT. If it is contributory or a page fault, a double fault may be generated, depending on the nature of the event whose delivery encountered the nested exception. See Chapter 6, “Interrupt 8—Double Fault Exception (#DF)” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.2 • If the bit for the nested exception is 1, a VM exit occurs. Section 26.6.1.2 details cases in which event injection causes a VM exit. 26.6.1.1 Details of Vectored-Event Injection The event-injection process is controlled by the contents of the VM-entry interruption information field (format given in Table 24-14), the VM-entry exception error-code field, and the VM-entry instruction-length field. The following items provide details of the process: • The value pushed on the stack for RFLAGS is generally that which was loaded from the guest-state area. The value pushed for the RF flag is not modified based on the type of event being delivered. However, the pushed value of RFLAGS may be modified if a software interrupt is being injected into a guest that will be in virtual8086 mode (see below). After RFLAGS is pushed on the stack, the value in the RFLAGS register is modified as is done normally when delivering an event through the IDT. • The instruction pointer that is pushed on the stack depends on the type of event and whether nested exceptions occur during its delivery. The term current guest RIP refers to the value to be loaded from the guest-state area. The value pushed is determined as follows:3 — If VM entry successfully injects (with no nested exception) an event with interruption type external interrupt, NMI, or hardware exception, the current guest RIP is pushed on the stack. — If VM entry successfully injects (with no nested exception) an event with interruption type software interrupt, privileged software exception, or software exception, the current guest RIP is incremented by the VM-entry instruction length before being pushed on the stack. — If VM entry encounters an exception while injecting an event and that exception does not cause a VM exit, the current guest RIP is pushed on the stack regardless of event type or VM-entry instruction length. If the encountered exception does cause a VM exit that saves RIP, the saved RIP is current guest RIP. 1. This does not imply that injection of an exception or interrupt will cause a VM exit due to the settings of VM-execution control fields (such as the exception bitmap) that would cause a VM exit if the event had occurred in VMX non-root operation. In contrast, a nested exception encountered during event delivery may cause a VM exit; see Section 26.6.1.1. 2. Hardware exceptions with the following unused vectors are considered benign: 15 and 21–31. A hardware exception with vector 20 is considered benign unless the processor supports the 1-setting of the “EPT-violation #VE” VM-execution control; in that case, it has the same severity as page faults. 3. While these items refer to RIP, the width of the value pushed (16 bits, 32 bits, or 64 bits) is determined normally. Vol. 3C 26-19 VM ENTRIES • If the deliver-error-code bit (bit 11) is set in the VM-entry interruption-information field, the contents of the VM-entry exception error-code field is pushed on the stack as an error code would be pushed during delivery of an exception. • DR6, DR7, and the IA32_DEBUGCTL MSR are not modified by event injection, even if the event has vector 1 (normal deliveries of debug exceptions, which have vector 1, do update these registers). • If VM entry is injecting a software interrupt and the guest will be in virtual-8086 mode (RFLAGS.VM = 1), no general-protection exception can occur due to RFLAGS.IOPL < 3. A VM monitor should check RFLAGS.IOPL before injecting such an event and, if desired, inject a general-protection exception instead of a software interrupt. • If VM entry is injecting a software interrupt and the guest will be in virtual-8086 mode with virtual-8086 mode extensions (RFLAGS.VM = CR4.VME = 1), event delivery is subject to VME-based interrupt redirection based on the software interrupt redirection bitmap in the task-state segment (TSS) as follows: — If bit n in the bitmap is clear (where n is the number of the software interrupt), the interrupt is directed to an 8086 program interrupt handler: the processor uses a 16-bit interrupt-vector table (IVT) located at linear address zero. If the value of RFLAGS.IOPL is less than 3, the following modifications are made to the value of RFLAGS that is pushed on the stack: IOPL is set to 3, and IF is set to the value of VIF. — If bit n in the bitmap is set (where n is the number of the software interrupt), the interrupt is directed to a protected-mode interrupt handler. (In other words, the injection is treated as described in the next item.) In this case, the software interrupt does not invoke such a handler if RFLAGS.IOPL < 3 (a generalprotection exception occurs instead). However, as noted above, RFLAGS.IOPL cannot cause an injected software interrupt to cause such a exception. Thus, in this case, the injection invokes a protected-mode interrupt handler independent of the value of RFLAGS.IOPL. Injection of events of other types are not subject to this redirection. • If VM entry is injecting a software interrupt (not redirected as described above) or software exception, privilege checking is performed on the IDT descriptor being accessed as would be the case for executions of INT n, INT3, or INTO (the descriptor’s DPL cannot be less than CPL). There is no checking of RFLAGS.IOPL, even if the guest will be in virtual-8086 mode. Failure of this check may lead to a nested exception. Injection of an event with interruption type external interrupt, NMI, hardware exception, and privileged software exception, or with interruption type software interrupt and being redirected as described above, do not perform these checks. • If VM entry is injecting a non-maskable interrupt (NMI) and the “virtual NMIs” VM-execution control is 1, virtual-NMI blocking is in effect after VM entry. • The transition causes a last-branch record to be logged if the LBR bit is set in the IA32_DEBUGCTL MSR. This is true even for events such as debug exceptions, which normally clear the LBR bit before delivery. • The last-exception record MSRs (LERs) may be updated based on the setting of the LBR bit in the IA32_DEBUGCTL MSR. Events such as debug exceptions, which normally clear the LBR bit before they are delivered, and therefore do not normally update the LERs, may do so as part of VM-entry event injection. • If injection of an event encounters a nested exception, the value of the EXT bit (bit 0) in any error code for that nested exception is determined as follows: — If event being injected has interruption type external interrupt, NMI, hardware exception, or privileged software exception and encounters a nested exception (but does not produce a double fault), the error code for that exception sets the EXT bit. — If event being injected is a software interrupt or a software exception and encounters a nested exception, the error code for that exception clears the EXT bit. — If event delivery encounters a nested exception and delivery of that exception encounters another exception (but does not produce a double fault), the error code for that exception sets the EXT bit. — If a double fault is produced, the error code for the double fault is 0000H (the EXT bit is clear). 26.6.1.2 VM Exits During Event Injection An event being injected never causes a VM exit directly regardless of the settings of the VM-execution controls. For example, setting the “NMI exiting” VM-execution control to 1 does not cause a VM exit due to injection of an NMI. However, the event-delivery process may lead to a VM exit: 26-20 Vol. 3C VM ENTRIES • If the vector in the VM-entry interruption-information field identifies a task gate in the IDT, the attempted task switch may cause a VM exit just as it would had the injected event occurred during normal execution in VMX non-root operation (see Section 25.4.2). • If event delivery encounters a nested exception, a VM exit may occur depending on the contents of the exception bitmap (see Section 25.2). • If event delivery generates a double-fault exception (due to a nested exception); the logical processor encounters another nested exception while attempting to call the double-fault handler; and that exception does not cause a VM exit due to the exception bitmap; then a VM exit occurs due to triple fault (see Section 25.2). • If event delivery injects a double-fault exception and encounters a nested exception that does not cause a VM exit due to the exception bitmap, then a VM exit occurs due to triple fault (see Section 25.2). • If the “virtualize APIC accesses” VM-execution control is 1 and event delivery generates an access to the APICaccess page, that access is treated as described in Section 29.4 and may cause a VM exit.1 If the event-delivery process does cause a VM exit, the processor state before the VM exit is determined just as it would be had the injected event occurred during normal execution in VMX non-root operation. If the injected event directly accesses a task gate that cause a VM exit or if the first nested exception encountered causes a VM exit, information about the injected event is saved in the IDT-vectoring information field (see Section 27.2.4). 26.6.1.3 Event Injection for VM Entries to Real-Address Mode If VM entry is loading CR0.PE with 0, any injected vectored event is delivered as would normally be done in realaddress mode.2 Specifically, VM entry uses the vector provided in the VM-entry interruption-information field to select a 4-byte entry from an interrupt-vector table at the linear address in IDTR.base. Further details are provided in Section 15.1.4 in Volume 3A of the IA-32 Intel® Architecture Software Developer’s Manual. Because bit 11 (deliver error code) in the VM-entry interruption-information field must be 0 if CR0.PE will be 0 after VM entry (see Section 26.2.1.3), vectored events injected with CR0.PE = 0 do not push an error code on the stack. This is consistent with event delivery in real-address mode. If event delivery encounters a fault (due to a violation of IDTR.limit or of SS.limit), the fault is treated as if it had occurred during event delivery in VMX non-root operation. Such a fault may lead to a VM exit as discussed in Section 26.6.1.2. 26.6.2 Injection of Pending MTF VM Exits If the interruption type in the VM-entry interruption-information field is 7 (other event) and the vector field is 0, VM entry causes an MTF VM exit to be pending on the instruction boundary following VM entry. This is the case even if the “monitor trap flag” VM-execution control is 0. See Section 25.5.2 for the treatment of pending MTF VM exits. 26.7 SPECIAL FEATURES OF VM ENTRY This section details a variety of features of VM entry. It uses the following terminology: a VM entry is vectoring if the valid bit (bit 31) of the VM-entry interruption information field is 1 and the interruption type in the field is 0 (external interrupt), 2 (non-maskable interrupt); 3 (hardware exception), 4 (software interrupt), 5 (privileged software exception), or 6 (software exception). 1. “Virtualize APIC accesses” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “virtualize APIC accesses” VM-execution control were 0. See Section 24.6.2. 2. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PE must be 1 in VMX operation, VM entry must be loading CR0.PE with 1 unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. Vol. 3C 26-21 VM ENTRIES 26.7.1 Interruptibility State The interruptibility-state field in the guest-state area (see Table 24-3) contains bits that control blocking by STI, blocking by MOV SS, and blocking by NMI. This field impacts event blocking after VM entry as follows: • If the VM entry is vectoring, there is no blocking by STI or by MOV SS following the VM entry, regardless of the contents of the interruptibility-state field. • If the VM entry is not vectoring, the following apply: — Events are blocked by STI if and only if bit 0 in the interruptibility-state field is 1. This blocking is cleared after the guest executes one instruction or incurs an exception (including a debug exception made pending by VM entry; see Section 26.7.3). — Events are blocked by MOV SS if and only if bit 1 in the interruptibility-state field is 1. This may affect the treatment of pending debug exceptions; see Section 26.7.3. This blocking is cleared after the guest executes one instruction or incurs an exception (including a debug exception made pending by VM entry). • The blocking of non-maskable interrupts (NMIs) is determined as follows: — If the “virtual NMIs” VM-execution control is 0, NMIs are blocked if and only if bit 3 (blocking by NMI) in the interruptibility-state field is 1. If the “NMI exiting” VM-execution control is 0, execution of the IRET instruction removes this blocking (even if the instruction generates a fault). If the “NMI exiting” control is 1, IRET does not affect this blocking. — The following items describe the use of bit 3 (blocking by NMI) in the interruptibility-state field if the “virtual NMIs” VM-execution control is 1: • The bit’s value does not affect the blocking of NMIs after VM entry. NMIs are not blocked in VMX nonroot operation (except for ordinary blocking for other reasons, such as by the MOV SS instruction, the wait-for-SIPI state, etc.) • The bit’s value determines whether there is virtual-NMI blocking after VM entry. If the bit is 1, virtualNMI blocking is in effect after VM entry. If the bit is 0, there is no virtual-NMI blocking after VM entry unless the VM entry is injecting an NMI (see Section 26.6.1.1). Execution of IRET removes virtual-NMI blocking (even if the instruction generates a fault). If the “NMI exiting” VM-execution control is 0, the “virtual NMIs” control must be 0; see Section 26.2.1.1. • Blocking of system-management interrupts (SMIs) is determined as follows: — If the VM entry was not executed in system-management mode (SMM), SMI blocking is unchanged by VM entry. — If the VM entry was executed in SMM, SMIs are blocked after VM entry if and only if the bit 2 in the interruptibility-state field is 1. 26.7.2 Activity State The activity-state field in the guest-state area controls whether, after VM entry, the logical processor is active or in one of the inactive states identified in Section 24.4.2. The use of this field is determined as follows: • If the VM entry is vectoring, the logical processor is in the active state after VM entry. While the consistency checks described in Section 26.3.1.5 on the activity-state field do apply in this case, the contents of the activity-state field do not determine the activity state after VM entry. • If the VM entry is not vectoring, the logical processor ends VM entry in the activity state specified in the gueststate area. If VM entry ends with the logical processor in an inactive activity state, the VM entry generates any special bus cycle that is normally generated when that activity state is entered from the active state. If VM entry would end with the logical processor in the shutdown state and the logical processor is in SMX operation,1 an Intel® TXT shutdown condition occurs. The error code used is 0000H, indicating “legacy shutdown.” See Intel® Trusted Execution Technology Preliminary Architecture Specification. 1. A logical processor is in SMX operation if GETSEC[SEXIT] has not been executed since the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. 26-22 Vol. 3C VM ENTRIES • Some activity states unconditionally block certain events. The following blocking is in effect after any VM entry that puts the processor in the indicated state: — The active state blocks start-up IPIs (SIPIs). SIPIs that arrive while a logical processor is in the active state and in VMX non-root operation are discarded and do not cause VM exits. — The HLT state blocks start-up IPIs (SIPIs). SIPIs that arrive while a logical processor is in the HLT state and in VMX non-root operation are discarded and do not cause VM exits. — The shutdown state blocks external interrupts and SIPIs. External interrupts that arrive while a logical processor is in the shutdown state and in VMX non-root operation do not cause VM exits even if the “external-interrupt exiting” VM-execution control is 1. SIPIs that arrive while a logical processor is in the shutdown state and in VMX non-root operation are discarded and do not cause VM exits. — The wait-for-SIPI state blocks external interrupts, non-maskable interrupts (NMIs), INIT signals, and system-management interrupts (SMIs). Such events do not cause VM exits if they arrive while a logical processor is in the wait-for-SIPI state and in VMX non-root operation. 26.7.3 Delivery of Pending Debug Exceptions after VM Entry The pending debug exceptions field in the guest-state area indicates whether there are debug exceptions that have not yet been delivered (see Section 24.4.2). This section describes how these are treated on VM entry. There are no pending debug exceptions after VM entry if any of the following are true: • The VM entry is vectoring with one of the following interruption types: external interrupt, non-maskable interrupt (NMI), hardware exception, or privileged software exception. • The interruptibility-state field does not indicate blocking by MOV SS and the VM entry is vectoring with either of the following interruption type: software interrupt or software exception. • The VM entry is not vectoring and the activity-state field indicates either shutdown or wait-for-SIPI. If none of the above hold, the pending debug exceptions field specifies the debug exceptions that are pending for the guest. There are valid pending debug exceptions if either the BS bit (bit 14) or the enable-breakpoint bit (bit 12) is 1. If there are valid pending debug exceptions, they are handled as follows: • If the VM entry is not vectoring, the pending debug exceptions are treated as they would had they been encountered normally in guest execution: — If the logical processor is not blocking such exceptions (the interruptibility-state field indicates no blocking by MOV SS), a debug exception is delivered after VM entry (see below). — If the logical processor is blocking such exceptions (due to blocking by MOV SS), the pending debug exceptions are held pending or lost as would normally be the case. • If the VM entry is vectoring (with interruption type software interrupt or software exception and with blocking by MOV SS), the following items apply: — For injection of a software interrupt or of a software exception with vector 3 (#BP) or vector 4 (#OF) — or a privileged software exception with vector 1 (#DB) — the pending debug exceptions are treated as they would had they been encountered normally in guest execution if the corresponding instruction (INT1, INT3, or INTO) were executed after a MOV SS that encountered a debug trap. — For injection of a software exception with a vector other than 3 and 4, the pending debug exceptions may be lost or they may be delivered after injection (see below). If there are no valid pending debug exceptions (as defined above), no pending debug exceptions are delivered after VM entry. If a pending debug exception is delivered after VM entry, it has the priority of “traps on the previous instruction” (see Section 6.9 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A). Thus, INIT signals and system-management interrupts (SMIs) take priority of such an exception, as do VM exits induced by the TPR threshold (see Section 26.7.7) and pending MTF VM exits (see Section 26.7.8. The exception takes priority over any pending non-maskable interrupt (NMI) or external interrupt and also over VM exits due to the 1-settings of the “interrupt-window exiting” and “NMI-window exiting” VM-execution controls. Vol. 3C 26-23 VM ENTRIES A pending debug exception delivered after VM entry causes a VM exit if the bit 1 (#DB) is 1 in the exception bitmap. If it does not cause a VM exit, it updates DR6 normally. 26.7.4 VMX-Preemption Timer If the “activate VMX-preemption timer” VM-execution control is 1, VM entry starts the VMX-preemption timer with the unsigned value in the VMX-preemption timer-value field. It is possible for the VMX-preemption timer to expire during VM entry (e.g., if the value in the VMX-preemption timer-value field is zero). If this happens (and if the VM entry was not to the wait-for-SIPI state), a VM exit occurs with its normal priority after any event injection and before execution of any instruction following VM entry. For example, any pending debug exceptions established by VM entry (see Section 26.7.3) take priority over a timerinduced VM exit. (The timer-induced VM exit will occur after delivery of the debug exception, unless that exception or its delivery causes a different VM exit.) See Section 25.5.1 for details of the operation of the VMX-preemption timer in VMX non-root operation, including the blocking and priority of the VM exits that it causes. 26.7.5 Interrupt-Window Exiting and Virtual-Interrupt Delivery If “interrupt-window exiting” VM-execution control is 1, an open interrupt window may cause a VM exit immediately after VM entry (see Section 25.2 for details). If the “interrupt-window exiting” VM-execution control is 0 but the “virtual-interrupt delivery” VM-execution control is 1, a virtual interrupt may be delivered immediately after VM entry (see Section 26.3.2.5 and Section 29.2.1). The following items detail the treatment of these events: • • These events occur after any event injection specified for VM entry. • These events wake the logical processor if it just entered the HLT state because of a VM entry (see Section 26.7.2). They do not occur if the logical processor just entered the shutdown state or the wait-for-SIPI state. Non-maskable interrupts (NMIs) and higher priority events take priority over these events. These events take priority over external interrupts and lower priority events. 26.7.6 NMI-Window Exiting The “NMI-window exiting” VM-execution control may cause a VM exit to occur immediately after VM entry (see Section 25.2 for details). The following items detail the treatment of these VM exits: • • These VM exits follow event injection if such injection is specified for VM entry. • VM exits caused by this control wake the logical processor if it just entered either the HLT state or the shutdown state because of a VM entry (see Section 26.7.2). They do not occur if the logical processor just entered the wait-for-SIPI state. Debug-trap exceptions (see Section 26.7.3) and higher priority events take priority over VM exits caused by this control. VM exits caused by this control take priority over non-maskable interrupts (NMIs) and lower priority events. 26.7.7 VM Exits Induced by the TPR Threshold If the “use TPR shadow” and “virtualize APIC accesses” VM-execution controls are both 1 and the “virtual-interrupt delivery” VM-execution control is 0, a VM exit occurs immediately after VM entry if the value of bits 3:0 of the TPR threshold VM-execution control field is greater than the value of bits 7:4 of VTPR (see Section 29.1.1).1 1. “Virtualize APIC accesses” and “virtual-interrupt delivery” are secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if these controls were 0. See Section 24.6.2. 26-24 Vol. 3C VM ENTRIES The following items detail the treatment of these VM exits: • The VM exits are not blocked if RFLAGS.IF = 0 or by the setting of bits in the interruptibility-state field in gueststate area. • • The VM exits follow event injection if such injection is specified for VM entry. • These VM exits wake the logical processor if it just entered the HLT state as part of a VM entry (see Section 26.7.2). They do not occur if the logical processor just entered the shutdown state or the wait-for-SIPI state. VM exits caused by this control take priority over system-management interrupts (SMIs), INIT signals, and lower priority events. They thus have priority over the VM exits described in Section 26.7.5, Section 26.7.6, and Section 26.7.8, as well as any interrupts or debug exceptions that may be pending at the time of VM entry. If such a VM exit is suppressed because the processor just entered the shutdown state, it occurs after the delivery of any event that cause the logical processor to leave the shutdown state while remaining in VMX non-root operation (e.g., due to an NMI that occurs while the “NMI-exiting” VM-execution control is 0). • The basic exit reason is “TPR below threshold.” 26.7.8 Pending MTF VM Exits As noted in Section 26.6.2, VM entry may cause an MTF VM exit to be pending immediately after VM entry. The following items detail the treatment of these VM exits: • System-management interrupts (SMIs), INIT signals, and higher priority events take priority over these VM exits. These VM exits take priority over debug-trap exceptions and lower priority events. • These VM exits wake the logical processor if it just entered the HLT state because of a VM entry (see Section 26.7.2). They do not occur if the logical processor just entered the shutdown state or the wait-for-SIPI state. 26.7.9 VM Entries and Advanced Debugging Features VM entries are not logged with last-branch records, do not produce branch-trace messages, and do not update the branch-trace store. 26.8 VM-ENTRY FAILURES DURING OR AFTER LOADING GUEST STATE VM-entry failures due to the checks identified in Section 26.3.1 and failures during the MSR loading identified in Section 26.4 are treated differently from those that occur earlier in VM entry. In these cases, the following steps take place: 1. Information about the VM-entry failure is recorded in the VM-exit information fields: — Exit reason. • Bits 15:0 of this field contain the basic exit reason. It is loaded with a number indicating the general cause of the VM-entry failure. The following numbers are used: 33. VM-entry failure due to invalid guest state. A VM entry failed one of the checks identified in Section 26.3.1. 34. VM-entry failure due to MSR loading. A VM entry failed in an attempt to load MSRs (see Section 26.4). 41. VM-entry failure due to machine-check event. A machine-check event occurred during VM entry (see Section 26.9). • • Bit 31 is set to 1 to indicate a VM-entry failure. The remainder of the field (bits 30:16) is cleared. — Exit qualification. This field is set based on the exit reason. • VM-entry failure due to invalid guest state. In most cases, the exit qualification is cleared to 0. The following non-zero values are used in the cases indicated: Vol. 3C 26-25 VM ENTRIES 1. Not used. 2. Failure was due to a problem loading the PDPTEs (see Section 26.3.1.6). 3. Failure was due to an attempt to inject a non-maskable interrupt (NMI) into a guest that is blocking events through the STI blocking bit in the interruptibility-state field. Such failures are implementation-specific (see Section 26.3.1.5). 4. Failure was due to an invalid VMCS link pointer (see Section 26.3.1.5). VM-entry checks on guest-state fields may be performed in any order. Thus, an indication by exit qualification of one cause does not imply that there are not also other errors. Different processors may give different exit qualifications for the same VMCS. • VM-entry failure due to MSR loading. The exit qualification is loaded to indicate which entry in the VM-entry MSR-load area caused the problem (1 for the first entry, 2 for the second, etc.). — All other VM-exit information fields are unmodified. 2. Processor state is loaded as would be done on a VM exit (see Section 27.5). If this results in [CR4.PAE & CR0.PG & ~IA32_EFER.LMA] = 1, page-directory-pointer-table entries (PDPTEs) may be checked and loaded (see Section 27.5.4). 3. The state of blocking by NMI is what it was before VM entry. 4. MSRs are loaded as specified in the VM-exit MSR-load area (see Section 27.6). Although this process resembles that of a VM exit, many steps taken during a VM exit do not occur for these VM-entry failures: • • • • Most VM-exit information fields are not updated (see step 1 above). The valid bit in the VM-entry interruption-information field is not cleared. The guest-state area is not modified. No MSRs are saved into the VM-exit MSR-store area. 26.9 MACHINE-CHECK EVENTS DURING VM ENTRY If a machine-check event occurs during a VM entry, one of the following occurs: • The machine-check event is handled as if it occurred before the VM entry: — If CR4.MCE = 0, operation of the logical processor depends on whether the logical processor is in SMX operation:1 • If the logical processor is in SMX operation, an Intel® TXT shutdown condition occurs. The error code used is 000CH, indicating “unrecoverable machine-check condition.” • If the logical processor is outside SMX operation, it goes to the shutdown state. — If CR4.MCE = 1, a machine-check exception (#MC) is delivered through the IDT. • The machine-check event is handled after VM entry completes: — If the VM entry ends with CR4.MCE = 0, operation of the logical processor depends on whether the logical processor is in SMX operation: • If the logical processor is in SMX operation, an Intel® TXT shutdown condition occurs with error code 000CH (unrecoverable machine-check condition). • If the logical processor is outside SMX operation, it goes to the shutdown state. — If the VM entry ends with CR4.MCE = 1, a machine-check exception (#MC) is generated: • If bit 18 (#MC) of the exception bitmap is 0, the exception is delivered through the guest IDT. 1. A logical processor is in SMX operation if GETSEC[SEXIT] has not been executed since the last execution of GETSEC[SENTER]. A logical processor is outside SMX operation if GETSEC[SENTER] has not been executed or if GETSEC[SEXIT] was executed after the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. 26-26 Vol. 3C VM ENTRIES • • If bit 18 of the exception bitmap is 1, the exception causes a VM exit. A VM-entry failure occurs as described in Section 26.8. The basic exit reason is 41, for “VM-entry failure due to machine-check event.” The first option is not used if the machine-check event occurs after any guest state has been loaded. The second option is used only if VM entry is able to load all guest state. Vol. 3C 26-27 VM ENTRIES 26-28 Vol. 3C 19.Updates to Chapter 27, Volume 3C Change bars and green text show changes to Appendix C of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to chapter: Updates across chapter describing VMX support improvements made for Intel® Processor Trace (Intel® PT). Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 30 CHAPTER 27 VM EXITS VM exits occur in response to certain instructions and events in VMX non-root operation as detailed in Section 25.1 through Section 25.2. VM exits perform the following operations: 1. Information about the cause of the VM exit is recorded in the VM-exit information fields and VM-entry control fields are modified as described in Section 27.2. 2. Processor state is saved in the guest-state area (Section 27.3). 3. MSRs may be saved in the VM-exit MSR-store area (Section 27.4). This step is not performed for SMM VM exits that activate the dual-monitor treatment of SMIs and SMM. 4. The following may be performed in parallel and in any order (Section 27.5): — Processor state is loaded based in part on the host-state area and some VM-exit controls. This step is not performed for SMM VM exits that activate the dual-monitor treatment of SMIs and SMM. See Section 34.15.6 for information on how processor state is loaded by such VM exits. — Address-range monitoring is cleared. 5. MSRs may be loaded from the VM-exit MSR-load area (Section 27.6). This step is not performed for SMM VM exits that activate the dual-monitor treatment of SMIs and SMM. VM exits are not logged with last-branch records, do not produce branch-trace messages, and do not update the branch-trace store. Section 27.1 clarifies the nature of the architectural state before a VM exit begins. The steps described above are detailed in Section 27.2 through Section 27.6. Section 34.15 describes the dual-monitor treatment of system-management interrupts (SMIs) and systemmanagement mode (SMM). Under this treatment, ordinary transitions to SMM are replaced by VM exits to a separate SMM monitor. Called SMM VM exits, these are caused by the arrival of an SMI or the execution of VMCALL in VMX root operation. SMM VM exits differ from other VM exits in ways that are detailed in Section 34.15.2. 27.1 ARCHITECTURAL STATE BEFORE A VM EXIT This section describes the architectural state that exists before a VM exit, especially for VM exits caused by events that would normally be delivered through the IDT. Note the following: • An exception causes a VM exit directly if the bit corresponding to that exception is set in the exception bitmap. A non-maskable interrupt (NMI) causes a VM exit directly if the “NMI exiting” VM-execution control is 1. An external interrupt causes a VM exit directly if the “external-interrupt exiting” VM-execution control is 1. A startup IPI (SIPI) that arrives while a logical processor is in the wait-for-SIPI activity state causes a VM exit directly. INIT signals that arrive while the processor is not in the wait-for-SIPI activity state cause VM exits directly. • An exception, NMI, external interrupt, or software interrupt causes a VM exit indirectly if it does not do so directly but delivery of the event causes a nested exception, double fault, task switch, APIC access (see Section 29.4), EPT violation, EPT misconfiguration, page-modification log-full event (see Section 28.2.6), or SPPrelated event (see Section 28.2.4) that causes a VM exit. • An event results in a VM exit if it causes a VM exit (directly or indirectly). The following bullets detail when architectural state is and is not updated in response to VM exits: • If an event causes a VM exit directly, it does not update architectural state as it would have if it had it not caused the VM exit: — A debug exception does not update DR6, DR7.GD, or IA32_DEBUGCTL.LBR. (Information about the nature of the debug exception is saved in the exit qualification field.) — A page fault does not update CR2. (The linear address causing the page fault is saved in the exit-qualification field.) Vol. 3C 27-1 VM EXITS — An NMI causes subsequent NMIs to be blocked, but only after the VM exit completes. — An external interrupt does not acknowledge the interrupt controller and the interrupt remains pending, unless the “acknowledge interrupt on exit” VM-exit control is 1. In such a case, the interrupt controller is acknowledged and the interrupt is no longer pending. — The flags L0 – L3 in DR7 (bit 0, bit 2, bit 4, and bit 6) are not cleared when a task switch causes a VM exit. — If a task switch causes a VM exit, none of the following are modified by the task switch: old task-state segment (TSS); new TSS; old TSS descriptor; new TSS descriptor; RFLAGS.NT1; or the TR register. — No last-exception record is made if the event that would do so directly causes a VM exit. — If a machine-check exception causes a VM exit directly, this does not prevent machine-check MSRs from being updated. These are updated by the machine-check event itself and not the resulting machine-check exception. — If the logical processor is in an inactive state (see Section 24.4.2) and not executing instructions, some events may be blocked but others may return the logical processor to the active state. Unblocked events may cause VM exits.2 If an unblocked event causes a VM exit directly, a return to the active state occurs only after the VM exit completes.3 The VM exit generates any special bus cycle that is normally generated when the active state is entered from that activity state. MTF VM exits (see Section 25.5.2 and Section 26.7.8) are not blocked in the HLT activity state. If an MTF VM exit occurs in the HLT activity state, the logical processor returns to the active state only after the VM exit completes. MTF VM exits are blocked the shutdown state and the wait-for-SIPI state. • If an event causes a VM exit indirectly, the event does update architectural state: — A debug exception updates DR6, DR7, and the IA32_DEBUGCTL MSR. No debug exceptions are considered pending. — A page fault updates CR2. — An NMI causes subsequent NMIs to be blocked before the VM exit commences. — An external interrupt acknowledges the interrupt controller and the interrupt is no longer pending. — If the logical processor had been in an inactive state, it enters the active state and, before the VM exit commences, generates any special bus cycle that is normally generated when the active state is entered from that activity state. — There is no blocking by STI or by MOV SS when the VM exit commences. — Processor state that is normally updated as part of delivery through the IDT (CS, RIP, SS, RSP, RFLAGS) is not modified. However, the incomplete delivery of the event may write to the stack. — The treatment of last-exception records is implementation dependent: • Some processors make a last-exception record when beginning the delivery of an event through the IDT (before it can encounter a nested exception). Such processors perform this update even if the event encounters a nested exception that causes a VM exit (including the case where nested exceptions lead to a triple fault). • Other processors delay making a last-exception record until event delivery has reached some event handler successfully (perhaps after one or more nested exceptions). Such processors do not update the last-exception record if a VM exit or triple fault occurs before an event handler is reached. 1. This chapter uses the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For processors that do not support Intel 64 architecture, this notation refers to the 32-bit forms of those registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to lower 32 bits of the indicated register. 2. If a VM exit takes the processor from an inactive state resulting from execution of a specific instruction (HLT or MWAIT), the value saved for RIP by that VM exit will reference the following instruction. 3. An exception is made if the logical processor had been inactive due to execution of MWAIT; in this case, it is considered to have become active before the VM exit. 27-2 Vol. 3C VM EXITS • If the “virtual NMIs” VM-execution control is 1, VM entry injects an NMI, and delivery of the NMI causes a nested exception, double fault, task switch, EPT violation, EPT misconfiguration, page-modification log-full event, or SPP-related event, or APIC access that causes a VM exit, virtual-NMI blocking is in effect before the VM exit commences. • If a VM exit results from a fault, EPT violation, EPT misconfiguration, page-modification log-full event, or SPPrelated event that is encountered during execution of IRET and the “NMI exiting” VM-execution control is 0, any blocking by NMI is cleared before the VM exit commences. However, the previous state of blocking by NMI may be recorded in the exit qualification or in the VM-exit interruption-information field; see Section 27.2.3. • If a VM exit results from a fault, EPT violation, EPT misconfiguration, page-modification log-full event, or SPPrelated event that is encountered during execution of IRET and the “virtual NMIs” VM-execution control is 1, virtual-NMI blocking is cleared before the VM exit commences. However, the previous state of blocking by NMI may be recorded in the exit qualification or in the VM-exit interruption-information field; see Section 27.2.3. • Suppose that a VM exit is caused directly by an x87 FPU Floating-Point Error (#MF) or by any of the following events if the event was unblocked due to (and given priority over) an x87 FPU Floating-Point Error: an INIT signal, an external interrupt, an NMI, an SMI; or a machine-check exception. In these cases, there is no blocking by STI or by MOV SS when the VM exit commences. • Normally, a last-branch record may be made when an event is delivered through the IDT. However, if such an event results in a VM exit before delivery is complete, no last-branch record is made. • If machine-check exception results in a VM exit, processor state is suspect and may result in suspect state being saved to the guest-state area. A VM monitor should consult the RIPV and EIPV bits in the IA32_MCG_STATUS MSR before resuming a guest that caused a VM exit resulting from a machine-check exception. • If a VM exit results from a fault, APIC access (see Section 29.4), EPT violation, EPT misconfiguration, pagemodification log-full event, or SPP-related event that is encountered while executing an instruction, data breakpoints due to that instruction may have been recognized and information about them may be saved in the pending debug exceptions field (unless the VM exit clears that field; see Section 27.3.4). • The following VM exits are considered to happen after an instruction is executed: — VM exits resulting from debug traps (single-step, I/O breakpoints, and data breakpoints). — VM exits resulting from debug exceptions (data breakpoints) whose recognition was delayed by blocking by MOV SS. — VM exits resulting from some machine-check exceptions. — Trap-like VM exits due to execution of MOV to CR8 when the “CR8-load exiting” VM-execution control is 0 and the “use TPR shadow” VM-execution control is 1 (see Section 29.3). (Such VM exits can occur only from 64-bit mode and thus only on processors that support Intel 64 architecture.) — Trap-like VM exits due to execution of WRMSR when the “use MSR bitmaps” VM-execution control is 1; the value of ECX is in the range 800H–8FFH; and the bit corresponding to the ECX value in write bitmap for low MSRs is 0; and the “virtualize x2APIC mode” VM-execution control is 1. See Section 29.5. — VM exits caused by APIC-write emulation (see Section 29.4.3.2) that result from APIC accesses as part of instruction execution. For these VM exits, the instruction’s modifications to architectural state complete before the VM exit occurs. Such modifications include those to the logical processor’s interruptibility state (see Table 24-3). If there had been blocking by MOV SS, POP SS, or STI before the instruction executed, such blocking is no longer in effect. A VM exit that occurs in enclave mode sets bit 27 of the exit-reason field and bit 4 of the guest interruptibility-state field. Before such a VM exit is delivered, an Asynchronous Enclave Exit (AEX) occurs (see Chapter 39, “Enclave Exiting Events”). An AEX modifies architectural state (Section 39.3). In particular, the processor establishes the following architectural state as indicated: • • • • The following bits in RFLAGS are cleared: CF, PF, AF, ZF, SF, OF, and RF. FS and GS are restored to the values they had prior to the most recent enclave entry. RIP is loaded with the AEP of interrupted enclave thread. RSP is loaded from the URSP field in the enclave’s state-save area (SSA). Vol. 3C 27-3 VM EXITS 27.2 RECORDING VM-EXIT INFORMATION AND UPDATING VM-ENTRY CONTROL FIELDS VM exits begin by recording information about the nature of and reason for the VM exit in the VM-exit information fields. Section 27.2.1 to Section 27.2.5 detail the use of these fields. In addition to updating the VM-exit information fields, the valid bit (bit 31) is cleared in the VM-entry interruptioninformation field. If bit 5 of the IA32_VMX_MISC MSR (index 485H) is read as 1 (see Appendix A.6), the value of IA32_EFER.LMA is stored into the “IA-32e mode guest” VM-entry control.1 27.2.1 Basic VM-Exit Information Section 24.9.1 defines the basic VM-exit information fields. The following items detail their use. • Exit reason. — Bits 15:0 of this field contain the basic exit reason. It is loaded with a number indicating the general cause of the VM exit. Appendix C lists the numbers used and their meaning. — Bit 27 of this field is set to 1 if the VM exit occurred while the logical processor was in enclave mode. Such VM exits include those caused by interrupts, non-maskable interrupts, system-management interrupts, INIT signals, and exceptions occurring in enclave mode as well as exceptions encountered during the delivery of such events incident to enclave mode. A VM exit also sets this bit if it is incident to delivery of an event injected by VM entry and the guest interruptibility-state field indicates an enclave interrupt (bit 4 of the field is 1). — The remainder of the field (bits 31:28 and bits 26:16) is cleared to 0 (certain SMM VM exits may set some of these bits; see Section 34.15.2.3).2 • Exit qualification. This field is saved for VM exits due to the following causes: debug exceptions; page-fault exceptions; start-up IPIs (SIPIs); system-management interrupts (SMIs) that arrive immediately after the execution of I/O instructions; task switches; INVEPT; INVLPG; INVPCID; INVVPID; LGDT; LIDT; LLDT; LTR; SGDT; SIDT; SLDT; STR; VMCLEAR; VMPTRLD; VMPTRST; VMREAD; VMWRITE; VMXON; XRSTORS; XSAVES; control-register accesses; MOV DR; I/O instructions; MWAIT; accesses to the APIC-access page (see Section 29.4); EPT violations (see Section 28.2.3.2); EOI virtualization (see Section 29.1.4); APIC-write emulation (see Section 29.4.3.3); page-modification log full (see Section 28.2.6); and SPP-related events (see Section 28.2.4). For all other VM exits, this field is cleared. The following items provide details: — For a debug exception, the exit qualification contains information about the debug exception. The information has the format given in Table 27-1. Table 27-1. Exit Qualification for Debug Exceptions Bit Position(s) Contents 3:0 B3 – B0. When set, each of these bits indicates that the corresponding breakpoint condition was met. Any of these bits may be set even if its corresponding enabling bit in DR7 is not set. 12:4 Reserved (cleared to 0). 13 BD. When set, this bit indicates that the cause of the debug exception is “debug register access detected.” 14 BS. When set, this bit indicates that the cause of the debug exception is either the execution of a single instruction (if RFLAGS.TF = 1 and IA32_DEBUGCTL.BTF = 0) or a taken branch (if RFLAGS.TF = DEBUGCTL.BTF = 1). 15 Reserved (cleared to 0). 1. Bit 5 of the IA32_VMX_MISC MSR is read as 1 on any logical processor that supports the 1-setting of the “unrestricted guest” VMexecution control. 2. Bit 31 of this field is set on certain VM-entry failures; see Section 26.8. 27-4 Vol. 3C VM EXITS Table 27-1. Exit Qualification for Debug Exceptions (Contd.) Bit Position(s) Contents 16 RTM. When set, this bit indicates that a debug exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM region while advanced debugging of RTM transactional regions was enabled (see Section 16.3.7, “RTM-Enabled Debugger Support,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1).1 63:17 Reserved (cleared to 0). Bits 63:32 exist only on processors that support Intel 64 architecture. NOTES: 1. In general, the format of this field matches that of DR6. However, DR6 clears bit 16 to indicate an RTM-related exception, while this field sets the bit to indicate that condition. — For a page-fault exception, the exit qualification contains the linear address that caused the page fault. On processors that support Intel 64 architecture, bits 63:32 are cleared if the logical processor was not in 64bit mode before the VM exit. If the page-fault exception occurred during execution of an instruction in enclave mode (and not during delivery of an event incident to enclave mode), bits 11:0 of the exit qualification are cleared. — For a start-up IPI (SIPI), the exit qualification contains the SIPI vector information in bits 7:0. Bits 63:8 of the exit qualification are cleared to 0. — For a task switch, the exit qualification contains details about the task switch, encoded as shown in Table 27-2. — For INVLPG, the exit qualification contains the linear-address operand of the instruction. • On processors that support Intel 64 architecture, bits 63:32 are cleared if the logical processor was not in 64-bit mode before the VM exit. • If the INVLPG source operand specifies an unusable segment, the linear address specified in the exit qualification will match the linear address that the INVLPG would have used if no VM exit occurred. This address is not architecturally defined and may be implementation-specific. Table 27-2. Exit Qualification for Task Switch Bit Position(s) Contents 15:0 Selector of task-state segment (TSS) to which the guest attempted to switch 29:16 Reserved (cleared to 0) 31:30 Source of task switch initiation: 0: CALL instruction 1: IRET instruction 2: JMP instruction 3: Task gate in IDT 63:32 Reserved (cleared to 0). These bits exist only on processors that support Intel 64 architecture. — For INVEPT, INVPCID, INVVPID, LGDT, LIDT, LLDT, LTR, SGDT, SIDT, SLDT, STR, VMCLEAR, VMPTRLD, VMPTRST, VMREAD, VMWRITE, VMXON, XRSTORS, and XSAVES, the exit qualification receives the value of the instruction’s displacement field, which is sign-extended to 64 bits if necessary (32 bits on processors that do not support Intel 64 architecture). If the instruction has no displacement (for example, has a register operand), zero is stored into the exit qualification. On processors that support Intel 64 architecture, an exception is made for RIP-relative addressing (used only in 64-bit mode). Such addressing causes an instruction to use an address that is the sum of the displacement field and the value of RIP that references the following instruction. In this case, the exit qualification is loaded with the sum of the displacement field and the appropriate RIP value. Vol. 3C 27-5 VM EXITS In all cases, bits of this field beyond the instruction’s address size are undefined. For example, suppose that the address-size field in the VM-exit instruction-information field (see Section 24.9.4 and Section 27.2.5) reports an n-bit address size. Then bits 63:n (bits 31:n on processors that do not support Intel 64 architecture) of the instruction displacement are undefined. — For a control-register access, the exit qualification contains information about the access and has the format given in Table 27-3. — For MOV DR, the exit qualification contains information about the instruction and has the format given in Table 27-4. — For an I/O instruction, the exit qualification contains information about the instruction and has the format given in Table 27-5. — For MWAIT, the exit qualification contains a value that indicates whether address-range monitoring hardware was armed. The exit qualification is set either to 0 (if address-range monitoring hardware is not armed) or to 1 (if address-range monitoring hardware is armed). — For an APIC-access VM exit resulting from a linear access or a guest-physical access to the APIC-access page (see Section 29.4), the exit qualification contains information about the access and has the format given in Table 27-6.1 If the access to the APIC-access page occurred during execution of an instruction in enclave mode (and not during delivery of an event incident to enclave mode), bits 11:0 of the exit qualification are cleared. Such a VM exit that set bits 15:12 of the exit qualification to 0000b (data read during instruction execution) or 0001b (data write during instruction execution) set bit 12—which distinguishes data read from data write—to that which would have been stored in bit 1—W/R—of the page-fault error code had the access caused a page fault instead of an APIC-access VM exit. This implies the following: • For an APIC-access VM exit caused by the CLFLUSH and CLFLUSHOPT instructions, the access type is “data read during instruction execution.” • For an APIC-access VM exit caused by the ENTER instruction, the access type is “data write during instruction execution.” Table 27-3. Exit Qualification for Control-Register Accesses Bit Positions Contents 3:0 Number of control register (0 for CLTS and LMSW). Bit 3 is always 0 on processors that do not support Intel 64 architecture as they do not support CR8. 5:4 Access type: 0 = MOV to CR 1 = MOV from CR 2 = CLTS 3 = LMSW 6 LMSW operand type: 0 = register 1 = memory For CLTS and MOV CR, cleared to 0 7 Reserved (cleared to 0) 1. The exit qualification is undefined if the access was part of the logging of a branch record or a processor-event-based-sampling (PEBS) record to the DS save area. It is recommended that software configure the paging structures so that no address in the DS save area translates to an address on the APIC-access page. 27-6 Vol. 3C VM EXITS Table 27-3. Exit Qualification for Control-Register Accesses (Contd.) Bit Positions Contents 11:8 For MOV CR, the general-purpose register: 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8–15 represent R8–R15, respectively (used only on processors that support Intel 64 architecture) For CLTS and LMSW, cleared to 0 15:12 Reserved (cleared to 0) 31:16 For LMSW, the LMSW source data For CLTS and MOV CR, cleared to 0 63:32 Reserved (cleared to 0). These bits exist only on processors that support Intel 64 architecture. • For an APIC-access VM exit caused by the MASKMOVQ instruction or the MASKMOVDQU instruction, the access type is “data write during instruction execution.” • For an APIC-access VM exit caused by the MONITOR instruction, the access type is “data read during instruction execution.” Such a VM exit stores 1 for bit 31 for IDT-vectoring information field (see Section 27.2.4) if and only if it sets bits 15:12 of the exit qualification to 0011b (linear access during event delivery) or 1010b (guestphysical access during event delivery). See Section 29.4.4 for further discussion of these instructions and APIC-access VM exits. For APIC-access VM exits resulting from physical accesses to the APIC-access page (see Section 29.4.6), the exit qualification is undefined. — For an EPT violation, the exit qualification contains information about the access causing the EPT violation and has the format given in Table 27-7. As noted in that table, the format and meaning of the exit qualification depends on the setting of the “mode-based execute control for EPT” VM-execution control and whether the processor supports advanced VM-exit information for EPT violations.1 An EPT violation that occurs during as a result of execution of a read-modify-write operation sets bit 1 (data write). Whether it also sets bit 0 (data read) is implementation-specific and, for a given implementation, may differ for different kinds of read-modify-write operations. Table 27-4. Exit Qualification for MOV DR Bit Position(s) Contents 2:0 Number of debug register 3 Reserved (cleared to 0) 4 Direction of access (0 = MOV to DR; 1 = MOV from DR) 1. Software can determine whether advanced VM-exit information for EPT violations is supported by consulting the VMX capability MSR IA32_VMX_EPT_VPID_CAP (see Appendix A.10). Vol. 3C 27-7 VM EXITS Table 27-4. Exit Qualification for MOV DR (Contd.) Bit Position(s) Contents 7:5 Reserved (cleared to 0) 11:8 General-purpose register: 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8 –15 = R8 – R15, respectively 63:12 Reserved (cleared to 0) Table 27-5. Exit Qualification for I/O Instructions Bit Position(s) Contents 2:0 Size of access: 0 = 1-byte 1 = 2-byte 3 = 4-byte Other values not used 3 Direction of the attempted access (0 = OUT, 1 = IN) 4 String instruction (0 = not string; 1 = string) 5 REP prefixed (0 = not REP; 1 = REP) 6 Operand encoding (0 = DX, 1 = immediate) 15:7 Reserved (cleared to 0) 31:16 Port number (as specified in DX or in an immediate operand) 63:32 Reserved (cleared to 0). These bits exist only on processors that support Intel 64 architecture. Table 27-6. Exit Qualification for APIC-Access VM Exits from Linear Accesses and Guest-Physical Accesses Bit Position(s) Contents 11:0 • If the APIC-access VM exit is due to a linear access, the offset of access within the APIC page. • Undefined if the APIC-access VM exit is due a guest-physical access 27-8 Vol. 3C VM EXITS Table 27-6. Exit Qualification for APIC-Access VM Exits from Linear Accesses and Guest-Physical Accesses (Contd.) Bit Position(s) Contents 15:12 Access type: 0 = linear access for a data read during instruction execution 1 = linear access for a data write during instruction execution 2 = linear access for an instruction fetch 3 = linear access (read or write) during event delivery 10 = guest-physical access during event delivery 15 = guest-physical access for an instruction fetch or during instruction execution Other values not used 16 If the APIC-access VM exit is due to a guest-physical access, this bit is set if the access was asynchronous to instruction execution and not part of event delivery. (The bit is set if the access is related to trace output by Intel PT; see Section 25.5.4.) Otherwise, this bit is cleared. 63:17 Reserved (cleared to 0). Bits 63:32 exist only on processors that support Intel 64 architecture. Bit 12 reports “NMI unblocking due to IRET”; see Section 27.2.3. Bit 16 is set if the VM exit occurs during trace-address pre-translation (TAPT); see Section 25.5.4. — For VM exits caused as part of EOI virtualization (Section 29.1.4), bits 7:0 of the exit qualification are set to vector of the virtual interrupt that was dismissed by the EOI virtualization. Bits above bit 7 are cleared. — For APIC-write VM exits (Section 29.4.3.3), bits 11:0 of the exit qualification are set to the page offset of the write access that caused the VM exit.1 Bits above bit 11 are cleared. — For a VM exit due to a page-modification log-full event (Section 28.2.6), bit 12 of the exit qualification reports “NMI unblocking due to IRET.” Bit 16 is set if the VM exit occurs during TAPT. All other bits of the exit qualification are undefined. — For a VM exit due to an SPP-related event (Section 28.2.4), bit 11 of the exit qualification indicates the type of event: 0 indicates an SPP misconfiguration and 1 indicates an SPP miss. Bit 12 of the exit qualification reports “NMI unblocking due to IRET.” Bit 16 is set if the VM exit occurs during TAPT. All other bits of the exit qualification are undefined. • Guest linear address. For some VM exits, this field receives a linear address that pertains to the VM exit. The field is set for different VM exits as follows: — VM exits due to attempts to execute LMSW with a memory operand. In these cases, this field receives the linear address of that operand. Bits 63:32 are cleared if the logical processor was not in 64-bit mode before the VM exit. — VM exits due to attempts to execute INS or OUTS for which the relevant segment is usable (if the relevant segment is not usable, the value is undefined). (ES is always the relevant segment for INS; for OUTS, the relevant segment is DS unless overridden by an instruction prefix.) The linear address is the base address of relevant segment plus (E)DI (for INS) or (E)SI (for OUTS). Bits 63:32 are cleared if the logical processor was not in 64-bit mode before the VM exit. Table 27-7. Exit Qualification for EPT Violations Bit Position(s) Contents 0 Set if the access causing the EPT violation was a data read.1 1 Set if the access causing the EPT violation was a data write.1 2 Set if the access causing the EPT violation was an instruction fetch. 1. Execution of WRMSR with ECX = 83FH (self-IPI MSR) can lead to an APIC-write VM exit; the exit qualification for such an APIC-write VM exit is 3F0H. Vol. 3C 27-9 VM EXITS Table 27-7. Exit Qualification for EPT Violations (Contd.) Bit Position(s) Contents 3 The logical-AND of bit 0 in the EPT paging-structure entries used to translate the guest-physical address of the access causing the EPT violation (indicates whether the guest-physical address was readable).2 4 The logical-AND of bit 1 in the EPT paging-structure entries used to translate the guest-physical address of the access causing the EPT violation (indicates whether the guest-physical address was writeable). 5 The logical-AND of bit 2 in the EPT paging-structure entries used to translate the guest-physical address of the access causing the EPT violation. If the “mode-based execute control for EPT” VM-execution control is 0, this indicates whether the guest-physical address was executable. If that control is 1, this indicates whether the guest-physical address was executable for supervisor-mode linear addresses. 6 If the “mode-based execute control” VM-execution control is 0, the value of this bit is undefined. If that control is 1, this bit is the logical-AND of bit 10 in the EPT paging-structures entries used to translate the guest-physical address of the access causing the EPT violation. In this case, it indicates whether the guest-physical address was executable for user-mode linear addresses. 7 Set if the guest linear-address field is valid. The guest linear-address field is valid for all EPT violations except those resulting from an attempt to load the guest PDPTEs as part of the execution of the MOV CR instruction and those due to trace-address pre-translation (TAPT; Section 25.5.4). 8 If bit 7 is 1: • Set if the access causing the EPT violation is to a guest-physical address that is the translation of a linear address. • Clear if the access causing the EPT violation is to a paging-structure entry as part of a page walk or the update of an accessed or dirty bit. Reserved if bit 7 is 0 (cleared to 0). 9 If bit 7 is 1, bit 8 is 1, and the processor supports advanced VM-exit information for EPT violations,3 this bit is 0 if the linear address is a supervisor-mode linear address and 1 if it is a user-mode linear address. (If CR0.PG = 0, the translation of every linear address is a user-mode linear address and thus this bit will be 1.) Otherwise, this bit is undefined. 10 If bit 7 is 1, bit 8 is 1, and the processor supports advanced VM-exit information for EPT violations,3 this bit is 0 if paging translates the linear address to a read-only page and 1 if it translates to a read/write page. (If CR0.PG = 0, every linear address is read/write and thus this bit will be 1.) Otherwise, this bit is undefined. 11 If bit 7 is 1, bit 8 is 1, and the processor supports advanced VM-exit information for EPT violations,3 this bit is 0 if paging translates the linear address to an executable page and 1 if it translates to an execute-disable page. (If CR0.PG = 0, CR4.PAE = 0, or IA32_EFER.NXE = 0, every linear address is executable and thus this bit will be 0.) Otherwise, this bit is undefined. 12 NMI unblocking due to IRET (see Section 27.2.3). 15:13 Reserved (cleared to 0). 16 This bit is set if the access was asynchronous to instruction execution not the result of event delivery. (The bit is set if the access is related to trace output by Intel PT; see Section 25.5.4.) Otherwise, this bit is cleared. 63:17 Reserved (cleared to 0). NOTES: 1. If accessed and dirty flags for EPT are enabled, processor accesses to guest paging-structure entries are treated as writes with regard to EPT violations (see Section 28.2.3.2). If such an access causes an EPT violation, the processor sets both bit 0 and bit 1 of the exit qualification. 27-10 Vol. 3C VM EXITS 2. Bits 5:3 are cleared to 0 if any of EPT paging-structure entries used to translate the guest-physical address of the access causing the EPT violation is not present (see Section 28.2.2). 3. Software can determine whether advanced VM-exit information for EPT violations is supported by consulting the VMX capability MSR IA32_VMX_EPT_VPID_CAP (see Appendix A.10). — VM exits due to EPT violations that set bit 7 of the exit qualification (see Table 27-7; these are all EPT violations except those resulting from an attempt to load the PDPTEs as of execution of the MOV CR instruction and those due to TAPT). The linear address may translate to the guest-physical address whose access caused the EPT violation. Alternatively, translation of the linear address may reference a pagingstructure entry whose access caused the EPT violation. Bits 63:32 are cleared if the logical processor was not in 64-bit mode before the VM exit. If the EPT violation occurred during execution of an instruction in enclave mode (and not during delivery of an event incident to enclave mode), bits 11:0 of this field are cleared. — VM exits due to SPP-related events. — For all other VM exits, the field is undefined. • Guest-physical address. For a VM exit due to an EPT violation, an EPT misconfiguration, or an SPP-related event, this field receives the guest-physical address that caused the EPT violation or EPT misconfiguration. For all other VM exits, the field is undefined. If the EPT violation or EPT misconfiguration occurred during execution of an instruction in enclave mode (and not during delivery of an event incident to enclave mode), bits 11:0 of this field are cleared. 27.2.2 Information for VM Exits Due to Vectored Events Section 24.9.2 defines fields containing information for VM exits due to the following events: exceptions (including those generated by the instructions INT1, INT3, INTO, BOUND, UD0, UD1, and UD2); external interrupts that occur while the “acknowledge interrupt on exit” VM-exit control is 1; and non-maskable interrupts (NMIs).1 Such VM exits include those that occur on an attempt at a task switch that causes an exception before generating the VM exit due to the task switch that causes the VM exit. The following items detail the use of these fields: • VM-exit interruption information (format given in Table 24-16). The following items detail how this field is established for VM exits due to these events: — For an exception, bits 7:0 receive the exception vector (at most 31). For an NMI, bits 7:0 are set to 2. For an external interrupt, bits 7:0 receive the vector. — Bits 10:8 are set to 0 (external interrupt), 2 (non-maskable interrupt), 3 (hardware exception), 5 (privileged software exception), or 6 (software exception). Hardware exceptions comprise all exceptions except the following: • Debug exceptions (#DB) generated by the INT1 instruction; these are privileged software exceptions. (Other debug exceptions are considered hardware exceptions, as are those caused by executions of INT1 in enclave mode.) • Breakpoint exceptions (#BP; generated by INT3) and overflow exceptions (#OF; generated by INTO); these are software exceptions. (A #BP that occurs in enclave mode is considered a hardware exception.) BOUND-range exceeded exceptions (#BR; generated by BOUND) and invalid opcode exceptions (#UD) generated by UD0, UD1, and UD2 are hardware exceptions. — Bit 11 is set to 1 if the VM exit is caused by a hardware exception that would have delivered an error code on the stack. This bit is always 0 if the VM exit occurred while the logical processor was in real-address mode (CR0.PE=0).2 If bit 11 is set to 1, the error code is placed in the VM-exit interruption error code (see below). 1. INT1 and INT3 refer to the instructions with opcodes F1 and CC, respectively, and not to INT n with value 1 or 3 for n. Vol. 3C 27-11 VM EXITS — Bit 12 reports “NMI unblocking due to IRET”; see Section 27.2.3. The value of this bit is undefined if the VM exit is due to a double fault (the interruption type is hardware exception and the vector is 8). — Bits 30:13 are always set to 0. — Bit 31 is always set to 1. For other VM exits (including those due to external interrupts when the “acknowledge interrupt on exit” VM-exit control is 0), the field is marked invalid (by clearing bit 31) and the remainder of the field is undefined. • VM-exit interruption error code. — For VM exits that set both bit 31 (valid) and bit 11 (error code valid) in the VM-exit interruption-information field, this field receives the error code that would have been pushed on the stack had the event causing the VM exit been delivered normally through the IDT. The EXT bit is set in this field exactly when it would be set normally. For exceptions that occur during the delivery of double fault (if the IDT-vectoring information field indicates a double fault), the EXT bit is set to 1, assuming that (1) that the exception would produce an error code normally (if not incident to double-fault delivery) and (2) that the error code uses the EXT bit (not for page faults, which use a different format). — For other VM exits, the value of this field is undefined. 27.2.3 Information About NMI Unblocking Due to IRET A VM exit may occur during execution of the IRET instruction for reasons including the following: faults, EPT violations, page-modification log-full events, or SPP-related events. An execution of IRET that commences while non-maskable interrupts (NMIs) are blocked will unblock NMIs even if a fault or VM exit occurs; the state saved by such a VM exit will indicate that NMIs were not blocked. VM exits for the reasons enumerated above provide more information to software by saving a bit called “NMI unblocking due to IRET.” This bit is defined if (1) either the “NMI exiting” VM-execution control is 0 or the “virtual NMIs” VM-execution control is 1; (2) the VM exit does not set the valid bit in the IDT-vectoring information field (see Section 27.2.4); and (3) the VM exit is not due to a double fault. In these cases, the bit is defined as follows: • The bit is 1 if the VM exit resulted from a memory access as part of execution of the IRET instruction and one of the following holds: — The “virtual NMIs” VM-execution control is 0 and blocking by NMI (see Table 24-3) was in effect before execution of IRET. — The “virtual NMIs” VM-execution control is 1 and virtual-NMI blocking was in effect before execution of IRET. • The bit is 0 for all other relevant VM exits. For VM exits due to faults, NMI unblocking due to IRET is saved in bit 12 of the VM-exit interruption-information field (Section 27.2.2). For VM exits due to EPT violations, page-modification log-full events, and SPP-related events, NMI unblocking due to IRET is saved in bit 12 of the exit qualification (Section 27.2.1). (Executions of IRET may also incur VM exits due to APIC accesses and EPT misconfigurations. These VM exits do not report information about NMI unblocking due to IRET.) 27.2.4 Information for VM Exits During Event Delivery Section 24.9.3 defined fields containing information for VM exits that occur while delivering an event through the IDT and as a result of any of the following cases:1 2. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PE must be 1 in VMX operation, a logical processor cannot be in realaddress mode unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. 1. This includes the case in which a VM exit occurs while delivering a software interrupt (INT n) through the 16-bit IVT (interrupt vector table) that is used in virtual-8086 mode with virtual-machine extensions (if RFLAGS.VM = CR4.VME = 1). 27-12 Vol. 3C VM EXITS • A fault occurs during event delivery and causes a VM exit (because the bit associated with the fault is set to 1 in the exception bitmap). • A task switch is invoked through a task gate in the IDT. The VM exit occurs due to the task switch only after the initial checks of the task switch pass (see Section 25.4.2). • • Event delivery causes an APIC-access VM exit (see Section 29.4). An EPT violation, EPT misconfiguration, page-modification log-full event, or SPP-related event that occurs during event delivery. These fields are used for VM exits that occur during delivery of events injected as part of VM entry (see Section 26.6.1.2). A VM exit is not considered to occur during event delivery in any of the following circumstances: • The original event causes the VM exit directly (for example, because the original event is a non-maskable interrupt (NMI) and the “NMI exiting” VM-execution control is 1). • • • The original event results in a double-fault exception that causes the VM exit directly. The VM exit occurred as a result of fetching the first instruction of the handler invoked by the event delivery. The VM exit is caused by a triple fault. The following items detail the use of these fields: • IDT-vectoring information (format given in Table 24-17). The following items detail how this field is established for VM exits that occur during event delivery: — If the VM exit occurred during delivery of an exception, bits 7:0 receive the exception vector (at most 31). If the VM exit occurred during delivery of an NMI, bits 7:0 are set to 2. If the VM exit occurred during delivery of an external interrupt, bits 7:0 receive the vector. — Bits 10:8 are set to indicate the type of event that was being delivered when the VM exit occurred: 0 (external interrupt), 2 (non-maskable interrupt), 3 (hardware exception), 4 (software interrupt), 5 (privileged software interrupt), or 6 (software exception). Hardware exceptions comprise all exceptions except the following:1 • Debug exceptions (#DB) generated by the INT1 instruction; these are privileged software exceptions. (Other debug exceptions are considered hardware exceptions, as are those caused by executions of INT1 in enclave mode.) • Breakpoint exceptions (#BP; generated by INT3) and overflow exceptions (#OF; generated by INTO); these are software exceptions. (A #BP that occurs in enclave mode is considered a hardware exception.) BOUND-range exceeded exceptions (#BR; generated by BOUND) and invalid opcode exceptions (#UD) generated by UD0, UD1, and UD2 are hardware exceptions. — Bit 11 is set to 1 if the VM exit occurred during delivery of a hardware exception that would have delivered an error code on the stack. This bit is always 0 if the VM exit occurred while the logical processor was in real-address mode (CR0.PE=0).2 If bit 11 is set to 1, the error code is placed in the IDT-vectoring error code (see below). — Bit 12 is undefined. — Bits 30:13 are always set to 0. — Bit 31 is always set to 1. For other VM exits, the field is marked invalid (by clearing bit 31) and the remainder of the field is undefined. • IDT-vectoring error code. 1. In the following items, INT1 and INT3 refer to the instructions with opcodes F1 and CC, respectively, and not to INT n with value 1 or 3 for n. 2. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PE must be 1 in VMX operation, a logical processor cannot be in realaddress mode unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. Vol. 3C 27-13 VM EXITS — For VM exits that set both bit 31 (valid) and bit 11 (error code valid) in the IDT-vectoring information field, this field receives the error code that would have been pushed on the stack by the event that was being delivered through the IDT at the time of the VM exit. The EXT bit is set in this field when it would be set normally. — For other VM exits, the value of this field is undefined. 27.2.5 Information for VM Exits Due to Instruction Execution Section 24.9.4 defined fields containing information for VM exits that occur due to instruction execution. (The VMexit instruction length is also used for VM exits that occur during the delivery of a software interrupt or software exception.) The following items detail their use. • VM-exit instruction length. This field is used in the following cases: — For fault-like VM exits due to attempts to execute one of the following instructions that cause VM exits unconditionally (see Section 25.1.2) or based on the settings of VM-execution controls (see Section 25.1.3): CLTS, CPUID, ENCLS, GETSEC, HLT, IN, INS, INVD, INVEPT, INVLPG, INVPCID, INVVPID, LGDT, LIDT, LLDT, LMSW, LTR, MONITOR, MOV CR, MOV DR, MWAIT, OUT, OUTS, PAUSE, RDMSR, RDPMC, RDRAND, RDSEED, RDTSC, RDTSCP, RSM, SGDT, SIDT, SLDT, STR, TPAUSE, UMWAIT, VMCALL, VMCLEAR, VMLAUNCH, VMPTRLD, VMPTRST, VMREAD, VMRESUME, VMWRITE, VMXOFF, VMXON, WBINVD, WRMSR, XRSTORS, XSETBV, and XSAVES.1 — For VM exits due to software exceptions (those generated by executions of INT3 or INTO) or privileged software exceptions (those generated by executions of INT1). — For VM exits due to faults encountered during delivery of a software interrupt, privileged software exception, or software exception. — For VM exits due to attempts to effect a task switch via instruction execution. These are VM exits that produce an exit reason indicating task switch and either of the following: • • An exit qualification indicating execution of CALL, IRET, or JMP instruction. An exit qualification indicating a task gate in the IDT and an IDT-vectoring information field indicating that the task gate was encountered during delivery of a software interrupt, privileged software exception, or software exception. — For APIC-access VM exits and for VM exits caused by EPT violations, page-modification log-full events, and SPP-related events encountered during delivery of a software interrupt, privileged software exception, or software exception.2 — For VM exits due executions of VMFUNC that fail because one of the following is true: • EAX indicates a VM function that is not enabled (the bit at position EAX is 0 in the VM-function controls; see Section 25.5.6.2). • EAX = 0 and either ECX ≥ 512 or the value of ECX selects an invalid tentative EPTP value (see Section 25.5.6.3). In all the above cases, this field receives the length in bytes (1–15) of the instruction (including any instruction prefixes) whose execution led to the VM exit (see the next paragraph for one exception). The cases of VM exits encountered during delivery of a software interrupt, privileged software exception, or software exception include those encountered during delivery of events injected as part of VM entry (see Section 26.6.1.2). If the original event was injected as part of VM entry, this field receives the value of the VMentry instruction length. All VM exits other than those listed in the above items leave this field undefined. 1. This item applies only to fault-like VM exits. It does not apply to trap-like VM exits following executions of the MOV to CR8 instruction when the “use TPR shadow” VM-execution control is 1 or to those following executions of the WRMSR instruction when the “virtualize x2APIC mode” VM-execution control is 1. 2. The VM-exit instruction-length field is not defined following APIC-access VM exits resulting from physical accesses (see Section 29.4.6) even if encountered during delivery of a software interrupt, privileged software exception, or software exception. 27-14 Vol. 3C VM EXITS If the VM exit occurred in enclave mode, this field is cleared (none of the previous items apply). Table 27-8. Format of the VM-Exit Instruction-Information Field as Used for INS and OUTS Bit Position(s) Content 6:0 Undefined. 9:7 Address size: 0: 16-bit 1: 32-bit 2: 64-bit (used only on processors that support Intel 64 architecture) Other values not used. 14:10 Undefined. 17:15 Segment register: 0: ES 1: CS 2: SS 3: DS 4: FS 5: GS Other values not used. Undefined for VM exits due to execution of INS. 31:18 • Undefined. VM-exit instruction information. For VM exits due to attempts to execute INS, INVEPT, INVPCID, INVVPID, LIDT, LGDT, LLDT, LTR, OUTS, RDRAND, RDSEED, SIDT, SGDT, SLDT, STR, VMCLEAR, VMPTRLD, VMPTRST, VMREAD, VMWRITE, VMXON, XRSTORS, or XSAVES, this field receives information about the instruction that caused the VM exit. The format of the field depends on the identity of the instruction causing the VM exit: — For VM exits due to attempts to execute INS or OUTS, the field has the format is given in Table 27-8.1 — For VM exits due to attempts to execute INVEPT, INVPCID, or INVVPID, the field has the format is given in Table 27-9. — For VM exits due to attempts to execute LIDT, LGDT, SIDT, or SGDT, the field has the format is given in Table 27-10. — For VM exits due to attempts to execute LLDT, LTR, SLDT, or STR, the field has the format is given in Table 27-11. — For VM exits due to attempts to execute RDRAND, RDSEED, TPAUSE, or UMWAIT, the field has the format is given in Table 27-12. — For VM exits due to attempts to execute VMCLEAR, VMPTRLD, VMPTRST, VMXON, XRSTORS, or XSAVES, the field has the format is given in Table 27-13. — For VM exits due to attempts to execute VMREAD or VMWRITE, the field has the format is given in Table 27-14. For all other VM exits, the field is undefined, unless the VM exit occurred in enclave mode, in which case the field is cleared. • I/O RCX, I/O RSI, I/O RDI, I/O RIP. These fields are undefined except for SMM VM exits due to systemmanagement interrupts (SMIs) that arrive immediately after retirement of I/O instructions. See Section 34.15.2.3. Note that, if the VM exit occurred in enclave mode, these fields are all cleared. 1. The format of the field was undefined for these VM exits on the first processors to support the virtual-machine extensions. Software can determine whether the format specified in Table 27-8 is used by consulting the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1). Vol. 3C 27-15 VM EXITS Table 27-9. Format of the VM-Exit Instruction-Information Field as Used for INVEPT, INVPCID, and INVVPID Bit Position(s) Content 1:0 Scaling: 0: no scaling 1: scale by 2 2: scale by 4 3: scale by 8 (used only on processors that support Intel 64 architecture) Undefined for instructions with no index register (bit 22 is set). 6:2 Undefined. 9:7 Address size: 0: 16-bit 1: 32-bit 2: 64-bit (used only on processors that support Intel 64 architecture) Other values not used. 10 Cleared to 0. 14:11 Undefined. 17:15 Segment register: 0: ES 1: CS 2: SS 3: DS 4: FS 5: GS Other values not used. 21:18 IndexReg: 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8–15 represent R8–R15, respectively (used only on processors that support Intel 64 architecture) Undefined for instructions with no index register (bit 22 is set). 22 IndexReg invalid (0 = valid; 1 = invalid) 26:23 BaseReg (encoded as IndexReg above) Undefined for memory instructions with no base register (bit 27 is set). 27 BaseReg invalid (0 = valid; 1 = invalid) 31:28 Reg2 (same encoding as IndexReg above) Table 27-10. Format of the VM-Exit Instruction-Information Field as Used for LIDT, LGDT, SIDT, or SGDT Bit Position(s) Content 1:0 Scaling: 0: no scaling 1: scale by 2 2: scale by 4 3: scale by 8 (used only on processors that support Intel 64 architecture) Undefined for instructions with no index register (bit 22 is set). 27-16 Vol. 3C VM EXITS Table 27-10. Format of the VM-Exit Instruction-Information Field as Used for LIDT, LGDT, SIDT, or SGDT (Contd.) Bit Position(s) Content 6:2 Undefined. 9:7 Address size: 0: 16-bit 1: 32-bit 2: 64-bit (used only on processors that support Intel 64 architecture) Other values not used. 10 Cleared to 0. 11 Operand size: 0: 16-bit 1: 32-bit Undefined for VM exits from 64-bit mode. 14:12 Undefined. 17:15 Segment register: 0: ES 1: CS 2: SS 3: DS 4: FS 5: GS Other values not used. 21:18 IndexReg: 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8–15 represent R8–R15, respectively (used only on processors that support Intel 64 architecture) Undefined for instructions with no index register (bit 22 is set). 22 IndexReg invalid (0 = valid; 1 = invalid) 26:23 BaseReg (encoded as IndexReg above) Undefined for instructions with no base register (bit 27 is set). 27 BaseReg invalid (0 = valid; 1 = invalid) 29:28 Instruction identity: 0: SGDT 1: SIDT 2: LGDT 3: LIDT 31:30 Undefined. Vol. 3C 27-17 VM EXITS Table 27-11. Format of the VM-Exit Instruction-Information Field as Used for LLDT, LTR, SLDT, and STR Bit Position(s) Content 1:0 Scaling: 0: no scaling 1: scale by 2 2: scale by 4 3: scale by 8 (used only on processors that support Intel 64 architecture) Undefined for register instructions (bit 10 is set) and for memory instructions with no index register (bit 10 is clear and bit 22 is set). 2 Undefined. 6:3 Reg1: 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8–15 represent R8–R15, respectively (used only on processors that support Intel 64 architecture) Undefined for memory instructions (bit 10 is clear). 9:7 Address size: 0: 16-bit 1: 32-bit 2: 64-bit (used only on processors that support Intel 64 architecture) Other values not used. Undefined for register instructions (bit 10 is set). 10 Mem/Reg (0 = memory; 1 = register). 14:11 Undefined. 17:15 Segment register: 0: ES 1: CS 2: SS 3: DS 4: FS 5: GS Other values not used. Undefined for register instructions (bit 10 is set). 21:18 IndexReg (encoded as Reg1 above) Undefined for register instructions (bit 10 is set) and for memory instructions with no index register (bit 10 is clear and bit 22 is set). 22 IndexReg invalid (0 = valid; 1 = invalid) Undefined for register instructions (bit 10 is set). 26:23 BaseReg (encoded as Reg1 above) Undefined for register instructions (bit 10 is set) and for memory instructions with no base register (bit 10 is clear and bit 27 is set). 27 BaseReg invalid (0 = valid; 1 = invalid) Undefined for register instructions (bit 10 is set). 27-18 Vol. 3C VM EXITS Table 27-11. Format of the VM-Exit Instruction-Information Field as Used for LLDT, LTR, SLDT, and STR (Contd.) Bit Position(s) Content 29:28 Instruction identity: 0: SLDT 1: STR 2: LLDT 3: LTR 31:30 Undefined. Table 27-12. Format of the VM-Exit Instruction-Information Field as Used for RDRAND, RDSEED, TPAUSE, and UMWAIT Bit Position(s) Content 2:0 Undefined. 6:3 Operand register (destination for RDRAND and RDSEED; source for TPAUSE and UMWAIT): 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8–15 represent R8–R15, respectively (used only on processors that support Intel 64 architecture) 10:7 Undefined. 12:11 Operand size: 0: 16-bit 1: 32-bit 2: 64-bit The value 3 is not used. 31:13 Undefined. Table 27-13. Format of the VM-Exit Instruction-Information Field as Used for VMCLEAR, VMPTRLD, VMPTRST, VMXON, XRSTORS, and XSAVES Bit Position(s) Content 1:0 Scaling: 0: no scaling 1: scale by 2 2: scale by 4 3: scale by 8 (used only on processors that support Intel 64 architecture) Undefined for instructions with no index register (bit 22 is set). 6:2 Undefined. 9:7 Address size: 0: 16-bit 1: 32-bit 2: 64-bit (used only on processors that support Intel 64 architecture) Other values not used. 10 Cleared to 0. Vol. 3C 27-19 VM EXITS Table 27-13. Format of the VM-Exit Instruction-Information Field as Used for VMCLEAR, VMPTRLD, VMPTRST, VMXON, XRSTORS, and XSAVES (Contd.) Bit Position(s) Content 14:11 Undefined. 17:15 Segment register: 0: ES 1: CS 2: SS 3: DS 4: FS 5: GS Other values not used. 21:18 IndexReg: 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8–15 represent R8–R15, respectively (used only on processors that support Intel 64 architecture) Undefined for instructions with no index register (bit 22 is set). 22 IndexReg invalid (0 = valid; 1 = invalid) 26:23 BaseReg (encoded as IndexReg above) Undefined for instructions with no base register (bit 27 is set). 27 BaseReg invalid (0 = valid; 1 = invalid) 31:28 Undefined. Table 27-14. Format of the VM-Exit Instruction-Information Field as Used for VMREAD and VMWRITE Bit Position(s) Content 1:0 Scaling: 0: no scaling 1: scale by 2 2: scale by 4 3: scale by 8 (used only on processors that support Intel 64 architecture) Undefined for register instructions (bit 10 is set) and for memory instructions with no index register (bit 10 is clear and bit 22 is set). 2 27-20 Vol. 3C Undefined. VM EXITS Table 27-14. Format of the VM-Exit Instruction-Information Field as Used for VMREAD and VMWRITE (Contd.) Bit Position(s) Content 6:3 Reg1: 0 = RAX 1 = RCX 2 = RDX 3 = RBX 4 = RSP 5 = RBP 6 = RSI 7 = RDI 8–15 represent R8–R15, respectively (used only on processors that support Intel 64 architecture) Undefined for memory instructions (bit 10 is clear). 9:7 Address size: 0: 16-bit 1: 32-bit 2: 64-bit (used only on processors that support Intel 64 architecture) Other values not used. Undefined for register instructions (bit 10 is set). 10 Mem/Reg (0 = memory; 1 = register). 14:11 Undefined. 17:15 Segment register: 0: ES 1: CS 2: SS 3: DS 4: FS 5: GS Other values not used. Undefined for register instructions (bit 10 is set). 21:18 IndexReg (encoded as Reg1 above) Undefined for register instructions (bit 10 is set) and for memory instructions with no index register (bit 10 is clear and bit 22 is set). 22 IndexReg invalid (0 = valid; 1 = invalid) Undefined for register instructions (bit 10 is set). 26:23 BaseReg (encoded as Reg1 above) Undefined for register instructions (bit 10 is set) and for memory instructions with no base register (bit 10 is clear and bit 27 is set). 27 BaseReg invalid (0 = valid; 1 = invalid) Undefined for register instructions (bit 10 is set). 31:28 27.3 Reg2 (same encoding as Reg1 above) SAVING GUEST STATE VM exits save certain components of processor state into corresponding fields in the guest-state area of the VMCS (see Section 24.4). On processors that support Intel 64 architecture, the full value of each natural-width field (see Section 24.11.2) is saved regardless of the mode of the logical processor before and after the VM exit. In general, the state saved is that which was in the logical processor at the time the VM exit commences. See Section 27.1 for a discussion of which architectural updates occur at that time. Section 27.3.1 through Section 27.3.4 provide details for how various components of processor state are saved. These sections reference VMCS fields that correspond to processor state. Unless otherwise stated, these references are to fields in the guest-state area. Vol. 3C 27-21 VM EXITS 27.3.1 Saving Control Registers, Debug Registers, and MSRs Contents of certain control registers, debug registers, and MSRs is saved as follows: • The contents of CR0, CR3, CR4, and the IA32_SYSENTER_CS, IA32_SYSENTER_ESP, and IA32_SYSENTER_EIP MSRs are saved into the corresponding fields. Bits 63:32 of the IA32_SYSENTER_CS MSR are not saved. On processors that do not support Intel 64 architecture, bits 63:32 of the IA32_SYSENTER_ESP and IA32_SYSENTER_EIP MSRs are not saved. • If the “save debug controls” VM-exit control is 1, the contents of DR7 and the IA32_DEBUGCTL MSR are saved into the corresponding fields. The first processors to support the virtual-machine extensions supported only the 1-setting of this control and thus always saved data into these fields. • If the “save IA32_PAT” VM-exit control is 1, the contents of the IA32_PAT MSR are saved into the corresponding field. • If the “save IA32_EFER” VM-exit control is 1, the contents of the IA32_EFER MSR are saved into the corresponding field. • If the processor supports either the 1-setting of the “load IA32_BNDCFGS” VM-entry control or that of the “clear IA32_BNDCFGS” VM-exit control, the contents of the IA32_BNDCFGS MSR are saved into the corresponding field. • If the processor supports either the 1-setting of the “load IA32_RTIT_CTL” VM-entry control or that of the “clear IA32_RTIT_CTL” VM-exit control, the contents of the IA32_RTIT_CTL MSR are saved into the corresponding field. • The value of the SMBASE field is undefined after all VM exits except SMM VM exits. See Section 34.15.2. 27.3.2 Saving Segment Registers and Descriptor-Table Registers For each segment register (CS, SS, DS, ES, FS, GS, LDTR, or TR), the values saved for the base-address, segmentlimit, and access rights are based on whether the register was unusable (see Section 24.4.1) before the VM exit: • If the register was unusable, the values saved into the following fields are undefined: (1) base address; (2) segment limit; and (3) bits 7:0 and bits 15:12 in the access-rights field. The following exceptions apply: — CS. • • The base-address and segment-limit fields are saved. The L, D, and G bits are saved in the access-rights field. — SS. • • DPL is saved in the access-rights field. On processors that support Intel 64 architecture, bits 63:32 of the value saved for the base address are always zero. — DS and ES. On processors that support Intel 64 architecture, bits 63:32 of the values saved for the base addresses are always zero. — FS and GS. The base-address field is saved. — LDTR. The value saved for the base address is always canonical. • If the register was not unusable, the values saved into the following fields are those which were in the register before the VM exit: (1) base address; (2) segment limit; and (3) bits 7:0 and bits 15:12 in access rights. • Bits 31:17 and 11:8 in the access-rights field are always cleared. Bit 16 is set to 1 if and only if the segment is unusable. The contents of the GDTR and IDTR registers are saved into the corresponding base-address and limit fields. 27.3.3 Saving RIP, RSP, and RFLAGS The contents of the RIP, RSP, and RFLAGS registers are saved as follows: • The value saved in the RIP field is determined by the nature and cause of the VM exit: 27-22 Vol. 3C VM EXITS — If the VM exit occurred in enclave mode, the value saved is the AEP of interrupted enclave thread (the remaining items do not apply). — If the VM exit occurs due to by an attempt to execute an instruction that causes VM exits unconditionally or that has been configured to cause a VM exit via the VM-execution controls, the value saved references that instruction. — If the VM exit is caused by an occurrence of an INIT signal, a start-up IPI (SIPI), or system-management interrupt (SMI), the value saved is that which was in RIP before the event occurred. — If the VM exit occurs due to the 1-setting of either the “interrupt-window exiting” VM-execution control or the “NMI-window exiting” VM-execution control, the value saved is that which would be in the register had the VM exit not occurred. — If the VM exit is due to an external interrupt, non-maskable interrupt (NMI), or hardware exception (as defined in Section 27.2.2), the value saved is the return pointer that would have been saved (either on the stack had the event been delivered through a trap or interrupt gate,1 or into the old task-state segment had the event been delivered through a task gate). — If the VM exit is due to a triple fault, the value saved is the return pointer that would have been saved (either on the stack had the event been delivered through a trap or interrupt gate, or into the old task-state segment had the event been delivered through a task gate) had delivery of the double fault not encountered the nested exception that caused the triple fault. — If the VM exit is due to a software exception (due to an execution of INT3 or INTO) or a privileged software exception (due to an execution of INT1), the value saved references the INT3, INTO, or INT1 instruction that caused that exception. — Suppose that the VM exit is due to a task switch that was caused by execution of CALL, IRET, or JMP or by execution of a software interrupt (INT n), software exception (due to execution of INT3 or INTO), or privileged software exception (due to execution of INT1) that encountered a task gate in the IDT. The value saved references the instruction that caused the task switch (CALL, IRET, JMP, INT n, INT3, INTO, INT1). — Suppose that the VM exit is due to a task switch that was caused by a task gate in the IDT that was encountered for any reason except the direct access by a software interrupt or software exception. The value saved is that which would have been saved in the old task-state segment had the task switch completed normally. — If the VM exit is due to an execution of MOV to CR8 or WRMSR that reduced the value of bits 7:4 of VTPR (see Section 29.1.1) below that of TPR threshold VM-execution control field (see Section 29.1.2), the value saved references the instruction following the MOV to CR8 or WRMSR. — If the VM exit was caused by APIC-write emulation (see Section 29.4.3.2) that results from an APIC access as part of instruction execution, the value saved references the instruction following the one whose execution caused the APIC-write emulation. • • The contents of the RSP register are saved into the RSP field. With the exception of the resume flag (RF; bit 16), the contents of the RFLAGS register is saved into the RFLAGS field. RFLAGS.RF is saved as follows: — If the VM exit occurred in enclave mode, the value saved is 0 (the remaining items do not apply). — If the VM exit is caused directly by an event that would normally be delivered through the IDT, the value saved is that which would appear in the saved RFLAGS image (either that which would be saved on the stack had the event been delivered through a trap or interrupt gate2 or into the old task-state segment had the event been delivered through a task gate) had the event been delivered through the IDT. See below for VM exits due to task switches caused by task gates in the IDT. — If the VM exit is caused by a triple fault, the value saved is that which the logical processor would have in RF in the RFLAGS register had the triple fault taken the logical processor to the shutdown state. 1. The reference here is to the full value of RIP before any truncation that would occur had the stack width been only 32 bits or 16 bits. 2. The reference here is to the full value of RFLAGS before any truncation that would occur had the stack width been only 32 bits or 16 bits. Vol. 3C 27-23 VM EXITS — If the VM exit is caused by a task switch (including one caused by a task gate in the IDT), the value saved is that which would have been saved in the RFLAGS image in the old task-state segment (TSS) had the task switch completed normally without exception. — If the VM exit is caused by an attempt to execute an instruction that unconditionally causes VM exits or one that was configured to do with a VM-execution control, the value saved is 0.1 — For APIC-access VM exits and for VM exits caused by EPT violations, EPT misconfigurations, page-modification log-full events, or SPP-related events, the value saved depends on whether the VM exit occurred during delivery of an event through the IDT: • If the VM exit stored 0 for bit 31 for IDT-vectoring information field (because the VM exit did not occur during delivery of an event through the IDT; see Section 27.2.4), the value saved is 1. • If the VM exit stored 1 for bit 31 for IDT-vectoring information field (because the VM exit did occur during delivery of an event through the IDT), the value saved is the value that would have appeared in the saved RFLAGS image had the event been delivered through the IDT (see above). — For all other VM exits, the value saved is the value RFLAGS.RF had before the VM exit occurred. 27.3.4 Saving Non-Register State Information corresponding to guest non-register state is saved as follows: • The activity-state field is saved with the logical processor’s activity state before the VM exit.2 See Section 27.1 for details of how events leading to a VM exit may affect the activity state. • The interruptibility-state field is saved to reflect the logical processor’s interruptibility before the VM exit. — See Section 27.1 for details of how events leading to a VM exit may affect this state. — VM exits that end outside system-management mode (SMM) save bit 2 (blocking by SMI) as 0 regardless of the state of such blocking before the VM exit. — Bit 3 (blocking by NMI) is treated specially if the “virtual NMIs” VM-execution control is 1. In this case, the value saved for this field does not indicate the blocking of NMIs but rather the state of virtual-NMI blocking. — Bit 4 (enclave interruption) is set to 1 if the VM exit occurred while the logical processor was in enclave mode. Such VM exits includes those caused by interrupts, non-maskable interrupts, system-management interrupts, INIT signals, and exceptions occurring in enclave mode as well as exceptions encountered during the delivery of such events incident to enclave mode. A VM exit that is incident to delivery of an event injected by VM entry leaves this bit unmodified. • The pending debug exceptions field is saved as clear for all VM exits except the following: — A VM exit caused by an INIT signal, a machine-check exception, or a system-management interrupt (SMI). — A VM exit with basic exit reason “TPR below threshold”,3 “virtualized EOI”, “APIC write”, or “monitor trap flag.” — A VM exit due to trace-address pre-translation (TAPT; see Section 25.5.4). Such VM exits can have basic exit reason “APIC access,” “EPT violation,” “EPT misconfiguration,” “page-modification log full,” or “SPPrelated event.” When due to TAPT, these VM exits (with the exception of those due to EPT misconfigurations) set bit 16 of the exit qualification, indicating that they are asynchronous to instruction execution and not part of event delivery. 1. This is true even if RFLAGS.RF was 1 before the instruction was executed. If, in response to such a VM exit, a VM monitor re-enters the guest to re-execute the instruction that caused the VM exit (for example, after clearing the VM-execution control that caused the VM exit), the instruction may encounter a code breakpoint that has already been processed. A VM monitor can avoid this by setting the guest value of RFLAGS.RF to 1 before resuming guest software. 2. If this activity state was an inactive state resulting from execution of a specific instruction (HLT or MWAIT), the value saved for RIP by that VM exit will reference the following instruction. 3. This item includes VM exits that occur as a result of certain VM entries (Section 26.7.7). 27-24 Vol. 3C VM EXITS — VM exits that are not caused by debug exceptions and that occur while there is MOV-SS blocking of debug exceptions. For VM exits that do not clear the field, the value saved is determined as follows: — Each of bits 3:0 may be set if it corresponds to a matched breakpoint. This may be true even if the corresponding breakpoint is not enabled in DR7. — Suppose that a VM exit is due to an INIT signal, a machine-check exception, or an SMI; or that a VM exit has basic exit reason “TPR below threshold” or “monitor trap flag.” In this case, the value saved sets bits corresponding to the causes of any debug exceptions that were pending at the time of the VM exit. If the VM exit occurs immediately after VM entry, the value saved may match that which was loaded on VM entry (see Section 26.7.3). Otherwise, the following items apply: • Bit 12 (enabled breakpoint) is set to 1 in any of the following cases: — If there was at least one matched data or I/O breakpoint that was enabled in DR7. — If it had been set on VM entry, causing there to be valid pending debug exceptions (see Section 26.7.3) and the VM exit occurred before those exceptions were either delivered or lost. — If the XBEGIN instruction was executed immediately before the VM exit and advanced debugging of RTM transactional regions had been enabled (see Section 16.3.7, “RTM-Enabled Debugger Support,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1). (This does not apply to VM exits with basic exit reason “monitor trap flag.”) In other cases, bit 12 is cleared to 0. • Bit 14 (BS) is set if RFLAGS.TF = 1 in either of the following cases: — IA32_DEBUGCTL.BTF = 0 and the cause of a pending debug exception was the execution of a single instruction. — IA32_DEBUGCTL.BTF = 1 and the cause of a pending debug exception was a taken branch. • Bit 16 (RTM) is set if a debug exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM region while advanced debugging of RTM transactional regions had been enabled. (This does not apply to VM exits with basic exit reason “monitor trap flag.”) — Suppose that a VM exit is due to another reason (but not a debug exception) and occurs while there is MOVSS blocking of debug exceptions. In this case, the value saved sets bits corresponding to the causes of any debug exceptions that were pending at the time of the VM exit. If the VM exit occurs immediately after VM entry (no instructions were executed in VMX non-root operation), the value saved may match that which was loaded on VM entry (see Section 26.7.3). Otherwise, the following items apply: • Bit 12 (enabled breakpoint) is set to 1 if there was at least one matched data or I/O breakpoint that was enabled in DR7. Bit 12 is also set if it had been set on VM entry, causing there to be valid pending debug exceptions (see Section 26.7.3) and the VM exit occurred before those exceptions were either delivered or lost. In other cases, bit 12 is cleared to 0. • The setting of bit 14 (BS) is implementation-specific. However, it is not set if RFLAGS.TF = 0 or IA32_DEBUGCTL.BTF = 1. — The reserved bits in the field are cleared. • If the “save VMX-preemption timer value” VM-exit control is 1, the value of timer is saved into the VMXpreemption timer-value field. This is the value loaded from this field on VM entry as subsequently decremented (see Section 25.5.1). VM exits due to timer expiration save the value 0. Other VM exits may also save the value 0 if the timer expired during VM exit. (If the “save VMX-preemption timer value” VM-exit control is 0, VM exit does not modify the value of the VMX-preemption timer-value field.) • If the logical processor supports the 1-setting of the “enable EPT” VM-execution control, values are saved into the four (4) PDPTE fields as follows: — If the “enable EPT” VM-execution control is 1 and the logical processor was using PAE paging at the time of the VM exit, the PDPTE values currently in use are saved:1 • The values saved into bits 11:9 of each of the fields is undefined. Vol. 3C 27-25 VM EXITS • If the value saved into one of the fields has bit 0 (present) clear, the value saved into bits 63:1 of that field is undefined. That value need not correspond to the value that was loaded by VM entry or to any value that might have been loaded in VMX non-root operation. • If the value saved into one of the fields has bit 0 (present) set, the value saved into bits 63:12 of the field is a guest-physical address. — If the “enable EPT” VM-execution control is 0 or the logical processor was not using PAE paging at the time of the VM exit, the values saved are undefined. 27.4 SAVING MSRS After processor state is saved to the guest-state area, values of MSRs may be stored into the VM-exit MSR-store area (see Section 24.7.2). Specifically each entry in that area (up to the number specified in the VM-exit MSR-store count) is processed in order by storing the value of the MSR indexed by bits 31:0 (as they would be read by RDMSR) into bits 127:64. Processing of an entry fails in either of the following cases: • The value of bits 31:8 is 000008H, meaning that the indexed MSR is one that allows access to an APIC register when the local APIC is in x2APIC mode. • The value of bits 31:0 indicates an MSR that can be read only in system-management mode (SMM) and the VM exit will not end in SMM. (IA32_SMBASE is an MSR that can be read only in SMM.) • The value of bits 31:0 indicates an MSR that cannot be saved on VM exits for model-specific reasons. A processor may prevent certain MSRs (based on the value of bits 31:0) from being stored on VM exits, even if they can normally be read by RDMSR. Such model-specific behavior is documented in Chapter 2, “ModelSpecific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. • • Bits 63:32 of the entry are not all 0. An attempt to read the MSR indexed by bits 31:0 would cause a general-protection exception if executed via RDMSR with CPL = 0. A VMX abort occurs if processing fails for any entry. See Section 27.7. 27.5 LOADING HOST STATE Processor state is updated on VM exits in the following ways: • • • • Some state is loaded from or otherwise determined by the contents of the host-state area. Some state is determined by VM-exit controls. Some state is established in the same way on every VM exit. The page-directory pointers are loaded based on the values of certain control registers. This loading may be performed in any order. On processors that support Intel 64 architecture, the full values of each 64-bit field loaded (for example, the base address for GDTR) is loaded regardless of the mode of the logical processor before and after the VM exit. The loading of host state is detailed in Section 27.5.1 to Section 27.5.5. These sections reference VMCS fields that correspond to processor state. Unless otherwise stated, these references are to fields in the host-state area. A logical processor is in IA-32e mode after a VM exit only if the “host address-space size” VM-exit control is 1. If the logical processor was in IA-32e mode before the VM exit and this control is 0, a VMX abort occurs. See Section 27.7. In addition to loading host state, VM exits clear address-range monitoring (Section 27.5.6). 1. A logical processor uses PAE paging if CR0.PG = 1, CR4.PAE = 1 and IA32_EFER.LMA = 0. See Section 4.4 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. “Enable EPT” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM exit functions as if the “enable EPT” VM-execution control were 0. See Section 24.6.2. 27-26 Vol. 3C VM EXITS After the state loading described in this section, VM exits may load MSRs from the VM-exit MSR-load area (see Section 27.6). This loading occurs only after the state loading described in this section. 27.5.1 Loading Host Control Registers, Debug Registers, MSRs VM exits load new values for controls registers, debug registers, and some MSRs: • CR0, CR3, and CR4 are loaded from the CR0 field, the CR3 field, and the CR4 field, respectively, with the following exceptions: — The following bits are not modified: • For CR0, ET, CD, NW; bits 63:32 (on processors that support Intel 64 architecture), 28:19, 17, and 15:6; and any bits that are fixed in VMX operation (see Section 23.8).1 • For CR3, bits 63:52 and bits in the range 51:32 beyond the processor’s physical-address width (they are cleared to 0).2 (This item applies only to processors that support Intel 64 architecture.) • For CR4, any bits that are fixed in VMX operation (see Section 23.8). — CR4.PAE is set to 1 if the “host address-space size” VM-exit control is 1. — CR4.PCIDE is set to 0 if the “host address-space size” VM-exit control is 0. • • DR7 is set to 400H. The following MSRs are established as follows: — The IA32_DEBUGCTL MSR is cleared to 00000000_00000000H. — The IA32_SYSENTER_CS MSR is loaded from the IA32_SYSENTER_CS field. Since that field has only 32 bits, bits 63:32 of the MSR are cleared to 0. — IA32_SYSENTER_ESP MSR and IA32_SYSENTER_EIP MSR are loaded from the IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field, respectively. If the processor does not support the Intel 64 architecture, these fields have only 32 bits; bits 63:32 of the MSRs are cleared to 0. If the processor does support the Intel 64 architecture and the processor supports N < 64 linear-address bits, each of bits 63:N is set to the value of bit N–1.3 — The following steps are performed on processors that support Intel 64 architecture: • The MSRs FS.base and GS.base are loaded from the base-address fields for FS and GS, respectively (see Section 27.5.2). • The LMA and LME bits in the IA32_EFER MSR are each loaded with the setting of the “host addressspace size” VM-exit control. — If the “load IA32_PERF_GLOBAL_CTRL” VM-exit control is 1, the IA32_PERF_GLOBAL_CTRL MSR is loaded from the IA32_PERF_GLOBAL_CTRL field. Bits that are reserved in that MSR are maintained with their reserved values. — If the “load IA32_PAT” VM-exit control is 1, the IA32_PAT MSR is loaded from the IA32_PAT field. Bits that are reserved in that MSR are maintained with their reserved values. — If the “load IA32_EFER” VM-exit control is 1, the IA32_EFER MSR is loaded from the IA32_EFER field. Bits that are reserved in that MSR are maintained with their reserved values. — If the “clear IA32_BNDCFGS” VM-exit control is 1, the IA32_BNDCFGS MSR is cleared to 00000000_00000000H; otherwise, it is not modified. 1. Bits 28:19, 17, and 15:6 of CR0 and CR0.ET are unchanged by executions of MOV to CR0. CR0.ET is always 1 and the other bits are always 0. 2. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 3. Software can determine the number N by executing CPUID with 80000008H in EAX. The number of linear-address bits supported is returned in bits 15:8 of EAX. Vol. 3C 27-27 VM EXITS — If the “clear IA32_RTIT_CTL” VM-exit control is 1, the IA32_RTIT_CTL MSR is cleared to 00000000_00000000H; otherwise, it is not modified. With the exception of FS.base and GS.base, any of these MSRs is subsequently overwritten if it appears in the VM-exit MSR-load area. See Section 27.6. 27.5.2 Loading Host Segment and Descriptor-Table Registers Each of the registers CS, SS, DS, ES, FS, GS, and TR is loaded as follows (see below for the treatment of LDTR): • The selector is loaded from the selector field. The segment is unusable if its selector is loaded with zero. The checks specified Section 26.3.1.2 limit the selector values that may be loaded. In particular, CS and TR are never loaded with zero and are thus never unusable. SS can be loaded with zero only on processors that support Intel 64 architecture and only if the VM exit is to 64-bit mode (64-bit mode allows use of segments marked unusable). • The base address is set as follows: — CS. Cleared to zero. — SS, DS, and ES. Undefined if the segment is unusable; otherwise, cleared to zero. — FS and GS. Undefined (but, on processors that support Intel 64 architecture, canonical) if the segment is unusable and the VM exit is not to 64-bit mode; otherwise, loaded from the base-address field. If the processor supports the Intel 64 architecture and the processor supports N < 64 linear-address bits, each of bits 63:N is set to the value of bit N–1.1 The values loaded for base addresses for FS and GS are also manifest in the FS.base and GS.base MSRs. — TR. Loaded from the host-state area. If the processor supports the Intel 64 architecture and the processor supports N < 64 linear-address bits, each of bits 63:N is set to the value of bit N–1. • The segment limit is set as follows: — CS. Set to FFFFFFFFH (corresponding to a descriptor limit of FFFFFH and a G-bit setting of 1). — SS, DS, ES, FS, and GS. Undefined if the segment is unusable; otherwise, set to FFFFFFFFH. — TR. Set to 00000067H. • The type field and S bit are set as follows: — CS. Type set to 11 and S set to 1 (execute/read, accessed, non-conforming code segment). — SS, DS, ES, FS, and GS. Undefined if the segment is unusable; otherwise, type set to 3 and S set to 1 (read/write, accessed, expand-up data segment). — TR. Type set to 11 and S set to 0 (busy 32-bit task-state segment). • The DPL is set as follows: — CS, SS, and TR. Set to 0. The current privilege level (CPL) will be 0 after the VM exit completes. — DS, ES, FS, and GS. Undefined if the segment is unusable; otherwise, set to 0. • The P bit is set as follows: — CS, TR. Set to 1. — SS, DS, ES, FS, and GS. Undefined if the segment is unusable; otherwise, set to 1. • On processors that support Intel 64 architecture, CS.L is loaded with the setting of the “host address-space size” VM-exit control. Because the value of this control is also loaded into IA32_EFER.LMA (see Section 27.5.1), no VM exit is ever to compatibility mode (which requires IA32_EFER.LMA = 1 and CS.L = 0). • D/B. — CS. Loaded with the inverse of the setting of the “host address-space size” VM-exit control. For example, if that control is 0, indicating a 32-bit guest, CS.D/B is set to 1. 1. Software can determine the number N by executing CPUID with 80000008H in EAX. The number of linear-address bits supported is returned in bits 15:8 of EAX. 27-28 Vol. 3C VM EXITS — SS. Set to 1. — DS, ES, FS, and GS. Undefined if the segment is unusable; otherwise, set to 1. — TR. Set to 0. • G. — CS. Set to 1. — SS, DS, ES, FS, and GS. Undefined if the segment is unusable; otherwise, set to 1. — TR. Set to 0. The host-state area does not contain a selector field for LDTR. LDTR is established as follows on all VM exits: the selector is cleared to 0000H, the segment is marked unusable and is otherwise undefined (although the base address is always canonical). The base addresses for GDTR and IDTR are loaded from the GDTR base-address field and the IDTR base-address field, respectively. If the processor supports the Intel 64 architecture and the processor supports N < 64 linearaddress bits, each of bits 63:N of each base address is set to the value of bit N–1 of that base address. The GDTR and IDTR limits are each set to FFFFH. 27.5.3 Loading Host RIP, RSP, and RFLAGS RIP and RSP are loaded from the RIP field and the RSP field, respectively. RFLAGS is cleared, except bit 1, which is always set. 27.5.4 Checking and Loading Host Page-Directory-Pointer-Table Entries If CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LMA = 0, the logical processor uses PAE paging. See Section 4.4 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.1 When in PAE paging is in use, the physical address in CR3 references a table of page-directory-pointer-table entries (PDPTEs). A MOV to CR3 when PAE paging is in use checks the validity of the PDPTEs and, if they are valid, loads them into the processor (into internal, non-architectural registers). A VM exit is to a VMM that uses PAE paging if (1) bit 5 (corresponding to CR4.PAE) is set in the CR4 field in the host-state area of the VMCS; and (2) the “host address-space size” VM-exit control is 0. Such a VM exit may check the validity of the PDPTEs referenced by the CR3 field in the host-state area of the VMCS. Such a VM exit must check their validity if either (1) PAE paging was not in use before the VM exit; or (2) the value of CR3 is changing as a result of the VM exit. A VM exit to a VMM that does not use PAE paging must not check the validity of the PDPTEs. A VM exit that checks the validity of the PDPTEs uses the same checks that are used when CR3 is loaded with MOV to CR3 when PAE paging is in use. If MOV to CR3 would cause a general-protection exception due to the PDPTEs that would be loaded (e.g., because a reserved bit is set), a VMX abort occurs (see Section 27.7). If a VM exit to a VMM that uses PAE does not cause a VMX abort, the PDPTEs are loaded into the processor as would MOV to CR3, using the value of CR3 being load by the VM exit. 27.5.5 Updating Non-Register State VM exits affect the non-register state of a logical processor as follows: • • A logical processor is always in the active state after a VM exit. Event blocking is affected as follows: — There is no blocking by STI or by MOV SS after a VM exit. 1. On processors that support Intel 64 architecture, the physical-address extension may support more than 36 physical-address bits. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. Vol. 3C 27-29 VM EXITS — VM exits caused directly by non-maskable interrupts (NMIs) cause blocking by NMI (see Table 24-3). Other VM exits do not affect blocking by NMI. (See Section 27.1 for the case in which an NMI causes a VM exit indirectly.) • There are no pending debug exceptions after a VM exit. Section 28.3 describes how the VMX architecture controls how a logical processor manages information in the TLBs and paging-structure caches. The following items detail how VM exits invalidate cached mappings: • If the “enable VPID” VM-execution control is 0, the logical processor invalidates linear mappings and combined mappings associated with VPID 0000H (for all PCIDs); combined mappings for VPID 0000H are invalidated for all EP4TA values (EP4TA is the value of bits 51:12 of EPTP). • VM exits are not required to invalidate any guest-physical mappings, nor are they required to invalidate any linear mappings or combined mappings if the “enable VPID” VM-execution control is 1. 27.5.6 Clearing Address-Range Monitoring The Intel 64 and IA-32 architectures allow software to monitor a specified address range using the MONITOR and MWAIT instructions. See Section 8.10.4 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. VM exits clear any address-range monitoring that may be in effect. 27.6 LOADING MSRS VM exits may load MSRs from the VM-exit MSR-load area (see Section 24.7.2). Specifically each entry in that area (up to the number specified in the VM-exit MSR-load count) is processed in order by loading the MSR indexed by bits 31:0 with the contents of bits 127:64 as they would be written by WRMSR. Processing of an entry fails in any of the following cases: • The value of bits 31:0 is either C0000100H (the IA32_FS_BASE MSR) or C0000101H (the IA32_GS_BASE MSR). • The value of bits 31:8 is 000008H, meaning that the indexed MSR is one that allows access to an APIC register when the local APIC is in x2APIC mode. • The value of bits 31:0 indicates an MSR that can be written only in system-management mode (SMM) and the VM exit will not end in SMM. (IA32_SMM_MONITOR_CTL is an MSR that can be written only in SMM.) • The value of bits 31:0 indicates an MSR that cannot be loaded on VM exits for model-specific reasons. A processor may prevent loading of certain MSRs even if they can normally be written by WRMSR. Such modelspecific behavior is documented in Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. • • Bits 63:32 are not all 0. An attempt to write bits 127:64 to the MSR indexed by bits 31:0 of the entry would cause a general-protection exception if executed via WRMSR with CPL = 0.1 If processing fails for any entry, a VMX abort occurs. See Section 27.7. If any MSR is being loaded in such a way that would architecturally require a TLB flush, the TLBs are updated so that, after VM exit, the logical processor does not use any translations that were cached before the transition. 27.7 VMX ABORTS A problem encountered during a VM exit leads to a VMX abort. A VMX abort takes a logical processor into a shutdown state as described below. 1. Note the following about processors that support Intel 64 architecture. If CR0.PG = 1, WRMSR to the IA32_EFER MSR causes a general-protection exception if it would modify the LME bit. Since CR0.PG is always 1 in VMX operation, the IA32_EFER MSR should not be included in the VM-exit MSR-load area for the purpose of modifying the LME bit. 27-30 Vol. 3C VM EXITS A VMX abort does not modify the VMCS data in the VMCS region of any active VMCS. The contents of these data are thus suspect after the VMX abort. On a VMX abort, a logical processor saves a nonzero 32-bit VMX-abort indicator field at byte offset 4 in the VMCS region of the VMCS whose misconfiguration caused the failure (see Section 24.2). The following values are used: 1. There was a failure in saving guest MSRs (see Section 27.4). 2. Host checking of the page-directory-pointer-table entries (PDPTEs) failed (see Section 27.5.4). 3. The current VMCS has been corrupted (through writes to the corresponding VMCS region) in such a way that the logical processor cannot complete the VM exit properly. 4. There was a failure on loading host MSRs (see Section 27.6). 5. There was a machine-check event during VM exit (see Section 27.8). 6. The logical processor was in IA-32e mode before the VM exit and the “host address-space size” VM-entry control was 0 (see Section 27.5). Some of these causes correspond to failures during the loading of state from the host-state area. Because the loading of such state may be done in any order (see Section 27.5) a VM exit that might lead to a VMX abort for multiple reasons (for example, the current VMCS may be corrupt and the host PDPTEs might not be properly configured). In such cases, the VMX-abort indicator could correspond to any one of those reasons. A logical processor never reads the VMX-abort indicator in a VMCS region and writes it only with one of the nonzero values mentioned above. The VMX-abort indicator allows software on one logical processor to diagnose the VMX-abort on another. For this reason, it is recommended that software running in VMX root operation zero the VMX-abort indicator in the VMCS region of any VMCS that it uses. After saving the VMX-abort indicator, operation of a logical processor experiencing a VMX abort depends on whether the logical processor is in SMX operation:1 • If the logical processor is in SMX operation, an Intel® TXT shutdown condition occurs. The error code used is 000DH, indicating “VMX abort.” See Intel® Trusted Execution Technology Measured Launched Environment Programming Guide. • If the logical processor is outside SMX operation, it issues a special bus cycle (to notify the chipset) and enters the VMX-abort shutdown state. RESET is the only event that wakes a logical processor from the VMX-abort shutdown state. The following events do not affect a logical processor in this state: machine-check events; INIT signals; external interrupts; non-maskable interrupts (NMIs); start-up IPIs (SIPIs); and systemmanagement interrupts (SMIs). 27.8 MACHINE-CHECK EVENTS DURING VM EXIT If a machine-check event occurs during VM exit, one of the following occurs: • The machine-check event is handled as if it occurred before the VM exit: — If CR4.MCE = 0, operation of the logical processor depends on whether the logical processor is in SMX operation:2 • If the logical processor is in SMX operation, an Intel® TXT shutdown condition occurs. The error code used is 000CH, indicating “unrecoverable machine-check condition.” • If the logical processor is outside SMX operation, it goes to the shutdown state. 1. A logical processor is in SMX operation if GETSEC[SEXIT] has not been executed since the last execution of GETSEC[SENTER]. A logical processor is outside SMX operation if GETSEC[SENTER] has not been executed or if GETSEC[SEXIT] was executed after the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. 2. A logical processor is in SMX operation if GETSEC[SEXIT] has not been executed since the last execution of GETSEC[SENTER]. A logical processor is outside SMX operation if GETSEC[SENTER] has not been executed or if GETSEC[SEXIT] was executed after the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. Vol. 3C 27-31 VM EXITS — If CR4.MCE = 1, a machine-check exception (#MC) is generated: • • • If bit 18 (#MC) of the exception bitmap is 0, the exception is delivered through the guest IDT. If bit 18 of the exception bitmap is 1, the exception causes a VM exit. The machine-check event is handled after VM exit completes: — If the VM exit ends with CR4.MCE = 0, operation of the logical processor depends on whether the logical processor is in SMX operation: • If the logical processor is in SMX operation, an Intel® TXT shutdown condition occurs with error code 000CH (unrecoverable machine-check condition). • If the logical processor is outside SMX operation, it goes to the shutdown state. — If the VM exit ends with CR4.MCE = 1, a machine-check exception (#MC) is delivered through the host IDT. • A VMX abort is generated (see Section 27.7). The logical processor blocks events as done normally in VMX abort. The VMX abort indicator is 5, for “machine-check event during VM exit.” The first option is not used if the machine-check event occurs after any host state has been loaded. The second option is used only if VM entry is able to load all host state. 27-32 Vol. 3C 20.Updates to Chapter 29, Volume 3C Change bars show changes to Chapter 29 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to chapter: Updates across chapter describing VMX support improvements made for Intel® Processor Trace (Intel® PT). Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 31 CHAPTER 29 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS The VMCS includes controls that enable the virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC). When these controls are used, the processor will emulate many accesses to the APIC, track the state of the virtual APIC, and deliver virtual interrupts — all in VMX non-root operation with out a VM exit.1 The processor tracks the state of the virtual APIC using a virtual-APIC page identified by the virtual-machine monitor (VMM). Section 29.1 discusses the virtual-APIC page and how the processor uses it to track the state of the virtual APIC. The following are the VM-execution controls relevant to APIC virtualization and virtual interrupts (see Section 24.6 for information about the locations of these controls): • Virtual-interrupt delivery. This controls enables the evaluation and delivery of pending virtual interrupts (Section 29.2). It also enables the emulation of writes (memory-mapped or MSR-based, as enabled) to the APIC registers that control interrupt prioritization. • Use TPR shadow. This control enables emulation of accesses to the APIC’s task-priority register (TPR) via CR8 (Section 29.3) and, if enabled, via the memory-mapped or MSR-based interfaces. • Virtualize APIC accesses. This control enables virtualization of memory-mapped accesses to the APIC (Section 29.4) by causing VM exits on accesses to a VMM-specified APIC-access page. Some of the other controls, if set, may cause some of these accesses to be emulated rather than causing VM exits. • Virtualize x2APIC mode. This control enables virtualization of MSR-based accesses to the APIC (Section 29.5). • APIC-register virtualization. This control allows memory-mapped and MSR-based reads of most APIC registers (as enabled) by satisfying them from the virtual-APIC page. It directs memory-mapped writes to the APIC-access page to the virtual-APIC page, following them by VM exits for VMM emulation. • Process posted interrupts. This control allows software to post virtual interrupts in a data structure and send a notification to another logical processor; upon receipt of the notification, the target processor will process the posted interrupts by copying them into the virtual-APIC page (Section 29.6). “Virtualize APIC accesses”, “virtualize x2APIC mode”, “virtual-interrupt delivery”, and “APIC-register virtualization” are all secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, the processor operates as if these controls were all 0. See Section 24.6.2. 29.1 VIRTUAL APIC STATE The virtual-APIC page is a 4-KByte region of memory that the processor uses to virtualize certain accesses to APIC registers and to manage virtual interrupts. The physical address of the virtual-APIC page is the virtual-APIC address, a 64-bit VM-execution control field in the VMCS (see Section 24.6.8). Depending on the settings of certain VM-execution controls, the processor may virtualize certain fields on the virtual-APIC page with functionality analogous to that performed by the local APIC. Section 29.1.1 identifies and defines these fields. Section 29.1.2, Section 29.1.3, Section 29.1.4, and Section 29.1.5 detail the actions taken to virtualize updates to some of these fields. 29.1.1 Virtualized APIC Registers Depending on the setting of certain VM-execution controls, a logical processor may virtualize certain accesses to APIC registers using the following fields on the virtual-APIC page: • Virtual task-priority register (VTPR): the 32-bit field located at offset 080H on the virtual-APIC page. 1. In most cases, it is not necessary for a virtual-machine monitor (VMM) to inject virtual interrupts as part of VM entry. Vol. 3C 29-1 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS • Virtual processor-priority register (VPPR): the 32-bit field located at offset 0A0H on the virtual-APIC page. • • Virtual end-of-interrupt register (VEOI): the 32-bit field located at offset 0B0H on the virtual-APIC page. • Virtual interrupt-request register (VIRR): the 256-bit value comprising eight non-contiguous 32-bit fields at offsets 200H, 210H, 220H, 230H, 240H, 250H, 260H, and 270H on the virtual-APIC page. Bit x of the VIRR is at bit position (x & 1FH) at offset (200H | ((x & E0H) » 1)). The processor uses only the low 4 bytes of each of the 16-Byte fields at offsets 200H, 210H, 220H, 230H, 240H, 250H, 260H, and 270H. • Virtual interrupt-command register (VICR_LO): the 32-bit field located at offset 300H on the virtual-APIC page • Virtual interrupt-command register (VICR_HI): the 32-bit field located at offset 310H on the virtual-APIC page. Virtual interrupt-service register (VISR): the 256-bit value comprising eight non-contiguous 32-bit fields at offsets 100H, 110H, 120H, 130H, 140H, 150H, 160H, and 170H on the virtual-APIC page. Bit x of the VISR is at bit position (x & 1FH) at offset (100H | ((x & E0H) » 1)). The processor uses only the low 4 bytes of each of the 16-byte fields at offsets 100H, 110H, 120H, 130H, 140H, 150H, 160H, and 170H. 29.1.2 TPR Virtualization The processor performs TPR virtualization in response to the following operations: (1) virtualization of the MOV to CR8 instruction; (2) virtualization of a write to offset 080H on the APIC-access page; and (3) virtualization of the WRMSR instruction with ECX = 808H. See Section 29.3, Section 29.4.3, and Section 29.5 for details of when TPR virtualization is performed. The following pseudocode details the behavior of TPR virtualization: IF “virtual-interrupt delivery” is 0 THEN IF VTPR[7:4] < TPR threshold (see Section 24.6.8) THEN cause VM exit due to TPR below threshold; FI; ELSE perform PPR virtualization (see Section 29.1.3); evaluate pending virtual interrupts (see Section 29.2.1); FI; Any VM exit caused by TPR virtualization is trap-like: the instruction causing TPR virtualization completes before the VM exit occurs (for example, the value of CS:RIP saved in the guest-state area of the VMCS references the next instruction). 29.1.3 PPR Virtualization The processor performs PPR virtualization in response to the following operations: (1) VM entry; (2) TPR virtualization; and (3) EOI virtualization. See Section 26.3.2.5, Section 29.1.2, and Section 29.1.4 for details of when PPR virtualization is performed. PPR virtualization uses the guest interrupt status (specifically, SVI; see Section 24.4.2) and VTPR. The following pseudocode details the behavior of PPR virtualization: IF VTPR[7:4] ≥ SVI[7:4] THEN VPPR ← VTPR & FFH; ELSE VPPR ← SVI & F0H; FI; PPR virtualization always clears bytes 3:1 of VPPR. PPR virtualization is caused only by TPR virtualization, EOI virtualization, and VM entry. Delivery of a virtual interrupt also modifies VPPR, but in a different way (see Section 29.2.2). No other operations modify VPPR, even if they modify SVI, VISR, or VTPR. 29-2 Vol. 3C APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS 29.1.4 EOI Virtualization The processor performs EOI virtualization in response to the following operations: (1) virtualization of a write to offset 0B0H on the APIC-access page; and (2) virtualization of the WRMSR instruction with ECX = 80BH. See Section 29.4.3 and Section 29.5 for details of when EOI virtualization is performed. EOI virtualization occurs only if the “virtual-interrupt delivery” VM-execution control is 1. EOI virtualization uses and updates the guest interrupt status (specifically, SVI; see Section 24.4.2). The following pseudocode details the behavior of EOI virtualization: Vector ← SVI; VISR[Vector] ← 0; (see Section 29.1.1 for definition of VISR) IF any bits set in VISR THEN SVI ← highest index of bit set in VISR ELSE SVI ← 0; FI; perform PPR virtualiation (see Section 29.1.3); IF EOI_exit_bitmap[Vector] = 1 (see Section 24.6.8 for definition of EOI_exit_bitmap) THEN cause EOI-induced VM exit with Vector as exit qualification; ELSE evaluate pending virtual interrupts; (see Section 29.2.1) FI; Any VM exit caused by EOI virtualization is trap-like: the instruction causing EOI virtualization completes before the VM exit occurs (for example, the value of CS:RIP saved in the guest-state area of the VMCS references the next instruction). 29.1.5 Self-IPI Virtualization The processor performs self-IPI virtualization in response to the following operations: (1) virtualization of a write to offset 300H on the APIC-access page; and (2) virtualization of the WRMSR instruction with ECX = 83FH. See Section 29.4.3 and Section 29.5 for details of when self-IPI virtualization is performed. Self-IPI virtualization occurs only if the “virtual-interrupt delivery” VM-execution control is 1. Each operation that leads to self-IPI virtualization provides an 8-bit vector (see Section 29.4.3 and Section 29.5). Self-IPI virtualization updates the guest interrupt status (specifically, RVI; see Section 24.4.2). The following pseudocode details the behavior of self-IPI virtualization: VIRR[Vector] ← 1; (see Section 29.1.1 for definition of VIRR) RVI ← max{RVI,Vector}; evaluate pending virtual interrupts; (see Section 29.2.1) 29.2 EVALUATION AND DELIVERY OF VIRTUAL INTERRUPTS If the “virtual-interrupt delivery” VM-execution control is 1, certain actions in VMX non-root operation or during VM entry cause the processor to evaluate and deliver virtual interrupts. Evaluation of virtual interrupts is triggered by certain actions change the state of the virtual-APIC page and is described in Section 29.2.1. This evaluation may result in recognition of a virtual interrupt. Once a virtual interrupt is recognized, the processor may deliver it within VMX non-root operation without a VM exit. Virtual-interrupt delivery is described in Section 29.2.2. 29.2.1 Evaluation of Pending Virtual Interrupts If the “virtual-interrupt delivery” VM-execution control is 1, certain actions cause a logical processor to evaluate pending virtual interrupts. The following actions cause the evaluation of pending virtual interrupts: VM entry; TPR virtualization; EOI virtualization; self-IPI virtualization; and posted-interrupt processing. See Section 26.3.2.5, Section 29.1.2, Section Vol. 3C 29-3 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS 29.1.4, Section 29.1.5, and Section 29.6 for details of when evaluation of pending virtual interrupts is performed. No other operations cause the evaluation of pending virtual interrupts, even if they modify RVI or VPPR. Evaluation of pending virtual interrupts uses the guest interrupt status (specifically, RVI; see Section 24.4.2). The following pseudocode details the evaluation of pending virtual interrupts: IF “interrupt-window exiting” is 0 AND RVI[7:4] > VPPR[7:4] (see Section 29.1.1 for definition of VPPR) THEN recognize a pending virtual interrupt; ELSE do not recognize a pending virtual interrupt; FI; Once recognized, a virtual interrupt may be delivered in VMX non-root operation; see Section 29.2.2. Evaluation of pending virtual interrupts is caused only by VM entry, TPR virtualization, EOI virtualization, self-IPI virtualization, and posted-interrupt processing. No other operations do so, even if they modify RVI or VPPR. The logical processor ceases recognition of a pending virtual interrupt following the delivery of a virtual interrupt. 29.2.2 Virtual-Interrupt Delivery If a virtual interrupt has been recognized (see Section 29.2.1), it is delivered at an instruction boundary when the following conditions all hold: (1) RFLAGS.IF = 1; (2) there is no blocking by STI; (3) there is no blocking by MOV SS or by POP SS; and (4) the “interrupt-window exiting” VM-execution control is 0. Virtual-interrupt delivery has the same priority as that of VM exits due to the 1-setting of the “interrupt-window exiting” VM-execution control.2 Thus, non-maskable interrupts (NMIs) and higher priority events take priority over delivery of a virtual interrupt; delivery of a virtual interrupt takes priority over external interrupts and lower priority events. Virtual-interrupt delivery wakes a logical processor from the same inactive activity states as would an external interrupt. Specifically, it wakes a logical processor from the states entered using the HLT and MWAIT instructions. It does not wake a logical processor in the shutdown state or in the wait-for-SIPI state. Virtual-interrupt delivery updates the guest interrupt status (both RVI and SVI; see Section 24.4.2) and delivers an event within VMX non-root operation without a VM exit. The following pseudocode details the behavior of virtualinterrupt delivery (see Section 29.1.1 for definition of VISR, VIRR, and VPPR): Vector ← RVI; VISR[Vector] ← 1; SVI ← Vector; VPPR ← Vector & F0H; VIRR[Vector] ← 0; IF any bits set in VIRR THEN RVI ← highest index of bit set in VIRR ELSE RVI ← 0; FI; deliver interrupt with Vector through IDT; cease recognition of any pending virtual interrupt; If a logical processor is in enclave mode, an Asynchronous Enclave Exit (AEX) occurs before delivery of a virtual interrupt (see Chapter 39, “Enclave Exiting Events”). 2. A logical processor never recognizes or delivers a virtual interrupt if the “interrupt-window exiting” VM-execution control is 1. Because of this, the relative priority of virtual-interrupt delivery and VM exits due to the 1-setting of that control is not defined. 29-4 Vol. 3C APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS 29.3 VIRTUALIZING CR8-BASED TPR ACCESSES In 64-bit mode, software can access the local APIC’s task-priority register (TPR) through CR8. Specifically, software uses the MOV from CR8 and MOV to CR8 instructions (see Section 10.8.6, “Task Priority in IA-32e Mode”). This section describes how these accesses can be virtualized. A virtual-machine monitor can virtualize these CR8-based APIC accesses by setting the “CR8-load exiting” and “CR8-store exiting” VM-execution controls, ensuring that the accesses cause VM exits (see Section 25.1.3). Alternatively, there are methods for virtualizing some CR8-based APIC accesses without VM exits. Normally, an execution of MOV from CR8 or MOV to CR8 that does not fault or cause a VM exit accesses the APIC’s TPR. However, such an execution are treated specially if the “use TPR shadow” VM-execution control is 1. The following items provide details: • MOV from CR8. The instruction loads bits 3:0 of its destination operand with bits 7:4 of VTPR (see Section 29.1.1). Bits 63:4 of the destination operand are cleared. • MOV to CR8. The instruction stores bits 3:0 of its source operand into bits 7:4 of VTPR; the remainder of VTPR (bits 3:0 and bits 31:8) are cleared. Following this, the processor performs TPR virtualization (see Section 29.1.2). 29.4 VIRTUALIZING MEMORY-MAPPED APIC ACCESSES When the local APIC is in xAPIC mode, software accesses the local APIC’s control registers using a memorymapped interface. Specifically, software uses linear addresses that translate to physical addresses on page frame indicated by the base address in the IA32_APIC_BASE MSR (see Section 10.4.4, “Local APIC Status and Location”). This section describes how these accesses can be virtualized. A virtual-machine monitor (VMM) can virtualize these memory-mapped APIC accesses by ensuring that any access to a linear address that would access the local APIC instead causes a VM exit. This could be done using paging or the extended page-table mechanism (EPT). Another way is by using the 1-setting of the “virtualize APIC accesses” VM-execution control. If the “virtualize APIC accesses” VM-execution control is 1, the logical processor treats specially memory accesses using linear addresses that translate to physical addresses in the 4-KByte APIC-access page.3,4 (The APICaccess page is identified by the APIC-access address, a field in the VMCS; see Section 24.6.8.) In general, an access to the APIC-access page causes an APIC-access VM exit. APIC-access VM exits provide a VMM with information about the access causing the VM exit. Section 29.4.1 discusses the priority of APIC-access VM exits. Certain VM-execution controls enable the processor to virtualize certain accesses to the APIC-access page without a VM exit. In general, this virtualization causes these accesses to be made to the virtual-APIC page instead of the APIC-access page. NOTES Unless stated otherwise, this section characterizes only linear accesses to the APIC-access page; an access to the APIC-access page is a linear access if (1) it results from a memory access using a 3. Even when addresses are translated using EPT (see Section 28.2), the determination of whether an APIC-access VM exit occurs depends on an access’s physical address, not its guest-physical address. Even when CR0.PG = 0, ordinary memory accesses by software use linear addresses; the fact that CR0.PG = 0 means only that the identity translation is used to convert linear addresses to physical (or guest-physical) addresses. 4. If EPT is enabled and there is write to a guest-physical address that translates to an address on the APIC-access page that is eligible for sub-page write permissions (see Section 28.2.4.1), the processor may treat the write as if the “virtualize APIC accesses” VM-execution control were 0 (and not apply the treatment specified in this section). For that reason, it is recommended that software not configure any guest-physical address that translates to an address on the APIC-access page to be eligible for sub-page write permissions. Vol. 3C 29-5 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS linear address; and (2) the access’s physical address is the translation of that linear address. Section 29.4.6 discusses accesses to the APIC-access page that are not linear accesses. The distinction between the APIC-access page and the virtual-APIC page allows a VMM to share paging structures or EPT paging structures among the virtual processors of a virtual machine (the shared paging structures referencing the same APIC-access address, which appears in the VMCS of all the virtual processors) while giving each virtual processor its own virtual APIC (the VMCS of each virtual processor will have a unique virtual-APIC address). Section 29.4.2 discusses when and how the processor may virtualize read accesses from the APIC-access page. Section 29.4.3 does the same for write accesses. When virtualizing a write to the APIC-access page, the processor typically takes actions in addition to passing the write through to the virtual-APIC page. The discussion in those sections uses the concept of an operation within which these memory accesses may occur. For those discussions, an “operation” can be an iteration of a REP-prefixed string instruction, an execution of any other instruction, or delivery of an event through the IDT. The 1-setting of the “virtualize APIC accesses” VM-execution control may also affect accesses to the APIC-access page that do not result directly from linear addresses. This is discussed in Section 29.4.6. Special treatment may apply to Intel SGX instructions or if the logical processor is in enclave mode. See Section 41.5.3 for details. 29.4.1 Priority of APIC-Access VM Exits The following items specify the priority of APIC-access VM exits relative to other events. • The priority of an APIC-access VM exit due to a memory access is below that of any page fault or EPT violation that that access may incur. That is, an access does not cause an APIC-access VM exit if it would cause a page fault or an EPT violation. • A memory access does not cause an APIC-access VM exit until after the accessed flags are set in the paging structures (including EPT paging structures, if enabled). • A write access does not cause an APIC-access VM exit until after the dirty flags are set in the appropriate paging structure and EPT paging structure (if enabled). • With respect to all other events, any APIC-access VM exit due to a memory access has the same priority as any page fault or EPT violation that the access could cause. (This item applies to other events that the access may generate as well as events that may be generated by other accesses by the same operation.) These principles imply, among other things, that an APIC-access VM exit may occur during the execution of a repeated string instruction (including INS and OUTS). Suppose, for example, that the first n iterations (n may be 0) of such an instruction do not access the APIC-access page and that the next iteration does access that page. As a result, the first n iterations may complete and be followed by an APIC-access VM exit. The instruction pointer saved in the VMCS references the repeated string instruction and the values of the general-purpose registers reflect the completion of n iterations. 29.4.2 Virtualizing Reads from the APIC-Access Page A read access from the APIC-access page causes an APIC-access VM exit if any of the following are true: • • • • The “use TPR shadow” VM-execution control is 0. • The access is not entirely contained within the low 4 bytes of a naturally aligned 16-byte region. That is, bits 3:2 of the access’s address are 0, and the same is true of the address of the highest byte accessed. The access is for an instruction fetch. The access is more than 32 bits in size. The access is part of an operation for which the processor has already virtualized a write to the APIC-access page. 29-6 Vol. 3C APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS If none of the above are true, whether a read access is virtualized depends on the setting of the “APIC-register virtualization” VM-execution control: • If “APIC-register virtualization” and “virtual-interrupt delivery” VM-execution controls are both 0, a read access is virtualized if its page offset is 080H (task priority); otherwise, the access causes an APIC-access VM exit. • If the “APIC-register virtualization” VM-execution control is 0 and the “virtual-interrupt delivery” VM-execution control is 1, a read access is virtualized if its page offset is 080H (task priority), 0B0H (end of interrupt), or 300H (interrupt command — low); otherwise, the access causes an APIC-access VM exit. • If “APIC-register virtualization” is 1, a read access is virtualized if it is entirely within one the following ranges of offsets: — 020H–023H (local APIC ID); — 030H–033H (local APIC version); — 080H–083H (task priority); — 0B0H–0B3H (end of interrupt); — 0D0H–0D3H (logical destination); — 0E0H–0E3H (destination format); — 0F0H–0F3H (spurious-interrupt vector); — 100H–103H, 110H–113H, 120H–123H, 130H–133H, 140H–143H, 150H–153H, 160H–163H, or 170H– 173H (in-service); — 180H–183H, 190H–193H, 1A0H–1A3H, 1B0H–1B3H, 1C0H–1C3H, 1D0H–1D3H, 1E0H–1E3H, or 1F0H– 1F3H (trigger mode); — 200H–203H, 210H–213H, 220H–223H, 230H–233H, 240H–243H, 250H–253H, 260H–263H, or 270H– 273H (interrupt request); — 280H–283H (error status); — 300H–303H or 310H–313H (interrupt command); — 320H–323H, 330H–333H, 340H–343H, 350H–353H, 360H–363H, or 370H–373H (LVT entries); — 380H–383H (initial count); or — 3E0H–3E3H (divide configuration). In all other cases, the access causes an APIC-access VM exit. A read access from the APIC-access page that is virtualized returns data from the corresponding page offset on the virtual-APIC page.5 29.4.3 Virtualizing Writes to the APIC-Access Page Whether a write access to the APIC-access page is virtualized depends on the settings of the VM-execution controls and the page offset of the access. Section 29.4.3.1 details when APIC-write virtualization occurs. Unlike reads, writes to the local APIC have side effects; because of this, virtualization of writes to the APIC-access page may require emulation specific to the access’s page offset (which identifies the APIC register being accessed). Section 29.4.3.2 describes this APIC-write emulation. For some page offsets, it is necessary for software to complete the virtualization after a write completes. In these cases, the processor causes an APIC-write VM exit to invoke VMM software. Section 29.4.3.3 discusses APICwrite VM exits. 29.4.3.1 Determining Whether a Write Access is Virtualized A write access to the APIC-access page causes an APIC-access VM exit if any of the following are true: 5. The memory type used for accesses that read from the virtual-APIC page is reported in bits 53:50 of the IA32_VMX_BASIC MSR (see Appendix A.1). Vol. 3C 29-7 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS • • • The “use TPR shadow” VM-execution control is 0. • The access is not entirely contained within the low 4 bytes of a naturally aligned 16-byte region. That is, bits 3:2 of the access’s address are 0, and the same is true of the address of the highest byte accessed. The access is more than 32 bits in size. The access is part of an operation for which the processor has already virtualized a write (with a different page offset or a different size) to the APIC-access page. If none of the above are true, whether a write access is virtualized depends on the settings of the “APIC-register virtualization” and “virtual-interrupt delivery” VM-execution controls: • If the “APIC-register virtualization” and “virtual-interrupt delivery” VM-execution controls are both 0, a write access is virtualized if its page offset is 080H; otherwise, the access causes an APIC-access VM exit. • If the “APIC-register virtualization” VM-execution control is 0 and the “virtual-interrupt delivery” VM-execution control is 1, a write access is virtualized if its page offset is 080H (task priority), 0B0H (end of interrupt), and 300H (interrupt command — low); otherwise, the access causes an APIC-access VM exit. • If the “APIC-register virtualization” VM-execution control is 1, a write access is virtualized if it is entirely within one the following ranges of offsets: — 020H–023H (local APIC ID); — 080H–083H (task priority); — 0B0H–0B3H (end of interrupt); — 0D0H–0D3H (logical destination); — 0E0H–0E3H (destination format); — 0F0H–0F3H (spurious-interrupt vector); — 280H–283H (error status); — 300H–303H or 310H–313H (interrupt command); — 320H–323H, 330H–333H, 340H–343H, 350H–353H, 360H–363H, or 370H–373H (LVT entries); — 380H–383H (initial count); or — 3E0H–3E3H (divide configuration). In all other cases, the access causes an APIC-access VM exit. The processor virtualizes a write access to the APIC-access page by writing data to the corresponding page offset on the virtual-APIC page.6 Following this, the processor performs certain actions after completion of the operation of which the access was a part.7 APIC-write emulation is described in Section 29.4.3.2. 29.4.3.2 APIC-Write Emulation If the processor virtualizes a write access to the APIC-access page, it performs additional actions after completion of an operation of which the access was a part. These actions are called APIC-write emulation. The details of APIC-write emulation depend upon the page offset of the virtualized write access:8 • 080H (task priority). The processor clears bytes 3:1 of VTPR and then causes TPR virtualization (Section 29.1.2). • 0B0H (end of interrupt). If the “virtual-interrupt delivery” VM-execution control is 1, the processor clears VEOI and then causes EOI virtualization (Section 29.1.4); otherwise, the processor causes an APIC-write VM exit (Section 29.4.3.3). 6. The memory type used for accesses that write to the virtual-APIC page is reported in bits 53:50 of the IA32_VMX_BASIC MSR (see Appendix A.1). 7. Recall that, for the purposes of this discussion, an operation is an iteration of a REP-prefixed string instruction, an execution of any other instruction, or delivery of an event through the IDT. 8. For any operation, there can be only one page offset for which a write access was virtualized. This is because a write access is not virtualized if the processor has already virtualized a write access for the same operation with a different page offset. 29-8 Vol. 3C APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS • 300H (interrupt command — low). If the “virtual-interrupt delivery” VM-execution control is 1, the processor checks the value of VICR_LO to determine whether the following are all true: — Reserved bits (31:20, 17:16, 13) and bit 12 (delivery status) are all 0. — Bits 19:18 (destination shorthand) are 01B (self). — Bit 15 (trigger mode) is 0 (edge). — Bits 10:8 (delivery mode) are 000B (fixed). — Bits 7:4 (the upper half of the vector) are not 0000B. If all of the items above are true, the processor performs self-IPI virtualization using the 8-bit vector in byte 0 of VICR_LO (Section 29.1.5). If the “virtual-interrupt delivery” VM-execution control is 0, or if any of the items above are false, the processor causes an APIC-write VM exit (Section 29.4.3.3). • 310H–313H (interrupt command — high). The processor clears bytes 2:0 of VICR_HI. No other virtualization or VM exit occurs. • Any other page offset. The processor causes an APIC-write VM exit (Section 29.4.3.3). APIC-write emulation takes priority over system-management interrupts (SMIs), INIT signals, and lower priority events. APIC-write emulation is not blocked if RFLAGS.IF = 0 or by the MOV SS, POP SS, or STI instructions. If an operation causes a fault after a write access to the APIC-access page and before APIC-write emulation, and that fault is delivered without a VM exit, APIC-write emulation occurs after the fault is delivered and before the fault handler can execute. If an operation causes a VM exit (perhaps due to a fault) after a write access to the APICaccess page and before APIC-write emulation, the APIC-write emulation does not occur. 29.4.3.3 APIC-Write VM Exits In certain cases, VMM software must be invoked to complete the virtualization of a write access to the APIC-access page. In this case, APIC-write emulation causes an APIC-write VM exit. (Section 29.4.3.2 details the cases that causes APIC-write VM exits.) APIC-write VM exits are invoked by APIC-write emulation, and APIC-write emulation occurs after an operation that performs a write access to the APIC-access page. Because of this, every APIC-write VM exit is trap-like: it occurs after completion of the operation containing the write access that caused the VM exit (for example, the value of CS:RIP saved in the guest-state area of the VMCS references the next instruction). The basic exit reason for an APIC-write VM exit is “APIC write.” The exit qualification is the page offset of the write access that led to the VM exit. As noted in Section 29.5, execution of WRMSR with ECX = 83FH (self-IPI MSR) can lead to an APIC-write VM exit if the “virtual-interrupt delivery” VM-execution control is 1. The exit qualification for such an APIC-write VM exit is 3F0H. 29.4.4 Instruction-Specific Considerations Certain instructions that use linear address may cause page faults even though they do not use those addresses to access memory. The APIC-virtualization features may affect these instructions as well: • CLFLUSH, CLFLUSHOPT. With regard to faulting, the processor operates as if each of these instructions reads from the linear address in its source operand. If that address translates to one on the APIC-access page, the instruction may cause an APIC-access VM exit. If it does not, it will flush the corresponding cache line on the virtual-APIC page instead of the APIC-access page. • ENTER. With regard to faulting, the processor operates if ENTER writes to the byte referenced by the final value of the stack pointer (even though it does not if its size operand is non-zero). If that value translates to an address on the APIC-access page, the instruction may cause an APIC-access VM exit. If it does not, it will cause the APIC-write emulation appropriate to the address’s page offset. Vol. 3C 29-9 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS • MASKMOVQ and MASKMOVDQU. Even if the instruction’s mask is zero, the processor may operate with regard to faulting as if MASKMOVQ or MASKMOVDQU writes to memory (the behavior is implementationspecific). In such a situation, an APIC-access VM exit may occur. • MONITOR. With regard to faulting, the processor operates as if MONITOR reads from the effective address in RAX. If the resulting linear address translates to one on the APIC-access page, the instruction may cause an APIC-access VM exit.9 If it does not, it will monitor the corresponding address on the virtual-APIC page instead of the APIC-access page. • PREFETCH. An execution of the PREFETCH instruction that would result in an access to the APIC-access page does not cause an APIC-access VM exit. Such an access may prefetch data; if so, it is from the corresponding address on the virtual-APIC page. Virtualization of accesses to the APIC-access page is principally intended for basic instructions such as AND, MOV, OR, TEST, XCHG, and XOR. Use of an instruction that normally operates on floating-point, SSE, AVX, or AVX-512 registers may cause an APIC-access VM exit unconditionally regardless of the page offset it accesses on the APICaccess page. 29.4.5 Issues Pertaining to Page Size and TLB Management The 1-setting of the “virtualize APIC accesses” VM-execution is guaranteed to apply only if translations to the APICaccess address use a 4-KByte page. The following items provide details: • If EPT is not in use, any linear address that translates to an address on the APIC-access page should use a 4KByte page. Any access to a linear address that translates to the APIC-access page using a larger page may operate as if the “virtualize APIC accesses” VM-execution control were 0. • If EPT is in use, any guest-physical address that translates to an address on the APIC-access page should use a 4-KByte page. Any access to a linear address that translates to a guest-physical address that in turn translates to the APIC-access page using a larger page may operate as if the “virtualize APIC accesses” VM-execution control were 0. (This is true also for guest-physical accesses to the APIC-access page; see Section 29.4.6.1.) In addition, software should perform appropriate TLB invalidation when making changes that may affect APICvirtualization. The specifics depend on whether VPIDs or EPT is being used: • VPIDs being used but EPT not being used. Suppose that there is a VPID that has been used before and that software has since made either of the following changes: (1) set the “virtualize APIC accesses” VM-execution control when it had previously been 0; or (2) changed the paging structures so that some linear address translates to the APIC-access address when it previously did not. In that case, software should execute INVVPID (see “INVVPID— Invalidate Translations Based on VPID” in Section 30.3) before performing on the same logical processor and with the same VPID.10 • EPT being used. Suppose that there is an EPTP value that has been used before and that software has since made either of the following changes: (1) set the “virtualize APIC accesses” VM-execution control when it had previously been 0; or (2) changed the EPT paging structures so that some guest-physical address translates to the APIC-access address when it previously did not. In that case, software should execute INVEPT (see “INVEPT— Invalidate Translations Derived from EPT” in Section 30.3) before performing on the same logical processor and with the same EPTP value.11 • Neither VPIDs nor EPT being used. No invalidation is required. Failure to perform the appropriate TLB invalidation may result in the logical processor operating as if the “virtualize APIC accesses” VM-execution control were 0 in responses to accesses to the affected address. (No invalidation is necessary if neither VPIDs nor EPT is being used.) 9. This chapter uses the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For IA-32 processors, this notation refers to the 32-bit forms of those registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to lower 32 bits of the indicated register. 10. INVVPID should use either (1) the all-contexts INVVPID type; (2) the single-context INVVPID type with the VPID in the INVVPID descriptor; or (3) the individual-address INVVPID type with the linear address and the VPID in the INVVPID descriptor. 11. INVEPT should use either (1) the global INVEPT type; or (2) the single-context INVEPT type with the EPTP value in the INVEPT descriptor. 29-10 Vol. 3C APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS 29.4.6 APIC Accesses Not Directly Resulting From Linear Addresses Section 29.4 has described the treatment of accesses that use linear addresses that translate to addresses on the APIC-access page. This section considers memory accesses that do not result directly from linear addresses. • An access is called a guest-physical access if (1) CR0.PG = 1;12 (2) the “enable EPT” VM-execution control is 1;13 (3) the access’s physical address is the result of an EPT translation; and (4) either (a) the access was not generated by a linear address; or (b) the access’s guest-physical address is not the translation of the access’s linear address. Section 29.4.6.1 discusses the treatment of guest-physical accesses to the APICaccess page. • An access is called a physical access if (1) either (a) the “enable EPT” VM-execution control is 0; or (b) the access’s physical address is not the result of a translation through the EPT paging structures; and (2) either (a) the access is not generated by a linear address; or (b) the access’s physical address is not the translation of its linear address. Section 29.4.6.2 discusses the treatment of physical accesses to the APIC-access page. 29.4.6.1 Guest-Physical Accesses to the APIC-Access Page Guest-physical accesses include the following when guest-physical addresses are being translated using EPT: • Reads from the guest paging structures when translating a linear address (such an access uses a guestphysical address that is not the translation of that linear address). • Loads of the page-directory-pointer-table entries by MOV to CR when the logical processor is using (or that causes the logical processor to use) PAE paging (see Section 4.4). • Updates to the accessed and dirty flags in the guest paging structures when using a linear address (such an access uses a guest-physical address that is not the translation of that linear address). • Memory accesses by Intel Processor Trace when the “Intel PT uses guest physical addresses” VM-execution control is 1 (see Section 25.5.4). Every guest-physical access using a guest-physical address that translates to an address on the APIC-access page causes an APIC-access VM exit. Such accesses are never virtualized regardless of the page offset. The following items specify the priority relative to other events of APIC-access VM exits caused by guest-physical accesses to the APIC-access page. • The priority of an APIC-access VM exit caused by a guest-physical access to memory is below that of any EPT violation that that access may incur. That is, a guest-physical access does not cause an APIC-access VM exit if it would cause an EPT violation. • With respect to all other events, any APIC-access VM exit caused by a guest-physical access has the same priority as any EPT violation that the guest-physical access could cause. 29.4.6.2 Physical Accesses to the APIC-Access Page Physical accesses include the following: • If the “enable EPT” VM-execution control is 0: — Reads from the paging structures when translating a linear address. — Loads of the page-directory-pointer-table entries by MOV to CR when the logical processor is using (or that causes the logical processor to use) PAE paging (see Section 4.4). — Updates to the accessed and dirty flags in the paging structures. • If the “enable EPT” VM-execution control is 1, accesses to the EPT paging structures (including updates to the accessed and dirty flags for EPT). • Any of the following accesses made by the processor to support VMX non-root operation: 12. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PG must be 1 in VMX operation, CR0.PG must be 1 unless the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. 13. “Enable EPT” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the “enable EPT” VM-execution control were 0. See Section 24.6.2. Vol. 3C 29-11 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS — Accesses to the VMCS region. — Accesses to data structures referenced (directly or indirectly) by physical addresses in VM-execution control fields in the VMCS. These include the I/O bitmaps, the MSR bitmaps, and the virtual-APIC page. • Accesses that effect transitions into and out of SMM.14 These include the following: — Accesses to SMRAM during SMI delivery and during execution of RSM. — Accesses during SMM VM exits (including accesses to MSEG) and during VM entries that return from SMM. A physical access to the APIC-access page may or may not cause an APIC-access VM exit. If it does not cause an APIC-access VM exit, it may access the APIC-access page or the virtual-APIC page. Physical write accesses to the APIC-access page may or may not cause APIC-write emulation or APIC-write VM exits. The priority of an APIC-access VM exit caused by physical access is not defined relative to other events that the access may cause. It is recommended that software not set the APIC-access address to any of the addresses used by physical memory accesses (identified above). For example, it should not set the APIC-access address to the physical address of any of the active paging structures if the “enable EPT” VM-execution control is 0. 29.5 VIRTUALIZING MSR-BASED APIC ACCESSES When the local APIC is in x2APIC mode, software accesses the local APIC’s control registers using the MSR interface. Specifically, software uses the RDMSR and WRMSR instructions, setting ECX (identifying the MSR being accessed) to values in the range 800H–8FFH (see Section 10.12, “Extended XAPIC (x2APIC)”). This section describes how these accesses can be virtualized. A virtual-machine monitor can virtualize these MSR-based APIC accesses by configuring the MSR bitmaps (see Section 24.6.9) to ensure that the accesses cause VM exits (see Section 25.1.3). Alternatively, there are methods for virtualizing some MSR-based APIC accesses without VM exits. Normally, an execution of RDMSR or WRMSR that does not fault or cause a VM exit accesses the MSR indicated in ECX. However, such an execution treats some values of ECX in the range 800H–8FFH specially if the “virtualize x2APIC mode” VM-execution control is 1. The following items provide details: • RDMSR. The instruction’s behavior depends on the setting of the “APIC-register virtualization” VM-execution control. — If the “APIC-register virtualization” VM-execution control is 0, behavior depends upon the value of ECX. • If ECX contains 808H (indicating the TPR MSR), the instruction reads the 8 bytes from offset 080H on the virtual-APIC page (VTPR and the 4 bytes above it) into EDX:EAX. This occurs even if the local APIC is not in x2APIC mode (no general-protection fault occurs because the local APIC is not x2APIC mode). • If ECX contains any other value in the range 800H–8FFH, the instruction operates normally. If the local APIC is in x2APIC mode and ECX indicates a readable APIC register, EDX and EAX are loaded with the value of that register. If the local APIC is not in x2APIC mode or ECX does not indicate a readable APIC register, a general-protection fault occurs. — If “APIC-register virtualization” is 1 and ECX contains a value in the range 800H–8FFH, the instruction reads the 8 bytes from offset X on the virtual-APIC page into EDX:EAX, where X = (ECX & FFH) « 4. This occurs even if the local APIC is not in x2APIC mode (no general-protection fault occurs because the local APIC is not in x2APIC mode). • WRMSR. The instruction’s behavior depends on the value of ECX and the setting of the “virtual-interrupt delivery” VM-execution control. Special processing applies in the following cases: (1) ECX contains 808H (indicating the TPR MSR); (2) ECX contains 80BH (indicating the EOI MSR) and the “virtual-interrupt delivery” VM-execution control is 1; and (3) ECX contains 83FH (indicating the self-IPI MSR) and the “virtual-interrupt delivery” VM-execution control is 1. 14. Technically, these accesses do not occur in VMX non-root operation. They are included here for clarity. 29-12 Vol. 3C APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS If special processing applies, no general-protection exception is produced due to the fact that the local APIC is in xAPIC mode. However, WRMSR does perform the normal reserved-bit checking: — If ECX contains 808H or 83FH, a general-protection fault occurs if either EDX or EAX[31:8] is non-zero. — If ECX contains 80BH, a general-protection fault occurs if either EDX or EAX is non-zero. If there is no fault, WRMSR stores EDX:EAX at offset X on the virtual-APIC page, where X = (ECX & FFH) « 4. Following this, the processor performs an operation depending on the value of ECX: — If ECX contains 808H, the processor performs TPR virtualization (see Section 29.1.2). — If ECX contains 80BH, the processor performs EOI virtualization (see Section 29.1.4). — If ECX contains 83FH, the processor then checks the value of EAX[7:4] and proceeds as follows: • If the value is non-zero, the logical processor performs self-IPI virtualization with the 8-bit vector in EAX[7:0] (see Section 29.1.5). • If the value is zero, the logical processor causes an APIC-write VM exit as if there had been a write access to page offset 3F0H on the APIC-access page (see Section 29.4.3.3). If special processing does not apply, the instruction operates normally. If the local APIC is in x2APIC mode and ECX indicates a writable APIC register, the value in EDX:EAX is written to that register. If the local APIC is not in x2APIC mode or ECX does not indicate a writable APIC register, a general-protection fault occurs. 29.6 POSTED-INTERRUPT PROCESSING Posted-interrupt processing is a feature by which a processor processes the virtual interrupts by recording them as pending on the virtual-APIC page. Posted-interrupt processing is enabled by setting the “process posted interrupts” VM-execution control. The processing is performed in response to the arrival of an interrupt with the posted-interrupt notification vector. In response to such an interrupt, the processor processes virtual interrupts recorded in a data structure called a posted-interrupt descriptor. The posted-interrupt notification vector and the address of the posted-interrupt descriptor are fields in the VMCS; see Section 24.6.8. If the “process posted interrupts” VM-execution control is 1, a logical processor uses a 64-byte posted-interrupt descriptor located at the posted-interrupt descriptor address. The posted-interrupt descriptor has the following format: Table 29-1. Format of Posted-Interrupt Descriptor Bit Position(s) Name Description 255:0 Posted-interrupt requests One bit for each interrupt vector. There is a posted-interrupt request for a vector if the corresponding bit is 1 256 Outstanding notification If this bit is set, there is a notification outstanding for one or more posted interrupts in bits 255:0 511:257 Reserved for software and other agents These bits may be used by software and by other agents in the system (e.g., chipset). The processor does not modify these bits. The notation PIR (posted-interrupt requests) refers to the 256 posted-interrupt bits in the posted-interrupt descriptor. Use of the posted-interrupt descriptor differs from that of other data structures that are referenced by pointers in a VMCS. There is a general requirement that software ensure that each such data structure is modified only when no logical processor with a current VMCS that references it is in VMX non-root operation. That requirement does not apply to the posted-interrupt descriptor. There is a requirement, however, that such modifications be done using locked read-modify-write instructions. If the “external-interrupt exiting” VM-execution control is 1, any unmasked external interrupt causes a VM exit (see Section 25.2). If the “process posted interrupts” VM-execution control is also 1, this behavior is changed and the processor handles an external interrupt as follows:15 Vol. 3C 29-13 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS 1. The local APIC is acknowledged; this provides the processor core with an interrupt vector, called here the physical vector. 2. If the physical vector equals the posted-interrupt notification vector, the logical processor continues to the next step. Otherwise, a VM exit occurs as it would normally due to an external interrupt; the vector is saved in the VM-exit interruption-information field. 3. The processor clears the outstanding-notification bit in the posted-interrupt descriptor. This is done atomically so as to leave the remainder of the descriptor unmodified (e.g., with a locked AND operation). 4. The processor writes zero to the EOI register in the local APIC; this dismisses the interrupt with the postedinterrupt notification vector from the local APIC. 5. The logical processor performs a logical-OR of PIR into VIRR and clears PIR. No other agent can read or write a PIR bit (or group of bits) between the time it is read (to determine what to OR into VIRR) and when it is cleared. 6. The logical processor sets RVI to be the maximum of the old value of RVI and the highest index of all bits that were set in PIR; if no bit was set in PIR, RVI is left unmodified. 7. The logical processor evaluates pending virtual interrupts as described in Section 29.2.1. The logical processor performs the steps above in an uninterruptible manner. If step #7 leads to recognition of a virtual interrupt, the processor may deliver that interrupt immediately. Steps #1 to #7 above occur when the interrupt controller delivers an unmasked external interrupt to the CPU core. The following items consider certain cases of interrupt delivery: • Interrupt delivery can occur between iterations of a REP-prefixed instruction (after at least one iteration has completed but before all iterations have completed). If this occurs, the following items characterize processor state after posted-interrupt processing completes and before guest execution resumes: — RIP references the REP-prefixed instruction; — RCX, RSI, and RDI are updated to reflect the iterations completed; and — RFLAGS.RF = 1. • Interrupt delivery can occur when the logical processor is in the active, HLT, or MWAIT states. If the logical processor had been in the active or MWAIT state before the arrival of the interrupt, it is in the active state following completion of step #7; if it had been in the HLT state, it returns to the HLT state after step #7 (if a pending virtual interrupt was recognized, the logical processor may immediately wake from the HLT state). • Interrupt delivery can occur while the logical processor is in enclave mode. If the logical processor had been in enclave mode before the arrival of the interrupt, an Asynchronous Enclave Exit (AEX) may occur before the steps #1 to #7 (see Chapter 39, “Enclave Exiting Events”). If no AEX occurs before step #1 and a VM exit occurs at step #2, an AEX occurs before the VM exit is delivered. 15. VM entry ensures that the “process posted interrupts” VM-execution control is 1 only if the “external-interrupt exiting” VM-execution control is also 1. SeeSection 26.2.1.1. 29-14 Vol. 3C 21.Updates to Chapter 30, Volume 3C Change bars show changes to Chapter 30 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to this chapter: Added operand encoding to instructions. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 27 CHAPTER 30 VMX INSTRUCTION REFERENCE NOTE This chapter was previously located in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B as chapter 5. 30.1 OVERVIEW This chapter describes the virtual-machine extensions (VMX) for the Intel 64 and IA-32 architectures. VMX is intended to support virtualization of processor hardware and a system software layer acting as a host to multiple guest software environments. The virtual-machine extensions (VMX) includes five instructions that manage the virtual-machine control structure (VMCS), four instructions that manage VMX operation, two TLB-management instructions, and two instructions for use by guest software. Additional details of VMX are described in Chapter 23 through Chapter 29. The behavior of the VMCS-maintenance instructions is summarized below: • VMPTRLD — This instruction takes a single 64-bit source operand that is in memory. It makes the referenced VMCS active and current, loading the current-VMCS pointer with this operand and establishes the current VMCS based on the contents of VMCS-data area in the referenced VMCS region. Because this makes the referenced VMCS active, a logical processor may start maintaining on the processor some of the VMCS data for the VMCS. • VMPTRST — This instruction takes a single 64-bit destination operand that is in memory. The current-VMCS pointer is stored into the destination operand. • VMCLEAR — This instruction takes a single 64-bit operand that is in memory. The instruction sets the launch state of the VMCS referenced by the operand to “clear”, renders that VMCS inactive, and ensures that data for the VMCS have been written to the VMCS-data area in the referenced VMCS region. If the operand is the same as the current-VMCS pointer, that pointer is made invalid. • VMREAD — This instruction reads a component from a VMCS (the encoding of that field is given in a register operand) and stores it into a destination operand that may be a register or in memory. • VMWRITE — This instruction writes a component to a VMCS (the encoding of that field is given in a register operand) from a source operand that may be a register or in memory. The behavior of the VMX management instructions is summarized below: • VMLAUNCH — This instruction launches a virtual machine managed by the VMCS. A VM entry occurs, transferring control to the VM. • VMRESUME — This instruction resumes a virtual machine managed by the VMCS. A VM entry occurs, transferring control to the VM. • • VMXOFF — This instruction causes the processor to leave VMX operation. VMXON — This instruction takes a single 64-bit source operand that is in memory. It causes a logical processor to enter VMX root operation and to use the memory referenced by the operand to support VMX operation. The behavior of the VMX-specific TLB-management instructions is summarized below: • INVEPT — This instruction invalidates entries in the TLBs and paging-structure caches that were derived from extended page tables (EPT). • INVVPID — This instruction invalidates entries in the TLBs and paging-structure caches based on a VirtualProcessor Identifier (VPID). None of the instructions above can be executed in compatibility mode; they generate invalid-opcode exceptions if executed in compatibility mode. The behavior of the guest-available instructions is summarized below: • VMCALL — This instruction allows software in VMX non-root operation to call the VMM for service. A VM exit occurs, transferring control to the VMM. Vol. 3C 30-1 VMX INSTRUCTION REFERENCE • VMFUNC — This instruction allows software in VMX non-root operation to invoke a VM function (processor functionality enabled and configured by software in VMX root operation) without a VM exit. 30.2 CONVENTIONS The operation sections for the VMX instructions in Section 30.3 use the pseudo-function VMexit, which indicates that the logical processor performs a VM exit. The operation sections also use the pseudo-functions VMsucceed, VMfail, VMfailInvalid, and VMfailValid. These pseudo-functions signal instruction success or failure by setting or clearing bits in RFLAGS and, in some cases, by writing the VM-instruction error field. The following pseudocode fragments detail these functions: VMsucceed: CF ← 0; PF ← 0; AF ← 0; ZF ← 0; SF ← 0; OF ← 0; VMfail(ErrorNumber): IF VMCS pointer is valid THEN VMfailValid(ErrorNumber); ELSE VMfailInvalid; FI; VMfailInvalid: CF ← 1; PF ← 0; AF ← 0; ZF ← 0; SF ← 0; OF ← 0; VMfailValid(ErrorNumber):// executed only if there is a current VMCS CF ← 0; PF ← 0; AF ← 0; ZF ← 1; SF ← 0; OF ← 0; Set the VM-instruction error field to ErrorNumber; The different VM-instruction error numbers are enumerated in Section 30.4, “VM Instruction Error Numbers”. 30.3 VMX INSTRUCTIONS This section provides detailed descriptions of the VMX instructions. 30-2 Vol. 3C VMX INSTRUCTION REFERENCE INVEPT— Invalidate Translations Derived from EPT Opcode/ Instruction Op/En Description 66 0F 38 80 RM Invalidates EPT-derived entries in the TLBs and paging-structure caches (in 64-bit mode). RM Invalidates EPT-derived entries in the TLBs and paging-structure caches (outside 64-bit mode). INVEPT r64, m128 66 0F 38 80 INVEPT r32, m128 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 RM ModRM:reg (r) ModRM:r/m (r) NA NA Description Invalidates mappings in the translation lookaside buffers (TLBs) and paging-structure caches that were derived from extended page tables (EPT). (See Chapter 28, “VMX Support for Address Translation”.) Invalidation is based on the INVEPT type specified in the register operand and the INVEPT descriptor specified in the memory operand. Outside IA-32e mode, the register operand is always 32 bits, regardless of the value of CS.D; in 64-bit mode, the register operand has 64 bits (the instruction cannot be executed in compatibility mode). The INVEPT types supported by a logical processors are reported in the IA32_VMX_EPT_VPID_CAP MSR (see Appendix A, “VMX Capability Reporting Facility”). There are two INVEPT types currently defined: • Single-context invalidation. If the INVEPT type is 1, the logical processor invalidates all mappings associated with bits 51:12 of the EPT pointer (EPTP) specified in the INVEPT descriptor. It may invalidate other mappings as well. • Global invalidation: If the INVEPT type is 2, the logical processor invalidates mappings associated with all EPTPs. If an unsupported INVEPT type is specified, the instruction fails. INVEPT invalidates all the specified mappings for the indicated EPTP(s) regardless of the VPID and PCID values with which those mappings may be associated. The INVEPT descriptor comprises 128 bits and contains a 64-bit EPTP value in bits 63:0 (see Figure 30-1). 127 64 63 Reserved (must be zero) 0 EPT pointer (EPTP) Figure 30-1. INVEPT Descriptor Operation IF (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation THEN VM exit; ELSIF CPL > 0 THEN #GP(0); ELSE Vol. 3C 30-3 VMX INSTRUCTION REFERENCE INVEPT_TYPE ← value of register operand; IF IA32_VMX_EPT_VPID_CAP MSR indicates that processor does not support INVEPT_TYPE THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE // INVEPT_TYPE must be 1 or 2 INVEPT_DESC ← value of memory operand; EPTP ← INVEPT_DESC[63:0]; CASE INVEPT_TYPE OF 1: // single-context invalidation IF VM entry with the “enable EPT“ VM execution control set to 1 would fail due to the EPTP value THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE Invalidate mappings associated with EPTP[51:12]; VMsucceed; FI; BREAK; 2: // global invalidation Invalidate mappings associated with all EPTPs; VMsucceed; BREAK; ESAC; FI; FI; Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the source operand is located in an execute-only code segment. #PF(fault-code) If a page fault occurs in accessing the memory operand. #SS(0) If the memory operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If not in VMX operation. If the logical processor does not support EPT (IA32_VMX_PROCBASED_CTLS2[33]=0). If the logical processor supports EPT (IA32_VMX_PROCBASED_CTLS2[33]=1) but does not support the INVEPT instruction (IA32_VMX_EPT_VPID_CAP[20]=0). Real-Address Mode Exceptions #UD The INVEPT instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The INVEPT instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD 30-4 Vol. 3C The INVEPT instruction is not recognized in compatibility mode. VMX INSTRUCTION REFERENCE 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. #PF(fault-code) If a page fault occurs in accessing the memory operand. #SS(0) If the memory operand is in the SS segment and the memory address is in a non-canonical form. #UD If not in VMX operation. If the logical processor does not support EPT (IA32_VMX_PROCBASED_CTLS2[33]=0). If the logical processor supports EPT (IA32_VMX_PROCBASED_CTLS2[33]=1) but does not support the INVEPT instruction (IA32_VMX_EPT_VPID_CAP[20]=0). Vol. 3C 30-5 VMX INSTRUCTION REFERENCE INVVPID— Invalidate Translations Based on VPID Opcode/ Instruction Op/En Description 66 0F 38 81 RM Invalidates entries in the TLBs and paging-structure caches based on VPID (in 64-bit mode). RM Invalidates entries in the TLBs and paging-structure caches based on VPID (outside 64-bit mode). INVVPID r64, m128 66 0F 38 81 INVVPID r32, m128 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 RM ModRM:reg (r) ModRM:r/m (r) NA NA Description Invalidates mappings in the translation lookaside buffers (TLBs) and paging-structure caches based on virtualprocessor identifier (VPID). (See Chapter 28, “VMX Support for Address Translation”.) Invalidation is based on the INVVPID type specified in the register operand and the INVVPID descriptor specified in the memory operand. Outside IA-32e mode, the register operand is always 32 bits, regardless of the value of CS.D; in 64-bit mode, the register operand has 64 bits (the instruction cannot be executed in compatibility mode). The INVVPID types supported by a logical processors are reported in the IA32_VMX_EPT_VPID_CAP MSR (see Appendix A, “VMX Capability Reporting Facility”). There are four INVVPID types currently defined: • Individual-address invalidation: If the INVVPID type is 0, the logical processor invalidates mappings for the linear address and VPID specified in the INVVPID descriptor. In some cases, it may invalidate mappings for other linear addresses (or other VPIDs) as well. • Single-context invalidation: If the INVVPID type is 1, the logical processor invalidates all mappings tagged with the VPID specified in the INVVPID descriptor. In some cases, it may invalidate mappings for other VPIDs as well. • All-contexts invalidation: If the INVVPID type is 2, the logical processor invalidates all mappings tagged with all VPIDs except VPID 0000H. In some cases, it may invalidate translations with VPID 0000H as well. • Single-context invalidation, retaining global translations: If the INVVPID type is 3, the logical processor invalidates all mappings tagged with the VPID specified in the INVVPID descriptor except global translations. In some cases, it may invalidate global translations (and mappings with other VPIDs) as well. See the “Caching Translation Information” section in Chapter 4 of the IA-32 Intel Architecture Software Developer’s Manual, Volumes 3A for information about global translations. If an unsupported INVVPID type is specified, the instruction fails. INVVPID invalidates all the specified mappings for the indicated VPID(s) regardless of the EPTP and PCID values with which those mappings may be associated. The INVVPID descriptor comprises 128 bits and consists of a VPID and a linear address as shown in Figure 30-2. 127 64 63 Linear Address Reserved (must be zero) Figure 30-2. INVVPID Descriptor 30-6 Vol. 3C 16 15 VPID 0 VMX INSTRUCTION REFERENCE Operation IF (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation THEN VM exit; ELSIF CPL > 0 THEN #GP(0); ELSE INVVPID_TYPE ← value of register operand; IF IA32_VMX_EPT_VPID_CAP MSR indicates that processor does not support INVVPID_TYPE THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE // INVVPID_TYPE must be in the range 0–3 INVVPID_DESC ← value of memory operand; IF INVVPID_DESC[63:16] ≠ 0 THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE CASE INVVPID_TYPE OF 0: // individual-address invalidation VPID ← INVVPID_DESC[15:0]; IF VPID = 0 THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE GL_ADDR ← INVVPID_DESC[127:64]; IF (GL_ADDR is not in a canonical form) THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE Invalidate mappings for GL_ADDR tagged with VPID; VMsucceed; FI; FI; BREAK; 1: // single-context invalidation VPID ← INVVPID_DESC[15:0]; IF VPID = 0 THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE Invalidate all mappings tagged with VPID; VMsucceed; FI; BREAK; 2: // all-context invalidation Invalidate all mappings tagged with all non-zero VPIDs; VMsucceed; BREAK; 3: // single-context invalidation retaining globals VPID ← INVVPID_DESC[15:0]; IF VPID = 0 THEN VMfail(Invalid operand to INVEPT/INVVPID); ELSE Invalidate all mappings tagged with VPID except global translations; VMsucceed; Vol. 3C 30-7 VMX INSTRUCTION REFERENCE FI; BREAK; ESAC; FI; FI; FI; Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the source operand is located in an execute-only code segment. #PF(fault-code) If a page fault occurs in accessing the memory operand. #SS(0) If the memory operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If not in VMX operation. If the logical processor does not support VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=0). If the logical processor supports VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=1) but does not support the INVVPID instruction (IA32_VMX_EPT_VPID_CAP[32]=0). Real-Address Mode Exceptions #UD The INVVPID instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The INVVPID instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD The INVVPID instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. #PF(fault-code) If a page fault occurs in accessing the memory operand. #SS(0) If the memory destination operand is in the SS segment and the memory address is in a noncanonical form. #UD If not in VMX operation. If the logical processor does not support VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=0). If the logical processor supports VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=1) but does not support the INVVPID instruction (IA32_VMX_EPT_VPID_CAP[32]=0). 30-8 Vol. 3C VMX INSTRUCTION REFERENCE VMCALL—Call to VM Monitor Opcode/ Instruction Op/En Description 0F 01 C1 ZO Call to VM monitor by causing VM exit. VMCALL Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 ZO NA NA NA NA Description This instruction allows guest software can make a call for service into an underlying VM monitor. The details of the programming interface for such calls are VMM-specific; this instruction does nothing more than cause a VM exit, registering the appropriate exit reason. Use of this instruction in VMX root operation invokes an SMM monitor (see Section 34.15.2). This invocation will activate the dual-monitor treatment of system-management interrupts (SMIs) and system-management mode (SMM) if it is not already active (see Section 34.15.6). Operation IF not in VMX operation THEN #UD; ELSIF in VMX non-root operation THEN VM exit; ELSIF (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF CPL > 0 THEN #GP(0); ELSIF in SMM or the logical processor does not support the dual-monitor treatment of SMIs and SMM or the valid bit in the IA32_SMM_MONITOR_CTL MSR is clear THEN VMfail (VMCALL executed in VMX root operation); ELSIF dual-monitor treatment of SMIs and SMM is active THEN perform an SMM VM exit (see Section 34.15.2); ELSIF current-VMCS pointer is not valid THEN VMfailInvalid; ELSIF launch state of current VMCS is not clear THEN VMfailValid(VMCALL with non-clear VMCS); ELSIF VM-exit control fields are not valid (see Section 34.15.6.1) THEN VMfailValid (VMCALL with invalid VM-exit control fields); ELSE enter SMM; read revision identifier in MSEG; IF revision identifier does not match that supported by processor THEN leave SMM; VMfailValid(VMCALL with incorrect MSEG revision identifier); ELSE read SMM-monitor features field in MSEG (see Section 34.15.6.1); IF features field is invalid THEN leave SMM; Vol. 3C 30-9 VMX INSTRUCTION REFERENCE VMfailValid(VMCALL with invalid SMM-monitor features); ELSE activate dual-monitor treatment of SMIs and SMM (see Section 34.15.6); FI; FI; FI; Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If the current privilege level is not 0 and the logical processor is in VMX root operation. #UD If executed outside VMX operation. Real-Address Mode Exceptions #UD If executed outside VMX operation. Virtual-8086 Mode Exceptions #UD If executed outside VMX non-root operation. Compatibility Mode Exceptions #UD If executed outside VMX non-root operation. 64-Bit Mode Exceptions #UD 30-10 Vol. 3C If executed outside VMX operation. VMX INSTRUCTION REFERENCE VMCLEAR—Clear Virtual-Machine Control Structure Opcode/ Instruction Op/En Description 66 0F C7 /6 M Copy VMCS data to VMCS region in memory. VMCLEAR m64 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (r) NA NA NA Description This instruction applies to the VMCS whose VMCS region resides at the physical address contained in the instruction operand. The instruction ensures that VMCS data for that VMCS (some of these data may be currently maintained on the processor) are copied to the VMCS region in memory. It also initializes parts of the VMCS region (for example, it sets the launch state of that VMCS to clear). See Chapter 24, “Virtual-Machine Control Structures”. The operand of this instruction is always 64 bits and is always in memory. If the operand is the current-VMCS pointer, then that pointer is made invalid (set to FFFFFFFF_FFFFFFFFH). Note that the VMCLEAR instruction might not explicitly write any VMCS data to memory; the data may be already resident in memory before the VMCLEAR is executed. Operation IF (register operand) or (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation THEN VM exit; ELSIF CPL > 0 THEN #GP(0); ELSE addr ← contents of 64-bit in-memory operand; IF addr is not 4KB-aligned OR addr sets any bits beyond the physical-address width1 THEN VMfail(VMCLEAR with invalid physical address); ELSIF addr = VMXON pointer THEN VMfail(VMCLEAR with VMXON pointer); ELSE ensure that data for VMCS referenced by the operand is in memory; initialize implementation-specific data in VMCS region; launch state of VMCS referenced by the operand ← “clear” IF operand addr = current-VMCS pointer THEN current-VMCS pointer ← FFFFFFFF_FFFFFFFFH; FI; VMsucceed; FI; FI; Flags Affected See the operation section and Section 30.2. 1. If IA32_VMX_BASIC[48] is read as 1, VMfail occurs if addr sets any bits in the range 63:32; see Appendix A.1. Vol. 3C 30-11 VMX INSTRUCTION REFERENCE Protected Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the operand is located in an execute-only code segment. #PF(fault-code) #SS(0) If a page fault occurs in accessing the memory operand. If the memory operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If operand is a register. If not in VMX operation. Real-Address Mode Exceptions #UD The VMCLEAR instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMCLEAR instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD The VMCLEAR instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. If the source operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. #PF(fault-code) If a page fault occurs in accessing the memory operand. #SS(0) If the source operand is in the SS segment and the memory address is in a non-canonical form. #UD If operand is a register. If not in VMX operation. 30-12 Vol. 3C VMX INSTRUCTION REFERENCE VMFUNC—Invoke VM function Opcode/ Instruction Op/En Description NP 0F 01 D4 ZO Invoke VM function specified in EAX. VMFUNC Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 ZO NA NA NA NA Description This instruction allows software in VMX non-root operation to invoke a VM function, which is processor functionality enabled and configured by software in VMX root operation. The value of EAX selects the specific VM function being invoked. The behavior of each VM function (including any additional fault checking) is specified in Section 25.5.6, “VM Functions”. Operation Perform functionality of the VM function specified in EAX; Flags Affected Depends on the VM function specified in EAX. See Section 25.5.6, “VM Functions”. Protected Mode Exceptions (not including those defined by specific VM functions) #UD If executed outside VMX non-root operation. If “enable VM functions” VM-execution control is 0. If EAX ≥ 64. Real-Address Mode Exceptions Same exceptions as in protected mode. Virtual-8086 Exceptions Same exceptions as in protected mode. Compatibility Mode Exceptions Same exceptions as in protected mode. 64-Bit Mode Exceptions Same exceptions as in protected mode. Vol. 3C 30-13 VMX INSTRUCTION REFERENCE VMLAUNCH/VMRESUME—Launch/Resume Virtual Machine Opcode/ Instruction Op/En Description 0F 01 C2 ZO Launch virtual machine managed by current VMCS. ZO Resume virtual machine managed by current VMCS. VMLAUNCH 0F 01 C3 VMRESUME Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 ZO NA NA NA NA Description Effects a VM entry managed by the current VMCS. • VMLAUNCH fails if the launch state of current VMCS is not “clear”. If the instruction is successful, it sets the launch state to “launched.” • VMRESUME fails if the launch state of the current VMCS is not “launched.” If VM entry is attempted, the logical processor performs a series of consistency checks as detailed in Chapter 26, “VM Entries”. Failure to pass checks on the VMX controls or on the host-state area passes control to the instruction following the VMLAUNCH or VMRESUME instruction. If these pass but checks on the guest-state area fail, the logical processor loads state from the host-state area of the VMCS, passing control to the instruction referenced by the RIP field in the host-state area. VM entry is not allowed when events are blocked by MOV SS or POP SS. Neither VMLAUNCH nor VMRESUME should be used immediately after either MOV to SS or POP to SS. Operation IF (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSIF current-VMCS pointer is not valid THEN VMfailInvalid; ELSIF events are being blocked by MOV SS THEN VMfailValid(VM entry with events blocked by MOV SS); ELSIF (VMLAUNCH and launch state of current VMCS is not “clear”) THEN VMfailValid(VMLAUNCH with non-clear VMCS); ELSIF (VMRESUME and launch state of current VMCS is not “launched”) THEN VMfailValid(VMRESUME with non-launched VMCS); ELSE Check settings of VMX controls and host-state area; IF invalid settings THEN VMfailValid(VM entry with invalid VMX-control field(s)) or VMfailValid(VM entry with invalid host-state field(s)) or VMfailValid(VM entry with invalid executive-VMCS pointer)) or VMfailValid(VM entry with non-launched executive VMCS) or VMfailValid(VM entry with executive-VMCS pointer not VMXON pointer) or 30-14 Vol. 3C VMX INSTRUCTION REFERENCE VMfailValid(VM entry with invalid VM-execution control fields in executive VMCS) as appropriate; ELSE Attempt to load guest state and PDPTRs as appropriate; clear address-range monitoring; IF failure in checking guest state or PDPTRs THEN VM entry fails (see Section 26.8); ELSE Attempt to load MSRs from VM-entry MSR-load area; IF failure THEN VM entry fails (see Section 26.8); ELSE IF VMLAUNCH THEN launch state of VMCS ← “launched”; FI; IF in SMM and “entry to SMM” VM-entry control is 0 THEN IF “deactivate dual-monitor treatment” VM-entry control is 0 THEN SMM-transfer VMCS pointer ← current-VMCS pointer; FI; IF executive-VMCS pointer is VMXON pointer THEN current-VMCS pointer ← VMCS-link pointer; ELSE current-VMCS pointer ← executive-VMCS pointer; FI; leave SMM; FI; VM entry succeeds; FI; FI; FI; FI; Further details of the operation of the VM-entry appear in Chapter 26. Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If the current privilege level is not 0. #UD If executed outside VMX operation. Real-Address Mode Exceptions #UD The VMLAUNCH and VMRESUME instructions are not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMLAUNCH and VMRESUME instructions are not recognized in virtual-8086 mode. Vol. 3C 30-15 VMX INSTRUCTION REFERENCE Compatibility Mode Exceptions #UD The VMLAUNCH and VMRESUME instructions are not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. #UD If executed outside VMX operation. 30-16 Vol. 3C VMX INSTRUCTION REFERENCE VMPTRLD—Load Pointer to Virtual-Machine Control Structure Opcode/ Instruction Op/En Description NP 0F C7 /6 M Loads the current VMCS pointer from memory. VMPTRLD m64 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (r) NA NA NA Description Marks the current-VMCS pointer valid and loads it with the physical address in the instruction operand. The instruction fails if its operand is not properly aligned, sets unsupported physical-address bits, or is equal to the VMXON pointer. In addition, the instruction fails if the 32 bits in memory referenced by the operand do not match the VMCS revision identifier supported by this processor.1 The operand of this instruction is always 64 bits and is always in memory. Operation IF (register operand) or (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSE addr ← contents of 64-bit in-memory source operand; IF addr is not 4KB-aligned OR addr sets any bits beyond the physical-address width2 THEN VMfail(VMPTRLD with invalid physical address); ELSIF addr = VMXON pointer THEN VMfail(VMPTRLD with VMXON pointer); ELSE rev ← 32 bits located at physical address addr; IF rev[30:0] ≠ VMCS revision identifier supported by processor OR rev[31] = 1 AND processor does not support 1-setting of “VMCS shadowing” THEN VMfail(VMPTRLD with incorrect VMCS revision identifier); ELSE current-VMCS pointer ← addr; VMsucceed; FI; FI; FI; Flags Affected See the operation section and Section 30.2. 1. Software should consult the VMX capability MSR VMX_BASIC to discover the VMCS revision identifier supported by this processor (see Appendix A, “VMX Capability Reporting Facility”). 2. If IA32_VMX_BASIC[48] is read as 1, VMfail occurs if addr sets any bits in the range 63:32; see Appendix A.1. Vol. 3C 30-17 VMX INSTRUCTION REFERENCE Protected Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory source operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the source operand is located in an execute-only code segment. #PF(fault-code) #SS(0) If a page fault occurs in accessing the memory source operand. If the memory source operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If operand is a register. If not in VMX operation. Real-Address Mode Exceptions #UD The VMPTRLD instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMPTRLD instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD The VMPTRLD instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. If the source operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. #PF(fault-code) If a page fault occurs in accessing the memory source operand. #SS(0) If the source operand is in the SS segment and the memory address is in a non-canonical form. #UD If operand is a register. If not in VMX operation. 30-18 Vol. 3C VMX INSTRUCTION REFERENCE VMPTRST—Store Pointer to Virtual-Machine Control Structure Opcode/ Instruction Op/En Description NP 0F C7 /7 M Stores the current VMCS pointer into memory. VMPTRST m64 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (w) NA NA NA Description Stores the current-VMCS pointer into a specified memory address. The operand of this instruction is always 64 bits and is always in memory. Operation IF (register operand) or (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSE 64-bit in-memory destination operand ← current-VMCS pointer; VMsucceed; FI; Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory destination operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the destination operand is located in a read-only data segment or any code segment. #PF(fault-code) #SS(0) If a page fault occurs in accessing the memory destination operand. If the memory destination operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If operand is a register. If not in VMX operation. Real-Address Mode Exceptions #UD The VMPTRST instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMPTRST instruction is not recognized in virtual-8086 mode. Vol. 3C 30-19 VMX INSTRUCTION REFERENCE Compatibility Mode Exceptions #UD The VMPTRST instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. If the destination operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. #PF(fault-code) If a page fault occurs in accessing the memory destination operand. #SS(0) If the destination operand is in the SS segment and the memory address is in a non-canonical form. #UD If operand is a register. If not in VMX operation. 30-20 Vol. 3C VMX INSTRUCTION REFERENCE VMREAD—Read Field from Virtual-Machine Control Structure Opcode/ Instruction Op/En Description NP 0F 78 MR Reads a specified VMCS field (in 64-bit mode). MR Reads a specified VMCS field (outside 64-bit mode). VMREAD r/m64, r64 NP 0F 78 VMREAD r/m32, r32 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 MR ModRM:r/m (w) ModRM:reg (r) NA NA Description Reads a specified field from a VMCS and stores it into a specified destination operand (register or memory). In VMX root operation, the instruction reads from the current VMCS. If executed in VMX non-root operation, the instruction reads from the VMCS referenced by the VMCS link pointer field in the current VMCS. The VMCS field is specified by the VMCS-field encoding contained in the register source operand. Outside IA-32e mode, the source operand has 32 bits, regardless of the value of CS.D. In 64-bit mode, the source operand has 64 bits. The effective size of the destination operand, which may be a register or in memory, is always 32 bits outside IA32e mode (the setting of CS.D is ignored with respect to operand size) and 64 bits in 64-bit mode. If the VMCS field specified by the source operand is shorter than this effective operand size, the high bits of the destination operand are cleared to 0. If the VMCS field is longer, then the high bits of the field are not read. Note that any faults resulting from accessing a memory destination operand can occur only after determining, in the operation section below, that the relevant VMCS pointer is valid and that the specified VMCS field is supported. Operation IF (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation AND (“VMCS shadowing” is 0 OR source operand sets bits in range 63:15 OR VMREAD bit corresponding to bits 14:0 of source operand is 1)1 THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSIF (in VMX root operation AND current-VMCS pointer is not valid) OR (in VMX non-root operation AND VMCS link pointer is not valid) THEN VMfailInvalid; ELSIF source operand does not correspond to any VMCS field THEN VMfailValid(VMREAD/VMWRITE from/to unsupported VMCS component); ELSE IF in VMX root operation THEN destination operand ← contents of field indexed by source operand in current VMCS; ELSE destination operand ← contents of field indexed by source operand in VMCS referenced by VMCS link pointer; FI; VMsucceed; FI; 1. The VMREAD bit for a source operand is defined as follows. Let x be the value of bits 14:0 of the source operand and let addr be the VMREAD-bitmap address. The corresponding VMREAD bit is in bit position x & 7 of the byte at physical address addr | (x » 3). Vol. 3C 30-21 VMX INSTRUCTION REFERENCE Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If the current privilege level is not 0. If a memory destination operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the destination operand is located in a read-only data segment or any code segment. #PF(fault-code) #SS(0) If a page fault occurs in accessing a memory destination operand. If a memory destination operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If not in VMX operation. Real-Address Mode Exceptions #UD The VMREAD instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMREAD instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD The VMREAD instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory destination operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. #PF(fault-code) If a page fault occurs in accessing a memory destination operand. #SS(0) If the memory destination operand is in the SS segment and the memory address is in a noncanonical form. #UD If not in VMX operation. 30-22 Vol. 3C VMX INSTRUCTION REFERENCE VMRESUME—Resume Virtual Machine See VMLAUNCH/VMRESUME—Launch/Resume Virtual Machine. Vol. 3C 30-23 VMX INSTRUCTION REFERENCE VMWRITE—Write Field to Virtual-Machine Control Structure Opcode/ Instruction Op/En Description NP 0F 79 RM Writes a specified VMCS field (in 64-bit mode). RM Writes a specified VMCS field (outside 64-bit mode). VMWRITE r64, r/m64 NP 0F 79 VMWRITE r32, r/m32 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 RM ModRM:reg (r) ModRM:r/m (r) NA NA Description Writes the contents of a primary source operand (register or memory) to a specified field in a VMCS. In VMX root operation, the instruction writes to the current VMCS. If executed in VMX non-root operation, the instruction writes to the VMCS referenced by the VMCS link pointer field in the current VMCS. The VMCS field is specified by the VMCS-field encoding contained in the register secondary source operand. Outside IA-32e mode, the secondary source operand is always 32 bits, regardless of the value of CS.D. In 64-bit mode, the secondary source operand has 64 bits. The effective size of the primary source operand, which may be a register or in memory, is always 32 bits outside IA-32e mode (the setting of CS.D is ignored with respect to operand size) and 64 bits in 64-bit mode. If the VMCS field specified by the secondary source operand is shorter than this effective operand size, the high bits of the primary source operand are ignored. If the VMCS field is longer, then the high bits of the field are cleared to 0. Note that any faults resulting from accessing a memory source operand occur after determining, in the operation section below, that the relevant VMCS pointer is valid but before determining if the destination VMCS field is supported. Operation IF (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation AND (“VMCS shadowing” is 0 OR secondary source operand sets bits in range 63:15 OR VMWRITE bit corresponding to bits 14:0 of secondary source operand is 1)1 THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSIF (in VMX root operation AND current-VMCS pointer is not valid) OR (in VMX non-root operation AND VMCS-link pointer is not valid) THEN VMfailInvalid; ELSIF secondary source operand does not correspond to any VMCS field THEN VMfailValid(VMREAD/VMWRITE from/to unsupported VMCS component); ELSIF VMCS field indexed by secondary source operand is a VM-exit information field AND processor does not support writing to such fields2 THEN VMfailValid(VMWRITE to read-only VMCS component); ELSE 1. The VMWRITE bit for a secondary source operand is defined as follows. Let x be the value of bits 14:0 of the secondary source operand and let addr be the VMWRITE-bitmap address. The corresponding VMWRITE bit is in bit position x & 7 of the byte at physical address addr | (x » 3). 2. Software can discover whether these fields can be written by reading the VMX capability MSR IA32_VMX_MISC (see Appendix A.6). 30-24 Vol. 3C VMX INSTRUCTION REFERENCE IF in VMX root operation THEN field indexed by secondary source operand in current VMCS ← primary source operand; ELSE field indexed by secondary source operand in VMCS referenced by VMCS link pointer ← primary source operand; FI; VMsucceed; FI; Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If the current privilege level is not 0. If a memory source operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the source operand is located in an execute-only code segment. #PF(fault-code) If a page fault occurs in accessing a memory source operand. #SS(0) If a memory source operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If not in VMX operation. Real-Address Mode Exceptions #UD The VMWRITE instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMWRITE instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD The VMWRITE instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If the current privilege level is not 0. If the memory source operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. #PF(fault-code) If a page fault occurs in accessing a memory source operand. #SS(0) If the memory source operand is in the SS segment and the memory address is in a noncanonical form. #UD If not in VMX operation. Vol. 3C 30-25 VMX INSTRUCTION REFERENCE VMXOFF—Leave VMX Operation Opcode/ Instruction Op/En Description 0F 01 C4 ZO Leaves VMX operation. VMXOFF Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 ZO NA NA NA NA Description Takes the logical processor out of VMX operation, unblocks INIT signals, conditionally re-enables A20M, and clears any address-range monitoring.1 Operation IF (not in VMX operation) or (CR0.PE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF in VMX non-root operation THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSIF dual-monitor treatment of SMIs and SMM is active THEN VMfail(VMXOFF under dual-monitor treatment of SMIs and SMM); ELSE leave VMX operation; unblock INIT; IF IA32_SMM_MONITOR_CTL[2] = 02 THEN unblock SMIs; IF outside SMX operation3 THEN unblock and enable A20M; FI; clear address-range monitoring; VMsucceed; FI; Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If executed in VMX root operation with CPL > 0. 1. See the information on MONITOR/MWAIT in Chapter 8, “Multiple-Processor Management,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 2. Setting IA32_SMM_MONITOR_CTL[bit 2] to 1 prevents VMXOFF from unblocking SMIs regardless of the value of the register’s value bit (bit 0). Not all processors allow this bit to be set to 1. Software should consult the VMX capability MSR IA32_VMX_MISC (see Appendix A.6) to determine whether this is allowed. 3. A logical processor is outside SMX operation if GETSEC[SENTER] has not been executed or if GETSEC[SEXIT] was executed after the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference.” 30-26 Vol. 3C VMX INSTRUCTION REFERENCE #UD If executed outside VMX operation. Real-Address Mode Exceptions #UD The VMXOFF instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMXOFF instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD The VMXOFF instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If executed in VMX root operation with CPL > 0. #UD If executed outside VMX operation. Vol. 3C 30-27 VMX INSTRUCTION REFERENCE VMXON—Enter VMX Operation Opcode/ Instruction Op/En Description F3 0F C7 /6 M Enter VMX root operation. VMXON m64 Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 M ModRM:r/m (r) NA NA NA Description Puts the logical processor in VMX operation with no current VMCS, blocks INIT signals, disables A20M, and clears any address-range monitoring established by the MONITOR instruction.1 The operand of this instruction is a 4KB-aligned physical address (the VMXON pointer) that references the VMXON region, which the logical processor may use to support VMX operation. This operand is always 64 bits and is always in memory. Operation IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or (RFLAGS.VM = 1) or (IA32_EFER.LMA = 1 and CS.L = 0) THEN #UD; ELSIF not in VMX operation THEN IF (CPL > 0) or (in A20M mode) or (the values of CR0 and CR4 are not supported in VMX operation; see Section 23.8) or (bit 0 (lock bit) of IA32_FEATURE_CONTROL MSR is clear) or (in SMX operation2 and bit 1 of IA32_FEATURE_CONTROL MSR is clear) or (outside SMX operation and bit 2 of IA32_FEATURE_CONTROL MSR is clear) THEN #GP(0); ELSE addr ← contents of 64-bit in-memory source operand; IF addr is not 4KB-aligned or addr sets any bits beyond the physical-address width3 THEN VMfailInvalid; ELSE rev ← 32 bits located at physical address addr; IF rev[30:0] ≠ VMCS revision identifier supported by processor OR rev[31] = 1 THEN VMfailInvalid; ELSE current-VMCS pointer ← FFFFFFFF_FFFFFFFFH; enter VMX operation; block INIT signals; block and disable A20M; 1. See the information on MONITOR/MWAIT in Chapter 8, “Multiple-Processor Management,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. 2. A logical processor is in SMX operation if GETSEC[SEXIT] has not been executed since the last execution of GETSEC[SENTER]. A logical processor is outside SMX operation if GETSEC[SENTER] has not been executed or if GETSEC[SEXIT] was executed after the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference.” 3. If IA32_VMX_BASIC[48] is read as 1, VMfailInvalid occurs if addr sets any bits in the range 63:32; see Appendix A.1. 30-28 Vol. 3C VMX INSTRUCTION REFERENCE clear address-range monitoring; IF the processor supports Intel PT but does not allow it to be used in VMX operation1 THEN IA32_RTIT_CTL.TraceEn ← 0; FI; VMsucceed; FI; FI; FI; ELSIF in VMX non-root operation THEN VMexit; ELSIF CPL > 0 THEN #GP(0); ELSE VMfail(“VMXON executed in VMX root operation”); FI; Flags Affected See the operation section and Section 30.2. Protected Mode Exceptions #GP(0) If executed outside VMX operation with CPL>0 or with invalid CR0 or CR4 fixed bits. If executed in A20M mode. If the memory source operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains an unusable segment. If the source operand is located in an execute-only code segment. If the value of the IA32_FEATURE_CONTROL MSR does not support entry to VMX operation in the current processor mode. #PF(fault-code) If a page fault occurs in accessing the memory source operand. #SS(0) If the memory source operand effective address is outside the SS segment limit. If the SS register contains an unusable segment. #UD If operand is a register. If executed with CR4.VMXE = 0. Real-Address Mode Exceptions #UD The VMXON instruction is not recognized in real-address mode. Virtual-8086 Mode Exceptions #UD The VMXON instruction is not recognized in virtual-8086 mode. Compatibility Mode Exceptions #UD The VMXON instruction is not recognized in compatibility mode. 64-Bit Mode Exceptions #GP(0) If executed outside VMX operation with CPL > 0 or with invalid CR0 or CR4 fixed bits. If executed in A20M mode. If the source operand is in the CS, DS, ES, FS, or GS segments and the memory address is in a non-canonical form. 1. Software should read the VMX capability MSR IA32_VMX_MISC to determine whether the processor allows Intel PT to be used in VMX operation (see Appendix A.6). Vol. 3C 30-29 VMX INSTRUCTION REFERENCE If the value of the IA32_FEATURE_CONTROL MSR does not support entry to VMX operation in the current processor mode. #PF(fault-code) If a page fault occurs in accessing the memory source operand. #SS(0) If the source operand is in the SS segment and the memory address is in a non-canonical form. #UD If operand is a register. If executed with CR4.VMXE = 0. 30-30 Vol. 3C VMX INSTRUCTION REFERENCE 30.4 VM INSTRUCTION ERROR NUMBERS For certain error conditions, the VM-instruction error field is loaded with an error number to indicate the source of the error. Table 30-1 lists VM-instruction error numbers. Table 30-1. VM-Instruction Error Numbers Error Description Number 1 VMCALL executed in VMX root operation 2 VMCLEAR with invalid physical address 3 VMCLEAR with VMXON pointer 4 VMLAUNCH with non-clear VMCS 5 VMRESUME with non-launched VMCS 6 VMRESUME after VMXOFF (VMXOFF and VMXON between VMLAUNCH and VMRESUME)a 7 VM entry with invalid control field(s)b,c 8 VM entry with invalid host-state field(s)b 9 VMPTRLD with invalid physical address 10 VMPTRLD with VMXON pointer 11 VMPTRLD with incorrect VMCS revision identifier 12 VMREAD/VMWRITE from/to unsupported VMCS component 13 VMWRITE to read-only VMCS component 15 VMXON executed in VMX root operation 16 VM entry with invalid executive-VMCS pointerb 17 VM entry with non-launched executive VMCSb 18 VM entry with executive-VMCS pointer not VMXON pointer (when attempting to deactivate the dual-monitor treatment of SMIs and SMM)b 19 VMCALL with non-clear VMCS (when attempting to activate the dual-monitor treatment of SMIs and SMM) 20 VMCALL with invalid VM-exit control fields 22 VMCALL with incorrect MSEG revision identifier (when attempting to activate the dual-monitor treatment of SMIs and SMM) 23 VMXOFF under dual-monitor treatment of SMIs and SMM 24 VMCALL with invalid SMM-monitor features (when attempting to activate the dual-monitor treatment of SMIs and SMM) 25 VM entry with invalid VM-execution control fields in executive VMCS (when attempting to return from SMM)b,c 26 VM entry with events blocked by MOV SS. 28 Invalid operand to INVEPT/INVVPID. NOTES: a. Earlier versions of this manual described this error as “VMRESUME with a corrupted VMCS”. b. VM-entry checks on control fields and host-state fields may be performed in any order. Thus, an indication by error number of one cause does not imply that there are not also other errors. Different processors may give different error numbers for the same VMCS. c. Error number 7 is not used for VM entries that return from SMM that fail due to invalid VM-execution control fields in the executive VMCS. Error number 25 is used for these cases. Vol. 3C 30-31 VMX INSTRUCTION REFERENCE 30-32 Vol. 3C 22.Updates to Chapter 34, Volume 3C Change bars show changes to Chapter 34 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to chapter: Updates across chapter describing VMX support improvements made for Intel® Processor Trace (Intel® PT). Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 32 CHAPTER 34 SYSTEM MANAGEMENT MODE This chapter describes aspects of IA-64 and IA-32 architecture used in system management mode (SMM). SMM provides an alternate operating environment that can be used to monitor and manage various system resources for more efficient energy usage, to control system hardware, and/or to run proprietary code. It was introduced into the IA-32 architecture in the Intel386 SL processor (a mobile specialized version of the Intel386 processor). It is also available in the Pentium M, Pentium 4, Intel Xeon, P6 family, and Pentium and Intel486 processors (beginning with the enhanced versions of the Intel486 SL and Intel486 processors). 34.1 SYSTEM MANAGEMENT MODE OVERVIEW SMM is a special-purpose operating mode provided for handling system-wide functions like power management, system hardware control, or proprietary OEM-designed code. It is intended for use only by system firmware, not by applications software or general-purpose systems software. The main benefit of SMM is that it offers a distinct and easily isolated processor environment that operates transparently to the operating system or executive and software applications. When SMM is invoked through a system management interrupt (SMI), the processor saves the current state of the processor (the processor’s context), then switches to a separate operating environment defined by a new address space. The system management software executive (SMI handler) starts execution in that environment, and the critical code and data of the SMI handler reside in a physical memory region (SMRAM) within that address space. While in SMM, the processor executes SMI handler code to perform operations such as powering down unused disk drives or monitors, executing proprietary code, or placing the whole system in a suspended state. When the SMI handler has completed its operations, it executes a resume (RSM) instruction. This instruction causes the processor to reload the saved context of the processor, switch back to protected or real mode, and resume executing the interrupted application or operating-system program or task. The following SMM mechanisms make it transparent to applications programs and operating systems: • • The only way to enter SMM is by means of an SMI. • • • Upon entering SMM, the processor saves the context of the interrupted program or task. The processor executes SMM code in a separate address space that can be made inaccessible from the other operating modes. All interrupts normally handled by the operating system are disabled upon entry into SMM. The RSM instruction can be executed only in SMM. Section 34.3 describes transitions into and out of SMM. The execution environment after entering SMM is in realaddress mode with paging disabled (CR0.PE = CR0.PG = 0). In this initial execution environment, the SMI handler can address up to 4 GBytes of memory and can execute all I/O and system instructions. Section 34.5 describes in detail the initial SMM execution environment for an SMI handler and operation within that environment. The SMI handler may subsequently switch to other operating modes while remaining in SMM. NOTES Software developers should be aware that, even if a logical processor was using the physicaladdress extension (PAE) mechanism (introduced in the P6 family processors) or was in IA-32e mode before an SMI, this will not be the case after the SMI is delivered. This is because delivery of an SMI disables paging (see Table 34-4). (This does not apply if the dual-monitor treatment of SMIs and SMM is active; see Section 34.15.) 34.1.1 System Management Mode and VMX Operation Traditionally, SMM services system management interrupts and then resumes program execution (back to the software stack consisting of executive and application software; see Section 34.2 through Section 34.13). Vol. 3C 34-1 SYSTEM MANAGEMENT MODE A virtual machine monitor (VMM) using VMX can act as a host to multiple virtual machines and each virtual machine can support its own software stack of executive and application software. On processors that support VMX, virtualmachine extensions may use system-management interrupts (SMIs) and system-management mode (SMM) in one of two ways: • Default treatment. System firmware handles SMIs. The processor saves architectural states and critical states relevant to VMX operation upon entering SMM. When the firmware completes servicing SMIs, it uses RSM to resume VMX operation. • Dual-monitor treatment. Two VM monitors collaborate to control the servicing of SMIs: one VMM operates outside of SMM to provide basic virtualization in support for guests; the other VMM operates inside SMM (while in VMX operation) to support system-management functions. The former is referred to as executive monitor, the latter SMM-transfer monitor (STM).1 The default treatment is described in Section 34.14, “Default Treatment of SMIs and SMM with VMX Operation and SMX Operation”. Dual-monitor treatment of SMM is described in Section 34.15, “Dual-Monitor Treatment of SMIs and SMM”. 34.2 SYSTEM MANAGEMENT INTERRUPT (SMI) The only way to enter SMM is by signaling an SMI through the SMI# pin on the processor or through an SMI message received through the APIC bus. The SMI is a nonmaskable external interrupt that operates independently from the processor’s interrupt- and exception-handling mechanism and the local APIC. The SMI takes precedence over an NMI and a maskable interrupt. SMM is non-reentrant; that is, the SMI is disabled while the processor is in SMM. NOTES In the Pentium 4, Intel Xeon, and P6 family processors, when a processor that is designated as an application processor during an MP initialization sequence is waiting for a startup IPI (SIPI), it is in a mode where SMIs are masked. However if a SMI is received while an application processor is in the wait for SIPI mode, the SMI will be pended. The processor then responds on receipt of a SIPI by immediately servicing the pended SMI and going into SMM before handling the SIPI. An SMI may be blocked for one instruction following execution of STI, MOV to SS, or POP into SS. 34.3 SWITCHING BETWEEN SMM AND THE OTHER PROCESSOR OPERATING MODES Figure 2-3 shows how the processor moves between SMM and the other processor operating modes (protected, real-address, and virtual-8086). Signaling an SMI while the processor is in real-address, protected, or virtual-8086 modes always causes the processor to switch to SMM. Upon execution of the RSM instruction, the processor always returns to the mode it was in when the SMI occurred. 34.3.1 Entering SMM The processor always handles an SMI on an architecturally defined “interruptible” point in program execution (which is commonly at an IA-32 architecture instruction boundary). When the processor receives an SMI, it waits for all instructions to retire and for all stores to complete. The processor then saves its current context in SMRAM (see Section 34.4), enters SMM, and begins to execute the SMI handler. Upon entering SMM, the processor signals external hardware that SMI handling has begun. The signaling mechanism used is implementation dependent. For the P6 family processors, an SMI acknowledge transaction is gener- 1. The dual-monitor treatment may not be supported by all processors. Software should consult the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1) to determine whether it is supported. 34-2 Vol. 3C SYSTEM MANAGEMENT MODE ated on the system bus and the multiplexed status signal EXF4 is asserted each time a bus transaction is generated while the processor is in SMM. For the Pentium and Intel486 processors, the SMIACT# pin is asserted. An SMI has a greater priority than debug exceptions and external interrupts. Thus, if an NMI, maskable hardware interrupt, or a debug exception occurs at an instruction boundary along with an SMI, only the SMI is handled. Subsequent SMI requests are not acknowledged while the processor is in SMM. The first SMI interrupt request that occurs while the processor is in SMM (that is, after SMM has been acknowledged to external hardware) is latched and serviced when the processor exits SMM with the RSM instruction. The processor will latch only one SMI while in SMM. See Section 34.5 for a detailed description of the execution environment when in SMM. 34.3.2 Exiting From SMM The only way to exit SMM is to execute the RSM instruction. The RSM instruction is only available to the SMI handler; if the processor is not in SMM, attempts to execute the RSM instruction result in an invalid-opcode exception (#UD) being generated. The RSM instruction restores the processor’s context by loading the state save image from SMRAM back into the processor’s registers. The processor then returns an SMIACK transaction on the system bus and returns program control back to the interrupted program. Upon successful completion of the RSM instruction, the processor signals external hardware that SMM has been exited. For the P6 family processors, an SMI acknowledge transaction is generated on the system bus and the multiplexed status signal EXF4 is no longer generated on bus cycles. For the Pentium and Intel486 processors, the SMIACT# pin is deserted. If the processor detects invalid state information saved in the SMRAM, it enters the shutdown state and generates a special bus cycle to indicate it has entered shutdown state. Shutdown happens only in the following situations: • A reserved bit in control register CR4 is set to 1 on a write to CR4. This error should not happen unless SMI handler code modifies reserved areas of the SMRAM saved state map (see Section 34.4.1). CR4 is saved in the state map in a reserved location and cannot be read or modified in its saved state. • An illegal combination of bits is written to control register CR0, in particular PG set to 1 and PE set to 0, or NW set to 1 and CD set to 0. • • CR4.PCIDE would be set to 1 and IA32_EFER.LMA to 0. (For the Pentium and Intel486 processors only.) If the address stored in the SMBASE register when an RSM instruction is executed is not aligned on a 32-KByte boundary. This restriction does not apply to the P6 family processors. In the shutdown state, Intel processors stop executing instructions until a RESET#, INIT# or NMI# is asserted. While Pentium family processors recognize the SMI# signal in shutdown state, P6 family and Intel486 processors do not. Intel does not support using SMI# to recover from shutdown states for any processor family; the response of processors in this circumstance is not well defined. On Pentium 4 and later processors, shutdown will inhibit INTR and A20M but will not change any of the other inhibits. On these processors, NMIs will be inhibited if no action is taken in the SMI handler to uninhibit them (see Section 34.8). If the processor is in the HALT state when the SMI is received, the processor handles the return from SMM slightly differently (see Section 34.10). Also, the SMBASE address can be changed on a return from SMM (see Section 34.11). 34.4 SMRAM Upon entering SMM, the processor switches to a new address space. Because paging is disabled upon entering SMM, this initial address space maps all memory accesses to the low 4 GBytes of the processor's physical address space. The SMI handler's critical code and data reside in a memory region referred to as system-management RAM (SMRAM). The processor uses a pre-defined region within SMRAM to save the processor's pre-SMI context. SMRAM can also be used to store system management information (such as the system configuration and specific information about powered-down devices) and OEM-specific information. Vol. 3C 34-3 SYSTEM MANAGEMENT MODE The default SMRAM size is 64 KBytes beginning at a base physical address in physical memory called the SMBASE (see Figure 34-1). The SMBASE default value following a hardware reset is 30000H. The processor looks for the first instruction of the SMI handler at the address [SMBASE + 8000H]. It stores the processor’s state in the area from [SMBASE + FE00H] to [SMBASE + FFFFH]. See Section 34.4.1 for a description of the mapping of the state save area. The system logic is minimally required to decode the physical address range for the SMRAM from [SMBASE + 8000H] to [SMBASE + FFFFH]. A larger area can be decoded if needed. The size of this SMRAM can be between 32 KBytes and 4 GBytes. The location of the SMRAM can be changed by changing the SMBASE value (see Section 34.11). It should be noted that all processors in a multiple-processor system are initialized with the same SMBASE value (30000H). Initialization software must sequentially place each processor in SMM and change its SMBASE so that it does not overlap those of other processors. The actual physical location of the SMRAM can be in system memory or in a separate RAM memory. The processor generates an SMI acknowledge transaction (P6 family processors) or asserts the SMIACT# pin (Pentium and Intel486 processors) when the processor receives an SMI (see Section 34.3.1). System logic can use the SMI acknowledge transaction or the assertion of the SMIACT# pin to decode accesses to the SMRAM and redirect them (if desired) to specific SMRAM memory. If a separate RAM memory is used for SMRAM, system logic should provide a programmable method of mapping the SMRAM into system memory space when the processor is not in SMM. This mechanism will enable start-up procedures to initialize the SMRAM space (that is, load the SMI handler) before executing the SMI handler during SMM. 34.4.1 SMRAM State Save Map When an IA-32 processor that does not support Intel 64 architecture initially enters SMM, it writes its state to the state save area of the SMRAM. The state save area begins at [SMBASE + 8000H + 7FFFH] and extends down to [SMBASE + 8000H + 7E00H]. Table 34-1 shows the state save map. The offset in column 1 is relative to the SMBASE value plus 8000H. Reserved spaces should not be used by software. Some of the registers in the SMRAM state save area (marked YES in column 3) may be read and changed by the SMI handler, with the changed values restored to the processor registers by the RSM instruction. Some register images are read-only, and must not be modified (modifying these registers will result in unpredictable behavior). An SMI handler should not rely on any values stored in an area that is marked as reserved. SMRAM SMBASE + FFFFH SMBASE + 8000H Start of State Save Area SMI Handler Entry Point SMBASE Figure 34-1. SMRAM Usage 34-4 Vol. 3C SYSTEM MANAGEMENT MODE Table 34-1. SMRAM State Save Map Offset (Added to SMBASE + 8000H) Register Writable? 7FFCH CR0 No 7FF8H CR3 No 7FF4H EFLAGS Yes 7FF0H EIP Yes 7FECH EDI Yes 7FE8H ESI Yes 7FE4H EBP Yes 7FE0H ESP Yes 7FDCH EBX Yes 7FD8H EDX Yes 7FD4H ECX Yes 7FD0H EAX Yes 7FCCH DR6 No 7FC8H DR7 No 7FC4H TR 1 No 7FC0H Reserved No 7FBCH GS1 No 7FB8H 1 No 7FB4H DS 1 No 7FB0H SS1 No 7FACH 1 No 1 FS CS 7FA8H ES No 7FA4H I/O State Field, see Section 34.7 No 7FA0H I/O Memory Address Field, see Section 34.7 No 7F9FH-7F03H Reserved No 7F02H Auto HALT Restart Field (Word) Yes 7F00H I/O Instruction Restart Field (Word) Yes 7EFCH SMM Revision Identifier Field (Doubleword) No 7EF8H SMBASE Field (Doubleword) Yes 7EF7H - 7E00H Reserved No NOTE: 1. The two most significant bytes are reserved. The following registers are saved (but not readable) and restored upon exiting SMM: • • Control register CR4. (This register is cleared to all 0s when entering SMM). The hidden segment descriptor information stored in segment registers CS, DS, ES, FS, GS, and SS. If an SMI request is issued for the purpose of powering down the processor, the values of all reserved locations in the SMM state save must be saved to nonvolatile memory. The following state is not automatically saved and restored following an SMI and the RSM instruction, respectively: Vol. 3C 34-5 SYSTEM MANAGEMENT MODE • • • • • Debug registers DR0 through DR3. • • • • The state of the trap controller. The x87 FPU registers. The MTRRs. Control register CR2. The model-specific registers (for the P6 family and Pentium processors) or test registers TR3 through TR7 (for the Pentium and Intel486 processors). The machine-check architecture registers. The APIC internal interrupt state (ISR, IRR, etc.). The microcode update state. If an SMI is used to power down the processor, a power-on reset will be required before returning to SMM, which will reset much of this state back to its default values. So an SMI handler that is going to trigger power down should first read these registers listed above directly, and save them (along with the rest of RAM) to nonvolatile storage. After the power-on reset, the continuation of the SMI handler should restore these values, along with the rest of the system's state. Anytime the SMI handler changes these registers in the processor, it must also save and restore them. NOTES A small subset of the MSRs (such as, the time-stamp counter and performance-monitoring counters) are not arbitrarily writable and therefore cannot be saved and restored. SMM-based power-down and restoration should only be performed with operating systems that do not use or rely on the values of these registers. Operating system developers should be aware of this fact and insure that their operating-system assisted power-down and restoration software is immune to unexpected changes in these register values. 34.4.1.1 SMRAM State Save Map and Intel 64 Architecture When the processor initially enters SMM, it writes its state to the state save area of the SMRAM. The state save area on an Intel 64 processor at [SMBASE + 8000H + 7FFFH] and extends to [SMBASE + 8000H + 7C00H]. Support for Intel 64 architecture is reported by CPUID.80000001:EDX[29] = 1. The layout of the SMRAM state save map is shown in Table 34-3. Additionally, the SMRAM state save map shown in Table 34-3 also applies to processors with the following CPUID signatures listed in Table 34-2, irrespective of the value in CPUID.80000001:EDX[29]. Table 34-2. Processor Signatures and 64-bit SMRAM State Save Map Format DisplayFamily_DisplayModel Processor Families/Processor Number Series 06_17H Intel Xeon Processor 5200, 5400 series, Intel Core 2 Quad processor Q9xxx, Intel Core 2 Duo processors E8000, T9000, 06_0FH Intel Xeon Processor 3000, 3200, 5100, 5300, 7300 series, Intel Core 2 Quad, Intel Core 2 Extreme, Intel Core 2 Duo processors, Intel Pentium dual-core processors 06_1CH 45 nm Intel® Atom™ processors 34-6 Vol. 3C SYSTEM MANAGEMENT MODE Table 34-3. SMRAM State Save Map for Intel 64 Architecture Offset (Added to SMBASE + 8000H) Register Writable? 7FF8H CR0 No 7FF0H CR3 No 7FE8H RFLAGS Yes 7FE0H IA32_EFER Yes 7FD8H RIP Yes 7FD0H DR6 No 7FC8H DR7 7FC4H No TR SEL 1 No 1 7FC0H LDTR SEL 7FBCH 1 GS SEL No 7FB8H FS SEL1 No 7FB4H 1 No 7FB0H SS SEL 1 No 7FACH CS SEL1 No 7FA8H 1 ES SEL No 7FA4H IO_MISC No 7F9CH IO_MEM_ADDR No 7F94H RDI Yes 7F8CH RSI Yes 7F84H RBP Yes 7F7CH RSP Yes 7F74H RBX Yes 7F6CH RDX Yes 7F64H RCX Yes 7F5CH RAX Yes 7F54H R8 Yes 7F4CH R9 Yes 7F44H R10 Yes 7F3CH R11 Yes 7F34H R12 Yes 7F2CH R13 Yes 7F24H R14 Yes 7F1CH R15 Yes 7F1BH-7F04H Reserved No 7F02H Auto HALT Restart Field (Word) Yes 7F00H I/O Instruction Restart Field (Word) Yes 7EFCH SMM Revision Identifier Field (Doubleword) No 7EF8H SMBASE Field (Doubleword) Yes DS SEL No Vol. 3C 34-7 SYSTEM MANAGEMENT MODE Table 34-3. SMRAM State Save Map for Intel 64 Architecture (Contd.) Offset (Added to SMBASE + 8000H) Register Writable? 7EF7H - 7EE4H Reserved No 7EE0H Setting of “enable EPT” VM-execution control No 7ED8H Value of EPTP VM-execution control field No 7ED7H - 7EA0H Reserved No 7E9CH LDT Base (lower 32 bits) No 7E98H Reserved No 7E94H IDT Base (lower 32 bits) No 7E90H Reserved No 7E8CH GDT Base (lower 32 bits) No 7E8BH - 7E44H Reserved No 7E40H CR4 No 7E3FH - 7DF0H Reserved No 7DE8H IO_RIP Yes 7DE7H - 7DDCH Reserved No 7DD8H IDT Base (Upper 32 bits) No 7DD4H LDT Base (Upper 32 bits) No 7DD0H GDT Base (Upper 32 bits) No 7DCFH - 7C00H Reserved No NOTE: 1. The two most significant bytes are reserved. 34.4.2 SMRAM Caching An IA-32 processor does not automatically write back and invalidate its caches before entering SMM or before exiting SMM. Because of this behavior, care must be taken in the placement of the SMRAM in system memory and in the caching of the SMRAM to prevent cache incoherence when switching back and forth between SMM and protected mode operation. Any of the following three methods of locating the SMRAM in system memory will guarantee cache coherency. • Place the SMRAM in a dedicated section of system memory that the operating system and applications are prevented from accessing. Here, the SMRAM can be designated as cacheable (WB, WT, or WC) for optimum processor performance, without risking cache incoherence when entering or exiting SMM. • Place the SMRAM in a section of memory that overlaps an area used by the operating system (such as the video memory), but designate the SMRAM as uncacheable (UC). This method prevents cache access when in SMM to maintain cache coherency, but the use of uncacheable memory reduces the performance of SMM code. • Place the SMRAM in a section of system memory that overlaps an area used by the operating system and/or application code, but explicitly flush (write back and invalidate) the caches upon entering and exiting SMM mode. This method maintains cache coherency, but incurs the overhead of two complete cache flushes. For Pentium 4, Intel Xeon, and P6 family processors, a combination of the first two methods of locating the SMRAM is recommended. Here the SMRAM is split between an overlapping and a dedicated region of memory. Upon entering SMM, the SMRAM space that is accessed overlaps video memory (typically located in low memory). This SMRAM section is designated as UC memory. The initial SMM code then jumps to a second SMRAM section that is located in a dedicated region of system memory (typically in high memory). This SMRAM section can be cached for optimum processor performance. 34-8 Vol. 3C SYSTEM MANAGEMENT MODE For systems that explicitly flush the caches upon entering SMM (the third method described above), the cache flush can be accomplished by asserting the FLUSH# pin at the same time as the request to enter SMM (generally initiated by asserting the SMI# pin). The priorities of the FLUSH# and SMI# pins are such that the FLUSH# is serviced first. To guarantee this behavior, the processor requires that the following constraints on the interaction of FLUSH# and SMI# be met. In a system where the FLUSH# and SMI# pins are synchronous and the set up and hold times are met, then the FLUSH# and SMI# pins may be asserted in the same clock. In asynchronous systems, the FLUSH# pin must be asserted at least one clock before the SMI# pin to guarantee that the FLUSH# pin is serviced first. Upon leaving SMM (for systems that explicitly flush the caches), the WBINVD instruction should be executed prior to leaving SMM to flush the caches. NOTES In systems based on the Pentium processor that use the FLUSH# pin to write back and invalidate cache contents before entering SMM, the processor will prefetch at least one cache line in between when the Flush Acknowledge cycle is run and the subsequent recognition of SMI# and the assertion of SMIACT#. It is the obligation of the system to ensure that these lines are not cached by returning KEN# inactive to the Pentium processor. 34.4.2.1 System Management Range Registers (SMRR) SMI handler code and data stored by SMM code resides in SMRAM. The SMRR interface is an enhancement in Intel 64 architecture to limit cacheable reference of addresses in SMRAM to code running in SMM. The SMRR interface can be configured only by code running in SMM. Details of SMRR is described in Section 11.11.2.4. 34.5 SMI HANDLER EXECUTION ENVIRONMENT Section 34.5.1 describes the initial execution environment for an SMI handler. An SMI handler may re-configure its execution environment to other supported operating modes. Section 34.5.2 discusses modifications an SMI handler can make to its execution environment. 34.5.1 Initial SMM Execution Environment After saving the current context of the processor, the processor initializes its core registers to the values shown in Table 34-4. Upon entering SMM, the PE and PG flags in control register CR0 are cleared, which places the processor in an environment similar to real-address mode. The differences between the SMM execution environment and the real-address mode execution environment are as follows: • • • The addressable address space ranges from 0 to FFFFFFFFH (4 GBytes). The normal 64-KByte segment limit for real-address mode is increased to 4 GBytes. The default operand and address sizes are set to 16 bits, which restricts the addressable SMRAM address space to the 1-MByte real-address mode limit for native real-address-mode code. However, operand-size and address-size override prefixes can be used to access the address space beyond the 1-MByte. Table 34-4. Processor Register Initialization in SMM Register Contents General-purpose registers Undefined EFLAGS 00000002H EIP 00008000H CS selector SMM Base shifted right 4 bits (default 3000H) CS base SMM Base (default 30000H) DS, ES, FS, GS, SS Selectors 0000H Vol. 3C 34-9 SYSTEM MANAGEMENT MODE Table 34-4. Processor Register Initialization in SMM DS, ES, FS, GS, SS Bases 000000000H DS, ES, FS, GS, SS Limits 0FFFFFFFFH CR0 PE, EM, TS, and PG flags set to 0; others unmodified CR4 Cleared to zero DR6 Undefined DR7 00000400H • Near jumps and calls can be made to anywhere in the 4-GByte address space if a 32-bit operand-size override prefix is used. Due to the real-address-mode style of base-address formation, a far call or jump cannot transfer control to a segment with a base address of more than 20 bits (1 MByte). However, since the segment limit in SMM is 4 GBytes, offsets into a segment that go beyond the 1-MByte limit are allowed when using 32-bit operand-size override prefixes. Any program control transfer that does not have a 32-bit operand-size override prefix truncates the EIP value to the 16 low-order bits. • Data and the stack can be located anywhere in the 4-GByte address space, but can be accessed only with a 32bit address-size override if they are located above 1 MByte. As with the code segment, the base address for a data or stack segment cannot be more than 20 bits. The value in segment register CS is automatically set to the default of 30000H for the SMBASE shifted 4 bits to the right; that is, 3000H. The EIP register is set to 8000H. When the EIP value is added to shifted CS value (the SMBASE), the resulting linear address points to the first instruction of the SMI handler. The other segment registers (DS, SS, ES, FS, and GS) are cleared to 0 and their segment limits are set to 4 GBytes. In this state, the SMRAM address space may be treated as a single flat 4-GByte linear address space. If a segment register is loaded with a 16-bit value, that value is then shifted left by 4 bits and loaded into the segment base (hidden part of the segment register). The limits and attributes are not modified. Maskable hardware interrupts, exceptions, NMI interrupts, SMI interrupts, A20M interrupts, single-step traps, breakpoint traps, and INIT operations are inhibited when the processor enters SMM. Maskable hardware interrupts, exceptions, single-step traps, and breakpoint traps can be enabled in SMM if the SMM execution environment provides and initializes an interrupt table and the necessary interrupt and exception handlers (see Section 34.6). 34.5.2 SMI Handler Operating Mode Switching Within SMM, an SMI handler may change the processor's operating mode (e.g., to enable PAE paging, enter 64-bit mode, etc.) after it has made proper preparation and initialization to do so. For example, if switching to 32-bit protected mode, the SMI handler should follow the guidelines provided in Chapter 9, “Processor Management and Initialization”. If the SMI handler does wish to change operating mode, it is responsible for executing the appropriate mode-transition code after each SMI. It is recommended that the SMI handler make use of all means available to protect the integrity of its critical code and data. In particular, it should use the system-management range register (SMRR) interface if it is available (see Section 11.11.2.4). The SMRR interface can protect only the first 4 GBytes of the physical address space. The SMI handler should take that fact into account if it uses operating modes that allow access to physical addresses beyond that 4-GByte limit (e.g. PAE paging or 64-bit mode). Execution of the RSM instruction restores the pre-SMI processor state from the SMRAM state-state map (see Section 34.4.1) into which it was stored when the processor entered SMM. (The SMBASE field in the SMRAM statesave map does not determine the state following RSM but rather the initial environment following the next entry to SMM.) Any required change to operating mode is performed by the RSM instruction; there is no need for the SMI handler to change modes explicitly prior to executing RSM. 34.6 EXCEPTIONS AND INTERRUPTS WITHIN SMM When the processor enters SMM, all hardware interrupts are disabled in the following manner: 34-10 Vol. 3C SYSTEM MANAGEMENT MODE • The IF flag in the EFLAGS register is cleared, which inhibits maskable hardware interrupts from being generated. • • The TF flag in the EFLAGS register is cleared, which disables single-step traps. • NMI, SMI, and A20M interrupts are blocked by internal SMM logic. (See Section 34.8 for more information about how NMIs are handled in SMM.) Debug register DR7 is cleared, which disables breakpoint traps. (This action prevents a debugger from accidentally breaking into an SMI handler if a debug breakpoint is set in normal address space that overlays code or data in SMRAM.) Software-invoked interrupts and exceptions can still occur, and maskable hardware interrupts can be enabled by setting the IF flag. Intel recommends that SMM code be written in so that it does not invoke software interrupts (with the INT n, INTO, INT1, INT3, or BOUND instructions) or generate exceptions. If the SMI handler requires interrupt and exception handling, an SMM interrupt table and the necessary exception and interrupt handlers must be created and initialized from within SMM. Until the interrupt table is correctly initialized (using the LIDT instruction), exceptions and software interrupts will result in unpredictable processor behavior. The following restrictions apply when designing SMM interrupt and exception-handling facilities: • The interrupt table should be located at linear address 0 and must contain real-address mode style interrupt vectors (4 bytes containing CS and IP). • Due to the real-address mode style of base address formation, an interrupt or exception cannot transfer control to a segment with a base address of more that 20 bits. • • An interrupt or exception cannot transfer control to a segment offset of more than 16 bits (64 KBytes). • The SMBASE relocation feature affects the way the processor will return from an interrupt or exception generated while the SMI handler is executing. For example, if the SMBASE is relocated to above 1 MByte, but the exception handlers are below 1 MByte, a normal return to the SMI handler is not possible. One solution is to provide the exception handler with a mechanism for calculating a return address above 1 MByte from the 16bit return address on the stack, then use a 32-bit far call to return to the interrupted procedure. • If an SMI handler needs access to the debug trap facilities, it must insure that an SMM accessible debug handler is available and save the current contents of debug registers DR0 through DR3 (for later restoration). Debug registers DR0 through DR3 and DR7 must then be initialized with the appropriate values. • If an SMI handler needs access to the single-step mechanism, it must insure that an SMM accessible singlestep handler is available, and then set the TF flag in the EFLAGS register. • If the SMI design requires the processor to respond to maskable hardware interrupts or software-generated interrupts while in SMM, it must ensure that SMM accessible interrupt handlers are available and then set the IF flag in the EFLAGS register (using the STI instruction). Software interrupts are not blocked upon entry to SMM, so they do not need to be enabled. When an exception or interrupt occurs, only the 16 least-significant bits of the return address (EIP) are pushed onto the stack. If the offset of the interrupted procedure is greater than 64 KBytes, it is not possible for the interrupt/exception handler to return control to that procedure. (One solution to this problem is for a handler to adjust the return address on the stack.) 34.7 MANAGING SYNCHRONOUS AND ASYNCHRONOUS SYSTEM MANAGEMENT INTERRUPTS When coding for a multiprocessor system or a system with Intel HT Technology, it was not always possible for an SMI handler to distinguish between a synchronous SMI (triggered during an I/O instruction) and an asynchronous SMI. To facilitate the discrimination of these two events, incremental state information has been added to the SMM state save map. Processors that have an SMM revision ID of 30004H or higher have the incremental state information described below. Vol. 3C 34-11 SYSTEM MANAGEMENT MODE 34.7.1 I/O State Implementation Within the extended SMM state save map, a bit (IO_SMI) is provided that is set only when an SMI is either taken immediately after a successful I/O instruction or is taken after a successful iteration of a REP I/O instruction (the successful notion pertains to the processor point of view; not necessarily to the corresponding platform function). When set, the IO_SMI bit provides a strong indication that the corresponding SMI was synchronous. In this case, the SMM State Save Map also supplies the port address of the I/O operation. The IO_SMI bit and the I/O Port Address may be used in conjunction with the information logged by the platform to confirm that the SMI was indeed synchronous. The IO_SMI bit by itself is a strong indication, not a guarantee, that the SMI is synchronous. This is because an asynchronous SMI might coincidentally be taken after an I/O instruction. In such a case, the IO_SMI bit would still be set in the SMM state save map. Information characterizing the I/O instruction is saved in two locations in the SMM State Save Map (Table 34-5). The IO_SMI bit also serves as a valid bit for the rest of the I/O information fields. The contents of these I/O information fields are not defined when the IO_SMI bit is not set. Table 34-5. I/O Instruction Information in the SMM State Save Map State (SMM Rev. ID: 30004H or higher) Format 31 15 8 I/O Memory Address When IO_SMI is set, the other fields may be interpreted as follows: • I/O length: • • • 001 – Byte 010 – Word 100 – Dword I/O instruction type (Table 34-6) Table 34-6. I/O Instruction Type Encodings Instruction Encoding IN Immediate 1001 IN DX 0001 OUT Immediate 1000 OUT DX 0000 INS 0011 OUTS 0010 REP INS 0111 REP OUTS 0110 34-12 Vol. 3C 0 0 SMRAM offset 7FA0 • 1 IO_SMI I/O Memory Address Field 3 I/O Length 31 4 I/O Type SMRAM offset 7FA4 7 Reserved I/O Port I/0 State Field 16 SYSTEM MANAGEMENT MODE 34.8 NMI HANDLING WHILE IN SMM NMI interrupts are blocked upon entry to the SMI handler. If an NMI request occurs during the SMI handler, it is latched and serviced after the processor exits SMM. Only one NMI request will be latched during the SMI handler. If an NMI request is pending when the processor executes the RSM instruction, the NMI is serviced before the next instruction of the interrupted code sequence. This assumes that NMIs were not blocked before the SMI occurred. If NMIs were blocked before the SMI occurred, they are blocked after execution of RSM. Although NMI requests are blocked when the processor enters SMM, they may be enabled through software by executing an IRET instruction. If the SMI handler requires the use of NMI interrupts, it should invoke a dummy interrupt service routine for the purpose of executing an IRET instruction. Once an IRET instruction is executed, NMI interrupt requests are serviced in the same “real mode” manner in which they are handled outside of SMM. A special case can occur if an SMI handler nests inside an NMI handler and then another NMI occurs. During NMI interrupt handling, NMI interrupts are disabled, so normally NMI interrupts are serviced and completed with an IRET instruction one at a time. When the processor enters SMM while executing an NMI handler, the processor saves the SMRAM state save map but does not save the attribute to keep NMI interrupts disabled. Potentially, an NMI could be latched (while in SMM or upon exit) and serviced upon exit of SMM even though the previous NMI handler has still not completed. One or more NMIs could thus be nested inside the first NMI handler. The NMI interrupt handler should take this possibility into consideration. Also, for the Pentium processor, exceptions that invoke a trap or fault handler will enable NMI interrupts from inside of SMM. This behavior is implementation specific for the Pentium processor and is not part of the IA-32 architecture. 34.9 SMM REVISION IDENTIFIER The SMM revision identifier field is used to indicate the version of SMM and the SMM extensions that are supported by the processor (see Figure 34-2). The SMM revision identifier is written during SMM entry and can be examined in SMRAM space at offset 7EFCH. The lower word of the SMM revision identifier refers to the version of the base SMM architecture. Register Offset 7EFCH 31 0 18 17 16 15 Reserved SMM Revision Identifier SMBASE Relocation I/O Instruction Restart Figure 34-2. SMM Revision Identifier The upper word of the SMM revision identifier refers to the extensions available. If the I/O instruction restart flag (bit 16) is set, the processor supports the I/O instruction restart (see Section 34.12); if the SMBASE relocation flag (bit 17) is set, SMRAM base address relocation is supported (see Section 34.11). 34.10 AUTO HALT RESTART If the processor is in a HALT state (due to the prior execution of a HLT instruction) when it receives an SMI, the processor records the fact in the auto HALT restart flag in the saved processor state (see Figure 34-3). (This flag is located at offset 7F02H and bit 0 in the state save area of the SMRAM.) If the processor sets the auto HALT restart flag upon entering SMM (indicating that the SMI occurred when the processor was in the HALT state), the SMI handler has two options: Vol. 3C 34-13 SYSTEM MANAGEMENT MODE • It can leave the auto HALT restart flag set, which instructs the RSM instruction to return program control to the HLT instruction. This option in effect causes the processor to re-enter the HALT state after handling the SMI. (This is the default operation.) • It can clear the auto HALT restart flag, which instructs the RSM instruction to return program control to the instruction following the HLT instruction. 1 0 15 Reserved Register Offset 7F02H Auto HALT Restart Figure 34-3. Auto HALT Restart Field These options are summarized in Table 34-7. If the processor was not in a HALT state when the SMI was received (the auto HALT restart flag is cleared), setting the flag to 1 will cause unpredictable behavior when the RSM instruction is executed. Table 34-7. Auto HALT Restart Flag Values Value of Flag After Entry to SMM Value of Flag When Exiting SMM Action of Processor When Exiting SMM 0 0 Returns to next instruction in interrupted program or task. 0 1 Unpredictable. 1 0 Returns to next instruction after HLT instruction. 1 1 Returns to HALT state. If the HLT instruction is restarted, the processor will generate a memory access to fetch the HLT instruction (if it is not in the internal cache), and execute a HLT bus transaction. This behavior results in multiple HLT bus transactions for the same HLT instruction. 34.10.1 Executing the HLT Instruction in SMM The HLT instruction should not be executed during SMM, unless interrupts have been enabled by setting the IF flag in the EFLAGS register. If the processor is halted in SMM, the only event that can remove the processor from this state is a maskable hardware interrupt or a hardware reset. 34.11 SMBASE RELOCATION The default base address for the SMRAM is 30000H. This value is contained in an internal processor register called the SMBASE register. The operating system or executive can relocate the SMRAM by setting the SMBASE field in the saved state map (at offset 7EF8H) to a new value (see Figure 34-4). The RSM instruction reloads the internal SMBASE register with the value in the SMBASE field each time it exits SMM. All subsequent SMI requests will use the new SMBASE value to find the starting address for the SMI handler (at SMBASE + 8000H) and the SMRAM state save area (from SMBASE + FE00H to SMBASE + FFFFH). (The processor resets the value in its internal SMBASE register to 30000H on a RESET, but does not change it on an INIT.) 34-14 Vol. 3C SYSTEM MANAGEMENT MODE 31 0 Register Offset 7EF8H SMM Base Figure 34-4. SMBASE Relocation Field In multiple-processor systems, initialization software must adjust the SMBASE value for each processor so that the SMRAM state save areas for each processor do not overlap. (For Pentium and Intel486 processors, the SMBASE values must be aligned on a 32-KByte boundary or the processor will enter shutdown state during the execution of a RSM instruction.) If the SMBASE relocation flag in the SMM revision identifier field is set, it indicates the ability to relocate the SMBASE (see Section 34.9). 34.12 I/O INSTRUCTION RESTART If the I/O instruction restart flag in the SMM revision identifier field is set (see Section 34.9), the I/O instruction restart mechanism is present on the processor. This mechanism allows an interrupted I/O instruction to be reexecuted upon returning from SMM mode. For example, if an I/O instruction is used to access a powered-down I/O device, a chip set supporting this device can intercept the access and respond by asserting SMI#. This action invokes the SMI handler to power-up the device. Upon returning from the SMI handler, the I/O instruction restart mechanism can be used to re-execute the I/O instruction that caused the SMI. The I/O instruction restart field (at offset 7F00H in the SMM state-save area, see Figure 34-5) controls I/O instruction restart. When an RSM instruction is executed, if this field contains the value FFH, then the EIP register is modified to point to the I/O instruction that received the SMI request. The processor will then automatically re-execute the I/O instruction that the SMI trapped. (The processor saves the necessary machine state to insure that reexecution of the instruction is handled coherently.) 15 0 I/O Instruction Restart Field Register Offset 7F00H Figure 34-5. I/O Instruction Restart Field If the I/O instruction restart field contains the value 00H when the RSM instruction is executed, then the processor begins program execution with the instruction following the I/O instruction. (When a repeat prefix is being used, the next instruction may be the next I/O instruction in the repeat loop.) Not re-executing the interrupted I/O instruction is the default behavior; the processor automatically initializes the I/O instruction restart field to 00H upon entering SMM. Table 34-8 summarizes the states of the I/O instruction restart field. Table 34-8. I/O Instruction Restart Field Values Value of Flag After Entry to SMM Value of Flag When Exiting SMM Action of Processor When Exiting SMM 00H 00H Does not re-execute trapped I/O instruction. 00H FFH Re-executes trapped I/O instruction. The I/O instruction restart mechanism does not indicate the cause of the SMI. It is the responsibility of the SMI handler to examine the state of the processor to determine the cause of the SMI and to determine if an I/O instruction was interrupted and should be restarted upon exiting SMM. If an SMI interrupt is signaled on a non-I/O instruction boundary, setting the I/O instruction restart field to FFH prior to executing the RSM instruction will likely result in a program error. Vol. 3C 34-15 SYSTEM MANAGEMENT MODE 34.12.1 Back-to-Back SMI Interrupts When I/O Instruction Restart Is Being Used If an SMI interrupt is signaled while the processor is servicing an SMI interrupt that occurred on an I/O instruction boundary, the processor will service the new SMI request before restarting the originally interrupted I/O instruction. If the I/O instruction restart field is set to FFH prior to returning from the second SMI handler, the EIP will point to an address different from the originally interrupted I/O instruction, which will likely lead to a program error. To avoid this situation, the SMI handler must be able to recognize the occurrence of back-to-back SMI interrupts when I/O instruction restart is being used and insure that the handler sets the I/O instruction restart field to 00H prior to returning from the second invocation of the SMI handler. 34.13 SMM MULTIPLE-PROCESSOR CONSIDERATIONS The following should be noted when designing multiple-processor systems: • • • Any processor in a multiprocessor system can respond to an SMM. • • The SMI handler will need to initialize the SMBASE for each processor. • • Each processor needs its own SMRAM space. This space can be in system memory or in a separate RAM. The SMRAMs for different processors can be overlapped in the same memory space. The only stipulation is that each processor needs its own state save area and its own dynamic data storage area. (Also, for the Pentium and Intel486 processors, the SMBASE address must be located on a 32-KByte boundary.) Code and static data can be shared among processors. Overlapping SMRAM spaces can be done more efficiently with the P6 family processors because they do not require that the SMBASE address be on a 32-KByte boundary. Processors can respond to local SMIs through their SMI# pins or to SMIs received through the APIC interface. The APIC interface can distribute SMIs to different processors. Two or more processors can be executing in SMM at the same time. When operating Pentium processors in dual processing (DP) mode, the SMIACT# pin is driven only by the MRM processor and should be sampled with ADS#. For additional details, see Chapter 14 of the Pentium Processor Family User’s Manual, Volume 1. SMM is not re-entrant, because the SMRAM State Save Map is fixed relative to the SMBASE. If there is a need to support two or more processors in SMM mode at the same time then each processor should have dedicated SMRAM spaces. This can be done by using the SMBASE Relocation feature (see Section 34.11). 34.14 DEFAULT TREATMENT OF SMIS AND SMM WITH VMX OPERATION AND SMX OPERATION Under the default treatment, the interactions of SMIs and SMM with VMX operation are few. This section details those interactions. It also explains how this treatment affects SMX operation. 34.14.1 Default Treatment of SMI Delivery Ordinary SMI delivery saves processor state into SMRAM and then loads state based on architectural definitions. Under the default treatment, processors that support VMX operation perform SMI delivery as follows: enter SMM; save the following internal to the processor: CR4.VMXE an indication of whether the logical processor was in VMX operation (root or non-root) IF the logical processor is in VMX operation THEN save current VMCS pointer internal to the processor; leave VMX operation; save VMX-critical state defined below; 34-16 Vol. 3C SYSTEM MANAGEMENT MODE FI; IF the logical processor supports SMX operation THEN save internal to the logical processor an indication of whether the Intel® TXT private space is locked; IF the TXT private space is unlocked THEN lock the TXT private space; FI; FI; CR4.VMXE ← 0; perform ordinary SMI delivery: save processor state in SMRAM; set processor state to standard SMM values;1 invalidate linear mappings and combined mappings associated with VPID 0000H (for all PCIDs); combined mappings for VPID 0000H are invalidated for all EP4TA values (EP4TA is the value of bits 51:12 of EPTP; see Section 28.3); The pseudocode above makes reference to the saving of VMX-critical state. This state consists of the following: (1) SS.DPL (the current privilege level); (2) RFLAGS.VM2; (3) the state of blocking by STI and by MOV SS (see Table 24-3 in Section 24.4.2); (4) the state of virtual-NMI blocking (only if the processor is in VMX non-root operation and the “virtual NMIs” VM-execution control is 1); and (5) an indication of whether an MTF VM exit is pending (see Section 25.5.2). These data may be saved internal to the processor or in the VMCS region of the current VMCS. Processors that do not support SMI recognition while there is blocking by STI or by MOV SS need not save the state of such blocking. If the logical processor supports the 1-setting of the “enable EPT” VM-execution control and the logical processor was in VMX non-root operation at the time of an SMI, it saves the value of that control into bit 0 of the 32-bit field at offset SMBASE + 8000H + 7EE0H (SMBASE + FEE0H; see Table 34-3).3 If the logical processor was not in VMX non-root operation at the time of the SMI, it saves 0 into that bit. If the logical processor saves 1 into that bit (it was in VMX non-root operation and the “enable EPT” VM-execution control was 1), it saves the value of the EPT pointer (EPTP) into the 64-bit field at offset SMBASE + 8000H + 7ED8H (SMBASE + FED8H). Because SMI delivery causes a logical processor to leave VMX operation, all the controls associated with VMX nonroot operation are disabled in SMM and thus cannot cause VM exits while the logical processor in SMM. 34.14.2 Default Treatment of RSM Ordinary execution of RSM restores processor state from SMRAM. Under the default treatment, processors that support VMX operation perform RSM as follows: IF VMXE = 1 in CR4 image in SMRAM THEN fail and enter shutdown state; ELSE restore state normally from SMRAM; invalidate linear mappings and combined mappings associated with all VPIDs and all PCIDs; combined mappings are invalidated for all EP4TA values (EP4TA is the value of bits 51:12 of EPTP; see Section 28.3); IF the logical processor supports SMX operation andthe Intel® TXT private space was unlocked at the time of the last SMI (as saved) THEN unlock the TXT private space; FI; CR4.VMXE ← value stored internally; 1. This causes the logical processor to block INIT signals, NMIs, and SMIs. 2. Section 34.14 and Section 34.15 use the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For processors that do not support Intel 64 architecture, this notation refers to the 32-bit forms of these registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to the lower 32 bits of the register. 3. “Enable EPT” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, SMI functions as the “enable EPT” VM-execution control were 0. See Section 24.6.2. Vol. 3C 34-17 SYSTEM MANAGEMENT MODE IF internal storage indicates that the logical processor had been in VMX operation (root or non-root) THEN enter VMX operation (root or non-root); restore VMX-critical state as defined in Section 34.14.1; set to their fixed values any bits in CR0 and CR4 whose values must be fixed in VMX operation (see Section 23.8);1 IF RFLAGS.VM = 0 AND (in VMX root operation OR the “unrestricted guest” VM-execution control is 0)2 THEN CS.RPL ← SS.DPL; SS.RPL ← SS.DPL; FI; restore current VMCS pointer; FI; leave SMM; IF logical processor will be in VMX operation or in SMX operation after RSM THEN block A20M and leave A20M mode; FI; FI; RSM unblocks SMIs. It restores the state of blocking by NMI (see Table 24-3 in Section 24.4.2) as follows: • If the RSM is not to VMX non-root operation or if the “virtual NMIs” VM-execution control will be 0, the state of NMI blocking is restored normally. • If the RSM is to VMX non-root operation and the “virtual NMIs” VM-execution control will be 1, NMIs are not blocked after RSM. The state of virtual-NMI blocking is restored as part of VMX-critical state. INIT signals are blocked after RSM if and only if the logical processor will be in VMX root operation. If RSM returns a logical processor to VMX non-root operation, it re-establishes the controls associated with the current VMCS. If the “interrupt-window exiting” VM-execution control is 1, a VM exit occurs immediately after RSM if the enabling conditions apply. The same is true for the “NMI-window exiting” VM-execution control. Such VM exits occur with their normal priority. See Section 25.2. If an MTF VM exit was pending at the time of the previous SMI, an MTF VM exit is pending on the instruction boundary following execution of RSM. The following items detail the treatment of MTF VM exits that may be pending following RSM: • System-management interrupts (SMIs), INIT signals, and higher priority events take priority over these MTF VM exits. These MTF VM exits take priority over debug-trap exceptions and lower priority events. • These MTF VM exits wake the logical processor if RSM caused the logical processor to enter the HLT state (see Section 34.10). They do not occur if the logical processor just entered the shutdown state. 34.14.3 Protection of CR4.VMXE in SMM Under the default treatment, CR4.VMXE is treated as a reserved bit while a logical processor is in SMM. Any attempt by software running in SMM to set this bit causes a general-protection exception. In addition, software cannot use VMX instructions or enter VMX operation while in SMM. 34.14.4 VMXOFF and SMI Unblocking The VMXOFF instruction can be executed only with the default treatment (see Section 34.15.1) and only outside SMM. If SMIs are blocked when VMXOFF is executed, VMXOFF unblocks them unless 1. If the RSM is to VMX non-root operation and both the “unrestricted guest” VM-execution control and bit 31 of the primary processor-based VM-execution controls will be 1, CR0.PE and CR0.PG retain the values that were loaded from SMRAM regardless of what is reported in the capability MSR IA32_VMX_CR0_FIXED0. 2. “Unrestricted guest” is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the “unrestricted guest” VM-execution control were 0. See Section 24.6.2. 34-18 Vol. 3C SYSTEM MANAGEMENT MODE IA32_SMM_MONITOR_CTL[bit 2] is 1 (see Section 34.15.5 for details regarding this MSR).1 Section 34.15.7 identifies a case in which SMIs may be blocked when VMXOFF is executed. Not all processors allow this bit to be set to 1. Software should consult the VMX capability MSR IA32_VMX_MISC (see Appendix A.6) to determine whether this is allowed. 34.15 DUAL-MONITOR TREATMENT OF SMIs AND SMM Dual-monitor treatment is activated through the cooperation of the executive monitor (the VMM that operates outside of SMM to provide basic virtualization) and the SMM-transfer monitor (STM; the VMM that operates inside SMM—while in VMX operation—to support system-management functions). Control is transferred to the STM through VM exits; VM entries are used to return from SMM. The dual-monitor treatment may not be supported by all processors. Software should consult the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1) to determine whether it is supported. 34.15.1 Dual-Monitor Treatment Overview The dual-monitor treatment uses an executive monitor and an SMM-transfer monitor (STM). Transitions from the executive monitor or its guests to the STM are called SMM VM exits and are discussed in Section 34.15.2. SMM VM exits are caused by SMIs as well as executions of VMCALL in VMX root operation. The latter allow the executive monitor to call the STM for service. The STM runs in VMX root operation and uses VMX instructions to establish a VMCS and perform VM entries to its own guests. This is done all inside SMM (see Section 34.15.3). The STM returns from SMM, not by using the RSM instruction, but by using a VM entry that returns from SMM. Such VM entries are described in Section 34.15.4. Initially, there is no STM and the default treatment (Section 34.14) is used. The dual-monitor treatment is not used until it is enabled and activated. The steps to do this are described in Section 34.15.5 and Section 34.15.6. It is not possible to leave VMX operation under the dual-monitor treatment; VMXOFF will fail if executed. The dualmonitor treatment must be deactivated first. The STM deactivates dual-monitor treatment using a VM entry that returns from SMM with the “deactivate dual-monitor treatment” VM-entry control set to 1 (see Section 34.15.7). The executive monitor configures any VMCS that it uses for VM exits to the executive monitor. SMM VM exits, which transfer control to the STM, use a different VMCS. Under the dual-monitor treatment, each logical processor uses a separate VMCS called the SMM-transfer VMCS. When the dual-monitor treatment is active, the logical processor maintains another VMCS pointer called the SMM-transfer VMCS pointer. The SMM-transfer VMCS pointer is established when the dual-monitor treatment is activated. 34.15.2 SMM VM Exits An SMM VM exit is a VM exit that begins outside SMM and that ends in SMM. Unlike other VM exits, SMM VM exits can begin in VMX root operation. SMM VM exits result from the arrival of an SMI outside SMM or from execution of VMCALL in VMX root operation outside SMM. Execution of VMCALL in VMX root operation causes an SMM VM exit only if the valid bit is set in the IA32_SMM_MONITOR_CTL MSR (see Section 34.15.5). Execution of VMCALL in VMX root operation causes an SMM VM exit even under the default treatment. This SMM VM exit activates the dual-monitor treatment (see Section 34.15.6). Differences between SMM VM exits and other VM exits are detailed in Sections 34.15.2.1 through 34.15.2.5. Differences between SMM VM exits that activate the dual-monitor treatment and other SMM VM exits are described in Section 34.15.6. 1. Setting IA32_SMM_MONITOR_CTL[bit 2] to 1 prevents VMXOFF from unblocking SMIs regardless of the value of the register’s valid bit (bit 0). Vol. 3C 34-19 SYSTEM MANAGEMENT MODE 34.15.2.1 Architectural State Before a VM Exit System-management interrupts (SMIs) that cause SMM VM exits always do so directly. They do not save state to SMRAM as they do under the default treatment. 34.15.2.2 Updating the Current-VMCS and Executive-VMCS Pointers SMM VM exits begin by performing the following steps: 1. The executive-VMCS pointer field in the SMM-transfer VMCS is loaded as follows: — If the SMM VM exit commenced in VMX non-root operation, it receives the current-VMCS pointer. — If the SMM VM exit commenced in VMX root operation, it receives the VMXON pointer. 2. The current-VMCS pointer is loaded with the value of the SMM-transfer VMCS pointer. The last step ensures that the current VMCS is the SMM-transfer VMCS. VM-exit information is recorded in that VMCS, and VM-entry control fields in that VMCS are updated. State is saved into the guest-state area of that VMCS. The VM-exit controls and host-state area of that VMCS determine how the VM exit operates. 34.15.2.3 Recording VM-Exit Information SMM VM exits differ from other VM exit with regard to the way they record VM-exit information. The differences follow. • Exit reason. — Bits 15:0 of this field contain the basic exit reason. The field is loaded with the reason for the SMM VM exit: I/O SMI (an SMI arrived immediately after retirement of an I/O instruction), other SMI, or VMCALL. See Appendix C, “VMX Basic Exit Reasons”. — SMM VM exits are the only VM exits that may occur in VMX root operation. Because the SMM-transfer monitor may need to know whether it was invoked from VMX root or VMX non-root operation, this information is stored in bit 29 of the exit-reason field (see Table 24-15 in Section 24.9.1). The bit is set by SMM VM exits from VMX root operation. — If the SMM VM exit occurred in VMX non-root operation and an MTF VM exit was pending, bit 28 of the exitreason field is set; otherwise, it is cleared. — Bits 27:16 and bits 31:30 are cleared. • Exit qualification. For an SMM VM exit due an SMI that arrives immediately after the retirement of an I/O instruction, the exit qualification contains information about the I/O instruction that retired immediately before the SMI. It has the format given in Table 34-9. Table 34-9. Exit Qualification for SMIs That Arrive Immediately After the Retirement of an I/O Instruction Bit Position(s) Contents 2:0 Size of access: 0 = 1-byte 1 = 2-byte 3 = 4-byte Other values not used. 3 Direction of the attempted access (0 = OUT, 1 = IN) 4 String instruction (0 = not string; 1 = string) 5 REP prefixed (0 = not REP; 1 = REP) 6 Operand encoding (0 = DX, 1 = immediate) 34-20 Vol. 3C SYSTEM MANAGEMENT MODE Table 34-9. Exit Qualification for SMIs That Arrive Immediately After the Retirement of an I/O Instruction (Contd.) Bit Position(s) Contents 15:7 Reserved (cleared to 0) 31:16 Port number (as specified in the I/O instruction) 63:32 Reserved (cleared to 0). These bits exist only on processors that support Intel 64 architecture. • Guest linear address. This field is used for VM exits due to SMIs that arrive immediately after the retirement of an INS or OUTS instruction for which the relevant segment (ES for INS; DS for OUTS unless overridden by an instruction prefix) is usable. The field receives the value of the linear address generated by ES:(E)DI (for INS) or segment:(E)SI (for OUTS; the default segment is DS but can be overridden by a segment override prefix) at the time the instruction started. If the relevant segment is not usable, the value is undefined. On processors that support Intel 64 architecture, bits 63:32 are clear if the logical processor was not in 64-bit mode before the VM exit. • I/O RCX, I/O RSI, I/O RDI, and I/O RIP. For an SMM VM exit due an SMI that arrives immediately after the retirement of an I/O instruction, these fields receive the values that were in RCX, RSI, RDI, and RIP, respectively, before the I/O instruction executed. Thus, the value saved for I/O RIP addresses the I/O instruction. 34.15.2.4 Saving Guest State SMM VM exits save the contents of the SMBASE register into the corresponding field in the guest-state area. The value of the VMX-preemption timer is saved into the corresponding field in the guest-state area if the “save VMX-preemption timer value” VM-exit control is 1. That field becomes undefined if, in addition, either the SMM VM exit is from VMX root operation or the SMM VM exit is from VMX non-root operation and the “activate VMXpreemption timer” VM-execution control is 0. 34.15.2.5 Updating State If an SMM VM exit is from VMX non-root operation and the “Intel PT uses guest physical addresses” VM-execution control is 1, the IA32_RTIT_CTL MSR is cleared to 00000000_00000000H.1 This is done even if the “clear IA32_RTIT_CTL” VM-exit control is 0. SMM VM exits affect the non-register state of a logical processor as follows: • SMM VM exits cause non-maskable interrupts (NMIs) to be blocked; they may be unblocked through execution of IRET or through a VM entry (depending on the value loaded for the interruptibility state and the setting of the “virtual NMIs” VM-execution control). • SMM VM exits cause SMIs to be blocked; they may be unblocked by a VM entry that returns from SMM (see Section 34.15.4). SMM VM exits invalidate linear mappings and combined mappings associated with VPID 0000H for all PCIDs. Combined mappings for VPID 0000H are invalidated for all EP4TA values (EP4TA is the value of bits 51:12 of EPTP; see Section 28.3). (Ordinary VM exits are not required to perform such invalidation if the “enable VPID” VM-execution control is 1; see Section 27.5.5.) 34.15.3 Operation of the SMM-Transfer Monitor Once invoked, the SMM-transfer monitor (STM) is in VMX root operation and can use VMX instructions to configure VMCSs and to cause VM entries to virtual machines supported by those structures. As noted in Section 34.15.1, the VMXOFF instruction cannot be used under the dual-monitor treatment and thus cannot be used by the STM. 1. In this situation, the value of this MSR was saved earlier into the guest-state area. All VM exits save this MSR if the 1-setting of the “load IA32_RTIT_CTL” VM-entry control is supported (see Section 27.3.1), which must be the case if the “Intel PT uses guest physical addresses” VM-execution control is 1 (see Section 26.2.1.1). Vol. 3C 34-21 SYSTEM MANAGEMENT MODE The RSM instruction also cannot be used under the dual-monitor treatment. As noted in Section 25.1.3, it causes a VM exit if executed in SMM in VMX non-root operation. If executed in VMX root operation, it causes an invalidopcode exception. The STM uses VM entries to return from SMM (see Section 34.15.4). 34.15.4 VM Entries that Return from SMM The SMM-transfer monitor (STM) returns from SMM using a VM entry with the “entry to SMM” VM-entry control clear. VM entries that return from SMM reverse the effects of an SMM VM exit (see Section 34.15.2). VM entries that return from SMM may differ from other VM entries in that they do not necessarily enter VMX nonroot operation. If the executive-VMCS pointer field in the current VMCS contains the VMXON pointer, the logical processor remains in VMX root operation after VM entry. For differences between VM entries that return from SMM and other VM entries see Sections 34.15.4.1 through 34.15.4.10. 34.15.4.1 Checks on the Executive-VMCS Pointer Field VM entries that return from SMM perform the following checks on the executive-VMCS pointer field in the current VMCS: • • • Bits 11:0 must be 0. The pointer must not set any bits beyond the processor’s physical-address width.1,2 The 32 bits located in memory referenced by the physical address in the pointer must contain the processor’s VMCS revision identifier (see Section 24.2). The checks above are performed before the checks described in Section 34.15.4.2 and before any of the following checks: • 'If the “deactivate dual-monitor treatment” VM-entry control is 0 and the executive-VMCS pointer field does not contain the VMXON pointer, the launch state of the executive VMCS (the VMCS referenced by the executiveVMCS pointer field) must be launched (see Section 24.11.3). • If the “deactivate dual-monitor treatment” VM-entry control is 1, the executive-VMCS pointer field must contain the VMXON pointer (see Section 34.15.7).3 34.15.4.2 Checks on VM-Execution Control Fields VM entries that return from SMM differ from other VM entries with regard to the checks performed on the VMexecution control fields specified in Section 26.2.1.1. They do not apply the checks to the current VMCS. Instead, VM-entry behavior depends on whether the executive-VMCS pointer field contains the VMXON pointer: • If the executive-VMCS pointer field contains the VMXON pointer (the VM entry remains in VMX root operation), the checks are not performed at all. • If the executive-VMCS pointer field does not contain the VMXON pointer (the VM entry enters VMX non-root operation), the checks are performed on the VM-execution control fields in the executive VMCS (the VMCS referenced by the executive-VMCS pointer field in the current VMCS). These checks are performed after checking the executive-VMCS pointer field itself (for proper alignment). Other VM entries ensure that, if “activate VMX-preemption timer” VM-execution control is 0, the “save VMXpreemption timer value” VM-exit control is also 0. This check is not performed by VM entries that return from SMM. 1. Software can determine a processor’s physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 2. If IA32_VMX_BASIC[48] is read as 1, this pointer must not set any bits in the range 63:32; see Appendix A.1. 3. The STM can determine the VMXON pointer by reading the executive-VMCS pointer field in the current VMCS after the SMM VM exit that activates the dual-monitor treatment. 34-22 Vol. 3C SYSTEM MANAGEMENT MODE 34.15.4.3 Checks on VM-Entry Control Fields VM entries that return from SMM differ from other VM entries with regard to the checks performed on the VM-entry control fields specified in Section 26.2.1.3. Specifically, if the executive-VMCS pointer field contains the VMXON pointer (the VM entry remains in VMX root operation), the VM-entry interruption-information field must not indicate injection of a pending MTF VM exit (see Section 26.6.2). Specifically, the following cannot all be true for that field: • • • the valid bit (bit 31) is 1 the interruption type (bits 10:8) is 7 (other event); and the vector (bits 7:0) is 0 (pending MTF VM exit). 34.15.4.4 Checks on the Guest State Area Section 26.3.1 specifies checks performed on fields in the guest-state area of the VMCS. Some of these checks are conditioned on the settings of certain VM-execution controls (e.g., “virtual NMIs” or “unrestricted guest”). VM entries that return from SMM modify these checks based on whether the executive-VMCS pointer field contains the VMXON pointer:1 • If the executive-VMCS pointer field contains the VMXON pointer (the VM entry remains in VMX root operation), the checks are performed as all relevant VM-execution controls were 0. (As a result, some checks may not be performed at all.) • If the executive-VMCS pointer field does not contain the VMXON pointer (the VM entry enters VMX non-root operation), this check is performed based on the settings of the VM-execution controls in the executive VMCS (the VMCS referenced by the executive-VMCS pointer field in the current VMCS). For VM entries that return from SMM, the activity-state field must not indicate the wait-for-SIPI state if the executive-VMCS pointer field contains the VMXON pointer (the VM entry is to VMX root operation). 34.15.4.5 Loading Guest State VM entries that return from SMM load the SMBASE register from the SMBASE field. VM entries that return from SMM invalidate linear mappings and combined mappings associated with all VPIDs. Combined mappings are invalidated for all EP4TA values (EP4TA is the value of bits 51:12 of EPTP; see Section 28.3). (Ordinary VM entries are required to perform such invalidation only for VPID 0000H and are not required to do even that if the “enable VPID” VM-execution control is 1; see Section 26.3.2.5.) 34.15.4.6 VMX-Preemption Timer A VM entry that returns from SMM activates the VMX-preemption timer only if the executive-VMCS pointer field does not contain the VMXON pointer (the VM entry enters VMX non-root operation) and the “activate VMX-preemption timer” VM-execution control is 1 in the executive VMCS (the VMCS referenced by the executive-VMCS pointer field). In this case, VM entry starts the VMX-preemption timer with the value in the VMX-preemption timer-value field in the current VMCS. 34.15.4.7 Updating the Current-VMCS and SMM-Transfer VMCS Pointers Successful VM entries (returning from SMM) load the SMM-transfer VMCS pointer with the current-VMCS pointer. Following this, they load the current-VMCS pointer from a field in the current VMCS: • If the executive-VMCS pointer field contains the VMXON pointer (the VM entry remains in VMX root operation), the current-VMCS pointer is loaded from the VMCS-link pointer field. • If the executive-VMCS pointer field does not contain the VMXON pointer (the VM entry enters VMX non-root operation), the current-VMCS pointer is loaded with the value of the executive-VMCS pointer field. 1. The STM can determine the VMXON pointer by reading the executive-VMCS pointer field in the current VMCS after the SMM VM exit that activates the dual-monitor treatment. Vol. 3C 34-23 SYSTEM MANAGEMENT MODE If the VM entry successfully enters VMX non-root operation, the VM-execution controls in effect after the VM entry are those from the new current VMCS. This includes any structures external to the VMCS referenced by VM-execution control fields. The updating of these VMCS pointers occurs before event injection. Event injection is determined, however, by the VM-entry control fields in the VMCS that was current when the VM entry commenced. 34.15.4.8 VM Exits Induced by VM Entry Section 26.6.1.2 describes how the event-delivery process invoked by event injection may lead to a VM exit. Section 26.7.3 to Section 26.7.7 describe other situations that may cause a VM exit to occur immediately after a VM entry. Whether these VM exits occur is determined by the VM-execution control fields in the current VMCS. For VM entries that return from SMM, they can occur only if the executive-VMCS pointer field does not contain the VMXON pointer (the VM entry enters VMX non-root operation). In this case, determination is based on the VM-execution control fields in the VMCS that is current after the VM entry. This is the VMCS referenced by the value of the executive-VMCS pointer field at the time of the VM entry (see Section 34.15.4.7). This VMCS also controls the delivery of such VM exits. Thus, VM exits induced by a VM entry returning from SMM are to the executive monitor and not to the STM. 34.15.4.9 SMI Blocking VM entries that return from SMM determine the blocking of system-management interrupts (SMIs) as follows: • If the “deactivate dual-monitor treatment” VM-entry control is 0, SMIs are blocked after VM entry if and only if the bit 2 in the interruptibility-state field is 1. • If the “deactivate dual-monitor treatment” VM-entry control is 1, the blocking of SMIs depends on whether the logical processor is in SMX operation:1 — If the logical processor is in SMX operation, SMIs are blocked after VM entry. — If the logical processor is outside SMX operation, SMIs are unblocked after VM entry. VM entries that return from SMM and that do not deactivate the dual-monitor treatment may leave SMIs blocked. This feature exists to allow the STM to invoke functionality outside of SMM without unblocking SMIs. 34.15.4.10 Failures of VM Entries That Return from SMM Section 26.8 describes the treatment of VM entries that fail during or after loading guest state. Such failures record information in the VM-exit information fields and load processor state as would be done on a VM exit. The VMCS used is the one that was current before the VM entry commenced. Control is thus transferred to the STM and the logical processor remains in SMM. 34.15.5 Enabling the Dual-Monitor Treatment Code and data for the SMM-transfer monitor (STM) reside in a region of SMRAM called the monitor segment (MSEG). Code running in SMM determines the location of MSEG and establishes its content. This code is also responsible for enabling the dual-monitor treatment. SMM code enables the dual-monitor treatment and specifies the location of MSEG by writing to the IA32_SMM_MONITOR_CTL MSR (index 9BH). The MSR has the following format: 1. A logical processor is in SMX operation if GETSEC[SEXIT] has not been executed since the last execution of GETSEC[SENTER]. A logical processor is outside SMX operation if GETSEC[SENTER] has not been executed or if GETSEC[SEXIT] was executed after the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference‚” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2D. 34-24 Vol. 3C SYSTEM MANAGEMENT MODE • Bit 0 is the register’s valid bit. The STM may be invoked using VMCALL only if this bit is 1. Because VMCALL is used to activate the dual-monitor treatment (see Section 34.15.6), the dual-monitor treatment cannot be activated if the bit is 0. This bit is cleared when the logical processor is reset. • • Bit 1 is reserved. Bit 2 determines whether executions of VMXOFF unblock SMIs under the default treatment of SMIs and SMM. Executions of VMXOFF unblock SMIs unless bit 2 is 1 (the value of bit 0 is irrelevant). See Section 34.14.4. Certain leaf functions of the GETSEC instruction clear this bit (see Chapter 6, “Safer Mode Extensions Reference,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2D). • • Bits 11:3 are reserved. • Bits 63:32 are reserved. Bits 31:12 contain a value that, when shifted left 12 bits, is the physical address of MSEG (the MSEG base address). The following items detail use of this MSR: • The IA32_SMM_MONITOR_CTL MSR is supported only on processors that support the dual-monitor treatment.1 On other processors, accesses to the MSR using RDMSR or WRMSR generate a general-protection fault (#GP(0)). • A write to the IA32_SMM_MONITOR_CTL MSR using WRMSR generates a general-protection fault (#GP(0)) if executed outside of SMM or if an attempt is made to set any reserved bit. An attempt to write to the IA32_SMM_MONITOR_CTL MSR fails if made as part of a VM exit that does not end in SMM or part of a VM entry that does not begin in SMM. • Reads from the IA32_SMM_MONITOR_CTL MSR using RDMSR are allowed any time RDMSR is allowed. The MSR may be read as part of any VM exit. • The dual-monitor treatment can be activated only if the valid bit in the MSR is set to 1. The 32 bytes located at the MSEG base address are called the MSEG header. The format of the MSEG header is given in Table 34-10 (each field is 32 bits). Table 34-10. Format of MSEG Header Byte Offset Field 0 MSEG-header revision identifier 4 SMM-transfer monitor features 8 GDTR limit 12 GDTR base offset 16 CS selector 20 EIP offset 24 ESP offset 28 CR3 offset To ensure proper behavior in VMX operation, software should maintain the MSEG header in writeback cacheable memory. Future implementations may allow or require a different memory type.2 Software should consult the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1). SMM code should enable the dual-monitor treatment (by setting the valid bit in IA32_SMM_MONITOR_CTL MSR) only after establishing the content of the MSEG header as follows: 1. Software should consult the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1) to determine whether the dual-monitor treatment is supported. Vol. 3C 34-25 SYSTEM MANAGEMENT MODE • Bytes 3:0 contain the MSEG revision identifier. Different processors may use different MSEG revision identifiers. These identifiers enable software to avoid using an MSEG header formatted for one processor on a processor that uses a different format. Software can discover the MSEG revision identifier that a processor uses by reading the VMX capability MSR IA32_VMX_MISC (see Appendix A.6). • Bytes 7:4 contain the SMM-transfer monitor features field. Bits 31:1 of this field are reserved and must be zero. Bit 0 of the field is the IA-32e mode SMM feature bit. It indicates whether the logical processor will be in IA-32e mode after the STM is activated (see Section 34.15.6). • Bytes 31:8 contain fields that determine how processor state is loaded when the STM is activated (see Section 34.15.6.5). SMM code should establish these fields so that activating of the STM invokes the STM’s initialization code. 34.15.6 Activating the Dual-Monitor Treatment The dual-monitor treatment may be enabled by SMM code as described in Section 34.15.5. The dual-monitor treatment is activated only if it is enabled and only by the executive monitor. The executive monitor activates the dualmonitor treatment by executing VMCALL in VMX root operation. When VMCALL activates the dual-monitor treatment, it causes an SMM VM exit. Differences between this SMM VM exit and other SMM VM exits are discussed in Sections 34.15.6.1 through 34.15.6.6. See also “VMCALL—Call to VM Monitor” in Chapter 30. 34.15.6.1 Initial Checks An execution of VMCALL attempts to activate the dual-monitor treatment if (1) the processor supports the dualmonitor treatment;1 (2) the logical processor is in VMX root operation; (3) the logical processor is outside SMM and the valid bit is set in the IA32_SMM_MONITOR_CTL MSR; (4) the logical processor is not in virtual-8086 mode and not in compatibility mode; (5) CPL = 0; and (6) the dual-monitor treatment is not active. Such an execution of VMCALL begins with some initial checks. These checks are performed before updating the current-VMCS pointer and the executive-VMCS pointer field (see Section 34.15.2.2). The VMCS that manages SMM VM exit caused by this VMCALL is the current VMCS established by the executive monitor. The VMCALL performs the following checks on the current VMCS in the order indicated: 1. There must be a current VMCS pointer. 2. The launch state of the current VMCS must be clear. 3. Reserved bits in the VM-exit controls in the current VMCS must be set properly. Software may consult the VMX capability MSR IA32_VMX_EXIT_CTLS to determine the proper settings (see Appendix A.4). If any of these checks fail, subsequent checks are skipped and VMCALL fails. If all these checks succeed, the logical processor uses the IA32_SMM_MONITOR_CTL MSR to determine the base address of MSEG. The following checks are performed in the order indicated: 1. The logical processor reads the 32 bits at the base of MSEG and compares them to the processor’s MSEG revision identifier. 2. The logical processor reads the SMM-transfer monitor features field: — Bit 0 of the field is the IA-32e mode SMM feature bit, and it indicates whether the logical processor will be in IA-32e mode after the SMM-transfer monitor (STM) is activated. • If the VMCALL is executed on a processor that does not support Intel 64 architecture, the IA-32e mode SMM feature bit must be 0. 2. Alternatively, software may map the MSEG header with the UC memory type; this may be necessary, depending on how memory is organized. Doing so is strongly discouraged unless necessary as it will cause the performance of transitions using those structures to suffer significantly. In addition, the processor will continue to use the memory type reported in the VMX capability MSR IA32_VMX_BASIC with exceptions noted in Appendix A.1. 1. Software should consult the VMX capability MSR IA32_VMX_BASIC (see Appendix A.1) to determine whether the dual-monitor treatment is supported. 34-26 Vol. 3C SYSTEM MANAGEMENT MODE • If the VMCALL is executed in 64-bit mode, the IA-32e mode SMM feature bit must be 1. — Bits 31:1 of this field are currently reserved and must be zero. If any of these checks fail, subsequent checks are skipped and the VMCALL fails. 34.15.6.2 Updating the Current-VMCS and Executive-VMCS Pointers Before performing the steps in Section 34.15.2.2, SMM VM exits that activate the dual-monitor treatment begin by loading the SMM-transfer VMCS pointer with the value of the current-VMCS pointer. 34.15.6.3 Saving Guest State As noted in Section 34.15.2.4, SMM VM exits save the contents of the SMBASE register into the corresponding field in the guest-state area. While this is true also for SMM VM exits that activate the dual-monitor treatment, the VMCS used for those VM exits exists outside SMRAM. The SMM-transfer monitor (STM) can also discover the current value of the SMBASE register by using the RDMSR instruction to read the IA32_SMBASE MSR (MSR address 9EH). The following items detail use of this MSR: • • The MSR is supported only if IA32_VMX_MISC[15] = 1 (see Appendix A.6). • A read from the IA32_SMBASE MSR using RDMSR generates a general-protection fault (#GP(0)) if executed outside of SMM. An attempt to read from the IA32_SMBASE MSR fails if made as part of a VM exit that does not end in SMM. A write to the IA32_SMBASE MSR using WRMSR generates a general-protection fault (#GP(0)). An attempt to write to the IA32_SMBASE MSR fails if made as part of a VM exit or part of a VM entry. 34.15.6.4 Saving MSRs The VM-exit MSR-store area is not used by SMM VM exits that activate the dual-monitor treatment. No MSRs are saved into that area. 34.15.6.5 Loading Host State The VMCS that is current during an SMM VM exit that activates the dual-monitor treatment was established by the executive monitor. It does not contain the VM-exit controls and host state required to initialize the STM. For this reason, such SMM VM exits do not load processor state as described in Section 27.5. Instead, state is set to fixed values or loaded based on the content of the MSEG header (see Table 34-10): • CR0 is set to as follows: — PG, NE, ET, MP, and PE are all set to 1. — CD and NW are left unchanged. — All other bits are cleared to 0. • CR3 is set as follows: — Bits 63:32 are cleared on processors that support IA-32e mode. — Bits 31:12 are set to bits 31:12 of the sum of the MSEG base address and the CR3-offset field in the MSEG header. — Bits 11:5 and bits 2:0 are cleared (the corresponding bits in the CR3-offset field in the MSEG header are ignored). — Bits 4:3 are set to bits 4:3 of the CR3-offset field in the MSEG header. • CR4 is set as follows: — MCE, PGE, and PCIDE are cleared. — PAE is set to the value of the IA-32e mode SMM feature bit. Vol. 3C 34-27 SYSTEM MANAGEMENT MODE — If the IA-32e mode SMM feature bit is clear, PSE is set to 1 if supported by the processor; if the bit is set, PSE is cleared. — All other bits are unchanged. • • • DR7 is set to 400H. The IA32_DEBUGCTL MSR is cleared to 00000000_00000000H. The registers CS, SS, DS, ES, FS, and GS are loaded as follows: — All registers are usable. — CS.selector is loaded from the corresponding field in the MSEG header (the high 16 bits are ignored), with bits 2:0 cleared to 0. If the result is 0000H, CS.selector is set to 0008H. — The selectors for SS, DS, ES, FS, and GS are set to CS.selector+0008H. If the result is 0000H (if the CS selector was FFF8H), these selectors are instead set to 0008H. — The base addresses of all registers are cleared to zero. — The segment limits for all registers are set to FFFFFFFFH. — The AR bytes for the registers are set as follows: • • • • • • • • • CS.Type is set to 11 (execute/read, accessed, non-conforming code segment). For SS, DS, ES, FS, and GS, the Type is set to 3 (read/write, accessed, expand-up data segment). The S bits for all registers are set to 1. The DPL for each register is set to 0. The P bits for all registers are set to 1. On processors that support Intel 64 architecture, CS.L is loaded with the value of the IA-32e mode SMM feature bit. CS.D is loaded with the inverse of the value of the IA-32e mode SMM feature bit. For each of SS, DS, ES, FS, and GS, the D/B bit is set to 1. The G bits for all registers are set to 1. • LDTR is unusable. The LDTR selector is cleared to 0000H, and the register is otherwise undefined (although the base address is always canonical) • GDTR.base is set to the sum of the MSEG base address and the GDTR base-offset field in the MSEG header (bits 63:32 are always cleared on processors that support IA-32e mode). GDTR.limit is set to the corresponding field in the MSEG header (the high 16 bits are ignored). • • IDTR.base is unchanged. IDTR.limit is cleared to 0000H. • RSP is set to the sum of the MSEG base address and the value of the RSP-offset field in the MSEG header (bits 63:32 are always cleared on logical processor that supports IA-32e mode). • • • RFLAGS is cleared, except bit 1, which is always set. RIP is set to the sum of the MSEG base address and the value of the RIP-offset field in the MSEG header (bits 63:32 are always cleared on logical processors that support IA-32e mode). The logical processor is left in the active state. Event blocking after the SMM VM exit is as follows: — There is no blocking by STI or by MOV SS. — There is blocking by non-maskable interrupts (NMIs) and by SMIs. • • There are no pending debug exceptions after the SMM VM exit. For processors that support IA-32e mode, the IA32_EFER MSR is modified so that LME and LMA both contain the value of the IA-32e mode SMM feature bit. If any of CR3[63:5], CR4.PAE, CR4.PSE, or IA32_EFER.LMA is changing, the TLBs are updated so that, after VM exit, the logical processor does not use translations that were cached before the transition. This is not neces- 34-28 Vol. 3C SYSTEM MANAGEMENT MODE sary for changes that would not affect paging due to the settings of other bits (for example, changes to CR4.PSE if IA32_EFER.LMA was 1 before and after the transition). 34.15.6.6 Loading MSRs The VM-exit MSR-load area is not used by SMM VM exits that activate the dual-monitor treatment. No MSRs are loaded from that area. 34.15.7 Deactivating the Dual-Monitor Treatment The SMM-transfer monitor may deactivate the dual-monitor treatment and return the processor to default treatment of SMIs and SMM (see Section 34.14). It does this by executing a VM entry with the “deactivate dual-monitor treatment” VM-entry control set to 1. As noted in Section 26.2.1.3 and Section 34.15.4.1, an attempt to deactivate the dual-monitor treatment fails in the following situations: (1) the processor is not in SMM; (2) the “entry to SMM” VM-entry control is 1; or (3) the executive-VMCS pointer does not contain the VMXON pointer (the VM entry is to VMX non-root operation). As noted in Section 34.15.4.9, VM entries that deactivate the dual-monitor treatment ignore the SMI bit in the interruptibility-state field of the guest-state area. Instead, the blocking of SMIs following such a VM entry depends on whether the logical processor is in SMX operation:1 • If the logical processor is in SMX operation, SMIs are blocked after VM entry. SMIs may later be unblocked by the VMXOFF instruction (see Section 34.14.4) or by certain leaf functions of the GETSEC instruction (see Chapter 6, “Safer Mode Extensions Reference,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2D). • If the logical processor is outside SMX operation, SMIs are unblocked after VM entry. 34.16 SMI AND PROCESSOR EXTENDED STATE MANAGEMENT On processors that support processor extended states using XSAVE/XRSTOR (see Chapter 13, “Managing State Using the XSAVE Feature Set” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1), the processor does not save any XSAVE/XRSTOR related state on an SMI. It is the responsibility of the SMI handler code to properly preserve the state information (including CR4.OSXSAVE, XCR0, and possibly processor extended states using XSAVE/XRSTOR). Therefore, the SMI handler must follow the rules described in Chapter 13, “Managing State Using the XSAVE Feature Set” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. 34.17 MODEL-SPECIFIC SYSTEM MANAGEMENT ENHANCEMENT This section describes enhancement of system management features that apply only to the 4th generation Intel Core processors. These features are model-specific. BIOS and SMM handler must use CPUID to enumerate DisplayFamily_DisplayModel signature when programming with these interfaces. 34.17.1 SMM Handler Code Access Control The BIOS may choose to restrict the address ranges of code that SMM handler executes. When SMM handler code execution check is enabled, an attempt by the SMM handler to execute outside the ranges specified by SMRR (see Section 34.4.2.1) will cause the assertion of an unrecoverable machine check exception (MCE). 1. A logical processor is in SMX operation if GETSEC[SEXIT] has not been executed since the last execution of GETSEC[SENTER]. A logical processor is outside SMX operation if GETSEC[SENTER] has not been executed or if GETSEC[SEXIT] was executed after the last execution of GETSEC[SENTER]. See Chapter 6, “Safer Mode Extensions Reference,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. Vol. 3C 34-29 SYSTEM MANAGEMENT MODE The interface to enable SMM handler code access check resides in a per-package scope model-specific register MSR_SMM_FEATURE_CONTROL at address 4E0H. An attempt to access MSR_SMM_FEATURE_CONTROL outside of SMM will cause a #GP. Writes to MSR_SMM_FEATURE_CONTROL is further protected by configuration interface of MSR_SMM_MCA_CAP at address 17DH. Details of the interface of MSR_SMM_FEATURE_CONTROL and MSR_SMM_MCA_CAP are described in Table 2-29 in Chapter 2, “Model-Specific Registers (MSRs)” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. 34.17.2 SMI Delivery Delay Reporting Entry into the system management mode occurs at instruction boundary. In situations where a logical processor is executing an instruction involving a long flow of internal operations, servicing an SMI by that logical processor will be delayed. Delayed servicing of SMI of each logical processor due to executing long flows of internal operation in a physical processor can be queried via a package-scope register MSR_SMM_DELAYED at address 4E2H. The interface to enable reporting of SMI delivery delay due to long internal flows resides in a per-package scope model-specific register MSR_SMM_DELAYED. An attempt to access MSR_SMM_DELAYED outside of SMM will cause a #GP. Availability to MSR_SMM_DELAYED is protected by configuration interface of MSR_SMM_MCA_CAP at address 17DH. Details of the interface of MSR_SMM_DELAYED and MSR_SMM_MCA_CAP are described in Table 2-29 in Chapter 2, “Model-Specific Registers (MSRs)” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. 34.17.3 Blocked SMI Reporting A logical processor may have entered into a state and blocked from servicing other interrupts (including SMI). Logical processors in a physical processor that are blocked in serving SMI can be queried in a package-scope register MSR_SMM_BLOCKED at address 4E3H. An attempt to access MSR_SMM_BLOCKED outside of SMM will cause a #GP. Details of the interface of MSR_SMM_BLOCKED is described in Table 2-29 in Chapter 2, “Model-Specific Registers (MSRs)” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4. 34-30 Vol. 3C 23.Updates to Chapter 35, Volume 3C Change bars show changes to Chapter 35 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------Changes to chapter: Typo corrections across chapter. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 33 INTEL® PROCESSOR TRACE CHAPTER 35 INTEL® PROCESSOR TRACE 35.1 OVERVIEW ® Intel Processor Trace (Intel PT) is an extension of Intel® Architecture that captures information about software execution using dedicated hardware facilities that cause only minimal performance perturbation to the software being traced. This information is collected in data packets. The initial implementations of Intel PT offer control flow tracing, which generates a variety of packets to be processed by a software decoder. The packets include timing, program flow information (e.g. branch targets, branch taken/not taken indications) and program-induced mode related information (e.g. Intel TSX state transitions, CR3 changes). These packets may be buffered internally before being sent to the memory subsystem or other output mechanism available in the platform. Debug software can process the trace data and reconstruct the program flow. Later generations include additional trace sources, including software trace instrumentation using PTWRITE, and Power Event tracing. 35.1.1 Features and Capabilities Intel PT’s control flow trace generates a variety of packets that, when combined with the binaries of a program by a post-processing tool, can be used to produce an exact execution trace. The packets record flow information such as instruction pointers (IP), indirect branch targets, and directions of conditional branches within contiguous code regions (basic blocks). Intel PT can also be configured to log software-generated packets using PTWRITE, and packets describing processor power management events. Further, Precise Event-Based Sampling (PEBS) can be configured to log PEBS records in the Intel PT trace; see Section 18.5.5.2. In addition, the packets record other contextual, timing, and bookkeeping information that enables both functional and performance debugging of applications. Intel PT has several control and filtering capabilities available to customize the tracing information collected and to append other processor state and timing information to enable debugging. For example, there are modes that allow packets to be filtered based on the current privilege level (CPL) or the value of CR3. Configuration of the packet generation and filtering capabilities are programmed via a set of MSRs. The MSRs generally follow the naming convention of IA32_RTIT_*. The capability provided by these configuration MSRs are enumerated by CPUID, see Section 35.3. Details of the MSRs for configuring Intel PT are described in Section 35.2.7. 35.1.1.1 Packet Summary After a tracing tool has enabled and configured the appropriate MSRs, the processor will collect and generate trace information in the following categories of packets (for more details on the packets, see Section 35.4): • Packets about basic information on program execution; these include: — Packet Stream Boundary (PSB) packets: PSB packets act as ‘heartbeats’ that are generated at regular intervals (e.g., every 4K trace packet bytes). These packets allow the packet decoder to find the packet boundaries within the output data stream; a PSB packet should be the first packet that a decoder looks for when beginning to decode a trace. — Paging Information Packet (PIP): PIPs record modifications made to the CR3 register. This information, along with information from the operating system on the CR3 value of each process, allows the debugger to attribute linear addresses to their correct application source. — Time-Stamp Counter (TSC) packets: TSC packets aid in tracking wall-clock time, and contain some portion of the software-visible time-stamp counter. — Core Bus Ratio (CBR) packets: CBR packets contain the core:bus clock ratio. Vol. 3C 35-1 INTEL® PROCESSOR TRACE — Overflow (OVF) packets: OVF packets are sent when the processor experiences an internal buffer overflow, resulting in packets being dropped. This packet notifies the decoder of the loss and can help the decoder to respond to this situation. • Packets about control flow information: — Taken Not-Taken (TNT) packets: TNT packets track the “direction” of direct conditional branches (taken or not taken). — Target IP (TIP) packets: TIP packets record the target IP of indirect branches, exceptions, interrupts, and other branches or events. These packets can contain the IP, although that IP value may be compressed by eliminating upper bytes that match the last IP. There are various types of TIP packets; they are covered in more detail in Section 35.4.2.2. — Flow Update Packets (FUP): FUPs provide the source IP addresses for asynchronous events (interrupt and exceptions), as well as other cases where the source address cannot be determined from the binary. — MODE packets: These packets provide the decoder with important processor execution information so that it can properly interpret the dis-assembled binary and trace log. MODE packets have a variety of formats that indicate details such as the execution mode (16-bit, 32-bit, or 64-bit). • Packets inserted by software: — PTWRITE (PTW) packets: includes the value of the operand passed to the PTWRITE instruction (see “PTWRITE - Write Data to a Processor Trace Packet” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B). • Packets about processor power management events: — MWAIT packets: Indicate successful completion of an MWAIT operation to a C-state deeper than C0.0. — Power State Entry (PWRE) packets: Indicate entry to a C-state deeper than C0.0. — Power State Exit (PWRX) packets: Indicate exit from a C-state deeper than C0.0, returning to C0. — Execution Stopped (EXSTOP) packets: Indicate that software execution has stopped, due to events such as P-state change, C-state change, or thermal throttling. • Packets containing groups of processor state values: — Block Begin Packets (BBP): Indicate the type of state held in the following group. — Block Item Packets (BIP): Indicate the state values held in the group. — Block End Packets (BEP): Indicate the end of the current group. 35.2 INTEL® PROCESSOR TRACE OPERATIONAL MODEL This section describes the overall Intel Processor Trace mechanism and the essential concepts relevant to how it operates. 35.2.1 Change of Flow Instruction (COFI) Tracing A basic program block is a section of code where no jumps or branches occur. The instruction pointers (IPs) in this block of code need not be traced, as the processor will execute them from start to end without redirecting code flow. Instructions such as branches, and events such as exceptions or interrupts, can change the program flow. These instructions and events that change program flow are called Change of Flow Instructions (COFI). There are three categories of COFI: • • • Direct transfer COFI. Indirect transfer COFI. Far transfer COFI. The following subsections describe the COFI events that result in trace packet generation. Table 35-1 lists branch instruction by COFI types. For detailed description of specific instructions, see Intel® 64 and IA-32 Architectures Software Developer’s Manual. 35-2 Vol. 3C INTEL® PROCESSOR TRACE Table 35-1. COFI Type for Branch Instructions COFI Type Instructions Conditional Branch JA, JAE, JB, JBE, JC, JCXZ, JECXZ, JRCXZ, JE, JG, JGE, JL, JLE, JNA, JNAE, JNB, JNBE, JNC, JNE, JNG, JNGE, JNL, JNLE, JNO, JNP, JNS, JNZ, JO, JP, JPE, JPO, JS, JZ, LOOP, LOOPE, LOOPNE, LOOPNZ, LOOPZ Unconditional Direct Branch JMP (E9 xx, EB xx), CALL (E8 xx) Indirect Branch JMP (FF /4), CALL (FF /2) Near Ret RET (C3, C2 xx) Far Transfers INT1, INT3, INT n, INTO, IRET, IRETD, IRETQ, JMP (EA xx, FF /5), CALL (9A xx, FF /3), RET (CB, CA xx), SYSCALL, SYSRET, SYSENTER, SYSEXIT, VMLAUNCH, VMRESUME 35.2.1.1 Direct Transfer COFI Direct Transfer COFI are relative branches. This means that their target is an IP whose offset from the current IP is embedded in the instruction bytes. It is not necessary to indicate target of these instructions in the trace output since it can be obtained through the source disassembly. Conditional branches need to indicate only whether the branch is taken or not. Unconditional branches do not need any recording in the trace output. There are two subcategories: • Conditional Branch (Jcc, J*CXZ) and LOOP To track this type of instruction, the processor encodes a single bit (taken or not taken — TNT) to indicate the program flow after the instruction. Jcc, J*CXZ, and LOOP can be traced with TNT bits. To improve the trace packet output efficiency, the processor will compact several TNT bits into a single packet. • Unconditional Direct Jumps There is no trace output required for direct unconditional jumps (like JMP near relative or CALL near relative) since they can be directly inferred from the application assembly. Direct unconditional jumps do not generate a TNT bit or a Target IP packet, though TIP.PGD and TIP.PGE packets can be generated by unconditional direct jumps that toggle Intel PT enables (see Section 35.2.5). 35.2.1.2 Indirect Transfer COFI Indirect transfer instructions involve updating the IP from a register or memory location. Since the register or memory contents can vary at any time during execution, there is no way to know the target of the indirect transfer until the register or memory contents are read. As a result, the disassembled code is not sufficient to determine the target of this type of COFI. Therefore, tracing hardware must send out the destination IP in the trace packet for debug software to determine the target address of the COFI. Note that this IP may be a linear or effective address (see Section 35.3.1.1). An indirect transfer instruction generates a Target IP Packet (TIP) that contains the target address of the branch. There are two sub-categories: • Near JMP Indirect and Near Call Indirect As previously mentioned, the target of an indirect COFI resides in the contents of either a register or memory location. Therefore, the processor must generate a packet that includes this target address to allow the decoder to determine the program flow. • Near RET When a CALL instruction executes, it pushes onto the stack the address of the next instruction following the CALL. Upon completion of the call procedure, the RET instruction is often used to pop the return address off of the call stack and redirect code flow back to the instruction following the CALL. A RET instruction simply transfers program flow to the address it popped off the stack. Because a called procedure may change the return address on the stack before executing the RET instruction, debug software can be misled if it assumes that code flow will return to the instruction following the last CALL. Therefore, even for near RET, a Target IP Packet may be sent. — RET Compression Vol. 3C 35-3 INTEL® PROCESSOR TRACE A special case is applied if the target of the RET is consistent with what would be expected from tracking the CALL stack. If it is assured that the decoder has seen the corresponding CALL (with “corresponding” defined as the CALL with matching stack depth), and the RET target is the instruction after that CALL, the RET target may be “compressed”. In this case, only a single TNT bit of “taken” is generated instead of a Target IP Packet. To ensure that the decoder will not be confused in cases of RET compression, only RETs that correspond to CALLs which have been seen since the last PSB packet may be compressed in a given logical processor. For details, see “Indirect Transfer Compression for Returns (RET)” in Section 35.4.2.2. 35.2.1.3 Far Transfer COFI All operations that change the instruction pointer and are not near jumps are “far transfers”. This includes exceptions, interrupts, traps, TSX aborts, and instructions that do far transfers. All far transfers will produce a Target IP (TIP) packet, which provides the destination IP address. For those far transfers that cannot be inferred from the binary source (e.g., asynchronous events such as exceptions and interrupts), the TIP will be preceded by a Flow Update packet (FUP), which provides the source IP address at which the event was taken. Table 35-24 indicates exactly which IP will be included in the FUP generated by a far transfer. 35.2.2 Software Trace Instrumentation with PTWRITE PTWRITE provides a mechanism by which software can instrument the Intel PT trace. PTWRITE is a ring3-accessible instruction that can be passed to a register or memory variable, see “PTWRITE - Write Data to a Processor Trace Packet” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B for details. The contents of that variable will be used as the payload for the PTW packet (see Table 35-41 “PTW Packet Definition”), inserted at the time of PTWRITE retirement, assuming PTWRITE is enabled and all other filtering conditions are met. Decode and analysis software will then be able to determine the meaning of the PTWRITE packet based on the IP of the associated PTWRITE instruction. PTWRITE is enabled via IA32_RTIT_CTL.PTWEn[12] (see Table 35-6). Optionally, the user can use IA32_RTIT_CTL.FUPonPTW[5] to enable PTW packets to be followed by FUP packets containing the IP of the associated PTWRITE instruction. Support for PTWRITE is introduced in Intel® Atom™ processors based on the Goldmont Plus microarchitecture. 35.2.3 Power Event Tracing Power Event Trace is a capability that exposes core- and thread-level sleep state and power down transition information. When this capability is enabled, the trace will expose information about: — Scenarios where software execution stops. • • Due to sleep state entry, frequency change, or other powerdown. Includes the IP, when in the tracing context. — The requested and resolved hardware thread C-state. • Including indication of hardware autonomous C-state entry. — The last and deepest core C-state achieved during a sleep session. — The reason for C-state wake. This information is in addition to the bus ratio (CBR) information provided by default after any powerdown, and the timing information (TSC, TMA, MTC, CYC) provided during or after a powerdown state. Power Event Trace is enabled via IA32_RTIT_CTL.PwrEvtEn[4]. Support for Power Event Tracing is introduced in Intel® Atom™ processors based on the Goldmont Plus microarchitecture. 35.2.4 Trace Filtering Intel Processor Trace provides filtering capabilities, by which the debug/profile tool can control what code is traced. 35-4 Vol. 3C INTEL® PROCESSOR TRACE 35.2.4.1 Filtering by Current Privilege Level (CPL) Intel PT provides the ability to configure a logical processor to generate trace packets only when CPL = 0, when CPL > 0, or regardless of CPL. CPL filtering ensures that no IPs or other architectural state information associated with the filtered CPL can be seen in the log. For example, if the processor is configured to trace only when CPL > 0, and software executes SYSCALL (changing the CPL to 0), the destination IP of the SYSCALL will be suppressed from the generated packet (see the discussion of TIP.PGD in Section 35.4.2.5). It should be noted that CPL is always 0 in real-address mode and that CPL is always 3 in virtual-8086 mode. To trace code in these modes, filtering should be configured accordingly. When software is executing in a non-enabled CPL, ContextEn is cleared. See Section 35.2.5.1 for details. 35.2.4.2 Filtering by CR3 Intel PT supports a CR3-filtering mechanism by which the generation of packets containing architectural states can be enabled or disabled based on the value of CR3. A debugger can use CR3 filtering to trace only a single application without context switching the state of the RTIT MSRs. For the reconstruction of traces from software with multiple threads, debug software may wish to context-switch for the state of the RTIT MSRs (if the operating system does not provide context-switch support) to separate the output for the different threads (see Section 35.3.5, “Context Switch Consideration”). To trace for only a single CR3 value, software can write that value to the IA32_RTIT_CR3_MATCH MSR, and set IA32_RTIT_CTL.CR3Filter. When CR3 value does not match IA32_RTIT_CR3_MATCH and IA32_RTIT_CTL.CR3Filter is 1, ContextEn is forced to 0, and packets containing architectural states will not be generated. Some other packets can be generated when ContextEn is 0; see Section 35.2.5.3 for details. When CR3 does match IA32_RTIT_CR3_MATCH (or when IA32_RTIT_CTL.CR3Filter is 0), CR3 filtering does not force ContextEn to 0 (although it could be 0 due to other filters or modes). CR3 matches IA32_RTIT_CR3_MATCH if the two registers are identical for bits 63:12, or 63:5 when in PAE paging mode; the lower 5 bits of CR3 and IA32_RTIT_CR3_MATCH are ignored. CR3 filtering is independent of the value of CR0.PG. When CR3 filtering is in use, PIP packets may still be seen in the log if the processor is configured to trace when CPL = 0 (IA32_RTIT_CTL.OS = 1). If not, no PIP packets will be seen. 35.2.4.3 Filtering by IP Trace packet generation with configurable filtering by IP is supported if CPUID.(EAX=14H, ECX=0):EBX[bit 2] = 1. Intel PT can be configured to enable the generation of packets containing architectural states only when the processor is executing code within certain IP ranges. If the IP is outside of these ranges, generation of some packets is blocked. IP filtering is enabled using the ADDRn_CFG fields in the IA32_RTIT_CTL MSR (Section 35.2.7.2), where the digit 'n' is a zero-based number that selects which address range is being configured. Each ADDRn_CFG field configures the use of the register pair IA32_RTIT_ADDRn_A and IA32_RTIT_ADDRn_B (Section 35.2.7.5). IA32_RTIT_ADDRn_A defines the base and IA32_RTIT_ADDRn_B specifies the limit of the range in which tracing is enabled. Thus each range, referred to as the ADDRn range, is defined by [IA32_RTIT_ADDRn_A, IA32_RTIT_ADDRn_B]. There can be multiple such ranges, software can query CPUID (Section 35.3.1) for the number of ranges supported on a processor. Default behavior (ADDRn_CFG=0) defines no IP filter range, meaning FilterEn is always set. In this case code at any IP can be traced, though other filters, such as CR3 or CPL, could limit tracing. When ADDRn_CFG is set to enable IP filtering (see Section 35.3.1), tracing will commence when a taken branch or event is seen whose target address is in the ADDRn range. While inside a tracing region and with FilterEn is set, leaving the tracing region may only be detected once a taken branch or event with a target outside the range is retired. If an ADDRn range is entered or exited by executing the next sequential instruction, rather than by a control flow transfer, FilterEn may not toggle immediately. See Section 35.2.5.5 for more details on FilterEn. Note that these address range base and limit values are inclusive, such that the range includes the first and last instruction whose first instruction byte is in the ADDRn range. Vol. 3C 35-5 INTEL® PROCESSOR TRACE Depending upon processor implementation, IP filtering may be based on linear or effective address. This can cause different behavior between implementations if CSbase is not equal to zero or in real mode. See Section 35.3.1.1 for details. Software can query CPUID to determine filters are based on linear or effective address (Section 35.3.1). Note that some packets, such as MTC (Section 35.3.7) and other timing packets, do not depend on FilterEn. For details on which packets depend on FilterEn, and hence are impacted by IP filtering, see Section 35.4.1. TraceStop The ADDRn ranges can also be configured to cause tracing to be disabled upon entry to the specified region. This is intended for cases where unexpected code is executed, and the user wishes to immediately stop generating packets in order to avoid overwriting previously written packets. The TraceStop mechanism works much the same way that IP filtering does, and uses the same address comparison logic. The TraceStop region base and limit values are programmed into one or more ADDRn ranges, but IA32_RTIT_CTL.ADDRn_CFG is configured with the TraceStop encoding. Like FilterEn, TraceStop is detected when a taken branch or event lands in a TraceStop region. Further, TraceStop requires that TriggerEn=1 at the beginning of the branch/event, and ContextEn=1 upon completion of the branch/event. When this happens, the CPU will set IA32_RTIT_STATUS.Stopped, thereby clearing TriggerEn and hence disabling packet generation. This may generate a TIP.PGD packet with the target IP of the branch or event that entered the TraceStop region. Finally, a TraceStop packet will be inserted, to indicate that the condition was hit. If a TraceStop condition is encountered during buffer overflow (Section 35.3.8), it will not be dropped, but will instead be signaled once the overflow has resolved. Note that a TraceStop event does not guarantee that all internally buffered packets are flushed out of internal buffers. To ensure that this has occurred, the user should clear TraceEn. To resume tracing after a TraceStop event, the user must first disable Intel PT by clearing IA32_RTIT_CTL.TraceEn before the IA32_RTIT_STATUS.Stopped bit can be cleared. At this point Intel PT can be reconfigured, and tracing resumed. Note that the IA32_RTIT_STATUS.Stopped bit can also be set using the ToPA STOP bit. See Section 35.2.6.2. IP Filtering Example The following table gives an example of IP filtering behavior. Assume that IA32_RTIT_ADDRn_A = the IP of RangeBase, and that IA32_RTIT_ADDRn_B = the IP of RangeLimit, while IA32_RTIT_CTL.ADDRn_CFG = 0x1 (enable ADDRn range as a FilterEn range). Table 35-2. IP Filtering Packet Example Code Flow Bar: jmp RangeBase // jump into filter range RangeBase: jcc Foo // not taken add eax, 1 Foo: jmp RangeLimit+1 // jump out of filter range RangeLimit: nop jcc Bar Packets TIP.PGE(RangeBase) TNT(0) TIP.PGD(RangeLimit+1) IP Filtering and TraceStop It is possible for the user to configure IP filter range(s) and TraceStop range(s) that overlap. In this case, code executing in the non-overlapping portion of either range will behave as would be expected from that range. Code executing in the overlapping range will get TraceStop behavior. 35-6 Vol. 3C INTEL® PROCESSOR TRACE 35.2.5 Packet Generation Enable Controls Intel Processor Trace includes a variety of controls that determine whether a packet is generated. In general, most packets are sent only if Packet Enable (PacketEn) is set. PacketEn is an internal state maintained in hardware in response to software configurable enable controls, PacketEn is not visible to software directly. The relationship of PacketEn to the software-visible controls in the configuration MSRs is described in this section. 35.2.5.1 Packet Enable (PacketEn) When PacketEn is set, the processor is in the mode that Intel PT is monitoring and all packets can be generated to log what is being executed. PacketEn is composed of other states according to this relationship: PacketEn TriggerEn AND ContextEn AND FilterEn AND BranchEn These constituent controls are detailed in the following subsections. PacketEn ultimately determines when the processor is tracing. When PacketEn is set, all control flow packets are enabled. When PacketEn is clear, no control flow packets are generated, though other packets (timing and bookkeeping packets) may still be sent. See Section 35.2.6 for details of PacketEn and packet generation. Note that, on processors that do not support IP filtering (i.e., CPUID.(EAX=14H, ECX=0):EBX.IPFILT_WRSTPRSV[bit 2] = 0), FilterEn is treated as always set. 35.2.5.2 Trigger Enable (TriggerEn) Trigger Enable (TriggerEn) is the primary indicator that trace packet generation is active. TriggerEn is set when IA32_RTIT_CTL.TraceEn is set, and cleared by any of the following conditions: • • • TraceEn is cleared by software. A TraceStop condition is encountered and IA32_RTIT_STATUS.Stopped is set. IA32_RTIT_STATUS.Error is set due to an operational error (see Section 35.3.9). Software can discover the current TriggerEn value by reading the IA32_RTIT_STATUS.TriggerEn bit. When TriggerEn is clear, tracing is inactive and no packets are generated. 35.2.5.3 Context Enable (ContextEn) Context Enable (ContextEn) indicates whether the processor is in the state or mode that software configured hardware to trace. For example, if execution with CPL = 0 code is not being traced (IA32_RTIT_CTL.OS = 0), then ContextEn will be 0 when the processor is in CPL0. Software can discover the current ContextEn value by reading the IA32_RTIT_STATUS.ContextEn bit. ContextEn is defined as follows: ContextEn = !((IA32_RTIT_CTL.OS = 0 AND CPL = 0) OR (IA32_RTIT_CTL.USER = 0 AND CPL > 0) OR (IS_IN_A_PRODUCTION_ENCLAVE1) OR (IA32_RTIT_CTL.CR3Filter = 1 AND IA32_RTIT_CR3_MATCH does not match CR3) If the clearing of ContextEn causes PacketEn to be cleared, a Packet Generation Disable (TIP.PGD) packet is generated, but its IP payload is suppressed. If the setting of ContextEn causes PacketEn to be set, a Packet Generation Enable (TIP.PGE) packet is generated. When ContextEn is 0, control flow packets (TNT, FUP, TIP.*, MODE.*) are not generated, and no Linear Instruction Pointers (LIPs) are exposed. However, some packets, such as MTC and PSB (see Section 35.4.2.16 and Section 35.4.2.17), may still be generated while ContextEn is 0. For details of which packets are generated only when ContextEn is set, see Section 35.4.1. The processor does not update ContextEn when TriggerEn = 0. The value of ContextEn will toggle only when TriggerEn = 1. 1. Trace packets generation is disabled in a production enclave, see Section 35.2.8.5. See Intel® Software Guard Extensions Programming Reference about differences between a production enclave and a debug enclave. Vol. 3C 35-7 INTEL® PROCESSOR TRACE 35.2.5.4 Branch Enable (BranchEn) This value is based purely on the IA32_RTIT_CTL.BranchEn value. If BranchEn is not set, then relevant COFI packets (TNT, TIP*, FUP, MODE.*) are suppressed. Other packets related to timing (TSC, TMA, MTC, CYC), as well as PSB, will be generated normally regardless. Further, PIP and VMCS continue to be generated, as indicators of what software is running. 35.2.5.5 Filter Enable (FilterEn) Filter Enable indicates that the Instruction Pointer (IP) is within the range of IPs that Intel PT is configured to watch. Software can get the state of Filter Enable by a RDMSR of IA32_RTIT_STATUS.FilterEn. For details on configuration and use of IP filtering, see Section 35.2.4.3. On clearing of FilterEn that also clears PacketEn, a Packet Generation Disable (TIP.PGD) will be generated, but unlike the ContextEn case, the IP payload may not be suppressed. For direct, unconditional branches, as well as for indirect branches (including RETs), the PGD generated by leaving the tracing region and clearing FilterEn will contain the target IP. This means that IPs from outside the configured range can be exposed in the trace, as long as they are within context. When FilterEn is 0, control flow packets are not generated (e.g., TNT, TIP). However, some packets, such as PIP, MTC, and PSB, may still be generated while FilterEn is clear. For details on packet enable dependencies, see Section 35.4.1. After TraceEn is set, FilterEn is set to 1 at all times if there is no IP filter range configured by software (IA32_RTIT_CTL.ADDRn_CFG != 1, for all n), or if the processor does not support IP filtering (i.e., CPUID.(EAX=14H, ECX=0):EBX.IPFILT_WRSTPRSV[bit 2] = 0). FilterEn will toggle only when TraceEn=1 and ContextEn=1, and when at least one range is configured for IP filtering. 35.2.6 Trace Output Intel PT output should be viewed independently from trace content and filtering mechanisms. The options available for trace output can vary across processor generations and platforms. Trace output is written out using one of the following output schemes, as configured by the ToPA and FabricEn bit fields of IA32_RTIT_CTL (see Section 35.2.7.2): • • • A single, contiguous region of physical address space. A collection of variable-sized regions of physical memory. These regions are linked together by tables of pointers to those regions, referred to as Table of Physical Addresses (ToPA). The trace output stores bypass the caches and the TLBs, but are not serializing. This is intended to minimize the performance impact of the output. A platform-specific trace transport subsystem. Regardless of the output scheme chosen, Intel PT stores bypass the processor caches by default. This ensures that they don't consume precious cache space, but they do not have the serializing aspects associated with un-cacheable (UC) stores. Software should avoid using MTRRs to mark any portion of the Intel PT output region as UC, as this may override the behavior described above and force Intel PT stores to UC, thereby incurring severe performance impact. There is no guarantee that a packet will be written to memory or other trace endpoint after some fixed number of cycles after a packet-producing instruction executes. The only way to assure that all packets generated have reached their endpoint is to clear TraceEn and follow that with a store, fence, or serializing instruction; doing so ensures that all buffered packets are flushed out of the processor. 35.2.6.1 Single Range Output When IA32_RTIT_CTL.ToPA and IA32_RTIT_CTL.FabricEn bits are clear, trace packet output is sent to a single, contiguous memory (or MMIO if DRAM is not available) range defined by a base address in IA32_RTIT_OUTPUT_BASE (Section 35.2.7.7) and mask value in IA32_RTIT_OUTPUT_MASK_PTRS (Section 35.2.7.8). The current write pointer in this range is also stored in IA32_RTIT_OUTPUT_MASK_PTRS. This output range is circular, meaning that when the writes wrap around the end of the buffer they begin again at the base address. 35-8 Vol. 3C INTEL® PROCESSOR TRACE This output method is best suited for cases where Intel PT output is either: • • Configured to be directed to a sufficiently large contiguous region of DRAM. Configured to go to an MMIO debug port, in order to route Intel PT output to a platform-specific trace endpoint (e.g., JTAG). In this scenario, a specific range of addresses is written in a circular manner, and SoC will intercept these writes and direct them to the proper device. Repeated writes to the same address do not overwrite each other, but are accumulated by the debugger, and hence no data is lost by the circular nature of the buffer. The processor will determine the address to which to write the next trace packet output byte as follows: OutputBase[63:0] IA32_RTIT_OUTPUT_BASE[63:0] OutputMask[63:0] ZeroExtend64(IA32_RTIT_OUTPUT_MASK_PTRS[31:0]) OutputOffset[63:0] ZeroExtend64(IA32_RTIT_OUTPUT_MASK_PTRS[63:32]) trace_store_phys_addr (OutputBase & ~OutputMask) + (OutputOffset & OutputMask) Single-Range Output Errors If the output base and mask are not properly configured by software, an operational error (see Section 35.3.9) will be signaled, and tracing disabled. Error scenarios with single-range output are: • Mask value is non-contiguous. IA32_RTIT_OUTPUT_MASK_PTRS.MaskOrTablePointer value has a 0 in a less significant bit position than the most significant bit containing a 1. • Base address and Mask are mis-aligned, and have overlapping bits set. IA32_RTIT_OUTPUT_BASE && IA32_RTIT_OUTPUT_MASK_PTRS[31:0] > 0. • Illegal Output Offset IA32_RTIT_OUTPUT_MASK_PTRS.OutputOffset is greater than the mask value IA32_RTIT_OUTPUT_MASK_PTRS[31:0]. Also note that errors can be signaled due to trace packet output overlapping with restricted memory, see Section 35.2.6.4. 35.2.6.2 Table of Physical Addresses (ToPA) When IA32_RTIT_CTL.ToPA is set and IA32_RTIT_CTL.FabricEn is clear, the ToPA output mechanism is utilized. The ToPA mechanism uses a linked list of tables; see Figure 35-1 for an illustrative example. Each entry in the table contains some attribute bits, a pointer to an output region, and the size of the region. The last entry in the table may hold a pointer to the next table. This pointer can either point to the top of the current table (for circular array) or to the base of another table. The table size is not fixed, since the link to the next table can exist at any entry. The processor treats the various output regions referenced by the ToPA table(s) as a unified buffer. This means that a single packet may span the boundary between one output region and the next. The ToPA mechanism is controlled by three values maintained by the processor: • proc_trace_table_base. This is the physical address of the base of the current ToPA table. When tracing is enabled, the processor loads this value from the IA32_RTIT_OUTPUT_BASE MSR. While tracing is enabled, the processor updates the IA32_RTIT_OUTPUT_BASE MSR with changes to proc_trace_table_base, but these updates may not be synchronous to software execution. When tracing is disabled, the processor ensures that the MSR contains the latest value of proc_trace_table_base. • proc_trace_table_offset. This indicates the entry of the current table that is currently in use. (This entry contains the address of the current output region.) When tracing is enabled, the processor loads the value from bits 31:7 (MaskOrTableOffset) of the IA32_RTIT_OUTPUT_MASK_PTRS into bits 27:3 of proc_trace_table_offset. While tracing is enabled, the processor updates IA32_RTIT_OUTPUT_MASK_PTRS.MaskOrTableOffset with changes to proc_trace_table_offset, but these updates may not be synchronous to software execution. When tracing is disabled, the processor ensures that the MSR contains the latest value of proc_trace_table_offset. Vol. 3C 35-9 INTEL® PROCESSOR TRACE • proc_trace_output_offset. This a pointer into the current output region and indicates the location of the next write. When tracing is enabled, the processor loads this value from bits 63:32 (OutputOffset) of the IA32_RTIT_OUTPUT_MASK_PTRS. While tracing is enabled, the processor updates IA32_RTIT_OUTPUT_MASK_PTRS.OutputOffset with changes to proc_trace_output_offset, but these updates may not be synchronous to software execution. When tracing is disabled, the processor ensures that the MSR contains the latest value of proc_trace_output_offset. Figure 35-1 provides an illustration (not to scale) of the table and associated pointers. 0FF_FFFF _FFFFH Physical Memory proc_trace_output_offset: IA32_RTIT_OUTPUT_MASK_PTRS.OutputOffset OutputRegionX END=1 4K TableBaseB OutputBaseY 64K OutputBaseX proc_trace_table_offset: IA32_RTIT_OUTPUT_MASK_PRS.MaskOrTableOffset<<3 ToPA Table A proc_trace_table_base: IA32_RTIT_OUTPUT_BASE OutputRegionY STOP=1 ToPA Table B 0 Figure 35-1. ToPA Memory Illustration With the ToPA mechanism, the processor writes packets to the current output region (identified by proc_trace_table_base and the proc_trace_table_offset). The offset within that region to which the next byte will be written is identified by proc_trace_output_offset. When that region is filled with packet output (thus proc_trace_output_offset = RegionSize–1), proc_trace_table_offset is moved to the next ToPA entry, proc_trace_output_offset is set to 0, and packet writes begin filling the new output region specified by proc_trace_table_offset. As packets are written out, each store derives its physical address as follows: trace_store_phys_addr Base address from current ToPA table entry + proc_trace_output_offset Eventually, the regions represented by all entries in the table may become full, and the final entry of the table is reached. An entry can be identified as the final entry because it has either the END or STOP attribute. The END attribute indicates that the address in the entry does not point to another output region, but rather to another ToPA table. The STOP attribute indicates that tracing will be disabled once the corresponding region is filled. See Table 35-3 and the section that follows for details on STOP. When an END entry is reached, the processor loads proc_trace_table_base with the base address held in this END entry, thereby moving the current table pointer to this new table. The proc_trace_table_offset is reset to 0, as is the proc_trace_output_offset, and packet writes will resume at the base address indicated in the first entry. If the table has no STOP or END entry, and trace-packet generation remains enabled, eventually the maximum table size will be reached (proc_trace_table_offset = 0FFFFFF8H). In this case, the proc_trace_table_offset and 35-10 Vol. 3C INTEL® PROCESSOR TRACE proc_trace_output_offset are reset to 0 (wrapping back to the beginning of the current table) once the last output region is filled. It is important to note that processor updates to the IA32_RTIT_OUTPUT_BASE and IA32_RTIT_OUTPUT_MASK_PTRS MSRs are asynchronous to instruction execution. Thus, reads of these MSRs while Intel PT is enabled may return stale values. Like all IA32_RTIT_* MSRs, the values of these MSRs should not be trusted or saved unless trace packet generation is first disabled by clearing IA32_RTIT_CTL.TraceEn. This ensures that the output MSR values account for all packets generated to that point, after which the processor will cease updating the output MSR values until tracing resumes. 1 The processor may cache internally any number of entries from the current table or from tables that it references (directly or indirectly). If tracing is enabled, the processor may ignore or delay detection of modifications to these tables. To ensure that table changes are detected by the processor in a predictable manner, software should clear TraceEn before modifying the current table (or tables that it references) and only then re-enable packet generation. Single Output Region ToPA Implementation The first processor generation to implement Intel PT supports only ToPA configurations with a single ToPA entry followed by an END entry that points back to the first entry (creating one circular output buffer). Such processors enumerate CPUID.(EAX=14H,ECX=0):ECX.MENTRY[bit 1] = 0 and CPUID.(EAX=14H,ECX=0):ECX.TOPAOUT[bit 0] = 1. If CPUID.(EAX=14H,ECX=0):ECX.MENTRY[bit 1] = 0, ToPA tables can hold only one output entry, which must be followed by an END=1 entry which points back to the base of the table. Hence only one contiguous block can be used as output. The lone output entry can have INT or STOP set, but nonetheless must be followed by an END entry as described above. Note that, if INT=1, the PMI will actually be delivered before the region is filled. ToPA Table Entry Format The format of ToPA table entries is shown in Figure 35-2. The size of the address field is determined by the processor’s physical-address width (MAXPHYADDR) in bits, as reported in CPUID.80000008H:EAX[7:0]. 63 12 11 10 9 MAXPHYADDR–1 6 5 4 3 2 1 0 Output Region Base Physical Address 9:6 Size 4 : STOP 2 : INT 0 : END Reserved Figure 35-2. Layout of ToPA Table Entry Table 35-3 describes the details of the ToPA table entry fields. If reserved bits are set to 1, an error is signaled. Table 35-3. ToPA Table Entry Fields ToPA Entry Field Output Region Base Physical Address Description If END=0, this is the base physical address of the output region specified by this entry. Note that all regions must be aligned based on their size. Thus a 2M region must have bits 20:12 clear. If the region is not properly aligned, an operational error will be signaled when the entry is reached. If END=1, this is the 4K-aligned base physical address of the next ToPA table (which may be the base of the current table, or the first table in the linked list if a circular buffer is desired). If the processor supports only a single ToPA output region (see above), this address must be the value currently in the IA32_RTIT_OUTPUT_BASE MSR. 1. Although WRMSR is a serializing instruction, the execution of WRMSR that forces packet writes by clearing TraceEn does not itself cause these writes to be globally observed. Vol. 3C 35-11 INTEL® PROCESSOR TRACE Table 35-3. ToPA Table Entry Fields (Contd.) ToPA Entry Field Description Size Indicates the size of the associated output region. Encodings are: 0: 4K, 1: 8K, 2: 16K, 3: 32K, 4: 64K, 5: 128K, 6: 256K, 8: 1M, 9: 2M, 10: 4M, 11: 8M, 12: 16M, 13: 32M, 14: 64M, This field is ignored if END=1. STOP When the output region indicated by this entry is filled, software should disable packet generation. This will be accomplished by setting IA32_RTIT_STATUS.Stopped, which clears TriggerEn. This bit must be 0 if END=1; otherwise it is treated as reserved bit violation (see ToPA Errors). INT When the output region indicated by this entry is filled, signal Perfmon LVT interrupt. Note that if both INT and STOP are set in the same entry, the STOP will happen before the INT. Thus the interrupt handler should expect that the IA32_RTIT_STATUS.Stopped bit will be set, and will need to be reset before tracing can be resumed. This bit must be 0 if END=1; otherwise it is treated as reserved bit violation (see ToPA Errors). END If set, indicates that this is an END entry, and thus the address field points to a table base rather than an output region base. If END=1, INT and STOP must be set to 0; otherwise it is treated as reserved bit violation (see ToPA Errors). The Size field is ignored in this case. If the processor supports only a single ToPA output region (see above), END must be set in the second table entry. 7: 512K, 15: 128M ToPA STOP Each ToPA entry has a STOP bit. If this bit is set, the processor will set the IA32_RTIT_STATUS.Stopped bit when the corresponding trace output region is filled. This will clear TriggerEn and thereby cease packet generation. See Section 35.2.7.4 for details on IA32_RTIT_STATUS.Stopped. This sequence is known as “ToPA Stop”. No TIP.PGD packet will be seen in the output when the ToPA stop occurs, since the disable happens only when the region is already full. When this occurs, output ceases after the last byte of the region is filled, which may mean that a packet is cut off in the middle. Any packets remaining in internal buffers are lost and cannot be recovered. When ToPA stop occurs, the IA32_RTIT_OUTPUT_BASE MSR will hold the base address of the table whose entry had STOP=1. IA32_RTIT_OUTPUT_MASK_PTRS.MaskOrTableOffset will hold the index value for that entry, and the IA32_RTIT_OUTPUT_MASK_PTRS.OutputOffset should be set to the size of the region. Note that this means the offset pointer is pointing to the next byte after the end of the region, a configuration that would produce an operational error if the configuration remained when tracing is re-enabled with IA32_RTIT_STATUS.Stopped cleared. ToPA PMI Each ToPA entry has an INT bit. If this bit is set, the processor will signal a performance-monitoring interrupt (PMI) when the corresponding trace output region is filled. This interrupt is not precise, and it is thus likely that writes to the next region will occur by the time the interrupt is taken. The following steps should be taken to configure this interrupt: 1. Enable PMI via the LVT Performance Monitor register (at MMIO offset 340H in xAPIC mode; via MSR 834H in x2APIC mode). See Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B for more details on this register. For ToPA PMI, set all fields to 0, save for the interrupt vector, which can be selected by software. 2. Set up an interrupt handler to service the interrupt vector that a ToPA PMI can raise. 3. Set the interrupt flag by executing STI. 4. Set the INT bit in the ToPA entry of interest and enable packet generation, using the ToPA output option. Thus, TraceEn=ToPA=1 in the IA32_RTIT_CTL MSR. Once the INT region has been filled with packet output data, the interrupt will be signaled. This PMI can be distinguished from others by checking bit 55 (Trace_ToPA_PMI) of the IA32_PERF_GLOBAL_STATUS MSR (MSR 38EH). Once the ToPA PMI handler has serviced the relevant buffer, writing 1 to bit 55 of the MSR at 390H (IA32_GLOBAL_STATUS_RESET) clears IA32_PERF_GLOBAL_STATUS.Trace_ToPA_PMI. 35-12 Vol. 3C INTEL® PROCESSOR TRACE Intel PT is not frozen on PMI, and thus the interrupt handler will be traced (though filtering can prevent this). The Freeze_Perfmon_on_PMI and Freeze_LBRs_on_PMI settings in IA32_DEBUGCTL will be applied on ToPA PMI just as on other PMIs, and hence Perfmon counters are frozen. Assuming the PMI handler wishes to read any buffered packets for persistent output, or wishes to modify any Intel PT MSRs, software should first disable packet generation by clearing TraceEn. This ensures that all buffered packets are written to memory and avoids tracing of the PMI handler. The configuration MSRs can then be used to determine where tracing has stopped. If packet generation is disabled by the handler, it should then be manually re-enabled before the IRET if continued tracing is desired. In rare cases, it may be possible to trigger a second ToPA PMI before the first is handled. This can happen if another ToPA region with INT=1 is filled before, or shortly after, the first PMI is taken, perhaps due to EFLAGS.IF being cleared for an extended period of time. This can manifest in two ways: either the second PMI is triggered before the first is taken, and hence only one PMI is taken, or the second is triggered after the first is taken, and thus will be taken when the handler for the first completes. Software can minimize the likelihood of the second case by clearing TraceEn at the beginning of the PMI handler. Further, it can detect such cases by then checking the Interrupt Request Register (IRR) for PMI pending, and checking the ToPA table base and off-set pointers (in IA32_RTIT_OUTPUT_BASE and IA32_RTIT_OUTPUT_MASK_PTRS) to see if multiple entries with INT=1 have been filled. When IA32_RTIT_CTL.InjectPsbPmiOnEnable[56] = 1, the PMI handler should take the following actions: 1. Ignore ToPA PMIs that are taken when TraceEn = 0, because the Intel PT MSR state may have already been saved by XSAVES, and because the PMI will be re-injected when Intel PT is re-enabled. 2. Clear the new IA32_RTIT_STATUS.PendTopaPMI[7] bit once the PMI has been handled. This bit should not be cleared in cases where a PMI is ignored due to TraceEn = 0. ToPA PMI and Single Output Region ToPA Implementation A processor that supports only a single ToPA output region implementation (such that only one output region is supported; see above) will attempt to signal a ToPA PMI interrupt before the output wraps and overwrites the top of the buffer. To support this functionality, the PMI handler should disable packet generation as soon as possible. Due to PMI skid, it is possible that, in rare cases, the wrap will have occurred before the PMI is delivered. Software can avoid this by setting the STOP bit in the ToPA entry (see Table 35-3); this will disable tracing once the region is filled, and no wrap will occur. This approach has the downside of disabling packet generation so that some of the instructions that led up to the PMI will not be traced. If the PMI skid is significant enough to cause the region to fill and tracing to be disabled, the PMI handler will need to clear the IA32_RTIT_STATUS.Stopped indication before tracing can resume. ToPA PMI and XSAVES/XRSTORS State Handling In some cases the ToPA PMI may be taken after completion of an XSAVES instruction that switches Intel PT state, and in such cases any modification of Intel PT MSRs within the PMI handler will not persist when the saved Intel PT context is later restored with XRSTORS. To account for such a scenario, it is recommended that the Intel PT output configuration be modified by altering the ToPA tables themselves, rather than the Intel PT output MSRs. On processors that support PMI preservation (CPUID.(EAX=14H, ECX=0):EBX.INJECTPSBPMI[6] = 1), setting IA32_RTIT_CTL.InjectPsbPmiOnEnable[56] = 1 will ensure that a PMI that is pending at the time PT is disabled will be recorded by setting IA32_RTIT_STATUS.PendTopaPMI[7] = 1. A PMI will then be pended when the saved PT context is later restored. Table 35-4 depicts a recommended PMI handler algorithm for managing multi-region ToPA output and handling ToPA PMIs that may arrive between XSAVES and XRSTORS, if PMI preservation is not in use. This algorithm is flexible to allow software to choose between adding entries to the current ToPA table, adding a new ToPA table, or using the current ToPA table as a circular buffer. It assumes that the ToPA entry that triggers the PMI is not the last entry in the table, which is the recommended treatment. Vol. 3C 35-13 INTEL® PROCESSOR TRACE Table 35-4. Algorithm to Manage Intel PT ToPA PMI and XSAVES/XRSTORS Pseudo Code Flow IF (IA32_PERF_GLOBAL_STATUS.ToPA) Save IA32_RTIT_CTL value; IF ( IA32_RTIT_CTL.TraceEN ) Disable Intel PT by clearing TraceEn; FI; IF ( there is space available to grow the current ToPA table ) Add one or more ToPA entries after the last entry in the ToPA table; Point new ToPA entry address field(s) to new output region base(s); ELSE Modify an upcoming ToPA entry in the current table to have END=1; IF (output should transition to a new ToPA table ) Point the address of the “END=1” entry of the current table to the new table base; ELSE /* Continue to use the current ToPA table, make a circular. */ Point the address of the “END=1”l entry to the base of the current table; Modify the ToPA entry address fields for filled output regions to point to new, unused output regions; /* Filled regions are those with index in the range of 0 to (IA32_RTIT_MASK_PTRS.MaskOrTableOffset -1). */ FI; FI; Restore saved IA32_RTIT_CTL.value; FI; ToPA Errors When a malformed ToPA entry is found, an operational error results (see Section 35.3.9). A malformed entry can be any of the following: 1. ToPA entry reserved bit violation. This describes cases where a bit marked as reserved in Section 35.2.6.2 above is set to 1. 2. ToPA alignment violation. This includes cases where illegal ToPA entry base address bits are set to 1: a. ToPA table base address is not 4KB-aligned. The table base can be from a WRMSR to IA32_RTIT_OUTPUT_BASE, or from a ToPA entry with END=1. b. ToPA entry base address is not aligned to the ToPA entry size (e.g., a 2MB region with base address[20:12] not equal to 0), for ToPA entries with END=0. c. ToPA entry base address sets upper physical address bits not supported by the processor. 3. Illegal ToPA Output Offset. IA32_RTIT_OUTPUT_MASK_PTRS.OutputOffset is greater than or equal to the size of the current ToPA output region size. 4. ToPA rules violations. These are similar to ToPA entry reserved bit violations; they are cases when a ToPA entry is encountered with illegal field combinations. They include the following: a. Setting the STOP or INT bit on an entry with END=1. b. Setting the END bit in entry 0 of a ToPA table. c. On processors that support only a single ToPA entry (see above), two additional illegal settings apply: i) ToPA table entry 1 with END=0. ii) ToPA table entry 1 with base address not matching the table base. 35-14 Vol. 3C INTEL® PROCESSOR TRACE In all cases, the error will be logged by setting IA32_RTIT_STATUS.Error, thereby disabling tracing when the problematic ToPA entry is reached (when proc_trace_table_offset points to the entry containing the error). Any packet bytes that are internally buffered when the error is detected may be lost. Note that operational errors may also be signaled due to attempts to access restricted memory. See Section 35.2.6.4 for details. A tracing software have a range of flexibility using ToPA to manage the interaction of Intel PT with application buffers, see Section 35.4.2.26. 35.2.6.3 Trace Transport Subsystem When IA32_RTIT_CTL.FabricEn is set, the IA32_RTIT_CTL.ToPA bit is ignored, and trace output is written to the trace transport subsystem. The endpoints of this transport are platform-specific, and details of configuration options should refer to the specific platform documentation. The FabricEn bit is available to be set if CPUID(EAX=14H,ECX=0):EBX[bit 3] = 1. 35.2.6.4 Restricted Memory Access Packet output cannot be directed to any regions of memory that are restricted by the platform. In particular, all memory accesses on behalf of packet output are checked against the SMRR regions. If there is any overlap with these regions, trace data collection will not function properly. Exact processor behavior is implementation-dependent; Table 35-5 summarizes several scenarios. Table 35-5. Behavior on Restricted Memory Access Scenario Description ToPA output region overlaps with SMRR Stores to the restricted memory region will be dropped, and that packet data will be lost. Any attempt to read from that restricted region will return all 1s. The processor also may signal an error (Section 35.3.9) and disable tracing when the output pointer reaches the restricted region. If packet generation remains enabled, then packet output may continue once stores are no longer directed to restricted memory (on wrap, or if the output region is larger than the restricted memory region). ToPA table overlaps with SMRR The processor will signal an error (Section 35.3.9) and disable tracing when the ToPA write pointer (IA32_RTIT_OUTPUT_BASE + proc_trace_table_offset) enters the restricted region. It should also be noted that packet output should not be routed to the 4KB APIC MMIO region, as defined by the IA32_APIC_BASE MSR. For details about the APIC, refer to Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. No error is signaled for this case. Modifications to Restricted Memory Regions It is recommended that software disable packet generation before modifying the SMRRs to change the scope of the SMRR regions. This is because the processor reserves the right to cache any number of ToPA table entries internally, after checking them against restricted memory ranges. Once cached, the entries will not be checked again, meaning one could potentially route packet output to a newly restricted region. Software can ensure that any cached entries are written to memory by clearing IA32_RTIT_CTL.TraceEn. 35.2.7 Enabling and Configuration MSRs 35.2.7.1 General Considerations Trace packet generation is enabled and configured by a collection of model-specific registers (MSRs), which are detailed below. Some notes on the configuration MSR behavior: • If Intel Processor Trace is not supported by the processor (see Section 35.3.1), RDMSR or WRMSR of the IA32_RTIT_* MSRs will cause #GP. • A WRMSR to any of these configuration MSRs that begins and ends with IA32_RTIT_CTL.TraceEn set will #GP fault. Packet generation must be disabled before the configuration MSRs can be changed. Vol. 3C 35-15 INTEL® PROCESSOR TRACE Note: Software may write the same value back to IA32_RTIT_CTL without #GP, even if TraceEn=1. • • All configuration MSRs for Intel PT are duplicated per logical processor • All configuration MSRs for Intel PT are cleared on a cold RESET. For each configuration MSR, any MSR write that attempts to change bits marked reserved, or utilize encodings marked reserved, will cause a #GP fault. — If CPUID.(EAX=14H, ECX=0):EBX.IPFILT_WRSTPRSV[bit 2] = 1, only the TraceEn bit is cleared on warm RESET; though this may have the impact of clearing other bits in IA32_RTIT_STATUS. Other MSR values of the trace configuration MSRs are preserved on warm RESET. • The semantics of MSR writes to trace configuration MSRs in this chapter generally apply to explicit WRMSR to these registers, using VM-exit or VM-entry MSR load list to these MSRs, XRSTORS with requested feature bit map including XSAVE map component of state_8 (corresponding to IA32_XSS[bit 8]), and the write to IA32_RTIT_CTL.TraceEn by XSAVES (Section 35.3.5.2). 35.2.7.2 IA32_RTIT_CTL MSR IA32_RTIT_CTL, at address 570H, is the primary enable and control MSR for trace packet generation. Bit positions are listed in Table 35-6. Table 35-6. IA32_RTIT_CTL MSR Position 0 Bit Name TraceEn At Reset 0 Bit Description If 1, enables tracing; else tracing is disabled. When this bit transitions from 1 to 0, all buffered packets are flushed out of internal buffers. A further store, fence, or architecturally serializing instruction may be required to ensure that packet data can be observed at the trace endpoint. See Section 35.2.7.3 for details of enabling and disabling packet generation. Note that the processor will clear this bit on #SMI (Section ) and warm reset. Other MSR bits of IA32_RTIT_CTL (and other trace configuration MSRs) are not impacted by these events. 1 CYCEn 0 0: Disables CYC Packet (see Section 35.4.2.14). 1: Enables CYC Packet. This bit is reserved if CPUID.(EAX=14H, ECX=0):EBX.CPSB_CAM[bit 1] = 0. 2 OS 0 0: Packet generation is disabled when CPL = 0. 1: Packet generation may be enabled when CPL = 0. 3 User 0 0: Packet generation is disabled when CPL > 0. 1: Packet generation may be enabled when CPL > 0. 4 PwrEvtEn 0 0: Power Event Trace packets are disabled. 1: Power Event Trace packets are enabled (see Section 35.2.3, “Power Event Tracing”). 5 FUPonPTW 0 0: PTW packets are not followed by FUPs. 1: PTW packets are followed by FUPs. This bit is reserved when CPUID.(EAX=14H, ECX=0):EBX[bit 4] (“PTWRITE Supported”) is 0. 6 FabricEn 0 0: Trace output is directed to the memory subsystem, mechanism depends on IA32_RTIT_CTL.ToPA. 1: Trace output is directed to the trace transport subsystem, IA32_RTIT_CTL.ToPA is ignored. This bit is reserved if CPUID.(EAX=14H, ECX=0):ECX[bit 3] = 0. 7 CR3Filter 0 0: Disables CR3 filtering. 1: Enables CR3 filtering. This bit is reserved if CPUID.(EAX=14H, ECX=0):EBX[bit 0] (“CR3 Filtering Support”) is 0. 35-16 Vol. 3C INTEL® PROCESSOR TRACE Table 35-6. IA32_RTIT_CTL MSR (Contd.) Position 8 Bit Name ToPA At Reset 0 Bit Description 0: Single-range output scheme enabled if CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 1 and IA32_RTIT_CTL.FabricEn=0. 1: ToPA output scheme enabled (see Section 35.2.6.2) if CPUID.(EAX=14H, ECX=0):ECX.TOPA[bit 0] = 1, and IA32_RTIT_CTL.FabricEn=0. Note: WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit and FabricEn would cause #GP, if CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0. WRMSR to IA32_RTIT_CTL that sets this bit causes #GP, if CPUID.(EAX=14H, ECX=0):ECX.TOPA[bit 0] = 0. 9 MTCEn 0 0: Disables MTC Packet (see Section 35.4.2.16). 1: Enables MTC Packet. This bit is reserved if CPUID.(EAX=14H, ECX=0):EBX.MTC[bit 3] = 0. 10 TSCEn 0 0: Disable TSC packets. 1: Enable TSC packets (see Section 35.4.2.11). 11 DisRETC 0 0: Enable RET compression. 1: Disable RET compression (see Section 35.2.1.2). 12 PTWEn 0 0: PTWRITE packet generation disabled. 1: PTWRITE packet generation enabled (see Table 35-41 “PTW Packet Definition”). This bit is reserved when CPUID.(EAX=14H, ECX=0):EBX[bit 4] (“PTWRITE Supported”) is 0. 13 BranchEn 0 0: Disable COFI-based packets. 1: Enable COFI-based packets: FUP, TIP, TIP.PGE, TIP.PGD, TNT, MODE.Exec, MODE.TSX. See Section 35.2.5.4 for details on BranchEn. 17:14 MTCFreq 0 Defines MTC packet Frequency, which is based on the core crystal clock, or Always Running Timer (ART). MTC will be sent each time the selected ART bit toggles. The following Encodings are defined: 0: ART(0), 1: ART(1), 2: ART(2), 3: ART(3), 4: ART(4), 5: ART(5), 6: ART(6), 7: ART(7), 8: ART(8), 9: ART(9), 10: ART(10), 11: ART(11), 12: ART(12), 13: ART(13), 14: ART(14), 15: ART(15) Software must use CPUID to query the supported encodings in the processor, see Section 35.3.1. Use of unsupported encodings will result in a #GP fault. This field is reserved if CPUID.(EAX=14H, ECX=0):EBX.MTC[bit 3] = 0. 18 Reserved 0 Must be 0. 22:19 CycThresh 0 CYC packet threshold, see Section 35.3.6 for details. CYC packets will be sent with the first eligible packet after N cycles have passed since the last CYC packet. If CycThresh is 0 then N=0, otherwise N is defined as 2(CycThresh-1). The following Encodings are defined: 0: 0, 1: 1, 2: 2, 3: 4, 4: 8, 5: 16, 6: 32, 7: 64, 8: 128, 9: 256, 10: 512, 11: 1024, 12: 2048, 13: 4096, 14: 8192, 15: 16384 Software must use CPUID to query the supported encodings in the processor, see Section 35.3.1. Use of unsupported encodings will result in a #GP fault. This field is reserved if CPUID.(EAX=14H, ECX=0):EBX.CPSB_CAM[bit 1] = 0. 23 Reserved 0 Must be 0. Vol. 3C 35-17 INTEL® PROCESSOR TRACE Table 35-6. IA32_RTIT_CTL MSR (Contd.) Position 27:24 Bit Name PSBFreq At Reset 0 Bit Description Indicates the frequency of PSB packets. PSB packet frequency is based on the number of Intel PT packet bytes output, so this field allows the user to determine the increment of IA32_IA32_RTIT_STATUS.PacketByteCnt that should cause a PSB to be generated. Note that PSB insertion is not precise, but the average output bytes per PSB should approximate the SW selected period. The following Encodings are defined: 0: 2K, 1: 4K, 2: 8K, 3: 16K, 4: 32K, 5: 64K, 6: 128K, 7: 256K, 8: 512K, 9: 1M, 10: 2M, 11: 4M, 12: 8M, 13: 16M, 14: 32M, 15: 64M Software must use CPUID to query the supported encodings in the processor, see Section 35.3.1. Use of unsupported encodings will result in a #GP fault. This field is reserved if CPUID.(EAX=14H, ECX=0):EBX.CPSB_CAM[bit 1] = 0. 31:28 Reserved 0 Must be 0. 35:32 ADDR0_CFG 0 Configures the base/limit register pair IA32_RTIT_ADDR0_A/B based on the following encodings: 0: ADDR0 range unused. 1: The [IA32_RTIT_ADDR0_A..IA32_RTIT_ADDR0_B] range defines a FilterEn range. FilterEn will only be set when the IP is within this range, though other FilterEn ranges can additionally be used. See Section 35.2.4.3 for details on IP filtering. 2: The [IA32_RTIT_ADDR0_A..IA32_RTIT_ADDR0_B] range defines a TraceStop range. TraceStop will be asserted if code branches into this range. See 4.2.8 for details on TraceStop. 3..15: Reserved (#GP). This field is reserved if CPUID.(EAX=14H, ECX=1):EBX.RANGECNT[2:0] < 1. 39:36 ADDR1_CFG 0 Configures the base/limit register pair IA32_RTIT_ADDR1_A/B based on the following encodings: 0: ADDR1 range unused. 1: The [IA32_RTIT_ADDR1_A..IA32_RTIT_ADDR1_B] range defines a FilterEn range. FilterEn will only be set when the IP is within this range, though other FilterEn ranges can additionally be used. See Section 35.2.4.3 for details on IP filtering. 2: The [IA32_RTIT_ADDR1_A..IA32_RTIT_ADDR1_B] range defines a TraceStop range. TraceStop will be asserted if code branches into this range. See Section 35.4.2.10 for details on TraceStop. 3..15: Reserved (#GP). This field is reserved if CPUID.(EAX=14H, ECX=1):EBX.RANGECNT[2:0] < 2. 43:40 ADDR2_CFG 0 Configures the base/limit register pair IA32_RTIT_ADDR2_A/B based on the following encodings: 0: ADDR2 range unused. 1: The [IA32_RTIT_ADDR2_A..IA32_RTIT_ADDR2_B] range defines a FilterEn range. FilterEn will only be set when the IP is within this range, though other FilterEn ranges can additionally be used. See Section 35.2.4.3 for details on IP filtering. 2: The [IA32_RTIT_ADDR2_A..IA32_RTIT_ADDR2_B] range defines a TraceStop range. TraceStop will be asserted if code branches into this range. See Section 35.4.2.10 for details on TraceStop. 3..15: Reserved (#GP). This field is reserved if CPUID.(EAX=14H, ECX=1):EBX.RANGECNT[2:0] < 3. 35-18 Vol. 3C INTEL® PROCESSOR TRACE Table 35-6. IA32_RTIT_CTL MSR (Contd.) Position 47:44 Bit Name ADDR3_CFG At Reset 0 Bit Description Configures the base/limit register pair IA32_RTIT_ADDR3_A/B based on the following encodings: 0: ADDR3 range unused. 1: The [IA32_RTIT_ADDR3_A..IA32_RTIT_ADDR3_B] range defines a FilterEn range. FilterEn will only be set when the IP is within this range, though other FilterEn ranges can additionally be used. See Section 35.2.4.3 for details on IP filtering. 2: The [IA32_RTIT_ADDR3_A..IA32_RTIT_ADDR3_B] range defines a TraceStop range. TraceStop will be asserted if code branches into this range. See Section 35.4.2.10 for details on TraceStop. 3..15: Reserved (#GP). This field is reserved if CPUID.(EAX=14H, ECX=1):EBX.RANGECNT[2:0] < 4. 55:48 56 Reserved 0 Reserved only for future trace content enables, or address filtering configuration enables. Must be 0. InjectPsbPmi OnEnable 0 1: Enables use of IA32_RTIT_STATUS bits PendPSB[6] and PendTopaPMI[7], see Section 35.2.7.4, “IA32_RTIT_STATUS MSR” for behavior of these bits. 0: IA32_RTIT_STATUS bits 6 and 7 are ignored. This field is reserved if CPUID.(EAX=14H, ECX=0):EBX.INJECTPSBPMI[6] = 0. 59:57 Reserved 0 Reserved only for future trace content enables, or address filtering configuration enables. Must be 0. 63:60 Reserved 0 Must be 0. 35.2.7.3 Enabling and Disabling Packet Generation with TraceEn When TraceEn transitions from 0 to 1, Intel Processor Trace is enabled, and a series of packets may be generated. These packets help ensure that the decoder is aware of the state of the processor when the trace begins, and that it can keep track of any timing or state changes that may have occurred while packet generation was disabled. A full PSB+ (see Section 35.4.2.17) will be generated if IA32_RTIT_STATUS.PacketByteCnt=0, and may be generated in other cases as well. Otherwise, timing packets will be generated, including TSC, TMA, and CBR (see Section 35.4.1.1). In addition to the packets discussed above, if and when PacketEn (Section 35.2.5.1) transitions from 0 to 1 (which may happen immediately, depending on filtering settings), a TIP.PGE packet (Section 35.4.2.3) will be generated. When TraceEn is set, the processor may read ToPA entries from memory and cache them internally. For this reason, software should disable packet generation before making modifications to the ToPA tables (or changing the configuration of restricted memory regions). See Section 35.7 for more details of packets that may be generated with modifications to TraceEn. Disabling Packet Generation Clearing TraceEn causes any packet data buffered within the logical processor to be flushed out, after which the output MSRs (IA32_RTIT_OUTPUT_BASE and IA32_RTIT_OUTPUT_MASK_PTRS) will have stable values. When output is directed to memory, a store, fence, or architecturally serializing instruction may be required to ensure that the packet data is globally observed. No special packets are generated by disabling packet generation, though a TIP.PGD may result if PacketEn=1 at the time of disable. Other Writes to IA32_RTIT_CTL Any attempt to modify IA32_RTIT_CTL while TraceEn is set will result in a general-protection fault (#GP) unless the same write also clears TraceEn. However, writes to IA32_RTIT_CTL that do not modify any bits will not cause a #GP, even if TraceEn remains set. Vol. 3C 35-19 INTEL® PROCESSOR TRACE 35.2.7.4 IA32_RTIT_STATUS MSR The IA32_RTIT_STATUS MSR is readable and writable by software, but some bits (ContextEn, TriggerEn) are readonly and cannot be directly modified. The WRMSR instruction ignores these bits in the source operand (attempts to modify these bits are ignored and do not cause WRMSR to fault). This MSR can only be written when IA32_RTIT_CTL.TraceEn is 0; otherwise WRMSR causes a general-protection fault (#GP). The processor does not modify the value of this MSR while TraceEn is 0 (software can modify it with WRMSR). Table 35-7. IA32_RTIT_STATUS MSR Position Bit Name At Reset Bit Description 0 FilterEn 0 This bit is written by the processor, and indicates that tracing is allowed for the current IP, see Section 35.2.5.5. Writes are ignored. 1 ContextEn 0 The processor sets this bit to indicate that tracing is allowed for the current context. See Section 35.2.5.3. Writes are ignored. 2 TriggerEn 0 The processor sets this bit to indicate that tracing is enabled. See Section 35.2.5.2. Writes are ignored. 3 Reserved 0 Must be 0. 4 Error 0 The processor sets this bit to indicate that an operational error has been encountered. When this bit is set, TriggerEn is cleared to 0 and packet generation is disabled. For details, see “ToPA Errors” in Section 35.2.6.2. When TraceEn is cleared, software can write this bit. Once it is set, only software can clear it. It is not recommended that software ever set this bit, except in cases where it is restoring a prior saved state. 5 Stopped 0 The processor sets this bit to indicate that a ToPA Stop condition has been encountered. When this bit is set, TriggerEn is cleared to 0 and packet generation is disabled. For details, see “ToPA STOP” in Section 35.2.6.2. When TraceEn is cleared, software can write this bit. Once it is set, only software can clear it. It is not recommended that software ever set this bit, except in cases where it is restoring a prior saved state. 6 PendPSB 0 If IA32_RTIT_CTL.InjectPsbPmiOnEnable[56] = 1, the processor sets this bit when the threshold for a PSB+ to be inserted has been reached. The processor will clear this bit when the PSB+ has been inserted into the trace. If PendPSB = 1 and InjectPsbPmiOnEnable = 1 when IA32_RTIT_CTL.TraceEn[0] transitions from 0 to 1, a PSB+ will be inserted into the trace. This field is reserved if CPUID.(EAX=14H, ECX=0):EBX.INJECTPSBPMI[6] = 1. 7 PendTopaPMI 0 If IA32_RTIT_CTL.InjectPsbPmiOnEnable[56] = 1, the processor sets this bit when the threshold for a ToPA PMI to be inserted has been reached. Software should clear this bit once the ToPA PMI has been handled, see “ToPA PMI” for details. If PendTopaPMI = 1 and InjectPsbPmiOnEnable = 1 when IA32_RTIT_CTL.TraceEn[0] transitions from 0 to 1, a PMI will be pended. This field is reserved if CPUID.(EAX=14H, ECX=0):EBX.INJECTPSBPMI[6] = 1. 31:8 Reserved 0 48:32 PacketByteCnt 0 Must be 0. This field is written by the processor, and holds a count of packet bytes that have been sent out. The processor also uses this field to determine when the next PSB packet should be inserted. Note that the processor may clear or modify this field at any time while IA32_RTIT_CTL.TraceEn=1. It will have a stable value when IA32_RTIT_CTL.TraceEn=0. See Section 35.4.2.17 for details. This field is reserved when CPUID.(EAX=14H,ECX=0):EBX[bit 1] (“Configurable PSB and CycleAccurate Mode Supported”) is 0. 63:49 Reserved 35-20 Vol. 3C 0 Must be 0. INTEL® PROCESSOR TRACE 35.2.7.5 IA32_RTIT_ADDRn_A and IA32_RTIT_ADDRn_B MSRs The role of the IA32_RTIT_ADDRn_A/B register pairs, for each n, is determined by the corresponding ADDRn_CFG fields in IA32_RTIT_CTL (see Section 35.2.7.2). The number of these register pairs is enumerated by CPUID.(EAX=14H, ECX=1):EAX.RANGECNT[2:0]. • Processors that enumerate support for 1 range support: • IA32_RTIT_ADDR0_A, IA32_RTIT_ADDR0_B Processors that enumerate support for 2 ranges support: IA32_RTIT_ADDR0_A, IA32_RTIT_ADDR0_B • IA32_RTIT_ADDR1_A, IA32_RTIT_ADDR1_B Processors that enumerate support for 3 ranges support: IA32_RTIT_ADDR0_A, IA32_RTIT_ADDR0_B IA32_RTIT_ADDR1_A, IA32_RTIT_ADDR1_B • IA32_RTIT_ADDR2_A, IA32_RTIT_ADDR2_B Processors that enumerate support for 4 ranges support: IA32_RTIT_ADDR0_A, IA32_RTIT_ADDR0_B IA32_RTIT_ADDR1_A, IA32_RTIT_ADDR1_B IA32_RTIT_ADDR2_A, IA32_RTIT_ADDR2_B IA32_RTIT_ADDR3_A, IA32_RTIT_ADDR3_B Each register has a single 64-bit field that holds a linear address value. Writes must ensure that the address is in canonical form, otherwise a #GP fault will result. 35.2.7.6 IA32_RTIT_CR3_MATCH MSR The IA32_RTIT_CR3_MATCH register is compared against CR3 when IA32_RTIT_CTL.CR3Filter is 1. Bits 63:5 hold the CR3 address value to match, bits 4:0 are reserved to 0. For more details on CR3 filtering and the treatment of this register, see Section 35.2.4.2. This MSR is accessible if CPUID.(EAX=14H, ECX=0):EBX[bit 0], “CR3 Filtering Support”, is 1. This MSR can be written only when IA32_RTIT_CTL.TraceEn is 0; otherwise WRMSR causes a general-protection fault (#GP). IA32_RTIT_CR3_MATCH[4:0] are reserved and must be 0; an attempt to set those bits using WRMSR causes a #GP. 35.2.7.7 IA32_RTIT_OUTPUT_BASE MSR This MSR is used to configure the trace output destination, when output is directed to memory (IA32_RTIT_CTL.FabricEn = 0). The size of the address field is determined by the maximum physical address width (MAXPHYADDR), as reported by CPUID.80000008H:EAX[7:0]. When the ToPA output scheme is used, the processor may update this MSR when packet generation is enabled, and those updates are asynchronous to instruction execution. Therefore, the values in this MSR should be considered unreliable unless packet generation is disabled (IA32_RTIT_CTL.TraceEn = 0). Accesses to this MSR are supported only if Intel PT output to memory is supported, hence when either CPUID.(EAX=14H, ECX=0):ECX[bit 0] or CPUID.(EAX=14H, ECX=0):ECX[bit 2] are set. Otherwise WRMSR or RDMSR cause a general-protection fault (#GP). If supported, this MSR can be written only when IA32_RTIT_CTL.TraceEn is 0; otherwise WRMSR causes a general-protection fault (#GP). Vol. 3C 35-21 INTEL® PROCESSOR TRACE Table 35-8. IA32_RTIT_OUTPUT_BASE MSR Position 6:0 Bit Name At Reset Reserved MAXPHYADDR-1:7 0 BasePhysAddr 0 Bit Description Must be 0. The base physical address. How this address is used depends on the value of IA32_RTIT_CTL.ToPA: 0: This is the base physical address of a single, contiguous physical output region. This could be mapped to DRAM or to MMIO, depending on the value. The base address should be aligned with the size of the region, such that none of the 1s in the mask value(Section 35.2.7.8) overlap with 1s in the base address. If the base is not aligned, an operational error will result (see Section 35.3.9). 1: The base physical address of the current ToPA table. The address must be 4K aligned. Writing an address in which bits 11:7 are non-zero will not cause a #GP, but an operational error will be signaled once TraceEn is set. See “ToPA Errors” in Section 35.2.6.2 as well as Section 35.3.9. 63:MAXPHYADDR 35.2.7.8 Reserved 0 Must be 0. IA32_RTIT_OUTPUT_MASK_PTRS MSR This MSR holds any mask or pointer values needed to indicate where the next byte of trace output should be written. The meaning of the values held in this MSR depend on whether the ToPA output mechanism is in use. See Section 35.2.6.2 for details. The processor updates this MSR while when packet generation is enabled, and those updates are asynchronous to instruction execution. Therefore, the values in this MSR should be considered unreliable unless packet generation is disabled (IA32_RTIT_CTL.TraceEn = 0). Accesses to this MSR are supported only if Intel PT output to memory is supported, hence when either CPUID.(EAX=14H, ECX=0):ECX[bit 0] or CPUID.(EAX=14H, ECX=0):ECX[bit 2] are set. Otherwise WRMSR or RDMSR cause a general-protection fault (#GP). If supported, this MSR can be written only when IA32_RTIT_CTL.TraceEn is 0; otherwise WRMSR causes a general-protection fault (#GP). Table 35-9. IA32_RTIT_OUTPUT_MASK_PTRS MSR Position Bit Name At Reset Bit Description 6:0 LowerMask 7FH Forced to 1, writes are ignored. 31:7 MaskOrTableO ffset 0 The use of this field depends on the value of IA32_RTIT_CTL.ToPA: 0: This field holds bits 31:7 of the mask value for the single, contiguous physical output region. The size of this field indicates that regions can be of size 128B up to 4GB. This value (combined with the lower 7 bits, which are reserved to 1) will be ANDed with the OutputOffset field to determine the next write address. All 1s in this field should be consecutive and starting at bit 7, otherwise the region will not be contiguous, and an operational error (Section 35.3.9) will be signaled when TraceEn is set. 1: This field holds bits 27:3 of the offset pointer into the current ToPA table. This value can be added to the IA32_RTIT_OUTPUT_BASE value to produce a pointer to the current ToPA table entry, which itself is a pointer to the current output region. In this scenario, the lower 7 reserved bits are ignored. This field supports tables up to 256 MBytes in size. 35-22 Vol. 3C INTEL® PROCESSOR TRACE Table 35-9. IA32_RTIT_OUTPUT_MASK_PTRS MSR (Contd.) Position Bit Name 63:32 OutputOffset At Reset 0 Bit Description The use of this field depends on the value of IA32_RTIT_CTL.ToPA: 0: This is bits 31:0 of the offset pointer into the single, contiguous physical output region. This value will be added to the IA32_RTIT_OUTPUT_BASE value to form the physical address at which the next byte of packet output data will be written. This value must be less than or equal to the MaskOrTableOffset field, otherwise an operational error (Section 35.3.9) will be signaled when TraceEn is set. 1: This field holds bits 31:0 of the offset pointer into the current ToPA output region. This value will be added to the output region base field, found in the current ToPA table entry, to form the physical address at which the next byte of trace output data will be written. This value must be less than the ToPA entry size, otherwise an operational error (Section 35.3.9) will be signaled when TraceEn is set. 35.2.8 Interaction of Intel® Processor Trace and Other Processor Features 35.2.8.1 Intel® Transactional Synchronization Extensions (Intel® TSX) The operation of Intel TSX is described in Chapter 14 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. For tracing purpose, packet generation does not distinguish between hardware lock elision (HLE) and restricted transactional memory (RTM), but speculative execution does have impacts on the trace output. Specifically, packets are generated as instructions complete, even for instructions in a transactional region that is later aborted. For this reason, debugging software will need indication of the beginning and end of a transactional region; this will allow software to understand when instructions are part of a transactional region and whether that region has been committed. To enable this, TSX information is included in a MODE packet leaf. The mode bits in the leaf are: • • InTX: Set to 1 on an TSX transaction begin, and cleared on transaction commit or abort. TXAbort: Set to 1 only when InTX transitions from 1 to 0 on an abort. Cleared otherwise. If BranchEn=1, this MODE packet will be sent each time the transaction status changes. See Table 35-10 for details. Table 35-10. TSX Packet Scenarios TSX Event Transaction Begin Transaction Commit Transaction Abort Other Instruction Packets Either XBEGIN or XACQUIRE lock (the latter if executed transactionally) MODE(TXAbort=0, InTX=1), FUP(CurrentIP) Either XEND or XRELEASE lock, if transactional execution ends. This happens only on the outermost commit MODE(TXAbort=0, InTX=0), FUP(CurrentIP) XABORT or other transactional abort MODE(TXAbort=1, InTX=0), FUP(CurrentIP), TIP(TargetIP) One of the following: • Nested XBEGIN or XACQUIRE lock • An outer XACQUIRE lock that doesn’t begin a transaction (InTX not set) • Non-outermost XEND or XRELEASE lock None. No change to TSX mode bits for these cases. The CurrentIP listed above is the IP of the associated instruction. The TargetIP is the IP of the next instruction to be executed; for HLE, this is the XACQUIRE lock; for RTM, this is the fallback handler. Intel PT stores are non-transactional, and thus packet writes are not rolled back on TSX abort. Vol. 3C 35-23 INTEL® PROCESSOR TRACE 35.2.8.2 TSX and IP Filtering A complication with tracking transactions is handling transactions that start or end outside of the tracing region. Transactions can’t span across a change in ContextEn, because CPL changes and CR3 changes each cause aborts. But a transaction can start within the IP filter region and end outside it. To assist the decoder handling this situation, MODE.TSX packets can be sent even if FilterEn=0, though there will be no FUP attached. Instead, they will merely serve to indicate to the decoder when transactions are active and when they are not. When tracing resumes (due to PacketEn=1), the last MODE.TSX preceding the TIP.PGE will indicate the current transaction status. 35.2.8.3 System Management Mode (SMM) SMM code has special privileges that non-SMM code does not have. Intel Processor Trace can be used to trace SMM code, but special care is taken to ensure that SMM handler context is not exposed in any non-SMM trace collection. Additionally, packet output from tracing non-SMM code cannot be written into memory space that is either protected by SMRR or used by the SMM handler. SMM is entered via a system management interrupt (SMI). SMI delivery saves the value of IA32_RTIT_CTL.TraceEn into SMRAM and then clears it, thereby disabling packet generation. The saving and clearing of IA32_RTIT_CTL.TraceEn ensures two things: 1. All internally buffered packet data is flushed before entering SMM (see Section 35.2.7.2). 2. Packet generation ceases before entering SMM, so any tracing that was configured outside SMM does not continue into SMM. No SMM instruction pointers or other state will be exposed in the non-SMM trace. When the RSM instruction is executed to return from SMM, the TraceEn value that was saved by SMI delivery is restored, allowing tracing to be resumed. As is done any time packet generation is enabled, ContextEn is re-evaluated, based on the values of CPL, CR3, etc., established by RSM. Like other interrupts, delivery of an SMI produces a FUP containing the IP of the next instruction to execute. By toggling TraceEn, SMI and RSM can produce TIP.PGD and TIP.PGE packets, respectively, indicating that tracing was disabled or re-enabled. See Table 35.7 for more information about packets entering and leaving SMM. Although #SMI and RSM change CR3, PIP packets are not generated in these cases. With #SMI tracing is disabled before the CR3 change; with RSM TraceEn is restored after CR3 is written. TraceEn must be cleared before executing RSM, otherwise it will cause a shutdown. Further, on processors that restrict use of Intel PT with LBRs (see Section 35.3.1.2), any RSM that results in enabling of both will cause a shutdown. Intel PT can support tracing of System Transfer Monitor operating in SMM, see Section 35.6. 35.2.8.4 Virtual-Machine Extensions (VMX) Initial implementations of Intel Processor Trace do not support tracing in VMX operation. Such processors indicate this by returning 0 for IA32_VMX_MISC[bit 14]. On these processors, execution of the VMXON instruction clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL in VMX operation causes a general-protection exception (#GP). Processors that support Intel Processor Trace in VMX operation return 1 for IA32_VMX_MISC[bit 14]. Details of tracing in VMX operation are described in Section 35.4.2.26. 35.2.8.5 Intel® Software Guard Extensions (Intel® SGX) Intel SGX provides an application with the ability to instantiate a protective container (an enclave) with confidentiality and integrity (see the Intel® Software Guard Extensions Programming Reference). On a processor with both Intel PT and Intel SGX enabled, when executing code within a production enclave, no control flow packets are produced by Intel PT. An enclave entry will clear ContextEn, thereby blocking control flow packet generation. A TIP.PGD packet will be generated if PacketEn=1 at the time of the entry. Upon enclave exit, ContextEn will no longer be forced to 0. If other enables are set at the time, a TIP.PGE may be generated to indicate that tracing is resumed. 35-24 Vol. 3C INTEL® PROCESSOR TRACE During the enclave execution, Intel PT remains enabled, and periodic or timing packets such as PSB, TSC, MTC, or CBR can still be generated. No IPs or other architectural state will be exposed. For packet generation examples on enclave entry or exit, see Section 35.7. Debug Enclaves Intel SGX allows an enclave to be configured with relaxed protection of confidentiality for debug purposes, see the Intel® Software Guard Extensions Programming Reference. In a debug enclave, Intel PT continues to function normally. Specifically, ContextEn is not impacted by an enclave entry or exit. Hence, the generation of ContextEndependent packets within a debug enclave is allowed. 35.2.8.6 SENTER/ENTERACCS and ACM GETSEC[SENTER] and GETSEC[ENTERACCS] instructions clear TraceEn, and it is not restored when those instruction complete. SENTER also causes TraceEn to be cleared on other logical processors when they rendezvous and enter the SENTER sleep state. In these two cases, the disabling of packet generation is not guaranteed to flush internally buffered packets. Some packets may be dropped. When executing an authenticated code module (ACM), packet generation is silently disabled during ACRAM setup. TraceEn will be cleared, but no TIP.PGD packet is generated. After completion of the module, the TraceEn value will be restored. There will be no TIP.PGE packet, but timing packets, like TSC and CBR, may be produced. 35.2.8.7 Intel® Memory Protection Extensions (Intel® MPX) Bounds exceptions (#BR) caused by Intel MPX are treated like other exceptions, producing FUP and TIP packets that indicate the source and destination IPs. 35.3 CONFIGURATION AND PROGRAMMING GUIDELINE 35.3.1 Detection of Intel Processor Trace and Capability Enumeration Processor support for Intel Processor Trace is indicated by CPUID.(EAX=07H,ECX=0H):EBX[bit 25] = 1. CPUID function 14H is dedicated to enumerate the resource and capability of processors that report CPUID.(EAX=07H,ECX=0H):EBX[bit 25] = 1. Different processor generations may have architecturally-defined variation in capabilities. Table 35-11 describes details of the enumerable capabilities that software must use across generations of processors that support Intel Processor Trace. Vol. 3C 35-25 INTEL® PROCESSOR TRACE Table 35-11. CPUID Leaf 14H Enumeration of Intel Processor Trace Capabilities CPUID.(EAX=14H,ECX=0) Register EAX Name Description Behavior Bits 31:0 Maximum valid sub-leaf Index Specifies the index of the maximum valid sub-leaf for this CPUID leaf 0 CR3 Filtering Support 1: Indicates that IA32_RTIT_CTL.CR3Filter can be set to 1, and that IA32_RTIT_CR3_MATCH MSR can be accessed. See Section 35.2.7. EBX 0: Indicates that writes that set IA32_RTIT_CTL.CR3Filter to 1, or any access to IA32_RTIT_CR3_MATCH, will #GP fault. 1 Configurable PSB and CycleAccurate Mode Supported 1: (a) IA32_RTIT_CTL.PSBFreq can be set to a non-zero value, in order to select the preferred PSB frequency (see below for allowed values). (b) IA32_RTIT_STATUS.PacketByteCnt can be set to a non-zero value, and will be incremented by the processor when tracing to indicate progress towards the next PSB. If trace packet generation is enabled by setting TraceEn, a PSB will only be generated if PacketByteCnt=0. (c) IA32_RTIT_CTL.CYCEn can be set to 1 to enable Cycle-Accurate Mode. See Section 35.2.7. 0: (a) Any attempt to set IA32_RTIT_CTL.PSBFreq, to set IA32_RTIT_CTL.CYCEn, or write a non-zero value to IA32_RTIT_STATUS.PacketByteCnt any access to IA32_RTIT_CR3_MATCH, will #GP fault. (b) If trace packet generation is enabled by setting TraceEn, a PSB is always generated. (c) Any attempt to set IA32_RTIT_CTL.CYCEn will #GP fault. 2 IP Filtering and TraceStop supported, and Preserve Intel PT MSRs across warm reset 1: (a) IA32_RTIT_CTL provides at one or more ADDRn_CFG field to configure the corresponding address range MSRs for IP Filtering or IP TraceStop. Each ADDRn_CFG field accepts a value in the range of 0:2 inclusive. The number of ADDRn_CFG fields is reported by CPUID.(EAX=14H, ECX=1):EAX.RANGECNT[2:0]. (b) At least one register pair IA32_RTIT_ADDRn_A and IA32_RTIT_ADDRn_B are provided to configure address ranges for IP filtering or IP TraceStop. (c) On warm reset, all Intel PT MSRs will retain their pre-reset values, though IA32_RTIT_CTL.TraceEn will be cleared. The Intel PT MSRs are listed in Section 35.2.7. 0: (a) An Attempt to write IA32_RTIT_CTL.ADDRn_CFG with non-zero encoding values will cause #GP. (b) Any access to IA32_RTIT_ADDRn_A and IA32_RTIT_ADDRn_B, will #GP fault. (c) On warm reset, all Intel PT MSRs will be cleared. 3 MTC Supported 1: IA32_RTIT_CTL.MTCEn can be set to 1, and MTC packets will be generated. See Section 35.2.7. 0: An attempt to set IA32_RTIT_CTL.MTCEn or IA32_RTIT_CTL.MTCFreq to a non-zero value will #GP fault. 4 PTWRITE Supported 1: Writes can set IA32_RTIT_CTL[12] (PTWEn) and IA32_RTIT_CTL[5] (FUPonPTW), and PTWRITE can generate packets. 0: Writes that set IA32_RTIT_CTL[12] or IA32_RTIT_CTL[5] will #GP, and PTWRITE will #UD fault. 5 Power Event Trace Supported 1: Writes can set IA32_RTIT_CTL[4] (PwrEvtEn), enabling Power Event Trace packet generation. 0: Writes that set IA32_RTIT_CTL[4] will #GP. 31:6 35-26 Vol. 3C Reserved INTEL® PROCESSOR TRACE Table 35-11. CPUID Leaf 14H Enumeration of Intel Processor Trace Capabilities (Contd.) CPUID.(EAX=14H,ECX=0) Register Name Description Behavior Bits 0 ToPA Output Supported ECX 1: Tracing can be enabled with IA32_RTIT_CTL.ToPA = 1, hence utilizing the ToPA output scheme (Section 35.2.6.2) IA32_RTIT_OUTPUT_BASE and IA32_RTIT_OUTPUT_MASK_PTRS MSRs can be accessed. 0: Unless CPUID.(EAX=14H, ECX=0):ECX.SNGLRNGOUT[bit 2] = 1. writes to IA32_RTIT_OUTPUT_BASE or IA32_RTIT_OUTPUT_MASK_PTRS. MSRs will #GP fault. 1 ToPA Tables Allow Multiple Output Entries 1: ToPA tables can hold any number of output entries, up to the maximum allowed by the MaskOrTableOffset field of IA32_RTIT_OUTPUT_MASK_PTRS. 0: ToPA tables can hold only one output entry, which must be followed by an END=1 entry which points back to the base of the table. Further, ToPA PMIs will be delivered before the region is filled. See ToPA PMI in Section 35.2.6.2. If there is more than one output entry before the END entry, or if the END entry has the wrong base address, an operational error will be signaled (see “ToPA Errors” in Section 35.2.6.2). 2 Single-Range Output Supported 1: Enabling tracing (TraceEn=1) with IA32_RTIT_CTL.ToPA=0 is supported. 0: Unless CPUID.(EAX=14H, ECX=0):ECX.TOPAOUT[bit 0] = 1. writes to IA32_RTIT_OUTPUT_BASE or IA32_RTIT_OUTPUT_MASK_PTRS. MSRs will #GP fault. 3 Output to Trace Transport Subsystem Supported 30:4 Reserved 31 IP Payloads are LIP 1: Setting IA32_RTIT_CTL.FabricEn to 1 is supported. 0: IA32_RTIT_CTL.FabricEn is reserved. Write 1 to IA32_RTIT_CTL.FabricEn will #GP fault. 1: Generated packets which contain IP payloads have LIP values, which include the CS base component. 0: Generated packets which contain IP payloads have RIP values, which are the offset from CS base. EDX 31:0 Reserved If CPUID.(EAX=14H, ECX=0):EAX reports a non-zero value, additional capabilities of Intel Processor Trace are described in the sub-leaves of CPUID leaf 14H. Vol. 3C 35-27 INTEL® PROCESSOR TRACE Table 35-12. CPUID Leaf 14H, sub-leaf 1H Enumeration of Intel Processor Trace Capabilities CPUID.(EAX=14H,ECX=1) Register EAX Name Description Behavior Bits 2:0 Number of Address Ranges A non-zero value specifies the number ADDRn_CFG field supported in IA32_RTIT_CTL and the number of register pair IA32_RTIT_ADDRn_A/IA32_RTIT_ADDRn_B supported for IP filtering and IP TraceStop. NOTE: Currently, no processors support more than 4 address ranges. 15:3 Reserved 31:16 Bitmap of supported MTC Period Encodings The non-zero bits indicate the map of supported encoding values for the IA32_RTIT_CTL.MTCFreq field. This applies only if CPUID.(EAX=14H, ECX=0):EBX.MTC[bit 3] = 1 (MTC Packet generation is supported), otherwise the MTCFreq field is reserved to 0. Each bit position in this field represents 1 encoding value in the 4-bit MTCFreq field (ie, bit 0 is associated with encoding value 0). For each bit: 1: MTCFreq can be assigned the associated encoding value. 0: MTCFreq cannot be assigned to the associated encoding value. A write to IA32_RTIT_CTLMTCFreq with unsupported encoding will cause #GP fault. EBX 15:0 Bitmap of supported Cycle Threshold values The non-zero bits indicate the map of supported encoding values for the IA32_RTIT_CTL.CycThresh field. This applies only if CPUID.(EAX=14H, ECX=0):EBX.CPSB_CAM[bit 1] = 1 (Cycle-Accurate Mode is Supported), otherwise the CycThresh field is reserved to 0. See Section 35.2.7. Each bit position in this field represents 1 encoding value in the 4-bit CycThresh field (ie, bit 0 is associated with encoding value 0). For each bit: 1: CycThresh can be assigned the associated encoding value. 0: CycThresh cannot be assigned to the associated encoding value. A write to CycThresh with unsupported encoding will cause #GP fault. 31:16 Bitmap of supported Configurable PSB Frequency encoding The non-zero bits indicate the map of supported encoding values for the IA32_RTIT_CTL.PSBFreq field. This applies only if CPUID.(EAX=14H, ECX=0):EBX.CPSB_CAM[bit 1] = 1 (Configurable PSB is supported), otherwise the PSBFreq field is reserved to 0. See Section 35.2.7. Each bit position in this field represents 1 encoding value in the 4-bit PSBFreq field (ie, bit 0 is associated with encoding value 0). For each bit: 1: PSBFreq can be assigned the associated encoding value. 0: PSBFreq cannot be assigned to the associated encoding value. A write to PSBFreq with unsupported encoding will cause #GP fault. ECX 31:0 Reserved EDX 31:0 Reserved 35-28 Vol. 3C INTEL® PROCESSOR TRACE 35.3.1.1 Packet Decoding of RIP versus LIP FUP, TIP, TIP.PGE, and TIP.PGE packets can contain an instruction pointer (IP) payload. On some processor generations, this payload will be an effective address (RIP), while on others this will be a linear address (LIP). In the former case, the payload is the offset from the current CS base address, while in the latter it is the sum of the offset and the CS base address (Note that in real mode, the CS base address is the value of CS<<4, while in protected mode the CS base address is the base linear address of the segment indicated by the CS register.). Which IP type is in use is indicated by enumeration (see CPUID.(EAX=14H, ECX=0):ECX.LIP[bit 31] in Table 35-11). For software that executes while the CS base address is 0 (including all software executing in 64-bit mode), the difference is indistinguishable. A trace decoder must account for cases where the CS base address is not 0 and the resolved LIP will not be evident in a trace generated on a CPU that enumerates use of RIP. This is likely to cause problems when attempting to link the trace with the associated binaries. Note that IP comparison logic, for IP filtering and TraceStop range calculation, is based on the same IP type as these IP packets. For processors that output RIP, the IP comparison mechanism is also based on RIP, and hence on those processors RIP values should be written to IA32_RTIT_ADDRn_[AB] MSRs. This can produce differing behavior if the same trace configuration setting is run on processors reporting different IP types, i.e. CPUID.(EAX=14H, ECX=0):ECX.LIP[bit 31]. Care should be taken to check CPUID when configuring IP filters. 35.3.1.2 Model Specific Capability Restrictions Some processor generations impose restrictions that prevent use of LBRs/BTS/BTM/LERs when software has enabled tracing with Intel Processor Trace. On these processors, when TraceEn is set, updates of LBR, BTS, BTM, LERs are suspended but the states of the corresponding IA32_DEBUGCTL control fields remained unchanged as if it were still enabled. When TraceEn is cleared, the LBR array is reset, and LBR/BTS/BTM/LERs updates will resume. Further, reads of these registers will return 0, and writes will be dropped. The list of MSRs whose updates/accesses are restricted follows. • • • MSR_LASTBRANCH_x_TO_IP, MSR_LASTBRANCH_x_FROM_IP, MSR_LBR_INFO_x, MSR_LASTBRANCH_TOS MSR_LER_FROM_LIP, MSR_LER_TO_LIP MSR_LBR_SELECT For processor with CPUID DisplayFamily_DisplayModel signature of 06_3DH, 06_47H, 06_4EH, 06_4FH, 06_56H and 06_5EH, the use of Intel PT and LBRs are mutually exclusive. 35.3.2 Enabling and Configuration of Trace Packet Generation To configure trace packets, enable packet generation, and capture packets, software starts with using CPUID instruction to detect its feature flag, CPUID.(EAX=07H,ECX=0H):EBX[bit 25] = 1; followed by enumerating the capabilities described in Section 35.3.1. Based on the capability queried from Section 35.3.1, software must configure a number of model-specific registers. This section describes programming considerations related to those MSRs. 35.3.2.1 Enabling Packet Generation When configuring and enabling packet generation, the IA32_RTIT_CTL MSR should be written after any other Intel PT MSRs have been written, since writes to the other configuration MSRs cause a general-protection fault (#GP) if TraceEn = 1. If a prior trace collection context is not being restored, then software should first clear IA32_RTIT_STATUS. This is important since the Stopped, and Error fields are writable; clearing the MSR clears any values that may have persisted from prior trace packet collection contexts. See Section 35.2.7.2 for details of packets generated by setting TraceEn to 1. If setting TraceEn to 1 causes an operational error (see Section 35.3.9), there may be a delay after the WRMSR completes before the error is signaled in the IA32_RTIT_STATUS MSR. While packet generation is enabled, the values of some configuration MSRs (e.g., IA32_RTIT_STATUS and IA32_RTIT_OUTPUT_*) are transient, and reads may return values that are out of date. Only after packet generation is disabled (by clearing TraceEn) do reads of these MSRs return reliable values. Vol. 3C 35-29 INTEL® PROCESSOR TRACE 35.3.2.2 Disabling Packet Generation After disabling packet generation by clearing IA32_RTIT_CTL, it is advisable to read the IA32_RTIT_STATUS MSR (Section 35.2.7.4): • If the Error bit is set, an operational error was encountered, and the trace is most likely compromised. Software should check the source of the error (by examining the output MSR values), correct the source of the problem, and then attempt to gather the trace again. For details on operational errors, see Section 35.3.9. Software should clear IA32_RTIT_STATUS.Error before re-enabling packet generation. • If the Stopped bit is set, software execution encountered an IP TraceStop (see Section 35.2.4.3) or the ToPA Stop condition (see “ToPA STOP” in Section 35.2.6.2) before packet generation was disabled. 35.3.3 Flushing Trace Output Packets are first buffered internally and then written out asynchronously. To collect packet output for postprocessing, a collector needs first to ensure that all packet data has been flushed from internal buffers. Software can ensure this by stopping packet generation by clearing IA32_RTIT_CTL.TraceEn (see “Disabling Packet Generation” in Section 35.2.7.2). When software clears IA32_RTIT_CTL.TraceEn to flush out internally buffered packets, the logical processor issues an SFENCE operation which ensures that WC trace output stores will be ordered with respect to the next store, or serializing operation. A subsequent read from the same logical processor will see the flushed trace data, while a read from another logical processor should be preceded by a store, fence, or architecturally serializing operation on the tracing logical processor. When the flush operations complete, the IA32_RTIT_OUTPUT_* MSR values indicate where the trace ended. While TraceEn is set, these MSRs may hold stale values. Further, if a ToPA region with INT=1 is filled, meaning a ToPA PMI has been triggered, IA32_PERF_GLOBAL_STATUS.Trace_ToPA_PMI[55] will be set by the time the flush completes. 35.3.4 Warm Reset The MSRs software uses to program Intel Processor Trace are cleared after a power-on RESET (or cold RESET). On a warm RESET, the contents of those MSRs can retain their values from before the warm RESET with the exception that IA32_RTIT_CTL.TraceEn will be cleared (which may have the side effect of clearing some bits in IA32_RTIT_STATUS). 35.3.5 Context Switch Consideration To facilitate construction of instruction execution traces at the granularity of a software process or thread context, software can save and restore the states of the trace configuration MSRs across the process or thread context switch boundary. The principle is the same as saving and restoring the typical architectural processor states across context switches. 35.3.5.1 Manual Trace Configuration Context Switch The configuration can be saved and restored through a sequence of instructions of RDMSR, management of MSR content and WRMSR. To stop tracing and to ensure that all configuration MSRs contain stable values, software must clear IA32_RTIT_CTL.TraceEn before reading any other trace configuration MSRs. The recommended method for saving trace configuration context manually follows: 1. RDMSR IA32_RTIT_CTL, save value to memory 2. WRMSR IA32_RTIT_CTL with saved value from RDMSR above and TraceEn cleared 3. RDMSR all other configuration MSRs whose values had changed from previous saved value, save changed values to memory 35-30 Vol. 3C INTEL® PROCESSOR TRACE When restoring the trace configuration context, IA32_RTIT_CTL should be restored last: 1. Read saved configuration MSR values, aside from IA32_RTIT_CTL, from memory, and restore them with WRMSR 2. Read saved IA32_RTIT_CTL value from memory, and restore with WRMSR. 35.3.5.2 Trace Configuration Context Switch Using XSAVES/XRSTORS On processors whose XSAVE feature set supports XSAVES and XRSTORS, the Trace configuration state can be saved using XSAVES and restored by XRSTORS, in conjunction with the bit field associated with supervisory state component in IA32_XSS. See Chapter 13, “Managing State Using the XSAVE Feature Set” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. The layout of the trace configuration component state in the XSAVE area is shown in Table 35-13.1 Table 35-13. Memory Layout of the Trace Configuration State Component Offset within Component Area Field Offset within Component Area Field 0H IA32_RTIT_CTL 08H IA32_RTIT_OUTPUT_BASE 10H IA32_RTIT_OUTPUT_MASK_PTRS 18H IA32_RTIT_STATUS 20H IA32_RTIT_CR3_MATCH 28H IA32_RTIT_ADDR0_A 30H IA32_RTIT_ADDR0_B 38H IA32_RTIT_ADDR1_A 40H IA32_RTIT_ADDR1_B 48H–End Reserved The IA32_XSS MSR is zero coming out of RESET. Once IA32_XSS[bit 8] is set, system software operating at CPL= 0 can use XSAVES/XRSTORS with the appropriate requested-feature bitmap (RFBM) to manage supervisor state components in the XSAVE map. See Chapter 13, “Managing State Using the XSAVE Feature Set” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1. 35.3.6 Cycle-Accurate Mode Intel PT can be run in a cycle-accurate mode which enables CYC packets (see Section 35.4.2.14) that provide lowlevel information in the processor core clock domain. This cycle counter data in CYC packets can be used to compute IPC (Instructions Per Cycle), or to track wall-clock time on a fine-grain level. To enable cycle-accurate mode packet generation, software should set IA32_RTIT_CTL.CYCEn=1. It is recommended that software also set TSCEn=1 anytime cycle-accurate mode is in use. With this, all CYC-eligible packets will be preceded by a CYC packet, the payload of which indicates the number of core clock cycles since the last CYC packet. In cases where multiple CYC-eligible packets are generated in a single cycle, only a single CYC will be generated before the CYC-eligible packets, otherwise each CYC-eligible packet will be preceded by its own CYC. The CYC-eligible packets are: • TNT, TIP, TIP.PGE, TIP.PGD, MODE.EXEC, MODE.TSX, PIP, VMCS, OVF, MTC, TSC, PTWRITE, EXSTOP TSC packets are generated when there is insufficient information to reconstruct wall-clock time, due to tracing being disabled (TriggerEn=0), or power down scenarios like a transition to a deep-sleep MWAIT C-state. In this case, the CYC that is generated along with the TSC will indicate the number of cycles actively tracing (those powered up, with TriggerEn=1) executed between the last CYC packet and the TSC packet. And hence the amount of time spent while tracing is inactive can be inferred from the difference in time between that expected based on the CYC value, and the actual time indicated by the TSC. Additional CYC packets may be sent stand-alone, so that the processor can ensure that the decoder is aware of the number of cycles that have passed before the internal hardware counter wraps, or is reset due to other microarchitectural condition. There is no guarantee at what intervals these standalone CYC packets will be sent, except that they will be sent before the wrap occurs. An illustration is given below. 1. Table 35-13 documents support for the MSRs defining address ranges 0 and 1. Processors that provide XSAVE support for Intel Processor Trace support only those address ranges. Vol. 3C 35-31 INTEL® PROCESSOR TRACE Example 35-1. An Illustrative CYC Packet Example Time (cycles) Instruction Snapshot Generated Packets Comment x call %eax CYC(?), TIP ?Elapsed cycles from the previous CYC unknown x+2 call %ebx CYC(2), TIP 1 byte CYC packet; 2 cycles elapsed from the previous CYC x+8 jnz Foo (not taken) CYC(6) 1 byte CYC packet x+9 ret (compressed) x + 12 jnz Bar (taken) x + 16 ret (uncompressed) TNT, CYC(8), TIP 1 byte CYC packet x + 4111 CYC(4095) 2 byte CYC packet x + 12305 CYC(8194) 3 byte CYC packet CYC(4027), PIP 2 byte CYC packet x + 16332 35.3.6.1 mov cr3, %ebx Cycle Counter The cycle counter is implemented in hardware (independent of the time stamp counter or performance monitoring counters), and is a simple incrementing counter that does not saturate, but rather wraps. The size of the counter is implementation specific. The cycle counter is reset to zero any time that TriggerEn is cleared, and when a CYC packet is sent. The cycle counter will continue to count when ContextEn or FilterEn are cleared, and cycle packets will still be generated. It will not count during sleep states that result in Intel PT logic being powered-down, but will count up to the point where clocks are disabled, and resume counting once they are re-enabled. 35.3.6.2 Cycle Packet Semantics Cycle-accurate mode adheres to the following protocol: • • All packets that precede a CYC packet represent instructions or events that took place before the CYC time. • The CYC-eligible packet that immediately follows a CYC packet represents an instruction or event that took place at the same time as the CYC time. All packets that follow a CYC packet represent instructions or events that took place at the same time as, or after, the CYC time. These items above give the decoder a means to apply CYC packets to a specific instruction in the assembly stream. Most packets represent a single instruction or event, and hence the CYC packet that precedes each of those packets represents the retirement time of that instruction or event. In the case of TNT packets, up to 6 conditional branches and/or compressed RETs may be contained in the packet. In this case, the preceding CYC packet provides the retirement time of the first branch in the packet. It is possible that multiple branches retired in the same cycle as that first branch in the TNT, but the protocol will not make that obvious. Also note that a MTC packet could be generated in the same cycle as the first JCC in the TNT packet. In this case, the CYC would precede both the MTC and the TNT, and apply to both. Note that there are times when the cycle counter will stop counting, though cycle-accurate mode is enabled. After any such scenario, a CYC packet followed by TSC packet will be sent. See Section 35.8.3.2 to understand how to interpret the payload values Multi-packet Instructions or Events Some operations, such as interrupts or task switches, generate multiple packets. In these cases, multiple CYC packets may be sent for the operation, preceding each CYC-eligible packet in the operation. An example, using a task switch on a software interrupt, is shown below. 35-32 Vol. 3C INTEL® PROCESSOR TRACE Example 35-2. An Example of CYC in the Presence of Multi-Packet Operations Time (cycles) Instruction Snapshot Generated Packets x jnz Foo (not taken) x+2 ret (compressed) x+8 jnz Bar (taken) x+9 jmp %eax TNT, CYC(9), TIP x + 12 jnz Bar (not taken) CYC(3) x + 32 int3 (task gate) TNT, FUP, CYC(10), PIP, CYC(20), MODE.Exec, TIP 35.3.6.3 CYC(?), Cycle Thresholds Software can opt to reduce the frequency of cycle packets, a trade-off to save bandwidth and intrusion at the expense of precision. This is done by utilizing a cycle threshold (see Section 35.2.7.2). IA32_RTIT_CTL.CycThresh indicates to the processor the minimum number of cycles that must pass before the next CYC packet should be sent. If this value is 0, no threshold is used, and CYC packets can be sent every cycle in which a CYC-eligible packet is generated. If this value is greater than 0, the hardware will wait until the associated number of cycles have passed since the last CYC packet before sending another. CPUID provides the threshold options for CycThresh, see Section 35.3.1. Note that the cycle threshold does not dictate how frequently a CYC packet will be posted, it merely assigns the maximum frequency. If the cycle threshold is 16, a CYC packet can be posted no more frequently than every 16 cycles. However, once that threshold of 16 cycles has passed, it still requires a new CYC-eligible packet to be generated before a CYC will be inserted. Table 35-14 illustrates the threshold behavior. Table 35-14. An Illustrative CYC Packet Example Time (cycles) Instruction Snapshot Threshold 0 16 32 64 x jmp %eax CYC, TIP CYC, TIP CYC, TIP CYC, TIP x+9 call %ebx CYC, TIP TIP TIP TIP x + 15 call %ecx CYC, TIP TIP TIP TIP x + 30 jmp %edx CYC, TIP CYC, TIP TIP TIP x + 38 mov cr3, %eax CYC, PIP PIP CYC, PIP PIP x + 46 jmp [%eax] CYC, TIP CYC, TIP TIP TIP x + 64 call %edx CYC, TIP CYC, TIP TIP CYC,TIP x + 71 jmp %edx CYC, TIP TIP CYC,TIP TIP 35.3.7 Decoder Synchronization (PSB+) The PSB packet (Section 35.4.2.17) serves as a synchronization point for a trace-packet decoder. It is a pattern in the trace log for which the decoder can quickly scan to align packet boundaries. No legal packet combination can result in such a byte sequence. As such, it serves as the starting point for packet decode. To decode a trace log properly, the decoder needs more than simply to be aligned: it needs to know some state and potentially some timing information as well. The decoder should never need to retain any information (e.g., LastIP, call stack, compound packet event) across a PSB; all compound packet events will be completed before a PSB, and any compression state will be reset. When a PSB packet is generated, it is followed by a PSBEND packet (Section 35.4.2.18). One or more packets may be generated in between those two packets, and these inform the decoder of the current state of the processor. These packets, known collectively as PSB+, should be interpreted as “status only”, since they do not imply any change of state at the time of the PSB, nor are they associated directly with any instruction or event. Thus, the Vol. 3C 35-33 INTEL® PROCESSOR TRACE normal binding and ordering rules that apply to these packets outside of PSB+ can be ignored when these packets are between a PSB and PSBEND. They inform the decoder of the state of the processor at the time of the PSB. PSB+ can include: • • • Timestamp (TSC), if IA32_RTIT_CTL.TSCEn=1. • VMCS packet, if either the logical is in VMX root operation or the logical processor is in VMX non-root operation and the “conceal VMX from PT” VM-execution control is 0. • • • • Core Bus Ratio (CBR). Timestamp-MTC Align (TMA), if IA32_RTIT_CTL.TSCEn=1 && IA32_RTIT_CTL.MTCEn=1. Paging Information Packet (PIP), if ContextEn=1 and IA32_RTIT_CTL.OS=1. The non-root bit (NR) is set if the logical processor is in VMX non-root operation and the “conceal VMX from PT” VM-execution control is 0. MODE.TSX, if ContextEn=1 and BranchEn = 1. MODE.Exec, if PacketEn=1. Flow Update Packet (FUP), if PacketEn=1. PSB is generated only when TriggerEn=1; hence PSB+ has the same dependencies. The ordering of packets within PSB+ is not fixed. Timing packets such as CYC and MTC may be generated between PSB and PSBEND, and their meanings are the same as outside PSB+. A PSB+ can be lost in some scenarios. If IA32_RTIT_STATUS.TriggerEn is cleared just as the PSB threshold is reached, the PSB+ may not be generated. TriggerEn can be cleared by a WRMSR that clears IA32_RTIT_CTL.TraceEn, a VM-exit that clears IA32_RTIT_CTL.TraceEn, an #SMI, or any time that either IA32_RTIT_STATUS.Stopped is set (e.g., by a TraceStop or ToPA stop condition) or IA32_RTIT_STATUS.Error is set (e.g., by an Intel PT output error). On processors that support PSB preservation (CPUID.(EAX=14H, ECX=0):EBX.INJECTPSBPMI[6] = 1), setting IA32_RTIT_CTL.InjectPsbPmiOnEnable[56] = 1 will ensure that a PSB+ that is pending at the time PT is disabled will be recorded by setting IA32_RTIT_STATUS.PendPSB[6] = 1. A PSB will be inserted, and PendPSB cleared, when PT is later re-enabled while PendPSB = 1. Note that an overflow can occur during PSB+, and this could cause the PSBEND packet to be lost. For this reason, the OVF packet should also be viewed as terminating PSB+. If IA32_RTIT_STATUS.TriggerEn is cleared just as the PSB threshold is reached, the PSB+ may not be generated. TriggerEn can be cleared by a WRMSR that clears IA32_RTIT_CTL.TraceEn, a VM-exit that clears IA32_RTIT_CTL.TraceEn, an #SMI, or any time that either IA32_RTIT_STATUS.Stopped is set (e.g., by a TraceStop or ToPA stop condition) or IA32_RTIT_STATUS.Error is set (e.g., by an Intel PT output error). 35.3.8 Internal Buffer Overflow In the rare circumstances when new packets need to be generated but the processor’s dedicated internal buffers are all full, an “internal buffer overflow” occurs. On such an overflow packet generation ceases (as packets would need to enter the processor’s internal buffer) until the overflow resolves. Once resolved, packet generation resumes. When the buffer overflow is cleared, an OVF packet (Section 35.4.2.16) is generated, and the processor ensures that packets which follow the OVF are not compressed (IP compression or RET compression) against packets that were lost. If IA32_RTIT_CTL.BranchEn = 1, the OVF packet will be followed by a FUP if the overflow resolves while PacketEn=1. If the overflow resolves while PacketEn = 0 no packet is generated, but a TIP.PGE will naturally be generated later, once PacketEn = 1. The payload of the FUP or TIP.PGE will be the Current IP of the first instruction upon which tracing resumes after the overflow is cleared. If the overflow resolves while PacketEn=1, only timing packets may come between the OVF and the FUP. If the overflow resolves while PacketEn=0, any other packets that are not dependent on PacketEn may come between the OVF and the TIP.PGE. 35.3.8.1 Overflow Impact on Enables The address comparisons to ADDRn ranges, for IP filtering and TraceStop (Section 35.2.4.3), continue during a buffer overflow, and TriggerEn, ContextEn, and FilterEn may change during a buffer overflow. Like other packets, however, any TIP.PGE or TIP.PGD packets that would have been generated will be lost. Further, IA32_RTIT_STATUS.PacketByteCnt will not increment, since it is only incremented when packets are generated. 35-34 Vol. 3C INTEL® PROCESSOR TRACE If a TraceStop event occurs during the buffer overflow, IA32_RTIT_STATUS.Stopped will still be set, tracing will cease as a result. However, the TraceStop packet, and any TIP.PGD that result from the TraceStop, may be dropped. 35.3.8.2 Overflow Impact on Timing Packets Any timing packets that are generated during a buffer overflow will be dropped. If only a few MTC packets are dropped, a decoder should be able to detect this by noticing that the time value in the first MTC packet after the buffer overflow incremented by more than one. If the buffer overflow lasted long enough that 256 MTC packets are lost (and thus the MTC packet ‘wraps’ its 8-bit CTC value), then the decoder may be unable to properly understand the trace. This is not an expected scenario. No CYC packets are generated during overflow, even if the cycle counter wraps. Note that, if cycle-accurate mode is enabled, the OVF packet will generate a CYC packet. Because the cycle counter counts during overflows, this CYC packet can provide the duration of the overflow. However, there is a risk that the cycle counter wrapped during the overflow, which could render this CYC misleading. 35.3.9 Operational Errors Errors are detected as a result of packet output configuration problems, which can include output alignment issues, ToPA reserved bit violations, or overlapping packet output with restricted memory. See “ToPA Errors” in Section 35.2.6.2 for details on ToPA errors, and Section 35.2.6.4 for details on restricted memory errors. Operational errors are only detected and signaled when TraceEn=1. When an operational error is detected, tracing is disabled and the error is logged. Specifically, IA32_RTIT_STATUS.Error is set, which will cause IA32_RTIT_STATUS.TriggerEn to be 0. This will disable generation of all packets. Some causes of operational errors may lead to packet bytes being dropped. It should be noted that the timing of error detection may not be predictable. Errors are signaled when the processor encounters the problematic configuration. This could be as soon as packet generation is enabled but could also be later when the problematic entry or field needs to be used. Once an error is signaled, software should disable packet generation by clearing TraceEn, diagnose and fix the error condition, and clear IA32_RTIT_STATUS.Error. At this point, packet generation can be re-enabled. 35.4 TRACE PACKETS AND DATA TYPES This section details the data packets generated by Intel Processor Trace. It is useful for developers writing the interpretation code that will decode the data packets and apply it to the traced source code. 35.4.1 Packet Relationships and Ordering This section introduces the concept of packet “binding”, which involves determining the IP in a binary disassembly at which the change indicated by a given packet applies. Some packets have the associated IP as the payload (FUP, TIP), while for others the decoder need only search for the next instance of a particular instruction (or instructions) to bind the packet (TNT). However, in many cases, the decoder will need to consider the relationship between packets, and to use this packet context to determine how to bind the packet. Section 35.4.1.1 below provides detailed descriptions of the packets, including how packets bind to IPs in the disassembly, to other packets, or to nothing at all. Many packets listed are simple to bind, because they are generated in only a few scenarios. Those that require more consideration are typically part of “compound packet events”, such as interrupts, exceptions, and some instructions, where multiple packets are generated by a single operation (instruction or event). These compound packet events frequently begin with a FUP to indicate the source address (if it is not clear from the disassembly), and are concluded by a TIP or TIP.PGD packet that indicates the destination address (if one is provided). In this scenario, the FUP is said to be “coupled” with the TIP packet. Other packets could be in between the coupled FUP and TIP packet. Timing packets, such as TSC, MTC, CYC, or CBR, could arrive at any time, and hence could intercede in a compound packet event. If an operation changes CR3 or the processor’s mode of execution, a state update packet (i.e., PIP or MODE) is generated. The state changes Vol. 3C 35-35 INTEL® PROCESSOR TRACE indicated by these intermediate packets should be applied at the IP of the TIP* packet. A summary of compound packet events is provided in Table 35-15; see Section 35.4.1.1 for more per-packet details and Section 35.7 for more detailed packet generation examples. Table 35-15. Compound Packet Event Summary Event Type Unconditional, uncompressed control-flow transfer TSX Update Overflow 35.4.1.1 Beginning End Comment Any combination of PIP, VMCS, MODE.Exec, or none TIP or TIP.PGD FUP only for asynchronous events. Order of middle packets may vary. MODE.TSX, and (FUP or none) None TIP, TIP.PGD, or none FUP OVF PSB, PSBEND, or none FUP or TIP.PGE FUP if overflow resolves while ContextEn=1, else TIP.PGE. FUP or none Middle PIP/VMCS/MODE only if the operation modifies the state tracked by these respective packets. TIP/TIP.PGD only for TSX abort cases. Packet Blocks Packet blocks are a means to dump one or more groups of state values. Packet blocks begin with a Block Begin Packet (BBP), which indicates what type of state is held within the block. Following each BBP there may be one or more Block Item Packets (BIPs), which contain the state values. The block is terminated by either a Block End Packet (BEP) or another BBP indicating the start of a new block. The BIP packet includes an ID value that, when combined with the Type field from the BBP that preceded it, uniquely identifies the state value held in the BIP payload. The size of each BIP packet payload is provided by the Size field in the preceding BBP packet. Each block type can have up to 32 items defined for it. There is no guarantee, however, that each block of that type will hold all 32 items. For more details on which items to expect, see documentation on the specific block type of interest. See the BBP packet description (Section 35.4.2.26) for details on packet block generation scenarios. Packet blocks are entirely generated within an instruction or between instructions, which dictates the types of packets (aside from BIPs) that may be seen within a packet block. Packets that indicate control flow changes, or other indication of instruction completion, cannot be generated within a block. These are listed in the following table. Other packets, including timing packets, may occur between BBP and BEP. Table 35-16. Packets Forbidden Between BBP and BEP TNT TIP, TIP.PGE, TIP.PGD MODE.Exec, MODE.TSX PIP, VMCS TraceStop PSB, PSBEND PTW MWAIT It is possible to encounter an internal buffer overflow in the middle of a block. In such a case, it is guaranteed that packet generation will not resume in the middle of a block, and hence the OVF packet terminates the current block. Depending on the duration of the overflow, subsequent blocks may also be lost. Decoder Implications When a Block Begin Packet (BBP) is encountered, the decoder will need to decode some packets within the block differently from those outside a block. The Block Item Packet (BIP) header byte has the same encoding as a TNT packet outside of a block, but must be treated as a BIP header (with following payload) within one. 35-36 Vol. 3C INTEL® PROCESSOR TRACE When an OVF packet is encountered, the decoder should treat that as a block ending condition. Packet generation will not resume within a block. 35.4.2 Packet Definitions The following description of packet definitions are in tabular format. Figure 35-3 explains how to interpret them. Packet bits listed as “RSVD” are not guaranteed to be 0. Byte Number Name Header bits in Green Bit Number Payload in White Packet name Packet Format 7 6 5 4 3 2 1 0 0 0 1 0 1 0 1 0 1 1 1 1 0 0 0 1 1 0 2 0 1 0 0 0 1 1 0 Description of fields Dependencies Depends on packet generation configuration enable controls or other bits (Section 35.2.5). Generation Scenario Which instructions, events, or other scenarios can cause this packet to be generated. Description Description of the packet, including the purpose it serves, meaning of the information or payload, etc Application How a decoder should apply this packet. It may bind to a specific instruction from the binary, or to another packet in the stream, or have other implications on decode Figure 35-3. Interpreting Tabular Definition of Packet Format Vol. 3C 35-37 INTEL® PROCESSOR TRACE 35.4.2.1 Taken/Not-taken (TNT) Packet Table 35-17. TNT Packet Definition Name Taken/Not-taken (TNT) Packet Packet Format 0 7 6 5 4 3 2 1 0 1 B1 B2 B3 B4 B5 B6 0 Short TNT B1…BN represent the last N conditional branch or compressed RET (Section 35.4.2.2) results, such that B1 is oldest and BN is youngest. The short TNT packet can contain from 1 to 6 TNT bits. The long TNT packet can contain from 1 to 47 TNT bits. 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 1 1 2 B40 B41 B42 B43 B44 B45 B46 B47 3 B32 B33 B34 B35 B36 B37 B38 B39 4 B24 B25 B26 B27 B28 B29 B30 B31 5 B16 B17 B18 B19 B20 B21 B22 B23 6 B8 B9 B10 B11 B12 B13 B14 B15 7 1 B1 B2 B3 B4 B5 B6 B7 Long TNT Irrespective of how many TNT bits is in a packet, the last valid TNT bit is followed by a trailing 1, or Stop bit, as shown above. If the TNT packet is not full (fewer than 6 TNT bits for the Short TNT, or fewer than 47 TNT bits for the Long TNT), the Stop bit moves up, and the trailing bits of the packet are filled with 0s. Examples of these “partial TNTs” are shown below. 7 6 5 4 3 2 1 0 0 0 1 B1 B2 B3 B4 0 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 1 1 2 B24 B25 B26 B27 B28 B29 B30 B31 3 B16 B17 B18 B19 B20 B21 B22 B23 4 B8 B9 B10 B11 B12 B13 B14 B15 5 1 B1 B2 B3 B4 B5 B6 B7 6 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 Dependencies 35-38 Vol. 3C PacketEn Generation Scenario Short TNT Long TNT On a conditional branch or compressed RET, if it fills the TNT. Also, partial TNTs may be generated at any time, as a result of other packets being generated, or certain micro-architectural conditions occurring, before the TNT is full. INTEL® PROCESSOR TRACE Table 35-17. TNT Packet Definition (Contd.) Description Provides the taken/not-taken results for the last 1–N conditional branches (Jcc, J*CXZ, or LOOP) or compressed RETs (Section 35.4.2.2). The TNT payload bits should be interpreted as follows: • 1 indicates a taken conditional branch, or a compressed RET • 0 indicates a not-taken conditional branch TNT payload bits are stored internal to the processor in a TNT buffer, until either the buffer is filled or another packet is to be generated. In either case a TNT packet holding the buffered bits will be emitted, and the TNT buffer will be marked as empty. Application Each valid payload bit (that is, bits between the header bits and the trailing Stop bit) applies to an upcoming conditional branch or RET instruction. Once a decoder consumes a TNT packet with N valid payload bits, these bits should be applied to (and hence provide the destination for) the next N conditional branches or RETs 35.4.2.2 Target IP (TIP) Packet Table 35-18. IP Packet Definition Name Target IP (TIP) Packet Packet Format 7 6 0 IPBytes 1 TargetIP[7:0] 2 TargetIP[15:8] 3 TargetIP[23:16] 4 TargetIP[31:24] 5 TargetIP[39:32] 6 TargetIP[47:40] 7 TargetIP[55:48] 8 TargetIP[63:56] 5 Generation Scenario 4 3 2 1 0 0 1 1 0 1 Dependencies PacketEn Indirect branch (including un-compressed RET), far branch, interrupt, exception, INIT, SIPI, VM exit, VM entry, TSX abort, EENTER, EEXIT, ERESUME, AEX1. Description Provides the target for some control flow transfers Application Anytime a TIP is encountered, it indicates that control was transferred to the IP provided in the payload. The source of this control flow change, and hence the IP or instruction to which it binds, depends on the packets that precede the TIP. If a TIP is encountered and all preceding packets have already been bound, then the TIP will apply to the upcoming indirect branch, far branch, or VMRESUME. However, if there was a preceding FUP that remains unbound, it will bind to the TIP. Here, the TIP provides the target of an asynchronous event or TSX abort that occurred at the IP given in the FUP payload. Note that there may be other packets, in addition to the FUP, which will bind to the TIP packet. See the packet application descriptions for other packets for details. NOTES: 1. EENTER, EEXIT, ERESUME, AEX would be possible only for a debug enclave. IP Compression The IP payload in a TIP. FUP, TIP.PGE, or TIP.PGD packet can vary in size, based on the mode of execution, and the use of IP compression. IP compression is an optional compression technique the processor may choose to employ to reduce bandwidth. With IP compression, the IP to be represented in the payload is compared with the last IP sent out, via any of FUP, TIP, TIP.PGE, or TIP.PGD. If that previous IP had the same upper (most significant) address bytes, those matching bytes may be suppressed in the current packet. The processor maintains an internal state of the “Last IP” that was encoded in trace packets, thus the decoder will need to keep track of the “Last IP” state in Vol. 3C 35-39 INTEL® PROCESSOR TRACE software, to match fidelity with packets generated by hardware. “Last IP” is initialized to zero, hence if the first IP in the trace may be compressed if the upper bytes are zeroes. The “IPBytes” field of the IP packets (FUP, TIP, TIP.PGE, TIP.PGD) serves to indicate how many bytes of payload are provided, and how the decoder should fill in any suppressed bytes. The algorithm for reconstructing the IP for a TIP/FUP packet is shown in the table below. Table 35-19. FUP/TIP IP Reconstruction IPBytes Uncompressed IP Value 63:56 55:48 47:40 39:32 000b None, IP is out of context 001b Last IP[63:16] 010b Last IP[63:32] 011b IP Payload[47] extended IP Payload[47:0] 100b Last IP [63:48] IP Payload[47:0] 101b Reserved 110b IP Payload[63:0] 111b Reserved 31:24 23:16 15:8 7:0 IP Payload[15:0] IP Payload[31:0] The processor-internal Last IP state is guaranteed to be reset to zero when a PSB is sent out. This means that the IP that follows the PSB with either be un-compressed (011b or 110b, see Table 35-19), or compressed against zero. At times, “IPbytes” will have a value of 0. As shown above, this does not mean that the IP payload matches the full address of the last IP, but rather that the IP for this packet was suppressed. This is used for cases where the IP that applies to the packet is out of context. An example is the TIP.PGD sent on a SYSCALL, when tracing only USR code. In that case, no TargetIP will be included in the packet, since that would expose an instruction point at CPL = 0. When the IP payload is suppressed in this manner, Last IP is not cleared, and instead refers to the last IP packet with a non-zero IPBytes field. On processors that support a maximum linear address size of 32 bits, IP payloads may never exceed 32 bits (IPBytes <= 010b). Indirect Transfer Compression for Returns (RET) In addition to IP compression, TIP packets for near return (RET) instructions can also be compressed. If the RET target matches the next IP of the corresponding CALL, then the TIP packet is unneeded, since the decoder can deduce the target IP by maintaining a CALL/RET stack of its own. A CALL/RET stack can be maintained by the decoder by doing the following: 1. Allocate space to store 64 RET targets. 2. For near CALLs, push the Next IP onto the stack. Once the stack is full, new CALLs will force the oldest entry off the end of the stack, such that only the youngest 64 entries are stored. Note that this excludes zero-length CALLs, which are direct near CALLs with displacement zero (to the next IP). These CALLs typically don’t have matching RETs. 3. For near RETs, pop the top (youngest) entry off the stack. This will be the target of the RET. In cases where the RET is compressed, the target is guaranteed to match the value produced in 2) above. If the target is not compressed, a TIP packet will be generated with the RET target, which may differ from 2). The hardware ensure that packets read by the decoder will always have seen the CALL that corresponds to any compressed RET. The processor will never compress a RET across a PSB, a buffer overflow, or scenario where PacketEn=0. This means that a RET whose corresponding CALL executed while PacketEn=0, or before the last PSB, etc., will not be compressed. If the CALL/RET stack is manipulated or corrupted by software, and thereby causes a RET to transfer control to a target that is inconsistent with the CALL/RET stack, then the RET will not be compressed, and will produce a TIP 35-40 Vol. 3C INTEL® PROCESSOR TRACE packet. This can happen, for example, if software executes a PUSH instruction to push a target onto the stack, and a later RET uses this target. When a RET is compressed, a Taken indication is added to the TNT buffer. Because it sends no TIP packet, it also does not update the internal Last IP value, and thus the decoder should treat it the same way. If the RET is not compressed, it will generate a TIP packet (just like when RET compression is disabled, via IA32_RTIT_CTL.DisRETC). For processors that employ deferred TIPs (Section 35.4.2.3), an uncompressed RET will not be deferred, and hence will force out any accumulated TNTs or TIPs. This serves to avoid ambiguity, and make clear to the decoder whether the near RET was compressed, and hence a bit in the in-progress TNT should be consumed, or uncompressed, in which case there will be no in-progress TNT and thus a TIP should be consumed. Note that in the unlikely case that a RET executes in a different execution mode than the associated CALL, the decoder will need to model the same behavior with its CALL stack. For instance, if a CALL executes in 64-bit mode, a 64-bit IP value will be pushed onto the software stack. If the corresponding RET executes in 32-bit mode, then only the lower 32 target bits will be popped off of the stack, which may mean that the RET does not go to the CALL’s Next IP. This is architecturally correct behavior, and this RET could be compressed, thus the decoder should match this behavior 35.4.2.3 Deferred TIPs The processor may opt to defer sending out the TNT when TIPs are generated. Thus, rather than sending a partial TNT followed by a TIP, both packets will be deferred while the TNT accumulates more Jcc/RET results. Any number of TIP packets may be accumulated this way, such that only once the TNT is filled, or once another packet (e.g., FUP) is generated, the TNT will be sent, followed by all the deferred TIP packets, and finally terminated by the other packet(s) that forced out the TNT and TIP packets. Generation of many other packets (see list below) will force out the TNT and any accumulated TIP packets. This is an optional optimization in hardware to reduce the bandwidth consumption, and hence the performance impact, incurred by tracing. Table 35-20. TNT Examples with Deferred TIPs Code Flow Packets, Non-Deferred TIPS Packets, Deferred TIPS 0x1000 cmp %rcx, 0 0x1004 jnz Foo // not-taken 0x1008 jmp %rdx TNT(0b0), TIP(0x1308) 0x1308 cmp %rcx, 1 0x130c jnz Bar // not-taken 0x1310 cmp %rcx, 2 0x1314 jnz Baz // taken 0x1500 cmp %eax, 7 0x1504 jg Exit // not-taken TNT(0b010), TIP(0x1100) 0x1508 jmp %r15 0x1100 cmp %rbx, 1 0x1104 jg Start // not-taken 0x1108 add %rcx, %eax 0x110c … // an asynchronous Interrupt arrives INThandler: 0xcc00 pop %rdx TNT(0b0), FUP(0x110c), TIP(0xcc00) TNT(0b00100), TIP(0x1308), TIP(0x1100), FUP(0x110c), TIP(0xcc00) Vol. 3C 35-41 INTEL® PROCESSOR TRACE 35.4.2.4 Packet Generation Enable (TIP.PGE) Packet Table 35-21. TIP.PGE Packet Definition Name Target IP - Packet Generation Enable (TIP.PGE) Packet Packet Format 7 6 0 IPBytes 1 TargetIP[7:0] 2 TargetIP[15:8] 3 TargetIP[23:16] 4 TargetIP[31:24] 5 TargetIP[39:32] 6 TargetIP[47:40] 7 TargetIP[55:48] 8 TargetIP[63:56] 5 4 3 2 1 0 1 0 0 0 1 Dependencies PacketEn transitions to 1 Description Indicates that PacketEn has transitioned to 1. It provides the IP at which the tracing begins. This can occur due to any of the enables that comprise PacketEn transitioning from 0 to 1, as long as all the others are asserted. Examples: • TriggerEn: This is set on software write to set IA32_RTIT_CTL.TraceEn as long as the Stopped and Error bits in IA32_RTIT_STATUS are clear. The IP payload will be the Next IP of the WRMSR. • FilterEn: This is set when software jumps into the tracing region. This region is defined by enabling IP filtering in IA32_RTIT_CTL.ADDRn_CFG, and defining the range in IA32_RTIT_ADDRn_[AB], see. Section 35.2.4.3. The IP payload will be the target of the branch. • ContextEn: This is set on a CPL change, a CR3 write or any other means of changing ContextEn. The IP payload will be the Next IP of the instruction that changes context if it is not a branch, otherwise it will be the target of the branch. Application TIP.PGE packets bind to the instruction at the IP given in the payload. 35-42 Vol. 3C Generation Scenario Any branch instruction, control flow transfer, or MOV CR3 that sets PacketEn, a WRMSR that enables packet generation and sets PacketEn INTEL® PROCESSOR TRACE 35.4.2.5 Packet Generation Disable (TIP.PGD) Packet Table 35-22. TIP.PGD Packet Definition Name Target IP - Packet Generation Disable (TIP.PGD) Packet Packet Format 7 6 5 0 IPBytes 1 TargetIP[7:0] 2 TargetIP[15:8] 3 TargetIP[23:16] 4 TargetIP[31:24] 5 TargetIP[39:32] 6 TargetIP[47:40] 7 TargetIP[55:48] 8 TargetIP[63:56] Generation Scenario 4 3 2 1 0 0 0 0 0 1 Dependencies PacketEn transitions to 0 Any branch instruction, control flow transfer, or MOV CR3 that clears PacketEn, a WRMSR that disables packet generation and clears PacketEn Description Indicates that PacketEn has transitioned to 0. It will include the IP at which the tracing ends, unless ContextEn= 0 or TraceEn=0 at the conclusion of the instruction or event that cleared PacketEn. PacketEn can be cleared due to any of the enables that comprise PacketEn transitioning from 1 to 0. Examples: • TriggerEn: This is cleared on software write to clear IA32_RTIT_CTL.TraceEn, or when IA32_RTIT_STATUS.Stopped is set, or on operational error. The IP payload will be suppressed in this case, and the “IPBytes” field will have the value 0. • FilterEn: This is cleared when software jumps out of the tracing region. This region is defined by enabling IP filtering in IA32_RTIT_CTL.ADDRn_CFG, and defining the range in IA32_RTIT_ADDRn_[AB], see. Section 35.2.4.3. The IP payload will depend on the type of the branch. For conditional branches, the payload is suppressed (IPBytes = 0), and in this case the destination can be inferred from the disassembly. For any other type of branch, the IP payload will be the target of the branch. • ContextEn: This can happen on a CPL change, a CR3 write or any other means of changing ContextEn. See Section 35.2.4.3 for details. In this case, when ContextEn is cleared, there will be no IP payload. The “IPBytes” field will have value 0. Note that, in cases where a branch that would normally produce a TIP packet (i.e., far transfer, indirect branch, interrupt, etc) or TNT update (conditional branch or compressed RT) causes PacketEn to transition from 1 to 0, the TIP or TNT bit will be replaced with TIP.PGD. The payload of the TIP.PGD will be the target of the branch, unless the result of the instruction causes TraceEn or ContextEn to be cleared (ie, SYSCALL when IA32_RTIT_CTL.OS=0, In the case where a conditional branch clears FilterEn and hence PacketEn, there will be no TNT bit for this branch, replaced instead by the TIP.PGD. Application TIP.PGD can be produced by any branch instructions, as well as some non-branch instructions, that clear PacketEn. When produced by a branch, it replaces any TIP or TNT update that the branch would normally produce. In cases where there is an unbound FUP preceding the TIP.PGD, then the TIP.PGD is part of compound operation (i.e., asynchronous event or TSX abort) which cleared PacketEn. For most such cases, the TIP.PGD is simply replacing a TIP, and should be treated the same way. The TIP.PGD may or may not have an IP payload, depending on whether the operation cleared ContextEn. If there is not an associated FUP, the binding will depend on whether there is an IP payload. If there is an IP payload, then the TIP.PGD should be applied to either the next direct branch whose target matches the TIP.PGD payload, or the next branch that would normally generate a TIP or TNT packet. If there is no IP payload, then the TIP.PGD should apply to the next branch or MOV CR3 instruction. Vol. 3C 35-43 INTEL® PROCESSOR TRACE 35.4.2.6 Flow Update (FUP) Packet Table 35-23. FUP Packet Definition Name Flow Update (FUP) Packet Packet Format 7 0 IPBytes 1 IP[7:0] 2 IP[15:8] 3 IP[23:16] 4 IP[31:24] 5 IP[39:32] 6 IP[47:40] 7 IP[55:48] 8 IP[63:56] 6 5 Generation Scenario 4 3 2 1 0 1 1 1 0 1 Dependencies TriggerEn & ContextEn. (Typically depends on BranchEn and FilterEn as well, see Section 35.2.4 for details.) Asynchronous Events (interrupts, exceptions, INIT, SIPI, SMI, VM exit, #MC), XBEGIN, XEND, XABORT, XACQUIRE, XRELEASE, EENTER, EEXIT, ERESUME, EEE, AEX,1, INTO, INT1, INT3, INT n, a WRMSR that disables packet generation. Description Provides the source address for asynchronous events, and some other instructions. Is never sent alone, always sent with an associated TIP or MODE packet, and potentially others. Application FUP packets provide the IP to which they bind. However, they are never standalone, but are coupled with other packets. In TSX cases, the FUP is immediately preceded by a MODE.TSX, which binds to the same IP. A TIP will follow only in the case of TSX aborts, see Section 35.4.2.8 for details. Otherwise, FUPs are part of compound packet events (see Section 35.4.1). In these compound cases, the FUP provides the source IP for an instruction or event, while a following TIP (or TIP.PGD) packet will provide the destination IP. Other packets may be included in the compound event between the FUP and TIP. NOTES: 1. EENTER, EEXIT, ERESUME, EEE, AEX apply only if Intel Software Guard Extensions is supported. 35-44 Vol. 3C INTEL® PROCESSOR TRACE FUP IP Payload Flow Update Packet gives the source address of an instruction when it is needed. In general, branch instructions do not need a FUP, because the source address is clear from the disassembly. For asynchronous events, however, the source address cannot be inferred from the source, and hence a FUP will be sent. Table 35-24 illustrates cases where FUPs are sent, and which IP can be expected in those cases. Table 35-24. FUP Cases and IP Payload Event Flow Update IP Comment External Interrupt, NMI/SMI, Traps, Machine Check (trap-like), INIT/SIPI Address of next instruction (Next IP) that would have been executed Functionally, this matches the LBR FROM field value and also the EIP value which is saved onto the stack. Exceptions/Faults, Machine check (fault-like) Address of the instruction which took the This matches the similar functionality of LBR FROM field value and also the EIP value which is saved onto the stack. Software Interrupt Address of the software interrupt instruction (Current IP) This matches the similar functionality of LBR FROM field value, but does not match the EIP value which is saved onto the stack (Next Linear Instruction Pointer - NLIP). EENTER, EEXIT, ERESUME, Enclave Exiting Event (EEE), AEX1 Current IP of the instruction This matches the LBR FROM field value and also the EIP value which is saved onto the stack. XACQUIRE Address of the X* instruction XRELEASE, XBEGIN, XEND, XABORT, other transactional abort Current IP #SMI IP that is saved into SMRAM WRMSR that clears TraceEn Current IP exception/fault (Current IP) NOTES: 1. Information on EENTER, EEXIT, ERESUME, EEE, Asynchronous Enclave eXit (AEX) can be found in Intel® Software Guard Extensions Programming Reference. On a canonical fault due to sequentially fetching an instruction in non-canonical space (as opposed to jumping to non-canonical space), the IP of the fault (and thus the payload of the FUP) will be a non-canonical address. This is consistent with what is pushed on the stack for such faulting cases. If there are post-commit task switch faults, the IP value of the FUP will be the original IP when the task switch started. This is the same value as would be seen in the LBR_FROM field. But it is a different value as is saved on the stack or VMCS. Vol. 3C 35-45 INTEL® PROCESSOR TRACE 35.4.2.7 Paging Information (PIP) Packet Table 35-25. PIP Packet Definition Name Paging Information (PIP) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 2 CR3[11:5] or 0 3 CR3[19:12] 4 CR3[27:20] 5 CR3[35:28] 6 CR3[43:36] 7 CR3[51:44] 0 1 RSVD/NR Dependencies TriggerEn && ContextEn && IA32_RTIT_CTL.OS Generation Scenario MOV CR3, Task switch, INIT, SIPI, PSB+, VM exit, VM entry Description The CR3 payload shown includes only the address portion of the CR3 value. For PAE paging, CR3[11:5] are thus included. For other paging modes (32-bit and 4-level paging1), these bits are 0. This packet holds the CR3 address value. It will be generated on operations that modify CR3: • MOV CR3 operation • Task Switch • INIT and SIPI • VM exit, if “conceal VMX from PT” VM-exit control is 0 (see Section 35.5.1) • VM entry, if “conceal VMX from PT” VM-entry control is 0 PIPs are not generated, despite changes to CR3, on SMI and RSM. This is due to the special behavior on these operations, see Section 35.2.8.3 for details. Note that, for some cases of task switch where CR3 is not modified, no PIP will be produced. The purpose of the PIP is to indicate to the decoder which application is running, so that it can apply the proper binaries to the linear addresses that are being traced. The PIP packet contains the new CR3 value when CR3 is written. PIPs generated by VM entries set the NR bit. PIPs generated in VMX non-root operation set the NR bit if the “conceal VMX from PT” VM-execution control is 0 (see Section 35.5.1). All other PIPs clear the NR bit. Application The purpose of the PIP packet is to help the decoder uniquely identify what software is running at any given time. When a PIP is encountered, a decoder should do the following: 1) If there was a prior unbound FUP (that is, a FUP not preceded by a packet such as MODE.TSX that consumes it, and it hence pairs with a TIP that has not yet been seen), then this PIP is part of a compound packet event (Section 35.4.1). Find the ending TIP and apply the new CR3/NR values to the TIP payload IP. 2) Otherwise, look for the next MOV CR3, far branch, or VMRESUME/VMLAUNCH in the disassembly, and apply the new CR3 to the next (or target) IP. For examples of the packets generated by these flows, see Section 35.7. NOTES: 1. Earlier versions of this manual used the term “IA-32e paging” to identify 4-level paging. 35-46 Vol. 3C INTEL® PROCESSOR TRACE 35.4.2.8 MODE Packets MODE packets keep the decoder informed of various processor modes about which it needs to know in order to properly manage the packet output, or to properly disassemble the associated binaries. MODE packets include a header and a mode byte, as shown below. Table 35-26. General Form of MODE Packets 7 6 5 4 3 2 1 0 0 1 0 0 1 1 0 0 1 1 Leaf ID Mode The MODE Leaf ID indicates which set of mode bits are held in the lower bits. MODE.Exec Packet Table 35-27. MODE.Exec Packet Definition Name MODE.Exec Packet Packet Format 7 6 5 4 3 2 1 0 0 1 0 0 1 1 0 0 1 1 0 0 0 0 0 0 CS.D (CS.L & LMA) Dependencies PacketEn Generation Scenario Far branch, interrupt, exception, VM exit, and VM entry, if the mode changes. PSB+, and any scenario that can generate a TIP.PGE, such that the mode may have changed since the last MODE.Exec. Description Indicates whether software is in 16, 32, or 64-bit mode, by providing the CS.D and (CS.L & IA32_EFER.LMA) values. Essential for the decoder to properly disassemble the associated binary. CS.D (CS.L & IA32_EFER.LMA) Addressing Mode 1 1 N/A 0 1 64-bit mode 1 0 32-bit mode 0 0 16-bit mode MODE.Exec is sent at the time of a mode change, if PacketEn=1 at the time, or when tracing resumes, if necessary. In the former case, the MODE.Exec packet is generated along with other packets that result from the far transfer operation that changes the mode. In cases where the mode changes while PacketEn=0, the processor will send out a MODE.Exec along with the TIP.PGE when tracing resumes. The processor may opt to suppress the MODE.Exec when tracing resumes if the mode matches that from the last MODE.Exec packet, if there was no PSB in between. Application MODE.Exec always immediately precedes a TIP or TIP.PGE. The mode change applies to the IP address in the payload of the next TIP or TIP.PGE. Vol. 3C 35-47 INTEL® PROCESSOR TRACE MODE.TSX Packet Table 35-28. MODE.TSX Packet Definition Name MODE.TSX Packet Packet Format 7 6 5 4 3 2 0 1 0 0 1 1 1 0 0 1 0 0 0 0 0 1 0 TXAbort InTX Dependencies TriggerEn and ContextEn Description Indicates when a TSX transaction (either HLE or RTM) begins, commits, or aborts. Instructions executed transactionally will be “rolled back” if the transaction is aborted. Application 35-48 Vol. 3C Generation Scenario 1 XBEGIN, XEND, XABORT, XACQUIRE, XRELEASE, if InTX changes, Asynchronous TSX Abort, PSB+ TXAbort InTX Implication 1 1 N/A 0 1 Transaction begins, or executing transactionally 1 0 Transaction aborted 0 0 Transaction committed, or not executing transactionally If PacketEn=1, MODE.TSX always immediately precedes a FUP. If the TXAbort bit is zero, then the mode change applies to the IP address in the payload of the FUP. If TXAbort=1, then the FUP will be followed by a TIP, and the mode change will apply to the IP address in the payload of the TIP. MODE.TSX packets may be generated when PacketEn=0, due to FilterEn=0. In this case, only the last MODE.TSX generated before TIP.PGE need be applied. INTEL® PROCESSOR TRACE 35.4.2.9 TraceStop Packet Table 35-29. TraceStop Packet Definition Name TraceStop Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 Dependencies TriggerEn && ContextEn Generation Scenario Taken branch with target in TraceStop IP region, MOV CR3 in TraceStop IP region, or WRMSR that sets TraceEn in TraceStop IP region. Description Indicates when software has entered a user-configured TraceStop region. When the IP matches a TraceStop range while ContextEn and TriggerEn are set, a TraceStop action occurs. This disables tracing by setting IA32_RTIT_STATUS.Stopped, thereby clearing TriggerEn, and causes a TraceStop packet to be generated. The TraceStop action also forces FilterEn to 0. Note that TraceStop may not force a flush of internally buffered packets, and thus trace packet generation should still be manually disabled by clearing IA32_RTIT_CTL.TraceEn before examining output. See Section 35.2.4.3 for more details. Application If TraceStop follows a TIP.PGD (before the next TIP.PGE), then it was triggered either by the instruction that cleared PacketEn, or it was triggered by some later instruction that executed while FilterEn=0. In either case, the TraceStop can be applied at the IP of the TIP.PGD (if any). If TraceStop follows a TIP.PGE (before the next TIP.PGD), it should be applied at the last known IP. 35.4.2.10 Core:Bus Ratio (CBR) Packet Table 35-30. CBR Packet Definition Name Core:Bus Ratio (CBR) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 2 Core:Bus Ratio 3 Reserved Dependencies TriggerEn Generation Scenario After any frequency change, on C-state wake up, PSB+, and after enabling trace packet generation. Description Indicates the core:bus ratio of the processor core. Useful for correlating wall-clock time and cycle time. Application The CBR packet indicates the point in the trace when a frequency transition has occurred. On some implementations, software execution will continue during transitions to a new frequency, while on others software execution ceases during frequency transitions. There is not a precise IP provided, to which to bind the CBR packet. Vol. 3C 35-49 INTEL® PROCESSOR TRACE 35.4.2.11 Timestamp Counter (TSC) Packet Table 35-31. TSC Packet Definition Name Timestamp Counter (TSC) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 1 1 0 0 1 1 SW TSC[7:0] 2 SW TSC[15:8] 3 SW TSC[23:16] 4 SW TSC[31:24] 5 SW TSC[39:32] 6 SW TSC[47:40] 7 SW TSC[55:48] Dependencies IA32_RTIT_CTL.TSCEn && TriggerEn Description When enabled by software, a TSC packet provides the lower 7 bytes of the current TSC value, as returned by the RDTSC instruction. This may be useful for tracking wall-clock time, and synchronizing the packets in the log with other timestamped logs. Application TSC packet provides a wall-clock proxy of the event which generated it (packet generation enable, sleep state wake, etc). In all cases, TSC does not precisely indicate the time of any control flow packets; however, all preceding packets represent instructions that executed before the indicated TSC time, and all subsequent packets represent instructions that executed after it. There is not a precise IP to which to bind the TSC packet. 35-50 Vol. 3C Generation Scenario Sent after any event that causes the processor clocks or Intel PT timing packets (such as MTC or CYC) to stop, This may include P-state changes, wake from C-state, or clock modulation. Also on transition of TraceEn from 0 to 1. INTEL® PROCESSOR TRACE 35.4.2.12 Mini Time Counter (MTC) Packet Table 35-32. MTC Packet Definition Name Mini time Counter (MTC) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 1 0 1 1 0 0 1 1 CTC[N+7:N] Dependencies IA32_RTIT_CTL.MTCEn && TriggerEn Generation Scenario Periodic, based on the core crystal clock, or Always Running Timer (ART). Description When enabled by software, an MTC packet provides a periodic indication of wall-clock time. The 8-bit CTC (Common Timestamp Copy) payload value is set to (ART >> N) & FFH. The frequency of the ART is related to the Maximum Non-Turbo frequency, and the ratio can be determined from CPUID leaf 15H, as described in Section 35.8.3. Software can select the threshold N, which determines the MTC frequency by setting the IA32_RTIT_CTL.MTCFreq field (see Section 35.2.7.2) to a supported value using the lookup enumerated by CPUID (see Section 35.3.1). See Section 35.8.3 for details on how to use the MTC payload to track TSC time. MTC provides 8 bits from the ART, starting with the bit selected by MTCFreq to dictate the frequency of the packet. Whenever that 8-bit range being watched changes, an MTC packet will be sent out with the new value of that 8-bit range. This allows the decoder to keep track of how much wall-clock time has elapsed since the last TSC packet was sent, by keeping track of how many MTC packets were sent and what their value was. The decoder can infer the truncated bits, CTC[N-1:0], are 0 at the time of the MTC packet. There are cases in which MTC packet can be dropped, due to overflow or other micro-architectural conditions. The decoder should be able to recover from such cases by checking the 8-bit payload of the next MTC packet, to determine how many MTC packets were dropped. It is not expected that >256 consecutive MTC packets should ever be dropped. Application MTC does not precisely indicate the time of any other packet, nor does it bind to any IP. However, all preceding packets represent instructions or events that executed before the indicated ART time, and all subsequent packets represent instructions that executed after, or at the same time as, the ART time. Vol. 3C 35-51 INTEL® PROCESSOR TRACE 35.4.2.13 TSC/MTC Alignment (TMA) Packet Table 35-33. TMA Packet Definition Name TSC/MTC Alignment (TMA) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 1 2 CTC[7:0] 3 CTC[15:8] 4 Reserved 5 FastCounter[7:0] 6 Reserved 0 FC[8] Dependencies IA32_RTIT_CTL.MTCEn && IA32_RTIT_CTL.TSCEn && TriggerEn Description The TMA packet serves to provide the information needed to allow the decoder to correlate MTC packets with TSC packets. With this packet, when a MTC packet is encountered, the decoder can determine how many timestamp counter ticks have passed since the last TSC or MTC packet. See Section 35.8.3.2 for details on how to make this calculation. Application TMA is always sent immediately following a TSC packet, and the payload values are consistent with the TSC payload value. Thus the application of TMA matches that of TSC. 35-52 Vol. 3C Generation Scenario Sent with any TSC packet. INTEL® PROCESSOR TRACE 35.4.2.14 Cycle Count (CYC) Packet Table 35-34. Cycle Count Packet Definition Name Cycle Count (CYC) Packet Packet Format 7 6 5 4 3 2 1 0 Exp 1 1 0 Cycle Counter[4:0] 1 Cycle Counter[11:5] Exp 2 Cycle Counter[18:12] Exp ... ... (if Exp = 1 in the previous byte) Dependencies IA32_RTIT_CTL.CYCEn && TriggerEn Generation Sce- Can be sent at any time, though a maximum of one CYC packet is nario sent per core clock cycle. See Section 35.3.6 for CYC-eligible packets. Description The Cycle Counter field increments at the same rate as the processor core clock ticks, but with a variable length format (using a trailing EXP bit field) and a range-capped byte length. If the CYC value is less than 32, a 1-byte CYC will be generated, with Exp=0. If the CYC value is between 32 and 4095 inclusive, a 2-byte CYC will be generated, with byte 0 Exp=1 and byte 1 Exp=0. And so on. CYC provides the number of core clocks that have passed since the last CYC packet. CYC can be configured to be sent in every cycle in which an eligible packet is generated, or software can opt to use a threshold to limit the number of CYC packets, at the expense of some precision. These settings are configured using the IA32_RTIT_CTL.CycThresh field (see Section 35.2.7.2). For details on Cycle-Accurate Mode, IPC calculation, etc, see Section 35.3.6. When CycThresh=0, and hence no threshold is in use, then a CYC packet will be generated in any cycle in which any CYC-eligible packet is generated. The CYC packet will precede the other packets generated in the cycle, and provides the precise cycle time of the packets that follow. In addition to these CYC packets generated with other packets, CYC packets can be sent stand-alone. These packets serve simply to update the decoder with the number of cycles passed, and are used to ensure that a wrap of the processor’s internal cycle counter doesn’t cause cycle information to be lost. These stand-alone CYC packets do not indicate the cycle time of any other packet or operation, and will be followed by another CYC packet before any other CYC-eligible packet is seen. When CycThresh>0, CYC packets are generated only after a minimum number of cycles have passed since the last CYC packet. Once this threshold has passed, the behavior above resumes, where CYC will either be sent in the next cycle that produces other CYC-eligible packets, or could be sent stand-alone. When using CYC thresholds, only the cycle time of the operation (instruction or event) that generates the CYC packet is truly known. Other operations simply have their execution time bounded: they completed at or after the last CYC time, and before the next CYC time. Application CYC provides the offset cycle time (since the last CYC packet) for the CYC-eligible packet that follows. If another CYC is encountered before the next CYC-eligible packet, the cycle values should be accumulated and applied to the next CYC-eligible packet. If a CYC packet is generated by a TNT, note that the cycle time provided by the CYC packet applies to the first branch in the TNT packet. Vol. 3C 35-53 INTEL® PROCESSOR TRACE 35.4.2.15 VMCS Packet Table 35-35. VMCS Packet Definition Name VMCS Packet Packet Format Dependencies Description 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 1 0 0 0 2 VMCS pointer [19:12] 3 VMCS pointer [27:20] 4 VMCS pointer [35:28] 5 VMCS pointer [43:36] 6 VMCS pointer [51:44] TriggerEn && ContextEn; Also in VMX operation. Generation Scenario Generated on successful VMPTRLD, and optionally on SMM VM exits and VM entries that return from SMM (see Section 35.4.2.26). The VMCS packet provides a VMCS pointer for a decoder to determine the transition of code contexts: • On a successful VMPTRLD (i.e., a VMPTRLD that doesn’t fault, fail, or VM exit), the VMCS packet contains the logical processor’s VMCS pointer established by VMPTRLD (for subsequent execution of a VM guest context). • An SMM VM exit loads the logical processor’s VMCS pointer with the SMM-transfer VMCS pointer. If the “conceal VMX from PT” VM-exit control is 0 (see Section 35.5.1), a VMCS packet provides this pointer. See Section 35.6 on tracing inside and outside STM. • A VM entry that returns from SMM loads the logical processor’s VMCS pointer from a field in the SMM-transfer VMCS. If the “conceal VMX from PT” VM-entry control is 0, a VMCS packet provides this pointer. Whether the VM entry is to VMX root operation or VMX non-root operation is indicated by the PIP.NR bit. A VMCS packet generated before a VMCS pointer has been loaded, or after the VMCS pointer has been cleared will set all 64 bits in the VMCS pointer field. VMCS packets will not be seen on processors with IA32_VMX_MISC[bit 14]=0, as these processors do not allow TraceEn to be set in VMX operation. Application 35-54 Vol. 3C The purpose of the VMCS packet is to help the decoder uniquely identify changes in the executing software context in situations that CR3 may not be unique. When a VMCS packet is encountered, a decoder should do the following: • If there was a prior unbound FUP (that is, a FUP not preceded by a packet such as MODE.TSX that consumes it, and it hence pairs with a TIP that has not yet been seen), then this VMCS is part of a compound packet event (Section 35.4.1). Find the ending TIP and apply the new VMCS base pointer value to the TIP payload IP. • Otherwise, look for the next VMPTRLD, VMRESUME, or VMLAUNCH in the disassembly, and apply the new VMCS base pointer on the next VM entry. For examples of the packets generated by these flows, see Section 35.7. INTEL® PROCESSOR TRACE 35.4.2.16 Overflow (OVF) Packet Table 35-36. OVF Packet Definition Name Overflow (OVF) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 1 Dependencies TriggerEn Generation Scenario On resolution of internal buffer overflow Description OVF simply indicates to the decoder that an internal buffer overflow occurred, and packets were likely lost. If BranchEN= 1, OVF is followed by a FUP or TIP.PGE which will provide the IP at which packet generation resumes. See Section 35.3.8. Application When an OVF packet is encountered, the decoder should skip to the IP given in the subsequent FUP or TIP.PGE. The cycle counter for the CYC packet will be reset at the time the OVF packet is sent. Software should reset its call stack depth on overflow, since no RET compression is allowed across an overflow. Similarly, any IP compression that follows the OVF is guaranteed to use as a reference LastIP the IP payload of an IP packet that preceded the overflow. 35.4.2.17 Packet Stream Boundary (PSB) Packet Table 35-37. PSB Packet Definition Name Packet Stream Boundary (PSB) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 1 0 2 0 0 0 0 0 0 1 0 3 1 0 0 0 0 0 1 0 4 0 0 0 0 0 0 1 0 5 1 0 0 0 0 0 1 0 6 0 0 0 0 0 0 1 0 7 1 0 0 0 0 0 1 0 8 0 0 0 0 0 0 1 0 9 1 0 0 0 0 0 1 0 10 0 0 0 0 0 0 1 0 11 1 0 0 0 0 0 1 0 12 0 0 0 0 0 0 1 0 13 1 0 0 0 0 0 1 0 14 0 0 0 0 0 0 1 0 15 1 0 0 0 0 0 1 0 Vol. 3C 35-55 INTEL® PROCESSOR TRACE Table 35-37. PSB Packet Definition (Contd.) Dependencies TriggerEn Generation Scenario Periodic, based on the number of output bytes generated while tracing. PSB is sent when IA32_RTIT_STATUS.PacketByteCnt=0, and each time it crosses the software selected threshold after that. May be sent for other micro-architectural conditions as well. Description PSB is a unique pattern in the packet output log, and hence serves as a sync point for the decoder. It is a pattern that the decoder can search for in order to get aligned on packet boundaries. This packet is periodic, based on the number of output bytes, as indicated by IA32_RTIT_STATUS.PacketByteCnt. The period is chosen by software, via IA32_RTIT_CTL.PSBFreq (see Section 35.2.7.2). Note, however, that the PSB period is not precise, it simply reflects the average number of output bytes that should pass between PSBs. The processor will make a best effort to insert PSB as quickly after the selected threshold is reached as possible. The processor also may send extra PSB packets for some micro-architectural conditions. PSB also serves as the leading packet for a set of “status-only” packets collectively known as PSB+ (Section 35.3.7). Application When a PSB is seen, the decoder should interpret all following packets as “status only”, until either a PSBEND or OVF packet is encountered. “Status only” implies that the binding and ordering rules to which these packets normally adhere are ignored, and the state they carry can instead be applied to the IP payload in the FUP packet that is included. 35.4.2.18 PSBEND Packet Table 35-38. PSBEND Packet Definition Name PSBEND Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 Dependencies TriggerEn Description PSBEND is simply a terminator for the series of “status only” (PSB+) packets that follow PSB (Section 35.3.7). Application When a PSBEND packet is seen, the decoder should cease to treat packets as “status only”. 35-56 Vol. 3C Generation Scenario Always follows PSB packet, separated by PSB+ packets INTEL® PROCESSOR TRACE 35.4.2.19 Maintenance (MNT) Packet Table 35-39. MNT Packet Definition Name Maintenance (MNT) Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 2 1 0 0 0 1 0 0 0 3 Payload[7:0] 4 Payload[15:8] 5 Payload[23:16] 6 Payload[31:24] 7 Payload[39:32] 8 Payload[47:40] 9 Payload[55:48] 10 Payload[63:56] Dependencies TriggerEn Generation Scenario Implementation specific. Description This packet is generated by hardware, the payload meaning is model-specific. Application Unless a decoder has been extended for a particular family/model/stepping to interpret MNT packet payloads, this packet should simply be ignored. It does not bind to any IP. 35.4.2.20 PAD Packet Table 35-40. PAD Packet Definition Name PAD Packet Packet Format 0 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 Dependencies TriggerEn Generation Scenario Implementation specific Description PAD is simply a NOP packet. Processor implementations may choose to add pad packets to improve packet alignment or for implementation-specific reasons. Application Ignore PAD packets. Vol. 3C 35-57 INTEL® PROCESSOR TRACE 35.4.2.21 PTWRITE (PTW) Packet Table 35-41. PTW Packet Definition Name PTW Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 IP PayloadBytes 1 0 0 1 0 2 Payload[7:0] 3 Payload[15:8] 4 Payload[23:16] 5 Payload[31:24] 6 Payload[39:32] 7 Payload[47:40] 8 Payload[55:48] 9 Payload[63:56] The PayloadBytes field indicates the number of bytes of payload that follow the header bytes. Encodings are as follows: PayloadBytes Bytes of Payload ‘00 4 ‘01 8 ‘10 Reserved ‘11 Reserved IP bit indicates if a FUP, whose payload will be the IP of the PTWRITE instruction, will follow. Dependencies TriggerEn & ContextEn & FilterEn & PTWEn Description Contains the value held in the PTWRITE operand. This packet is CYC-eligible, and hence will generate a CYC packet if IA32_RTIT_CTL.CYCEn=1 and any CYC Threshold has been reached. Application Binds to the associated PTWRITE instruction. The IP of the PTWRITE will be provided by a following FUP, when PTW.IP=1. 35-58 Vol. 3C Generation Scenario PTWRITE Instruction INTEL® PROCESSOR TRACE 35.4.2.22 Execution Stop (EXSTOP) Packet Table 35-42. EXSTOP Packet Definition Name EXSTOP Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 IP 1 1 0 0 0 1 0 IP bit indicates if a FUP will follow. Dependencies TriggerEn & PwrEvtEn Generation Scenario C-state entry, P-state change, or other processor clock powerdown. Includes : • Entry to C-state deeper than C0.0 • TM1/2 • STPCLK# • Frequency change due to IA32_CLOCK_MODULATION, Turbo Description This packet indicates that software execution has stopped due to processor clock powerdown. Later packets will indicate when execution resumes. If EXSTOP is generated while ContextEn is set, the IP bit will be set, and EXSTOP will be followed by a FUP packet containing the IP at which execution stopped. More precisely, this will be the IP of the oldest instruction that has not yet completed. This packet is CYC-eligible, and hence will generate a CYC packet if IA32_RTIT_CTL.CYCEn=1 and any CYC Threshold has been reached. Application If a FUP follows EXSTOP (hence IP bit set), the EXSTOP can be bound to the FUP IP. Otherwise the IP is not known. Time of powerdown can be inferred from the preceding CYC, if CYCEn=1. Combined with the TSC at the time of wake (if TSCEn=1), this can be used to determine the duration of the powerdown. Vol. 3C 35-59 INTEL® PROCESSOR TRACE 35.4.2.23 MWAIT Packet Table 35-43. MWAIT Packet Definition Name MWAIT Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 2 MWAIT Hints[7:0] 3 Reserved 4 Reserved 5 Reserved 6 Reserved 7 Reserved 8 Reserved 9 Reserved EXT[1:0] Dependencies TriggerEn & PwrEvtEn & ContextEn Description Indicates that an MWAIT operation to C-state deeper than C0.0 completed. The MWAIT hints and extensions passed in by software are exposed in the payload. For UMWAIT and TPAUSE, the EXT field holds the input register value that determines the optimized state requested. For entry to some highly optimized C0 sub-C-states, such as C0.1, no MWAIT packet is generated. This packet is CYC-eligible, and hence will generate a CYC packet if IA32_RTIT_CTL.CYCEn=1 and any CYC Threshold has been reached. Application The MWAIT packet should bind to the IP of the next FUP, which will be the IP of the instruction that caused the MWAIT. This FUP will be shared with EXSTOP. 35-60 Vol. 3C Generation Scenario MWAIT, UMWAIT, or TPAUSE instructions, or I/O redirection to MWAIT, that complete without fault or VMexit. INTEL® PROCESSOR TRACE 35.4.2.24 Power Entry (PWRE) Packet Table 35-44. PWRE Packet Definition Name PWRE Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 2 HW Reserved 3 Resolved Thread C-State Generation Scenario Resolved Thread Sub C-State Dependencies TriggerEn & PwrEvtEn Transition to a C-state deeper than C0.0. Description Indicates processor entry to the resolved thread C-state and sub C-state indicated. The processor will remain in this C-state until either another PWRE indicates the processor has moved to a C-state deeper than C0.0, or a PWRX packet indicates a return to C0.0. For entry to some highly optimized C0 sub-C-states, such as C0.1, no PWRE packet is generated. Note that some CPUs may allow MWAIT to request a deeper C-state than is supported by the core. These deeper Cstates may have platform-level implications that differentiate them. However, the PWRE packet will provide only the resolved thread C-state, which will not exceed that supported by the core. If the C-state entry was initiated by hardware, rather than a direct software request (such as MWAIT, UMWAIT, TPAUSE, HLT, or shutdown), the HW bit will be set to indicate this. Hardware Duty Cycling (see Section 14.5, “Hardware Duty Cycling (HDC)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B) is an example of such a case. Application When transitioning from C0.0 to a deeper C-state, the PWRE packet will be followed by an EXSTOP. If that EXSTOP packet has the IP bit set, then the following FUP will provide the IP at which the C-state entry occurred. Subsequent PWRE packets generated before the next PWRX should bind to the same IP. Vol. 3C 35-61 INTEL® PROCESSOR TRACE 35.4.2.25 Power Exit (PWRX) Packet Table 35-45. PWRX Packet Definition Name PWRX Packet Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 1 0 2 Last Core C-State Deepest Core C-State 3 Reserved Wake Reason 4 Reserved 5 Reserved 6 Reserved Dependencies TriggerEn & PwrEvtEn Description Indicates processor return to thread C0 from a C-state deeper than C0.0. For return from some highly optimized C0 sub-C-states, such as C0.1, no PWRX packet is generated. The Last Core C-State field provides the MWAIT encoding for the core C-state at the time of the wake. The Deepest Core C-State provides the MWAIT encoding for the deepest core C-state achieved during the sleep session, or since leaving thread C0. MWAIT encodings for C-states can be found in Table 4-11 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2B. Note that these values reflect only the core C-state, and hence will not exceed the maximum supported core C-state, even if deeper C-states can be requested. The Wake Reason field is one-hot, encoded as follows: Application 35-62 Vol. 3C Generation Scenario Transition from a C-state deeper than C0.0 to C0. Bit Field Meaning 0 Interrupt Wake due to external interrupt received. 1 Timer Deadline Wake due to timer expiration, such as UMWAIT/TPAUSE TSC-quanta. 2 Store to Monitored Address Wake due to store to monitored address. 3 HW Wake Wake due to hardware autonomous condition, such as HDC. PWRX will always apply to the same IP as the PWRE. The time of wake can be discerned from (optional) timing packets that precede PWRX. INTEL® PROCESSOR TRACE 35.4.2.26 Block Begin Packet (BBP) Table 35-46. Block Begin Packet Definition Name BBP Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 1 1 2 SZ Reserved Type[4:0] Dependencies TriggerEn Description This packet indicates the beginning of a block of packets which are collectively tied to a single event or instruction. The size of the block item payloads within this block is provided by the Size (SZ) bit: SZ=0: 8-byte block items SZ=1: 4-byte block items The meaning of the BIP payloads is provided by the Type field: BBP.Type Application Generation Scenario PEBS event, if IA32_PEBS_ENABLE.OUTPUT=1. Block name 0x00 Reserved 0x01 General-Purpose Registers 0x02..0x03 Reserved 0x04 PEBS Basic 0x05 PEBS Memory 0x06..0x07 Reserved 0x08 LBR Block 0 0x09 LBR Block 1 0x0A LBR Block 2 0x0B..0x0F Reserved 0x10 XMM Registers 0x11..0x1F Reserved A BBP will always be followed by a Block End Packet (BEP), and when the block is generated while ContextEn=1 that BEP will have IP=1 and be followed by a FUP that provides the IP to which the block should be bound. Note that, in addition to BEP, a block can be terminated by a BBP (indicating the start of a new block) or an OVF packet. Vol. 3C 35-63 INTEL® PROCESSOR TRACE 35.4.2.27 Block Item Packet (BIP) Table 35-47. Block Item Packet Definition Name BIP Packet Format If the preceding BBP.SZ=0: 7 6 0 5 4 3 ID[5:0] 1 2 1 0 1 0 0 2 1 0 1 0 0 Payload[7:0] 2 Payload[15:8] 3 Payload[23:16] 4 Payload[31:24] 5 Payload[39:32] 6 Payload[47:40] 7 Payload[55:48] 8 Payload[63:56] If the preceding BBP.SZ=1: 7 6 0 5 4 3 ID[5:0] 1 Payload[7:0] 2 Payload[15:8] 3 Payload[23:16] 4 Payload[31:24] Dependencies TriggerEn Generation Scenario See BBP. Description The size of the BIP payload is determined by the Size field in the preceding BBP packet. The BIP header provides the ID value that, when combined with the Type field from the preceding BBP, uniquely identifies the state value held in the BIP payload. See Table 35-48 below for the complete list. Application See BBP. BIP State Value Encodings The table below provides the encoding values for all defined block items. State items that are larger than 8 bytes, such as XMM register values, are broken into multiple 8-byte components. BIP packets with Size=1 (4 byte payload) will provide only the lower 4 bytes of the associated state value. Table 35-48. BIP Encodings BBP.Type BIP.ID State Value General-Purpose Registers 0x01 0x00 R/EFLAGS 0x01 0x01 R/EIP 0x01 0x02 R/EAX 0x01 0x03 R/ECX 35-64 Vol. 3C INTEL® PROCESSOR TRACE Table 35-48. BIP Encodings BBP.Type BIP.ID State Value 0x01 0x04 R/EDX 0x01 0x05 R/EBX 0x01 0x06 R/ESP 0x01 0x07 R/EBP 0x01 0x08 R/ESI 0x01 0x09 R/EDI 0x01 0x0A R8 0x01 0x0B R9 0x01 0x0C R10 0x01 0x0D R11 0x01 0x0E R12 0x01 0x0F R13 0x01 0x10 R14 0x01 0x11 R15 0x04 0x00 Instruction Pointer 0x04 0x01 Applicable Counters 0x04 0x02 Timestamp PEBS Basic Info (Section 18.9.2.2.1) PEBS Memory Info (Section 18.9.2.2.2) 0x05 0x00 MemAccessAddress 0x05 0x01 MemAuxInfo 0x05 0x02 MemAccessLatency 0x05 0x03 TSXAuxInfo 0x08 0x00 LBR[TOS-0]_FROM_IP 0x08 0x01 LBR[TOS-0]_TO_IP 0x08 0x02 LBR[TOS-0]_INFO 0x08 0x03 LBR[TOS-1]_FROM_IP 0x08 0x04 LBR[TOS-1]_TO_IP 0x08 0x05 LBR[TOS-1]_INFO 0x08 0x06 LBR[TOS-2]_FROM_IP 0x08 0x07 LBR[TOS-2]_TO_IP 0x08 0x08 LBR[TOS-2]_INFO 0x08 0x09 LBR[TOS-3]_FROM_IP 0x08 0x0A LBR[TOS-3]_TO_IP 0x08 0x0B LBR[TOS-3]_INFO 0x08 0x0C LBR[TOS-4]_FROM_IP 0x08 0x0D LBR[TOS-4]_TO_IP LBR_0 Vol. 3C 35-65 INTEL® PROCESSOR TRACE Table 35-48. BIP Encodings BBP.Type BIP.ID State Value 0x08 0x0E LBR[TOS-4]_INFO 0x08 0x0F LBR[TOS-5]_FROM_IP 0x08 0x10 LBR[TOS-5]_TO_IP 0x08 0x11 LBR[TOS-5]_INFO 0x08 0x12 LBR[TOS-6]_FROM_IP 0x08 0x13 LBR[TOS-6]_TO_IP 0x08 0x14 LBR[TOS-6]_INFO 0x08 0x15 LBR[TOS-7]_FROM_IP 0x08 0x16 LBR[TOS-7]_TO_IP 0x08 0x17 LBR[TOS-7]_INFO 0x08 0x18 LBR[TOS-8]_FROM_IP 0x08 0x19 LBR[TOS-8]_TO_IP 0x08 0x1A LBR[TOS-8]_INFO 0x08 0x1B LBR[TOS-9]_FROM_IP 0x08 0x1C LBR[TOS-9]_TO_IP 0x08 0x1D LBR[TOS-9]_INFO 0x08 0x1E LBR[TOS-10]_FROM_IP 0x08 0x1F LBR[TOS-10]_TO_IP LBR_1 0x09 0x00 LBR[TOS-10]_INFO 0x09 0x01 LBR[TOS-11]_FROM_IP 0x09 0x02 LBR[TOS-11]_TO_IP 0x09 0x03 LBR[TOS-11]_INFO 0x09 0x04 LBR[TOS-12]_FROM_IP 0x09 0x05 LBR[TOS-12]_TO_IP 0x09 0x06 LBR[TOS-12]_INFO 0x09 0x07 LBR[TOS-13]_FROM_IP 0x09 0x08 LBR[TOS-13]_TO_IP 0x09 0x09 LBR[TOS-13]_INFO 0x09 0x0A LBR[TOS-14]_FROM_IP 0x09 0x0B LBR[TOS-14]_TO_IP 0x09 0x0C LBR[TOS-14]_INFO 0x09 0x0D LBR[TOS-15]_FROM_IP 0x09 0x0E LBR[TOS-15]_TO_IP 0x09 0x0F LBR[TOS-15]_INFO 0x09 0x10 LBR[TOS-16]_FROM_IP 0x09 0x11 LBR[TOS-16]_TO_IP 0x09 0x12 LBR[TOS-16]_INFO 35-66 Vol. 3C INTEL® PROCESSOR TRACE Table 35-48. BIP Encodings BBP.Type BIP.ID State Value 0x09 0x13 LBR[TOS-17]_FROM_IP 0x09 0x14 LBR[TOS-17]_TO_IP 0x09 0x15 LBR[TOS-17]_INFO 0x09 0x16 LBR[TOS-18]_FROM_IP 0x09 0x17 LBR[TOS-18]_TO_IP 0x09 0x18 LBR[TOS-18]_INFO 0x09 0x19 LBR[TOS-19]_FROM_IP 0x09 0x1A LBR[TOS-19]_TO_IP 0x09 0x1B LBR[TOS-19]_INFO 0x09 0x1C LBR[TOS-20]_FROM_IP 0x09 0x1D LBR[TOS-20]_TO_IP 0x09 0x1E LBR[TOS-20]_INFO 0x09 0x1F LBR[TOS-21]_FROM_IP LBR_2 0x0A 0x00 LBR[TOS-21]_TO_IP 0x0A 0x01 LBR[TOS-21]_INFO 0x0A 0x02 LBR[TOS-22]_FROM_IP 0x0A 0x03 LBR[TOS-22]_TO_IP 0x0A 0x04 LBR[TOS-22]_INFO 0x0A 0x05 LBR[TOS-23]_FROM_IP 0x0A 0x06 LBR[TOS-23]_TO_IP 0x0A 0x07 LBR[TOS-23]_INFO 0x0A 0x08 LBR[TOS-24]_FROM_IP 0x0A 0x09 LBR[TOS-24]_TO_IP 0x0A 0x0A LBR[TOS-24]_INFO 0x0A 0x0B LBR[TOS-25]_FROM_IP 0x0A 0x0C LBR[TOS-25]_TO_IP 0x0A 0x0D LBR[TOS-25]_INFO 0x0A 0x0E LBR[TOS-26]_FROM_IP 0x0A 0x0F LBR[TOS-26]_TO_IP 0x0A 0x10 LBR[TOS-26]_INFO 0x0A 0x11 LBR[TOS-27]_FROM_IP 0x0A 0x12 LBR[TOS-27]_TO_IP 0x0A 0x13 LBR[TOS-27]_INFO 0x0A 0x14 LBR[TOS-28]_FROM_IP 0x0A 0x15 LBR[TOS-28]_TO_IP 0x0A 0x16 LBR[TOS-28]_INFO 0x0A 0x17 LBR[TOS-29]_FROM_IP Vol. 3C 35-67 INTEL® PROCESSOR TRACE Table 35-48. BIP Encodings BBP.Type BIP.ID State Value 0x0A 0x18 LBR[TOS-29]_TO_IP 0x0A 0x19 LBR[TOS-29]_INFO 0x0A 0x1A LBR[TOS-30]_FROM_IP 0x0A 0x1B LBR[TOS-30]_TO_IP 0x0A 0x1C LBR[TOS-30]_INFO 0x0A 0x1D LBR[TOS-31]_FROM_IP 0x0A 0x1E LBR[TOS-31]_TO_IP 0x0A 0x1F LBR[TOS-31]_INFO XMM Registers 0x10 0x00 XMM0_Q0 0x10 0x01 XMM0_Q1 0x10 0x02 XMM1_Q0 0x10 0x03 XMM1_Q1 0x10 0x04 XMM2_Q0 0x10 0x05 XMM2_Q1 0x10 0x06 XMM3_Q0 0x10 0x07 XMM3_Q1 0x10 0x08 XMM4_Q0 0x10 0x09 XMM4_Q1 0x10 0x0A XMM5_Q0 0x10 0x0B XMM5_Q1 0x10 0x0C XMM6_Q0 0x10 0x0D XMM6_Q1 0x10 0x0E XMM7_Q0 0x10 0x0F XMM7_Q1 0x10 0x10 XMM8_Q0 0x10 0x11 XMM8_Q1 0x10 0x12 XMM9_Q0 0x10 0x13 XMM9_Q1 0x10 0x14 XMM10_Q0 0x10 0x15 XMM10_Q1 0x10 0x16 XMM11_Q0 0x10 0x17 XMM11_Q1 0x10 0x18 XMM12_Q0 0x10 0x19 XMM12_Q1 0x10 0x1A XMM13_Q0 0x10 0x1B XMM13_Q1 0x10 0x1C XMM14_Q0 35-68 Vol. 3C INTEL® PROCESSOR TRACE Table 35-48. BIP Encodings BBP.Type BIP.ID State Value 0x10 0x1D XMM14_Q1 0x10 0x1E XMM15_Q0 0x10 0x1F XMM15_Q1 35.4.2.28 Block End Packet (BEP) Table 35-49. Block End Packet Definition Name BEP Packet Format 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 1 IP 0 1 1 0 0 1 1 Dependencies TriggerEn Description Indicates the end of a packet block. The IP bit indicates if a FUP will follow, and will be set if ContextEn=1. Application The block, from initial BBP to the BEP, binds to the FUP IP, if IP=1, and consumes the FUP. 35.5 Generation Scenario See BBP. TRACING IN VMX OPERATION On processors that IA32_VMX_MISC[bit 14] reports 1, TraceEn can be set in VMX operation. A series of mechanisms exist to allow the VMM to configure tracing based on the desired trace domain, and on the consumer of the trace output. The VMM can configure specific VMX controls to control what virtualization-specific data are included within the trace packets (see Section 35.5.1 for details). The MSR-load areas used by VMX transitions can be employed by the VMM to restrict tracing to the desired context (see Section 35.5.2 for details). These configuration options are summarized in Table 35-50. Table 35-50 covers common Intel PT usages while SMIs are handled by the default SMM treatment. Tracing with SMM Transfer Monitor is described in Section 35.6. Vol. 3C 35-69 INTEL® PROCESSOR TRACE Table 35-50. Common Usages of Intel PT and VMX Target Domain Output Consumer Virtualize Output Configure VMX Controls TraceEN Configuration Save/Restore MSR states of Trace Configuration System-Wide (VMM + VMs) Host N/A Default setting (no suppression) WRMSR or XRSTORS by Host N/A VMM Only Intel PT Aware VMM N/A Enable suppression Use VMX MSR-load areas to disable tracing in VM, enable tracing on VM exits N/A VM Only Intel PT Aware VMM N/A Enable suppression Use VMX MSR-load areas to enable tracing in VM, disable tracing on VM exits N/A Intel PT Aware Guest(s) Per Guest VMM adds trace output virtualization Enable suppression Use VMX MSR-load areas to enable tracing in VM, disable tracing on VM exits VMM updates guest state on VM exits due to XRSTORS 35.5.1 VMX-Specific Packets and VMCS Controls In all of the usages of VMX and Intel PT, a decoder in the host or VMM context can identify the occurrences of VMX transitions with the aid of VMX-specific packets. There are two kinds of packets relevant to VMX: • VMCS packet. The VMX transitions of individual VMs can be distinguished by a decoder using the VMCSpointer field in a VMCS packet. A VMCS packet is sent on a successful execution of VMPTRLD, and its VMCSpointer field stores the VMCS pointer loaded by that execution. See Section 35.4.2.15 for details. • The NR (non-root) bit in a PIP packet. Normally, the NR bit is set in any PIP packet generated in VMX nonroot operation. In addition, PIP packets are generated with each VM entry and VM exit. Thus a transition of the NR bit from 0 to 1 indicates the occurrence of a VM entry, and a transition of 1 to 0 indicates the occurrence of a VM exit. There are VMX controls that a VMM can set to conceal some of this VMX-specific information (by suppressing its recording) and thereby prevent it from leaking across virtualization boundaries. There is one of these controls (each of which is called “conceal VMX from PT”) of each type of VMX control. Table 35-51. VMX Controls For Intel Processor Trace Type of VMX Control Secondary processor-based VM-execution control Bit Position1 19 Value 0 Behavior Each PIP generated in VM non-root operation will set the NR bit. PSB+ in VMX non-root operation will include the VMCS packet, to ensure that the decoder knows which guest is currently in use. 1 Each PIP generated in VMX non-root operation will clear the NR bit. PSB+ in VMX non-root operation will not include the VMCS packet. VM-exit control 24 0 Each VM exit generates a PIP in which the NR bit is clear. In addition, SMM VM exits generate VMCS packets. VM-entry control 17 1 VM exits do not generate PIPs, and no VMCS packets are generated on SMM VM exits. 0 Each VM entry generates a PIP in which the NR bit is set (except VM entries that return from SMM to VMX root operation). In addition, VM entries that return from SMM generate VMCS packets. 1 VM entries do not generate PIPs, and no VMCS packets are generated on VM entries that return from SMM. NOTES: 1. These are the positions of the control bits in the relevant VMX control fields. 35-70 Vol. 3C INTEL® PROCESSOR TRACE The 0-settings of these VMX controls enable all VMX-specific packet information. The scenarios that would use these default settings also do not require the VMM to use VMX MSR-load areas to enable and disable trace-packet generation across VMX transitions. If IA32_VMX_MISC[bit 14] reports 0, the 1-settings of the VMX controls in Table 35-51 are not supported, and VM entry will fail on any attempt to set them. 35.5.2 Managing Trace Packet Generation Across VMX Transitions In tracing scenarios that collect packets for both VMX root operation and VMX non-root operation, a host executive can manage the MSRs associated with trace packet generation directly. The states of these MSRs need not be modified using MSR load areas across VMX transitions. For tracing scenarios that collect packets only within VMX root operation or only within VMX non-root operation, the VMM can use the MSR load areas to toggle IA32_RTIT_CTL.TraceEn. 35.5.2.1 System-Wide Tracing When a host or VMM configures Intel PT to collect trace packets of the entire system, it can leave the relevant VMX controls clear to allow VMX-specific packets to provide information across VMX transitions. The VMX MSR-load areas need not be used to load Intel PT MSRs on VM exits or VM entries. The decoder will desire to identify the occurrence of VMX transitions. The packets of interests to a decoder are shown in Table 35-52. Table 35-52. Packets on VMX Transitions (System-Wide Tracing) Event VM exit Packets Description FUP(GuestIP) The FUP indicates at which point in the guest flow the VM exit occurred. This is important, since VM exit can be an asynchronous event. The IP will match that written into the VMCS. PIP(HostCR3, NR=0) The PIP packet provides the new host CR3 value, as well as indication that the logical processor is entering VMX root operation. This allows the decoder to identify the change of executing context from guest to host and load the appropriate set of binaries to continue decode. TIP(HostIP) The TIP indicates the destination IP, the IP of the first instruction to be executed in VMX root operation. Note, this packet could be preceded by a MODE.Exec packet (Section 35.4.2.8). This is generated only in cases where CS.D or (CS.L & EFER.LMA) change during the transition. VM entry PIP(GuestCR3, NR=1) The PIP packet provides the new guest CR3 value, as well as indication that the logical processor is entering VMX non-root operation. This allows the decoder to identify the change of executing context from host to guest and load the appropriate set of binaries to continue decode. TIP(GuestIP) The TIP indicates the destination IP, the IP of the first instruction to be executed in VMX nonroot operation. This should match the RIP loaded from the VMCS. Note, this packet could be preceded by a MODE.Exec packet (Section 35.4.2.8). This is generated only in cases where CS.D or (CS.L & EFER.LMA) change during the transition. Since the VMX controls that suppress packet generation are cleared, a VMCS packet will be included in all PSB+ for this usage scenario. Additionally, VMPTRLD will generate such a packet. Thus the decoder can distinguish the execution context of different VMs. When the host VMM configures a system to collect trace packets in this scenario, it should emulate CPUID to report CPUID.(EAX=07H, ECX=0):EBX[bit 26] as 0 to guests, indicating to guests that Intel PT is not available. VMX TSC Manipulation The TSC packets generated while in VMX non-root operation will include any changes resulting from the use of a VMM’s use of the TSC offsetting or TSC scaling VMX controls (see Chapter 25, “VMX Non-Root Operation”). In this system-wide usage model, the decoder may need to account for the effect of per-VM adjustments in the TSC Vol. 3C 35-71 INTEL® PROCESSOR TRACE packets generated in VMX non-root operation and the absence of TSC adjustments in TSC packets generated in VMX root operation. The VMM can supply this information to the decoder. 35.5.2.2 Host-Only Tracing When trace packets in VMX non-root operation are not desired, the VMM can use the VM-entry MSR-load area to load IA32_RTIT_CTL (clearing TraceEn) to disable trace-packet generation in guests, and use the VM-exit MSR-load area to load IA32_RTIT_CTL to set TraceEn. When tracing only the host, the decoder does not need information about the guests, and the VMX controls for suppressing VMX-specific packets can be set to reduce the packets generated. VMCS packets will still be generated on execution of VMPTRLD and in PSB+ generated in the host, but these will be unused by the decoder. The packets of interests to a decoder when trace packets are collected for host-only tracing are shown in Table 3553. Table 35-53. Packets on VMX Transitions (Host-Only Tracing) Event VM exit Packets TIP.PGE(HostIP) Description The TIP.PGE indicates that trace packet generation is enabled and gives the IP of the first instruction to be executed in VMX root operation. Note, this packet could be preceded by a MODE.Exec packet (Section 35.4.2.8). This is generated only in cases where CS.D or (CS.L & EFER.LMA) change during the transition. VM entry 35.5.2.3 TIP.PGD() The TIP indicates that trace packet generation was disabled. This ensure that all buffered packets are flushed out. Guest-Only Tracing A VMM can configure trace-packet generation while in VMX non-root operation for guests executing normally. This is accomplished by utilizing the VMX MSR-load areas on VM exits (see Section 24.7.2, “VM-Exit Controls for MSRs”) and VM entries (see Section 24.8.2, “VM-Entry Controls for MSRs”) to limit trace-packet generation to the guest environment. For this usage, the VM-entry MSR load area is programmed to enable trace packet generation; the VM-exit MSR load area is used to clear IA32_RTIT_CTL.TraceEn so as to disable trace-packet generation in the host. Further, if it is preferred that the guest packet stream contain no indication that execution was in VMX non-root operation, the VMM should set to 1 all the VMX controls enumerated in Table 35-51. 35.5.2.4 Virtualization of Guest Output Packet Streams Each Intel PT aware guest OS can produce one or more output packet streams to destination addresses specified as guest physical address using by context-switching IA32_RTIT_OUTPUT_BASE within the guest. The processor generates trace packets to the physical address specified in IA32_RTIT_OUTPUT_BASE, and those specified in the ToPA tables. Thus, a VMM that supports Intel PT aware guest OS may wish to virtualize the output configurations of IA32_RTIT_OUTPUT_BASE and ToPA for each trace configuration state of all the guests. 35.5.2.5 Emulation of Intel PT Traced State If a VMM emulates an element of processor state by taking a VM exit on reads and/or writes to that piece of state, and the state element impacts Intel PT packet generation or values, it may be incumbent upon the VMM to insert or modify the output trace data. If a VM exit is taken on a guest write to CR3 (including “MOV CR3” as well as task switches), the PIP packet normally generated on the CR3 write will be missing. To avoid decoder confusion when the guest trace is decoded, the VMM should emulate the missing PIP by writing it into the guest output buffer. If the guest CR3 value is manipulated, the VMM may also need to manipulate the IA32_RTIT_CR3_MATCH value, in order to ensure the trace behavior matches the guest's expectation. Similarly, if a VMM emulates the TSC value by taking a VM exit on RDTSC, the TSC packets generated in the trace may mismatch the TSC values returned by the VMM on RDTSC. To ensure that the trace can be properly aligned 35-72 Vol. 3C INTEL® PROCESSOR TRACE with software logs based on RDTSC, the VMM should either make corresponding modifications to the TSC packet values in the guest trace, or use mechanisms such as TSC offsetting or TSC scaling in place of exiting. 35.5.2.6 TSC Scaling When TSC scaling is enabled for a guest using Intel PT, the VMM should ensure that the value of Maximum NonTurbo Ratio[15:8] in MSR_PLATFORM_INFO (MSR 0CEH) and the TSC/”core crystal clock” ratio (EBX/EAX) in CPUID leaf 15H are set in a manner consistent with the resulting TSC rate that will be visible to the VM. This will allow the decoder to properly apply TSC packets, MTC packets (based on the core crystal clock or ART, whose frequency is indicated by CPUID leaf 15H), and CBR packets (which indicate the ratio of the processor frequency to the Max Non-Turbo frequency). Absent this, or separate indication of the scaling factor, the decoder will be unable to properly track time in the trace. See Section 35.8.3 for details on tracking time within an Intel PT trace. 35.5.2.7 Failed VM Entry The packets generated by a failed VM entry depend both on the VMCS configuration, as well as on the type of failure. The results to expect are summarized in the table below. Note that packets in italics may or may not be generated, depending on implementation choice, and the point of failure. Table 35-54. Packets on a Failed VM Entry Usage Model Entry Configuration Early Failure (fall through to next IP) Late Failure (VM-exit like) System-Wide No use of VM-entry MSR-load area TIP (NextIP) PIP(Guest CR3, NR=1), TraceEn 0->1 Packets (See Section 35.2.7.3), PIP(HostCR3, NR=0), TIP(HostIP) VMM Only VM-entry MSR-load area used to clear TraceEn TIP (NextIP) TraceEn 0->1 Packets (See Section 35.2.7.3), TIP(HostIP) VM Only VM-entry MSR-load area used to set TraceEn None None 35.5.2.8 VMX Abort VMX abort conditions take the processor into a shutdown state. On a VM exit that leads to VMX abort, some packets (FUP, PIP) may be generated, but any expected TIP, TIP.PGE, or TIP.PGD may be dropped. 35.6 TRACING AND SMM TRANSFER MONITOR (STM) The SMM-transfer monitor (STM) is a VMM that operates inside SMM while in VMX root operation. An STM operates in conjunction with an executive monitor. The latter operates outside SMM and in VMX root operation. Transitions from the executive monitor or its VMs to the STM are called SMM VM exits. The STM returns from SMM via a VM entry to the VM in VMX non-root operation or the executive monitor in VMX root operation. Intel PT supports tracing in an STM similar to tracing support for VMX operation as described above in Section 35.4.2.26. As a result, on a SMM VM exit resulting from #SMI, TraceEn is not saved and then cleared. Software can save the state of the trace configuration MSRs and clear TraceEn using the MSR load/save lists. 35.7 PACKET GENERATION SCENARIOS Table 35-55 and Table 35-56 illustrate the packets generated in various scenarios. In the heading row, PacketEn is abbreviated as PktEn, ContextEn as CntxEn. Note that this assumes that TraceEn=1 in IA32_RTIT_CTL, while TriggerEn=1 and Error=0 in IA32_RTIT_STATUS, unless otherwise specified. Entries that do not matter in packet generation are marked “D.C.” Packets followed by a “?” imply that these packets depend on additional factors, which are listed in the “Other Dependencies” column. Vol. 3C 35-73 INTEL® PROCESSOR TRACE The following acronyms are used in the packet examples below: • • • CLIP - Current LIP NLIP - Next Sequential LIP BLIP - Branch Target LIP In Table 35-55, PktEn is evaluated based on TiggerEn & ContextEn & FilterEn & BranchEn. Table 35-55. Packet Generation under Different Enable Conditions Case Operation PktEn Before PktEn CntxEn After After Other Dependencies Packets Output 1a Normal non-jump operation 0 0 D.C. None 1b Normal non-jump operation 1 1 1 None 2a WRMSR/XRSTORS/RSM that changes 0 TraceEn 0 -> 1, with PacketByteCnt >0 0 D.C. *TSC if TSCEn=1; *TMA if TSCEn=MTCEn=1 TSC?, TMA?, CBR 2b WRMSR/XRSTORS/RSM that changes 0 TraceEn 0 -> 1, with PacketByteCnt =0 0 D.C. *TSC if TSCEn=1; *TMA if TSCEn=MTCEn=1 PSB, PSBEND (see Section 35.4.2.17) 2d WRMSR/XRSTORS/RSM that changes 0 TraceEn 0 -> 1, with PacketByteCnt >0 1 1 TSC if TSCEn=1; TMA if TSCEn=MTCEn=1 TSC?, TMA?, CBR, MODE.Exec, TIP.PGE(NLIP) 2e 0 WRMSR/XRSTORS/RSM that changes TraceEn 0 -> 1, with PacketByteCnt =0 1 1 MODE.Exec, TIP.PGE(NLIP), PSB, PSBEND (see Section 35.4.2.8, 35.4.2.7, 35.4.2.13,35.4.2.15, 35.4.2.17) 3a WRMSR that changes TraceEn 1 -> 0 0 0 D.C. None 3b WRMSR that changes TraceEn 1 -> 0 1 0 D.C. FUP(CLIP), TIP.PGD() 5a MOV to CR3 0 0 0 None 5b MOV to CR3 0 1 1 5c MOV to CR3 1 0 0 5d MOV to CR3 1 1 1 *PIP.NR=1 if not in root operation and the “conceal VMX from PT” VM-execution control is 0 PIP(NewCR3, NR?) 5e MOV to CR3 1 0 1 *PIP.NR=1 if not in root operation and the “conceal VMX from PT” VM-execution control is 0 *TraceStop if executed in a TraceStop region PIP(NewCR3, NR?), TIP.PGD(NLIP), TraceStop? 5f MOV to CR3 0 0 1 TraceStop if executed in a TraceStop region PIP(NewCR3,NR?), TraceStop? 6a Unconditional direct near jump 0 0 D.C. 6b Unconditional direct near jump 1 0 1 35-74 Vol. 3C *PIP.NR=1 if not in root operation and the “conceal VMX from PT” VM-execution control is 0 *MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB PIP(NewCR3, NR?), MODE.Exec?, TIP.PGE(NLIP) TIP.PGD() None TraceStop if BLIP is in a TraceStop region TIP.PGD(BLIP), TraceStop? INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB Packets Output 6c Unconditional direct near jump 0 1 1 6d Unconditional direct near jump 1 1 1 None 7a Conditional taken jump or compressed RET that does not fill up the internal TNT buffer 0 0 D.C. None 7b Conditional taken jump or compressed RET 0 1 1 7d Conditional taken jump or compressed RET that fills up the internal TNT buffer 1 1 1 7e Conditional taken jump or compressed RET, with empty TNT buffer 1 0 1 TraceStop if BLIP is in a TraceStop region TIP.PGD(), TraceStop? 7f Conditional taken jump or compressed RET, with non-empty TNT buffer 1 0 1 TraceStop if BLIP is in a TraceStop region TNT, TIP.PGD(), TraceStop? 8a Conditional non-taken jump 0 0 D.C. None 8d Conditional not-taken jump that fills up the internal TNT buffer 1 1 1 TNT 9a Near indirect jump (JMP, CALL, or uncompressed RET) 0 0 D.C. None 9b Near indirect jump (JMP, CALL, or uncompressed RET) 0 1 1 MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB MODE.Exec?, TIP.PGE(BLIP) 9c Near indirect jump (JMP, CALL, or uncompressed RET) 1 0 1 TraceStop if BLIP is in a TraceStop region TIP.PGD(BLIP), TraceStop? 9d Near indirect jump (JMP, CALL, or uncompressed RET) 1 1 1 TIP(BLIP) 10a Far Branch (CALL/JMP/RET/SYS*/IRET) 0 0 0 None 10b Far Branch (CALL/JMP/RET/SYS*/IRET) 0 1 1 10c Far Branch (CALL/JMP/RET/SYS*/IRET) 1 0 0 MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB MODE.Exec?, TIP.PGE(BLIP) MODE.Exec?, TIP.PGE(BLIP) TNT *PIP if CR3 is updated (i.e., task switch), and OS=1; *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB PIP(new CR3, NR?), MODE.Exec?, TIP.PGE(BLIP) TIP.PGD() Vol. 3C 35-75 INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies Packets Output 10d Far Branch (CALL/JMP/RET/SYS*/IRET) 1 0 1 *PIP if CR3 is updated (i.e., task switch), and OS=1; *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region PIP(new CR3, NR?), TIP.PGD(BLIP), TraceStop? 10e Far Branch (CALL/JMP/RET/SYS*/IRET) 1 1 1 *PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; * MODE.Exec if the operation changes CS.L/D or IA32_EFER.LMA PIP(NewCR3, NR?)?, MODE.Exec?, TIP(BLIP) 10f Far Branch (CALL/JMP/RET/SYS*/IRET) 0 0 1 *PIP if CR3 is updated (i.e., task switch), and OS=1; *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region PIP(new CR3, NR?), TraceStop? 11a HW Interrupt 0 0 0 11b HW Interrupt 0 1 1 11c HW Interrupt 1 0 0 11d HW Interrupt 1 0 1 35-76 Vol. 3C None *PIP if CR3 is updated (i.e., task switch), and OS=1; *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; * MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB PIP(new CR3, NR?), MODE.Exec?, TIP.PGE(BLIP) FUP(NLIP), TIP.PGD() * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region FUP(NLIP), PIP(NewCR3, NR?)?, TIP.PGD(BLIP), TraceStop? INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies Packets Output 11e HW Interrupt 1 1 1 * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; * MODE.Exec if the operation changes CS.L/D or IA32_EFER.LMA FUP(NLIP), PIP(NewCR3, NR?)?, MODE.Exec?, TIP(BLIP) 11f HW Interrupt 0 0 1 *PIP if CR3 is updated (i.e., task switch), and OS=1; *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region PIP(new CR3, NR?), TraceStop? 12a SW Interrupt 0 0 0 12b SW Interrupt 0 1 1 12c SW Interrupt 1 0 0 12d SW Interrupt 1 0 1 * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region FUP(CLIP), PIP(NewCR3, NR?)?, TIP.PGD(BLIP), TraceStop? 12e SW Interrupt 1 1 1 * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; * MODE.Exec if the operation changes CS.L/D or IA32_EFER.LMA FUP(CLIP), PIP(NewCR3, NR?)?, MODE.Exec?, TIP(BLIP) None * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB PIP(NewCR3, NR?)?, MODE.Exec?, TIP.PGE(BLIP) FUP(CLIP), TIP.PGD() Vol. 3C 35-77 INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies 12f SW Interrupt 0 0 1 13a Exception/Fault 0 0 0 13b Exception/Fault 0 1 1 13c Exception/Fault 1 0 0 13d Exception/Fault 1 0 1 * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region FUP(CLIP), PIP(NewCR3, NR?)?, TIP.PGD(BLIP), TraceStop? 13e Exception/Fault 1 1 1 * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; * MODE.Exec if the operation changes CS.L/D or IA32_EFER.LMA FUP(CLIP), PIP(NewCR3, NR?)?, MODE.Exec?, TIP(BLIP) 13f Exception/Fault 0 0 1 * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region PIP(NewCR3, NR?)?, TraceStop? 14a SMI (TraceEn cleared) 0 0 D.C. None 14b SMI (TraceEn cleared) 1 0 0 FUP(SMRAM.LIP), TIP.PGD() 14c SMI (TraceEn cleared) 1 1 1 NA 14f SMI (TraceEn cleared) 1 0 1 NA 35-78 Vol. 3C * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *TraceStop if BLIP is in a TraceStop region Packets Output PIP(NewCR3, NR?)?, TraceStop? None * PIP if CR3 is updated (i.e., task switch), and OS=1 *PIP.NR=1 if destination is not root operation and the “conceal VMX from PT” VMexecution control is 0; *MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB PIP(NewCR3, NR?)?, MODE.Exec?, TIP.PGE(BLIP) FUP(CLIP), TIP.PGD() INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies Packets Output 15a RSM, TraceEn restored to 0 0 0 0 None 15b RSM, TraceEn restored to 1 0 0 D.C. See WRMSR cases for packets on enable 15c RSM, TraceEn restored to 1 0 1 1 See WRMSR cases for packets on enable. FUP/TIP.PGE IP is SMRAM.LIP 15d RSM (TraceEn=1, goes to shutdown) 1 1 1 None 15e RSM (TraceEn=1, goes to shutdown) 1 0 0 None 15f RSM (TraceEn=1, goes to shutdown) 1 0 1 None 16a VM exit 0 0 1 16b VM exit, MSR list sets TraceEn=1 0 0 0 See WRMSR cases for packets on enable. FUP IP is VMCSh.LIP 16c VM exit, MSR list sets TraceEn=1 0 1 1 See WRMSR cases for packets on enable. FUP/TIP.PGE IP is VMCSh.LIP 16e VM exit 0 1 1 *PIP if OF=1 and the “conceal VMX from PT” VM-exit control is 0; *MODE.Exec if the value is different, since last TIP.PGD PIP(HostCR3, NR=0)?, MODE.Exec?, TIP.PGE(VMCSh.LIP) 16f VM exit, MSR list clears TraceEn=0 1 0 0 *PIP if OF=1 and the “conceal VMX from PT” VM-exit control is 0; FUP(VMCSg.LIP), PIP(HostCR3, NR=0)?, TIP.PGD 16g VM exit 1 0 1 *PIP if OF=1 and the “conceal VMX from PT” VM-exit control is 0; *TraceStop if VMCSh.LIP is in a TraceStop region FUP(VMCSg.LIP), PIP(HostCR3, NR=0)?, TIP.PGD(VMCSh.LIP), TraceStop? 16h VM exit 1 1 1 *PIP if OF=1 and the “conceal VMX from PT” VM-exit control is 0; *MODE.Exec if the value is different, since last TIP.PGD FUP(VMCSg.LIP), PIP(HostCR3, NR=0)?, MODE.Exec, TIP(VMCSh.LIP) 16i VM exit 0 0 0 None 16j VM exit, ContextEN 1->0 1 0 0 FUP(VMCSg.LIP), TIP.PGD 17a VM entry 0 0 0 None *PIP if OF=1 and the “conceal VMX from PT” VM-exit control is 0; *TraceStop if VMCSh.LIP is in a TraceStop region PIP(HostCR3, NR=0)?, TraceStop? Vol. 3C 35-79 INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies 17b VM entry 0 0 1 17c VM entry, MSR load list sets TraceEn=1 0 0 1 See WRMSR cases for packets on enable. FUP IP is VMCSg.LIP 17d VM entry, MSR load list sets TraceEn=1 0 1 1 See WRMSR cases for packets on enable. FUP/TIP.PGE IP is VMCSg.LIP 17f VM entry, FilterEN 0->1 0 1 1 *PIP if OF=1 and the “conceal VMX from PT” VMentry control is 0; *MODE.Exec if the value is different, since last TIP.PGD PIP(GuestCR3, NR=1)?, MODE.Exec?, TIP.PGE(VMCSg.LIP) 17g VM entry, MSR list clears TraceEn=0 1 0 0 *PIP if OF=1 and the “conceal VMX from PT” VMentry control is 0; PIP(GuestCR3, NR=1)?, TIP.PGD 17h VM entry 1 0 1 *PIP if OF=1 and the “conceal VMX from PT” VMentry control is 0; *TraceStop if VMCSg.LIP is in a TraceStop region PIP(GuestCR3, NR=1)?, TIP.PGD(VMCSg.LIP), TraceStop? 17i VM entry 1 1 1 *PIP if OF=1 and the “conceal VMX from PT” VMentry control is 0; *MODE.Exec if the value is different, since last TIP.PGD PIP(GuestCR3, NR=1)?, MODE.Exec, TIP(VMCSg.LIP) 17j VM entry, ContextEN 0->1 0 1 1 *MODE.Exec if the value is different, since last TIP.PGD MODE.Exec, TIP.PGE(VMCSg.LIP) 20a EENTER/ERESUME to non-debug enclave 0 0 0 None 20c EENTER/ERESUME to non-debug enclave 1 0 0 FUP(CLIP), TIP.PGD() 21a EEXIT from non-debug enclave 0 0 D.C. None 21b EEXIT from non-debug enclave 0 1 1 22a AEX/EEE from non-debug enclave 0 0 D.C. 22b AEX/EEE from non-debug enclave 0 1 1 23a EENTER/ERESUME to debug enclave 0 0 D.C. 23b EENTER/ERESUME to debug enclave 0 1 1 23c EENTER/ERESUME to debug enclave 1 0 0 35-80 Vol. 3C *PIP if OF=1 and the “conceal VMX from PT” VMentry control is 0; *TraceStop if VMCSg.LIP is in a TraceStop region Packets Output *MODE.Exec if the value is different, since last TIP.PGD PIP(GuestCR3, NR=1)?, TraceStop? MODE.Exec?, TIP.PGE(BLIP) None *MODE.Exec if the value is different, since last TIP.PGD MODE.Exec?, TIP.PGE(AEP.LIP) None *MODE.Exec if the value is different, since last TIP.PGD MODE.Exec?, TIP.PGE(BLIP) FUP(CLIP), TIP.PGD() INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies *TraceStop if BLIP is in a TraceStop region Packets Output 23d EENTER/ERESUME to debug enclave 0 0 1 FUP(CLIP), TIP.PGD(BLIP), TraceStop? 23e EENTER/ERESUME to debug enclave 1 1 1 24b EEXIT from debug enclave 0 1 1 *MODE.Exec if the value is different, since last TIP.PGD MODE.Exec?, TIP.PGE(BLIP) 24d EEXIT from debug enclave 1 0 1 *TraceStop if BLIP is in a TraceStop region FUP(CLIP), TIP.PGD(BLIP), TraceStop? 24e EEXIT from debug enclave 1 1 1 24f EEXIT from debug enclave 0 0 D.C. None 25a AEX/EEE from debug enclave 0 0 D.C. None 25b AEX/EEE from debug enclave 0 1 1 *MODE.Exec if the value is different, since last TIP.PGD MODE.Exec?, TIP.PGE(AEP.LIP) 25d AEX/EEE from debug enclave 1 0 1 *For AEX, FUP IP could be NLIP, for trap-like events FUP(CLIP), TIP.PGD(AEP.LIP) 25e AEX/EEE from debug enclave 1 1 1 *MODE.Exec if the value is different, since last TIP.PGD *For AEX, FUP IP could be NLIP, for trap-like events FUP(CLIP), MODE.Exec?, TIP(AEP.LIP) 26a XBEGIN/XACQUIRE 0 0 D.C. None 26d XBEGIN/XACQUIRE that does not set InTX 1 1 1 None 26e XBEGIN/XACQUIRE that sets InTX 1 1 1 MODE.TSX(InTX=1, TXAbort=0), FUP(CLIP) 27a XEND/XRELEASE 0 0 D.C. None 27d XEND/XRELEASE that does not clear InTX 1 1 1 None 27e XEND/XRELEASE that clears InTX 1 1 1 MODE.TSX(InTX=0, TXAbort=0), FUP(CLIP) 28a XABORT(Async XAbort, or other) 0 0 0 None 28b XABORT(Async XAbort, or other) 0 1 1 MODE.TSX(InTX=0, TXAbort=1), TIP.PGE(BLIP) 28c XABORT(Async XAbort, or other) 1 0 1 28d XABORT(Async XAbort, or other) 1 1 1 28e XABORT(Async XAbort, or other) 0 0 1 30a INIT (BSP) 0 0 0 30b INIT (BSP) 0 0 1 FUP(CLIP), TIP(BLIP) FUP(CLIP), TIP(BLIP) *TraceStop if BLIP is in a TraceStop region MODE.TSX(InTX=0, TXAbort=1), TIP.PGD (BLIP), TraceStop? MODE.TSX(InTX=0, TXAbort=1), FUP(CLIP), TIP(BLIP) *TraceStop if BLIP is in a TraceStop region MODE.TSX(InTX=0, TXAbort=1), TraceStop? None *TraceStop if RESET.LIP is in a TraceStop region PIP(0), TraceStop? Vol. 3C 35-81 INTEL® PROCESSOR TRACE Table 35-55. Packet Generation under Different Enable Conditions (Contd.) Case Operation PktEn Before PktEn CntxEn After After Other Dependencies * MODE.Exec if the value is different, since last TIP.PGD Packets Output 30c INIT (BSP) 0 1 1 MODE.Exec?, PIP(0), TIP.PGE(ResetLIP) 30d INIT (BSP) 1 0 0 30e INIT (BSP) 1 0 1 * PIP if OS=1 *TraceStop if RESET.LIP is in a TraceStop region FUP(NLIP), PIP(0), TIP.PGD, TraceStop? 30f INIT (BSP) 1 1 1 * MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB * PIP if OS=1 FUP(NLIP), PIP(0)?, MODE.Exec?, TIP(ResetLIP) 31a INIT (AP, goes to wait-for-SIPI) 0 D.C. D.C. 31b INIT (AP, goes to wait-for-SIPI) 1 D.C. D.C. 32a SIPI 0 0 0 32c SIPI 0 1 1 32d SIPI 1 0 0 32e SIPI 1 0 1 *TraceStop if SIPI LIP is in a TraceStop region TIP.PGD(SIPILIP); TraceStop? 32f SIPI 1 1 1 * MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB MODE.Exec?, TIP(SIPILIP) 33a MWAIT (to C0) D.C. D.C. D.C. 33b MWAIT (to higher-numbered C-State, packet sent on wake) D.C. D.C. D.C. FUP(NLIP), TIP.PGD() None * PIP if OS=1 FUP(NLIP), PIP(0) None * MODE.Exec if the mode has changed since the last MODE.Exec, or if no MODE.Exec since last PSB MODE.Exec?, TIP.PGE(SIPILIP) TIP.PGD None *TSC if TSCEn=1 *TMA if TSCEn=MTCEn=1 TSC?, TMA?, CBR In Table 35-56, PktEn is evaluated based on (TiggerEn & ContextEn & FilterEn & BranchEn & PTWEn). Table 35-56. PwrEvtEn and PTWEn Packet Generation under Different Enable Conditions Case Operation PktEn Before PktEn CntxEn After After Other Dependencies Packets Output 16.24a PTWRITE rm32/64, no fault D.C. D.C. D.C. None 16.24b PTWRITE rm32/64, no fault D.C. 0 0 None 16.24d PTWRITE rm32, no fault D.C. 1 1 * FUP, IP=1 if FUPonPTW=1 PTW(IP=1?, 4B, rm32_value), FUP(CLIP)? 16.24e PTWRITE rm64, no fault D.C. 1 1 * FUP, IP=1 if FUPonPTW=1 PTW(IP=1?, 8B, rm64_value), FUP(CLIP)? 16.25a PTWRITE mem32/64, fault D.C. D.C. D.C. 35-82 Vol. 3C See “Exception/fault” (cases 13[a-z] in Table 35-55) for BranchEn packets. INTEL® PROCESSOR TRACE 35.8 SOFTWARE CONSIDERATIONS 35.8.1 Tracing SMM Code Nothing prevents an SMM handler from configuring and enabling packet generation for its own use. As described in Section Section 35.2.8.3, SMI will always clear TraceEn, so the SMM handler would have to set TraceEn in order to enable tracing. There are some unique aspects and guidelines involved with tracing SMM code, which follow: 1. SMM should save away the existing values of any configuration MSRs that SMM intends to modify for tracing. This will allow the non-SMM tracing context to be restored before RSM. 2. It is recommended that SMM wait until it sets CSbase to 0 before enabling packet generation, to avoid possible LIP vs RIP confusion. 3. Packet output cannot be directed to SMRR memory, even while tracing in SMM. 4. Before performing RSM, SMM should take care to restore modified configuration MSRs to the values they had immediately after #SMI. This involves first disabling packet generation by clearing TraceEn, then restoring any other configuration MSRs that were modified. 5. RSM — Software must ensure that TraceEn=0 at the time of RSM. Tracing RSM is not a supported usage model, and the packets generated by RSM are undefined. — For processors on which Intel PT and LBR use are mutually exclusive (see Section 35.3.1.2), any RSM during which TraceEn is restored to 1 will suspend any LBR or BTS logging. 35.8.2 Cooperative Transition of Multiple Trace Collection Agents A third-party trace-collection tool should take into consideration the fact that it may be deployed on a processor that supports Intel PT but may run under any operating system. In such a deployment scenario, Intel recommends that tool agents follow similar principles of cooperative transition of single-use hardware resources, similar to how performance monitoring tools handle performance monitoring hardware: • Respect the “in-use” ownership of an agent who already configured the trace configuration MSRs, see architectural MSRs with the prefix “IA32_RTIT_” in Chapter 2, “Model-Specific Registers (MSRs)” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4, where “in-use” can be determined by reading the “enable bits” in the configuration MSRs. • Relinquish ownership of the trace configuration MSRs by clearing the “enabled bits” of those configuration MSRs. 35.8.3 Tracking Time This section describes the relationships of several clock counters whose update frequencies reside in different domains that feed into the timing packets. To track time, the decoder also needs to know the regularity or irregularity of the occurrences of various timing packets that store those clock counters. Intel PT provides time information for three different but related domains: • Processor timestamp counter • This counter increments at the max non-turbo or P1 frequency, and its value is returned on a RDTSC. Its frequency is fixed. The TSC packet holds the lower 7 bytes of the timestamp counter value. The TSC packet occurs occasionally and are much less frequent than the frequency of the time stamp counter. The timestamp counter will continue to increment when the processor is in deep C-States, with the exception of processors reporting CPUID.80000007H:EDX.InvariantTSC[bit 8] =0. Core crystal clock The ratio of the core crystal clock to timestamp counter frequency is known as P, and can be calculated as CPUID.15H:EBX[31:0] / CPUID.15H:EAX[31:0]. The frequency of the core crystal clock is fixed and lower than Vol. 3C 35-83 INTEL® PROCESSOR TRACE that of the timestamp counter. The periodic MTC packet is generated based on software-selected multiples of the crystal clock frequency. The MTC packet is expected to occur more frequently than the TSC packet. • Processor core clock The processor core clock frequency can vary due to P-state and thermal conditions. The CYC packet provides elapsed time as measured in processor core clock cycles relative to the last CYC packet. A decoder can use all or some combination of these packets to track time at different resolutions throughout the trace packets. 35.8.3.1 Time Domain Relationships The three domains are related by the following formula: TimeStampValue = (CoreCrystalClockValue * P) + AdjustedProcessorCycles + Software_Offset; The CoreCrystalClockValue can provide the coarse-grained component of the TSC value. P, or the TSC/”core crystal clock” ratio, can be derived from CPUID leaf 15H, as described in Section 35.8.3. The AdjustedProcessorCycles component provides the fine-grained distance from the rising edge of the last core crystal clock. Specifically, it is a cycle count in the same frequency as the timestamp counter from the last crystal clock rising edge. The value is adjusted based on the ratio of the processor core clock frequency to the Maximum Non-Turbo (or P1) frequency. The Software_Offsets component includes software offsets that are factored into the timestamp value, such as IA32_TSC_ADJUST. 35.8.3.2 Estimating TSC within Intel PT For many usages, it may be useful to have an estimated timestamp value for all points in the trace. The formula provided in Section 35.8.3.1 above provides the framework for how such an estimate can be calculated from the various timing packets present in the trace. The TSC packet provides the precise timestamp value at the time it is generated; however, TSC packets are infrequent, and estimates of the current timestamp value based purely on TSC packets are likely to be very inaccurate for this reason. In order to get more precise timing information between TSC packets, CYC packets and/or MTC packets should be enabled. MTC packets provide incremental updates of the CoreCrystalClockValue. On processors that support CPUID leaf 15H, the frequency of the timestamp counter and the core crystal clock is fixed, thus MTC packets provide a means to update the running timestamp estimate. Between two MTC packets A and B, the number of crystal clock cycles passed is calculated from the 8-bit payloads of respective MTC packets: (CTCB - CTCA), where CTCi = MTCi[15:8] << IA32_RTIT_CTL.MTCFreq and i = A, B. The time from a TSC packet to the subsequent MTC packet can be calculated using the TMA packet that follows the TSC packet. The TMA packet provides both the crystal clock value (lower 16 bits, in the CTC field) and the AdjustedProcessorCycles value (in the FastCounter field) that can be used in the calculation of the corresponding core crystal clock value of the TSC packet. When the next MTC after a pair of TSC/TMA is seen, the number of crystal clocks passed since the TSC packet can be calculated by subtracting the TMA.CTC value from the time indicated by the MTCNext packet by CTCDelta[15:0] = (CTCNext[15:0] - TMA.CTC[15:0]), where CTCNext = MTCPayload << IA32_RTIT_CTL.MTCFreq. The TMA.FastCounter field provides the fractional component of the TSC packet into the next crystal clock cycle. CYC packets can provide further precision of an estimated timestamp value to many non-timing packets, by providing an indication of the time passed between other timing packets (MTCs or TSCs). When enabled, CYC packets are sent preceding each CYC-eligible packet, and provide the number of processor core clock cycles that have passed since the last CYC packet. Thus between MTCs and TSCs, the accumulated CYC values can be used to estimate the adjusted_processor_cycles component of the timestamp value. The accumulated CPU cycles will have to be adjusted to account for the difference in frequency between the processor core clock and the P1 frequency. The necessary adjustment can be estimated using the core:bus ratio value given in the CBR packet, by multiplying the accumulated cycle count value by P1/CBRpayload. 35-84 Vol. 3C INTEL® PROCESSOR TRACE Note that stand-alone TSC packets (that is, TSC packets that are not a part of a PSB+) are typically generated only when generation of other timing packets (MTCs and CYCs) has ceased for a period of time. Example scenarios include when Intel PT is re-enabled, or on wake after a sleep state. Thus any calculation of ART or cycle time leading up to a TSC packet will likely result in a discrepancy, which the TSC packet serves to correct. A greater level of precision may be achieved by calculating the CPU clock frequency, see Section 35.8.3.4 below for a method to do so using Intel PT packets. CYCs can be used to estimate time between TSCs even without MTCs, though this will likely result in a reduction in estimated TSC precision. 35.8.3.3 VMX TSC Manipulation When software executes in non-Root operation, additional offset and scaling factors may be applied to the TSC value. These are optional, but may be enabled via VMCS controls on a per-VM basis. See Chapter 25, “VMX NonRoot Operation” for details on VMX TSC offsetting and TSC scaling. Like the value returned by RDTSC, TSC packets will include these adjustments, but other timing packets (such as MTC, CYC, and CBR) are not impacted. In order to use the algorithm above to estimate the TSC value when TSC scaling is in use, it will be necessary for software to account for the scaling factor. See Section 35.5.2.6 for details. 35.8.3.4 Calculating Frequency with Intel PT Because Intel PT can provide both wall-clock time and processor clock cycle time, it can be used to measure the processor core clock frequency. Either TSC or MTC packets can be used to track the wall-clock time. By using CYC packets to count the number of processor core cycles that pass in between a pair of wall-clock time packets, the ratio between processor core clock frequency and TSC frequency can be derived. If the P1 frequency is known, it can be applied to determine the CPU frequency. See Section 35.8.3.1 above for details on the relationship between TSC, MTC, and CYC. Vol. 3C 35-85 INTEL® PROCESSOR TRACE 35-86 Vol. 3C 24.Updates to Chapter 36, Volume 3D Change bars show changes to Chapter 36 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3D: System Programming Guide, Part 4. -----------------------------------------------------------------------------------------Changes to chapter: Minor update to terminology used in Section 36.3 “Enclave Life Cycle”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 19 INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS CHAPTER 36 INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS 36.1 OVERVIEW ® Intel Software Guard Extensions (Intel® SGX) is a set of instructions and mechanisms for memory accesses added to Intel® Architecture processors. Intel SGX can encompass two collections of instruction extensions, referred to as SGX1 and SGX2, see Table 36-1 and Table 36-2. The SGX1 extensions allow an application to instantiate a protected container, referred to as an enclave. The enclave is a trusted area of memory, where critical aspects of the application functionality have hardware-enhanced confidentiality and integrity protections. New access controls to restrict access to software not resident in the enclave are also introduced. The SGX2 extensions allow additional flexibility in runtime management of enclave resources and thread execution within an enclave. Chapter 37 covers main concepts, objects and data structure formats that interact within the Intel SGX architecture. Chapter 38 covers operational aspects ranging from preparing an enclave, transferring control to enclave code, and programming considerations for the enclave code and system software providing support for enclave execution. Chapter 39 describes the behavior of Asynchronous Enclave Exit (AEX) caused by events while executing enclave code. Chapter 40 covers the syntax and operational details of the instruction and associated leaf functions available in Intel SGX. Chapter 41 describes interaction of various aspects of IA32 and Intel® 64 architectures with Intel SGX. Chapter 42 covers Intel SGX support for application debug, profiling and performance monitoring. OS Enclave Entry Table Enclave Heap App Code Enclave Stack Enclave Code App Code Figure 36-1. An Enclave Within the Application’s Virtual Address Space 36.2 ENCLAVE INTERACTION AND PROTECTION Intel SGX allows the protected portion of an application to be distributed in the clear. Before the enclave is built, the enclave code and data are free for inspection and analysis. The protected portion is loaded into an enclave where its code and data is measured. Once the application’s protected portion of the code and data are loaded into an enclave, memory access controls are in place to restrict access by external software. An enclave can prove its identity to a remote party and provide the necessary building-blocks for secure provisioning of keys and credentials. The application can also request an enclave-specific and platform-specific key that it can use to protect keys and data that it wishes to store outside the enclave.1 1. For additional information, see white papers on Intel SGX at http://software.intel.com/en-us/intel-isa-extensions. Vol. 3D 36-1 INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS Intel SGX introduces two significant capabilities to the Intel Architecture. First is the change in enclave memory access semantics. The second is protection of the address mappings of the application. 36.3 ENCLAVE LIFE CYCLE Enclave memory management is divided into two parts: address space allocation and memory commitment. Address space allocation is the specification of the range of linear addresses that the enclave may use. This range is called the ELRANGE. No actual resources are committed to this region. Memory commitment is the assignment of actual memory resources (as pages) within the allocated address space. This two-phase technique allows flexibility for enclaves to control their memory usage and to adjust dynamically without overusing memory resources when enclave needs are low. Commitment adds physical pages to the enclave. An operating system may support separate allocate and commit operations. During enclave creation, code and data for an enclave are loaded from a clear-text source, i.e. from non-enclave memory. Untrusted application code starts using an initialized enclave typically by using the EENTER leaf function provided by Intel SGX to transfer control to the enclave code residing in the protected Enclave Page Cache (EPC). The enclave code returns to the caller via the EEXIT leaf function. Upon enclave entry, control is transferred by hardware to software inside the enclave. The software inside the enclave switches the stack pointer to one inside the enclave. When returning back from the enclave, the software swaps back the stack pointer then executes the EEXIT leaf function. On processors that support the SGX2 extensions, an enclave writer may add memory to an enclave using the SGX2 instruction set, after the enclave is built and running. These instructions allow adding additional memory resources to the enclave for use in such areas as the heap. In addition, SGX2 instructions allow the enclave to add new threads to the enclave. The SGX2 features provide additional capabilities to the software model without changing the security properties of the Intel SGX architecture. Calling an external procedure from an enclave could be done using the EEXIT leaf function. Software would use EEXIT and a software convention between the trusted section and the untrusted section. An active enclave consumes resources from the Enclave Page Cache (EPC, see Section 36.5). Intel SGX provides the EREMOVE instruction that an EPC manager can use to reclaim EPC pages committed to an enclave. The EPC manager uses EREMOVE on every enclave page when the enclave is torn down. After successful execution of EREMOVE the EPC page is available for allocation to another enclave. 36.4 DATA STRUCTURES AND ENCLAVE OPERATION There are 2 main data structures associated with operating an enclave, the SGX Enclave Control Structure (SECS, see Section 37.7) and the Thread Control Structure (TCS, see Section 37.8). There is one SECS for each enclave. The SECS contains meta-data about the enclave which is used by the hardware and cannot be directly accessed by software. Included in the SECS is a field that stores the enclave build measurement value. This field, MRENCLAVE, is initialized by the ECREATE instruction and updated by every EADD and EEXTEND. It is locked by EINIT. Every enclave contains one or more TCS structures. The TCS contains meta-data used by the hardware to save and restore thread specific information when entering/exiting the enclave. There is one field, FLAGS, that may be accessed by software. This field can only be accessed by debug enclaves. The flag bit, DBGOPTIN, allows to single step into the thread associated with the TCS. (see Section 37.8.1) The SECS is created when ECREATE (see Table 36-1) is executed. The TCS can be created using the EADD instruction or the SGX2 instructions (see Table 36-2). 36.5 ENCLAVE PAGE CACHE The Enclave Page Cache (EPC) is the secure storage used to store enclave pages when they are a part of an executing enclave. For an EPC page, hardware performs additional access control checks to restrict access to the page. After the current page access checks and translations are performed, the hardware checks that the EPC page 36-2 Vol. 3D INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS is accessible to the program currently executing. Generally an EPC page is only accessed by the owner of the executing enclave or an instruction which is setting up an EPC page The EPC is divided into EPC pages. An EPC page is 4KB in size and always aligned on a 4KB boundary. Pages in the EPC can either be valid or invalid. Every valid page in the EPC belongs to one enclave instance. Each enclave instance has an EPC page that holds its SECS. The security metadata for each EPC page is held in an internal micro-architectural structure called Enclave Page Cache Map (EPCM, see Section 36.5.1). The EPC is managed by privileged software. Intel SGX provides a set of instructions for adding and removing content to and from the EPC. The EPC may be configured by BIOS at boot time. On implementations in which EPC memory is part of system DRAM, the contents of the EPC are protected by an encryption engine. 36.5.1 Enclave Page Cache Map (EPCM) The EPCM is a secure structure used by the processor to track the contents of the EPC. The EPCM holds one entry for each page in the EPC. The format of the EPCM is micro-architectural, and consequently is implementation dependent. However, the EPCM contains the following architectural information: • • • • • The status of EPC page with respect to validity and accessibility. An SECS identifier (see Section 37.19) of the enclave to which the page belongs. The type of page: regular, SECS, TCS or VA. The linear address through which the enclave is allowed to access the page. The specified read/write/execute permissions on that page. The EPCM structure is used by the CPU in the address-translation flow to enforce access-control on the EPC pages. The EPCM structure is described in Table 37-27, and the conceptual access-control flow is described in Section 37.5. The EPCM entries are managed by the processor as part of various instruction flows. 36.6 ENCLAVE INSTRUCTIONS AND INTEL® SGX The enclave instructions available with Intel SGX are organized as leaf functions under three instruction mnemonics: ENCLS (ring 0), ENCLU (ring 3), and ENCLV (VT root mode). Each leaf function uses EAX to specify the leaf function index, and may require additional implicit input registers as parameters. The use of EAX is implied implicitly by the ENCLS, ENCLU, and ENCLV instructions; ModR/M byte encoding is not used with ENCLS, ENCLU, and ENCLV. The use of additional registers does not use ModR/M encoding and is implied implicitly by the respective leaf function index. Each leaf function index is also associated with a unique, leaf-specific mnemonic. A long-form expression of Intel SGX instruction takes the form of ENCLx[LEAF_MNEMONIC], where ‘x’ is either ‘S’, ‘U’, or ‘V’. The long-form expression provides clear association of the privilege-level requirement of a given “leaf mnemonic”. For simplicity, the unique “Leaf_Mnemonic” name is used (omitting the ENCLx for convenience) throughout in this document. Details of individual SGX leaf functions are described in Chapter 40. Table 36-1 provides a summary of the instruction leaves that are available in the initial implementation of Intel SGX, which is introduced in the 6th generation Intel Core processors. Table 36-2 summarizes enhancement of Intel SGX for future Intel processors. Table 36-1. Supervisor and User Mode Enclave Instruction Leaf Functions in Long-Form of SGX1 Supervisor Instruction Description User Instruction Description ENCLS[EADD] Add an EPC page to an enclave. ENCLU[EENTER] Enter an enclave. ENCLS[EBLOCK] Block an EPC page. ENCLU[EEXIT] Exit an enclave. ENCLS[ECREATE] Create an enclave. ENCLU[EGETKEY] Create a cryptographic key. ENCLS[EDBGRD] Read data from a debug enclave by debug- ENCLU[EREPORT] ger. Create a cryptographic report. Vol. 3D 36-3 INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS Table 36-1. Supervisor and User Mode Enclave Instruction Leaf Functions in Long-Form of SGX1 Supervisor Instruction Description User Instruction ENCLS[EDBGWR] Write data into a debug enclave by debug- ENCLU[ERESUME] ger. ENCLS[EEXTEND] Extend EPC page measurement. ENCLS[EINIT] Initialize an enclave. ENCLS[ELDB] Load an EPC page in blocked state. ENCLS[ELDU] Load an EPC page in unblocked state. ENCLS[EPA] Add an EPC page to create a version array. ENCLS[EREMOVE] Remove an EPC page from an enclave. ENCLS[ETRACK] Activate EBLOCK checks. ENCLS[EWB] Write back/invalidate an EPC page. Description Re-enter an enclave. Table 36-2. Supervisor and User Mode Enclave Instruction Leaf Functions in Long-Form of SGX2 Supervisor Instruction Description ENCLS[EAUG] Allocate EPC page to an existing enclave. ENCLS[EMODPR] ENCLS[EMODT] User Instruction Description ENCLU[EACCEPT] Accept EPC page into the enclave. Restrict page permissions. ENCLU[EMODPE] Enhance page permissions. Modify EPC page type. ENCLU[EACCEPTCOPY] Copy contents to an augmented EPC page and accept the EPC page into the enclave. Table 36-3. VMX Operation and Supervisor Mode Enclave Instruction Leaf Functions in Long-Form of OVERSUB Supervisor Instruction Description User Instruction Description ENCLV[EDECVIRTCHILD] Decrement the virtual child page count. ENCLS[ERDINFO] Read information about EPC page. ENCLV[EINCVIRTCHILD] Increment the virtual child page count. ENCLS[TRACKC] Activate EBLOCK checks with conflict reporting. ENCLV[ESETCONTEXT] Set virtualization context. ENCLS[ELDBC/UC] Load an EPC page with conflict reporting. 36.7 DISCOVERING SUPPORT FOR INTEL® SGX AND ENABLING ENCLAVE INSTRUCTIONS Detection of support of Intel SGX and enumeration of available and enabled Intel SGX resources are queried using the CPUID instruction. The enumeration interface comprises the following: • • Processor support of Intel SGX is enumerated by a feature flag in CPUID leaf 07H: CPUID.(EAX=07H, ECX=0H):EBX.SGX[bit 2]. If CPUID.(EAX=07H, ECX=0H):EBX.SGX = 1, the processor has support for Intel SGX, and requires opt-in enabling by BIOS via IA32_FEATURE_CONTROL MSR. If CPUID.(EAX=07H, ECX=0H):EBX.SGX = 1, CPUID will report via the available sub-leaves of CPUID.(EAX=12H) on available and/or configured Intel SGX resources. The available and configured Intel SGX resources enumerated by the sub-leaves of CPUID.(EAX=12H) depend on the state of BIOS configuration. 36-4 Vol. 3D INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS 36.7.1 Intel® SGX Opt-In Configuration On processors that support Intel SGX, IA32_FEATURE_CONTROL provides the SGX_ENABLE field (bit 18). Before system software can configure and enable Intel SGX resources, BIOS is required to set IA32_FEATURE_CONTROL.SGX_ENABLE = 1 to opt-in the use of Intel SGX by system software. The semantics of setting SGX_ENABLE follows the rules of IA32_FEATURE_CONTROL.LOCK (bit 0). Software is considered to have opted into Intel SGX if and only if IA32_FEATURE_CONTROL.SGX_ENABLE and IA32_FEATURE_CONTROL.LOCK are set to 1. The setting of IA32_FEATURE_CONTROL.SGX_ENABLE (bit 18) is not reflected by CPUID. Table 36-4. Intel® SGX Opt-in and Enabling Behavior CPUID.(07H,0H):EBX. SGX CPUID.(12H) FEATURE_CONTROL. FEATURE_CONTROL. LOCK SGX_ENABLE Enclave Instruction 0 Invalid X X #UD 1 Valid* X X #UD** 1 Valid* 0 X #GP 1 Valid* 1 0 #GP 1 Valid* 1 1 Available (see Table 36-5 for details of SGX1 and SGX2). * Leaf 12H enumeration results are dependent on enablement. ** See list of conditions in the #UD section of the reference pages of ENCLS and ENCLU 36.7.2 Intel® SGX Resource Enumeration Leaves If CPUID.(EAX=07H, ECX=0H):EBX.SGX = 1, the processor also supports querying CPUID with EAX=12H on Intel SGX resource capability and configuration. The number of available sub-leaves in leaf 12H depends on the Opt-in and system software configuration. Information returned by CPUID.12H is thread specific; software should not assume that if Intel SGX instructions are supported on one hardware thread, they are also supported elsewhere. A properly configured processor exposes Intel SGX functionality with CPUID.EAX=12H reporting valid information (non-zero content) in three or more sub-leaves, see Table 36-5. • • CPUID.(EAX=12H, ECX=0H) enumerates Intel SGX capability, including enclave instruction opcode support. • CPUID.(EAX=12H, ECX >1) enumerates available EPC resources. CPUID.(EAX=12H, ECX=1H) enumerates Intel SGX capability of processor state configuration and enclave configuration in the SECS structure (see Table 37-3). Table 36-5. CPUID Leaf 12H, Sub-Leaf 0 Enumeration of Intel® SGX Capabilities CPUID.(EAX=12H,ECX=0) Register EAX EBX ECX Description Behavior Bits 0 SGX1: If 1, indicates leaf functions of SGX1 instruction listed in Table 36-1 are supported. 1 SGX2: If 1, indicates leaf functions of SGX2 instruction listed in Table 36-2 are supported. 4:2 Reserved (0) 5 OVERSUB: If 1, indicates Intel SGX supports instructions: EINCVIRTCHILD, EDECVIRTCHILD, and ESETCONTEXT. 6 OVERSUB: If 1, indicates Intel SGX supports instructions: ETRACKC, ERDINFO, ELDBC, and ELDUC. 31:7 Reserved (0) 31:0 MISCSELECT: Reports the bit vector of supported extended features that can be written to the MISC region of the SSA. 31:0 Reserved (0). Vol. 3D 36-5 INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS Table 36-5. CPUID Leaf 12H, Sub-Leaf 0 Enumeration of Intel® SGX Capabilities CPUID.(EAX=12H,ECX=0) Register Description Behavior Bits EDX 7:0 MaxEnclaveSize_Not64: the maximum supported enclave size is 2^(EDX[7:0]) bytes when not in 64-bit mode. 15:8 MaxEnclaveSize_64: the maximum supported enclave size is 2^(EDX[15:8]) bytes when operating in 64bit mode. 31:16 Reserved (0). Table 36-6. CPUID Leaf 12H, Sub-Leaf 1 Enumeration of Intel® SGX Capabilities CPUID.(EAX=12H,ECX=1) Register Description Behavior Bits EAX 31:0 Report the valid bits of SECS.ATTRIBUTES[31:0] that software can set with ECREATE. SECS.ATTRIBUTES[n] can be set to 1 using ECREATE only if EAX[n] is 1, where n < 32. EBX 31:0 Report the valid bits of SECS.ATTRIBUTES[63:32] that software can set with ECREATE. SECS.ATTRIBUTES[n+32] can be set to 1 using ECREATE only if EBX[n] is 1, where n < 32. ECX 31:0 Report the valid bits of SECS.ATTRIBUTES[95:64] that software can set with ECREATE. SECS.ATTRIBUTES[n+64] can be set to 1 using ECREATE only if ECX[n] is 1, where n < 32. EDX 31:0 Report the valid bits of SECS.ATTRIBUTES[127:96] that software can set with ECREATE. SECS.ATTRIBUTES[n+96] can be set to 1 using ECREATE only if EDX[n] is 1, where n < 32. On processors that support Intel SGX1 and SGX2, CPUID leaf 12H sub-leaf 2 report physical memory resources available for use with Intel SGX. These physical memory sections are typically allocated by BIOS as Processor Reserved Memory, and available to the OS to manage as EPC. To enumerate how many EPC sections are available to the EPC manager, software can enumerate CPUID leaf 12H with sub-leaf index starting from 2, and decode the sub-leaf-type encoding (returned in EAX[3:0]) until the subleaf type is invalid. All invalid sub-leaves of CPUID leaf 12H return EAX/EBX/ECX/EDX with 0. Table 36-7. CPUID Leaf 12H, Sub-Leaf Index 2 or Higher Enumeration of Intel® SGX Resources CPUID.(EAX=12H,ECX > 1) Register EAX Description Behavior Bits 3:0 0000b: This sub-leaf is invalid; EDX:ECX:EBX:EAX return 0. 0001b: This sub-leaf enumerates an EPC section. EBX:EAX and EDX:ECX provide information on the Enclave Page Cache (EPC) section. All other encoding are reserved. EBX 11:4 Reserved (enumerate 0). 31:12 If EAX[3:0] = 0001b, these are bits 31:12 of the physical address of the base of the EPC section. 19:0 If EAX[3:0] = 0001b, these are bits 51:32 of the physical address of the base of the EPC section. 31:20 Reserved. 3: 0 If EAX[3:0] 0000b, then all bits of the EDX:ECX pair are enumerated as 0. If EAX[3:0] 0001b, then this section has confidentiality and integrity protection. ECX 36-6 Vol. 3D All other encoding are reserved. 11:4 Reserved (enumerate 0). 31:12 If EAX[3:0] = 0001b, these are bits 31:12 of the size of the corresponding EPC section within the Processor Reserved Memory. INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS Table 36-7. CPUID Leaf 12H, Sub-Leaf Index 2 or Higher Enumeration of Intel® SGX Resources CPUID.(EAX=12H,ECX > 1) Register EDX Description Behavior Bits 19: 0 If EAX[3:0] = 0001b, these are bits 51:32 of the size of the corresponding EPC section within the Processor Reserved Memory. 31:20 Reserved. Vol. 3D 36-7 INTRODUCTION TO INTEL® SOFTWARE GUARD EXTENSIONS 36-8 Vol. 3D 25.Updates to Appendix B, Volume 3D Change bars show changes to Appendix B of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3D: System Programming Guide, Part 4. -----------------------------------------------------------------------------------------Changes to this chapter: Updated Table B-6 “Encodings for 64-Bit Guest-State Fields (0010_10xx_xxxx_xxxAb)” with entries for Guest IA32_RTIT_CTL (full) and Guest IA32_RTIT_CTL (high). Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 27 APPENDIX B FIELD ENCODING IN VMCS Every component of the VMCS is encoded by a 32-bit field that can be used by VMREAD and VMWRITE. Section 24.11.2 describes the structure of the encoding space (the meanings of the bits in each 32-bit encoding). This appendix enumerates all fields in the VMCS and their encodings. Fields are grouped by width (16-bit, 32-bit, etc.) and type (guest-state, host-state, etc.) B.1 16-BIT FIELDS A value of 0 in bits 14:13 of an encoding indicates a 16-bit field. Only guest-state areas and the host-state area contain 16-bit fields. As noted in Section 24.11.2, each 16-bit field allows only full access, meaning that bit 0 of its encoding is 0. Each such encoding is thus an even number. B.1.1 16-Bit Control Fields A value of 0 in bits 11:10 of an encoding indicates a control field. These fields are distinguished by their index value in bits 9:1. Table B-1 enumerates the 16-bit control fields. Table B-1. Encoding for 16-Bit Control Fields (0000_00xx_xxxx_xxx0B) Field Name Virtual-processor identifier (VPID)1 2 Posted-interrupt notification vector EPTP index3 Index Encoding 000000000B 00000000H 000000001B 00000002H 000000010B 00000004H NOTES: 1. This field exists only on processors that support the 1-setting of the “enable VPID” VM-execution control. 2. This field exists only on processors that support the 1-setting of the “process posted interrupts” VM-execution control. 3. This field exists only on processors that support the 1-setting of the “EPT-violation #VE” VM-execution control. B.1.2 16-Bit Guest-State Fields A value of 2 in bits 11:10 of an encoding indicates a field in the guest-state area. These fields are distinguished by their index value in bits 9:1. Table B-2 enumerates 16-bit guest-state fields. Table B-2. Encodings for 16-Bit Guest-State Fields (0000_10xx_xxxx_xxx0B) Field Name Index Encoding Guest ES selector 000000000B 00000800H Guest CS selector 000000001B 00000802H Guest SS selector 000000010B 00000804H Guest DS selector 000000011B 00000806H Guest FS selector 000000100B 00000808H Guest GS selector 000000101B 0000080AH Guest LDTR selector 000000110B 0000080CH Guest TR selector 000000111B 0000080EH Vol. 3D B-1 FIELD ENCODING IN VMCS Table B-2. Encodings for 16-Bit Guest-State Fields (0000_10xx_xxxx_xxx0B) (Contd.) Field Name Index Encoding Guest interrupt status 000001000B 00000810H 2 000001001B 00000812H 1 PML index NOTES: 1. This field exists only on processors that support the 1-setting of the “virtual-interrupt delivery” VM-execution control. 2. This field exists only on processors that support the 1-setting of the “enable PML” VM-execution control. B.1.3 16-Bit Host-State Fields A value of 3 in bits 11:10 of an encoding indicates a field in the host-state area. These fields are distinguished by their index value in bits 9:1. Table B-3 enumerates the 16-bit host-state fields. Table B-3. Encodings for 16-Bit Host-State Fields (0000_11xx_xxxx_xxx0B) Field Name Index Encoding Host ES selector 000000000B 00000C00H Host CS selector 000000001B 00000C02H Host SS selector 000000010B 00000C04H Host DS selector 000000011B 00000C06H Host FS selector 000000100B 00000C08H Host GS selector 000000101B 00000C0AH Host TR selector 000000110B 00000C0CH B.2 64-BIT FIELDS A value of 1 in bits 14:13 of an encoding indicates a 64-bit field. There are 64-bit fields only for controls and for guest state. As noted in Section 24.11.2, every 64-bit field has two encodings, which differ on bit 0, the access type. Thus, each such field has an even encoding for full access and an odd encoding for high access. B.2.1 64-Bit Control Fields A value of 0 in bits 11:10 of an encoding indicates a control field. These fields are distinguished by their index value in bits 9:1. Table B-4 enumerates the 64-bit control fields. Table B-4. Encodings for 64-Bit Control Fields (0010_00xx_xxxx_xxxAb) Field Name Index Address of I/O bitmap A (full) 000000000B Address of I/O bitmap A (high) Address of I/O bitmap B (full) 000000001B Address of I/O bitmap B (high) Address of MSR bitmaps (full)1 1 Address of MSR bitmaps (high) VM-exit MSR-store address (full) VM-exit MSR-store address (high) B-2 Vol. 3D 000000010B 000000011B Encoding 00002000H 00002001H 00002002H 00002003H 00002004H 00002005H 00002006H 00002007H FIELD ENCODING IN VMCS Table B-4. Encodings for 64-Bit Control Fields (0010_00xx_xxxx_xxxAb) (Contd.) Field Name Index VM-exit MSR-load address (full) 000000100B VM-exit MSR-load address (high) VM-entry MSR-load address (full) 000000101B VM-entry MSR-load address (high) Executive-VMCS pointer (full) 000000110B Executive-VMCS pointer (high) PML address (full)2 000000111B 2 PML address (high) TSC offset (full) 000001000B TSC offset (high) Virtual-APIC address (full)3 000001001B 3 Virtual-APIC address (high) APIC-access address (full)4 APIC-access address (high)4 000001010B Posted-interrupt descriptor address (full)5 Posted-interrupt descriptor address (high)5 VM-function controls 000001011B (full)6 000001100B 6 VM-function controls (high) EPT pointer (EPTP; full)7 EPT pointer (EPTP; high)7 000001101B EOI-exit bitmap 0 (EOI_EXIT0; full)8 EOI-exit bitmap 0 (EOI_EXIT0; high)8 EOI-exit bitmap 1 (EOI_EXIT1; full)8 000001110B 000001111B 8 EOI-exit bitmap 1 (EOI_EXIT1; high) EOI-exit bitmap 2 (EOI_EXIT2; full)8 EOI-exit bitmap 2 (EOI_EXIT2; high)8 EOI-exit bitmap 3 (EOI_EXIT3; full)8 EOI-exit bitmap 3 (EOI_EXIT3; high)8 EPTP-list address 000010000B 000010001B (full)9 EPTP-list address (high) 000010010B 9 VMREAD-bitmap address (full)10 VMREAD-bitmap address (high)10 VMWRITE-bitmap address (full)10 VMWRITE-bitmap address (high)10 Virtualization-exception information address 000010011B 000010100B (full)11 11 Virtualization-exception information address (high) XSS-exiting bitmap (full)12 XSS-exiting bitmap (high)12 000010101B 000010110B Encoding 00002008H 00002009H 0000200AH 0000200BH 0000200CH 0000200DH 0000200EH 0000200FH 00002010H 00002011H 00002012H 00002013H 00002014H 00002015H 00002016H 00002017H 00002018H 00002019H 0000201AH 0000201BH 0000201CH 0000201DH 0000201EH 0000201FH 00002020H 00002021H 00002022H 00002023H 00002024H 00002025H 00002026H 00002027H 00002028H 00002029H 0000202AH 0000202BH 0000202CH 0000202DH Vol. 3D B-3 FIELD ENCODING IN VMCS Table B-4. Encodings for 64-Bit Control Fields (0010_00xx_xxxx_xxxAb) (Contd.) Field Name Index ENCLS-exiting bitmap (full) 13 ENCLS-exiting bitmap (high) 000010111B 13 14 Sub-page-permission-table pointer (full) Sub-page-permission-table pointer (high)14 000011000B 15 TSC multiplier (full) TSC multiplier 000011001B (high)15 Encoding 0000202EH 0000202FH 00002030H 00002031H 00002032H 00002033H NOTES: 1. This field exists only on processors that support the 1-setting of the “use MSR bitmaps” VM-execution control. 2. This field exists only on processors that support the 1-setting of the “enable PML” VM-execution control. 3. This field exists only on processors that support the 1-setting of the “use TPR shadow” VM-execution control. 4. This field exists only on processors that support the 1-setting of the “virtualize APIC accesses” VM-execution control. 5. This field exists only on processors that support the 1-setting of the “process posted interrupts” VM-execution control. 6. This field exists only on processors that support the 1-setting of the “enable VM functions” VM-execution control. 7. This field exists only on processors that support the 1-setting of the “enable EPT” VM-execution control. 8. This field exists only on processors that support the 1-setting of the “virtual-interrupt delivery” VM-execution control. 9. This field exists only on processors that support the 1-setting of the “EPTP switching” VM-function control. 10. This field exists only on processors that support the 1-setting of the “VMCS shadowing” VM-execution control. 11. This field exists only on processors that support the 1-setting of the “EPT-violation #VE” VM-execution control. 12. This field exists only on processors that support the 1-setting of the “enable XSAVES/XRSTORS” VM-execution control. 13. This field exists only on processors that support the 1-setting of the “enable ENCLS exiting” VM-execution control. 14. This field exists only on processors that support the 1-setting of the “sub-page write permissions for EPT” VM-execution control. 15. This field exists only on processors that support the 1-setting of the “use TSC scaling” VM-execution control. B.2.2 64-Bit Read-Only Data Field A value of 1 in bits 11:10 of an encoding indicates a read-only data field. These fields are distinguished by their index value in bits 9:1. There is only one such 64-bit field as given in Table B-5.(As with other 64-bit fields, this one has two encodings.) Table B-5. Encodings for 64-Bit Read-Only Data Field (0010_01xx_xxxx_xxxAb) Field Name Index Guest-physical address (full)1 Guest-physical address (high)1 000000000B NOTES: 1. This field exists only on processors that support the 1-setting of the "enable EPT” VM-execution control. B-4 Vol. 3D Encoding 00002400H 00002401H FIELD ENCODING IN VMCS B.2.3 64-Bit Guest-State Fields A value of 2 in bits 11:10 of an encoding indicates a field in the guest-state area. These fields are distinguished by their index value in bits 9:1. Table B-6 enumerates the 64-bit guest-state fields. Table B-6. Encodings for 64-Bit Guest-State Fields (0010_10xx_xxxx_xxxAb) Field Name Index VMCS link pointer (full) 000000000B VMCS link pointer (high) Guest IA32_DEBUGCTL (full) 000000001B Guest IA32_DEBUGCTL (high) Guest IA32_PAT (full)1 000000010B Guest IA32_PAT (high)1 Guest IA32_EFER (full)2 Guest IA32_EFER (high)2 000000011B Guest IA32_PERF_GLOBAL_CTRL (full)3 Guest IA32_PERF_GLOBAL_CTRL Guest PDPTE0 (full)4 Guest PDPTE1 Guest PDPTE1 (high)4 000000110B Guest PDPTE2 (full)4 000000111B 4 Guest PDPTE2 (high) Guest PDPTE3 000000100B 000000101B Guest PDPTE0 (high)4 (full)4 (high)3 (full)4 000001000B Guest PDPTE3 (high)4 Guest IA32_BNDCFGS (full)5 Guest IA32_BNDCFGS (high)5 Guest IA32_RTIT_CTL (full)6 6 Guest IA32_RTIT_CTL (high) 000001001B 000001010B Encoding 00002800H 00002801H 00002802H 00002803H 00002804H 00002805H 00002806H 00002807H 00002808H 00002809H 0000280AH 0000280BH 0000280CH 0000280DH 0000280EH 0000280FH 00002810H 00002811H 00002812H 00002813H 00002814H 00002815H NOTES: 1. This field exists only on processors that support either the 1-setting of the "load IA32_PAT" VM-entry control or that of the "save IA32_PAT" VM-exit control. 2. This field exists only on processors that support either the 1-setting of the "load IA32_EFER" VM-entry control or that of the "save IA32_EFER" VM-exit control. 3. This field exists only on processors that support the 1-setting of the "load IA32_PERF_GLOBAL_CTRL" VM-entry control. 4. This field exists only on processors that support the 1-setting of the "enable EPT" VM-execution control. 5. This field exists only on processors that support either the 1-setting of the “load IA32_BNDCFGS” VM-entry control or that of the “clear IA32_BNDCFGS” VM-exit control. 6. This field exists only on processors that support either the 1-setting of the “load IA32_RTIT_CTL” VM-entry control or that of the “clear IA32_RTIT_CTL” VM-exit control. Vol. 3D B-5 FIELD ENCODING IN VMCS B.2.4 64-Bit Host-State Fields A value of 3 in bits 11:10 of an encoding indicates a field in the host-state area. These fields are distinguished by their index value in bits 9:1. Table B-7 enumerates the 64-bit control fields. Table B-7. Encodings for 64-Bit Host-State Fields (0010_11xx_xxxx_xxxAb) Field Name Index 1 Host IA32_PAT (full) 000000000B Host IA32_PAT (high)1 Host IA32_EFER (full)2 Host IA32_EFER (high)2 Host IA32_PERF_GLOBAL_CTRL (full)3 Host IA32_PERF_GLOBAL_CTRL (high)3 000000001B 000000010B Encoding 00002C00H 00002C01H 00002C02H 00002C03H 00002C04H 00002C05H NOTES: 1. This field exists only on processors that support the 1-setting of the "load IA32_PAT" VM-exit control. 2. This field exists only on processors that support the 1-setting of the "load IA32_EFER" VM-exit control. 3. This field exists only on processors that support the 1-setting of the "load IA32_PERF_GLOBAL_CTRL" VM-exit control. B.3 32-BIT FIELDS A value of 2 in bits 14:13 of an encoding indicates a 32-bit field. As noted in Section 24.11.2, each 32-bit field allows only full access, meaning that bit 0 of its encoding is 0. Each such encoding is thus an even number. B.3.1 32-Bit Control Fields A value of 0 in bits 11:10 of an encoding indicates a control field. These fields are distinguished by their index value in bits 9:1. Table B-8 enumerates the 32-bit control fields. Table B-8. Encodings for 32-Bit Control Fields (0100_00xx_xxxx_xxx0B) Field Name Index Encoding Pin-based VM-execution controls 000000000B 00004000H Primary processor-based VM-execution controls 000000001B 00004002H Exception bitmap 000000010B 00004004H Page-fault error-code mask 000000011B 00004006H Page-fault error-code match 000000100B 00004008H CR3-target count 000000101B 0000400AH VM-exit controls 000000110B 0000400CH VM-exit MSR-store count 000000111B 0000400EH VM-exit MSR-load count 000001000B 00004010H VM-entry controls 000001001B 00004012H VM-entry MSR-load count 000001010B 00004014H VM-entry interruption-information field 000001011B 00004016H VM-entry exception error code 000001100B 00004018H VM-entry instruction length 000001101B 0000401AH B-6 Vol. 3D FIELD ENCODING IN VMCS Table B-8. Encodings for 32-Bit Control Fields (0100_00xx_xxxx_xxx0B) (Contd.) Field Name 1 TPR threshold 2 Secondary processor-based VM-execution controls PLE_Gap 3 PLE_Window3 Index Encoding 000001110B 0000401CH 000001111b 0000401EH 000010000b 00004020H 000010001b 00004022H NOTES: 1. This field exists only on processors that support the 1-setting of the “use TPR shadow” VM-execution control. 2. This field exists only on processors that support the 1-setting of the “activate secondary controls” VM-execution control. 3. This field exists only on processors that support the 1-setting of the “PAUSE-loop exiting” VM-execution control. B.3.2 32-Bit Read-Only Data Fields A value of 1 in bits 11:10 of an encoding indicates a read-only data field. These fields are distinguished by their index value in bits 9:1. Table B-9 enumerates the 32-bit read-only data fields. Table B-9. Encodings for 32-Bit Read-Only Data Fields (0100_01xx_xxxx_xxx0B) Field Name Index Encoding VM-instruction error 000000000B 00004400H Exit reason 000000001B 00004402H VM-exit interruption information 000000010B 00004404H VM-exit interruption error code 000000011B 00004406H IDT-vectoring information field 000000100B 00004408H IDT-vectoring error code 000000101B 0000440AH VM-exit instruction length 000000110B 0000440CH VM-exit instruction information 000000111B 0000440EH B.3.3 32-Bit Guest-State Fields A value of 2 in bits 11:10 of an encoding indicates a field in the guest-state area. These fields are distinguished by their index value in bits 9:1. Table B-10 enumerates the 32-bit guest-state fields. Table B-10. Encodings for 32-Bit Guest-State Fields (0100_10xx_xxxx_xxx0B) Field Name Index Encoding Guest ES limit 000000000B 00004800H Guest CS limit 000000001B 00004802H Guest SS limit 000000010B 00004804H Guest DS limit 000000011B 00004806H Guest FS limit 000000100B 00004808H Guest GS limit 000000101B 0000480AH Guest LDTR limit 000000110B 0000480CH Guest TR limit 000000111B 0000480EH Guest GDTR limit 000001000B 00004810H Vol. 3D B-7 FIELD ENCODING IN VMCS Table B-10. Encodings for 32-Bit Guest-State Fields (0100_10xx_xxxx_xxx0B) (Contd.) Field Name Index Encoding Guest IDTR limit 000001001B 00004812H Guest ES access rights 000001010B 00004814H Guest CS access rights 000001011B 00004816H Guest SS access rights 000001100B 00004818H Guest DS access rights 000001101B 0000481AH Guest FS access rights 000001110B 0000481CH Guest GS access rights 000001111B 0000481EH Guest LDTR access rights 000010000B 00004820H Guest TR access rights 000010001B 00004822H Guest interruptibility state 000010010B 00004824H Guest activity state 000010011B 00004826H Guest SMBASE 000010100B 00004828H 000010101B 0000482AH 000010111B 0000482EH Guest IA32_SYSENTER_CS VMX-preemption timer value1 NOTES: 1. This field exists only on processors that support the 1-setting of the "activate VMX-preemption timer" VM-execution control. The limit fields for GDTR and IDTR are defined to be 32 bits in width even though these fields are only 16-bits wide in the Intel 64 and IA-32 architectures. VM entry ensures that the high 16 bits of both these fields are cleared to 0. B.3.4 32-Bit Host-State Field A value of 3 in bits 11:10 of an encoding indicates a field in the host-state area. There is only one such 32-bit field as given in Table B-11. Table B-11. Encoding for 32-Bit Host-State Field (0100_11xx_xxxx_xxx0B) Field Name Index Encoding Host IA32_SYSENTER_CS 000000000B 00004C00H B.4 NATURAL-WIDTH FIELDS A value of 3 in bits 14:13 of an encoding indicates a natural-width field. As noted in Section 24.11.2, each of these fields allows only full access, meaning that bit 0 of its encoding is 0. Each such encoding is thus an even number. B.4.1 Natural-Width Control Fields A value of 0 in bits 11:10 of an encoding indicates a control field. These fields are distinguished by their index value in bits 9:1. Table B-12 enumerates the natural-width control fields. Table B-12. Encodings for Natural-Width Control Fields (0110_00xx_xxxx_xxx0B) Field Name Index Encoding CR0 guest/host mask 000000000B 00006000H B-8 Vol. 3D FIELD ENCODING IN VMCS Table B-12. Encodings for Natural-Width Control Fields (0110_00xx_xxxx_xxx0B) (Contd.) Field Name Index Encoding CR4 guest/host mask 000000001B 00006002H CR0 read shadow 000000010B 00006004H CR4 read shadow 000000011B 00006006H CR3-target value 0 000000100B 00006008H CR3-target value 1 000000101B 0000600AH CR3-target value 2 000000110B 0000600CH 000000111B 0000600EH CR3-target value 31 NOTES: 1. If a future implementation supports more than 4 CR3-target values, they will be encoded consecutively following the 4 encodings given here. B.4.2 Natural-Width Read-Only Data Fields A value of 1 in bits 11:10 of an encoding indicates a read-only data field. These fields are distinguished by their index value in bits 9:1. Table B-13 enumerates the natural-width read-only data fields. Table B-13. Encodings for Natural-Width Read-Only Data Fields (0110_01xx_xxxx_xxx0B) Field Name Index Encoding Exit qualification 000000000B 00006400H I/O RCX 000000001B 00006402H I/O RSI 000000010B 00006404H I/O RDI 000000011B 00006406H I/O RIP 000000100B 00006408H Guest-linear address 000000101B 0000640AH B.4.3 Natural-Width Guest-State Fields A value of 2 in bits 11:10 of an encoding indicates a field in the guest-state area. These fields are distinguished by their index value in bits 9:1. Table B-14 enumerates the natural-width guest-state fields. Table B-14. Encodings for Natural-Width Guest-State Fields (0110_10xx_xxxx_xxx0B) Field Name Index Encoding Guest CR0 000000000B 00006800H Guest CR3 000000001B 00006802H Guest CR4 000000010B 00006804H Guest ES base 000000011B 00006806H Guest CS base 000000100B 00006808H Guest SS base 000000101B 0000680AH Guest DS base 000000110B 0000680CH Guest FS base 000000111B 0000680EH Guest GS base 000001000B 00006810H Vol. 3D B-9 FIELD ENCODING IN VMCS Table B-14. Encodings for Natural-Width Guest-State Fields (0110_10xx_xxxx_xxx0B) (Contd.) Field Name Index Encoding Guest LDTR base 000001001B 00006812H Guest TR base 000001010B 00006814H Guest GDTR base 000001011B 00006816H Guest IDTR base 000001100B 00006818H Guest DR7 000001101B 0000681AH Guest RSP 000001110B 0000681CH Guest RIP 000001111B 0000681EH Guest RFLAGS 000010000B 00006820H Guest pending debug exceptions 000010001B 00006822H Guest IA32_SYSENTER_ESP 000010010B 00006824H Guest IA32_SYSENTER_EIP 000010011B 00006826H The base-address fields for ES, CS, SS, and DS in the guest-state area are defined to be natural-width (with 64 bits on processors supporting Intel 64 architecture) even though these fields are only 32-bits wide in the Intel 64 architecture. VM entry ensures that the high 32 bits of these fields are cleared to 0. B.4.4 Natural-Width Host-State Fields A value of 3 in bits 11:10 of an encoding indicates a field in the host-state area. These fields are distinguished by their index value in bits 9:1. Table B-15 enumerates the natural-width host-state fields. Table B-15. Encodings for Natural-Width Host-State Fields (0110_11xx_xxxx_xxx0B) Field Name Index Encoding Host CR0 000000000B 00006C00H Host CR3 000000001B 00006C02H Host CR4 000000010B 00006C04H Host FS base 000000011B 00006C06H Host GS base 000000100B 00006C08H Host TR base 000000101B 00006C0AH Host GDTR base 000000110B 00006C0CH Host IDTR base 000000111B 00006C0EH Host IA32_SYSENTER_ESP 000001000B 00006C10H Host IA32_SYSENTER_EIP 000001001B 00006C12H Host RSP 000001010B 00006C14H Host RIP 000001011B 00006C16H B-10 Vol. 3D 26.Updates to Appendix C, Volume 3D Change bars show changes to Appendix C of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3D: System Programming Guide, Part 4. -----------------------------------------------------------------------------------------Changes to this chapter: Updated Table C-1 to include basic exit reasons for the UMWAIT and TPAUSE instructions. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 27 APPENDIX C VMX BASIC EXIT REASONS Every VM exit writes a 32-bit exit reason to the VMCS (see Section 24.9.1). Certain VM-entry failures also do this (see Section 26.7). The low 16 bits of the exit-reason field form the basic exit reason which provides basic information about the cause of the VM exit or VM-entry failure. Table C-1 lists values for basic exit reasons and explains their meaning. Entries apply to VM exits, unless otherwise noted. Table C-1. Basic Exit Reasons Basic Exit Reason Description 0 Exception or non-maskable interrupt (NMI). Either: 1: Guest software caused an exception and the bit in the exception bitmap associated with exception’s vector was 1. This case includes executions of BOUND that cause #BR, executions of INT1 (they cause #DB), executions of INT3 (they cause #BP), executions of INTO that cause #OF, and executions of UD0, UD1, and UD2 (they cause #UD). 2: An NMI was delivered to the logical processor and the “NMI exiting” VM-execution control was 1. 1 External interrupt. An external interrupt arrived and the “external-interrupt exiting” VM-execution control was 1. 2 Triple fault. The logical processor encountered an exception while attempting to call the double-fault handler and that exception did not itself cause a VM exit due to the exception bitmap. 3 INIT signal. An INIT signal arrived 4 Start-up IPI (SIPI). A SIPI arrived while the logical processor was in the “wait-for-SIPI” state. 5 I/O system-management interrupt (SMI). An SMI arrived immediately after retirement of an I/O instruction and caused an SMM VM exit (see Section 34.15.2). 6 Other SMI. An SMI arrived and caused an SMM VM exit (see Section 34.15.2) but not immediately after retirement of an I/O instruction. 7 Interrupt window. At the beginning of an instruction, RFLAGS.IF was 1; events were not blocked by STI or by MOV SS; and the “interrupt-window exiting” VM-execution control was 1. 8 NMI window. At the beginning of an instruction, there was no virtual-NMI blocking; events were not blocked by MOV SS; and the “NMI-window exiting” VM-execution control was 1. 9 Task switch. Guest software attempted a task switch. 10 CPUID. Guest software attempted to execute CPUID. 11 GETSEC. Guest software attempted to execute GETSEC. 12 HLT. Guest software attempted to execute HLT and the “HLT exiting” VM-execution control was 1. 13 INVD. Guest software attempted to execute INVD. 14 INVLPG. Guest software attempted to execute INVLPG and the “INVLPG exiting” VM-execution control was 1. 15 RDPMC. Guest software attempted to execute RDPMC and the “RDPMC exiting” VM-execution control was 1. 16 RDTSC. Guest software attempted to execute RDTSC and the “RDTSC exiting” VM-execution control was 1. 17 RSM. Guest software attempted to execute RSM in SMM. 18 VMCALL. VMCALL was executed either by guest software (causing an ordinary VM exit) or by the executive monitor (causing an SMM VM exit; see Section 34.15.2). 19 VMCLEAR. Guest software attempted to execute VMCLEAR. 20 VMLAUNCH. Guest software attempted to execute VMLAUNCH. 21 VMPTRLD. Guest software attempted to execute VMPTRLD. 22 VMPTRST. Guest software attempted to execute VMPTRST. Vol. 3D C-1 VMX BASIC EXIT REASONS Table C-1. Basic Exit Reasons (Contd.) Basic Exit Reason Description 23 VMREAD. Guest software attempted to execute VMREAD. 24 VMRESUME. Guest software attempted to execute VMRESUME. 25 VMWRITE. Guest software attempted to execute VMWRITE. 26 VMXOFF. Guest software attempted to execute VMXOFF. 27 VMXON. Guest software attempted to execute VMXON. 28 Control-register accesses. Guest software attempted to access CR0, CR3, CR4, or CR8 using CLTS, LMSW, or MOV CR and the VM-execution control fields indicate that a VM exit should occur (see Section 25.1 for details). This basic exit reason is not used for trap-like VM exits following executions of the MOV to CR8 instruction when the “use TPR shadow” VM-execution control is 1. Such VM exits instead use basic exit reason 43. 29 MOV DR. Guest software attempted a MOV to or from a debug register and the “MOV-DR exiting” VM-execution control was 1. 30 I/O instruction. Guest software attempted to execute an I/O instruction and either: 1: The “use I/O bitmaps” VM-execution control was 0 and the “unconditional I/O exiting” VM-execution control was 1. 2: The “use I/O bitmaps” VM-execution control was 1 and a bit in the I/O bitmap associated with one of the ports accessed by the I/O instruction was 1. 31 RDMSR. Guest software attempted to execute RDMSR and either: 1: The “use MSR bitmaps” VM-execution control was 0. 2: The value of RCX is neither in the range 00000000H – 00001FFFH nor in the range C0000000H – C0001FFFH. 3: The value of RCX was in the range 00000000H – 00001FFFH and the nth bit in read bitmap for low MSRs is 1, where n was the value of RCX. 4: The value of RCX is in the range C0000000H – C0001FFFH and the nth bit in read bitmap for high MSRs is 1, where n is the value of RCX & 00001FFFH. 32 WRMSR. Guest software attempted to execute WRMSR and either: 1: The “use MSR bitmaps” VM-execution control was 0. 2: The value of RCX is neither in the range 00000000H – 00001FFFH nor in the range C0000000H – C0001FFFH. 3: The value of RCX was in the range 00000000H – 00001FFFH and the nth bit in write bitmap for low MSRs is 1, where n was the value of RCX. 4: The value of RCX is in the range C0000000H – C0001FFFH and the nth bit in write bitmap for high MSRs is 1, where n is the value of RCX & 00001FFFH. 33 VM-entry failure due to invalid guest state. A VM entry failed one of the checks identified in Section 26.3.1. 34 VM-entry failure due to MSR loading. A VM entry failed in an attempt to load MSRs. See Section 26.4. 36 MWAIT. Guest software attempted to execute MWAIT and the “MWAIT exiting” VM-execution control was 1. 37 Monitor trap flag. A VM exit occurred due to the 1-setting of the “monitor trap flag” VM-execution control (see Section 25.5.2) or VM entry injected a pending MTF VM exit as part of VM entry (see Section 26.5.2). 39 MONITOR. Guest software attempted to execute MONITOR and the “MONITOR exiting” VM-execution control was 1. 40 PAUSE. Either guest software attempted to execute PAUSE and the “PAUSE exiting” VM-execution control was 1 or the “PAUSE-loop exiting” VM-execution control was 1 and guest software executed a PAUSE loop with execution time exceeding PLE_Window (see Section 25.1.3). 41 VM-entry failure due to machine-check event. A machine-check event occurred during VM entry (see Section 26.8). 43 TPR below threshold. The logical processor determined that the value of bits 7:4 of the byte at offset 080H on the virtual-APIC page was below that of the TPR threshold VM-execution control field while the “use TPR shadow” VMexecution control was 1 either as part of TPR virtualization (Section 29.1.2) or VM entry (Section 26.6.7). 44 APIC access. Guest software attempted to access memory at a physical address on the APIC-access page and the “virtualize APIC accesses” VM-execution control was 1 (see Section 29.4). 45 Virtualized EOI. EOI virtualization was performed for a virtual interrupt whose vector indexed a bit set in the EOIexit bitmap. C-2 Vol. 3D VMX BASIC EXIT REASONS Table C-1. Basic Exit Reasons (Contd.) Basic Exit Reason Description 46 Access to GDTR or IDTR. Guest software attempted to execute LGDT, LIDT, SGDT, or SIDT and the “descriptor-table exiting” VM-execution control was 1. 47 Access to LDTR or TR. Guest software attempted to execute LLDT, LTR, SLDT, or STR and the “descriptor-table exiting” VM-execution control was 1. 48 EPT violation. An attempt to access memory with a guest-physical address was disallowed by the configuration of the EPT paging structures. 49 EPT misconfiguration. An attempt to access memory with a guest-physical address encountered a misconfigured EPT paging-structure entry. 50 INVEPT. Guest software attempted to execute INVEPT. 51 RDTSCP. Guest software attempted to execute RDTSCP and the “enable RDTSCP” and “RDTSC exiting” VM-execution controls were both 1. 52 VMX-preemption timer expired. The preemption timer counted down to zero. 53 INVVPID. Guest software attempted to execute INVVPID. 54 WBINVD. Guest software attempted to execute WBINVD and the “WBINVD exiting” VM-execution control was 1. 55 XSETBV. Guest software attempted to execute XSETBV. 56 APIC write. Guest software completed a write to the virtual-APIC page that must be virtualized by VMM software (see Section 29.4.3.3). 57 RDRAND. Guest software attempted to execute RDRAND and the “RDRAND exiting” VM-execution control was 1. 58 INVPCID. Guest software attempted to execute INVPCID and the “enable INVPCID” and “INVLPG exiting” VM-execution controls were both 1. 59 VMFUNC. Guest software invoked a VM function with the VMFUNC instruction and the VM function either was not enabled or generated a function-specific condition causing a VM exit. 60 ENCLS. Guest software attempted to execute ENCLS and “enable ENCLS exiting” VM-execution control was 1 and either (1) EAX < 63 and the corresponding bit in the ENCLS-exiting bitmap is 1; or (2) EAX ≥ 63 and bit 63 in the ENCLS-exiting bitmap is 1. 61 RDSEED. Guest software attempted to execute RDSEED and the “RDSEED exiting” VM-execution control was 1. 62 Page-modification log full. The processor attempted to create a page-modification log entry and the value of the PML index was not in the range 0–511. 63 XSAVES. Guest software attempted to execute XSAVES, the “enable XSAVES/XRSTORS” was 1, and a bit was set in the logical-AND of the following three values: EDX:EAX, the IA32_XSS MSR, and the XSS-exiting bitmap. 64 XRSTORS. Guest software attempted to execute XRSTORS, the “enable XSAVES/XRSTORS” was 1, and a bit was set in the logical-AND of the following three values: EDX:EAX, the IA32_XSS MSR, and the XSS-exiting bitmap. 66 SPP-related event. The processor attempted to determine an access’s sub-page write permission and encountered an SPP miss or an SPP misconfiguration. See Section 28.2.4.2. 67 UMWAIT. Guest software attempted to execute UMWAIT and the “enable user wait and pause” and “RDTSC exiting” VM-execution controls were both 1. 68 TPAUSE. Guest software attempted to execute TPAUSE and the “enable user wait and pause” and “RDTSC exiting” VM-execution controls were both 1. Vol. 3D C-3 VMX BASIC EXIT REASONS C-4 Vol. 3D 27.Updates to Chapter 1, Volume 4 Change bars show changes to Chapter 1 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4: Model-Specific Registers. -----------------------------------------------------------------------------------------Changes to this chapter: Updates to Section 1.1 “Intel® 64 and IA-32 Processors Covered in this Manual”. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 9 CHAPTER 1 ABOUT THIS MANUAL The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4: Model-Specific Registers (order number 335592) is part of a set that describes the architecture and programming environment of Intel® 64 and IA32 architecture processors. Other volumes in this set are: • Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture (order number 253665). • Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D: Instruction Set Reference (order numbers 253666, 253667, 326018 and 334569). • The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A, 3B, 3C & 3D: System Programming Guide (order numbers 253668, 253669, 326019 and 332831). The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A, 2B, 2C & 2D, describe the instruction set of the processor and the opcode structure. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A, 3B, 3C & 3D, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operatingsystem and BIOS designers. In addition, Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B, and Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C address the programming environment for classes of software that host operating systems. The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4, describes the model-specific registers of Intel 64 and IA-32 processors. 1.1 INTEL® 64 AND IA-32 PROCESSORS COVERED IN THIS MANUAL This manual set includes information pertaining primarily to the most recent Intel 64 and IA-32 processors, which include: • • • • • • • • • • • • • • • • • • • Pentium® processors P6 family processors Pentium® 4 processors Pentium® M processors Intel® Xeon® processors Pentium® D processors Pentium® processor Extreme Editions 64-bit Intel® Xeon® processors Intel® Core™ Duo processor Intel® Core™ Solo processor Dual-Core Intel® Xeon® processor LV Intel® Core™2 Duo processor Intel® Core™2 Quad processor Q6000 series Intel® Xeon® processor 3000, 3200 series Intel® Xeon® processor 5000 series Intel® Xeon® processor 5100, 5300 series Intel® Core™2 Extreme processor X7000 and X6800 series Intel® Core™2 Extreme QX6000 series Intel® Xeon® processor 7100 series Vol. 4 1-1 ABOUT THIS MANUAL • • • • • • • • Intel® Pentium® Dual-Core processor • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Intel® Core™ i7 processor Intel® Xeon® processor 7200, 7300 series Intel® Core™2 Extreme QX9000 series Intel® Xeon® processor 5200, 5400, 7400 series Intel® Core™2 Extreme processor QX9000 and X9000 series Intel® Core™2 Quad processor Q9000 series Intel® Core™2 Duo processor E8000, T9000 series Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are built from 45 nm and 32 nm processes. Intel® Core™ i5 processor Intel® Xeon® processor E7-8800/4800/2800 product families Intel® Core™ i7-3930K processor 2nd generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx processor series Intel® Xeon® processor E3-1200 product family Intel® Xeon® processor E5-2400/1400 product family Intel® Xeon® processor E5-4600/2600/1600 product family 3rd generation Intel® Core™ processors Intel® Xeon® processor E3-1200 v2 product family Intel® Xeon® processor E5-2400/1400 v2 product families Intel® Xeon® processor E5-4600/2600/1600 v2 product families Intel® Xeon® processor E7-8800/4800/2800 v2 product families 4th generation Intel® Core™ processors The Intel® Core™ M processor family Intel® Core™ i7-59xx Processor Extreme Edition Intel® Core™ i7-49xx Processor Extreme Edition Intel® Xeon® processor E3-1200 v3 product family Intel® Xeon® processor E5-2600/1600 v3 product families 5th generation Intel® Core™ processors Intel® Xeon® processor D-1500 product family Intel® Xeon® processor E5 v4 family Intel® Atom™ processor X7-Z8000 and X5-Z8000 series Intel® Atom™ processor Z3400 series Intel® Atom™ processor Z3500 series 6th generation Intel® Core™ processors Intel® Xeon® processor E3-1500m v5 product family 7th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series Intel® Xeon® Processor Scalable Family 8th generation Intel® Core™ processors Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series Intel® Xeon® E processors 9th generation Intel® Core™ processors 1-2 Vol. 4 ABOUT THIS MANUAL P6 family processors are IA-32 processors based on the P6 family microarchitecture. This includes the Pentium® Pro, Pentium® II, Pentium® III, and Pentium® III Xeon® processors. The Pentium® 4, Pentium® D, and Pentium® processor Extreme Editions are based on the Intel NetBurst® microarchitecture. Most early Intel® Xeon® processors are based on the Intel NetBurst® microarchitecture. Intel Xeon processor 5000, 7100 series are based on the Intel NetBurst® microarchitecture. The Intel® Core™ Duo, Intel® Core™ Solo and dual-core Intel® Xeon® processor LV are based on an improved Pentium® M processor microarchitecture. The Intel® Xeon® processor 3000, 3200, 5100, 5300, 7200, and 7300 series, Intel® Pentium® dual-core, Intel® Core™2 Duo, Intel® Core™2 Quad, and Intel® Core™2 Extreme processors are based on Intel® Core™ microarchitecture. The Intel® Xeon® processor 5200, 5400, 7400 series, Intel® Core™2 Quad processor Q9000 series, and Intel® Core™2 Extreme processors QX9000, X9000 series, Intel® Core™2 processor E8000 series are based on Enhanced Intel® Core™ microarchitecture. The Intel® Atom™ processors 200, 300, D400, D500, D2000, N200, N400, N2000, E2000, Z500, Z600, Z2000, C1000 series are based on the Intel® Atom™ microarchitecture and supports Intel 64 architecture. P6 family, Pentium® M, Intel® Core™ Solo, Intel® Core™ Duo processors, dual-core Intel® Xeon® processor LV, and early generations of Pentium 4 and Intel Xeon processors support IA-32 architecture. The Intel® AtomTM processor Z5xx series support IA-32 architecture. The Intel® Xeon® processor 3000, 3200, 5000, 5100, 5200, 5300, 5400, 7100, 7200, 7300, 7400 series, Intel® Core™2 Duo, Intel® Core™2 Extreme, Intel® Core™2 Quad processors, Pentium® D processors, Pentium® DualCore processor, newer generations of Pentium 4 and Intel Xeon processor family support Intel® 64 architecture. The Intel® Core™ i7 processor and Intel® Xeon® processor 3400, 5500, 7500 series are based on 45 nm Nehalem microarchitecture. Westmere microarchitecture is a 32 nm version of the Nehalem microarchitecture. Intel® Xeon® processor 5600 series, Intel Xeon processor E7 and various Intel Core i7, i5, i3 processors are based on the Westmere microarchitecture. These processors support Intel 64 architecture. The Intel® Xeon® processor E5 family, Intel® Xeon® processor E3-1200 family, Intel® Xeon® processor E78800/4800/2800 product families, Intel® Core™ i7-3930K processor, and 2nd generation Intel® Core™ i7-2xxx, Intel® CoreTM i5-2xxx, Intel® Core™ i3-2xxx processor series are based on the Sandy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E7-8800/4800/2800 v2 product families, Intel® Xeon® processor E3-1200 v2 product family and 3rd generation Intel® Core™ processors are based on the Ivy Bridge microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-4600/2600/1600 v2 product families, Intel® Xeon® processor E5-2400/1400 v2 product families and Intel® Core™ i7-49xx Processor Extreme Edition are based on the Ivy Bridge-E microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E3-1200 v3 product family and 4th Generation Intel® Core™ processors are based on the Haswell microarchitecture and support Intel 64 architecture. The Intel® Xeon® processor E5-2600/1600 v3 product families and the Intel® Core™ i7-59xx Processor Extreme Edition are based on the Haswell-E microarchitecture and support Intel 64 architecture. The Intel® Atom™ processor Z8000 series is based on the Airmont microarchitecture. The Intel® Atom™ processor Z3400 series and the Intel® Atom™ processor Z3500 series are based on the Silvermont microarchitecture. The Intel® Core™ M processor family, 5th generation Intel® Core™ processors, Intel® Xeon® processor D-1500 product family and the Intel® Xeon® processor E5 v4 family are based on the Broadwell microarchitecture and support Intel 64 architecture. The Intel® Xeon® Processor Scalable Family, Intel® Xeon® processor E3-1500m v5 product family and 6th generation Intel® Core™ processors are based on the Skylake microarchitecture and support Intel 64 architecture. The 7th generation Intel® Core™ processors are based on the Kaby Lake microarchitecture and support Intel 64 architecture. Vol. 4 1-3 ABOUT THIS MANUAL The Intel® Atom™ processor C series, the Intel® Atom™ processor X series, the Intel® Pentium® processor J series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont microarchitecture. The Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series is based on the Knights Landing microarchitecture and supports Intel 64 architecture. The Intel® Pentium® Silver processor series, the Intel® Celeron® processor J series, and the Intel® Celeron® processor N series are based on the Goldmont Plus microarchitecture. The 8th generation Intel® Core™ processors, 9th generation Intel® Core™ processors, and Intel® Xeon® E processors are based on the Coffee Lake microarchitecture and support Intel 64 architecture. The Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series is based on the Knights Mill microarchitecture and supports Intel 64 architecture. IA-32 architecture is the instruction set architecture and programming environment for Intel's 32-bit microprocessors. Intel® 64 architecture is the instruction set architecture and programming environment which is the superset of Intel’s 32-bit and 64-bit architectures. It is compatible with the IA-32 architecture. 1.2 OVERVIEW OF THE SYSTEM PROGRAMMING GUIDE A description of this manual’s content follows: Chapter 1 — About This Manual. Gives an overview of all eight volumes of the Intel® 64 and IA-32 Architectures Software Developer’s Manual. It also describes the notational conventions in these manuals and lists related Intel manuals and documentation of interest to programmers and hardware designers. Chapter 2 — Model-Specific Registers (MSRs). Lists the MSRs available in the Pentium processors, the P6 family processors, the Pentium 4, Intel Xeon, Intel Core Solo, Intel Core Duo processors, Intel Core 2 processor family, and Intel Atom processors, and describes their functions. 1.3 NOTATIONAL CONVENTIONS This manual uses specific notation for data-structure formats, for symbolic representation of instructions, and for hexadecimal and binary numbers. A review of this notation makes the manual easier to read. 1.3.1 Bit and Byte Order In illustrations of data structures in memory, smaller addresses appear toward the bottom of the figure; addresses increase toward the top. Bit positions are numbered from right to left. The numerical value of a set bit is equal to two raised to the power of the bit position. Intel 64 and IA-32 processors are “little endian” machines; this means the bytes of a word are numbered starting from the least significant byte. Figure 1-1 illustrates these conventions. 1.3.2 Reserved Bits and Software Compatibility In many register and memory layout descriptions, certain bits are marked as reserved. When bits are marked as reserved, it is essential for compatibility with future processors that software treat these bits as having a future, though unknown, effect. The behavior of reserved bits should be regarded as not only undefined, but unpredictable. Software should follow these guidelines in dealing with reserved bits: • Do not depend on the states of any reserved bits when testing the values of registers which contain such bits. Mask out the reserved bits before testing. • • • Do not depend on the states of any reserved bits when storing to memory or to a register. Do not depend on the ability to retain information written into any reserved bits. When loading a register, always load the reserved bits with the values indicated in the documentation, if any, or reload them with values previously read from the same register. 1-4 Vol. 4 ABOUT THIS MANUAL NOTE Avoid any software dependence upon the state of reserved bits in Intel 64 and IA-32 registers. Depending upon the values of reserved register bits will make software dependent upon the unspecified manner in which the processor handles these bits. Programs that depend upon reserved values risk incompatibility with future processors. Highest 31 Address Data Structure 8 7 24 23 16 15 Byte 3 Byte 2 Byte 1 Bit offset 0 Byte 0 28 24 20 16 12 8 4 0 Lowest Address Byte Offset Figure 1-1. Bit and Byte Order 1.3.3 Instruction Operands When instructions are represented symbolically, a subset of assembly language is used. In this subset, an instruction has the following format: label: mnemonic argument1, argument2, argument3 where: • • • A label is an identifier which is followed by a colon. A mnemonic is a reserved name for a class of instruction opcodes which have the same function. The operands argument1, argument2, and argument3 are optional. There may be from zero to three operands, depending on the opcode. When present, they take the form of either literals or identifiers for data items. Operand identifiers are either reserved names of registers or are assumed to be assigned to data items declared in another part of the program (which may not be shown in the example). When two operands are present in an arithmetic or logical instruction, the right operand is the source and the left operand is the destination. For example: LOADREG: MOV EAX, SUBTOTAL In this example LOADREG is a label, MOV is the mnemonic identifier of an opcode, EAX is the destination operand, and SUBTOTAL is the source operand. Some assembly languages put the source and destination in reverse order. 1.3.4 Hexadecimal and Binary Numbers Base 16 (hexadecimal) numbers are represented by a string of hexadecimal digits followed by the character H (for example, F82EH). A hexadecimal digit is a character from the following set: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Base 2 (binary) numbers are represented by a string of 1s and 0s, sometimes followed by the character B (for example, 1010B). The “B” designation is only used in situations where confusion as to the type of number might arise. Vol. 4 1-5 ABOUT THIS MANUAL 1.3.5 Segmented Addressing The processor uses byte addressing. This means memory is organized and accessed as a sequence of bytes. Whether one or more bytes are being accessed, a byte address is used to locate the byte or bytes memory. The range of memory that can be addressed is called an address space. The processor also supports segmented addressing. This is a form of addressing where a program may have many independent address spaces, called segments. For example, a program can keep its code (instructions) and stack in separate segments. Code addresses would always refer to the code space, and stack addresses would always refer to the stack space. The following notation is used to specify a byte address within a segment: Segment-register:Byte-address For example, the following segment address identifies the byte at address FF79H in the segment pointed by the DS register: DS:FF79H The following segment address identifies an instruction address in the code segment. The CS register points to the code segment and the EIP register contains the address of the instruction. CS:EIP 1.3.6 Syntax for CPUID, CR, and MSR Values Obtain feature flags, status, and system information by using the CPUID instruction, by checking control register bits, and by reading model-specific registers. We are moving toward a single syntax to represent this type of information. See Figure 1-2. 1-6 Vol. 4 ABOUT THIS MANUAL CPUID Input and Output CPUID.01H:EDX.SSE[bit 25] = 1 Input value for EAX register Output register and feature flag or field name with bit position(s) Value (or range) of output Control Register Values CR4.OSFXSR[bit 9] = 1 Example CR name Feature flag or field name with bit position(s) Value (or range) of output Model-Specific Register Values IA32_MISC_ENABLE.ENABLEFOPCODE[bit 2] = 1 Example MSR name Feature flag or field name with bit position(s) Value (or range) of output Figure 1-2. Syntax for CPUID, CR, and MSR Data Presentation 1.3.7 Exceptions An exception is an event that typically occurs when an instruction causes an error. For example, an attempt to divide by zero generates an exception. However, some exceptions, such as breakpoints, occur under other conditions. Some types of exceptions may provide error codes. An error code reports additional information about the error. An example of the notation used to show an exception and error code is shown below: #PF(fault code) This example refers to a page-fault exception under conditions where an error code naming a type of fault is reported. Under some conditions, exceptions which produce error codes may not be able to report an accurate code. In this case, the error code is zero, as shown below for a general-protection exception: #GP(0) 1.4 RELATED LITERATURE Literature related to Intel 64 and IA-32 processors is listed and viewable on-line at: https://software.intel.com/en-us/articles/intel-sdm Vol. 4 1-7 ABOUT THIS MANUAL See also: • The latest security information on Intel® products: https://www.intel.com/content/www/us/en/security-center/default.html • Software developer resources, guidance and insights for security advisories: https://software.intel.com/security-software-guidance/ • • • The data sheet for a particular Intel 64 or IA-32 processor • Intel® Fortran Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ • Intel® Software Development Tools: https://software.intel.com/en-us/intel-sdp-home • Intel® 64 and IA-32 Architectures Software Developer’s Manual (in one, four or ten volumes): https://software.intel.com/en-us/articles/intel-sdm • Intel® 64 and IA-32 Architectures Optimization Reference Manual: https://software.intel.com/en-us/articles/intel-sdm#optimization • Intel 64 Architecture x2APIC Specification: The specification update for a particular Intel 64 or IA-32 processor Intel® C++ Compiler documentation and online help: http://software.intel.com/en-us/articles/intel-compilers/ http://www.intel.com/content/www/us/en/architecture-and-technology/64-architecture-x2apic-specification.html • Intel® Trusted Execution Technology Measured Launched Environment Programming Guide: http://www.intel.com/content/www/us/en/software-developers/intel-txt-software-development-guide.html • Developing Multi-threaded Applications: A Platform Consistent Approach: https://software.intel.com/sites/default/files/article/147714/51534-developing-multithreaded-applications.pdf • Using Spin-Loops on Intel® Pentium® 4 Processor and Intel® Xeon® Processor: https://software.intel.com/sites/default/files/22/30/25602 • Performance Monitoring Unit Sharing Guide http://software.intel.com/file/30388 Literature related to selected features in future Intel processors are available at: • Intel® Architecture Instruction Set Extensions Programming Reference https://software.intel.com/en-us/isa-extensions • Intel® Software Guard Extensions (Intel® SGX) Programming Reference https://software.intel.com/en-us/isa-extensions/intel-sgx More relevant links are: • Intel® Developer Zone: https://software.intel.com/en-us • Developer centers: http://www.intel.com/content/www/us/en/hardware-developers/developer-centers.html • Processor support general link: http://www.intel.com/support/processors/ • Intel® Hyper-Threading Technology (Intel® HT Technology): http://www.intel.com/technology/platform-technology/hyper-threading/index.htm 1-8 Vol. 4 28.Updates to Chapter 2, Volume 4 Change bars show changes to Chapter 2 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 4: Model-Specific Registers. -----------------------------------------------------------------------------------------Changes to this chapter: Update to Table 2-1 “CPUID Signature Values of DisplayFamily_DisplayModel”. Replaced “MSR_PERF_FIXED_CTR” naming with “IA32_FIXED_CTR” in various places. Updated processor naming as necessary. Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 27 CHAPTER 2 MODEL-SPECIFIC REGISTERS (MSRS) This chapter lists MSRs across Intel processor families. All MSRs listed can be read with the RDMSR and written with the WRMSR instructions. The scope of an MSR defines the set of processors that access the same MSR with RDMSR and WRMSR. Thread-scope MSRs are unique to every logical processor. Core-scope MSRs are shared by the threads in the same core; similarly for module-scope, die-scope, and package-scope. When a processor package contains a single die, die-scope and package-scope are synonymous. When a package contains multiple die, they are distinct. NOTE For information on hierarchical level types supported, refer to the CPUID Leaf 1FH definition for the actual level type numbers: “V2 Extended Topology Enumeration Leaf” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A. Also see Section 8.9.1, “Hierarchical Mapping of Shared Resources” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. Register addresses are given in both hexadecimal and decimal. The register name is the mnemonic register name and the bit description describes individual bits in registers. Model specific registers and its bit-fields may be supported for a finite range of processor families/models. To distinguish between different processor family and/or models, software must use CPUID.01H leaf function to query the combination of DisplayFamily and DisplayModel to determine model-specific availability of MSRs (see CPUID instruction in Chapter 3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A). Table 2-1 lists the signature values of DisplayFamily and DisplayModel for various processor families or processor number series. Table 2-1. CPUID Signature Values of DisplayFamily_DisplayModel DisplayFamily_DisplayModel Processor Families/Processor Number Series 06_85H Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series based on Knights Mill microarchitecture 06_57H Intel® Xeon Phi™ Processor 3200, 5200, 7200 Series based on Knights Landing microarchitecture 06_7DH, 06_7EH Future Intel® Core™ processors based on Ice Lake microarchitecture 06_66H Intel® Core™ processors based on Cannon Lake microarchitecture 06_8EH, 06_9EH 7th generation Intel® Core™ processors based on Kaby Lake microarchitecture, 8th and 9th generation Intel® Core™ processors based on Coffee Lake microarchitecture, Intel® Xeon® E processors based on Coffee Lake microarchitecture 06_6AH, 06_6CH Future Intel® Xeon® processors based on Ice Lake microarchitecture 06_55H Intel® Xeon® Processor Scalable Family based on Skylake microarchitecture, 2nd generation Intel® Xeon® Processor Scalable Family based on Cascade Lake product, and future Cooper Lake product 06_4EH, 06_5EH 6th generation Intel Core processors and Intel Xeon processor E3-1500m v5 product family and E31200 v5 product family based on Skylake microarchitecture 06_56H Intel Xeon processor D-1500 product family based on Broadwell microarchitecture 06_4FH Intel Xeon processor E5 v4 Family based on Broadwell microarchitecture, Intel Xeon processor E7 v4 Family, Intel Core i7-69xx Processor Extreme Edition 06_47H 5th generation Intel Core processors, Intel Xeon processor E3-1200 v4 product family based on Broadwell microarchitecture 06_3DH Intel Core M-5xxx Processor, 5th generation Intel Core processors based on Broadwell microarchitecture Vol. 4 2-1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-1. CPUID Signature (Contd.)Values of DisplayFamily_DisplayModel (Contd.) DisplayFamily_DisplayModel Processor Families/Processor Number Series 06_3FH Intel Xeon processor E5-4600/2600/1600 v3 product families, Intel Xeon processor E7 v3 product families based on Haswell-E microarchitecture, Intel Core i7-59xx Processor Extreme Edition 06_3CH, 06_45H, 06_46H 4th Generation Intel Core processor and Intel Xeon processor E3-1200 v3 product family based on Haswell microarchitecture 06_3EH Intel Xeon processor E7-8800/4800/2800 v2 product families based on Ivy Bridge-E microarchitecture 06_3EH Intel Xeon processor E5-2600/1600 v2 product families and Intel Xeon processor E5-2400 v2 product family based on Ivy Bridge-E microarchitecture, Intel Core i7-49xx Processor Extreme Edition 06_3AH 3rd Generation Intel Core Processor and Intel Xeon processor E3-1200 v2 product family based on Ivy Bridge microarchitecture 06_2DH Intel Xeon processor E5 Family based on Intel microarchitecture code name Sandy Bridge, Intel Core i7-39xx Processor Extreme Edition 06_2FH Intel Xeon Processor E7 Family 06_2AH Intel Xeon processor E3-1200 product family; 2nd Generation Intel Core i7, i5, i3 Processors 2xxx Series 06_2EH Intel Xeon processor 7500, 6500 series 06_25H, 06_2CH Intel Xeon processors 3600, 5600 series, Intel Core i7, i5 and i3 Processors 06_1EH, 06_1FH Intel Core i7 and i5 Processors 06_1AH Intel Core i7 Processor, Intel Xeon processor 3400, 3500, 5500 series 06_1DH Intel Xeon processor MP 7400 series 06_17H Intel Xeon processor 3100, 3300, 5200, 5400 series, Intel Core 2 Quad processors 8000, 9000 series 06_0FH Intel Xeon processor 3000, 3200, 5100, 5300, 7300 series, Intel Core 2 Quad processor 6000 series, Intel Core 2 Extreme 6000 series, Intel Core 2 Duo 4000, 5000, 6000, 7000 series processors, Intel Pentium dual-core processors 06_0EH Intel Core Duo, Intel Core Solo processors 06_0DH Intel Pentium M processor 06_86H Intel® Atom™ processors based on Tremont Microarchitecture 06_7AH Intel Atom processors based on Goldmont Plus Microarchitecture 06_5FH Intel Atom processors based on Goldmont Microarchitecture (code name Denverton) 06_5CH Intel Atom processors based on Goldmont Microarchitecture 06_4CH Intel Atom processor X7-Z8000 and X5-Z8000 series based on Airmont Microarchitecture 06_5DH Intel Atom processor X3-C3000 based on Silvermont Microarchitecture 06_5AH Intel Atom processor Z3500 series 06_4AH Intel Atom processor Z3400 series 06_37H Intel Atom processor E3000 series, Z3600 series, Z3700 series 06_4DH Intel Atom processor C2000 series 06_36H Intel Atom processor S1000 Series 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H Intel Atom processor family, Intel Atom processor D2000, N2000, E2000, Z2000, C1000 series 0F_06H Intel Xeon processor 7100, 5000 Series, Intel Xeon Processor MP, Intel Pentium 4, Pentium D processors 0F_03H, 0F_04H Intel Xeon processor, Intel Xeon processor MP, Intel Pentium 4, Pentium D processors 06_09H Intel Pentium M processor 2-2 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-1. CPUID Signature (Contd.)Values of DisplayFamily_DisplayModel (Contd.) DisplayFamily_DisplayModel Processor Families/Processor Number Series 0F_02H Intel Xeon Processor, Intel Xeon processor MP, Intel Pentium 4 processors 0F_0H, 0F_01H Intel Xeon Processor, Intel Xeon processor MP, Intel Pentium 4 processors 06_7H, 06_08H, 06_0AH, 06_0BH Intel Pentium III Xeon processor, Intel Pentium III processor 06_03H, 06_05H Intel Pentium II Xeon processor, Intel Pentium II processor 06_01H Intel Pentium Pro processor 05_01H, 05_02H, 05_04H Intel Pentium processor, Intel Pentium processor with MMX Technology The Intel® Quark™ SoC X1000 processor can be identified by the signature of DisplayFamily_DisplayModel = 05_09H and SteppingID = 0 2.1 ARCHITECTURAL MSRS Many MSRs have carried over from one generation of IA-32 processors to the next and to Intel 64 processors. A subset of MSRs and associated bit fields, which do not change on future processor generations, are now considered architectural MSRs. For historical reasons (beginning with the Pentium 4 processor), these “architectural MSRs” were given the prefix “IA32_”. Table 2-2 lists the architectural MSRs, their addresses, their current names, their names in previous IA-32 processors, and bit fields that are considered architectural. MSR addresses outside Table 2-2 and certain bit fields in an MSR address that may overlap with architectural MSR addresses are model-specific. Code that accesses a machine specified MSR and that is executed on a processor that does not support that MSR will generate an exception. Architectural MSR or individual bit fields in an architectural MSR may be introduced or transitioned at the granularity of certain processor family/model or the presence of certain CPUID feature flags. The right-most column of Table 2-2 provides information on the introduction of each architectural MSR or its individual fields. This information is expressed either as signature values of “DF_DM” (see Table 2-1) or via CPUID flags. Certain bit field position may be related to the maximum physical address width, the value of which is expressed as “MAXPHYADDR” in Table 2-2. “MAXPHYADDR” is reported by CPUID.8000_0008H leaf. MSR address range between 40000000H - 400000FFH is marked as a specially reserved range. All existing and future processors will not implement any features using any MSR in this range. Table 2-2. IA-32 Architectural MSRs Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 0H 0 IA32_P5_MC_ADDR (P5_MC_ADDR) See Section 2.23, “MSRs in Pentium Processors.” Pentium Processor (05_01H) 1H 1 IA32_P5_MC_TYPE (P5_MC_TYPE) See Section 2.23, “MSRs in Pentium Processors.” DF_DM = 05_01H 6H 6 IA32_MONITOR_FILTER_SIZE See Section 8.10.5, “Monitor/Mwait Address Range Determination.” 0F_03H 10H 16 IA32_TIME_STAMP_COUNTER See Section 17.17, “Time-Stamp Counter.” 05_01H IA32_PLATFORM_ID (MSR_PLATFORM_ID ) Platform ID (RO) The operating system can use this MSR to determine “slot” information for the processor and the proper microcode update to load. 06_01H 49:0 Reserved (TSC) 17H 23 Vol. 4 2-3 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 52:50 Platform Id (RO) Contains information concerning the intended platform for the processor. 52 0 0 0 0 1 1 1 1 1BH 27 58 50 0 1 0 1 0 1 0 1 Processor Flag 0 Processor Flag 1 Processor Flag 2 Processor Flag 3 Processor Flag 4 Processor Flag 5 Processor Flag 6 Processor Flag 7 63:53 Reserved IA32_APIC_BASE This register holds the APIC base address, permitting the relocation of the APIC memory map. See Section 10.4.4, “Local APIC Status and Location” and Section 10.4.5, “Relocating the Local APIC Registers”. (APIC_BASE) 3AH 51 0 0 1 1 0 0 1 1 7:0 Reserved 8 BSP flag (R/W) 9 Reserved 10 Enable x2APIC mode. 11 APIC Global Enable (R/W) (MAXPHYADDR - 1):12 APIC Base (R/W) 63: MAXPHYADDR Reserved IA32_FEATURE_CONTROL Control Features in Intel 64 Processor (R/W) If any one enumeration condition for defined bit field holds. 0 Lock bit (R/WO): (1 = locked). When set, locks this MSR from being written; writes to this bit will result in GP(0). If any one enumeration condition for defined bit field position greater than bit 0 holds. Note: Once the Lock bit is set, the contents of this register cannot be modified. Therefore the lock bit must be set after configuring support for Intel Virtualization Technology and prior to transferring control to an option ROM or the OS. Hence, once the Lock bit is set, the entire IA32_FEATURE_CONTROL contents are preserved across RESET when PWRGOOD is not deasserted. 2-4 Vol. 4 06_01H 06_1AH MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Enable VMX inside SMX operation (R/WL): This bit enables a system executive to use VMX in conjunction with SMX to support Intel® Trusted Execution Technology. If CPUID.01H:ECX[5] = 1 && CPUID.01H:ECX[6] = 1 Decimal 1 BIOS must set this bit only when the CPUID function 1 returns VMX feature flag and SMX feature flag set (ECX bits 5 and 6 respectively). 2 Enable VMX outside SMX operation (R/WL): This bit enables VMX for a system executive that does not require SMX. If CPUID.01H:ECX[5] = 1 BIOS must set this bit only when the CPUID function 1 returns the VMX feature flag set (ECX bit 5). 3BH 59 7:3 Reserved 14:8 SENTER Local Function Enables (R/WL): When set, each bit in the field represents an enable control for a corresponding SENTER function. This field is supported only if CPUID.1:ECX.[bit 6] is set. If CPUID.01H:ECX[6] = 1 15 SENTER Global Enable (R/WL): This bit must be set to enable SENTER leaf functions. This bit is supported only if CPUID.1:ECX.[bit 6] is set. If CPUID.01H:ECX[6] = 1 16 Reserved 17 SGX Launch Control Enable (R/WL): This bit must be set to enable runtime reconfiguration of SGX Launch Control via the IA32_SGXLEPUBKEYHASHn MSR. If CPUID.(EAX=07H, ECX=0H): ECX[30] = 1 18 SGX Global Enable (R/WL): This bit must be set to enable SGX leaf functions. If CPUID.(EAX=07H, ECX=0H): EBX[2] = 1 19 Reserved 20 LMCE On (R/WL): When set, system software can program the MSRs associated with LMCE to configure delivery of some machine check exceptions to a single logical processor. 63:21 Reserved IA32_TSC_ADJUST Per Logical Processor TSC Adjust (R/Write to clear) 63:0 THREAD_ADJUST: If IA32_MCG_CAP[27] = 1 If CPUID.(EAX=07H, ECX=0H): EBX[1] = 1 Local offset value of the IA32_TSC for a logical processor. Reset value is zero. A write to IA32_TSC will modify the local offset in IA32_TSC_ADJUST and the content of IA32_TSC, but does not affect the internal invariant TSC hardware. Vol. 4 2-5 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 48H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 72 IA32_SPEC_CTRL Speculation Control (R/W) The MSR bits are defined as logical processor scope. On some core implementations, the bits may impact sibling logical processors on the same core. If any one of the enumeration conditions for defined bit field positions holds. This MSR has a value of 0 after reset and is unaffected by INIT# or SIPI#. 49H 73 0 Indirect Branch Restricted Speculation (IBRS). Restricts speculation of indirect branch. If CPUID.(EAX=07H, ECX=0):EDX[26]=1 1 Single Thread Indirect Branch Predictors (STIBP). Prevents indirect branch predictions on all logical processors on the core from being controlled by any sibling logical processor in the same core. If CPUID.(EAX=07H, ECX=0):EDX[27]=1 2 Speculative Store Bypass Disable (SSBD) delays speculative execution of a load until the addresses for all older stores are known. If CPUID.(EAX=07H, ECX=0):EDX[31]=1 63:3 Reserved IA32_PRED_CMD Prediction Command (WO) Gives software a way to issue commands that affect the state of predictors. 79H 121 0 Indirect Branch Prediction Barrier (IBPB). 63:1 Reserved IA32_BIOS_UPDT_TRIG (BIOS_UPDT_TRIG) BIOS Update Trigger (W) If any one of the enumeration conditions for defined bit field positions holds. If CPUID.(EAX=07H, ECX=0):EDX[26]=1 06_01H Executing a WRMSR instruction to this MSR causes a microcode update to be loaded into the processor. See Section 9.11.6, “Microcode Update Loader.” A processor may prevent writing to this MSR when loading guest states on VM entries or saving guest states on VM exits. 8BH 139 IA32_BIOS_SIGN_ID (BIOS_SIGN/BBL_CR_D3) BIOS Update Signature (RO) Returns the microcode update signature following the execution of CPUID.01H. A processor may prevent writing to this MSR when loading guest states on VM entries or saving guest states on VM exits. 31:0 2-6 Vol. 4 Reserved 06_01H MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 63:32 It is recommended that this field be preloaded with zero prior to executing CPUID. If the field remains zero following the execution of CPUID, this indicates that no microcode update is loaded. Any non-zero value is the microcode update signature. 8CH 140 IA32_SGXLEPUBKEYHASH0 IA32_SGXLEPUBKEYHASH[63:0] (R/W) Bits 63:0 of the SHA256 digest of the SIGSTRUCT.MODULUS for SGX Launch Enclave. On reset, the default value is the digest of Intel’s signing key. 8DH 141 IA32_SGXLEPUBKEYHASH1 IA32_SGXLEPUBKEYHASH[127:64] (R/W) Bits 127:64 of the SHA256 digest of the SIGSTRUCT.MODULUS for SGX Launch Enclave. On reset, the default value is the digest of Intel’s signing key. 8EH 142 IA32_SGXLEPUBKEYHASH2 IA32_SGXLEPUBKEYHASH[191:128] (R/W) Bits 191:128 of the SHA256 digest of the SIGSTRUCT.MODULUS for SGX Launch Enclave. On reset, the default value is the digest of Intel’s signing key. 8FH 143 IA32_SGXLEPUBKEYHASH3 IA32_SGXLEPUBKEYHASH[255:192] (R/W) Read permitted If CPUID.(EAX=12H,ECX=0H): EAX[0]=1 && CPUID.(EAX=07H, ECX=0H):ECX[30]=1. Write permitted if CPUID.(EAX=12H,ECX=0H): EAX[0]=1 && IA32_FEATURE_CONTROL[ 17] = 1 && IA32_FEATURE_CONTROL[ 0] = 1. Bits 255:192 of the SHA256 digest of the SIGSTRUCT.MODULUS for SGX Launch Enclave. On reset, the default value is the digest of Intel’s signing key. 9BH 155 IA32_SMM_MONITOR_CTL SMM Monitor Configuration (R/W) 0 Valid (R/W) 1 Reserved 2 Controls SMI unblocking by VMXOFF (see Section 34.14.4). 11:3 Reserved 31:12 MSEG Base (R/W) 63:32 Reserved If CPUID.01H: ECX[5]=1 || CPUID.01H: ECX[6] = 1 If IA32_VMX_MISC[28] 9EH 158 IA32_SMBASE Base address of the logical processor’s SMRAM image (RO, SMM only). If IA32_VMX_MISC[15] C1H 193 IA32_PMC0 General Performance Counter 0 (R/W) If CPUID.0AH: EAX[15:8] > 0 General Performance Counter 1 (R/W) If CPUID.0AH: EAX[15:8] > 1 General Performance Counter 2 (R/W) If CPUID.0AH: EAX[15:8] > 2 (PERFCTR0) C2H 194 IA32_PMC1 (PERFCTR1) C3H 195 IA32_PMC2 Vol. 4 2-7 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal C4H 196 IA32_PMC3 General Performance Counter 3 (R/W) If CPUID.0AH: EAX[15:8] > 3 C5H 197 IA32_PMC4 General Performance Counter 4 (R/W) If CPUID.0AH: EAX[15:8] > 4 C6H 198 IA32_PMC5 General Performance Counter 5 (R/W) If CPUID.0AH: EAX[15:8] > 5 C7H 199 IA32_PMC6 General Performance Counter 6 (R/W) If CPUID.0AH: EAX[15:8] > 6 C8H 200 IA32_PMC7 General Performance Counter 7 (R/W) If CPUID.0AH: EAX[15:8] > 7 CFH 207 IA32_CORE_CAPABILITY IA32 Core Capability Register 06_86H 4:0 Reserved. 5 #AC(0) exception for split locked accesses supported. 31:6 Reserved. IA32_UMWAIT_CONTROL UMWAIT Control (R/W) 0 C0.2 is not allowed by the OS. Value of “1” means all C0.2 requests revert to C0.1. 1 Reserved 31:2 Determines the maximum time in TSCquanta that the processor can reside in either C0.1 or C0.2. A zero value indicates no maximum time. The maximum time value is a 32-bit value where the upper 30 bits come from this field and the lower two bits are zero. IA32_MPERF TSC Frequency Clock Counter (R/Write to clear) E1H E7H 225 231 63:0 If CPUID.06H: ECX[0] = 1 C0_MCNT: C0 TSC Frequency Clock Count Increments at fixed interval (relative to TSC freq.) when the logical processor is in C0. Cleared upon overflow / wrap-around of IA32_APERF. E8H 232 IA32_APERF Actual Performance Clock Counter (R/Write to clear) 63:0 C0_ACNT: C0 Actual Frequency Clock Count If CPUID.06H: ECX[0] = 1 Accumulates core clock counts at the coordinated clock frequency, when the logical processor is in C0. Cleared upon overflow / wrap-around of IA32_MPERF. FEH 2-8 Vol. 4 254 IA32_MTRRCAP MTRR Capability (RO) (MTRRcap) See Section 11.11.2.1, “IA32_MTRR_DEF_TYPE MSR.” 06_01H MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 10AH 10BH 174H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 266 267 372 7:0 VCNT: The number of variable memory type ranges in the processor. 8 Fixed range MTRRs are supported when set. 9 Reserved 10 WC Supported when set. 11 SMRR Supported when set. 12 PRMRR supported when set. 63:13 Reserved IA32_ARCH_CAPABILITIES Enumeration of Architectural Features (RO) 0 RDCL_NO: The processor is not susceptible to Rogue Data Cache Load (RDCL). 1 IBRS_ALL: The processor supports enhanced IBRS. 2 RSBA: The processor supports RSB Alternate. Alternative branch predictors may be used by RET instructions when the RSB is empty. SW using retpoline may be affected by this behavior. 3 SKIP_L1DFL_VMENTRY: A value of 1 indicates the hypervisor need not flush the L1D on VM entry. 4 SSB_NO: Processor is not susceptible to Speculative Store Bypass. 63:5 Reserved IA32_FLUSH_CMD Flush Command (WO) If CPUID.(EAX=07H, ECX=0):EDX[29]=1 Gives software a way to invalidate structures with finer granularity than other architectural methods. If any one of the enumeration conditions for defined bit field positions holds. 0 L1D_FLUSH: Writeback and invalidate the L1 data cache. If CPUID.(EAX=07H, ECX=0):EDX[28]=1 63:1 Reserved IA32_SYSENTER_CS SYSENTER_CS_MSR (R/W) 15:0 CS Selector. 31:16 Not used. Can be read and written. 63:32 Not used. Writes ignored; reads 06_01H return zero. 175H 373 IA32_SYSENTER_ESP SYSENTER_ESP_MSR (R/W) 06_01H 176H 374 IA32_SYSENTER_EIP SYSENTER_EIP_MSR (R/W) 06_01H 179H 377 IA32_MCG_CAP (MCG_CAP) Global Machine Check Capability (RO) 06_01H 7:0 Count: Number of reporting banks. Vol. 4 2-9 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 17AH Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 378 8 MCG_CTL_P: IA32_MCG_CTL is present if this bit is set. 9 MCG_EXT_P: Extended machine check state registers are present if this bit is set. 10 MCP_CMCI_P: Support for corrected MC error event is present. 11 MCG_TES_P: Threshold-based error status register are present if this bit is set. 15:12 Reserved 23:16 MCG_EXT_CNT: Number of extended machine check state registers present. 24 MCG_SER_P: The processor supports software error recovery if this bit is set. 25 Reserved 26 MCG_ELOG_P: Indicates that the processor allows platform firmware to be invoked when an error is detected so that it may provide additional platform specific information in an ACPI format “Generic Error Data Entry” that augments the data included in machine check bank registers. 06_3EH 27 MCG_LMCE_P: Indicates that the processor supports extended state in IA32_MCG_STATUS and associated MSR necessary to configure Local Machine Check Exception (LMCE). 06_3EH 63:28 Reserved IA32_MCG_STATUS Global Machine Check Status (R/W0) 06_01H 0 RIPV. Restart IP valid. 06_01H 1 EIPV. Error IP valid. 06_01H 2 MCIP. Machine check in progress. 06_01H 3 LMCE_S If IA32_MCG_CAP.LMCE_P[2 7] =1 63:4 Reserved Global Machine Check Control (R/W) 06_01H (MCG_STATUS) 17BH 379 IA32_MCG_CTL (MCG_CTL) 180H185H 384389 Reserved 186H 390 IA32_PERFEVTSEL0 (PERFEVTSEL0) 2-10 Vol. 4 If IA32_MCG_CAP.CTL_P[8] =1 06_0EH1 Performance Event Select Register 0 (R/W) If CPUID.0AH: EAX[15:8] > 0 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 187H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 391 7:0 Event Select: Selects a performance event logic unit. 15:8 UMask: Qualifies the microarchitectural condition to detect on the selected event logic. 16 USR: Counts while in privilege level is not ring 0. 17 OS: Counts while in privilege level is ring 0. 18 Edge: Enables edge detection if set. 19 PC: Enables pin control. 20 INT: Enables interrupt on counter overflow. 21 AnyThread: When set to 1, it enables counting the associated event conditions occurring across all logical processors sharing a processor core. When set to 0, the counter only increments the associated event conditions occurring in the logical processor which programmed the MSR. 22 EN: Enables the corresponding performance counter to commence counting when this bit is set. 23 INV: Invert the CMASK. 31:24 CMASK: When CMASK is not zero, the corresponding performance counter increments each cycle if the event count is greater than or equal to the CMASK. 63:32 Reserved IA32_PERFEVTSEL1 Performance Event Select Register 1 (R/W) If CPUID.0AH: EAX[15:8] > 1 (PERFEVTSEL1) 188H 392 IA32_PERFEVTSEL2 Performance Event Select Register 2 (R/W) If CPUID.0AH: EAX[15:8] > 2 189H 393 IA32_PERFEVTSEL3 Performance Event Select Register 3 (R/W) If CPUID.0AH: EAX[15:8] > 3 18AH197H 394407 Reserved 198H 408 IA32_PERF_STATUS 06_0EH2 Current Performance Status (RO) 0F_03H See Section 14.1.1, “Software Interface For Initiating Performance State Transitions”. 15:0 Current performance State Value. 63:16 Reserved Vol. 4 2-11 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 199H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 409 IA32_PERF_CTL Performance Control MSR (R/W) 0F_03H Software makes a request for a new Performance state (P-State) by writing this MSR. See Section 14.1.1, “Software Interface For Initiating Performance State Transitions”. 15:0 Target performance State Value. 31:16 Reserved 32 IDA Engage (R/W) 06_0FH (Mobile only) When set to 1: disengages IDA. 19AH 410 63:33 Reserved IA32_CLOCK_MODULATION Clock Modulation Control (R/W) If CPUID.01H:EDX[22] = 1 See Section 14.7.3, “Software Controlled Clock Modulation.” 19BH 411 0 Extended On-Demand Clock Modulation Duty Cycle. If CPUID.06H:EAX[5] = 1 3:1 On-Demand Clock Modulation Duty Cycle: Specific encoded values for target duty cycle modulation. If CPUID.01H:EDX[22] = 1 4 On-Demand Clock Modulation Enable: Set 1 to enable modulation. If CPUID.01H:EDX[22] = 1 63:5 Reserved IA32_THERM_INTERRUPT Thermal Interrupt Control (R/W) If CPUID.01H:EDX[22] = 1 Enables and disables the generation of an interrupt on temperature transitions detected with the processor’s thermal sensors and thermal monitor. See Section 14.7.2, “Thermal Monitor.” 2-12 Vol. 4 0 High-Temperature Interrupt Enable If CPUID.01H:EDX[22] = 1 1 Low-Temperature Interrupt Enable If CPUID.01H:EDX[22] = 1 2 PROCHOT# Interrupt Enable If CPUID.01H:EDX[22] = 1 3 FORCEPR# Interrupt Enable If CPUID.01H:EDX[22] = 1 4 Critical Temperature Interrupt Enable If CPUID.01H:EDX[22] = 1 7:5 Reserved 14:8 Threshold #1 Value If CPUID.01H:EDX[22] = 1 15 Threshold #1 Interrupt Enable If CPUID.01H:EDX[22] = 1 22:16 Threshold #2 Value If CPUID.01H:EDX[22] = 1 23 Threshold #2 Interrupt Enable If CPUID.01H:EDX[22] = 1 24 Power Limit Notification Enable If CPUID.06H:EAX[4] = 1 63:25 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 19CH Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 412 IA32_THERM_STATUS Thermal Status Information (RO) If CPUID.01H:EDX[22] = 1 Contains status information about the processor’s thermal sensor and automatic thermal monitoring facilities. See Section 14.7.2, “Thermal Monitor”. 1A0H 416 0 Thermal Status (RO) If CPUID.01H:EDX[22] = 1 1 Thermal Status Log (R/W) If CPUID.01H:EDX[22] = 1 2 PROCHOT # or FORCEPR# event (RO) If CPUID.01H:EDX[22] = 1 3 PROCHOT # or FORCEPR# log (R/WC0) If CPUID.01H:EDX[22] = 1 4 Critical Temperature Status (RO) If CPUID.01H:EDX[22] = 1 5 Critical Temperature Status log (R/WC0) If CPUID.01H:EDX[22] = 1 6 Thermal Threshold #1 Status (RO) If CPUID.01H:ECX[8] = 1 7 Thermal Threshold #1 log (R/WC0) If CPUID.01H:ECX[8] = 1 8 Thermal Threshold #2 Status (RO) If CPUID.01H:ECX[8] = 1 9 Thermal Threshold #2 log (R/WC0) If CPUID.01H:ECX[8] = 1 10 Power Limitation Status (RO) If CPUID.06H:EAX[4] = 1 11 Power Limitation log (R/WC0) If CPUID.06H:EAX[4] = 1 12 Current Limit Status (RO) If CPUID.06H:EAX[7] = 1 13 Current Limit log (R/WC0) If CPUID.06H:EAX[7] = 1 14 Cross Domain Limit Status (RO) If CPUID.06H:EAX[7] = 1 15 Cross Domain Limit log (R/WC0) If CPUID.06H:EAX[7] = 1 22:16 Digital Readout (RO) If CPUID.06H:EAX[0] = 1 26:23 Reserved 30:27 Resolution in Degrees Celsius (RO) If CPUID.06H:EAX[0] = 1 31 Reading Valid (RO) If CPUID.06H:EAX[0] = 1 63:32 Reserved IA32_MISC_ENABLE Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 0 Fast-Strings Enable 0F_0H When set, the fast-strings feature (for REP MOVS and REP STORS) is enabled (default). When clear, fast-strings are disabled. 2:1 Reserved Vol. 4 2-13 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 3 Automatic Thermal Control Circuit Enable (R/W) 0F_0H 1 = Setting this bit enables the thermal control circuit (TCC) portion of the Intel Thermal Monitor feature. This allows the processor to automatically reduce power consumption in response to TCC activation. 0 = Disabled. Note: In some products clearing this bit might be ignored in critical thermal conditions, and TM1, TM2 and adaptive thermal throttling will still be activated. The default value of this field varies with product . See respective tables where default value is listed. 6:4 Reserved 7 Performance Monitoring Available (R) 0F_0H 1 = Performance monitoring enabled. 0 = Performance monitoring disabled. 10:8 11 Reserved Branch Trace Storage Unavailable (RO) 0F_0H 1 = Processor doesn’t support branch trace storage (BTS). 0 = BTS is supported. 12 Processor Event Based Sampling (PEBS) Unavailable (RO) 06_0FH 1 = PEBS is not supported. 0 = PEBS is supported. 15:13 Reserved 16 Enhanced Intel SpeedStep Technology Enable (R/W) 0= Enhanced Intel SpeedStep Technology disabled. 1 = Enhanced Intel SpeedStep Technology enabled. 17 2-14 Vol. 4 Reserved If CPUID.01H: ECX[7] =1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 18 ENABLE MONITOR FSM (R/W) 0F_03H When this bit is set to 0, the MONITOR feature flag is not set (CPUID.01H:ECX[bit 3] = 0). This indicates that MONITOR/MWAIT are not supported. Software attempts to execute MONITOR/MWAIT will cause #UD when this bit is 0. When this bit is set to 1 (default), MONITOR/MWAIT are supported (CPUID.01H:ECX[bit 3] = 1). If the SSE3 feature flag ECX[0] is not set (CPUID.01H:ECX[bit 0] = 0), the OS must not attempt to alter this bit. BIOS must leave it in the default state. Writing this bit when the SSE3 feature flag is set to 0 may generate a #GP exception. 21:19 Reserved 22 Limit CPUID Maxval (R/W) 0F_03H When this bit is set to 1, CPUID.00H returns a maximum value in EAX[7:0] of 2. BIOS should contain a setup question that allows users to specify when the installed OS does not support CPUID functions greater than 2. Before setting this bit, BIOS must execute the CPUID.0H and examine the maximum value returned in EAX[7:0]. If the maximum value is greater than 2, this bit is supported. Otherwise, this bit is not supported. Setting this bit when the maximum value is not greater than 2 may generate a #GP exception. Setting this bit may cause unexpected behavior in software that depends on the availability of CPUID leaves greater than 2. 23 xTPR Message Disable (R/W) If CPUID.01H:ECX[14] = 1 When set to 1, xTPR messages are disabled. xTPR messages are optional messages that allow the processor to inform the chipset of its priority. 33:24 Reserved Vol. 4 2-15 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 34 XD Bit Disable (R/W) When set to 1, the Execute Disable Bit feature (XD Bit) is disabled and the XD Bit extended feature flag will be clear (CPUID.80000001H: EDX[20]=0). If CPUID.80000001H:EDX[2 0] = 1 When set to a 0 (default), the Execute Disable Bit feature (if available) allows the OS to enable PAE paging and take advantage of data only pages. BIOS must not alter the contents of this bit location, if XD bit is not supported. Writing this bit to 1 when the XD Bit extended feature flag is set to 0 may generate a #GP exception. 1B0H 432 63:35 Reserved IA32_ENERGY_PERF_BIAS Performance Energy Bias Hint (R/W) 3:0 Power Policy Preference: If CPUID.6H:ECX[3] = 1 0 indicates preference to highest performance. 15 indicates preference to maximize energy saving. 63:4 1B1H 433 IA32_PACKAGE_THERM_STATUS Reserved Package Thermal Status Information (RO) Contains status information about the package’s thermal sensor. See Section 14.8, “Package Level Thermal Management.” 2-16 Vol. 4 0 Pkg Thermal Status (RO) 1 Pkg Thermal Status Log (R/W) 2 Pkg PROCHOT # event (RO) 3 Pkg PROCHOT # log (R/WC0) 4 Pkg Critical Temperature Status (RO) 5 Pkg Critical Temperature Status Log (R/WC0) 6 Pkg Thermal Threshold #1 Status (RO) 7 Pkg Thermal Threshold #1 log (R/WC0) 8 Pkg Thermal Threshold #2 Status (RO) 9 Pkg Thermal Threshold #1 log (R/WC0) 10 Pkg Power Limitation Status (RO) 11 Pkg Power Limitation log (R/WC0) 15:12 Reserved 22:16 Pkg Digital Readout (RO) If CPUID.06H: EAX[6] = 1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 1B2H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 434 63:23 Reserved IA32_PACKAGE_THERM_INTERRUPT Pkg Thermal Interrupt Control (R/W) If CPUID.06H: EAX[6] = 1 Enables and disables the generation of an interrupt on temperature transitions detected with the package’s thermal sensor. See Section 14.8, “Package Level Thermal Management.” 1D9H 473 0 Pkg High-Temperature Interrupt Enable 1 Pkg Low-Temperature Interrupt Enable 2 Pkg PROCHOT# Interrupt Enable 3 Reserved 4 Pkg Overheat Interrupt Enable 7:5 Reserved 14:8 Pkg Threshold #1 Value 15 Pkg Threshold #1 Interrupt Enable 22:16 Pkg Threshold #2 Value 23 Pkg Threshold #2 Interrupt Enable 24 Pkg Power Limit Notification Enable 63:25 Reserved IA32_DEBUGCTL Trace/Profile Resource Control (R/W) 06_0EH 0 LBR: Setting this bit to 1 enables the processor to record a running trace of the most recent branches taken by the processor in the LBR stack. 06_01H 1 BTF: Setting this bit to 1 enables the processor to treat EFLAGS.TF as single-step on branches instead of single-step on instructions. 06_01H 5:2 Reserved 6 TR: Setting this bit to 1 enables branch trace messages to be sent. 06_0EH 7 BTS: Setting this bit enables branch trace messages (BTMs) to be logged in a BTS buffer. 06_0EH 8 BTINT: When clear, BTMs are logged in a BTS buffer in circular fashion. When this bit is set, an interrupt is generated by the BTS facility when the BTS buffer is full. 06_0EH 9 1: BTS_OFF_OS: When set, BTS or BTM is skipped if CPL = 0. 06_0FH (MSR_DEBUGCTLA, MSR_DEBUGCTLB) Vol. 4 2-17 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 1F2H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 498 10 BTS_OFF_USR: When set, BTS or BTM is skipped if CPL > 0. 06_0FH 11 FREEZE_LBRS_ON_PMI: When set, the LBR stack is frozen on a PMI request. If CPUID.01H: ECX[15] = 1 && CPUID.0AH: EAX[7:0] > 1 12 FREEZE_PERFMON_ON_PMI: When set, each ENABLE bit of the global counter control MSR are frozen (address 38FH) on a PMI request. If CPUID.01H: ECX[15] = 1 && CPUID.0AH: EAX[7:0] > 1 13 ENABLE_UNCORE_PMI: When set, enables the logical processor to receive and generate PMI on behalf of the uncore. 06_1AH 14 FREEZE_WHILE_SMM: When set, freezes perfmon and trace messages while in SMM. If IA32_PERF_CAPABILITIES[ 12] = 1 15 RTM_DEBUG: When set, enables DR7 debug bit on XBEGIN. If (CPUID.(EAX=07H, ECX=0):EBX[11] = 1) 63:16 Reserved IA32_SMRR_PHYSBASE SMRR Base Address (Writeable only in SMM) Base address of SMM memory range. 7:0 Type. Specifies memory type of the range. 11:8 Reserved 31:12 PhysBase If IA32_MTRRCAP.SMRR[11] =1 SMRR physical Base Address. 1F3H 499 63:32 Reserved IA32_SMRR_PHYSMASK SMRR Range Mask (Writeable only in SMM) Range Mask of SMM memory range. 10:0 11 If IA32_MTRRCAP[SMRR] =1 Reserved Valid Enable range mask. 31:12 PhysMask SMRR address range mask. 63:32 Reserved 1F8H 504 IA32_PLATFORM_DCA_CAP DCA Capability (R) If CPUID.01H: ECX[18] = 1 1F9H 505 IA32_CPU_DCA_CAP If set, CPU supports Prefetch-Hint type. If CPUID.01H: ECX[18] = 1 1FAH 506 IA32_DCA_0_CAP DCA type 0 Status and Control register. If CPUID.01H: ECX[18] = 1 0 DCA_ACTIVE: Set by HW when DCA is fuseenabled and no defeatures are set. 2:1 TRANSACTION 6:3 DCA_TYPE 2-18 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 10:7 DCA_QUEUE_SIZE 12:11 Reserved 16:13 DCA_DELAY: Writes will update the register but have no HW side-effect. 23:17 Reserved 24 SW_BLOCK: SW can request DCA block by setting this bit. 25 Reserved 26 HW_BLOCK: Set when DCA is blocked by HW (e.g. CR0.CD = 1). 31:27 Reserved 200H 512 IA32_MTRR_PHYSBASE0 (MTRRphysBase0) See Section 11.11.2.3, “Variable Range MTRRs.” If CPUID.01H: EDX.MTRR[12] =1 201H 513 IA32_MTRR_PHYSMASK0 MTRRphysMask0 If CPUID.01H: EDX.MTRR[12] =1 202H 514 IA32_MTRR_PHYSBASE1 MTRRphysBase1 If CPUID.01H: EDX.MTRR[12] =1 203H 515 IA32_MTRR_PHYSMASK1 MTRRphysMask1 If CPUID.01H: EDX.MTRR[12] =1 204H 516 IA32_MTRR_PHYSBASE2 MTRRphysBase2 If CPUID.01H: EDX.MTRR[12] =1 205H 517 IA32_MTRR_PHYSMASK2 MTRRphysMask2 If CPUID.01H: EDX.MTRR[12] =1 206H 518 IA32_MTRR_PHYSBASE3 MTRRphysBase3 If CPUID.01H: EDX.MTRR[12] =1 207H 519 IA32_MTRR_PHYSMASK3 MTRRphysMask3 If CPUID.01H: EDX.MTRR[12] =1 208H 520 IA32_MTRR_PHYSBASE4 MTRRphysBase4 If CPUID.01H: EDX.MTRR[12] =1 209H 521 IA32_MTRR_PHYSMASK4 MTRRphysMask4 If CPUID.01H: EDX.MTRR[12] =1 20AH 522 IA32_MTRR_PHYSBASE5 MTRRphysBase5 If CPUID.01H: EDX.MTRR[12] =1 20BH 523 IA32_MTRR_PHYSMASK5 MTRRphysMask5 If CPUID.01H: EDX.MTRR[12] =1 20CH 524 IA32_MTRR_PHYSBASE6 MTRRphysBase6 If CPUID.01H: EDX.MTRR[12] =1 20DH 525 IA32_MTRR_PHYSMASK6 MTRRphysMask6 If CPUID.01H: EDX.MTRR[12] =1 20EH 526 IA32_MTRR_PHYSBASE7 MTRRphysBase7 If CPUID.01H: EDX.MTRR[12] =1 20FH 527 IA32_MTRR_PHYSMASK7 MTRRphysMask7 If CPUID.01H: EDX.MTRR[12] =1 Vol. 4 2-19 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 210H 528 IA32_MTRR_PHYSBASE8 MTRRphysBase8 if IA32_MTRRCAP[7:0] > 8 211H 529 IA32_MTRR_PHYSMASK8 MTRRphysMask8 if IA32_MTRRCAP[7:0] > 8 212H 530 IA32_MTRR_PHYSBASE9 MTRRphysBase9 if IA32_MTRRCAP[7:0] > 9 213H 531 IA32_MTRR_PHYSMASK9 MTRRphysMask9 if IA32_MTRRCAP[7:0] > 9 250H 592 IA32_MTRR_FIX64K_00000 MTRRfix64K_00000 If CPUID.01H: EDX.MTRR[12] =1 258H 600 IA32_MTRR_FIX16K_80000 MTRRfix16K_80000 If CPUID.01H: EDX.MTRR[12] =1 259H 601 IA32_MTRR_FIX16K_A0000 MTRRfix16K_A0000 If CPUID.01H: EDX.MTRR[12] =1 268H 616 IA32_MTRR_FIX4K_C0000 (MTRRfix4K_C0000 ) See Section 11.11.2.2, “Fixed Range MTRRs.” If CPUID.01H: EDX.MTRR[12] =1 269H 617 IA32_MTRR_FIX4K_C8000 MTRRfix4K_C8000 If CPUID.01H: EDX.MTRR[12] =1 26AH 618 IA32_MTRR_FIX4K_D0000 MTRRfix4K_D0000 If CPUID.01H: EDX.MTRR[12] =1 26BH 619 IA32_MTRR_FIX4K_D8000 MTRRfix4K_D8000 If CPUID.01H: EDX.MTRR[12] =1 26CH 620 IA32_MTRR_FIX4K_E0000 MTRRfix4K_E0000 If CPUID.01H: EDX.MTRR[12] =1 26DH 621 IA32_MTRR_FIX4K_E8000 MTRRfix4K_E8000 If CPUID.01H: EDX.MTRR[12] =1 26EH 622 IA32_MTRR_FIX4K_F0000 MTRRfix4K_F0000 If CPUID.01H: EDX.MTRR[12] =1 26FH 623 IA32_MTRR_FIX4K_F8000 MTRRfix4K_F8000 If CPUID.01H: EDX.MTRR[12] =1 277H 631 IA32_PAT IA32_PAT (R/W) If CPUID.01H: EDX.MTRR[16] =1 2:0 PA0 7:3 Reserved 10:8 PA1 15:11 Reserved 18:16 PA2 23:19 Reserved 26:24 PA3 31:27 Reserved 34:32 PA4 39:35 Reserved 42:40 PA5 47:43 Reserved 50:48 PA6 2-20 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 280H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 640 55:51 Reserved 58:56 PA7 63:59 Reserved IA32_MC0_CTL2 MSR to enable/disable CMCI capability for bank 0. (R/W) See Section 15.3.2.5, “IA32_MCi_CTL2 MSRs”. 14:0 Corrected error count threshold. 29:15 Reserved 30 CMCI_EN 63:31 Reserved If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 0 281H 641 IA32_MC1_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 1 282H 642 IA32_MC2_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 2 283H 643 IA32_MC3_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 3 284H 644 IA32_MC4_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 4 285H 645 IA32_MC5_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 5 286H 646 IA32_MC6_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 6 287H 647 IA32_MC7_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 7 288H 648 IA32_MC8_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 8 289H 649 IA32_MC9_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 9 28AH 650 IA32_MC10_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 10 28BH 651 IA32_MC11_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 11 Vol. 4 2-21 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 28CH 652 IA32_MC12_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 12 28DH 653 IA32_MC13_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 13 28EH 654 IA32_MC14_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 14 28FH 655 IA32_MC15_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 15 290H 656 IA32_MC16_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 16 291H 657 IA32_MC17_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 17 292H 658 IA32_MC18_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 18 293H 659 IA32_MC19_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 19 294H 660 IA32_MC20_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 20 295H 661 IA32_MC21_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 21 296H 662 IA32_MC22_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 22 297H 663 IA32_MC23_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 23 298H 664 IA32_MC24_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 24 299H 665 IA32_MC25_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 25 29AH 666 IA32_MC26_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 26 2-22 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 29BH 667 IA32_MC27_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 27 29CH 668 IA32_MC28_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 28 29DH 669 IA32_MC29_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 29 29EH 670 IA32_MC30_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 30 29FH 671 IA32_MC31_CTL2 (R/W) Same fields as IA32_MC0_CTL2. If IA32_MCG_CAP[10] = 1 && IA32_MCG_CAP[7:0] > 31 2FFH 767 IA32_MTRR_DEF_TYPE MTRRdefType (R/W) If CPUID.01H: EDX.MTRR[12] =1 2:0 Default Memory Type 9:3 Reserved 10 Fixed Range MTRR Enable 11 MTRR Enable 63:12 Reserved 309H 777 IA32_FIXED_CTR0 Fixed-Function Performance Counter 0 (R/W): Counts Instr_Retired.Any. If CPUID.0AH: EDX[4:0] > 0 30AH 778 IA32_FIXED_CTR1 Fixed-Function Performance Counter 1 (R/W): Counts CPU_CLK_Unhalted.Core. If CPUID.0AH: EDX[4:0] > 1 30BH 779 IA32_FIXED_CTR2 Fixed-Function Performance Counter 2 (R/W): Counts CPU_CLK_Unhalted.Ref. If CPUID.0AH: EDX[4:0] > 2 345H 837 IA32_PERF_CAPABILITIES Read Only MSR that enumerates the existence of performance monitoring features. (RO) If CPUID.01H: ECX[15] = 1 5:0 LBR format 6 PEBS Trap 7 PEBSSaveArchRegs Vol. 4 2-23 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 38DH Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 909 11:8 PEBS Record Format 12 1: Freeze while SMM is supported. 13 1: Full width of counter writable via IA32_A_PMCx. 14 Reserved 15 1: Performance metrics available. 16 1: PEBS output will be written into the Intel PT trace stream. 63:17 Reserved IA32_FIXED_CTR_CTRL Fixed-Function Performance Counter Control (R/W) If CPUID.0x7.0.EBX[25]=1 If CPUID.0AH: EAX[7:0] > 1 Counter increments while the results of ANDing respective enable bit in IA32_PERF_GLOBAL_CTRL with the corresponding OS or USR bits in this MSR is true. 2-24 Vol. 4 0 EN0_OS: Enable Fixed Counter 0 to count while CPL = 0. 1 EN0_Usr: Enable Fixed Counter 0 to count while CPL > 0. 2 AnyThread: When set to 1, it enables counting the associated event conditions occurring across all logical processors sharing a processor core. When set to 0, the counter only increments the associated event conditions occurring in the logical processor which programmed the MSR. 3 EN0_PMI: Enable PMI when fixed counter 0 overflows. 4 EN1_OS: Enable Fixed Counter 1to count while CPL = 0. 5 EN1_Usr: Enable Fixed Counter 1to count while CPL > 0. 6 AnyThread: When set to 1, it enables counting the associated event conditions occurring across all logical processors sharing a processor core. When set to 0, the counter only increments the associated event conditions occurring in the logical processor which programmed the MSR. 7 EN1_PMI: Enable PMI when fixed counter 1 overflows. 8 EN2_OS: Enable Fixed Counter 2 to count while CPL = 0. If CPUID.0AH: EAX[7:0] > 2 If CPUID.0AH: EAX[7:0] > 2 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 38EH Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 910 9 EN2_Usr: Enable Fixed Counter 2 to count while CPL > 0. 10 AnyThread: When set to 1, it enables counting the associated event conditions occurring across all logical processors sharing a processor core. When set to 0, the counter only increments the associated event conditions occurring in the logical processor which programmed the MSR. If CPUID.0AH: EAX[7:0] > 2 11 EN2_PMI: Enable PMI when fixed counter 2 overflows. 63:12 Reserved IA32_PERF_GLOBAL_STATUS Global Performance Counter Status (RO) If CPUID.0AH: EAX[7:0] > 0 0 Ovf_PMC0: Overflow status of IA32_PMC0. If CPUID.0AH: EAX[15:8] > 0 1 Ovf_PMC1: Overflow status of IA32_PMC1. If CPUID.0AH: EAX[15:8] > 1 2 Ovf_PMC2: Overflow status of IA32_PMC2. If CPUID.0AH: EAX[15:8] > 2 3 Ovf_PMC3: Overflow status of IA32_PMC3. If CPUID.0AH: EAX[15:8] > 3 31:4 Reserved 32 Ovf_FixedCtr0: Overflow status of IA32_FIXED_CTR0. If CPUID.0AH: EAX[7:0] > 1 33 Ovf_FixedCtr1: Overflow status of IA32_FIXED_CTR1. If CPUID.0AH: EAX[7:0] > 1 34 Ovf_FixedCtr2: Overflow status of IA32_FIXED_CTR2. If CPUID.0AH: EAX[7:0] > 1 47:35 Reserved 48 OVF_PERF_METRICS: If this bit is set, it indicates that PERF_METRIC counter has overflowed and a PMI is triggered; however, an overflow of fixed counter 3 should normally happen first. If this bit is clear no overflow occurred. 54:49 Reserved 55 Trace_ToPA_PMI: A PMI occurred due to a ToPA entry memory buffer that was completely filled. 57:56 Reserved 58 LBR_Frz. LBRs are frozen due to: If (CPUID.(EAX=07H, ECX=0):EBX[25] = 1) && IA32_RTIT_CTL.ToPA = 1 If CPUID.0AH: EAX[7:0] > 3 • IA32_DEBUGCTL.FREEZE_LBR_ON_PMI=1. • The LBR stack overflowed. Vol. 4 2-25 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment CTR_Frz. Performance counters in the core PMU are frozen due to: If CPUID.0AH: EAX[7:0] > 3 Decimal 59 • IA32_DEBUGCTL.FREEZE_PERFMON_ON_ PMI=1. • One or more core PMU counters overflowed. 38FH 911 60 ASCI: Data in the performance counters in the core PMU may include contributions from the direct or indirect operation Intel SGX to protect an enclave. If CPUID.(EAX=07H, ECX=0):EBX[2] = 1 61 Ovf_Uncore: Uncore counter overflow status. If CPUID.0AH: EAX[7:0] > 2 62 OvfBuf: DS SAVE area Buffer overflow status. If CPUID.0AH: EAX[7:0] > 0 63 CondChgd: Status bits of this register have changed. If CPUID.0AH: EAX[7:0] > 0 Global Performance Counter Control (R/W) If CPUID.0AH: EAX[7:0] > 0 IA32_PERF_GLOBAL_CTRL Counter increments while the result of ANDing the respective enable bit in this MSR with the corresponding OS or USR bits in the general-purpose or fixed counter control MSR is true. 390H 2-26 Vol. 4 912 0 EN_PMC0 If CPUID.0AH: EAX[15:8] > 0 1 EN_PMC1 If CPUID.0AH: EAX[15:8] > 1 2 EN_PMC2 If CPUID.0AH: EAX[15:8] > 2 n EN_PMCn If CPUID.0AH: EAX[15:8] > n 31:n+1 Reserved 32 EN_FIXED_CTR0 If CPUID.0AH: EDX[4:0] > 0 33 EN_FIXED_CTR1 If CPUID.0AH: EDX[4:0] > 1 34 EN_FIXED_CTR2 If CPUID.0AH: EDX[4:0] > 2 47:35 Reserved 48 EN_PERF_METRICS: If this bit is set and fixed counter 3 is effectively enabled, builtin performance metrics are enabled. 63:49 Reserved IA32_PERF_GLOBAL_OVF_CTRL Global Performance Counter Overflow Control (R/W) If CPUID.0AH: EAX[7:0] > 0 && CPUID.0AH: EAX[7:0] <= 3 0 Set 1 to Clear Ovf_PMC0 bit. If CPUID.0AH: EAX[15:8] > 0 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 390H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 912 1 Set 1 to Clear Ovf_PMC1 bit. If CPUID.0AH: EAX[15:8] > 1 2 Set 1 to Clear Ovf_PMC2 bit. If CPUID.0AH: EAX[15:8] > 2 n Set 1 to Clear Ovf_PMCn bit. If CPUID.0AH: EAX[15:8] > n 31:n Reserved 32 Set 1 to Clear Ovf_FIXED_CTR0 bit. If CPUID.0AH: EDX[4:0] > 0 33 Set 1 to Clear Ovf_FIXED_CTR1 bit. If CPUID.0AH: EDX[4:0] > 1 34 Set 1 to Clear Ovf_FIXED_CTR2 bit. If CPUID.0AH: EDX[4:0] > 2 54:35 Reserved 55 Set 1 to Clear Trace_ToPA_PMI bit. 60:56 Reserved 61 Set 1 to Clear Ovf_Uncore bit. 06_2EH 62 Set 1 to Clear OvfBuf bit. If CPUID.0AH: EAX[7:0] > 0 63 Set 1 to clear CondChgd bit. If CPUID.0AH: EAX[7:0] > 0 IA32_PERF_GLOBAL_STATUS_RESET Global Performance Counter Overflow Reset Control (R/W) If CPUID.0AH: EAX[7:0] > 3 0 Set 1 to Clear Ovf_PMC0 bit. If CPUID.0AH: EAX[15:8] > 0 1 Set 1 to Clear Ovf_PMC1 bit. If CPUID.0AH: EAX[15:8] > 1 2 Set 1 to Clear Ovf_PMC2 bit. If CPUID.0AH: EAX[15:8] > 2 n Set 1 to Clear Ovf_PMCn bit. If CPUID.0AH: EAX[15:8] > n 31:n Reserved 32 Set 1 to Clear Ovf_FIXED_CTR0 bit. If CPUID.0AH: EDX[4:0] > 0 33 Set 1 to Clear Ovf_FIXED_CTR1 bit. If CPUID.0AH: EDX[4:0] > 1 34 Set 1 to Clear Ovf_FIXED_CTR2 bit. If CPUID.0AH: EDX[4:0] > 2 47:35 Reserved 48 RESET_OVF_PERF_METRICS: If this bit is set, it will clear the status bit in the IA32_PERF_GLOBAL_STATUS register for the PERF_METRICS counters. 54:49 Reserved 55 Set 1 to Clear Trace_ToPA_PMI bit. If (CPUID.(EAX=07H, ECX=0):EBX[25] = 1) && IA32_RTIT_CTL.ToPA = 1 If (CPUID.(EAX=07H, ECX=0):EBX[25] = 1) && IA32_RTIT_CTL.ToPA[8] = 1 Vol. 4 2-27 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 391H 392H 2-28 Vol. 4 Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 913 914 57:56 Reserved 58 Set 1 to Clear LBR_Frz bit. If CPUID.0AH: EAX[7:0] > 3 59 Set 1 to Clear CTR_Frz bit. If CPUID.0AH: EAX[7:0] > 3 58 Set 1 to Clear ASCI bit. If CPUID.0AH: EAX[7:0] > 3 61 Set 1 to Clear Ovf_Uncore bit. 06_2EH 62 Set 1 to Clear OvfBuf bit. If CPUID.0AH: EAX[7:0] > 0 63 Set 1 to clear CondChgd bit. If CPUID.0AH: EAX[7:0] > 0 IA32_PERF_GLOBAL_STATUS_SET Global Performance Counter Overflow Set Control (R/W) If CPUID.0AH: EAX[7:0] > 3 0 Set 1 to cause Ovf_PMC0 = 1. If CPUID.0AH: EAX[7:0] > 3 1 Set 1 to cause Ovf_PMC1 = 1. If CPUID.0AH: EAX[15:8] > 1 2 Set 1 to cause Ovf_PMC2 = 1. If CPUID.0AH: EAX[15:8] > 2 n Set 1 to cause Ovf_PMCn = 1. If CPUID.0AH: EAX[15:8] > n 31:n Reserved 32 Set 1 to cause Ovf_FIXED_CTR0 = 1. If CPUID.0AH: EAX[7:0] > 3 33 Set 1 to cause Ovf_FIXED_CTR1 = 1. If CPUID.0AH: EAX[7:0] > 3 34 Set 1 to cause Ovf_FIXED_CTR2 = 1. If CPUID.0AH: EAX[7:0] > 3 47:35 Reserved 48 SET_OVF_PERF_METRICS: If this bit is set, it will set the status bit in the IA32_PERF_GLOBAL_STATUS register for the PERF_METRICS counters. 54:49 Reserved 55 Set 1 to cause Trace_ToPA_PMI = 1. 57:56 Reserved If CPUID.0AH: EAX[7:0] > 3 58 Set 1 to cause LBR_Frz = 1. If CPUID.0AH: EAX[7:0] > 3 59 Set 1 to cause CTR_Frz = 1. If CPUID.0AH: EAX[7:0] > 3 58 Set 1 to cause ASCI = 1. If CPUID.0AH: EAX[7:0] > 3 61 Set 1 to cause Ovf_Uncore = 1. If CPUID.0AH: EAX[7:0] > 3 62 Set 1 to cause OvfBuf = 1. If CPUID.0AH: EAX[7:0] > 3 63 Reserved IA32_PERF_GLOBAL_INUSE Indicator that core perfmon interface is in use. (RO) 0 IA32_PERFEVTSEL0 in use. 1 IA32_PERFEVTSEL1 in use. If CPUID.0AH: EAX[7:0] > 3 If CPUID.0AH: EAX[15:8] > 1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 3F1H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 1009 2 IA32_PERFEVTSEL2 in use. If CPUID.0AH: EAX[15:8] > 2 n IA32_PERFEVTSELn in use. If CPUID.0AH: EAX[15:8] > n 31:n+1 Reserved 32 IA32_FIXED_CTR0 in use. 33 IA32_FIXED_CTR1 in use. 34 IA32_FIXED_CTR2 in use. 62:35 Reserved or model specific. 63 PMI in use. IA32_PEBS_ENABLE PEBS Control (R/W) 0 Enable PEBS on IA32_PMC0. 3:1 Reserved or model specific. 31:4 Reserved 35:32 Reserved or model specific. 63:36 Reserved 06_0FH 400H 1024 IA32_MC0_CTL MC0_CTL If IA32_MCG_CAP.CNT >0 401H 1025 IA32_MC0_STATUS MC0_STATUS If IA32_MCG_CAP.CNT >0 402H 1026 IA32_MC0_ADDR1 MC0_ADDR If IA32_MCG_CAP.CNT >0 403H 1027 IA32_MC0_MISC MC0_MISC If IA32_MCG_CAP.CNT >0 404H 1028 IA32_MC1_CTL MC1_CTL If IA32_MCG_CAP.CNT >1 405H 1029 IA32_MC1_STATUS MC1_STATUS If IA32_MCG_CAP.CNT >1 406H 1030 IA32_MC1_ADDR2 MC1_ADDR If IA32_MCG_CAP.CNT >1 407H 1031 IA32_MC1_MISC MC1_MISC If IA32_MCG_CAP.CNT >1 408H 1032 IA32_MC2_CTL MC2_CTL If IA32_MCG_CAP.CNT >2 409H 1033 IA32_MC2_STATUS MC2_STATUS If IA32_MCG_CAP.CNT >2 40AH 1034 IA32_MC2_ADDR1 MC2_ADDR If IA32_MCG_CAP.CNT >2 40BH 1035 IA32_MC2_MISC MC2_MISC If IA32_MCG_CAP.CNT >2 40CH 1036 IA32_MC3_CTL MC3_CTL If IA32_MCG_CAP.CNT >3 40DH 1037 IA32_MC3_STATUS MC3_STATUS If IA32_MCG_CAP.CNT >3 40EH 1038 IA32_MC3_ADDR1 MC3_ADDR If IA32_MCG_CAP.CNT >3 40FH 1039 IA32_MC3_MISC MC3_MISC If IA32_MCG_CAP.CNT >3 410H 1040 IA32_MC4_CTL MC4_CTL If IA32_MCG_CAP.CNT >4 411H 1041 IA32_MC4_STATUS MC4_STATUS If IA32_MCG_CAP.CNT >4 412H 1042 IA32_MC4_ADDR1 MC4_ADDR If IA32_MCG_CAP.CNT >4 413H 1043 IA32_MC4_MISC MC4_MISC If IA32_MCG_CAP.CNT >4 414H 1044 IA32_MC5_CTL MC5_CTL If IA32_MCG_CAP.CNT >5 Vol. 4 2-29 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 415H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 1045 IA32_MC5_STATUS 1 MC5_STATUS If IA32_MCG_CAP.CNT >5 416H 1046 IA32_MC5_ADDR MC5_ADDR If IA32_MCG_CAP.CNT >5 417H 1047 IA32_MC5_MISC MC5_MISC If IA32_MCG_CAP.CNT >5 418H 1048 IA32_MC6_CTL MC6_CTL If IA32_MCG_CAP.CNT >6 419H 1049 IA32_MC6_STATUS MC6_STATUS If IA32_MCG_CAP.CNT >6 41AH 1050 IA32_MC6_ADDR1 MC6_ADDR If IA32_MCG_CAP.CNT >6 41BH 1051 IA32_MC6_MISC MC6_MISC If IA32_MCG_CAP.CNT >6 41CH 1052 IA32_MC7_CTL MC7_CTL If IA32_MCG_CAP.CNT >7 41DH 1053 IA32_MC7_STATUS MC7_STATUS If IA32_MCG_CAP.CNT >7 41EH 1054 IA32_MC7_ADDR1 MC7_ADDR If IA32_MCG_CAP.CNT >7 41FH 1055 IA32_MC7_MISC MC7_MISC If IA32_MCG_CAP.CNT >7 420H 1056 IA32_MC8_CTL MC8_CTL If IA32_MCG_CAP.CNT >8 421H 1057 IA32_MC8_STATUS MC8_STATUS If IA32_MCG_CAP.CNT >8 422H 1058 IA32_MC8_ADDR1 MC8_ADDR If IA32_MCG_CAP.CNT >8 423H 1059 IA32_MC8_MISC MC8_MISC If IA32_MCG_CAP.CNT >8 424H 1060 IA32_MC9_CTL MC9_CTL If IA32_MCG_CAP.CNT >9 425H 1061 IA32_MC9_STATUS MC9_STATUS If IA32_MCG_CAP.CNT >9 426H 1062 IA32_MC9_ADDR1 MC9_ADDR If IA32_MCG_CAP.CNT >9 427H 1063 IA32_MC9_MISC MC9_MISC If IA32_MCG_CAP.CNT >9 428H 1064 IA32_MC10_CTL MC10_CTL If IA32_MCG_CAP.CNT >10 429H 1065 IA32_MC10_STATUS MC10_STATUS If IA32_MCG_CAP.CNT >10 42AH 1066 IA32_MC10_ADDR1 MC10_ADDR If IA32_MCG_CAP.CNT >10 42BH 1067 IA32_MC10_MISC MC10_MISC If IA32_MCG_CAP.CNT >10 42CH 1068 IA32_MC11_CTL MC11_CTL If IA32_MCG_CAP.CNT >11 42DH 1069 IA32_MC11_STATUS MC11_STATUS If IA32_MCG_CAP.CNT >11 42EH 1070 IA32_MC11_ADDR1 MC11_ADDR If IA32_MCG_CAP.CNT >11 42FH 1071 IA32_MC11_MISC MC11_MISC If IA32_MCG_CAP.CNT >11 430H 1072 IA32_MC12_CTL MC12_CTL If IA32_MCG_CAP.CNT >12 431H 1073 IA32_MC12_STATUS MC12_STATUS If IA32_MCG_CAP.CNT >12 432H 1074 IA32_MC12_ADDR1 MC12_ADDR If IA32_MCG_CAP.CNT >12 433H 1075 IA32_MC12_MISC MC12_MISC If IA32_MCG_CAP.CNT >12 434H 1076 IA32_MC13_CTL MC13_CTL If IA32_MCG_CAP.CNT >13 435H 1077 IA32_MC13_STATUS MC13_STATUS If IA32_MCG_CAP.CNT >13 436H 1078 IA32_MC13_ADDR1 MC13_ADDR If IA32_MCG_CAP.CNT >13 437H 1079 IA32_MC13_MISC MC13_MISC If IA32_MCG_CAP.CNT >13 438H 1080 IA32_MC14_CTL MC14_CTL If IA32_MCG_CAP.CNT >14 439H 1081 IA32_MC14_STATUS MC14_STATUS If IA32_MCG_CAP.CNT >14 2-30 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 43AH 1082 IA32_MC14_ADDR1 MC14_ADDR If IA32_MCG_CAP.CNT >14 43BH 1083 IA32_MC14_MISC MC14_MISC If IA32_MCG_CAP.CNT >14 43CH 1084 IA32_MC15_CTL MC15_CTL If IA32_MCG_CAP.CNT >15 43DH 1085 IA32_MC15_STATUS MC15_STATUS If IA32_MCG_CAP.CNT >15 43EH 1086 IA32_MC15_ADDR1 MC15_ADDR If IA32_MCG_CAP.CNT >15 43FH 1087 IA32_MC15_MISC MC15_MISC If IA32_MCG_CAP.CNT >15 440H 1088 IA32_MC16_CTL MC16_CTL If IA32_MCG_CAP.CNT >16 441H 1089 IA32_MC16_STATUS MC16_STATUS If IA32_MCG_CAP.CNT >16 442H 1090 IA32_MC16_ADDR1 MC16_ADDR If IA32_MCG_CAP.CNT >16 443H 1091 IA32_MC16_MISC MC16_MISC If IA32_MCG_CAP.CNT >16 444H 1092 IA32_MC17_CTL MC17_CTL If IA32_MCG_CAP.CNT >17 445H 1093 IA32_MC17_STATUS MC17_STATUS If IA32_MCG_CAP.CNT >17 446H 1094 IA32_MC17_ADDR1 MC17_ADDR If IA32_MCG_CAP.CNT >17 447H 1095 IA32_MC17_MISC MC17_MISC If IA32_MCG_CAP.CNT >17 448H 1096 IA32_MC18_CTL MC18_CTL If IA32_MCG_CAP.CNT >18 449H 1097 IA32_MC18_STATUS MC18_STATUS If IA32_MCG_CAP.CNT >18 44AH 1098 IA32_MC18_ADDR1 MC18_ADDR If IA32_MCG_CAP.CNT >18 44BH 1099 IA32_MC18_MISC MC18_MISC If IA32_MCG_CAP.CNT >18 44CH 1100 IA32_MC19_CTL MC19_CTL If IA32_MCG_CAP.CNT >19 44DH 1101 IA32_MC19_STATUS MC19_STATUS If IA32_MCG_CAP.CNT >19 44EH 1102 IA32_MC19_ADDR1 MC19_ADDR If IA32_MCG_CAP.CNT >19 44FH 1103 IA32_MC19_MISC MC19_MISC If IA32_MCG_CAP.CNT >19 450H 1104 IA32_MC20_CTL MC20_CTL If IA32_MCG_CAP.CNT >20 451H 1105 IA32_MC20_STATUS MC20_STATUS If IA32_MCG_CAP.CNT >20 452H 1106 IA32_MC20_ADDR1 MC20_ADDR If IA32_MCG_CAP.CNT >20 453H 1107 IA32_MC20_MISC MC20_MISC If IA32_MCG_CAP.CNT >20 454H 1108 IA32_MC21_CTL MC21_CTL If IA32_MCG_CAP.CNT >21 455H 1109 IA32_MC21_STATUS MC21_STATUS If IA32_MCG_CAP.CNT >21 456H 1110 IA32_MC21_ADDR1 MC21_ADDR If IA32_MCG_CAP.CNT >21 457H 1111 IA32_MC21_MISC MC21_MISC If IA32_MCG_CAP.CNT >21 458H 1112 IA32_MC22_CTL MC22_CTL If IA32_MCG_CAP.CNT >22 459H 1113 IA32_MC22_STATUS MC22_STATUS If IA32_MCG_CAP.CNT >22 45AH 1114 IA32_MC22_ADDR1 MC22_ADDR If IA32_MCG_CAP.CNT >22 45BH 1115 IA32_MC22_MISC MC22_MISC If IA32_MCG_CAP.CNT >22 45CH 1116 IA32_MC23_CTL MC23_CTL If IA32_MCG_CAP.CNT >23 45DH 1117 IA32_MC23_STATUS MC23_STATUS If IA32_MCG_CAP.CNT >23 1118 IA32_MC23_ADDR1 MC23_ADDR If IA32_MCG_CAP.CNT >23 45EH Vol. 4 2-31 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 45FH 1119 IA32_MC23_MISC MC23_MISC If IA32_MCG_CAP.CNT >23 460H 1120 IA32_MC24_CTL MC24_CTL If IA32_MCG_CAP.CNT >24 461H 1121 IA32_MC24_STATUS MC24_STATUS If IA32_MCG_CAP.CNT >24 1 462H 1122 IA32_MC24_ADDR MC24_ADDR If IA32_MCG_CAP.CNT >24 463H 1123 IA32_MC24_MISC MC24_MISC If IA32_MCG_CAP.CNT >24 464H 1124 IA32_MC25_CTL MC25_CTL If IA32_MCG_CAP.CNT >25 465H 1125 IA32_MC25_STATUS MC25_STATUS If IA32_MCG_CAP.CNT >25 466H 1126 IA32_MC25_ADDR1 MC25_ADDR If IA32_MCG_CAP.CNT >25 467H 1127 IA32_MC25_MISC MC25_MISC If IA32_MCG_CAP.CNT >25 468H 1128 IA32_MC26_CTL MC26_CTL If IA32_MCG_CAP.CNT >26 469H 1129 IA32_MC26_STATUS MC26_STATUS If IA32_MCG_CAP.CNT >26 46AH 1130 IA32_MC26_ADDR1 MC26_ADDR If IA32_MCG_CAP.CNT >26 46BH 1131 IA32_MC26_MISC MC26_MISC If IA32_MCG_CAP.CNT >26 46CH 1132 IA32_MC27_CTL MC27_CTL If IA32_MCG_CAP.CNT >27 46DH 1133 IA32_MC27_STATUS MC27_STATUS If IA32_MCG_CAP.CNT >27 46EH 1134 IA32_MC27_ADDR1 MC27_ADDR If IA32_MCG_CAP.CNT >27 46FH 1135 IA32_MC27_MISC MC27_MISC If IA32_MCG_CAP.CNT >27 470H 1136 IA32_MC28_CTL MC28_CTL If IA32_MCG_CAP.CNT >28 471H 1137 IA32_MC28_STATUS MC28_STATUS If IA32_MCG_CAP.CNT >28 472H 1138 IA32_MC28_ADDR1 MC28_ADDR If IA32_MCG_CAP.CNT >28 473H 1139 IA32_MC28_MISC MC28_MISC If IA32_MCG_CAP.CNT >28 480H 1152 IA32_VMX_BASIC Reporting Register of Basic VMX Capabilities (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.1, “Basic VMX Information.” 481H 1153 IA32_VMX_PINBASED_CTLS Capability Reporting Register of Pin-Based VM-Execution Controls (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.3.1, “Pin-Based VMExecution Controls.” 482H 1154 IA32_VMX_PROCBASED_CTLS Capability Reporting Register of Primary Processor-Based VM-Execution Controls (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.3.2, “Primary ProcessorBased VM-Execution Controls.” 483H 1155 IA32_VMX_EXIT_CTLS Capability Reporting Register of VM-Exit Controls (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.4, “VM-Exit Controls.” 484H 1156 IA32_VMX_ENTRY_CTLS Capability Reporting Register of VM-Entry Controls (R/O) See Appendix A.5, “VM-Entry Controls.” 2-32 Vol. 4 If CPUID.01H:ECX.[5] = 1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 485H 1157 IA32_VMX_MISC 486H 1158 IA32_VMX_CR0_FIXED0 Reporting Register of Miscellaneous VMX Capabilities (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.6, “Miscellaneous Data.” Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.7, “VMX-Fixed Bits in CR0.” 487H 1159 IA32_VMX_CR0_FIXED1 Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.7, “VMX-Fixed Bits in CR0.” 488H 1160 IA32_VMX_CR4_FIXED0 Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.8, “VMX-Fixed Bits in CR4.” 489H 1161 IA32_VMX_CR4_FIXED1 Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.8, “VMX-Fixed Bits in CR4.” 48AH 1162 IA32_VMX_VMCS_ENUM Capability Reporting Register of VMCS Field Enumeration (R/O) If CPUID.01H:ECX.[5] = 1 See Appendix A.9, “VMCS Enumeration.” 48BH 1163 IA32_VMX_PROCBASED_CTLS2 Capability Reporting Register of Secondary Processor-Based VM-Execution Controls (R/O) If ( CPUID.01H:ECX.[5] && IA32_VMX_PROCBASED_C TLS[63]) See Appendix A.3.3, “Secondary ProcessorBased VM-Execution Controls.” 48CH 48DH 1164 1165 IA32_VMX_EPT_VPID_CAP IA32_VMX_TRUE_PINBASED_CTLS Capability Reporting Register of EPT and VPID (R/O) See Appendix A.10, “VPID and EPT Capabilities.” If ( CPUID.01H:ECX.[5] && IA32_VMX_PROCBASED_C TLS[63] && ( IA32_VMX_PROCBASED_C TLS2[33] || IA32_VMX_PROCBASED_C TLS2[37]) ) Capability Reporting Register of Pin-Based VM-Execution Flex Controls (R/O) If ( CPUID.01H:ECX.[5] = 1 && IA32_VMX_BASIC[55] ) See Appendix A.3.1, “Pin-Based VMExecution Controls.” 48EH 1166 IA32_VMX_TRUE_PROCBASED_CTLS Capability Reporting Register of Primary Processor-Based VM-Execution Flex Controls (R/O) If( CPUID.01H:ECX.[5] = 1 && IA32_VMX_BASIC[55] ) See Appendix A.3.2, “Primary ProcessorBased VM-Execution Controls.” 48FH 1167 IA32_VMX_TRUE_EXIT_CTLS Capability Reporting Register of VM-Exit Flex Controls (R/O) If( CPUID.01H:ECX.[5] = 1 && IA32_VMX_BASIC[55] ) See Appendix A.4, “VM-Exit Controls.” 490H 1168 IA32_VMX_TRUE_ENTRY_CTLS Capability Reporting Register of VM-Entry Flex Controls (R/O) If( CPUID.01H:ECX.[5] = 1 && IA32_VMX_BASIC[55] ) See Appendix A.5, “VM-Entry Controls.” Vol. 4 2-33 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 491H 1169 IA32_VMX_VMFUNC Capability Reporting Register of VMFunction Controls (R/O) If( CPUID.01H:ECX.[5] = 1 && IA32_VMX_BASIC[55] ) 4C1H 1217 IA32_A_PMC0 Full Width Writable IA32_PMC0 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 0) && IA32_PERF_CAPABILITIES[ 13] = 1 4C2H 1218 IA32_A_PMC1 Full Width Writable IA32_PMC1 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 1) && IA32_PERF_CAPABILITIES[ 13] = 1 4C3H 1219 IA32_A_PMC2 Full Width Writable IA32_PMC2 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 2) && IA32_PERF_CAPABILITIES[ 13] = 1 4C4H 1220 IA32_A_PMC3 Full Width Writable IA32_PMC3 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 3) && IA32_PERF_CAPABILITIES[ 13] = 1 4C5H 1221 IA32_A_PMC4 Full Width Writable IA32_PMC4 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 4) && IA32_PERF_CAPABILITIES[ 13] = 1 4C6H 1222 IA32_A_PMC5 Full Width Writable IA32_PMC5 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 5) && IA32_PERF_CAPABILITIES[ 13] = 1 4C7H 1223 IA32_A_PMC6 Full Width Writable IA32_PMC6 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 6) && IA32_PERF_CAPABILITIES[ 13] = 1 4C8H 1224 IA32_A_PMC7 Full Width Writable IA32_PMC7 Alias (R/W) (If CPUID.0AH: EAX[15:8] > 7) && IA32_PERF_CAPABILITIES[ 13] = 1 4D0H 1232 IA32_MCG_EXT_CTL Allows software to signal some MCEs to only a single logical processor in the system. (R/W) If IA32_MCG_CAP.LMCE_P =1 See Section 15.3.1.4, “IA32_MCG_EXT_CTL MSR”. 500H 2-34 Vol. 4 1280 0 LMCE_EN 63:1 Reserved IA32_SGX_SVN_STATUS Status and SVN Threshold of SGX Support for ACM (RO). If CPUID.(EAX=07H, ECX=0H): EBX[2] = 1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 560H Architectural MSR Name / Bit Fields (Former MSR Name) 570H Comment Decimal 1376 0 Lock 15:1 Reserved 23:16 SGX_SVN_SINIT 63:24 Reserved IA32_RTIT_OUTPUT_BASE Trace Output Base Register (R/W) 6:0 Reserved 3-1:7 561H MSR/Bit Description 1377 1392 MAXPHYADDR Base physical address. 63:MAXPHYADDR Reserved IA32_RTIT_OUTPUT_MASK_PTRS Trace Output Mask Pointers Register (R/W) 6:0 Reserved 31:7 MaskOrTableOffset 63:32 Output Offset IA32_RTIT_CTL Trace Control Register (R/W) 0 TraceEn 1 CYCEn 2 OS 3 User 4 PwrEvtEn 5 FUPonPTW 6 FabricEn 7 CR3 filter 8 ToPA 9 MTCEn See Section 41.11.3, “Interactions with Authenticated Code Modules (ACMs)”. See Section 41.11.3, “Interactions with Authenticated Code Modules (ACMs)”. If ((CPUID.(EAX=07H, ECX=0):EBX[25] = 1) && ( (CPUID.(EAX=14H,ECX=0): ECX[0] = 1) || (CPUID.(EAX=14H,ECX=0): ECX[2] = 1) ) ) If ((CPUID.(EAX=07H, ECX=0):EBX[25] = 1) && ( (CPUID.(EAX=14H,ECX=0): ECX[0] = 1) || (CPUID.(EAX=14H,ECX=0): ECX[2] = 1) ) ) If (CPUID.(EAX=07H, ECX=0):EBX[25] = 1) If (CPUID.(EAX=07H, ECX=0):EBX[1] = 1) If (CPUID.(EAX=07H, ECX=0):ECX[3] = 1) If (CPUID.(EAX=07H, ECX=0):EBX[3] = 1) Vol. 4 2-35 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 571H 572H 580H 2-36 Vol. 4 Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 1393 1394 1408 10 TSCEn 11 DisRETC 12 PTWEn 13 BranchEn 17:14 MTCFreq 18 Reserved, must be zero. 22:19 CYCThresh 23 Reserved, must be zero. 27:24 PSBFreq 31:28 Reserved, must be zero. 35:32 ADDR0_CFG If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 0) 39:36 ADDR1_CFG If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 1) 43:40 ADDR2_CFG If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 2) 47:44 ADDR3_CFG If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 3) 63:48 Reserved, must be zero. IA32_RTIT_STATUS Tracing Status Register (R/W) If (CPUID.(EAX=07H, ECX=0):EBX[25] = 1) 0 FilterEn (writes ignored) If (CPUID.(EAX=07H, ECX=0):EBX[2] = 1) 1 ContexEn (writes ignored) 2 TriggerEn (writes ignored) 3 Reserved 4 Error 5 Stopped 31:6 Reserved, must be zero. 48:32 PacketByteCnt 63:49 Reserved IA32_RTIT_CR3_MATCH Trace Filter CR3 Match Register (R/W) 4:0 Reserved 63:5 CR3[63:5] value to match. IA32_RTIT_ADDR0_A Region 0 Start Address (R/W) If (CPUID.(EAX=07H, ECX=0):EBX[3] = 1) If (CPUID.(EAX=07H, ECX=0):EBX[1] = 1) If (CPUID.(EAX=07H, ECX=0):EBX[1] = 1) If (CPUID.(EAX=07H, ECX=0):EBX[1] > 3) If (CPUID.(EAX=07H, ECX=0):EBX[25] = 1) If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 0) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex 581H 582H 583H 584H 585H 586H 587H 600H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 1409 1410 1411 1412 1413 1414 1415 1536 47:0 Virtual Address 63:48 SignExt_VA IA32_RTIT_ADDR0_B Region 0 End Address (R/W) 47:0 Virtual Address 63:48 SignExt_VA IA32_RTIT_ADDR1_A Region 1 Start Address (R/W) 47:0 Virtual Address 63:48 SignExt_VA IA32_RTIT_ADDR1_B Region 1 End Address (R/W) 47:0 Virtual Address 63:48 SignExt_VA IA32_RTIT_ADDR2_A Region 2 Start Address (R/W) 47:0 Virtual Address 63:48 SignExt_VA IA32_RTIT_ADDR2_B Region 2 End Address (R/W) 47:0 Virtual Address 63:48 SignExt_VA IA32_RTIT_ADDR3_A Region 3 Start Address (R/W) 47:0 Virtual Address 63:48 SignExt_VA IA32_RTIT_ADDR3_B Region 3 End Address (R/W) 47:0 Virtual Address 63:48 SignExt_VA IA32_DS_AREA DS Save Area (R/W) Points to the linear address of the first byte of the DS buffer management area, which is used to manage the BTS and PEBS buffers. If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 0) If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 1) If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 1) If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 2) If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 2) If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 3) If (CPUID.(EAX=07H, ECX=1):EAX[2:0] > 3) If( CPUID.01H:EDX.DS[21] =1 See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 63:0 The linear address of the first byte of the DS buffer management area, if IA-32e mode is active. Vol. 4 2-37 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 31:0 The linear address of the first byte of the DS buffer management area, if not in IA32e mode. 63:32 Reserved if not in IA-32e mode. 6E0H 1760 IA32_TSC_DEADLINE TSC Target of Local APIC’s TSC Deadline Mode (R/W) If CPUID.01H:ECX.[24] = 1 770H 1904 IA32_PM_ENABLE Enable/disable HWP (R/W) If CPUID.06H:EAX.[7] = 1 0 HWP_ENABLE (R/W1-Once) If CPUID.06H:EAX.[7] = 1 See Section 14.4.2, “Enabling HWP”. 771H 1905 63:1 Reserved IA32_HWP_CAPABILITIES HWP Performance Range Enumeration (RO) If CPUID.06H:EAX.[7] = 1 7:0 Highest_Performance If CPUID.06H:EAX.[7] = 1 See Section 14.4.3, “HWP Performance Range and Dynamic Capabilities”. 15:8 Guaranteed_Performance If CPUID.06H:EAX.[7] = 1 See Section 14.4.3, “HWP Performance Range and Dynamic Capabilities”. 23:16 Most_Efficient_Performance If CPUID.06H:EAX.[7] = 1 See Section 14.4.3, “HWP Performance Range and Dynamic Capabilities”. 31:24 Lowest_Performance If CPUID.06H:EAX.[7] = 1 See Section 14.4.3, “HWP Performance Range and Dynamic Capabilities”. 772H 1906 63:32 Reserved IA32_HWP_REQUEST_PKG Power Management Control Hints for All Logical Processors in a Package (R/W) If CPUID.06H:EAX.[11] = 1 Minimum_Performance If CPUID.06H:EAX.[11] = 1 7:0 See Section 14.4.4, “Managing HWP”. 15:8 Maximum_Performance If CPUID.06H:EAX.[11] = 1 See Section 14.4.4, “Managing HWP”. 23:16 Desired_Performance If CPUID.06H:EAX.[11] = 1 See Section 14.4.4, “Managing HWP”. 31:24 Energy_Performance_Preference See Section 14.4.4, “Managing HWP”. 41:32 Activity_Window See Section 14.4.4, “Managing HWP”. 773H 2-38 Vol. 4 1907 63:42 Reserved IA32_HWP_INTERRUPT Control HWP Native Interrupts (R/W) If CPUID.06H:EAX.[11] = 1 && CPUID.06H:EAX.[10] = 1 If CPUID.06H:EAX.[11] = 1 && CPUID.06H:EAX.[9] = 1 If CPUID.06H:EAX.[8] = 1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 0 EN_Guaranteed_Performance_Change If CPUID.06H:EAX.[8] = 1 See Section 14.4.6, “HWP Notifications”. 1 EN_Excursion_Minimum If CPUID.06H:EAX.[8] = 1 See Section 14.4.6, “HWP Notifications”. 774H 1908 63:2 Reserved IA32_HWP_REQUEST Power Management Control Hints to a Logical Processor (R/W) If CPUID.06H:EAX.[7] = 1 Minimum_Performance If CPUID.06H:EAX.[7] = 1 7:0 See Section 14.4.4, “Managing HWP”. 15:8 Maximum_Performance If CPUID.06H:EAX.[7] = 1 See Section 14.4.4, “Managing HWP”. 23:16 Desired_Performance If CPUID.06H:EAX.[7] = 1 See Section 14.4.4, “Managing HWP”. 31:24 Energy_Performance_Preference See Section 14.4.4, “Managing HWP”. 41:32 Activity_Window See Section 14.4.4, “Managing HWP”. 42 Package_Control See Section 14.4.4, “Managing HWP”. 777H 1911 If CPUID.06H:EAX.[7] = 1 && CPUID.06H:EAX.[10] = 1 If CPUID.06H:EAX.[7] = 1 && CPUID.06H:EAX.[9] = 1 If CPUID.06H:EAX.[7] = 1 && CPUID.06H:EAX.[11] = 1 63:43 Reserved IA32_HWP_STATUS Log bits indicating changes to Guaranteed & excursions to Minimum (R/W) If CPUID.06H:EAX.[7] = 1 0 Guaranteed_Performance_Change (R/WC0) If CPUID.06H:EAX.[7] = 1 See Section 14.4.5, “HWP Feedback”. 1 Reserved 2 Excursion_To_Minimum (R/WC0) If CPUID.06H:EAX.[7] = 1 See Section 14.4.5, “HWP Feedback”. 802H 2050 63:3 Reserved IA32_X2APIC_APICID x2APIC ID Register (R/O) See x2APIC Specification. If CPUID.01H:ECX[21] = 1 && IA32_APIC_BASE.[10] =1 803H 2051 IA32_X2APIC_VERSION x2APIC Version Register (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 808H 2056 IA32_X2APIC_TPR x2APIC Task Priority Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 80AH 2058 IA32_X2APIC_PPR x2APIC Processor Priority Register (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 Vol. 4 2-39 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 80BH 2059 IA32_X2APIC_EOI x2APIC EOI Register (W/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 80DH 2061 IA32_X2APIC_LDR x2APIC Logical Destination Register (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 80FH 2063 IA32_X2APIC_SIVR x2APIC Spurious Interrupt Vector Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 810H 2064 IA32_X2APIC_ISR0 x2APIC In-Service Register Bits 31:0 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 811H 2065 IA32_X2APIC_ISR1 x2APIC In-Service Register Bits 63:32 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 812H 2066 IA32_X2APIC_ISR2 x2APIC In-Service Register Bits 95:64 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 813H 2067 IA32_X2APIC_ISR3 x2APIC In-Service Register Bits 127:96 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 814H 2068 IA32_X2APIC_ISR4 x2APIC In-Service Register Bits 159:128 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 815H 2069 IA32_X2APIC_ISR5 x2APIC In-Service Register Bits 191:160 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 816H 2070 IA32_X2APIC_ISR6 x2APIC In-Service Register Bits 223:192 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 817H 2071 IA32_X2APIC_ISR7 x2APIC In-Service Register Bits 255:224 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 818H 2072 IA32_X2APIC_TMR0 x2APIC Trigger Mode Register Bits 31:0 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 819H 2073 IA32_X2APIC_TMR1 x2APIC Trigger Mode Register Bits 63:32 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 81AH 2074 IA32_X2APIC_TMR2 x2APIC Trigger Mode Register Bits 95:64 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 81BH 2075 IA32_X2APIC_TMR3 x2APIC Trigger Mode Register Bits 127:96 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 2-40 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 81CH 2076 IA32_X2APIC_TMR4 x2APIC Trigger Mode Register Bits 159:128 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 81DH 2077 IA32_X2APIC_TMR5 x2APIC Trigger Mode Register Bits 191:160 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 81EH 2078 IA32_X2APIC_TMR6 x2APIC Trigger Mode Register Bits 223:192 (R/O) If ( CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] = 1) 81FH 2079 IA32_X2APIC_TMR7 x2APIC Trigger Mode Register Bits 255:224 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 820H 2080 IA32_X2APIC_IRR0 x2APIC Interrupt Request Register Bits 31:0 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 821H 2081 IA32_X2APIC_IRR1 x2APIC Interrupt Request Register Bits 63:32 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 822H 2082 IA32_X2APIC_IRR2 x2APIC Interrupt Request Register Bits 95:64 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 823H 2083 IA32_X2APIC_IRR3 x2APIC Interrupt Request Register Bits 127:96 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 824H 2084 IA32_X2APIC_IRR4 x2APIC Interrupt Request Register Bits 159:128 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 825H 2085 IA32_X2APIC_IRR5 x2APIC Interrupt Request Register Bits 191:160 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 826H 2086 IA32_X2APIC_IRR6 x2APIC Interrupt Request Register Bits 223:192 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 827H 2087 IA32_X2APIC_IRR7 x2APIC Interrupt Request Register Bits 255:224 (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 828H 2088 IA32_X2APIC_ESR x2APIC Error Status Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 82FH 2095 IA32_X2APIC_LVT_CMCI x2APIC LVT Corrected Machine Check Interrupt Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 830H 2096 IA32_X2APIC_ICR x2APIC Interrupt Command Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 Vol. 4 2-41 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 832H 2098 IA32_X2APIC_LVT_TIMER x2APIC LVT Timer Interrupt Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 833H 2099 IA32_X2APIC_LVT_THERMAL x2APIC LVT Thermal Sensor Interrupt Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 834H 2100 IA32_X2APIC_LVT_PMI x2APIC LVT Performance Monitor Interrupt Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 835H 2101 IA32_X2APIC_LVT_LINT0 x2APIC LVT LINT0 Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 836H 2102 IA32_X2APIC_LVT_LINT1 x2APIC LVT LINT1 Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 837H 2103 IA32_X2APIC_LVT_ERROR x2APIC LVT Error Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 838H 2104 IA32_X2APIC_INIT_COUNT x2APIC Initial Count Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 839H 2105 IA32_X2APIC_CUR_COUNT x2APIC Current Count Register (R/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 83EH 2110 IA32_X2APIC_DIV_CONF x2APIC Divide Configuration Register (R/W) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 83FH 2111 IA32_X2APIC_SELF_IPI x2APIC Self IPI Register (W/O) If CPUID.01H:ECX.[21] = 1 && IA32_APIC_BASE.[10] =1 C80H 3200 IA32_DEBUG_INTERFACE Silicon Debug Feature Control (R/W) If CPUID.01H:ECX.[11] = 1 0 Enable (R/W) If CPUID.01H:ECX.[11] = 1 BIOS set 1 to enable Silicon debug features. Default is 0. C81H 2-42 Vol. 4 3201 29:1 Reserved 30 Lock (R/W): If 1, locks any further change to the MSR. The lock bit is set automatically on the first SMI assertion even if not explicitly set by BIOS. Default is 0. If CPUID.01H:ECX.[11] = 1 31 Debug Occurred (R/O): This “sticky bit” is set by hardware to indicate the status of bit 0. Default is 0. If CPUID.01H:ECX.[11] = 1 63:32 Reserved IA32_L3_QOS_CFG L3 QOS Configuration (R/W) If ( CPUID.(EAX=10H, ECX=1):ECX.[2] = 1 ) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 0 Enable (R/W) Set 1 to enable L3 CAT masks and COS to operate in Code and Data Prioritization (CDP) mode. C82H 3202 63:1 Reserved. Attempts to write to reserved bits result in a #GP(0). IA32_L2_QOS_CFG L2 QOS Configuration (R/W) 0 Enable (R/W) If ( CPUID.(EAX=10H, ECX=2):ECX.[2] = 1 ) Set 1 to enable L2 CAT masks and COS to operate in Code and Data Prioritization (CDP) mode. C8DH C8EH C8FH 3213 3214 3215 63:1 Reserved. Attempts to write to reserved bits result in a #GP(0). IA32_QM_EVTSEL Monitoring Event Select Register (R/W) 7:0 Event ID: ID of a supported monitoring event to report via IA32_QM_CTR. 31: 8 Reserved N+31:32 Resource Monitoring ID: ID for monitoring hardware to report monitored data via IA32_QM_CTR. If ( CPUID.(EAX=07H, ECX=0):EBX.[12] = 1 ) N = Ceil (Log2 ( CPUID.(EAX= 0FH, ECX=0H).EBX[31:0] +1)) 63:N+32 Reserved IA32_QM_CTR Monitoring Counter Register (R/O) 61:0 Resource Monitored Data 62 Unavailable: If 1, indicates data for this RMID is not available or not monitored for this resource or RMID. 63 Error: If 1, indicates an unsupported RMID or event type was written to IA32_PQR_QM_EVTSEL. IA32_PQR_ASSOC Resource Association Register (R/W) If ( (CPUID.(EAX=07H, ECX=0):EBX[12] =1) or (CPUID.(EAX=07H, ECX=0):EBX[15] =1 ) ) N-1:0 Resource Monitoring ID (R/W): ID for monitoring hardware to track internal operation, e.g., memory access. N = Ceil (Log2 ( CPUID.(EAX= 0FH, ECX=0H).EBX[31:0] +1)) 31:N Reserved 63:32 COS (R/W): The class of service (COS) to enforce (on writes); returns the current COS when read. If ( CPUID.(EAX=07H, ECX=0):EBX.[12] = 1 ) If ( CPUID.(EAX=07H, ECX=0):EBX.[15] = 1 ) Vol. 4 2-43 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex C90H+ n 3216 Reserved MSR Address Space for CAT Mask Registers See Section 17.19.4.1, “Enumeration and Detection Support of Cache Allocation Technology”. IA32_L3_MASK_0 L3 CAT Mask for COS0 (R/W) 31:0 Capacity Bit Mask (R/W) 63:32 Reserved 3216+n IA32_L3_MASK_n D10H D4FH D10H 3344 DA0H 2-44 Vol. 4 3472 3488 L3 CAT Mask for COSn (R/W) 31:0 Capacity Bit Mask (R/W) 63:32 Reserved Reserved MSR Address Space for L2 CAT Mask Registers See Section 17.19.4.1, “Enumeration and Detection Support of Cache Allocation Technology”. IA32_L2_MASK_0 L2 CAT Mask for COS0 (R/W) 31:0 Capacity Bit Mask (R/W) 63:32 Reserved D10H+ 3344+n IA32_L2_MASK_n n D90H MSR/Bit Description Comment Decimal C90H D8FH C90H Architectural MSR Name / Bit Fields (Former MSR Name) L2 CAT Mask for COSn (R/W) 31:0 Capacity Bit Mask (R/W) 63:32 Reserved IA32_BNDCFGS Supervisor State of MPX Configuration (R/W) 0 EN: Enable Intel MPX in supervisor mode. 1 BNDPRESERVE: Preserve the bounds registers for near branch instructions in the absence of the BND prefix. 11:2 Reserved, must be zero. 63:12 Base Address of Bound Directory. IA32_XSS Extended Supervisor State Mask (R/W) 7:0 Reserved 8 Trace Packet Configuration State (R/W) 63:9 Reserved. If (CPUID.(EAX=10H, ECX=0H):EBX[1] != 0) n = CPUID.(EAX=10H, ECX=1H):EDX[15:0] If (CPUID.(EAX=10H, ECX=0H):EBX[2] != 0) n = CPUID.(EAX=10H, ECX=2H):EDX[15:0] If (CPUID.(EAX=07H, ECX=0H):EBX[14] = 1) If( CPUID.(0DH, 1):EAX.[3] =1 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex DB0H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Comment Decimal 3504 IA32_PKG_HDC_CTL Package Level Enable/disable HDC (R/W) If CPUID.06H:EAX.[13] = 1 0 HDC_Pkg_Enable (R/W) If CPUID.06H:EAX.[13] = 1 Force HDC idling or wake up HDC-idled logical processors in the package. See Section 14.5.2, “Package level Enabling HDC”. DB1H 3505 63:1 Reserved IA32_PM_CTL1 Enable/disable HWP (R/W) If CPUID.06H:EAX.[13] = 1 0 HDC_Allow_Block (R/W) If CPUID.06H:EAX.[13] = 1 Allow/Block this logical processor for package level HDC control. See Section 14.5.3. DB2H 3506 63:1 Reserved IA32_THREAD_STALL Per-Logical_Processor HDC Idle Residency (R/0) If CPUID.06H:EAX.[13] = 1 Stall_Cycle_Cnt (R/W) If CPUID.06H:EAX.[13] = 1 63:0 Stalled cycles due to HDC forced idle on this logical processor. See Section 14.5.4.1. 4000_ 0000H 4000_ 00FFH Reserved MSR Address Space All existing and future processors will not implement MSRs in this range. C000_ 0080H IA32_EFER Extended Feature Enables 0 SYSCALL Enable: IA32_EFER.SCE (R/W) If ( CPUID.80000001H:EDX.[2 0] || CPUID.80000001H:EDX.[2 9]) Enables SYSCALL/SYSRET instructions in 64-bit mode. 7:1 Reserved 8 IA-32e Mode Enable: IA32_EFER.LME (R/W) Enables IA-32e mode operation. 9 Reserved 10 IA-32e Mode Active: IA32_EFER.LMA (R) Indicates IA-32e mode is active when set. C000_ 0081H 11 Execute Disable Bit Enable: IA32_EFER.NXE (R/W) 63:12 Reserved IA32_STAR System Call Target Address (R/W) If CPUID.80000001:EDX.[29] =1 Vol. 4 2-45 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-2. IA-32 Architectural MSRs (Contd.) Register Address Hex C000_ 0082H Architectural MSR Name / Bit Fields (Former MSR Name) MSR/Bit Description Decimal IA32_LSTAR IA-32e Mode System Call Target Address (R/W) Target RIP for the called procedure when SYSCALL is executed in 64-bit mode. C000_ 0083H Comment IA32_CSTAR IA-32e Mode System Call Target Address (R/W) Not used, as the SYSCALL instruction is not recognized in compatibility mode. If CPUID.80000001:EDX.[29] =1 If CPUID.80000001:EDX.[29] =1 C000_ 0084H IA32_FMASK System Call Flag Mask (R/W) If CPUID.80000001:EDX.[29] =1 C000_ 0100H IA32_FS_BASE Map of BASE Address of FS (R/W) If CPUID.80000001:EDX.[29] =1 C000_ 0101H IA32_GS_BASE Map of BASE Address of GS (R/W) If CPUID.80000001:EDX.[29] =1 C000_ 0102H IA32_KERNEL_GS_BASE Swap Target of BASE Address of GS (R/W) If CPUID.80000001:EDX.[29] =1 C000_ 0103H IA32_TSC_AUX Auxiliary TSC (RW) If CPUID.80000001H: EDX[27] = 1 or CPUID.(EAX=7,ECX=0):ECX [bit 22] = 1 31:0 AUX: Auxiliary signature of TSC. 63:32 Reserved NOTES: 1. In processors based on Intel NetBurst® microarchitecture, MSR addresses 180H-197H are supported, software must treat them as model-specific. Starting with Intel Core Duo processors, MSR addresses 180H-185H, 188H-197H are reserved. 2. The *_ADDR MSRs may or may not be present; this depends on flag settings in IA32_MCi_STATUS. See Section 15.3.2.3 and Section 15.3.2.4 for more information. 3. MAXPHYADDR is reported by CPUID.80000008H:EAX[7:0]. 2.2 MSRS IN THE INTEL® CORE™ 2 PROCESSOR FAMILY Table 2-3 lists model-specific registers (MSRs) for Intel Core 2 processor family and for Intel Xeon processors based on Intel Core microarchitecture, architectural MSR addresses are also included in Table 2-3. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_0FH, see Table 2-1. MSRs listed in Table 2-2 and Table 2-3 are also supported by processors based on the Enhanced Intel Core microarchitecture. Processors based on the Enhanced Intel Core microarchitecture have the CPUID signature DisplayFamily_DisplayModel of 06_17H. 2-46 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) The column “Shared/Unique” applies to multi-core processors based on Intel Core microarchitecture. “Unique” means each processor core has a separate MSR, or a bit field in an MSR governs only a core independently. “Shared” means the MSR or the bit field in an MSR address governs the operation of both processor cores. Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture Register Address Register Name / Bit Fields Shared/ Unique Bit Description Hex Dec 0H 0 IA32_P5_MC_ADDR Unique See Section 2.23, “MSRs in Pentium Processors.” 1H 1 IA32_P5_MC_TYPE Unique See Section 2.23, “MSRs in Pentium Processors.” 6H 6 IA32_MONITOR_FILTER_SIZE Unique See Section 8.10.5, “Monitor/Mwait Address Range Determination.” and Table 2-2. 10H 16 IA32_TIME_STAMP_COUNTER Unique See Section 17.17, “Time-Stamp Counter,” and see Table 2-2. 17H 23 IA32_PLATFORM_ID Shared Platform ID (R) See Table 2-2. 17H 23 MSR_PLATFORM_ID Shared Model Specific Platform ID (R) 7:0 Reserved 12:8 Maximum Qualified Ratio (R) The maximum allowed bus ratio. 49:13 Reserved 52:50 See Table 2-2. 63:53 Reserved 1BH 27 IA32_APIC_BASE Unique See Section 10.4.4, “Local APIC Status and Location” and Table 2-2. 2AH 42 MSR_EBL_CR_POWERON Shared Processor Hard Power-On Configuration (R/W) Enables and disables processor features; (R) indicates current processor configuration. 0 Reserved 1 Data Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processors implement R/W. 2 Response Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 3 MCERR# Drive Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processors implement R/W. 4 Address Parity Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processors implement R/W. 5 Reserved 6 Reserved Vol. 4 2-47 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 7 BINIT# Driver Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processors implement R/W. 8 Output Tri-state Enabled (R/O) 1 = Enabled; 0 = Disabled 9 Execute BIST (R/O) 1 = Enabled; 0 = Disabled 10 MCERR# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled 11 Intel TXT Capable Chipset. (R/O) 1 = Present; 0 = Not Present 12 BINIT# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled 13 Reserved 14 1 MByte Power on Reset Vector (R/O) 1 = 1 MByte; 0 = 4 GBytes 15 Reserved 17:16 APIC Cluster ID (R/O) 18 N/2 Non-Integer Bus Ratio (R/O) 0 = Integer ratio; 1 = Non-integer ratio 3AH 58 19 Reserved 21: 20 Symmetric Arbitration ID (R/O) 26:22 Integer Bus Frequency Ratio (R/O) MSR_FEATURE_CONTROL Unique Control Features in Intel 64 Processor (R/W) See Table 2-2. 3 Unique SMRR Enable (R/WL) When this bit is set and the lock bit is set, this makes the SMRR_PHYS_BASE and SMRR_PHYS_MASK registers read visible and writeable while in SMM. 40H 64 MSR_LASTBRANCH_0_FROM_IP Unique Last Branch Record 0 From IP (R/W) One of four pairs of last branch record registers on the last branch record stack. The From_IP part of the stack contains pointers to the source instruction. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.5. 41H 65 MSR_LASTBRANCH_1_FROM_IP Unique Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 42H 66 MSR_LASTBRANCH_2_FROM_IP Unique Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 2-48 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Dec 43H 67 Register Name / Bit Fields MSR_LASTBRANCH_3_FROM_IP Shared/ Unique Unique Bit Description Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 60H 96 MSR_LASTBRANCH_0_TO_IP Unique Last Branch Record 0 To IP (R/W) One of four pairs of last branch record registers on the last branch record stack. This To_IP part of the stack contains pointers to the destination instruction. 61H 97 MSR_LASTBRANCH_1_TO_IP Unique Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 62H 98 MSR_LASTBRANCH_2_TO_IP Unique Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 63H 99 MSR_LASTBRANCH_3_TO_IP Unique Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 79H 121 IA32_BIOS_UPDT_TRIG Unique BIOS Update Trigger Register (W) See Table 2-2. 8BH 139 IA32_BIOS_SIGN_ID Unique BIOS Update Signature ID (RO) See Table 2-2. A0H 160 MSR_SMRR_PHYSBASE Unique System Management Mode Base Address register (WO in SMM) Model-specific implementation of SMRR-like interface, read visible and write only in SMM. 11:0 Reserved 31:12 PhysBase: SMRR physical Base Address. 63:32 A1H 161 MSR_SMRR_PHYSMASK Reserved Unique System Management Mode Physical Address Mask register (WO in SMM) Model-specific implementation of SMRR-like interface, read visible and write only in SMM. C1H 193 10:0 Reserved 11 Valid: Physical address base and range mask are valid. 31:12 PhysMask: SMRR physical address range mask. 63:32 Reserved IA32_PMC0 Unique Performance Counter Register See Table 2-2. C2H 194 IA32_PMC1 Unique Performance Counter Register See Table 2-2. CDH 205 MSR_FSB_FREQ Shared Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Intel Core microarchitecture. Vol. 4 2-49 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 2:0 • • • • • • 101B: 100 MHz (FSB 400) 001B: 133 MHz (FSB 533) 011B: 167 MHz (FSB 667) 010B: 200 MHz (FSB 800) 000B: 267 MHz (FSB 1067) 100B: 333 MHz (FSB 1333) 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 266.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 000B. 333.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 100B. 63:3 CDH 205 MSR_FSB_FREQ Reserved Shared Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Enhanced Intel Core microarchitecture. 2:0 • • • • • • • 101B: 100 MHz (FSB 400) 001B: 133 MHz (FSB 533) 011B: 167 MHz (FSB 667) 010B: 200 MHz (FSB 800) 000B: 267 MHz (FSB 1067) 100B: 333 MHz (FSB 1333) 110B: 400 MHz (FSB 1600) 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 266.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 110B. 333.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 111B. 63:3 Reserved E7H 231 IA32_MPERF Unique E8H 232 IA32_APERF Unique Maximum Performance Frequency Clock Count (RW) See Table 2-2. Actual Performance Frequency Clock Count (RW) See Table 2-2. FEH 254 IA32_MTRRCAP Unique See Table 2-2. 11 Unique SMRR Capability Using MSR 0A0H and 0A1H (R) 174H 372 IA32_SYSENTER_CS Unique See Table 2-2. 175H 373 IA32_SYSENTER_ESP Unique See Table 2-2. 2-50 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 176H 374 IA32_SYSENTER_EIP Unique See Table 2-2. 179H 377 IA32_MCG_CAP Unique See Table 2-2. 17AH 378 IA32_MCG_STATUS Unique Global Machine Check Status 0 RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 Reserved 186H 390 IA32_PERFEVTSEL0 Unique See Table 2-2. 187H 391 IA32_PERFEVTSEL1 Unique See Table 2-2. 198H 408 IA32_PERF_STATUS Shared See Table 2-2. 198H 408 MSR_PERF_STATUS Shared Current performance status. See Section 14.1.1, “Software Interface For Initiating Performance State Transitions”. 15:0 Current Performance State Value 30:16 Reserved 31 XE Operation (R/O). If set, XE operation is enabled. Default is cleared. 39:32 Reserved 44:40 Maximum Bus Ratio (R/O) Indicates maximum bus ratio configured for the processor. 45 Reserved 46 Non-Integer Bus Ratio (R/O) Indicates non-integer bus ratio is enabled. Applies processors based on Enhanced Intel Core microarchitecture. 63:47 199H 409 IA32_PERF_CTL Reserved Unique See Table 2-2. Vol. 4 2-51 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Dec 19AH 410 Register Name / Bit Fields IA32_CLOCK_MODULATION Shared/ Unique Unique Bit Description Clock Modulation (R/W) See Table 2-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 19BH 411 IA32_THERM_INTERRUPT Unique Thermal Interrupt Control (R/W) See Table 2-2. 19CH 412 IA32_THERM_STATUS Unique Thermal Monitor Status (R/W) See Table 2-2. 19DH 413 MSR_THERM2_CTL Unique Thermal Monitor 2 Control 15:0 Reserved 16 TM_SELECT (R/W) Mode of automatic thermal monitor: 0 = Thermal Monitor 1 (thermally-initiated on-die modulation of the stop-clock duty cycle). 1 = Thermal Monitor 2 (thermally-initiated frequency transitions). If bit 3 of the IA32_MISC_ENABLE register is cleared, TM_SELECT has no effect. Neither TM1 nor TM2 are enabled. 63:16 1A0H 416 Reserved IA32_MISC_ENABLE Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 0 Fast-Strings Enable See Table 2-2. 2:1 3 Reserved Unique Automatic Thermal Control Circuit Enable (R/W) See Table 2-2. 6:4 7 Reserved Shared Performance Monitoring Available (R) See Table 2-2. 8 9 Reserved Hardware Prefetcher Disable (R/W) When set, disables the hardware prefetcher operation on streams of data. When clear (default), enables the prefetch queue. Disabling of the hardware prefetcher may impact processor performance. 2-52 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 10 Shared FERR# Multiplexing Enable (R/W) 1 = FERR# asserted by the processor to indicate a pending break event within the processor. 0 = Indicates compatible FERR# signaling behavior. This bit must be set to 1 to support XAPIC interrupt model usage. 11 Shared Branch Trace Storage Unavailable (RO) See Table 2-2. 12 Shared Processor Event Based Sampling Unavailable (RO) See Table 2-2. 13 Shared TM2 Enable (R/W) When this bit is set (1) and the thermal sensor indicates that the die temperature is at the pre-determined threshold, the Thermal Monitor 2 mechanism is engaged. TM2 will reduce the bus to core ratio and voltage according to the value last written to MSR_THERM2_CTL bits 15:0. When this bit is clear (0, default), the processor does not change the VID signals or the bus to core ratio when the processor enters a thermally managed state. The BIOS must enable this feature if the TM2 feature flag (CPUID.1:ECX[8]) is set; if the TM2 feature flag is not set, this feature is not supported and BIOS must not alter the contents of the TM2 bit location. The processor is operating out of specification if both this bit and the TM1 bit are set to 0. 15:14 16 Reserved Shared Enhanced Intel SpeedStep Technology Enable (R/W) See Table 2-2. 18 Shared ENABLE MONITOR FSM (R/W) See Table 2-2. 19 Shared Adjacent Cache Line Prefetch Disable (R/W) When set to 1, the processor fetches the cache line that contains data currently required by the processor. When set to 0, the processor fetches cache lines that comprise a cache line pair (128 bytes). Single processor platforms should not set this bit. Server platforms should set or clear this bit based on platform performance observed in validation and testing. BIOS may contain a setup option that controls the setting of this bit. Vol. 4 2-53 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 20 Shared Enhanced Intel SpeedStep Technology Select Lock (R/WO) When set, this bit causes the following bits to become read-only: • Enhanced Intel SpeedStep Technology Select Lock (this bit). • Enhanced Intel SpeedStep Technology Enable bit. The bit must be set before an Enhanced Intel SpeedStep Technology transition is requested. This bit is cleared on reset. 21 22 Reserved Shared Limit CPUID Maxval (R/W) See Table 2-2. 23 Shared xTPR Message Disable (R/W) See Table 2-2. 33:24 34 Reserved Unique XD Bit Disable (R/W) See Table 2-2. 36:35 37 Reserved Unique DCU Prefetcher Disable (R/W) When set to 1, the DCU L1 data cache prefetcher is disabled. The default value after reset is 0. BIOS may write ‘1’ to disable this feature. The DCU prefetcher is an L1 data cache prefetcher. When the DCU prefetcher detects multiple loads from the same line done within a time limit, the DCU prefetcher assumes the next line will be required. The next line is prefetched in to the L1 data cache from memory or L2. 38 Shared IDA Disable (R/W) When set to 1 on processors that support IDA, the Intel Dynamic Acceleration feature (IDA) is disabled and the IDA_Enable feature flag will be cleared (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processor’s support of IDA is enabled. Note: The power-on default value is used by BIOS to detect hardware support of IDA. If the power-on default value is 1, IDA is available in the processor. If the poweron default value is 0, IDA is not available. 2-54 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 39 Unique IP Prefetcher Disable (R/W) When set to 1, the IP prefetcher is disabled. The default value after reset is 0. BIOS may write ‘1’ to disable this feature. The IP prefetcher is an L1 data cache prefetcher. The IP prefetcher looks for sequential load history to determine whether to prefetch the next expected data into the L1 cache from memory or L2. 63:40 1C9H 457 MSR_LASTBRANCH_TOS Reserved Unique Last Branch Record Stack TOS (R/W) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 40H). 1D9H 473 IA32_DEBUGCTL Unique 1DDH 477 MSR_LER_FROM_LIP Unique Debug Control (R/W) See Table 2-2. Last Exception Record From Linear IP (R/W) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Unique Last Exception Record To Linear IP (R/W) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 200H 512 IA32_MTRR_PHYSBASE0 Unique See Table 2-2. 201H 513 IA32_MTRR_PHYSMASK0 Unique See Table 2-2. 202H 514 IA32_MTRR_PHYSBASE1 Unique See Table 2-2. 203H 515 IA32_MTRR_PHYSMASK1 Unique See Table 2-2. 204H 516 IA32_MTRR_PHYSBASE2 Unique See Table 2-2. 205H 517 IA32_MTRR_PHYSMASK2 Unique See Table 2-2. 206H 518 IA32_MTRR_PHYSBASE3 Unique See Table 2-2. 207H 519 IA32_MTRR_PHYSMASK3 Unique See Table 2-2. 208H 520 IA32_MTRR_PHYSBASE4 Unique See Table 2-2. 209H 521 IA32_MTRR_PHYSMASK4 Unique See Table 2-2. 20AH 522 IA32_MTRR_PHYSBASE5 Unique See Table 2-2. 20BH 523 IA32_MTRR_PHYSMASK5 Unique See Table 2-2. 20CH 524 IA32_MTRR_PHYSBASE6 Unique See Table 2-2. 20DH 525 IA32_MTRR_PHYSMASK6 Unique See Table 2-2. 20EH 526 IA32_MTRR_PHYSBASE7 Unique See Table 2-2. 20FH 527 IA32_MTRR_PHYSMASK7 Unique See Table 2-2. Vol. 4 2-55 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Register Name / Bit Fields Shared/ Unique Bit Description Hex Dec 250H 592 IA32_MTRR_FIX64K_00000 Unique See Table 2-2. 258H 600 IA32_MTRR_FIX16K_80000 Unique See Table 2-2. 259H 601 IA32_MTRR_FIX16K_A0000 Unique See Table 2-2. 268H 616 IA32_MTRR_FIX4K_C0000 Unique See Table 2-2. 269H 617 IA32_MTRR_FIX4K_C8000 Unique See Table 2-2. 26AH 618 IA32_MTRR_FIX4K_D0000 Unique See Table 2-2. 26BH 619 IA32_MTRR_FIX4K_D8000 Unique See Table 2-2. 26CH 620 IA32_MTRR_FIX4K_E0000 Unique See Table 2-2. 26DH 621 IA32_MTRR_FIX4K_E8000 Unique See Table 2-2. 26EH 622 IA32_MTRR_FIX4K_F0000 Unique See Table 2-2. 26FH 623 IA32_MTRR_FIX4K_F8000 Unique See Table 2-2. 277H 631 IA32_PAT Unique See Table 2-2. 2FFH 767 IA32_MTRR_DEF_TYPE Unique Default Memory Types (R/W) See Table 2-2. 309H 777 IA32_FIXED_CTR0 Unique Fixed-Function Performance Counter Register 0 (R/W) See Table 2-2. 30AH 778 IA32_FIXED_CTR1 Unique 30BH 779 IA32_FIXED_CTR2 Unique Fixed-Function Performance Counter Register 1 (R/W) See Table 2-2. Fixed-Function Performance Counter Register 2 (R/W) See Table 2-2. 345H 837 IA32_PERF_CAPABILITIES Unique See Table 2-2. See Section 17.4.1, “IA32_DEBUGCTL MSR.” 345H 837 MSR_PERF_CAPABILITIES Unique RO. This applies to processors that do not support architectural perfmon version 2. 38DH 909 5:0 LBR Format. See Table 2-2. 6 PEBS Record Format 7 PEBSSaveArchRegs. See Table 2-2. 63:8 Reserved IA32_FIXED_CTR_CTRL Unique Fixed-Function-Counter Control Register (R/W) See Table 2-2. 38EH 910 IA32_PERF_GLOBAL_STATUS Unique See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 38EH 910 MSR_PERF_GLOBAL_STATUS Unique See Section 18.6.2.2, “Global Counter Control Facilities.” 38FH 911 IA32_PERF_GLOBAL_CTRL Unique See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 38FH 911 MSR_PERF_GLOBAL_CTRL Unique See Section 18.6.2.2, “Global Counter Control Facilities.” 2-56 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Register Name / Bit Fields Shared/ Unique Bit Description Hex Dec 390H 912 IA32_PERF_GLOBAL_OVF_CTRL Unique See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 390H 912 MSR_PERF_GLOBAL_OVF_CTRL Unique See Section 18.6.2.2, “Global Counter Control Facilities.” 3F1H 1009 MSR_PEBS_ENABLE Unique See Table 2-2. See Section 18.6.2.4, “Processor Event Based Sampling (PEBS).” 0 Enable PEBS on IA32_PMC0. (R/W) 400H 1024 IA32_MC0_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 1028 IA32_MC1_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 406H 1030 IA32_MC1_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 408H 1032 IA32_MC2_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40AH 1034 IA32_MC2_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 1036 IA32_MC4_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 IA32_MC4_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 IA32_MC4_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. Vol. 4 2-57 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Dec Register Name / Bit Fields Shared/ Unique Bit Description 410H 1040 IA32_MC3_CTL See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 IA32_MC3_STATUS See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 IA32_MC3_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 413H 1043 IA32_MC3_MISC Unique Machine Check Error Reporting Register: Contains additional information describing the machine-check error if the MISCV flag in the IA32_MCi_STATUS register is set. 414H 1044 IA32_MC5_CTL Unique Machine Check Error Reporting Register: Controls signaling of #MC for errors produced by a particular hardware unit (or group of hardware units). 415H 1045 IA32_MC5_STATUS Unique Machine Check Error Reporting Register: Contains information related to a machine-check error if its VAL (valid) flag is set. Software is responsible for clearing IA32_MCi_STATUS MSRs by explicitly writing 0s to them; writing 1s to them causes a general-protection exception. 416H 1046 IA32_MC5_ADDR Unique Machine Check Error Reporting Register: Contains the address of the code or data memory location that produced the machine-check error if the ADDRV flag in the IA32_MCi_STATUS register is set. 417H 1047 IA32_MC5_MISC Unique Machine Check Error Reporting Register: Contains additional information describing the machine-check error if the MISCV flag in the IA32_MCi_STATUS register is set. 419H 1045 IA32_MC6_STATUS Unique Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 23. 480H 1152 IA32_VMX_BASIC Unique Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. See Appendix A.1, “Basic VMX Information.” 481H 1153 IA32_VMX_PINBASED_CTLS Unique Capability Reporting Register of Pin-Based VM-Execution Controls (R/O) See Table 2-2. See Appendix A.3, “VM-Execution Controls.” 482H 1154 IA32_VMX_PROCBASED_CTLS Unique Capability Reporting Register of Primary Processor-Based VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 483H 1155 IA32_VMX_EXIT_CTLS Unique Capability Reporting Register of VM-Exit Controls (R/O) See Table 2-2. See Appendix A.4, “VM-Exit Controls.” 2-58 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Dec 484H 1156 Register Name / Bit Fields IA32_VMX_ENTRY_CTLS Shared/ Unique Unique Bit Description Capability Reporting Register of VM-Entry Controls (R/O) See Table 2-2. See Appendix A.5, “VM-Entry Controls.” 485H 1157 IA32_VMX_MISC Unique Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 2-2. See Appendix A.6, “Miscellaneous Data.” 486H 1158 IA32_VMX_CR0_FIXED0 Unique Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 487H 1159 IA32_VMX_CR0_FIXED1 Unique Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 488H 1160 IA32_VMX_CR4_FIXED0 Unique Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 489H 1161 IA32_VMX_CR4_FIXED1 Unique Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 48AH 1162 IA32_VMX_VMCS_ENUM Unique Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 2-2. See Appendix A.9, “VMCS Enumeration.” 48BH 1163 IA32_VMX_PROCBASED_CTLS2 Unique Capability Reporting Register of Secondary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 600H 1536 IA32_DS_AREA Unique DS Save Area (R/W) See Table 2-2. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 107CC H MSR_EMON_L3_CTR_CTL0 107CD H MSR_EMON_L3_CTR_CTL1 107CE H MSR_EMON_L3_CTR_CTL2 Unique GBUSQ Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique GBUSQ Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique GSNPQ Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Vol. 4 2-59 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-3. MSRs in Processors Based on Intel® Core™ Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Bit Description Dec 107CF H MSR_EMON_L3_CTR_CTL3 107D0 H MSR_EMON_L3_CTR_CTL4 107D1 H MSR_EMON_L3_CTR_CTL5 107D2 H MSR_EMON_L3_CTR_CTL6 107D3 H MSR_EMON_L3_CTR_CTL7 107D8 H MSR_EMON_L3_GL_CTL C000_ 0080H IA32_EFER C000_ 0081H IA32_STAR C000_ 0082H IA32_LSTAR C000_ 0084H IA32_FMASK C000_ 0100H IA32_FS_BASE C000_ 0101H IA32_GS_BASE C000_ 0102H IA32_KERNEL_GS_BASE 2.3 Shared/ Unique Unique GSNPQ Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique FSB Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique FSB Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique FSB Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique FSB Event Control/Counter Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique L3/FSB Common Control Register (R/W) Applies to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 Unique Extended Feature Enables See Table 2-2. Unique System Call Target Address (R/W) See Table 2-2. Unique IA-32e Mode System Call Target Address (R/W) See Table 2-2. Unique System Call Flag Mask (R/W) See Table 2-2. Unique Map of BASE Address of FS (R/W) See Table 2-2. Unique Map of BASE Address of GS (R/W) See Table 2-2. Unique Swap Target of BASE Address of GS (R/W) See Table 2-2. MSRS IN THE 45 NM AND 32 NM INTEL® ATOM™ PROCESSOR FAMILY Table 2-4 lists model-specific registers (MSRs) for 45 nm and 32 nm Intel Atom processors, architectural MSR addresses are also included in Table 2-4. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_1CH, 06_26H, 06_27H, 06_35H and 06_36H; see Table 2-1. The column “Shared/Unique” applies to logical processors sharing the same core in processors based on the Intel Atom microarchitecture. “Unique” means each logical processor has a separate MSR, or a bit field in an MSR 2-60 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) governs only a logical processor. “Shared” means the MSR or the bit field in an MSR address governs the operation of both logical processors in the same core. Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family Register Address Register Name / Bit Fields Shared/ Unique Bit Description Hex Dec 0H 0 IA32_P5_MC_ADDR Shared See Section 2.23, “MSRs in Pentium Processors.” 1H 1 IA32_P5_MC_TYPE Shared See Section 2.23, “MSRs in Pentium Processors.” 6H 6 IA32_MONITOR_FILTER_ SIZE Unique See Section 8.10.5, “Monitor/Mwait Address Range Determination.” and Table 2-2. 10H 16 IA32_TIME_STAMP_ COUNTER Unique See Section 17.17, “Time-Stamp Counter,” and see Table 2-2. 17H 23 IA32_PLATFORM_ID Shared Platform ID (R) See Table 2-2. 17H 23 MSR_PLATFORM_ID Shared Model Specific Platform ID (R) 7:0 Reserved 12:8 Maximum Qualified Ratio (R) The maximum allowed bus ratio. 63:13 Reserved 1BH 27 IA32_APIC_BASE Unique See Section 10.4.4, “Local APIC Status and Location” and Table 2-2. 2AH 42 MSR_EBL_CR_POWERON Shared Processor Hard Power-On Configuration (R/W) Enables and disables processor features; (R) indicates current processor configuration. 0 Reserved 1 Data Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. 2 Response Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. 3 AERR# Drive Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. 4 BERR# Enable for initiator bus requests (R/W) 1 = Enabled; 0 = Disabled Always 0. 5 Reserved 6 Reserved 7 BINIT# Driver Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. 8 Reserved Vol. 4 2-61 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 9 Execute BIST (R/O) 1 = Enabled; 0 = Disabled 10 AERR# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Always 0. 11 Reserved 12 BINIT# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Always 0. 13 Reserved 14 1 MByte Power on Reset Vector (R/O) 1 = 1 MByte; 0 = 4 GBytes 15 Reserved 17:16 APIC Cluster ID (R/O) Always 00B. 19: 18 Reserved 21: 20 Symmetric Arbitration ID (R/O) Always 00B. 26:22 3AH 58 IA32_FEATURE_CONTROL Integer Bus Frequency Ratio (R/O) Unique Control Features in Intel 64Processor (R/W) See Table 2-2. 40H 64 MSR_LASTBRANCH_0_FROM_IP Unique Last Branch Record 0 From IP (R/W) One of eight pairs of last branch record registers on the last branch record stack. The From_IP part of the stack contains pointers to the source instruction . See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.5. 41H 65 MSR_LASTBRANCH_1_FROM_IP Unique Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 42H 66 MSR_LASTBRANCH_2_FROM_IP Unique Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 43H 67 MSR_LASTBRANCH_3_FROM_IP Unique Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 44H 68 MSR_LASTBRANCH_4_FROM_IP Unique Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 45H 69 MSR_LASTBRANCH_5_FROM_IP Unique Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 46H 70 MSR_LASTBRANCH_6_FROM_IP Unique Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 2-62 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Hex Dec 47H 71 Register Name / Bit Fields MSR_LASTBRANCH_7_FROM_IP Shared/ Unique Unique Bit Description Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 60H 96 MSR_LASTBRANCH_0_TO_IP Unique Last Branch Record 0 To IP (R/W) One of eight pairs of last branch record registers on the last branch record stack. The To_IP part of the stack contains pointers to the destination instruction. 61H 97 MSR_LASTBRANCH_1_TO_IP Unique Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 62H 98 MSR_LASTBRANCH_2_TO_IP Unique Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 63H 99 MSR_LASTBRANCH_3_TO_IP Unique Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 64H 100 MSR_LASTBRANCH_4_TO_IP Unique Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 65H 101 MSR_LASTBRANCH_5_TO_IP Unique Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 66H 102 MSR_LASTBRANCH_6_TO_IP Unique Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 67H 103 MSR_LASTBRANCH_7_TO_IP Unique Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 79H 121 IA32_BIOS_UPDT_TRIG Shared BIOS Update Trigger Register (W) See Table 2-2. 8BH 139 IA32_BIOS_SIGN_ID Unique BIOS Update Signature ID (RO) See Table 2-2. C1H 193 IA32_PMC0 Unique Performance counter register See Table 2-2. C2H 194 IA32_PMC1 Unique Performance Counter Register See Table 2-2. CDH 205 MSR_FSB_FREQ Shared Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Intel Atom microarchitecture. 2:0 • • • • 111B: 083 MHz (FSB 333) 101B: 100 MHz (FSB 400) 001B: 133 MHz (FSB 533) 011B: 167 MHz (FSB 667) 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 63:3 Reserved Vol. 4 2-63 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Hex Dec E7H 231 Register Name / Bit Fields IA32_MPERF Shared/ Unique Unique Bit Description Maximum Performance Frequency Clock Count (RW) See Table 2-2. E8H 232 IA32_APERF Unique Actual Performance Frequency Clock Count (RW) See Table 2-2. FEH 254 IA32_MTRRCAP Shared Memory Type Range Register (R) See Table 2-2. 11EH 281 MSR_BBL_CR_CTL3 Shared Control Register 3 Used to configure the L2 Cache. 0 L2 Hardware Enabled (RO) 1 = Indicates the L2 is hardware-enabled. 0 = Indicates the L2 is hardware-disabled. 7:1 Reserved 8 L2 Enabled (R/W) 1 = L2 cache has been initialized. 0 = Disabled (default). Until this bit is set, the processor will not respond to the WBINVD instruction or the assertion of the FLUSH# input. 22:9 Reserved 23 L2 Not Present (RO) 0 = L2 Present 1 = L2 Not Present 63:24 Reserved 174H 372 IA32_SYSENTER_CS Unique See Table 2-2. 175H 373 IA32_SYSENTER_ESP Unique See Table 2-2. 176H 374 IA32_SYSENTER_EIP Unique See Table 2-2. 179H 377 IA32_MCG_CAP Unique See Table 2-2. 17AH 378 IA32_MCG_STATUS Unique Global Machine Check Status 0 RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2-64 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 Reserved 186H 390 IA32_PERFEVTSEL0 Unique See Table 2-2. 187H 391 IA32_PERFEVTSEL1 Unique See Table 2-2. 198H 408 IA32_PERF_STATUS Shared See Table 2-2. 198H 408 MSR_PERF_STATUS Shared Performance Status 15:0 Current Performance State Value 39:16 Reserved 44:40 Maximum Bus Ratio (R/O) Indicates maximum bus ratio configured for the processor. 63:45 Reserved 199H 409 IA32_PERF_CTL Unique See Table 2-2. 19AH 410 IA32_CLOCK_MODULATION Unique Clock Modulation (R/W) See Table 2-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 19BH 411 IA32_THERM_INTERRUPT Unique 19CH 412 IA32_THERM_STATUS Unique Thermal Interrupt Control (R/W) See Table 2-2. Thermal Monitor Status (R/W) See Table 2-2. 19DH 413 MSR_THERM2_CTL Shared Thermal Monitor 2 Control 15:0 Reserved 16 TM_SELECT (R/W) Mode of automatic thermal monitor: 0 = Thermal Monitor 1 (thermally-initiated on-die modulation of the stop-clock duty cycle). 1 = Thermal Monitor 2 (thermally-initiated frequency transitions). If bit 3 of the IA32_MISC_ENABLE register is cleared, TM_SELECT has no effect. Neither TM1 nor TM2 are enabled. 63:17 1A0H 416 IA32_MISC_ENABLE Reserved Unique Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 0 Fast-Strings Enable See Table 2-2. Vol. 4 2-65 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 2:1 3 Reserved Unique Automatic Thermal Control Circuit Enable (R/W) See Table 2-2. Default value is 0. 6:4 7 Reserved Shared Performance Monitoring Available (R) See Table 2-2. 8 Reserved 9 Reserved 10 Shared FERR# Multiplexing Enable (R/W) 1= FERR# asserted by the processor to indicate a pending break event within the processor. 0 = Indicates compatible FERR# signaling behavior. This bit must be set to 1 to support XAPIC interrupt model usage. 11 Shared Branch Trace Storage Unavailable (RO) See Table 2-2. 12 Shared Processor Event Based Sampling Unavailable (RO) See Table 2-2. 13 Shared TM2 Enable (R/W) When this bit is set (1) and the thermal sensor indicates that the die temperature is at the pre-determined threshold, the Thermal Monitor 2 mechanism is engaged. TM2 will reduce the bus to core ratio and voltage according to the value last written to MSR_THERM2_CTL bits 15:0. When this bit is cleared (0, default), the processor does not change the VID signals or the bus to core ratio when the processor enters a thermally managed state. The BIOS must enable this feature if the TM2 feature flag (CPUID.1:ECX[8]) is set; if the TM2 feature flag is not set, this feature is not supported and BIOS must not alter the contents of the TM2 bit location. The processor is operating out of specification if both this bit and the TM1 bit are set to 0. 15:14 16 Reserved Shared Enhanced Intel SpeedStep Technology Enable (R/W) See Table 2-2. 18 Shared ENABLE MONITOR FSM (R/W) See Table 2-2. 19 2-66 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec 20 Shared Enhanced Intel SpeedStep Technology Select Lock (R/WO) When set, this bit causes the following bits to become read-only: • Enhanced Intel SpeedStep Technology Select Lock (this bit). • Enhanced Intel SpeedStep Technology Enable bit. The bit must be set before an Enhanced Intel SpeedStep Technology transition is requested. This bit is cleared on reset. 21 22 Reserved Unique Limit CPUID Maxval (R/W) See Table 2-2. 23 Shared xTPR Message Disable (R/W) See Table 2-2. 33:24 34 Reserved Unique XD Bit Disable (R/W) See Table 2-2. 63:35 1C9H 457 MSR_LASTBRANCH_TOS Reserved Unique Last Branch Record Stack TOS (R/W) Contains an index (bits 0-2) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 40H). 1D9H 473 IA32_DEBUGCTL Unique Debug Control (R/W) See Table 2-2. 1DDH 477 MSR_LER_FROM_LIP Unique Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Unique Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 200H 512 IA32_MTRR_PHYSBASE0 Shared See Table 2-2. 201H 513 IA32_MTRR_PHYSMASK0 Shared See Table 2-2. 202H 514 IA32_MTRR_PHYSBASE1 Shared See Table 2-2. 203H 515 IA32_MTRR_PHYSMASK1 Shared See Table 2-2. 204H 516 IA32_MTRR_PHYSBASE2 Shared See Table 2-2. 205H 517 IA32_MTRR_PHYSMASK2 Shared See Table 2-2. 206H 518 IA32_MTRR_PHYSBASE3 Shared See Table 2-2. 207H 519 IA32_MTRR_PHYSMASK3 Shared See Table 2-2. Vol. 4 2-67 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Register Name / Bit Fields Shared/ Unique Bit Description Hex Dec 208H 520 IA32_MTRR_PHYSBASE4 Shared See Table 2-2. 209H 521 IA32_MTRR_PHYSMASK4 Shared See Table 2-2. 20AH 522 IA32_MTRR_PHYSBASE5 Shared See Table 2-2. 20BH 523 IA32_MTRR_PHYSMASK5 Shared See Table 2-2. 20CH 524 IA32_MTRR_PHYSBASE6 Shared See Table 2-2. 20DH 525 IA32_MTRR_PHYSMASK6 Shared See Table 2-2. 20EH 526 IA32_MTRR_PHYSBASE7 Shared See Table 2-2. 20FH 527 IA32_MTRR_PHYSMASK7 Shared See Table 2-2. 250H 592 IA32_MTRR_FIX64K_00000 Shared See Table 2-2. 258H 600 IA32_MTRR_FIX16K_80000 Shared See Table 2-2. 259H 601 IA32_MTRR_FIX16K_A0000 Shared See Table 2-2. 268H 616 IA32_MTRR_FIX4K_C0000 Shared See Table 2-2. 269H 617 IA32_MTRR_FIX4K_C8000 Shared See Table 2-2. 26AH 618 IA32_MTRR_FIX4K_D0000 Shared See Table 2-2. 26BH 619 IA32_MTRR_FIX4K_D8000 Shared See Table 2-2. 26CH 620 IA32_MTRR_FIX4K_E0000 Shared See Table 2-2. 26DH 621 IA32_MTRR_FIX4K_E8000 Shared See Table 2-2. 26EH 622 IA32_MTRR_FIX4K_F0000 Shared See Table 2-2. 26FH 623 IA32_MTRR_FIX4K_F8000 Shared See Table 2-2. 277H 631 IA32_PAT Unique See Table 2-2. 309H 777 IA32_FIXED_CTR0 Unique Fixed-Function Performance Counter Register 0 (R/W) See Table 2-2. 30AH 778 IA32_FIXED_CTR1 Unique Fixed-Function Performance Counter Register 1 (R/W) See Table 2-2. 30BH 779 IA32_FIXED_CTR2 Unique Fixed-Function Performance Counter Register 2 (R/W) See Table 2-2. 345H 837 IA32_PERF_CAPABILITIES Shared 38DH 909 IA32_FIXED_CTR_CTRL Unique See Table 2-2. See Section 17.4.1, “IA32_DEBUGCTL MSR.” Fixed-Function-Counter Control Register (R/W) See Table 2-2. 38EH 910 IA32_PERF_GLOBAL_STATUS Unique See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 38FH 911 IA32_PERF_GLOBAL_CTRL Unique See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 390H 912 IA32_PERF_GLOBAL_OVF_CTRL Unique See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 3F1H 1009 MSR_PEBS_ENABLE Unique See Table 2-2. See Section 18.6.2.4, “Processor Event Based Sampling (PEBS).” 0 400H 2-68 Vol. 4 1024 IA32_MC0_CTL Enable PEBS on IA32_PMC0 (R/W) Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Register Name / Bit Fields Shared/ Unique Bit Description Hex Dec 401H 1025 IA32_MC0_STATUS Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR Shared See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 1028 IA32_MC1_CTL Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 408H 1032 IA32_MC2_CTL Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40AH 1034 IA32_MC2_ADDR Shared See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 1036 IA32_MC3_CTL Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 IA32_MC3_STATUS Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 IA32_MC3_ADDR Shared See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 410H 1040 IA32_MC4_CTL Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 IA32_MC4_STATUS Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 IA32_MC4_ADDR Shared See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 480H 1152 IA32_VMX_BASIC Unique Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. See Appendix A.1, “Basic VMX Information.” 481H 1153 IA32_VMX_PINBASED_CTLS Unique Capability Reporting Register of Pin-Based VM-Execution Controls (R/O) See Table 2-2. See Appendix A.3, “VM-Execution Controls.” Vol. 4 2-69 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Register Name / Bit Fields Shared/ Unique Hex Dec 482H 1154 IA32_VMX_PROCBASED_CTLS Unique 483H 1155 IA32_VMX_EXIT_CTLS Unique Bit Description Capability Reporting Register of Primary Processor-Based VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” Capability Reporting Register of VM-Exit Controls (R/O) See Table 2-2. See Appendix A.4, “VM-Exit Controls.” 484H 1156 IA32_VMX_ENTRY_CTLS Unique Capability Reporting Register of VM-Entry Controls (R/O) See Table 2-2. See Appendix A.5, “VM-Entry Controls.” 485H 1157 IA32_VMX_MISC Unique Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 2-2. See Appendix A.6, “Miscellaneous Data.” 486H 1158 IA32_VMX_CR0_FIXED0 Unique Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 487H 1159 IA32_VMX_CR0_FIXED1 Unique Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 488H 1160 IA32_VMX_CR4_FIXED0 Unique Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 489H 1161 IA32_VMX_CR4_FIXED1 Unique Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 48AH 1162 IA32_VMX_VMCS_ENUM Unique Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 2-2. See Appendix A.9, “VMCS Enumeration.” 48BH 1163 IA32_VMX_PROCBASED_CTLS2 Unique Capability Reporting Register of Secondary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 600H 1536 IA32_DS_AREA Unique DS Save Area (R/W) See Table 2-2. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” C000_ 0080H IA32_EFER C000_ 0081H IA32_STAR C000_ 0082H IA32_LSTAR 2-70 Vol. 4 Unique Extended Feature Enables See Table 2-2. Unique System Call Target Address (R/W) See Table 2-2. Unique IA-32e Mode System Call Target Address (R/W) See Table 2-2. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-4. MSRs in 45 nm and 32 nm Intel® Atom™ Processor Family (Contd.) Register Address Hex Register Name / Bit Fields Shared/ Unique Bit Description Dec C000_ 0084H IA32_FMASK C000_ 0100H IA32_FS_BASE C000_ 0101H IA32_GS_BASE C000_ 0102H IA32_KERNEL_GS_BASE Unique System Call Flag Mask (R/W) See Table 2-2. Unique Map of BASE Address of FS (R/W) See Table 2-2. Unique Map of BASE Address of GS (R/W) See Table 2-2. Unique Swap Target of BASE Address of GS (R/W) See Table 2-2. Table 2-5 lists model-specific registers (MSRs) that are specific to Intel® Atom™ processor with the CPUID signature with DisplayFamily_DisplayModel of 06_27H. Table 2-5. MSRs Supported by Intel® Atom™ Processors with CPUID Signature 06_27H Register Address Hex Dec 3F8H 1016 Register Name / Bit Fields MSR_PKG_C2_RESIDENCY Scope Package Bit Description Package C2 Residency Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 63:0 Package Package C2 Residency Counter (R/O) Time that this package is in processor-specific C2 states since last reset. Counts at 1 Mhz frequency. 3F9H 1017 MSR_PKG_C4_RESIDENCY Package Package C4 Residency Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 63:0 Package Package C4 Residency Counter. (R/O) Time that this package is in processor-specific C4 states since last reset. Counts at 1 Mhz frequency. 3FAH 1018 MSR_PKG_C6_RESIDENCY Package Package C6 Residency Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 63:0 Package Package C6 Residency Counter. (R/O) Time that this package is in processor-specific C6 states since last reset. Counts at 1 Mhz frequency. 2.4 MSRS IN INTEL PROCESSORS BASED ON SILVERMONT MICROARCHITECTURE Table 2-6 lists model-specific registers (MSRs) common to Intel processors based on the Silvermont microarchitecture. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_37H, 06_4AH, 06_4DH, Vol. 4 2-71 MODEL-SPECIFIC REGISTERS (MSRS) 06_5AH, and 06_5DH; see Table 2-1. The MSRs listed in Table 2-6 are also common to processors based on the Airmont microarchitecture and newer microarchitectures for next generation Intel Atom processors. Table 2-7 lists MSRs common to processors based on the Silvermont and Airmont microarchitectures, but not newer microarchitectures. Table 2-8, Table 2-9, and Table 2-10 lists MSRs that are model-specific across processors based on the Silvermont microarchitecture. In the Silvermont microarchitecture, the scope column indicates the following: “Core” means each processor core has a separate MSR, or a bit field not shared with another processor core. “Module” means the MSR or the bit field is shared by a pair of processor cores in the physical package. “Package” means all processor cores in the physical package share the same MSR or bit interface. Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 0H 0 IA32_P5_MC_ADDR Module See Section 2.23, “MSRs in Pentium Processors.” 1H 1 IA32_P5_MC_TYPE Module See Section 2.23, “MSRs in Pentium Processors.” 6H 6 IA32_MONITOR_FILTER_SIZE Core See Section 8.10.5, “Monitor/Mwait Address Range Determination.” and Table 2-2. 10H 16 IA32_TIME_STAMP_COUNTER Core See Section 17.17, “Time-Stamp Counter,” and Table 2-2. 1BH 27 IA32_APIC_BASE Core See Section 10.4.4, “Local APIC Status and Location,” and Table 2-2. 2AH 42 MSR_EBL_CR_POWERON Module Processor Hard Power-On Configuration (R/W) Writes ignored. 63:0 34H 52 MSR_SMI_COUNT Reserved Core 31:0 SMI Counter (R/O) SMI Count (R/O) Running count of SMI events since last RESET. 63:32 79H 121 IA32_BIOS_UPDT_TRIG Reserved Core BIOS Update Trigger Register (W) See Table 2-2. 8BH 139 IA32_BIOS_SIGN_ID Core BIOS Update Signature ID (RO) See Table 2-2. C1H 193 IA32_PMC0 Core Performance counter register See Table 2-2. C2H 194 IA32_PMC1 Core Performance Counter Register See Table 2-2. E4H 228 MSR_PMG_IO_CAPTURE_BASE Module Power Management IO Redirection in C-state (R/W) See http://biosbits.org. 2-72 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 15:0 LVL_2 Base Address (R/W) Specifies the base address visible to software for IO redirection. If IO MWAIT Redirection is enabled, reads to this address will be consumed by the power management logic and decoded to MWAIT instructions. When IO port address redirection is enabled, this is the IO port address reported to the OS/software. 18:16 C-state Range (R/W) Specifies the encoding value of the maximum C-State code name to be included when IO read to MWAIT redirection is enabled by MSR_PKG_CST_CONFIG_CONTROL[bit10]: 100b - C4 is the max C-State to include 110b - C6 is the max C-State to include 111b - C7 is the max C-State to include 63:19 E7H 231 IA32_MPERF Reserved Core Maximum Performance Frequency Clock Count (RW) See Table 2-2. E8H 232 IA32_APERF Core Actual Performance Frequency Clock Count (RW) See Table 2-2. FEH 254 IA32_MTRRCAP Core Memory Type Range Register (R) See Table 2-2. 13CH 52 MSR_FEATURE_CONFIG Core AES Configuration (RW-L) Privileged post-BIOS agent must provide a #GP handler to handle unsuccessful read of this MSR. 1:0 AES Configuration (RW-L) Upon a successful read of this MSR, the configuration of AES instruction sets availability is as follows: 11b: AES instructions are not available until next RESET. Otherwise, AES instructions are available. Note: AES instruction set is not available if read is unsuccessful. If the configuration is not 01b, AES instructions can be mis-configured if a privileged agent unintentionally writes 11b. 63:2 Reserved 174H 372 IA32_SYSENTER_CS Core See Table 2-2. 175H 373 IA32_SYSENTER_ESP Core See Table 2-2. 176H 374 IA32_SYSENTER_EIP Core See Table 2-2. 179H 377 IA32_MCG_CAP Core See Table 2-2. 17AH 378 IA32_MCG_STATUS Core Global Machine Check Status Vol. 4 2-73 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 0 RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 186H 390 IA32_PERFEVTSEL0 Reserved Core See Table 2-2. 7:0 Event Select 15:8 UMask 16 USR 17 OS 18 Edge 19 PC 20 INT 21 Reserved 22 EN 23 INV 31:24 CMASK 63:32 Reserved 187H 391 IA32_PERFEVTSEL1 Core See Table 2-2. 198H 408 IA32_PERF_STATUS Module See Table 2-2. 199H 409 IA32_PERF_CTL Core See Table 2-2. 19AH 410 IA32_CLOCK_MODULATION Core Clock Modulation (R/W) See Table 2-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 19BH 411 IA32_THERM_INTERRUPT Core Thermal Interrupt Control (R/W) See Table 2-2. 2-74 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Hex Dec 19CH 412 Register Name / Bit Fields IA32_THERM_STATUS Scope Core Bit Description Thermal Monitor Status (R/W) See Table 2-2. 1A2H 418 MSR_TEMPERATURE_TARGET Package 15:0 Temperature Target Reserved 23:16 Temperature Target (R) The default thermal throttling or PROCHOT# activation temperature in degrees C. The effective temperature for thermal throttling or PROCHOT# activation is “Temperature Target” + “Target Offset”. 29:24 Target Offset (R/W) Specifies an offset in degrees C to adjust the throttling and PROCHOT# activation temperature from the default target specified in TEMPERATURE_TARGET (bits 23:16). 63:30 Reserved 1A6H 422 MSR_OFFCORE_RSP_0 Module Offcore Response Event Select Register (R/W) 1A7H 423 MSR_OFFCORE_RSP_1 Module Offcore Response Event Select Register (R/W) 1B0H 432 IA32_ENERGY_PERF_BIAS Core See Table 2-2. 1D9H 473 IA32_DEBUGCTL Core Debug Control (R/W) See Table 2-2. 1DDH 477 MSR_LER_FROM_LIP Core Last Exception Record From Linear IP (R/W) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Core Last Exception Record To Linear IP (R/W) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1F2H 498 IA32_SMRR_PHYSBASE Core See Table 2-2. 1F3H 499 IA32_SMRR_PHYSMASK Core See Table 2-2. 200H 512 IA32_MTRR_PHYSBASE0 Core See Table 2-2. 201H 513 IA32_MTRR_PHYSMASK0 Core See Table 2-2. 202H 514 IA32_MTRR_PHYSBASE1 Core See Table 2-2. 203H 515 IA32_MTRR_PHYSMASK1 Core See Table 2-2. 204H 516 IA32_MTRR_PHYSBASE2 Core See Table 2-2. 205H 517 IA32_MTRR_PHYSMASK2 Core See Table 2-2. 206H 518 IA32_MTRR_PHYSBASE3 Core See Table 2-2. 207H 519 IA32_MTRR_PHYSMASK3 Core See Table 2-2. 208H 520 IA32_MTRR_PHYSBASE4 Core See Table 2-2. Vol. 4 2-75 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 209H 521 IA32_MTRR_PHYSMASK4 Core See Table 2-2. 20AH 522 IA32_MTRR_PHYSBASE5 Core See Table 2-2. 20BH 523 IA32_MTRR_PHYSMASK5 Core See Table 2-2. 20CH 524 IA32_MTRR_PHYSBASE6 Core See Table 2-2. 20DH 525 IA32_MTRR_PHYSMASK6 Core See Table 2-2. 20EH 526 IA32_MTRR_PHYSBASE7 Core See Table 2-2. 20FH 527 IA32_MTRR_PHYSMASK7 Core See Table 2-2. 250H 592 IA32_MTRR_FIX64K_00000 Core See Table 2-2. 258H 600 IA32_MTRR_FIX16K_80000 Core See Table 2-2. 259H 601 IA32_MTRR_FIX16K_A0000 Core See Table 2-2. 268H 616 IA32_MTRR_FIX4K_C0000 Core See Table 2-2. 269H 617 IA32_MTRR_FIX4K_C8000 Core See Table 2-2. 26AH 618 IA32_MTRR_FIX4K_D0000 Core See Table 2-2. 26BH 619 IA32_MTRR_FIX4K_D8000 Core See Table 2-2. 26CH 620 IA32_MTRR_FIX4K_E0000 Core See Table 2-2. 26DH 621 IA32_MTRR_FIX4K_E8000 Core See Table 2-2. 26EH 622 IA32_MTRR_FIX4K_F0000 Core See Table 2-2. 26FH 623 IA32_MTRR_FIX4K_F8000 Core See Table 2-2. 277H 631 IA32_PAT Core See Table 2-2. 2FFH 767 IA32_MTRR_DEF_TYPE Core Default Memory Types (R/W) See Table 2-2. 309H 777 IA32_FIXED_CTR0 Core Fixed-Function Performance Counter Register 0 (R/W) See Table 2-2. 30AH 778 IA32_FIXED_CTR1 Core Fixed-Function Performance Counter Register 1 (R/W) See Table 2-2. 30BH 779 IA32_FIXED_CTR2 Core Fixed-Function Performance Counter Register 2 (R/W) See Table 2-2. 345H 837 IA32_PERF_CAPABILITIES Core 38DH 909 IA32_FIXED_CTR_CTRL Core See Table 2-2. See Section 17.4.1, “IA32_DEBUGCTL MSR.” Fixed-Function-Counter Control Register (R/W) See Table 2-2. 38FH 911 IA32_PERF_GLOBAL_CTRL Core See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 3FDH 1021 MSR_CORE_C6_RESIDENCY Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 2-76 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 63:0 CORE C6 Residency Counter (R/O) Value since last reset that this core is in processorspecific C6 states. Counts at the TSC Frequency. 400H 1024 IA32_MC0_CTL Module See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS Module See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR Module See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 1028 IA32_MC1_CTL Module See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS Module See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 408H 1032 IA32_MC2_CTL Module See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS Module See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40AH 1034 IA32_MC2_ADDR Module See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 1036 IA32_MC3_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 IA32_MC3_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 IA32_MC3_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 410H 1040 IA32_MC4_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 IA32_MC4_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 IA32_MC4_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 414H 1044 IA32_MC5_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” Vol. 4 2-77 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 415H 1045 IA32_MC5_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 416H 1046 IA32_MC5_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 480H 1152 IA32_VMX_BASIC Core Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. See Appendix A.1, “Basic VMX Information.” 481H 1153 IA32_VMX_PINBASED_CTLS Core Capability Reporting Register of Pin-Based VM-Execution Controls (R/O) See Table 2-2. See Appendix A.3, “VM-Execution Controls.” 482H 1154 IA32_VMX_PROCBASED_CTLS Core Capability Reporting Register of Primary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 483H 1155 IA32_VMX_EXIT_CTLS Core Capability Reporting Register of VM-Exit Controls (R/O) See Table 2-2. See Appendix A.4, “VM-Exit Controls.” 484H 1156 IA32_VMX_ENTRY_CTLS Core Capability Reporting Register of VM-Entry Controls (R/O) See Table 2-2. See Appendix A.5, “VM-Entry Controls.” 485H 1157 IA32_VMX_MISC Core Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 2-2. See Appendix A.6, “Miscellaneous Data.” 486H 1158 IA32_VMX_CR0_FIXED0 Core Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 487H 1159 IA32_VMX_CR0_FIXED1 Core Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 488H 1160 IA32_VMX_CR4_FIXED0 Core Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 2-78 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Hex Dec 489H 1161 Register Name / Bit Fields IA32_VMX_CR4_FIXED1 Scope Core Bit Description Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 48AH 1162 IA32_VMX_VMCS_ENUM Core Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 2-2. See Appendix A.9, “VMCS Enumeration.” 48BH 1163 IA32_VMX_PROCBASED_CTLS2 Core Capability Reporting Register of Secondary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 48CH 1164 IA32_VMX_EPT_VPID_ENUM Core Capability Reporting Register of EPT and VPID (R/O) See Table 2-2 48DH 1165 IA32_VMX_TRUE_PINBASED_CTLS Core Capability Reporting Register of Pin-Based VM-Execution Flex Controls (R/O) See Table 2-2 48EH 1166 IA32_VMX_TRUE_PROCBASED_CTLS Core Capability Reporting Register of Primary Processorbased VM-Execution Flex Controls (R/O) See Table 2-2 48FH 1167 IA32_VMX_TRUE_EXIT_CTLS Core Capability Reporting Register of VM-Exit Flex Controls (R/O) See Table 2-2 490H 1168 IA32_VMX_TRUE_ENTRY_CTLS Core 491H 1169 IA32_VMX_FMFUNC Core Capability Reporting Register of VM-Entry Flex Controls (R/O) See Table 2-2 Capability Reporting Register of VM-Function Controls (R/O) See Table 2-2 4C1H 1217 IA32_A_PMC0 Core See Table 2-2. 4C2H 1218 IA32_A_PMC1 Core See Table 2-2. 600H 1536 IA32_DS_AREA Core DS Save Area (R/W) See Table 2-2. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 660H 1632 MSR_CORE_C1_RESIDENCY Core 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. CORE C1 Residency Counter. (R/O) Value since last reset that this core is in processorspecific C1 states. Counts at the TSC frequency. 6E0H 1760 IA32_TSC_DEADLINE Core TSC Target of Local APIC’s TSC Deadline Mode (R/W) See Table 2-2. Vol. 4 2-79 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-6. MSRs Common to the Silvermont Microarchitecture and Newer Microarchitectures for Intel Atom Processors Register Address Hex Register Name / Bit Fields Scope Bit Description Dec C000_ 0080H IA32_EFER C000_ 0081H IA32_STAR C000_ 0082H IA32_LSTAR C000_ 0084H IA32_FMASK C000_ 0100H IA32_FS_BASE C000_ 0101H IA32_GS_BASE C000_ 0102H IA32_KERNEL_GS_BASE C000_ 0103H IA32_TSC_AUX Core Extended Feature Enables See Table 2-2. Core System Call Target Address (R/W) See Table 2-2. Core IA-32e Mode System Call Target Address (R/W) See Table 2-2. Core System Call Flag Mask (R/W) See Table 2-2. Core Map of BASE Address of FS (R/W) See Table 2-2. Core Map of BASE Address of GS (R/W) See Table 2-2. Core Swap Target of BASE Address of GS (R/W) See Table 2-2. Core AUXILIARY TSC Signature (R/W) See Table 2-2 Table 2-7 lists model-specific registers (MSRs) that are common to Intel® Atom™ processors based on the Silvermont and Airmont microarchitectures but not newer microarchitectures. Table 2-7. MSRs Common to the Silvermont and Airmont Microarchitectures Register Address Hex Dec 17H 23 Register Name / Bit Fields MSR_PLATFORM_ID Scope Module Bit Description Model Specific Platform ID (R) 7:0 Reserved 13:8 Maximum Qualified Ratio (R) The maximum allowed bus ratio. 3AH 58 49:13 Reserved 52:50 See Table 2-2. 63:33 Reserved IA32_FEATURE_CONTROL Core Control Features in Intel 64Processor (R/W) See Table 2-2. 2-80 Vol. 4 0 Lock (R/WL) 1 Reserved 2 Enable VMX outside SMX operation (R/WL) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-7. MSRs Common to the Silvermont and Airmont Microarchitectures Register Address Hex Dec 40H 64 Register Name / Bit Fields MSR_LASTBRANCH_0_FROM_IP Scope Core Bit Description Last Branch Record 0 From IP (R/W) One of eight pairs of last branch record registers on the last branch record stack. The From_IP part of the stack contains pointers to the source instruction. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.5 and record format in Section 17.4.8.1. 41H 65 MSR_LASTBRANCH_1_FROM_IP Core Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 42H 66 MSR_LASTBRANCH_2_FROM_IP Core Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 43H 67 MSR_LASTBRANCH_3_FROM_IP Core Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 44H 68 MSR_LASTBRANCH_4_FROM_IP Core Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 45H 69 MSR_LASTBRANCH_5_FROM_IP Core Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 46H 70 MSR_LASTBRANCH_6_FROM_IP Core Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 47H 71 MSR_LASTBRANCH_7_FROM_IP Core Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 60H 96 MSR_LASTBRANCH_0_TO_IP Core Last Branch Record 0 To IP (R/W) One of eight pairs of last branch record registers on the last branch record stack. The To_IP part of the stack contains pointers to the destination instruction. 61H 97 MSR_LASTBRANCH_1_TO_IP Core Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 62H 98 MSR_LASTBRANCH_2_TO_IP Core Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 63H 99 MSR_LASTBRANCH_3_TO_IP Core Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 64H 100 MSR_LASTBRANCH_4_TO_IP Core Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 65H 101 MSR_LASTBRANCH_5_TO_IP Core Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 66H 102 MSR_LASTBRANCH_6_TO_IP Core Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 67H 103 MSR_LASTBRANCH_7_TO_IP Core Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Vol. 4 2-81 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-7. MSRs Common to the Silvermont and Airmont Microarchitectures Register Address Hex Dec E2H 226 Register Name / Bit Fields MSR_PKG_CST_CONFIG_CONTROL Scope Module Bit Description C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 000b: C0 (no package C-sate support) 001b: C1 (Behavior is the same as 000b) 100b: C4 110b: C6 111b: C7 (Silvermont only). 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 14:11 Reserved 15 CFG Lock (R/WO) When set, locks bits 15:0 of this register until next reset. 63:16 11EH 281 MSR_BBL_CR_CTL3 Reserved Module Control Register 3 Used to configure the L2 Cache. 0 L2 Hardware Enabled (RO) 1 = If the L2 is hardware-enabled. 0 = Indicates if the L2 is hardware-disabled. 7:1 Reserved 8 L2 Enabled (R/W) 1 = L2 cache has been initialized. 0 = Disabled (default). Until this bit is set the processor will not respond to the WBINVD instruction or the assertion of the FLUSH# input. 22:9 Reserved 23 L2 Not Present (RO) 0 = L2 Present. 1 = L2 Not Present. 63:24 2-82 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-7. MSRs Common to the Silvermont and Airmont Microarchitectures Register Address Hex Dec 1A0H 416 Register Name / Bit Fields Scope IA32_MISC_ENABLE Bit Description Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 0 Core Fast-Strings Enable See Table 2-2. 2:1 3 Reserved Module Automatic Thermal Control Circuit Enable (R/W) See Table 2-2. Default value is 0. 6:4 7 Reserved Core Performance Monitoring Available (R) See Table 2-2. 10:8 11 Reserved Core Branch Trace Storage Unavailable (RO) See Table 2-2. 12 Core Processor Event Based Sampling Unavailable (RO) See Table 2-2. 15:13 Reserved 16 Module 18 Core Enhanced Intel SpeedStep Technology Enable (R/W) See Table 2-2. ENABLE MONITOR FSM (R/W) See Table 2-2. 21:19 22 Reserved Core Limit CPUID Maxval (R/W) See Table 2-2. 23 Module xTPR Message Disable (R/W) See Table 2-2. 33:24 34 Reserved Core XD Bit Disable (R/W) See Table 2-2. 37:35 Reserved Vol. 4 2-83 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-7. MSRs Common to the Silvermont and Airmont Microarchitectures Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 38 Module Turbo Mode Disable (R/W) When set to 1 on processors that support Intel Turbo Boost Technology, the turbo mode feature is disabled and the IDA_Enable feature flag will be cleared (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processor’s support of turbo mode is enabled. Note: The power-on default value is used by BIOS to detect hardware support of turbo mode. If the power-on default value is 1, turbo mode is available in the processor. If the power-on default value is 0, turbo mode is not available. 63:39 1C8H 456 MSR_LBR_SELECT Reserved Core Last Branch Record Filtering Select Register (R/W) See Section 17.9.2, “Filtering of Last Branch Records.” 1C9H 457 0 CPL_EQ_0 1 CPL_NEQ_0 2 JCC 3 NEAR_REL_CALL 4 NEAR_IND_CALL 5 NEAR_RET 6 NEAR_IND_JMP 7 NEAR_REL_JMP 8 FAR_BRANCH 63:9 Reserved MSR_LASTBRANCH_TOS Core Last Branch Record Stack TOS (R/W) Contains an index (bits 0-2) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP. 38EH 910 IA32_PERF_GLOBAL_STATUS Core See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 390H 912 IA32_PERF_GLOBAL_OVF_CTRL Core See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 3F1H 1009 MSR_PEBS_ENABLE Core See Table 2-2. See Section 18.6.2.4, “Processor Event Based Sampling (PEBS).” 0 3FAH 1018 MSR_PKG_C6_RESIDENCY 63:0 Enable PEBS for precise event on IA32_PMC0 (R/W) Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C6 Residency Counter (R/O) Value since last reset that this package is in processorspecific C6 states. Counts at the TSC Frequency. 2-84 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-7. MSRs Common to the Silvermont and Airmont Microarchitectures Register Address Register Name / Bit Fields Hex Dec 664H 1636 MSR_MC6_RESIDENCY_COUNTER Scope Module Bit Description Module C6 Residency Counter (R/0) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 63:0 Time that this module is in module-specific C6 states since last reset. Counts at 1 Mhz frequency. CEH 206 MSR_PLATFORM_INFO Package 7:0 Platform Information: Contains power management and other model specific features enumeration. See http://biosbits.org. Reserved 15:8 Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the maximum frequency that does not require turbo. Frequency = ratio * Scalable Bus Frequency. 63:16 2.4.1 Reserved MSRs with Model-Specific Behavior in the Silvermont Microarchitecture Table 2-8 lists model-specific registers (MSRs) that are specific to Intel® Atom™ processor E3000 Series (CPUID signature with DisplayFamily_DisplayModel of 06_37H) and Intel Atom processors (CPUID signatures with DisplayFamily_DisplayModel of 06_4AH, 06_5AH, 06_5DH). Table 2-8. Specific MSRs Supported by Intel® Atom™ Processors with CPUID Signatures 06_37H, 06_4AH, 06_5AH, 06_5DH Register Address Hex Dec 606H 1542 Register Name / Bit Fields MSR_RAPL_POWER_UNIT Scope Package Bit Description Unit Multipliers used in RAPL Interfaces (R/O) See Section 14.9.1, “RAPL Interfaces.” 3:0 Power Units Power related information (in milliWatts) is based on the multiplier, 2^PU; where PU is an unsigned integer represented by bits 3:0. Default value is 0101b, indicating power unit is in 32 milliWatts increment. 7:4 12:8 Reserved Energy Status Units Energy related information (in microJoules) is based on the multiplier, 2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 00101b, indicating energy unit is in 32 microJoules increment. 15:13 Reserved Vol. 4 2-85 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-8. Specific MSRs Supported by Intel® Atom™ Processors with CPUID Signatures 06_37H, 06_4AH, 06_5AH, 06_5DH Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 19:16 Time Unit The value is 0000b, indicating time unit is in one second. 63:20 610H 1552 Reserved MSR_PKG_POWER_LIMIT Package 14:0 PKG RAPL Power Limit Control (R/W) Package Power Limit #1 (R/W) See Section 14.9.3, “Package RAPL Domain.” and MSR_RAPL_POWER_UNIT in Table 2-8. 15 Enable Power Limit #1 (R/W) See Section 14.9.3, “Package RAPL Domain.” 16 Package Clamping Limitation #1 (R/W) See Section 14.9.3, “Package RAPL Domain.” 23:17 Time Window for Power Limit #1 (R/W) In unit of second. If 0 is specified in bits [23:17], defaults to 1 second window. 63:24 611H 1553 Reserved MSR_PKG_ENERGY_STATUS Package PKG Energy Status (R/O) See Section 14.9.3, “Package RAPL Domain.” and MSR_RAPL_POWER_UNIT in Table 2-8. 639H 1593 MSR_PP0_ENERGY_STATUS Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” and MSR_RAPL_POWER_UNIT in Table 2-8. CDH 205 MSR_FSB_FREQ Module Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Silvermont microarchitecture. 2:0 • • • • • 100B: 080.0 MHz 000B: 083.3 MHz 001B: 100.0 MHz 010B: 133.3 MHz 011B: 116.7 MHz 63:3 Reserved Table 2-9 lists model-specific registers (MSRs) that are specific to Intel® Atom™ processor E3000 Series (CPUID signature with DisplayFamily_DisplayModel of 06_37H). Table 2-9. Specific MSRs Supported by Intel® Atom™ Processor E3000 Series with CPUID Signature 06_37H Register Address Hex Dec 668H 1640 Register Name / Bit Fields MSR_CC6_DEMOTION_POLICY_CONFIG 63:0 2-86 Vol. 4 Scope Package Bit Description Core C6 Demotion Policy Config MSR Controls per-core C6 demotion policy. Writing a value of 0 disables core level HW demotion policy. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-9. Specific MSRs Supported by Intel® Atom™ Processor E3000 Series (Contd.)with CPUID Signature 06_37H Register Address Hex Dec 669H 1641 Register Name / Bit Fields MSR_MC6_DEMOTION_POLICY_CONFIG Scope Package 63:0 664H 1636 Bit Description Module C6 Demotion Policy Config MSR Controls module (i.e., two cores sharing the second-level cache) C6 demotion policy. Writing a value of 0 disables module level HW demotion policy. MSR_MC6_RESIDENCY_COUNTER Module Module C6 Residency Counter (R/0) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 63:0 Time that this module is in module-specific C6 states since last reset. Counts at 1 Mhz frequency. Table 2-10 lists model-specific registers (MSRs) that are specific to Intel® Atom™ processor C2000 Series (CPUID signature with DisplayFamily_DisplayModel of 06_4DH). Table 2-10. Specific MSRs Supported by Intel® Atom™ Processor C2000 Series with CPUID Signature 06_4DH Register Address Hex Dec 1A4H 420 Register Name / Bit Fields Scope MSR_MISC_FEATURE_CONTROL 0 Bit Description Miscellaneous Feature Control (R/W) Core L2 Hardware Prefetcher Disable (R/W) If 1, disables the L2 hardware prefetcher, which fetches additional lines of code or data into the L2 cache. 1 2 Reserved Core DCU Hardware Prefetcher Disable (R/W) If 1, disables the L1 data cache prefetcher, which fetches the next cache line into L1 data cache. 63:3 1ADH 429 Reserved MSR_TURBO_RATIO_LIMIT Package Maximum Ratio Limit of Turbo Mode (RW) 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 39:32 Package Maximum Ratio Limit for 5C Maximum turbo ratio limit of 5 core active. 47:40 Package Maximum Ratio Limit for 6C Maximum turbo ratio limit of 6 core active. Vol. 4 2-87 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-10. Specific MSRs Supported by Intel® Atom™ Processor C2000 Series (Contd.)with CPUID Signature Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 55:48 Package Maximum Ratio Limit for 7C Maximum turbo ratio limit of 7 core active. 63:56 Package Maximum Ratio Limit for 8C Maximum turbo ratio limit of 8 core active. 606H 1542 MSR_RAPL_POWER_UNIT Package Unit Multipliers used in RAPL Interfaces (R/O) See Section 14.9.1, “RAPL Interfaces.” 3:0 Power Units Power related information (in milliWatts) is based on the multiplier, 2^PU; where PU is an unsigned integer represented by bits 3:0. Default value is 0101b, indicating power unit is in 32 milliWatts increment. 7:4 Reserved 12:8 Energy Status Units. Energy related information (in microJoules) is based on the multiplier, 2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 00101b, indicating energy unit is in 32 microJoules increment. 15:13 Reserved 19:16 Time Unit The value is 0000b, indicating time unit is in one second. 63:20 610H 1552 MSR_PKG_POWER_LIMIT Reserved Package PKG RAPL Power Limit Control (R/W) See Section 14.9.3, “Package RAPL Domain.” 66EH 1646 MSR_PKG_POWER_INFO 14:0 Package PKG RAPL Parameter (R/0) Thermal Spec Power (R/0) The unsigned integer value is the equivalent of the thermal specification power of the package domain. The unit of this field is specified by the “Power Units” field of MSR_RAPL_POWER_UNIT. 63:15 2.4.2 Reserved MSRs In Intel Atom Processors Based on Airmont Microarchitecture Intel Atom processor X7-Z8000 and X5-Z8000 series are based on the Airmont microarchitecture. These processors support MSRs listed in Table 2-6, Table 2-7, Table 2-8, and Table 2-11. These processors have a CPUID signature with DisplayFamily_DisplayModel including 06_4CH; see Table 2-1. 2-88 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-11. MSRs in Intel Atom Processors Based on the Airmont Microarchitecture Register Address Hex Dec CDH 205 Register Name / Bit Fields MSR_FSB_FREQ Scope Module Bit Description Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Airmont microarchitecture. E2H 226 3:0 • • • • • • • • • 63:5 Reserved MSR_PKG_CST_CONFIG_CONTROL Module 0000B: 083.3 MHz 0001B: 100.0 MHz 0010B: 133.3 MHz 0011B: 116.7 MHz 0100B: 080.0 MHz 0101B: 093.3 MHz 0110B: 090.0 MHz 0111B: 088.9 MHz 1000B: 087.5 MHz C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package Cstate limit. The following C-state code name encodings are supported: 000b: No limit 001b: C1 010b: C2 110b: C6 111b: C7 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 14:11 Reserved 15 CFG Lock (R/WO) When set, locks bits 15:0 of this register until next reset. 63:16 E4H 228 MSR_PMG_IO_CAPTURE_BASE Reserved Module Power Management IO Redirection in C-state (R/W) See http://biosbits.org. Vol. 4 2-89 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-11. MSRs in Intel Atom Processors Based on the Airmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 15:0 LVL_2 Base Address (R/W) Specifies the base address visible to software for IO redirection. If IO MWAIT Redirection is enabled, reads to this address will be consumed by the power management logic and decoded to MWAIT instructions. When IO port address redirection is enabled, this is the IO port address reported to the OS/software. 18:16 C-state Range (R/W) Specifies the encoding value of the maximum C-State code name to be included when IO read to MWAIT redirection is enabled by MSR_PKG_CST_CONFIG_CONTROL[bit10]: 000b - C3 is the max C-State to include. 001b - Deep Power Down Technology is the max CState. 010b - C7 is the max C-State to include. 63:19 638H 1592 MSR_PP0_POWER_LIMIT 14:0 Reserved Package PP0 RAPL Power Limit Control (R/W) PP0 Power Limit #1 (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains” and MSR_RAPL_POWER_UNIT in Table 2-8. 15 Enable Power Limit #1 (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 16 Reserved 23:17 Time Window for Power Limit #1 (R/W) Specifies the time duration over which the average power must remain below PP0_POWER_LIMIT #1(14:0). Supported Encodings: 0x0: 1 second time duration. 0x1: 5 second time duration (Default). 0x2: 10 second time duration. 0x3: 15 second time duration. 0x4: 20 second time duration. 0x5: 25 second time duration. 0x6: 30 second time duration. 0x7: 35 second time duration. 0x8: 40 second time duration. 0x9: 45 second time duration. 0xA: 50 second time duration. 0xB-0x7F - reserved. 63:24 2-90 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) 2.5 MSRS IN INTEL ATOM PROCESSORS BASED ON GOLDMONT MICROARCHITECTURE Intel Atom processors based on the Goldmont microarchitecture support MSRs listed in Table 2-6 and Table 2-12. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_5CH; see Table 2-1. In the Goldmont microarchitecture, the scope column indicates the following: “Core” means each processor core has a separate MSR, or a bit field not shared with another processor core. “Module” means the MSR or the bit field is shared by a pair of processor cores in the physical package. “Package” means all processor cores in the physical package share the same MSR or bit interface. Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture Register Address Hex Dec 17H 23 3AH 58 Register Name / Bit Fields MSR_PLATFORM_ID Scope Module Bit Description Model Specific Platform ID (R) 49:0 Reserved 52:50 See Table 2-2. 63:33 Reserved IA32_FEATURE_CONTROL Core Control Features in Intel 64 Processor (R/W) See Table 2-2. 3BH 59 0 Lock (R/WL) 1 Enable VMX inside SMX operation (R/WL) 2 Enable VMX outside SMX operation (R/WL) 14:8 SENTER local functions enables (R/WL) 15 SENTER global functions enable (R/WL) 18 SGX global functions enable (R/WL) 63:19 Reserved IA32_TSC_ADJUST Core Per-Core TSC ADJUST (R/W) See Table 2-2. C3H 195 IA32_PMC2 Core Performance Counter Register See Table 2-2. C4H 196 IA32_PMC3 Core Performance Counter Register See Table 2-2. CEH 206 MSR_PLATFORM_INFO Package Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the maximum frequency that does not require turbo. Frequency = ratio * 100 MHz. 27:16 Reserved Vol. 4 2-91 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 28 Package Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limit for Turbo mode is enabled. When set to 0, indicates Programmable Ratio Limit for Turbo mode is disabled. 29 Package Programmable TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDP Limit for Turbo mode is programmable. When set to 0, indicates TDP Limit for Turbo mode is not programmable. 30 Package Programmable TJ OFFSET (R/O) When set to 1, indicates that MSR_TEMPERATURE_TARGET.[27:24] is valid and writable to specify a temperature offset. 39:31 47:40 Reserved Package Maximum Efficiency Ratio (R/O) This is the minimum ratio (maximum efficiency) that the processor can operate, in units of 100MHz. 63:48 E2H 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Core C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. See http://biosbits.org. 3:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package Cstate limit. The following C-state code name encodings are supported: 0000b: No limit 0001b: C1 0010b: C3 0011b: C6 0100b: C7 0101b: C7S 0110b: C8 0111b: C9 1000b: C10 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 2-92 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 14:11 Reserved 15 CFG Lock (R/WO) When set, locks bits 15:0 of this register until next reset. 63:16 17DH 381 MSR_SMM_MCA_CAP Reserved Core Enhanced SMM Capabilities (SMM-RO) Reports SMM capability enhancement. Accessible only while in SMM. 57:0 Reserved 58 SMM_Code_Access_Chk (SMM-RO) If set to 1 indicates that the SMM code access restriction is supported and the MSR_SMM_FEATURE_CONTROL is supported. 59 Long_Flow_Indication (SMM-RO) If set to 1 indicates that the SMM long flow indicator is supported and the MSR_SMM_DELAYED is supported. 63:60 Reserved 188H 392 IA32_PERFEVTSEL2 Core See Table 2-2. 189H 393 IA32_PERFEVTSEL3 Core See Table 2-2. 1A0H 416 IA32_MISC_ENABLE Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 0 Core Fast-Strings Enable See Table 2-2. 2:1 3 Reserved Package Automatic Thermal Control Circuit Enable (R/W) See Table 2-2. Default value is 1. 6:4 7 Reserved Core Performance Monitoring Available (R) See Table 2-2. 10:8 11 Reserved Core Branch Trace Storage Unavailable (RO) See Table 2-2. 12 Core Processor Event Based Sampling Unavailable (RO) See Table 2-2. 15:13 16 Reserved Package Enhanced Intel SpeedStep Technology Enable (R/W) See Table 2-2. 18 Core ENABLE MONITOR FSM (R/W) See Table 2-2. Vol. 4 2-93 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 21:19 22 Reserved Core Limit CPUID Maxval (R/W) See Table 2-2. 23 Package xTPR Message Disable (R/W) See Table 2-2. 33:24 34 Reserved Core XD Bit Disable (R/W) See Table 2-2. 37:35 38 Reserved Package Turbo Mode Disable (R/W) When set to 1 on processors that support Intel Turbo Boost Technology, the turbo mode feature is disabled and the IDA_Enable feature flag will be clear (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processor’s support of turbo mode is enabled. Note: The power-on default value is used by BIOS to detect hardware support of turbo mode. If the poweron default value is 1, turbo mode is available in the processor. If the power-on default value is 0, turbo mode is not available. 1A4H 420 63:39 Reserved MSR_MISC_FEATURE_CONTROL Miscellaneous Feature Control (R/W) 0 Core L2 Hardware Prefetcher Disable (R/W) If 1, disables the L2 hardware prefetcher, which fetches additional lines of code or data into the L2 cache. 1 2 Reserved Core DCU Hardware Prefetcher Disable (R/W) If 1, disables the L1 data cache prefetcher, which fetches the next cache line into L1 data cache. 63:3 1AAH 426 MSR_MISC_PWR_MGMT Reserved Package Miscellaneous Power Management Control Various model specific features enumeration. See http://biosbits.org. 0 EIST Hardware Coordination Disable (R/W) When 0, enables hardware coordination of Enhanced Intel Speedstep Technology request from processor cores. When 1, disables hardware coordination of Enhanced Intel Speedstep Technology requests. 21:1 2-94 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 22 Thermal Interrupt Coordination Enable (R/W) If set, then thermal interrupt on one core is routed to all cores. 63:23 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved Package Maximum Ratio Limit of Turbo Mode by Core Groups (RW) Specifies Maximum Ratio Limit for each Core Group. Max ratio for groups with more cores must decrease monotonically. For groups with less than 4 cores, the max ratio must be 32 or less. For groups with 4-5 cores, the max ratio must be 22 or less. For groups with more than 5 cores, the max ratio must be 16 or less. 7:0 Package Maximum Ratio Limit for Active Cores in Group 0 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 0 threshold. 15:8 Package Maximum Ratio Limit for Active Cores in Group 1 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 1 threshold, and greater than the Group 0 threshold. 23:16 Package Maximum Ratio Limit for Active Cores in Group 2 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 2 threshold, and greater than the Group 1 threshold. 31:24 Package Maximum Ratio Limit for Active Cores in Group 3 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 3 threshold, and greater than the Group 2 threshold. 39:32 Package Maximum Ratio Limit for Active Cores in Group 4 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 4 threshold, and greater than the Group 3 threshold. 47:40 Package Maximum Ratio Limit for Active Cores in Group 5 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 5 threshold, and greater than the Group 4 threshold. 55:48 Package Maximum Ratio Limit for Active Cores in Group 6 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 6 threshold, and greater than the Group 5 threshold. 63:56 Package Maximum Ratio Limit for Active Cores in Group 7 Maximum turbo ratio limit when the number of active cores is less than or equal to the Group 7 threshold, and greater than the Group 6 threshold. Vol. 4 2-95 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 1AEH 430 Register Name / Bit Fields Scope MSR_TURBO_GROUP_CORECNT Package 7:0 Package Bit Description Group Size of Active Cores for Turbo Mode Operation (RW) Writes of 0 threshold is ignored. Group 0 Core Count Threshold Maximum number of active cores to operate under the Group 0 Max Turbo Ratio limit. 15:8 Package Group 1 Core Count Threshold Maximum number of active cores to operate under the Group 1 Max Turbo Ratio limit. Must be greater than the Group 0 Core Count. 23:16 Package Group 2 Core Count Threshold Maximum number of active cores to operate under the Group 2 Max Turbo Ratio limit. Must be greater than the Group 1 Core Count. 31:24 Package Group 3 Core Count Threshold Maximum number of active cores to operate under the Group 3 Max Turbo Ratio limit. Must be greater than the Group 2 Core Count. 39:32 Package Group 4 Core Count Threshold Maximum number of active cores to operate under the Group 4 Max Turbo Ratio limit. Must be greater than the Group 3 Core Count. 47:40 Package Group 5 Core Count Threshold Maximum number of active cores to operate under the Group 5 Max Turbo Ratio limit. Must be greater than the Group 4 Core Count. 55:48 Package Group 6 Core Count Threshold Maximum number of active cores to operate under the Group 6 Max Turbo Ratio limit. Must be greater than the Group 5 Core Count. 63:56 Package Group 7 Core Count Threshold Maximum number of active cores to operate under the Group 7 Max Turbo Ratio limit. Must be greater than the Group 6 Core Count, and not less than the total number of processor cores in the package. E.g., specify 255. 1C8H 456 MSR_LBR_SELECT Core Last Branch Record Filtering Select Register (R/W) See Section 17.9.2, “Filtering of Last Branch Records.” 2-96 Vol. 4 0 CPL_EQ_0 1 CPL_NEQ_0 2 JCC 3 NEAR_REL_CALL 4 NEAR_IND_CALL 5 NEAR_RET MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex 1C9H Register Name / Bit Fields Scope Bit Description Dec 457 6 NEAR_IND_JMP 7 NEAR_REL_JMP 8 FAR_BRANCH 9 EN_CALL_STACK 63:10 Reserved MSR_LASTBRANCH_TOS Core Last Branch Record Stack TOS (R/W) Contains an index (bits 0-4) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP. 1FCH 508 MSR_POWER_CTL Core 0 1 Power Control Register. See http://biosbits.org. Reserved Package C1E Enable (R/W) When set to ‘1’, will enable the CPU to switch to the Minimum Enhanced Intel SpeedStep Technology operating point when all execution cores enter MWAIT (C1). 63:2 Reserved 210H 528 IA32_MTRR_PHYSBASE8 Core See Table 2-2. 211H 529 IA32_MTRR_PHYSMASK8 Core See Table 2-2. 212H 530 IA32_MTRR_PHYSBASE9 Core See Table 2-2. 213H 531 IA32_MTRR_PHYSMASK9 Core See Table 2-2. 280H 640 IA32_MC0_CTL2 Module See Table 2-2. 281H 641 IA32_MC1_CTL2 Module See Table 2-2. 282H 642 IA32_MC2_CTL2 Core See Table 2-2. 283H 643 IA32_MC3_CTL2 Module See Table 2-2. 284H 644 IA32_MC4_CTL2 Package See Table 2-2. 285H 645 IA32_MC5_CTL2 Package See Table 2-2. 286H 646 IA32_MC6_CTL2 Package See Table 2-2. 300H 768 MSR_SGXOWNEREPOCH0 Package Lower 64 Bit CR_SGXOWNEREPOCH (W) Writes do not update CR_SGXOWNEREPOCH if CPUID.(EAX=12H, ECX=0):EAX.SGX1 is 1 on any thread in the package. 63:0 301H 769 MSR_SGXOWNEREPOCH1 Lower 64 bits of an 128-bit external entropy value for key derivation of an enclave. Package Upper 64 Bit CR_SGXOWNEREPOCH (W) Writes do not update CR_SGXOWNEREPOCH if CPUID.(EAX=12H, ECX=0):EAX.SGX1 is 1 on any thread in the package. 63:0 Upper 64 bits of an 128-bit external entropy value for key derivation of an enclave. Vol. 4 2-97 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 38EH 910 390H 2-98 Vol. 4 912 Register Name / Bit Fields IA32_PERF_GLOBAL_STATUS Scope Core Bit Description See Table 2-2. See Section 18.2.4, “Architectural Performance Monitoring Version 4.” 0 Ovf_PMC0 1 Ovf_PMC1 2 Ovf_PMC2 3 Ovf_PMC3 31:4 Reserved 32 Ovf_FixedCtr0 33 Ovf_FixedCtr1 34 Ovf_FixedCtr2 54:35 Reserved 55 Trace_ToPA_PMI 57:56 Reserved 58 LBR_Frz. 59 CTR_Frz. 60 ASCI 61 Ovf_Uncore 62 Ovf_BufDSSAVE 63 CondChgd IA32_PERF_GLOBAL_STATUS_RESET Core See Table 2-2. See Section 18.2.4, “Architectural Performance Monitoring Version 4.” 0 Set 1 to clear Ovf_PMC0. 1 Set 1 to clear Ovf_PMC1. 2 Set 1 to clear Ovf_PMC2. 3 Set 1 to clear Ovf_PMC3. 31:4 Reserved 32 Set 1 to clear Ovf_FixedCtr0. 33 Set 1 to clear Ovf_FixedCtr1. 34 Set 1 to clear Ovf_FixedCtr2. 54:35 Reserved 55 Set 1 to clear Trace_ToPA_PMI. 57:56 Reserved 58 Set 1 to clear LBR_Frz. 59 Set 1 to clear CTR_Frz. 60 Set 1 to clear ASCI. 61 Set 1 to clear Ovf_Uncore. 62 Set 1 to clear Ovf_BufDSSAVE. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Dec 63 391H 913 IA32_PERF_GLOBAL_STATUS_SET Set 1 to clear CondChgd. Core Set 1 to cause Ovf_PMC0 = 1. 1 Set 1 to cause Ovf_PMC1 = 1. 2 Set 1 to cause Ovf_PMC2 = 1. 3 Set 1 to cause Ovf_PMC3 = 1. 31:4 Reserved 32 Set 1 to cause Ovf_FixedCtr0 = 1. 33 Set 1 to cause Ovf_FixedCtr1 = 1. 34 Set 1 to cause Ovf_FixedCtr2 = 1. 54:35 Reserved 55 Set 1 to cause Trace_ToPA_PMI = 1. 57:56 Reserved 58 Set 1 to cause LBR_Frz = 1. 59 Set 1 to cause CTR_Frz = 1. 60 Set 1 to cause ASCI = 1. 61 Set 1 to cause Ovf_Uncore. 62 Set 1 to cause Ovf_BufDSSAVE. 63 Reserved See Table 2-2. 914 IA32_PERF_GLOBAL_INUSE 3F1H 1009 MSR_PEBS_ENABLE Core 0 1016 See Table 2-2. See Section 18.2.4, “Architectural Performance Monitoring Version 4.” 0 392H 3F8H Bit Description MSR_PKG_C3_RESIDENCY See Table 2-2. See Section 18.6.2.4, “Processor Event Based Sampling (PEBS).” Enable PEBS trigger and recording for the programmed event (precise or otherwise) on IA32_PMC0. (R/W) Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C3 Residency Counter (R/O) Value since last reset that this package is in processor-specific C3 states. Count at the same frequency as the TSC. 3F9H 1017 MSR_PKG_C6_RESIDENCY 63:0 Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C6 Residency Counter (R/O) Value since last reset that this package is in processor-specific C6 states. Count at the same frequency as the TSC. Vol. 4 2-99 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 3FCH 1020 Register Name / Bit Fields MSR_CORE_C3_RESIDENCY Scope Core 63:0 Bit Description Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. CORE C3 Residency Counter (R/O) Value since last reset that this core is in processorspecific C3 states. Count at the same frequency as the TSC. 406H 1030 IA32_MC1_ADDR Module See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 418H 1048 IA32_MC6_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 419H 1049 IA32_MC6_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 41AH 1050 IA32_MC6_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 4C3H 1219 IA32_A_PMC2 Core See Table 2-2. 4C4H 1220 IA32_A_PMC3 Core See Table 2-2. 4E0H 1248 MSR_SMM_FEATURE_CONTROL Package Enhanced SMM Feature Control (SMM-RW) Reports SMM capability Enhancement. Accessible only while in SMM. 0 Lock (SMM-RWO) When set to ‘1’ locks this register from further changes. 1 Reserved 2 SMM_Code_Chk_En (SMM-RW) This control bit is available only if MSR_SMM_MCA_CAP[58] == 1. When set to ‘0’ (default) none of the logical processors are prevented from executing SMM code outside the ranges defined by the SMRR. When set to ‘1’ any logical processor in the package that attempts to execute SMM code not within the ranges defined by the SMRR will assert an unrecoverable MCE. 63:3 4E2H 1250 MSR_SMM_DELAYED Reserved Package SMM Delayed (SMM-RO) Reports the interruptible state of all logical processors in the package. Available only while in SMM and MSR_SMM_MCA_CAP[LONG_FLOW_INDICATION] == 1. 2-100 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec N-1:0 LOG_PROC_STATE (SMM-RO) Each bit represents a processor core of its state in a long flow of internal operation which delays servicing an interrupt. The corresponding bit will be set at the start of long events such as: Microcode Update Load, C6, WBINVD, Ratio Change, Throttle. The bit is automatically cleared at the end of each long event. The reset value of this field is 0. Only bit positions below N = CPUID.(EAX=0BH, ECX=PKG_LVL):EBX[15:0] can be updated. 63:N 4E3H 1251 MSR_SMM_BLOCKED Reserved Package SMM Blocked (SMM-RO) Reports the blocked state of all logical processors in the package. Available only while in SMM. N-1:0 LOG_PROC_STATE (SMM-RO) Each bit represents a processor core of its blocked state to service an SMI. The corresponding bit will be set if the logical processor is in one of the following states: Wait For SIPI or SENTER Sleep. The reset value of this field is 0FFFH. Only bit positions below N = CPUID.(EAX=0BH, ECX=PKG_LVL):EBX[15:0] can be updated. 63:N 500H 1280 IA32_SGX_SVN_STATUS Reserved Core 0 Status and SVN Threshold of SGX Support for ACM (RO) Lock See Section 41.11.3, “Interactions with Authenticated Code Modules (ACMs)”. 15:1 Reserved 23:16 SGX_SVN_SINIT See Section 41.11.3, “Interactions with Authenticated Code Modules (ACMs)”. 63:24 560H 1376 IA32_RTIT_OUTPUT_BASE Reserved Core Trace Output Base Register (R/W) See Table 2-2. 561H 1377 IA32_RTIT_OUTPUT_MASK_PTRS Core Trace Output Mask Pointers Register (R/W) See Table 2-2. 570H 1392 IA32_RTIT_CTL Core Trace Control Register (R/W) 0 TraceEn 1 CYCEn 2 OS 3 User 6:4 Reserved, must be zero. Vol. 4 2-101 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 7 CR3 filter 8 ToPA Writing 0 will #GP if also setting TraceEn. 571H 1393 9 MTCEn 10 TSCEn 11 DisRETC 12 Reserved, must be zero. 13 BranchEn 17:14 MTCFreq 18 Reserved, must be zero. 22:19 CYCThresh 23 Reserved, must be zero. 27:24 PSBFreq 31:28 Reserved, must be zero. 35:32 ADDR0_CFG 39:36 ADDR1_CFG 63:40 Reserved, must be zero. IA32_RTIT_STATUS Core 0 Tracing Status Register (R/W) FilterEn Writes ignored. 1 ContexEn Writes ignored. 2 TriggerEn Writes ignored. 572H 580H 1394 1408 3 Reserved 4 Error (R/W) 5 Stopped 31:6 Reserved, must be zero. 48:32 PacketByteCnt 63:49 Reserved, must be zero. IA32_RTIT_CR3_MATCH Core 4:0 Reserved 63:5 CR3[63:5] value to match. IA32_RTIT_ADDR0_A Core 63:0 581H 1409 IA32_RTIT_ADDR0_B 63:0 2-102 Vol. 4 Trace Filter CR3 Match Register (R/W) Region 0 Start Address (R/W) See Table 2-2. Core Region 0 End Address (R/W) See Table 2-2. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 582H 1410 Register Name / Bit Fields IA32_RTIT_ADDR1_A Scope Core 63:0 583H 1411 IA32_RTIT_ADDR1_B 1542 MSR_RAPL_POWER_UNIT Region 1 Start Address (R/W) See Table 2-2. Core 63:0 606H Bit Description Region 1 End Address (R/W) See Table 2-2. Package Unit Multipliers used in RAPL Interfaces (R/O) See Section 14.9.1, “RAPL Interfaces.” 3:0 Power Units Power related information (in Watts) is in unit of 1W/2^PU; where PU is an unsigned integer represented by bits 3:0. Default value is 1000b, indicating power unit is in 3.9 milliWatts increment. 7:4 Reserved 12:8 Energy Status Units Energy related information (in Joules) is in unit of 1Joule/ (2^ESU); where ESU is an unsigned integer represented by bits 12:8. Default value is 01110b, indicating energy unit is in 61 microJoules. 15:13 Reserved 19:16 Time Unit Time related information (in seconds) is in unit of 1S/2^TU; where TU is an unsigned integer represented by bits 19:16. Default value is 1010b, indicating power unit is in 0.977 millisecond. 63:20 60AH 1546 MSR_PKGC3_IRTL Reserved Package Package C3 Interrupt Response Limit (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 9:0 Interrupt Response Time Limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C3 state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. See Table 2-20 for supported time unit encodings. 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 Reserved Vol. 4 2-103 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 60BH 1547 Register Name / Bit Fields MSR_PKGC_IRTL1 Scope Package Bit Description Package C6/C7S Interrupt Response Limit 1 (R/W) This MSR defines the interrupt response time limit used by the processor to manage a transition to a package C6 or C7S state. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. 9:0 Interrupt Response Time Limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C6 or C7S state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. See Table 2-20 for supported time unit encodings. 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 60CH 1548 MSR_PKGC_IRTL2 Reserved Package Package C7 Interrupt Response Limit 2 (R/W) This MSR defines the interrupt response time limit used by the processor to manage a transition to a package C7 state. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 9:0 Interrupt Response Time Limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C7 state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. See Table 2-20 for supported time unit encodings. 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 60DH 2-104 Vol. 4 1549 MSR_PKG_C2_RESIDENCY Reserved Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 63:0 Package C2 Residency Counter (R/O) Value since last reset that this package is in processor-specific C2 states. Count at the same frequency as the TSC. 610H 1552 MSR_PKG_POWER_LIMIT Package PKG RAPL Power Limit Control (R/W) See Section 14.9.3, “Package RAPL Domain.” 611H 1553 MSR_PKG_ENERGY_STATUS Package PKG Energy Status (R/O) See Section 14.9.3, “Package RAPL Domain.” 613H 1555 MSR_PKG_PERF_STATUS Package PKG Perf Status (R/O) See Section 14.9.3, “Package RAPL Domain.” 614H 1556 MSR_PKG_POWER_INFO Package 14:0 PKG RAPL Parameters (R/W) Thermal Spec Power (R/W) See Section 14.9.3, “Package RAPL Domain.” 15 Reserved 30:16 Minimum Power (R/W) See Section 14.9.3, “Package RAPL Domain.” 31 Reserved 46:32 Maximum Power (R/W) See Section 14.9.3, “Package RAPL Domain.” 47 Reserved 54:48 Maximum Time Window (R/W) Specified by 2^Y * (1.0 + Z/4.0) * Time_Unit, where “Y” is the unsigned integer value represented by bits 52:48, “Z” is an unsigned integer represented by bits 54:53. “Time_Unit” is specified by the “Time Units” field of MSR_RAPL_POWER_UNIT. 63:55 618H 1560 MSR_DRAM_POWER_LIMIT Reserved Package DRAM RAPL Power Limit Control (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM Energy Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61BH 1563 MSR_DRAM_PERF_STATUS Package DRAM Performance Throttling Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61CH 1564 MSR_DRAM_POWER_INFO Package DRAM RAPL Parameters (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 632H 1586 MSR_PKG_C10_RESIDENCY 63:0 Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. Package C10 Residency Counter (R/O) Value since last reset that the entire SOC is in an S0i3 state. Count at the same frequency as the TSC. Vol. 4 2-105 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 639H 1593 Register Name / Bit Fields MSR_PP0_ENERGY_STATUS Scope Package Bit Description PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 641H 1601 MSR_PP1_ENERGY_STATUS Package PP1 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 64CH 1612 MSR_TURBO_ACTIVATION_RATIO Package 7:0 ConfigTDP Control (R/W) MAX_NON_TURBO_RATIO (RW/L) System BIOS can program this field. 30:8 Reserved 31 TURBO_ACTIVATION_RATIO_Lock (RW/L) When this bit is set, the content of this register is locked until a reset. 63:32 64FH 1615 MSR_CORE_PERF_LIMIT_REASONS Reserved Package Indicator of Frequency Clipping in Processor Cores (R/W) (Frequency refers to processor core frequency.) 0 PROCHOT Status (R0) When set, processor core frequency is reduced below the operating system request due to assertion of external PROCHOT. 1 Thermal Status (R0) When set, frequency is reduced below the operating system request due to a thermal event. 2 Package-Level Power Limiting PL1 Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL1. 3 Package-Level PL2 Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL2. 8:4 Reserved 9 Core Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to domain-level power limiting. 10 VR Therm Alert Status (R0) When set, frequency is reduced below the operating system request due to a thermal alert from the Voltage Regulator. 11 Max Turbo Limit Status (R0) When set, frequency is reduced below the operating system request due to multi-core turbo limits. 2-106 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 12 Electrical Design Point Status (R0) When set, frequency is reduced below the operating system request due to electrical design point constraints (e.g., maximum electrical current consumption). 13 Turbo Transition Attenuation Status (R0) When set, frequency is reduced below the operating system request due to Turbo transition attenuation. This prevents performance degradation due to frequent operating ratio changes. 14 Maximum Efficiency Frequency Status (R0) When set, frequency is reduced below the maximum efficiency frequency. 15 Reserved 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 18 Package-Level PL1 Power Limiting Log When set, indicates that the Package Level PL1 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 19 Package-Level PL2 Power Limiting Log When set, indicates that the Package Level PL2 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 24:20 25 Reserved Core Power Limiting Log When set, indicates that the Core Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. Vol. 4 2-107 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 26 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 27 Max Turbo Limit Log When set, indicates that the Max Turbo Limit Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28 Electrical Design Point Log When set, indicates that the EDP Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 29 Turbo Transition Attenuation Log When set, indicates that the Turbo Transition Attenuation Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 30 Maximum Efficiency Frequency Log When set, indicates that the Maximum Efficiency Frequency Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:31 680H 1664 MSR_LASTBRANCH_0_FROM_IP Reserved Core Last Branch Record 0 From IP (R/W) One of 32 pairs of last branch record registers on the last branch record stack. The From_IP part of the stack contains pointers to the source instruction . See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.6 and record format in Section 17.4.8.1. 681H 1665 0:47 From Linear Address (R/W) 62:48 Signed extension of bits 47:0. 63 Mispred MSR_LASTBRANCH_1_FROM_IP Core Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 682H 1666 MSR_LASTBRANCH_2_FROM_IP Core Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 683H 1667 MSR_LASTBRANCH_3_FROM_IP Core Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 2-108 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 684H 1668 Register Name / Bit Fields MSR_LASTBRANCH_4_FROM_IP Scope Core Bit Description Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 685H 1669 MSR_LASTBRANCH_5_FROM_IP Core Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 686H 1670 MSR_LASTBRANCH_6_FROM_IP Core Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 687H 1671 MSR_LASTBRANCH_7_FROM_IP Core Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 688H 1672 MSR_LASTBRANCH_8_FROM_IP Core Last Branch Record 8 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 689H 1673 MSR_LASTBRANCH_9_FROM_IP Core Last Branch Record 9 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68AH 1674 MSR_LASTBRANCH_10_FROM_IP Core Last Branch Record 10 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68BH 1675 MSR_LASTBRANCH_11_FROM_IP Core Last Branch Record 11 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68CH 1676 MSR_LASTBRANCH_12_FROM_IP Core Last Branch Record 12 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68DH 1677 MSR_LASTBRANCH_13_FROM_IP Core Last Branch Record 13 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68EH 1678 MSR_LASTBRANCH_14_FROM_IP Core Last Branch Record 14 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68FH 1679 MSR_LASTBRANCH_15_FROM_IP Core Last Branch Record 15 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 690H 1680 MSR_LASTBRANCH_16_FROM_IP Core Last Branch Record 16 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 691H 1681 MSR_LASTBRANCH_17_FROM_IP Core Last Branch Record 17 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 692H 1682 MSR_LASTBRANCH_18_FROM_IP Core Last Branch Record 18 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 693H 1683 MSR_LASTBRANCH_19_FROM_IP Core Last Branch Record 19From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 694H 1684 MSR_LASTBRANCH_20_FROM_IP Core Last Branch Record 20 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 695H 1685 MSR_LASTBRANCH_21_FROM_IP Core Last Branch Record 21 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 696H 1686 MSR_LASTBRANCH_22_FROM_IP Core Last Branch Record 22 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Vol. 4 2-109 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 697H 1687 Register Name / Bit Fields MSR_LASTBRANCH_23_FROM_IP Scope Core Bit Description Last Branch Record 23 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 698H 1688 MSR_LASTBRANCH_24_FROM_IP Core Last Branch Record 24 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 699H 1689 MSR_LASTBRANCH_25_FROM_IP Core Last Branch Record 25 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69AH 1690 MSR_LASTBRANCH_26_FROM_IP Core Last Branch Record 26 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69BH 1691 MSR_LASTBRANCH_27_FROM_IP Core Last Branch Record 27 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69CH 1692 MSR_LASTBRANCH_28_FROM_IP Core Last Branch Record 28 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69DH 1693 MSR_LASTBRANCH_29_FROM_IP Core Last Branch Record 29 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69EH 1694 MSR_LASTBRANCH_30_FROM_IP Core Last Branch Record 30 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69FH 1695 MSR_LASTBRANCH_31_FROM_IP Core Last Branch Record 31 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 6C0H 1728 MSR_LASTBRANCH_0_TO_IP Core Last Branch Record 0 To IP (R/W) One of 32 pairs of last branch record registers on the last branch record stack. The To_IP part of the stack contains pointers to the Destination instruction and elapsed cycles from last LBR update. See Section 17.6. 6C1H 1729 0:47 Target Linear Address (R/W) 63:48 Elapsed cycles from last update to the LBR. MSR_LASTBRANCH_1_TO_IP Core Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C2H 1730 MSR_LASTBRANCH_2_TO_IP Core Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C3H 1731 MSR_LASTBRANCH_3_TO_IP Core Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C4H 1732 MSR_LASTBRANCH_4_TO_IP Core Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C5H 1733 MSR_LASTBRANCH_5_TO_IP Core Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C6H 1734 MSR_LASTBRANCH_6_TO_IP Core Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C7H 1735 MSR_LASTBRANCH_7_TO_IP Core Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 2-110 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 6C8H 1736 Register Name / Bit Fields MSR_LASTBRANCH_8_TO_IP Scope Core Bit Description Last Branch Record 8 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C9H 1737 MSR_LASTBRANCH_9_TO_IP Core Last Branch Record 9 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CAH 1738 MSR_LASTBRANCH_10_TO_IP Core Last Branch Record 10 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CBH 1739 MSR_LASTBRANCH_11_TO_IP Core Last Branch Record 11 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CCH 1740 MSR_LASTBRANCH_12_TO_IP Core Last Branch Record 12 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CDH 1741 MSR_LASTBRANCH_13_TO_IP Core Last Branch Record 13 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CEH 1742 MSR_LASTBRANCH_14_TO_IP Core Last Branch Record 14 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CFH 1743 MSR_LASTBRANCH_15_TO_IP Core Last Branch Record 15 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D0H 1744 MSR_LASTBRANCH_16_TO_IP Core Last Branch Record 16 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D1H 1745 MSR_LASTBRANCH_17_TO_IP Core Last Branch Record 17 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D2H 1746 MSR_LASTBRANCH_18_TO_IP Core Last Branch Record 18 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D3H 1747 MSR_LASTBRANCH_19_TO_IP Core Last Branch Record 19To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D4H 1748 MSR_LASTBRANCH_20_TO_IP Core Last Branch Record 20 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D5H 1749 MSR_LASTBRANCH_21_TO_IP Core Last Branch Record 21 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D6H 1750 MSR_LASTBRANCH_22_TO_IP Core Last Branch Record 22 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D7H 1751 MSR_LASTBRANCH_23_TO_IP Core Last Branch Record 23 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D8H 1752 MSR_LASTBRANCH_24_TO_IP Core Last Branch Record 24 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D9H 1753 MSR_LASTBRANCH_25_TO_IP Core Last Branch Record 25 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DAH 1754 MSR_LASTBRANCH_26_TO_IP Core Last Branch Record 26 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Vol. 4 2-111 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Dec 6DBH 1755 Register Name / Bit Fields MSR_LASTBRANCH_27_TO_IP Scope Core Bit Description Last Branch Record 27 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DCH 1756 MSR_LASTBRANCH_28_TO_IP Core Last Branch Record 28 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DDH 1757 MSR_LASTBRANCH_29_TO_IP Core Last Branch Record 29 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DEH 1758 MSR_LASTBRANCH_30_TO_IP Core Last Branch Record 30 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DFH 1759 MSR_LASTBRANCH_31_TO_IP Core Last Branch Record 31 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 802H 2050 IA32_X2APIC_APICID Core x2APIC ID register (R/O) 803H 2051 IA32_X2APIC_VERSION Core x2APIC Version register (R/O) 808H 2056 IA32_X2APIC_TPR Core x2APIC Task Priority register (R/W) 80AH 2058 IA32_X2APIC_PPR Core x2APIC Processor Priority register (R/O) 80BH 2059 IA32_X2APIC_EOI Core x2APIC EOI register (W/O) 80DH 2061 IA32_X2APIC_LDR Core x2APIC Logical Destination register (R/O) 80FH 2063 IA32_X2APIC_SIVR Core x2APIC Spurious Interrupt Vector register (R/W) 810H 2064 IA32_X2APIC_ISR0 Core x2APIC In-Service register bits [31:0] (R/O) 811H 2065 IA32_X2APIC_ISR1 Core x2APIC In-Service register bits [63:32] (R/O) 812H 2066 IA32_X2APIC_ISR2 Core x2APIC In-Service register bits [95:64] (R/O) 813H 2067 IA32_X2APIC_ISR3 Core x2APIC In-Service register bits [127:96] (R/O) 814H 2068 IA32_X2APIC_ISR4 Core x2APIC In-Service register bits [159:128] (R/O) 815H 2069 IA32_X2APIC_ISR5 Core x2APIC In-Service register bits [191:160] (R/O) 816H 2070 IA32_X2APIC_ISR6 Core x2APIC In-Service register bits [223:192] (R/O) 817H 2071 IA32_X2APIC_ISR7 Core x2APIC In-Service register bits [255:224] (R/O) 818H 2072 IA32_X2APIC_TMR0 Core x2APIC Trigger Mode register bits [31:0] (R/O) 819H 2073 IA32_X2APIC_TMR1 Core x2APIC Trigger Mode register bits [63:32] (R/O) 81AH 2074 IA32_X2APIC_TMR2 Core x2APIC Trigger Mode register bits [95:64] (R/O) 81BH 2075 IA32_X2APIC_TMR3 Core x2APIC Trigger Mode register bits [127:96] (R/O) 81CH 2076 IA32_X2APIC_TMR4 Core x2APIC Trigger Mode register bits [159:128] (R/O) 81DH 2077 IA32_X2APIC_TMR5 Core x2APIC Trigger Mode register bits [191:160] (R/O) 81EH 2078 IA32_X2APIC_TMR6 Core x2APIC Trigger Mode register bits [223:192] (R/O) 81FH 2079 IA32_X2APIC_TMR7 Core x2APIC Trigger Mode register bits [255:224] (R/O) 820H 2080 IA32_X2APIC_IRR0 Core x2APIC Interrupt Request register bits [31:0] (R/O) 821H 2081 IA32_X2APIC_IRR1 Core x2APIC Interrupt Request register bits [63:32] (R/O) 822H 2082 IA32_X2APIC_IRR2 Core x2APIC Interrupt Request register bits [95:64] (R/O) 823H 2083 IA32_X2APIC_IRR3 Core x2APIC Interrupt Request register bits [127:96] (R/O) 2-112 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 824H 2084 IA32_X2APIC_IRR4 Core x2APIC Interrupt Request register bits [159:128] (R/O) 825H 2085 IA32_X2APIC_IRR5 Core x2APIC Interrupt Request register bits [191:160] (R/O) 826H 2086 IA32_X2APIC_IRR6 Core x2APIC Interrupt Request register bits [223:192] (R/O) 827H 2087 IA32_X2APIC_IRR7 Core x2APIC Interrupt Request register bits [255:224] (R/O) 828H 2088 IA32_X2APIC_ESR Core x2APIC Error Status register (R/W) 82FH 2095 IA32_X2APIC_LVT_CMCI Core x2APIC LVT Corrected Machine Check Interrupt register (R/W) 830H 2096 IA32_X2APIC_ICR Core x2APIC Interrupt Command register (R/W) 832H 2098 IA32_X2APIC_LVT_TIMER Core x2APIC LVT Timer Interrupt register (R/W) 833H 2099 IA32_X2APIC_LVT_THERMAL Core x2APIC LVT Thermal Sensor Interrupt register (R/W) 834H 2100 IA32_X2APIC_LVT_PMI Core x2APIC LVT Performance Monitor register (R/W) 835H 2101 IA32_X2APIC_LVT_LINT0 Core x2APIC LVT LINT0 register (R/W) 836H 2102 IA32_X2APIC_LVT_LINT1 Core x2APIC LVT LINT1 register (R/W) 837H 2103 IA32_X2APIC_LVT_ERROR Core x2APIC LVT Error register (R/W) 838H 2104 IA32_X2APIC_INIT_COUNT Core x2APIC Initial Count register (R/W) 839H 2105 IA32_X2APIC_CUR_COUNT Core x2APIC Current Count register (R/O) 83EH 2110 IA32_X2APIC_DIV_CONF Core x2APIC Divide Configuration register (R/W) 83FH 2111 IA32_X2APIC_SELF_IPI Core x2APIC Self IPI register (W/O) C8FH 3215 IA32_PQR_ASSOC Core Resource Association Register (R/W) D10H 3344 31:0 Reserved 33:32 COS (R/W) 63: 34 Reserved IA32_L2_QOS_MASK_0 Module L2 Class Of Service Mask - COS 0 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=0. D11H 3345 0:7 CBM: Bit vector of available L2 ways for COS 0 enforcement. 63:8 Reserved IA32_L2_QOS_MASK_1 Module L2 Class Of Service Mask - COS 1 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=1. 0:7 CBM: Bit vector of available L2 ways for COS 0 enforcement. 63:8 D12H 3346 IA32_L2_QOS_MASK_2 Reserved Module L2 Class Of Service Mask - COS 2 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=2. 0:7 CBM: Bit vector of available L2 ways for COS 0 enforcement. Vol. 4 2-113 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-12. MSRs in Intel Atom Processors Based on the Goldmont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Dec 63:8 D13H Bit Description 3347 Reserved IA32_L2_QOS_MASK_3 Package L2 Class Of Service Mask - COS 3 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=3. 0:19 CBM: Bit vector of available L2 ways for COS 3 enforcement. 63:20 Reserved D90H 3472 IA32_BNDCFGS Core See Table 2-2. DA0H 3488 IA32_XSS Core See Table 2-2. See Table 2-6, and Table 2-12 for MSR definitions applicable to processors with CPUID signature 06_5CH. 2.6 MSRS IN INTEL ATOM PROCESSORS BASED ON GOLDMONT PLUS MICROARCHITECTURE Intel Atom processors based on the Goldmont Plus microarchitecture support MSRs listed in Table 2-6, Table 2-12 and Table 2-13. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_7AH; see Table 2-1. For an MSR listed in Table 2-13 that also appears in the model-specific tables of prior generations, Table 2-13 supercede prior generation tables. In the Goldmont Plus microarchitecture, the scope column indicates the following: “Core” means each processor core has a separate MSR, or a bit field not shared with another processor core. “Module” means the MSR or the bit field is shared by a pair of processor cores in the physical package. “Package” means all processor cores in the physical package share the same MSR or bit interface. Table 2-13. MSRs in Intel Atom Processors Based on the Goldmont Plus Microarchitecture Register Address Hex Dec 3AH 58 Register Name / Bit Fields IA32_FEATURE_CONTROL Scope Core Bit Description Control Features in Intel 64Processor (R/W) See Table 2-2. 0 Lock (R/WL) 1 Enable VMX inside SMX operation (R/WL) 2 Enable VMX outside SMX operation (R/WL) 14:8 SENTER local functions enables (R/WL) 15 SENTER global functions enable (R/WL) 17 SGX Launch Control Enable (R/WL) This bit must be set to enable runtime reconfiguration of SGX Launch Control via IA32_SGXLEPUBKEYHASHn MSR. Valid if CPUID.(EAX=07H, ECX=0H): ECX[30] = 1. 8CH 2-114 Vol. 4 140 18 SGX global functions enable (R/WL) 63:19 Reserved IA32_SGXLEPUBKEYHASH0 Core See Table 2-2. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-13. MSRs in Intel Atom Processors Based on the Goldmont Plus Microarchitecture (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 8DH 141 IA32_SGXLEPUBKEYHASH1 Core See Table 2-2. 8EH 142 IA32_SGXLEPUBKEYHASH2 Core See Table 2-2. 8FH 143 IA32_SGXLEPUBKEYHASH3 Core See Table 2-2. 3F1H 1009 MSR_PEBS_ENABLE Core (R/W) See Table 2-2. See Section 18.6.2.4, “Processor Event Based Sampling (PEBS).” 570H 1392 0 Enable PEBS trigger and recording for the programmed event (precise or otherwise) on IA32_PMC0. 1 Enable PEBS trigger and recording for the programmed event (precise or otherwise) on IA32_PMC1. 2 Enable PEBS trigger and recording for the programmed event (precise or otherwise) on IA32_PMC2. 3 Enable PEBS trigger and recording for the programmed event (precise or otherwise) on IA32_PMC3. 31:4 Reserved 32 Enable PEBS trigger and recording for IA32_FIXED_CTR0. 33 Enable PEBS trigger and recording for IA32_FIXED_CTR1. 34 Enable PEBS trigger and recording for IA32_FIXED_CTR2. 63:35 Reserved IA32_RTIT_CTL Core Trace Control Register (R/W) 0 TraceEn 1 CYCEn 2 OS 3 User 4 PwrEvtEn 5 FUPonPTW 6 FabricEn 7 CR3 filter 8 ToPA Writing 0 will #GP if also setting TraceEn. 9 MTCEn 10 TSCEn 11 DisRETC 12 PTWEn 13 BranchEn Vol. 4 2-115 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-13. MSRs in Intel Atom Processors Based on the Goldmont Plus Microarchitecture (Contd.) Register Address Hex 680H Register Name / Bit Fields Scope Bit Description Dec 1664 17:14 MTCFreq 18 Reserved, must be zero. 22:19 CYCThresh 23 Reserved, must be zero. 27:24 PSBFreq 31:28 Reserved, must be zero. 35:32 ADDR0_CFG 39:36 ADDR1_CFG 63:40 Reserved, must be zero. MSR_LASTBRANCH_0_FROM_IP Core Last Branch Record 0 From IP (R/W) One of the three MSRs that make up the first entry of the 32-entry LBR stack. The From_IP part of the stack contains pointers to the source instruction. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.7, “Last Branch, Call Stack, Interrupt, and Exception Recording for Processors based on Goldmont Plus Microarchitecture.” 681H 69FH 1665 1695 MSR_LASTBRANCH_i_FROM_IP 6C0H 1728 MSR_LASTBRANCH_0_TO_IP Core Last Branch Record i From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP; i = 1-31. Core Last Branch Record 0 To IP (R/W) One of the three MSRs that make up the first entry of the 32-entry LBR stack. The To_IP part of the stack contains pointers to the Destination instruction. See also: • Section 17.7, “Last Branch, Call Stack, Interrupt, and Exception Recording for Processors based on Goldmont Plus Microarchitecture.” 6C1H 6DFH 1729 1759 MSR_LASTBRANCH_i_TO_IP DC0H 3520 MSR_LASTBRANCH_INFO_0 Core Last Branch Record i To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP; i = 131. Core Last Branch Record 0 Additional Information (R/W) One of the three MSRs that make up the first entry of the 32-entry LBR stack. This part of the stack contains flag and elapsed cycle information. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.9.1, “LBR Stack.” DC1H 3521 MSR_LASTBRANCH_INFO_1 Core DC2H 3522 MSR_LASTBRANCH_INFO_2 Core Last Branch Record 1 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. Last Branch Record 2 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DC3H 3523 MSR_LASTBRANCH_INFO_3 Core Last Branch Record 3 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. 2-116 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-13. MSRs in Intel Atom Processors Based on the Goldmont Plus Microarchitecture (Contd.) Register Address Hex Dec DC4H 3524 Register Name / Bit Fields MSR_LASTBRANCH_INFO_4 Scope Core Bit Description Last Branch Record 4 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DC5H 3525 MSR_LASTBRANCH_INFO_5 Core Last Branch Record 5 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DC6H 3526 MSR_LASTBRANCH_INFO_6 Core Last Branch Record 6 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DC7H 3527 MSR_LASTBRANCH_INFO_7 Core Last Branch Record 7 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DC8H 3528 MSR_LASTBRANCH_INFO_8 Core Last Branch Record 8 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DC9H 3529 MSR_LASTBRANCH_INFO_9 Core Last Branch Record 9 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DCAH 3530 MSR_LASTBRANCH_INFO_10 Core Last Branch Record 10 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DCBH 3531 MSR_LASTBRANCH_INFO_11 Core Last Branch Record 11 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DCCH 3532 MSR_LASTBRANCH_INFO_12 Core Last Branch Record 12 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DCDH 3533 MSR_LASTBRANCH_INFO_13 Core Last Branch Record 13 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DCEH 3534 MSR_LASTBRANCH_INFO_14 Core Last Branch Record 14 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DCFH 3535 MSR_LASTBRANCH_INFO_15 Core Last Branch Record 15 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD0H 3536 MSR_LASTBRANCH_INFO_16 Core Last Branch Record 16 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD1H 3537 MSR_LASTBRANCH_INFO_17 Core Last Branch Record 17 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD2H 3538 MSR_LASTBRANCH_INFO_18 Core Last Branch Record 18 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD3H 3539 MSR_LASTBRANCH_INFO_19 Core Last Branch Record 19 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD4H 3520 MSR_LASTBRANCH_INFO_20 Core Last Branch Record 20 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD5H 3521 MSR_LASTBRANCH_INFO_21 Core Last Branch Record 21 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD6H 3522 MSR_LASTBRANCH_INFO_22 Core Last Branch Record 22 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. Vol. 4 2-117 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-13. MSRs in Intel Atom Processors Based on the Goldmont Plus Microarchitecture (Contd.) Register Address Hex Dec DD7H 3523 Register Name / Bit Fields MSR_LASTBRANCH_INFO_23 Scope Core Bit Description Last Branch Record 23 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD8H 3524 MSR_LASTBRANCH_INFO_24 Core Last Branch Record 24 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DD9H 3525 MSR_LASTBRANCH_INFO_25 Core Last Branch Record 25 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DDAH 3526 MSR_LASTBRANCH_INFO_26 Core Last Branch Record 26 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DDBH 3527 MSR_LASTBRANCH_INFO_27 Core Last Branch Record 27 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DDCH 3528 MSR_LASTBRANCH_INFO_28 Core Last Branch Record 28 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DDDH 3529 MSR_LASTBRANCH_INFO_29 Core Last Branch Record 29 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DDEH 3530 MSR_LASTBRANCH_INFO_30 Core Last Branch Record 30 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. DDFH 3531 MSR_LASTBRANCH_INFO_31 Core Last Branch Record 31 Additional Information (R/W) See description of MSR_LASTBRANCH_INFO_0. See Table 2-6, Table 2-12 and Table 2-13 for MSR definitions applicable to processors with CPUID signature 06_7AH. 2.7 MSRS IN INTEL ATOM PROCESSORS BASED ON TREMONT MICROARCHITECTURE Intel Atom processors based on the Tremont microarchitecture support MSRs listed in Table 2-6, Table 2-12, Table 2-13 and Table 2-14. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_86H; see Table 2-1. For an MSR listed in Table 2-14 that also appears in the model-specific tables of prior generations, Table 2-14 supercede prior generation tables. In the Tremont microarchitecture, the scope column indicates the following: “Core” means each processor core has a separate MSR, or a bit field not shared with another processor core. “Module” means the MSR or the bit field is shared by a pair of processor cores in the physical package. “Package” means all processor cores in the physical package share the same MSR or bit interface. 2-118 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-14. MSRs in Intel Atom Processors Based on the Tremont Microarchitecture Register Address Hex Dec 33H 51 Register Name / Bit Fields TEST_CTRL Scope Core Bit Description Test Control Register 28:0 Reserved. 29 Enable #AC(0) exception for split locked accesses: Cause #AC(0) exception for split locked access at all CPL irrespective of CR0.AM or EFLAGS.AC. If bits 29 and 31 are both set, bit 29 takes precedence. CFH 207 30 Reserved. 31 Disable LOCK# assertion for split locked access. IA32_CORE_CAPABILITY Core IA32 Core Capability Register See Table 2-2. 3F1H 1009 MSR_PEBS_ENABLE Core (R/W) See Table 2-2. See Section 18.6.2.4, “Processor Event Based Sampling (PEBS)”. n:0 Enable PEBS trigger and recording for the programmed event (precise or otherwise) on IA32_PMCx. The maximum value n can be determined from CPUID.0AH:EAX[15:8]. 31:n+1 Reserved. 32+m:32 Enable PEBS trigger and recording for IA32_FIXED_CTRx. The maximum value m can be determined from CPUID.0AH:EDX[4:0]. 59:33+m Reserved. 60 Pend a PerfMon Interrupt (PMI) after each PEBS event. 62:61 Specifies PEBS output destination. Encodings: 00B: DS Save Area 01B: Intel PT trace output. Supported if IA32_PERF_CAPABILITIES.PEBS_OUTPUT_PT_AVAIL[ 16] and CPUID.07H.0.EBX[25] are set. 10B: Reserved 11B: Reserved 1309H 130BH 14C1H 14C8H 4873 4875 5313 5320 63 Reserved. MSR_RELOAD_FIXED_CTRx Reload value for IA32_FIXED_CTRx (R/W) 47:0 Value loaded into IA32_FIXED_CTRx when a PEBS record is generated while PEBS_EN_FIXEDx = 1 and PEBS_OUTPUT = 01B in IA32_PEBS_ENABLE, and FIXED_CTRx is overflowed. 63:48 Reserved. MSR_RELOAD_PMCx Reload value for IA32_PMCx (R/W) 47:0 Value loaded into IA32_PMCx when a PEBS record is generated while PEBS_EN_PMCx = 1 and PEBS_OUTPUT = 01B in IA32_PEBS_ENABLE, and PMCx is overflowed. 63:48 Reserved. Vol. 4 2-119 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-14. MSRs in Intel Atom Processors Based on the Tremont Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec See Table 2-6, Table 2-12, Table 2-13 and Table 2-14 for MSR definitions applicable to processors with CPUID signature 06_86H. 2.8 MSRS IN THE INTEL® MICROARCHITECTURE CODE NAME NEHALEM Table 2-15 lists model-specific registers (MSRs) that are common for Intel® microarchitecture code name Nehalem. These include Intel Core i7 and i5 processor family. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_1AH, 06_1EH, 06_1FH, 06_2EH, see Table 2-1. Additional MSRs specific to 06_1AH, 06_1EH, 06_1FH are listed in Table 2-16. Some MSRs listed in these tables are used by BIOS. More information about these MSR can be found at http://biosbits.org. The column “Scope” represents the package/core/thread scope of individual bit field of an MSR. “Thread” means this bit field must be programmed on each logical processor independently. “Core” means the bit field must be programmed on each processor core independently, logical processors in the same core will be affected by change of this bit on the other logical processor in the same core. “Package” means the bit field must be programmed once for each physical package. Change of a bit filed with a package scope will affect all logical processors in that physical package. Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 0H 0 IA32_P5_MC_ADDR Thread See Section 2.23, “MSRs in Pentium Processors.” 1H 1 IA32_P5_MC_TYPE Thread See Section 2.23, “MSRs in Pentium Processors.” 6H 6 IA32_MONITOR_FILTER_SIZE Thread See Section 8.10.5, “Monitor/Mwait Address Range Determination” and Table 2-2. 10H 16 IA32_TIME_STAMP_COUNTER Thread See Section 17.17, “Time-Stamp Counter,” and see Table 2-2. 17H 23 IA32_PLATFORM_ID Package Platform ID (R) See Table 2-2. 17H 23 MSR_PLATFORM_ID Package Model Specific Platform ID (R) 49:0 Reserved 52:50 See Table 2-2. 63:53 Reserved 1BH 27 IA32_APIC_BASE Thread See Section 10.4.4, “Local APIC Status and Location,” and Table 2-2. 34H 52 MSR_SMI_COUNT Thread SMI Counter (R/O) 31:0 SMI Count (R/O) Running count of SMI events since last RESET. 63:32 2-120 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Dec 3AH 58 Register Name / Bit Fields IA32_FEATURE_CONTROL Scope Thread Bit Description Control Features in Intel 64Processor (R/W) See Table 2-2. 79H 121 IA32_BIOS_UPDT_TRIG Core BIOS Update Trigger Register (W) See Table 2-2. 8BH 139 IA32_BIOS_SIGN_ID Thread BIOS Update Signature ID (RO) See Table 2-2. C1H 193 IA32_PMC0 Thread Performance Counter Register See Table 2-2. C2H 194 IA32_PMC1 Thread Performance Counter Register See Table 2-2. C3H 195 IA32_PMC2 Thread Performance Counter Register See Table 2-2. C4H 196 IA32_PMC3 Thread Performance Counter Register See Table 2-2. CEH 206 MSR_PLATFORM_INFO Package Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the frequency that invariant TSC runs at. The invariant TSC frequency can be computed by multiplying this ratio by 133.33 MHz. 27:16 28 Reserved Package Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limit for Turbo mode is enabled. When set to 0, indicates Programmable Ratio Limit for Turbo mode is disabled. 29 Package Programmable TDC-TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDC and TDP Limits for Turbo mode are programmable. When set to 0, indicates TDC and TDP Limits for Turbo mode are not programmable. 39:30 47:40 Reserved Package Maximum Efficiency Ratio (R/O) This is the minimum ratio (maximum efficiency) that the processor can operate, in units of 133.33MHz. 63:48 E2H 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Core C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. See http://biosbits.org. Vol. 4 2-121 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 000b: C0 (no package C-sate support) 001b: C1 (Behavior is the same as 000b) 010b: C3 011b: C6 100b: C7 101b and 110b: Reserved 111: No package C-state limit. Note: This field cannot be used to limit package C-state to C3. 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 14:11 Reserved 15 CFG Lock (R/WO) When set, locks bits 15:0 of this register until next reset. 23:16 Reserved 24 Interrupt filtering enable (R/W) When set, processor cores in a deep C-State will wake only when the event message is destined for that core. When 0, all processor cores in a deep C-State will wake for an event message. 25 C3 state auto demotion enable (R/W) When set, the processor will conditionally demote C6/C7 requests to C3 based on uncore auto-demote information. 26 C1 state auto demotion enable (R/W) When set, the processor will conditionally demote C3/C6/C7 requests to C1 based on uncore auto-demote information. 2-122 Vol. 4 27 Enable C3 Undemotion (R/W) 28 Enable C1 Undemotion (R/W) 29 Package C State Demotion Enable (R/W) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex E4H Register Name / Bit Fields Scope Bit Description Dec 228 30 Package C State UnDemotion Enable (R/W) 63:31 Reserved MSR_PMG_IO_CAPTURE_BASE Core Power Management IO Redirection in C-state (R/W) See http://biosbits.org. 15:0 LVL_2 Base Address (R/W) Specifies the base address visible to software for IO redirection. If IO MWAIT Redirection is enabled, reads to this address will be consumed by the power management logic and decoded to MWAIT instructions. When IO port address redirection is enabled, this is the IO port address reported to the OS/software. 18:16 C-state Range (R/W) Specifies the encoding value of the maximum C-State code name to be included when IO read to MWAIT redirection is enabled by MSR_PKG_CST_CONFIG_CONTROL[bit10]: 000b - C3 is the max C-State to include. 001b - C6 is the max C-State to include. 010b - C7 is the max C-State to include. 63:19 E7H 231 IA32_MPERF Reserved Thread Maximum Performance Frequency Clock Count (RW) See Table 2-2. E8H 232 IA32_APERF Thread Actual Performance Frequency Clock Count (RW) See Table 2-2. FEH 254 IA32_MTRRCAP Thread See Table 2-2. 174H 372 IA32_SYSENTER_CS Thread See Table 2-2. 175H 373 IA32_SYSENTER_ESP Thread See Table 2-2. 176H 374 IA32_SYSENTER_EIP Thread See Table 2-2. 179H 377 IA32_MCG_CAP Thread See Table 2-2. 17AH 378 IA32_MCG_STATUS Thread Global Machine Check Status 0 RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. Vol. 4 2-123 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 186H 390 IA32_PERFEVTSEL0 Reserved Thread See Table 2-2. 7:0 Event Select 15:8 UMask 16 USR 17 OS 18 Edge 19 PC 20 INT 21 AnyThread 22 EN 23 INV 31:24 CMASK 63:32 Reserved 187H 391 IA32_PERFEVTSEL1 Thread See Table 2-2. 188H 392 IA32_PERFEVTSEL2 Thread See Table 2-2. 189H 393 IA32_PERFEVTSEL3 Thread See Table 2-2. 198H 408 IA32_PERF_STATUS Core See Table 2-2. 15:0 Current Performance State Value. 63:16 Reserved 199H 409 IA32_PERF_CTL Thread 19AH 410 IA32_CLOCK_MODULATION Thread See Table 2-2. Clock Modulation (R/W) See Table 2-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 0 19BH 411 Reserved 3:1 On demand Clock Modulation Duty Cycle (R/W) 4 On demand Clock Modulation Enable (R/W) 63:5 Reserved IA32_THERM_INTERRUPT Core Thermal Interrupt Control (R/W) See Table 2-2. 2-124 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Dec 19CH 412 Register Name / Bit Fields IA32_THERM_STATUS Scope Core Bit Description Thermal Monitor Status (R/W) See Table 2-2. 1A0H 416 IA32_MISC_ENABLE Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 0 Thread Fast-Strings Enable See Table 2-2. 2:1 3 Reserved Thread Automatic Thermal Control Circuit Enable (R/W) See Table 2-2. Default value is 1. 6:4 7 Reserved Thread Performance Monitoring Available (R) See Table 2-2. 10:8 11 Reserved Thread Branch Trace Storage Unavailable (RO) See Table 2-2. 12 Thread Processor Event Based Sampling Unavailable (RO) See Table 2-2. 15:13 16 Reserved Package Enhanced Intel SpeedStep Technology Enable (R/W) See Table 2-2. 18 Thread 21:19 ENABLE MONITOR FSM. (R/W) See Table 2-2. Reserved 22 Thread 23 Thread Limit CPUID Maxval (R/W) See Table 2-2. xTPR Message Disable (R/W) See Table 2-2. 33:24 34 Reserved Thread XD Bit Disable (R/W) See Table 2-2. 37:35 Reserved Vol. 4 2-125 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 38 Package Turbo Mode Disable (R/W) When set to 1 on processors that support Intel Turbo Boost Technology, the turbo mode feature is disabled and the IDA_Enable feature flag will be clear (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processor’s support of turbo mode is enabled. Note: The power-on default value is used by BIOS to detect hardware support of turbo mode. If the poweron default value is 1, turbo mode is available in the processor. If the power-on default value is 0, turbo mode is not available. 63:39 1A2H 418 MSR_TEMPERATURE_TARGET Reserved Thread 15:0 Temperature Target Reserved 23:16 Temperature Target (R) The minimum temperature at which PROCHOT# will be asserted. The value is degrees C. 1A4H 420 63:24 Reserved MSR_MISC_FEATURE_CONTROL Miscellaneous Feature Control (R/W) 0 Core L2 Hardware Prefetcher Disable (R/W) If 1, disables the L2 hardware prefetcher, which fetches additional lines of code or data into the L2 cache. 1 Core L2 Adjacent Cache Line Prefetcher Disable (R/W) If 1, disables the adjacent cache line prefetcher, which fetches the cache line that comprises a cache line pair (128 bytes). 2 Core DCU Hardware Prefetcher Disable (R/W) If 1, disables the L1 data cache prefetcher, which fetches the next cache line into L1 data cache. 3 Core DCU IP Prefetcher Disable (R/W) If 1, disables the L1 data cache IP prefetcher, which uses sequential load history (based on instruction pointer of previous loads) to determine whether to prefetch additional lines. 63:4 1A6H 422 MSR_OFFCORE_RSP_0 1AAH 426 MSR_MISC_PWR_MGMT Reserved Thread Offcore Response Event Select Register (R/W) Miscellaneous Power Management Control Various model specific features enumeration. See http://biosbits.org. 2-126 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 0 Package EIST Hardware Coordination Disable (R/W) When 0, enables hardware coordination of Enhanced Intel Speedstep Technology request from processor cores. When 1, disables hardware coordination of Enhanced Intel Speedstep Technology requests. 1 Thread Energy/Performance Bias Enable (R/W) This bit makes the IA32_ENERGY_PERF_BIAS register (MSR 1B0h) visible to software with Ring 0 privileges. This bit’s status (1 or 0) is also reflected by CPUID.(EAX=06h):ECX[3]. 1ACH 428 63:2 Reserved MSR_TURBO_POWER_CURRENT_LIMIT See http://biosbits.org. 14:0 Package TDP Limit (R/W) TDP limit in 1/8 Watt granularity. 15 Package TDP Limit Override Enable (R/W) A value = 0 indicates override is not active; a value = 1 indicates override is active. 30:16 Package TDC Limit (R/W) TDC limit in 1/8 Amp granularity. 31 Package TDC Limit Override Enable (R/W) A value = 0 indicates override is not active; a value = 1 indicates override is active. 63:32 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 63:32 1C8H 456 MSR_LBR_SELECT Reserved Core Last Branch Record Filtering Select Register (R/W) See Section 17.9.2, “Filtering of Last Branch Records.” 0 CPL_EQ_0 1 CPL_NEQ_0 2 JCC Vol. 4 2-127 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex 1C9H Register Name / Bit Fields Scope Bit Description Dec 457 3 NEAR_REL_CALL 4 NEAR_IND_CALL 5 NEAR_RET 6 NEAR_IND_JMP 7 NEAR_REL_JMP 8 FAR_BRANCH 63:9 Reserved MSR_LASTBRANCH_TOS Thread Last Branch Record Stack TOS (R/W) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 680H). 1D9H 473 IA32_DEBUGCTL Thread 1DDH 477 MSR_LER_FROM_LIP Thread Debug Control (R/W) See Table 2-2. Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Thread Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1F2H 498 IA32_SMRR_PHYSBASE Core See Table 2-2. 1F3H 499 IA32_SMRR_PHYSMASK Core See Table 2-2. 1FCH 508 MSR_POWER_CTL Core Power Control Register See http://biosbits.org. 0 1 Reserved Package C1E Enable (R/W) When set to ‘1’, will enable the CPU to switch to the Minimum Enhanced Intel SpeedStep Technology operating point when all execution cores enter MWAIT (C1). 63:2 Reserved 200H 512 IA32_MTRR_PHYSBASE0 Thread See Table 2-2. 201H 513 IA32_MTRR_PHYSMASK0 Thread See Table 2-2. 202H 514 IA32_MTRR_PHYSBASE1 Thread See Table 2-2. 203H 515 IA32_MTRR_PHYSMASK1 Thread See Table 2-2. 204H 516 IA32_MTRR_PHYSBASE2 Thread See Table 2-2. 2-128 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 205H 517 IA32_MTRR_PHYSMASK2 Thread See Table 2-2. 206H 518 IA32_MTRR_PHYSBASE3 Thread See Table 2-2. 207H 519 IA32_MTRR_PHYSMASK3 Thread See Table 2-2. 208H 520 IA32_MTRR_PHYSBASE4 Thread See Table 2-2. 209H 521 IA32_MTRR_PHYSMASK4 Thread See Table 2-2. 20AH 522 IA32_MTRR_PHYSBASE5 Thread See Table 2-2. 20BH 523 IA32_MTRR_PHYSMASK5 Thread See Table 2-2. 20CH 524 IA32_MTRR_PHYSBASE6 Thread See Table 2-2. 20DH 525 IA32_MTRR_PHYSMASK6 Thread See Table 2-2. 20EH 526 IA32_MTRR_PHYSBASE7 Thread See Table 2-2. 20FH 527 IA32_MTRR_PHYSMASK7 Thread See Table 2-2. 210H 528 IA32_MTRR_PHYSBASE8 Thread See Table 2-2. 211H 529 IA32_MTRR_PHYSMASK8 Thread See Table 2-2. 212H 530 IA32_MTRR_PHYSBASE9 Thread See Table 2-2. 213H 531 IA32_MTRR_PHYSMASK9 Thread See Table 2-2. 250H 592 IA32_MTRR_FIX64K_00000 Thread See Table 2-2. 258H 600 IA32_MTRR_FIX16K_80000 Thread See Table 2-2. 259H 601 IA32_MTRR_FIX16K_A0000 Thread See Table 2-2. 268H 616 IA32_MTRR_FIX4K_C0000 Thread See Table 2-2. 269H 617 IA32_MTRR_FIX4K_C8000 Thread See Table 2-2. 26AH 618 IA32_MTRR_FIX4K_D0000 Thread See Table 2-2. 26BH 619 IA32_MTRR_FIX4K_D8000 Thread See Table 2-2. 26CH 620 IA32_MTRR_FIX4K_E0000 Thread See Table 2-2. 26DH 621 IA32_MTRR_FIX4K_E8000 Thread See Table 2-2. 26EH 622 IA32_MTRR_FIX4K_F0000 Thread See Table 2-2. 26FH 623 IA32_MTRR_FIX4K_F8000 Thread See Table 2-2. 277H 631 IA32_PAT Thread See Table 2-2. 280H 640 IA32_MC0_CTL2 Package See Table 2-2. 281H 641 IA32_MC1_CTL2 Package See Table 2-2. 282H 642 IA32_MC2_CTL2 Core See Table 2-2. 283H 643 IA32_MC3_CTL2 Core See Table 2-2. 284H 644 IA32_MC4_CTL2 Core See Table 2-2. 285H 645 IA32_MC5_CTL2 Core See Table 2-2. 286H 646 IA32_MC6_CTL2 Package See Table 2-2. 287H 647 IA32_MC7_CTL2 Package See Table 2-2. 288H 648 IA32_MC8_CTL2 Package See Table 2-2. Vol. 4 2-129 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Dec 2FFH 767 Register Name / Bit Fields IA32_MTRR_DEF_TYPE Scope Thread Bit Description Default Memory Types (R/W) See Table 2-2. 309H 777 IA32_FIXED_CTR0 Thread Fixed-Function Performance Counter Register 0 (R/W) See Table 2-2. 30AH 778 IA32_FIXED_CTR1 Thread Fixed-Function Performance Counter Register 1 (R/W) See Table 2-2. 30BH 779 IA32_FIXED_CTR2 Thread Fixed-Function Performance Counter Register 2 (R/W) See Table 2-2. 345H 837 IA32_PERF_CAPABILITIES Thread 5:0 See Table 2-2. See Section 17.4.1, “IA32_DEBUGCTL MSR.” LBR Format See Table 2-2. 6 PEBS Record Format 7 PEBSSaveArchRegs See Table 2-2. 11:8 PEBS_REC_FORMAT See Table 2-2. 12 SMM_FREEZE See Table 2-2. 63:13 Reserved 38DH 909 IA32_FIXED_CTR_CTRL Thread Fixed-Function-Counter Control Register (R/W) 38EH 910 IA32_PERF_GLOBAL_STATUS Thread See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 38EH 910 MSR_PERF_GLOBAL_STATUS Thread Provides single-bit status used by software to query the overflow condition of each performance counter. (RO) See Table 2-2. 61 UNC_Ovf Uncore overflowed if 1. 38FH 911 IA32_PERF_GLOBAL_CTRL Thread See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 390H 912 IA32_PERF_GLOBAL_OVF_CTRL Thread See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” Allows software to clear counter overflow conditions on any combination of fixedfunction PMCs (IA32_FIXED_CTRx) or general-purpose PMCs via a single WRMSR. 390H 912 MSR_PERF_GLOBAL_OVF_CTRL Thread (R/W) 61 CLR_UNC_Ovf Set 1 to clear UNC_Ovf. 2-130 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Dec 3F1H 1009 3F6H 1014 Register Name / Bit Fields MSR_PEBS_ENABLE Scope Thread Enable PEBS on IA32_PMC0 (R/W) 1 Enable PEBS on IA32_PMC1 (R/W) 2 Enable PEBS on IA32_PMC2 (R/W) 3 Enable PEBS on IA32_PMC3 (R/W) 31:4 Reserved 32 Enable Load Latency on IA32_PMC0 (R/W) 33 Enable Load Latency on IA32_PMC1 (R/W) 34 Enable Load Latency on IA32_PMC2 (R/W) 35 Enable Load Latency on IA32_PMC3 (R/W) 63:36 Reserved MSR_PEBS_LD_LAT Thread MSR_PKG_C3_RESIDENCY See Section 18.3.1.1.2, “Load Latency Performance Monitoring Facility.” Minimum threshold latency value of tagged load operation that will be counted. (R/W) 63:36 1016 See Section 18.3.1.1.1, “Processor Event Based Sampling (PEBS).” 0 15:0 3F8H Bit Description Reserved Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C3 Residency Counter (R/O) Value since last reset that this package is in processorspecific C3 states. Count at the same frequency as the TSC. 3F9H 1017 MSR_PKG_C6_RESIDENCY Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C6 Residency Counter (R/O) Value since last reset that this package is in processorspecific C6 states. Count at the same frequency as the TSC. 3FAH 1018 MSR_PKG_C7_RESIDENCY Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C7 Residency Counter (R/O) Value since last reset that this package is in processorspecific C7 states. Count at the same frequency as the TSC. 3FCH 1020 MSR_CORE_C3_RESIDENCY Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Vol. 4 2-131 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 63:0 CORE C3 Residency Counter (R/O) Value since last reset that this core is in processorspecific C3 states. Count at the same frequency as the TSC. 3FDH 1021 MSR_CORE_C6_RESIDENCY Core 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. CORE C6 Residency Counter (R/O) Value since last reset that this core is in processorspecific C6 states. Count at the same frequency as the TSC. 400H 1024 IA32_MC0_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 403H 1027 IA32_MC0_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 404H 1028 IA32_MC1_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 406H 1030 IA32_MC1_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 407H 1031 IA32_MC1_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 408H 1032 IA32_MC2_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40AH 1034 IA32_MC2_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40BH 1035 IA32_MC2_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 40CH 1036 IA32_MC3_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 2-132 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 40DH 1037 IA32_MC3_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 IA32_MC3_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40FH 1039 IA32_MC3_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 410H 1040 IA32_MC4_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 IA32_MC4_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 IA32_MC4_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 413H 1043 IA32_MC4_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 414H 1044 IA32_MC5_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 415H 1045 IA32_MC5_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 416H 1046 IA32_MC5_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 417H 1047 IA32_MC5_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 418H 1048 IA32_MC6_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 419H 1049 IA32_MC6_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 41AH 1050 IA32_MC6_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 41BH 1051 IA32_MC6_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 41CH 1052 IA32_MC7_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 41DH 1053 IA32_MC7_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 41EH 1054 IA32_MC7_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 41FH 1055 IA32_MC7_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 420H 1056 IA32_MC8_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 421H 1057 IA32_MC8_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 422H 1058 IA32_MC8_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 423H 1059 IA32_MC8_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” Vol. 4 2-133 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Dec 480H 1152 Register Name / Bit Fields IA32_VMX_BASIC Scope Thread Bit Description Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. See Appendix A.1, “Basic VMX Information.” 481H 1153 IA32_VMX_PINBASED_CTLS Thread Capability Reporting Register of Pin-based VM-execution Controls (R/O) See Table 2-2. See Appendix A.3, “VM-Execution Controls.” 482H 1154 IA32_VMX_PROCBASED_CTLS Thread Capability Reporting Register of Primary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 483H 1155 IA32_VMX_EXIT_CTLS Thread Capability Reporting Register of VM-Exit Controls (R/O) See Table 2-2. See Appendix A.4, “VM-Exit Controls.” 484H 1156 IA32_VMX_ENTRY_CTLS Thread Capability Reporting Register of VM-Entry Controls (R/O) See Table 2-2. See Appendix A.5, “VM-Entry Controls.” 485H 1157 IA32_VMX_MISC Thread Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 2-2. See Appendix A.6, “Miscellaneous Data.” 486H 1158 IA32_VMX_CR0_FIXED0 Thread Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 487H 1159 IA32_VMX_CR0_FIXED1 Thread Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 488H 1160 IA32_VMX_CR4_FIXED0 Thread Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 489H 1161 IA32_VMX_CR4_FIXED1 Thread Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 48AH 1162 IA32_VMX_VMCS_ENUM Thread Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 2-2. See Appendix A.9, “VMCS Enumeration.” 2-134 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Dec 48BH 1163 Register Name / Bit Fields IA32_VMX_PROCBASED_CTLS2 Scope Thread Bit Description Capability Reporting Register of Secondary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 600H 1536 IA32_DS_AREA Thread DS Save Area (R/W) See Table 2-2. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 680H 1664 MSR_LASTBRANCH_0_FROM_IP Thread Last Branch Record 0 From IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. The From_IP part of the stack contains pointers to the source instruction. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.9.1 and record format in Section 17.4.8.1. 681H 1665 MSR_LASTBRANCH_1_FROM_IP Thread Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 682H 1666 MSR_LASTBRANCH_2_FROM_IP Thread Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 683H 1667 MSR_LASTBRANCH_3_FROM_IP Thread Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 684H 1668 MSR_LASTBRANCH_4_FROM_IP Thread Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 685H 1669 MSR_LASTBRANCH_5_FROM_IP Thread Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 686H 1670 MSR_LASTBRANCH_6_FROM_IP Thread Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 687H 1671 MSR_LASTBRANCH_7_FROM_IP Thread Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 688H 1672 MSR_LASTBRANCH_8_FROM_IP Thread Last Branch Record 8 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 689H 1673 MSR_LASTBRANCH_9_FROM_IP Thread Last Branch Record 9 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68AH 1674 MSR_LASTBRANCH_10_FROM_IP Thread Last Branch Record 10 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68BH 1675 MSR_LASTBRANCH_11_FROM_IP Thread Last Branch Record 11 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68CH 1676 MSR_LASTBRANCH_12_FROM_IP Thread Last Branch Record 12 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68DH 1677 MSR_LASTBRANCH_13_FROM_IP Thread Last Branch Record 13 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Vol. 4 2-135 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Hex Dec 68EH 1678 Register Name / Bit Fields MSR_LASTBRANCH_14_FROM_IP Scope Thread Bit Description Last Branch Record 14 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68FH 1679 MSR_LASTBRANCH_15_FROM_IP Thread Last Branch Record 15 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 6C0H 1728 MSR_LASTBRANCH_0_TO_IP Thread Last Branch Record 0 To IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the destination instruction. 6C1H 1729 MSR_LASTBRANCH_1_TO_IP Thread Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C2H 1730 MSR_LASTBRANCH_2_TO_IP Thread Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C3H 1731 MSR_LASTBRANCH_3_TO_IP Thread Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C4H 1732 MSR_LASTBRANCH_4_TO_IP Thread Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C5H 1733 MSR_LASTBRANCH_5_TO_IP Thread Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C6H 1734 MSR_LASTBRANCH_6_TO_IP Thread Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C7H 1735 MSR_LASTBRANCH_7_TO_IP Thread Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C8H 1736 MSR_LASTBRANCH_8_TO_IP Thread Last Branch Record 8 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C9H 1737 MSR_LASTBRANCH_9_TO_IP Thread Last Branch Record 9 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CAH 1738 MSR_LASTBRANCH_10_TO_IP Thread Last Branch Record 10 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CBH 1739 MSR_LASTBRANCH_11_TO_IP Thread Last Branch Record 11 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CCH 1740 MSR_LASTBRANCH_12_TO_IP Thread Last Branch Record 12 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CDH 1741 MSR_LASTBRANCH_13_TO_IP Thread Last Branch Record 13 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CEH 1742 MSR_LASTBRANCH_14_TO_IP Thread Last Branch Record 14 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CFH 1743 MSR_LASTBRANCH_15_TO_IP Thread Last Branch Record 15 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 802H 2-136 Vol. 4 2050 IA32_X2APIC_APICID Thread x2APIC ID Register (R/O) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 803H 2051 IA32_X2APIC_VERSION Thread x2APIC Version Register (R/O) 808H 2056 IA32_X2APIC_TPR Thread x2APIC Task Priority Register (R/W) 80AH 2058 IA32_X2APIC_PPR Thread x2APIC Processor Priority Register (R/O) 80BH 2059 IA32_X2APIC_EOI Thread x2APIC EOI Register (W/O) 80DH 2061 IA32_X2APIC_LDR Thread x2APIC Logical Destination Register (R/O) 80FH 2063 IA32_X2APIC_SIVR Thread x2APIC Spurious Interrupt Vector Register (R/W) 810H 2064 IA32_X2APIC_ISR0 Thread x2APIC In-Service Register Bits [31:0] (R/O) 811H 2065 IA32_X2APIC_ISR1 Thread x2APIC In-Service Register Bits [63:32] (R/O) 812H 2066 IA32_X2APIC_ISR2 Thread x2APIC In-Service Register Bits [95:64] (R/O) 813H 2067 IA32_X2APIC_ISR3 Thread x2APIC In-Service Register Bits [127:96] (R/O) 814H 2068 IA32_X2APIC_ISR4 Thread x2APIC In-Service Register Bits [159:128] (R/O) 815H 2069 IA32_X2APIC_ISR5 Thread x2APIC In-Service Register Bits [191:160] (R/O) 816H 2070 IA32_X2APIC_ISR6 Thread x2APIC In-Service Register Bits [223:192] (R/O) 817H 2071 IA32_X2APIC_ISR7 Thread x2APIC In-Service Register Bits [255:224] (R/O) 818H 2072 IA32_X2APIC_TMR0 Thread x2APIC Trigger Mode Register Bits [31:0] (R/O) 819H 2073 IA32_X2APIC_TMR1 Thread x2APIC Trigger Mode Register Bits [63:32] (R/O) 81AH 2074 IA32_X2APIC_TMR2 Thread x2APIC Trigger Mode Register Bits [95:64] (R/O) 81BH 2075 IA32_X2APIC_TMR3 Thread x2APIC Trigger Mode Register Bits [127:96] (R/O) 81CH 2076 IA32_X2APIC_TMR4 Thread x2APIC Trigger Mode Register Bits [159:128] (R/O) 81DH 2077 IA32_X2APIC_TMR5 Thread x2APIC Trigger Mode Register Bits [191:160] (R/O) 81EH 2078 IA32_X2APIC_TMR6 Thread x2APIC Trigger Mode Register Bits [223:192] (R/O) 81FH 2079 IA32_X2APIC_TMR7 Thread x2APIC Trigger Mode Register Bits [255:224] (R/O) 820H 2080 IA32_X2APIC_IRR0 Thread x2APIC Interrupt Request Register Bits [31:0] (R/O) 821H 2081 IA32_X2APIC_IRR1 Thread x2APIC Interrupt Request Register Bits [63:32] (R/O) 822H 2082 IA32_X2APIC_IRR2 Thread x2APIC Interrupt Request Register Bits [95:64] (R/O) 823H 2083 IA32_X2APIC_IRR3 Thread x2APIC Interrupt Request Register Bits [127:96] (R/O) 824H 2084 IA32_X2APIC_IRR4 Thread x2APIC Interrupt Request Register Bits [159:128] (R/O) 825H 2085 IA32_X2APIC_IRR5 Thread x2APIC Interrupt Request Register Bits [191:160] (R/O) 826H 2086 IA32_X2APIC_IRR6 Thread x2APIC Interrupt Request Register Bits [223:192] (R/O) 827H 2087 IA32_X2APIC_IRR7 Thread x2APIC Interrupt Request Register Bits [255:224] (R/O) 828H 2088 IA32_X2APIC_ESR Thread x2APIC Error Status Register (R/W) 82FH 2095 IA32_X2APIC_LVT_CMCI Thread x2APIC LVT Corrected Machine Check Interrupt Register (R/W) 830H 2096 IA32_X2APIC_ICR Thread x2APIC Interrupt Command Register (R/W) 832H 2098 IA32_X2APIC_LVT_TIMER Thread x2APIC LVT Timer Interrupt Register (R/W) 833H 2099 IA32_X2APIC_LVT_THERMAL Thread x2APIC LVT Thermal Sensor Interrupt Register (R/W) Vol. 4 2-137 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-15. MSRs in Processors Based on Intel® Microarchitecture Code Name Nehalem (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 834H 2100 IA32_X2APIC_LVT_PMI Thread x2APIC LVT Performance Monitor Register (R/W) 835H 2101 IA32_X2APIC_LVT_LINT0 Thread x2APIC LVT LINT0 Register (R/W) 836H 2102 IA32_X2APIC_LVT_LINT1 Thread x2APIC LVT LINT1 Register (R/W) 837H 2103 IA32_X2APIC_LVT_ERROR Thread x2APIC LVT Error Register (R/W) 838H 2104 IA32_X2APIC_INIT_COUNT Thread x2APIC Initial Count Register (R/W) 839H 2105 IA32_X2APIC_CUR_COUNT Thread x2APIC Current Count Register (R/O) 83EH 2110 IA32_X2APIC_DIV_CONF Thread x2APIC Divide Configuration Register (R/W) 83FH 2111 IA32_X2APIC_SELF_IPI Thread x2APIC Self IPI Register (W/O) C000_ 0080H IA32_EFER Thread Extended Feature Enables C000_ 0081H IA32_STAR C000_ 0082H IA32_LSTAR C000_ 0084H IA32_FMASK C000_ 0100H IA32_FS_BASE C000_ 0101H IA32_GS_BASE C000_ 0102H IA32_KERNEL_GS_BASE C000_ 0103H IA32_TSC_AUX 2.8.1 See Table 2-2. Thread System Call Target Address (R/W) See Table 2-2. Thread IA-32e Mode System Call Target Address (R/W) See Table 2-2. Thread System Call Flag Mask (R/W) See Table 2-2. Thread Map of BASE Address of FS (R/W) See Table 2-2. Thread Map of BASE Address of GS (R/W) See Table 2-2. Thread Swap Target of BASE Address of GS (R/W) See Table 2-2. Thread AUXILIARY TSC Signature (R/W) See Table 2-2 and Section 17.17.2, “IA32_TSC_AUX Register and RDTSCP Support.” Additional MSRs in the Intel® Xeon® Processor 5500 and 3400 Series Intel Xeon Processor 5500 and 3400 series support additional model-specific registers listed in Table 2-16. These MSRs also apply to Intel Core i7 and i5 processor family CPUID signature with DisplayFamily_DisplayModel of 06_1AH, 06_1EH and 06_1FH, see Table 2-1. Table 2-16. Additional MSRs in Intel® Xeon® Processor 5500 and 3400 Series Register Address Hex Dec 1ADH 429 Register Name / Bit Fields MSR_TURBO_RATIO_LIMIT Scope Package Bit Description Actual maximum turbo frequency is multiplied by 133.33MHz. (Not available in model 06_2EH.) 2-138 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-16. Additional MSRs in Intel® Xeon® Processor 5500 and 3400 Series (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 7:0 Maximum Turbo Ratio Limit 1C (R/O) Maximum Turbo mode ratio limit with 1 core active. 15:8 Maximum Turbo Ratio Limit 2C (R/O) Maximum Turbo mode ratio limit with 2 cores active. 23:16 Maximum Turbo Ratio Limit 3C (R/O) Maximum Turbo mode ratio limit with 3 cores active. 31:24 Maximum Turbo Ratio Limit 4C (R/O) Maximum Turbo mode ratio limit with 4 cores active. 63:32 301H 769 MSR_GQ_SNOOP_MESF Reserved Package 0 From M to S (R/W) 1 From E to S (R/W) 2 From S to S (R/W) 3 From F to S (R/W) 4 From M to I (R/W) 5 From E to I (R/W) 6 From S to I (R/W) 7 From F to I (R/W) 63:8 Reserved 391H 913 MSR_UNCORE_PERF_GLOBAL_CTRL Package See Section 18.3.1.2.1, “Uncore Performance Monitoring Management Facility.” 392H 914 MSR_UNCORE_PERF_GLOBAL_STATUS Package See Section 18.3.1.2.1, “Uncore Performance Monitoring Management Facility.” 393H 915 MSR_UNCORE_PERF_GLOBAL_OVF_CTRL Package See Section 18.3.1.2.1, “Uncore Performance Monitoring Management Facility.” 394H 916 MSR_UNCORE_FIXED_CTR0 Package See Section 18.3.1.2.1, “Uncore Performance Monitoring Management Facility.” 395H 917 MSR_UNCORE_FIXED_CTR_CTRL Package See Section 18.3.1.2.1, “Uncore Performance Monitoring Management Facility.” 396H 918 MSR_UNCORE_ADDR_OPCODE_MATCH Package See Section 18.3.1.2.3, “Uncore Address/Opcode Match MSR.” 3B0H 960 MSR_UNCORE_PMC0 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3B1H 961 MSR_UNCORE_PMC1 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3B2H 962 MSR_UNCORE_PMC2 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3B3H 963 MSR_UNCORE_PMC3 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3B4H 964 MSR_UNCORE_PMC4 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” Vol. 4 2-139 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-16. Additional MSRs in Intel® Xeon® Processor 5500 and 3400 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 3B5H 965 MSR_UNCORE_PMC5 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3B6H 966 MSR_UNCORE_PMC6 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3B7H 967 MSR_UNCORE_PMC7 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C0H 944 MSR_UNCORE_PERFEVTSEL0 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C1H 945 MSR_UNCORE_PERFEVTSEL1 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C2H 946 MSR_UNCORE_PERFEVTSEL2 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C3H 947 MSR_UNCORE_PERFEVTSEL3 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C4H 948 MSR_UNCORE_PERFEVTSEL4 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C5H 949 MSR_UNCORE_PERFEVTSEL5 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C6H 950 MSR_UNCORE_PERFEVTSEL6 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 3C7H 951 MSR_UNCORE_PERFEVTSEL7 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 2.8.2 Additional MSRs in the Intel® Xeon® Processor 7500 Series Intel Xeon Processor 7500 series support MSRs listed in Table 2-15 (except MSR address 1ADH) and additional model-specific registers listed in Table 2-17. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2EH. Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series Register Address Hex Dec 1ADH 429 Register Name / Bit Fields MSR_TURBO_RATIO_LIMIT Scope Package Bit Description Reserved Attempt to read/write will cause #UD. 289H 649 IA32_MC9_CTL2 Package See Table 2-2. 28AH 650 IA32_MC10_CTL2 Package See Table 2-2. 28BH 651 IA32_MC11_CTL2 Package See Table 2-2. 28CH 652 IA32_MC12_CTL2 Package See Table 2-2. 28DH 653 IA32_MC13_CTL2 Package See Table 2-2. 28EH 654 IA32_MC14_CTL2 Package See Table 2-2. 28FH 655 IA32_MC15_CTL2 Package See Table 2-2. 2-140 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 290H 656 IA32_MC16_CTL2 Package See Table 2-2. 291H 657 IA32_MC17_CTL2 Package See Table 2-2. 292H 658 IA32_MC18_CTL2 Package See Table 2-2. 293H 659 IA32_MC19_CTL2 Package See Table 2-2. 294H 660 IA32_MC20_CTL2 Package See Table 2-2. 295H 661 IA32_MC21_CTL2 Package See Table 2-2. 394H 816 MSR_W_PMON_FIXED_CTR Package Uncore W-box perfmon fixed counter. 395H 817 MSR_W_PMON_FIXED_CTR_CTL Package Uncore U-box perfmon fixed counter control MSR. 424H 1060 IA32_MC9_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 425H 1061 IA32_MC9_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 426H 1062 IA32_MC9_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 427H 1063 IA32_MC9_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 428H 1064 IA32_MC10_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 429H 1065 IA32_MC10_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 42AH 1066 IA32_MC10_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 42BH 1067 IA32_MC10_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 42CH 1068 IA32_MC11_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 42DH 1069 IA32_MC11_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 42EH 1070 IA32_MC11_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 42FH 1071 IA32_MC11_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 430H 1072 IA32_MC12_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 431H 1073 IA32_MC12_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 432H 1074 IA32_MC12_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 433H 1075 IA32_MC12_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 434H 1076 IA32_MC13_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 435H 1077 IA32_MC13_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 436H 1078 IA32_MC13_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 437H 1079 IA32_MC13_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 438H 1080 IA32_MC14_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 439H 1081 IA32_MC14_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 43AH 1082 IA32_MC14_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 43BH 1083 IA32_MC14_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 43CH 1084 IA32_MC15_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” Vol. 4 2-141 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 43DH 1085 IA32_MC15_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 43EH 1086 IA32_MC15_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 43FH 1087 IA32_MC15_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 440H 1088 IA32_MC16_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 441H 1089 IA32_MC16_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 442H 1090 IA32_MC16_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 443H 1091 IA32_MC16_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 444H 1092 IA32_MC17_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 445H 1093 IA32_MC17_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 446H 1094 IA32_MC17_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 447H 1095 IA32_MC17_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 448H 1096 IA32_MC18_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 449H 1097 IA32_MC18_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 44AH 1098 IA32_MC18_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 44BH 1099 IA32_MC18_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 44CH 1100 IA32_MC19_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 44DH 1101 IA32_MC19_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 44EH 1102 IA32_MC19_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 44FH 1103 IA32_MC19_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 450H 1104 IA32_MC20_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 451H 1105 IA32_MC20_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 452H 1106 IA32_MC20_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 453H 1107 IA32_MC20_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 454H 1108 IA32_MC21_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 455H 1109 IA32_MC21_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 456H 1110 IA32_MC21_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 457H 1111 IA32_MC21_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” C00H 3072 MSR_U_PMON_GLOBAL_CTRL Package Uncore U-box perfmon global control MSR. C01H 3073 MSR_U_PMON_GLOBAL_STATUS Package Uncore U-box perfmon global status MSR. C02H 3074 MSR_U_PMON_GLOBAL_OVF_CTRL Package Uncore U-box perfmon global overflow control MSR. 2-142 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec C10H 3088 MSR_U_PMON_EVNT_SEL Package Uncore U-box perfmon event select MSR. C11H 3089 MSR_U_PMON_CTR Package Uncore U-box perfmon counter MSR. C20H 3104 MSR_B0_PMON_BOX_CTRL Package Uncore B-box 0 perfmon local box control MSR. C21H 3105 MSR_B0_PMON_BOX_STATUS Package Uncore B-box 0 perfmon local box status MSR. C22H 3106 MSR_B0_PMON_BOX_OVF_CTRL Package Uncore B-box 0 perfmon local box overflow control MSR. C30H 3120 MSR_B0_PMON_EVNT_SEL0 Package Uncore B-box 0 perfmon event select MSR. C31H 3121 MSR_B0_PMON_CTR0 Package Uncore B-box 0 perfmon counter MSR. C32H 3122 MSR_B0_PMON_EVNT_SEL1 Package Uncore B-box 0 perfmon event select MSR. C33H 3123 MSR_B0_PMON_CTR1 Package Uncore B-box 0 perfmon counter MSR. C34H 3124 MSR_B0_PMON_EVNT_SEL2 Package Uncore B-box 0 perfmon event select MSR. C35H 3125 MSR_B0_PMON_CTR2 Package Uncore B-box 0 perfmon counter MSR. C36H 3126 MSR_B0_PMON_EVNT_SEL3 Package Uncore B-box 0 perfmon event select MSR. C37H 3127 MSR_B0_PMON_CTR3 Package Uncore B-box 0 perfmon counter MSR. C40H 3136 MSR_S0_PMON_BOX_CTRL Package Uncore S-box 0 perfmon local box control MSR. C41H 3137 MSR_S0_PMON_BOX_STATUS Package Uncore S-box 0 perfmon local box status MSR. C42H 3138 MSR_S0_PMON_BOX_OVF_CTRL Package Uncore S-box 0 perfmon local box overflow control MSR. C50H 3152 MSR_S0_PMON_EVNT_SEL0 Package Uncore S-box 0 perfmon event select MSR. C51H 3153 MSR_S0_PMON_CTR0 Package Uncore S-box 0 perfmon counter MSR. C52H 3154 MSR_S0_PMON_EVNT_SEL1 Package Uncore S-box 0 perfmon event select MSR. C53H 3155 MSR_S0_PMON_CTR1 Package Uncore S-box 0 perfmon counter MSR. C54H 3156 MSR_S0_PMON_EVNT_SEL2 Package Uncore S-box 0 perfmon event select MSR. C55H 3157 MSR_S0_PMON_CTR2 Package Uncore S-box 0 perfmon counter MSR. C56H 3158 MSR_S0_PMON_EVNT_SEL3 Package Uncore S-box 0 perfmon event select MSR. C57H 3159 MSR_S0_PMON_CTR3 Package Uncore S-box 0 perfmon counter MSR. C60H 3168 MSR_B1_PMON_BOX_CTRL Package Uncore B-box 1 perfmon local box control MSR. Vol. 4 2-143 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec C61H 3169 MSR_B1_PMON_BOX_STATUS Package Uncore B-box 1 perfmon local box status MSR. C62H 3170 MSR_B1_PMON_BOX_OVF_CTRL Package Uncore B-box 1 perfmon local box overflow control MSR. C70H 3184 MSR_B1_PMON_EVNT_SEL0 Package Uncore B-box 1 perfmon event select MSR. C71H 3185 MSR_B1_PMON_CTR0 Package Uncore B-box 1 perfmon counter MSR. C72H 3186 MSR_B1_PMON_EVNT_SEL1 Package Uncore B-box 1 perfmon event select MSR. C73H 3187 MSR_B1_PMON_CTR1 Package Uncore B-box 1 perfmon counter MSR. C74H 3188 MSR_B1_PMON_EVNT_SEL2 Package Uncore B-box 1 perfmon event select MSR. C75H 3189 MSR_B1_PMON_CTR2 Package Uncore B-box 1 perfmon counter MSR. C76H 3190 MSR_B1_PMON_EVNT_SEL3 Package Uncore B-box 1vperfmon event select MSR. C77H 3191 MSR_B1_PMON_CTR3 Package Uncore B-box 1 perfmon counter MSR. C80H 3120 MSR_W_PMON_BOX_CTRL Package Uncore W-box perfmon local box control MSR. C81H 3121 MSR_W_PMON_BOX_STATUS Package Uncore W-box perfmon local box status MSR. C82H 3122 MSR_W_PMON_BOX_OVF_CTRL Package Uncore W-box perfmon local box overflow control MSR. C90H 3136 MSR_W_PMON_EVNT_SEL0 Package Uncore W-box perfmon event select MSR. C91H 3137 MSR_W_PMON_CTR0 Package Uncore W-box perfmon counter MSR. C92H 3138 MSR_W_PMON_EVNT_SEL1 Package Uncore W-box perfmon event select MSR. C93H 3139 MSR_W_PMON_CTR1 Package Uncore W-box perfmon counter MSR. C94H 3140 MSR_W_PMON_EVNT_SEL2 Package Uncore W-box perfmon event select MSR. C95H 3141 MSR_W_PMON_CTR2 Package Uncore W-box perfmon counter MSR. C96H 3142 MSR_W_PMON_EVNT_SEL3 Package Uncore W-box perfmon event select MSR. C97H 3143 MSR_W_PMON_CTR3 Package Uncore W-box perfmon counter MSR. CA0H 3232 MSR_M0_PMON_BOX_CTRL Package Uncore M-box 0 perfmon local box control MSR. CA1H 3233 MSR_M0_PMON_BOX_STATUS Package Uncore M-box 0 perfmon local box status MSR. CA2H 3234 MSR_M0_PMON_BOX_OVF_CTRL Package Uncore M-box 0 perfmon local box overflow control MSR. CA4H 3236 MSR_M0_PMON_TIMESTAMP Package Uncore M-box 0 perfmon time stamp unit select MSR. CA5H 3237 MSR_M0_PMON_DSP Package Uncore M-box 0 perfmon DSP unit select MSR. 2-144 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec CA6H 3238 MSR_M0_PMON_ISS Package Uncore M-box 0 perfmon ISS unit select MSR. CA7H 3239 MSR_M0_PMON_MAP Package Uncore M-box 0 perfmon MAP unit select MSR. CA8H 3240 MSR_M0_PMON_MSC_THR Package Uncore M-box 0 perfmon MIC THR select MSR. CA9H 3241 MSR_M0_PMON_PGT Package Uncore M-box 0 perfmon PGT unit select MSR. CAAH 3242 MSR_M0_PMON_PLD Package Uncore M-box 0 perfmon PLD unit select MSR. CABH 3243 MSR_M0_PMON_ZDP Package Uncore M-box 0 perfmon ZDP unit select MSR. CB0H 3248 MSR_M0_PMON_EVNT_SEL0 Package Uncore M-box 0 perfmon event select MSR. CB1H 3249 MSR_M0_PMON_CTR0 Package Uncore M-box 0 perfmon counter MSR. CB2H 3250 MSR_M0_PMON_EVNT_SEL1 Package Uncore M-box 0 perfmon event select MSR. CB3H 3251 MSR_M0_PMON_CTR1 Package Uncore M-box 0 perfmon counter MSR. CB4H 3252 MSR_M0_PMON_EVNT_SEL2 Package Uncore M-box 0 perfmon event select MSR. CB5H 3253 MSR_M0_PMON_CTR2 Package Uncore M-box 0 perfmon counter MSR. CB6H 3254 MSR_M0_PMON_EVNT_SEL3 Package Uncore M-box 0 perfmon event select MSR. CB7H 3255 MSR_M0_PMON_CTR3 Package Uncore M-box 0 perfmon counter MSR. CB8H 3256 MSR_M0_PMON_EVNT_SEL4 Package Uncore M-box 0 perfmon event select MSR. CB9H 3257 MSR_M0_PMON_CTR4 Package Uncore M-box 0 perfmon counter MSR. CBAH 3258 MSR_M0_PMON_EVNT_SEL5 Package Uncore M-box 0 perfmon event select MSR. CBBH 3259 MSR_M0_PMON_CTR5 Package Uncore M-box 0 perfmon counter MSR. CC0H 3264 MSR_S1_PMON_BOX_CTRL Package Uncore S-box 1 perfmon local box control MSR. CC1H 3265 MSR_S1_PMON_BOX_STATUS Package Uncore S-box 1 perfmon local box status MSR. CC2H 3266 MSR_S1_PMON_BOX_OVF_CTRL Package Uncore S-box 1 perfmon local box overflow control MSR. CD0H 3280 MSR_S1_PMON_EVNT_SEL0 Package Uncore S-box 1 perfmon event select MSR. CD1H 3281 MSR_S1_PMON_CTR0 Package Uncore S-box 1 perfmon counter MSR. CD2H 3282 MSR_S1_PMON_EVNT_SEL1 Package Uncore S-box 1 perfmon event select MSR. CD3H 3283 MSR_S1_PMON_CTR1 Package Uncore S-box 1 perfmon counter MSR. CD4H 3284 MSR_S1_PMON_EVNT_SEL2 Package Uncore S-box 1 perfmon event select MSR. CD5H 3285 MSR_S1_PMON_CTR2 Package Uncore S-box 1 perfmon counter MSR. Vol. 4 2-145 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec CD6H 3286 MSR_S1_PMON_EVNT_SEL3 Package Uncore S-box 1 perfmon event select MSR. CD7H 3287 MSR_S1_PMON_CTR3 Package Uncore S-box 1 perfmon counter MSR. CE0H 3296 MSR_M1_PMON_BOX_CTRL Package Uncore M-box 1 perfmon local box control MSR. CE1H 3297 MSR_M1_PMON_BOX_STATUS Package Uncore M-box 1 perfmon local box status MSR. CE2H 3298 MSR_M1_PMON_BOX_OVF_CTRL Package Uncore M-box 1 perfmon local box overflow control MSR. CE4H 3300 MSR_M1_PMON_TIMESTAMP Package Uncore M-box 1 perfmon time stamp unit select MSR. CE5H 3301 MSR_M1_PMON_DSP Package Uncore M-box 1 perfmon DSP unit select MSR. CE6H 3302 MSR_M1_PMON_ISS Package Uncore M-box 1 perfmon ISS unit select MSR. CE7H 3303 MSR_M1_PMON_MAP Package Uncore M-box 1 perfmon MAP unit select MSR. CE8H 3304 MSR_M1_PMON_MSC_THR Package Uncore M-box 1 perfmon MIC THR select MSR. CE9H 3305 MSR_M1_PMON_PGT Package Uncore M-box 1 perfmon PGT unit select MSR. CEAH 3306 MSR_M1_PMON_PLD Package Uncore M-box 1 perfmon PLD unit select MSR. CEBH 3307 MSR_M1_PMON_ZDP Package Uncore M-box 1 perfmon ZDP unit select MSR. CF0H 3312 MSR_M1_PMON_EVNT_SEL0 Package Uncore M-box 1 perfmon event select MSR. CF1H 3313 MSR_M1_PMON_CTR0 Package Uncore M-box 1 perfmon counter MSR. CF2H 3314 MSR_M1_PMON_EVNT_SEL1 Package Uncore M-box 1 perfmon event select MSR. CF3H 3315 MSR_M1_PMON_CTR1 Package Uncore M-box 1 perfmon counter MSR. CF4H 3316 MSR_M1_PMON_EVNT_SEL2 Package Uncore M-box 1 perfmon event select MSR. CF5H 3317 MSR_M1_PMON_CTR2 Package Uncore M-box 1 perfmon counter MSR. CF6H 3318 MSR_M1_PMON_EVNT_SEL3 Package Uncore M-box 1 perfmon event select MSR. CF7H 3319 MSR_M1_PMON_CTR3 Package Uncore M-box 1 perfmon counter MSR. CF8H 3320 MSR_M1_PMON_EVNT_SEL4 Package Uncore M-box 1 perfmon event select MSR. CF9H 3321 MSR_M1_PMON_CTR4 Package Uncore M-box 1 perfmon counter MSR. CFAH 3322 MSR_M1_PMON_EVNT_SEL5 Package Uncore M-box 1 perfmon event select MSR. CFBH 3323 MSR_M1_PMON_CTR5 Package Uncore M-box 1 perfmon counter MSR. D00H 3328 MSR_C0_PMON_BOX_CTRL Package Uncore C-box 0 perfmon local box control MSR. D01H 3329 MSR_C0_PMON_BOX_STATUS Package Uncore C-box 0 perfmon local box status MSR. 2-146 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec D02H 3330 MSR_C0_PMON_BOX_OVF_CTRL Package Uncore C-box 0 perfmon local box overflow control MSR. D10H 3344 MSR_C0_PMON_EVNT_SEL0 Package Uncore C-box 0 perfmon event select MSR. D11H 3345 MSR_C0_PMON_CTR0 Package Uncore C-box 0 perfmon counter MSR. D12H 3346 MSR_C0_PMON_EVNT_SEL1 Package Uncore C-box 0 perfmon event select MSR. D13H 3347 MSR_C0_PMON_CTR1 Package Uncore C-box 0 perfmon counter MSR. D14H 3348 MSR_C0_PMON_EVNT_SEL2 Package Uncore C-box 0 perfmon event select MSR. D15H 3349 MSR_C0_PMON_CTR2 Package Uncore C-box 0 perfmon counter MSR. D16H 3350 MSR_C0_PMON_EVNT_SEL3 Package Uncore C-box 0 perfmon event select MSR. D17H 3351 MSR_C0_PMON_CTR3 Package Uncore C-box 0 perfmon counter MSR. D18H 3352 MSR_C0_PMON_EVNT_SEL4 Package Uncore C-box 0 perfmon event select MSR. D19H 3353 MSR_C0_PMON_CTR4 Package Uncore C-box 0 perfmon counter MSR. D1AH 3354 MSR_C0_PMON_EVNT_SEL5 Package Uncore C-box 0 perfmon event select MSR. D1BH 3355 MSR_C0_PMON_CTR5 Package Uncore C-box 0 perfmon counter MSR. D20H 3360 MSR_C4_PMON_BOX_CTRL Package Uncore C-box 4 perfmon local box control MSR. D21H 3361 MSR_C4_PMON_BOX_STATUS Package Uncore C-box 4 perfmon local box status MSR. D22H 3362 MSR_C4_PMON_BOX_OVF_CTRL Package Uncore C-box 4 perfmon local box overflow control MSR. D30H 3376 MSR_C4_PMON_EVNT_SEL0 Package Uncore C-box 4 perfmon event select MSR. D31H 3377 MSR_C4_PMON_CTR0 Package Uncore C-box 4 perfmon counter MSR. D32H 3378 MSR_C4_PMON_EVNT_SEL1 Package Uncore C-box 4 perfmon event select MSR. D33H 3379 MSR_C4_PMON_CTR1 Package Uncore C-box 4 perfmon counter MSR. D34H 3380 MSR_C4_PMON_EVNT_SEL2 Package Uncore C-box 4 perfmon event select MSR. D35H 3381 MSR_C4_PMON_CTR2 Package Uncore C-box 4 perfmon counter MSR. D36H 3382 MSR_C4_PMON_EVNT_SEL3 Package Uncore C-box 4 perfmon event select MSR. D37H 3383 MSR_C4_PMON_CTR3 Package Uncore C-box 4 perfmon counter MSR. D38H 3384 MSR_C4_PMON_EVNT_SEL4 Package Uncore C-box 4 perfmon event select MSR. D39H 3385 MSR_C4_PMON_CTR4 Package Uncore C-box 4 perfmon counter MSR. Vol. 4 2-147 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec D3AH 3386 MSR_C4_PMON_EVNT_SEL5 Package Uncore C-box 4 perfmon event select MSR. D3BH 3387 MSR_C4_PMON_CTR5 Package Uncore C-box 4 perfmon counter MSR. D40H 3392 MSR_C2_PMON_BOX_CTRL Package Uncore C-box 2 perfmon local box control MSR. D41H 3393 MSR_C2_PMON_BOX_STATUS Package Uncore C-box 2 perfmon local box status MSR. D42H 3394 MSR_C2_PMON_BOX_OVF_CTRL Package Uncore C-box 2 perfmon local box overflow control MSR. D50H 3408 MSR_C2_PMON_EVNT_SEL0 Package Uncore C-box 2 perfmon event select MSR. D51H 3409 MSR_C2_PMON_CTR0 Package Uncore C-box 2 perfmon counter MSR. D52H 3410 MSR_C2_PMON_EVNT_SEL1 Package Uncore C-box 2 perfmon event select MSR. D53H 3411 MSR_C2_PMON_CTR1 Package Uncore C-box 2 perfmon counter MSR. D54H 3412 MSR_C2_PMON_EVNT_SEL2 Package Uncore C-box 2 perfmon event select MSR. D55H 3413 MSR_C2_PMON_CTR2 Package Uncore C-box 2 perfmon counter MSR. D56H 3414 MSR_C2_PMON_EVNT_SEL3 Package Uncore C-box 2 perfmon event select MSR. D57H 3415 MSR_C2_PMON_CTR3 Package Uncore C-box 2 perfmon counter MSR. D58H 3416 MSR_C2_PMON_EVNT_SEL4 Package Uncore C-box 2 perfmon event select MSR. D59H 3417 MSR_C2_PMON_CTR4 Package Uncore C-box 2 perfmon counter MSR. D5AH 3418 MSR_C2_PMON_EVNT_SEL5 Package Uncore C-box 2 perfmon event select MSR. D5BH 3419 MSR_C2_PMON_CTR5 Package Uncore C-box 2 perfmon counter MSR. D60H 3424 MSR_C6_PMON_BOX_CTRL Package Uncore C-box 6 perfmon local box control MSR. D61H 3425 MSR_C6_PMON_BOX_STATUS Package Uncore C-box 6 perfmon local box status MSR. D62H 3426 MSR_C6_PMON_BOX_OVF_CTRL Package Uncore C-box 6 perfmon local box overflow control MSR. D70H 3440 MSR_C6_PMON_EVNT_SEL0 Package Uncore C-box 6 perfmon event select MSR. D71H 3441 MSR_C6_PMON_CTR0 Package Uncore C-box 6 perfmon counter MSR. D72H 3442 MSR_C6_PMON_EVNT_SEL1 Package Uncore C-box 6 perfmon event select MSR. D73H 3443 MSR_C6_PMON_CTR1 Package Uncore C-box 6 perfmon counter MSR. D74H 3444 MSR_C6_PMON_EVNT_SEL2 Package Uncore C-box 6 perfmon event select MSR. 2-148 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec D75H 3445 MSR_C6_PMON_CTR2 Package Uncore C-box 6 perfmon counter MSR. D76H 3446 MSR_C6_PMON_EVNT_SEL3 Package Uncore C-box 6 perfmon event select MSR. D77H 3447 MSR_C6_PMON_CTR3 Package Uncore C-box 6 perfmon counter MSR. D78H 3448 MSR_C6_PMON_EVNT_SEL4 Package Uncore C-box 6 perfmon event select MSR. D79H 3449 MSR_C6_PMON_CTR4 Package Uncore C-box 6 perfmon counter MSR. D7AH 3450 MSR_C6_PMON_EVNT_SEL5 Package Uncore C-box 6 perfmon event select MSR. D7BH 3451 MSR_C6_PMON_CTR5 Package Uncore C-box 6 perfmon counter MSR. D80H 3456 MSR_C1_PMON_BOX_CTRL Package Uncore C-box 1 perfmon local box control MSR. D81H 3457 MSR_C1_PMON_BOX_STATUS Package Uncore C-box 1 perfmon local box status MSR. D82H 3458 MSR_C1_PMON_BOX_OVF_CTRL Package Uncore C-box 1 perfmon local box overflow control MSR. D90H 3472 MSR_C1_PMON_EVNT_SEL0 Package Uncore C-box 1 perfmon event select MSR. D91H 3473 MSR_C1_PMON_CTR0 Package Uncore C-box 1 perfmon counter MSR. D92H 3474 MSR_C1_PMON_EVNT_SEL1 Package Uncore C-box 1 perfmon event select MSR. D93H 3475 MSR_C1_PMON_CTR1 Package Uncore C-box 1 perfmon counter MSR. D94H 3476 MSR_C1_PMON_EVNT_SEL2 Package Uncore C-box 1 perfmon event select MSR. D95H 3477 MSR_C1_PMON_CTR2 Package Uncore C-box 1 perfmon counter MSR. D96H 3478 MSR_C1_PMON_EVNT_SEL3 Package Uncore C-box 1 perfmon event select MSR. D97H 3479 MSR_C1_PMON_CTR3 Package Uncore C-box 1 perfmon counter MSR. D98H 3480 MSR_C1_PMON_EVNT_SEL4 Package Uncore C-box 1 perfmon event select MSR. D99H 3481 MSR_C1_PMON_CTR4 Package Uncore C-box 1 perfmon counter MSR. D9AH 3482 MSR_C1_PMON_EVNT_SEL5 Package Uncore C-box 1 perfmon event select MSR. D9BH 3483 MSR_C1_PMON_CTR5 Package Uncore C-box 1 perfmon counter MSR. DA0H 3488 MSR_C5_PMON_BOX_CTRL Package Uncore C-box 5 perfmon local box control MSR. DA1H 3489 MSR_C5_PMON_BOX_STATUS Package Uncore C-box 5 perfmon local box status MSR. DA2H 3490 MSR_C5_PMON_BOX_OVF_CTRL Package Uncore C-box 5 perfmon local box overflow control MSR. DB0H 3504 MSR_C5_PMON_EVNT_SEL0 Package Uncore C-box 5 perfmon event select MSR. Vol. 4 2-149 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec DB1H 3505 MSR_C5_PMON_CTR0 Package Uncore C-box 5 perfmon counter MSR. DB2H 3506 MSR_C5_PMON_EVNT_SEL1 Package Uncore C-box 5 perfmon event select MSR. DB3H 3507 MSR_C5_PMON_CTR1 Package Uncore C-box 5 perfmon counter MSR. DB4H 3508 MSR_C5_PMON_EVNT_SEL2 Package Uncore C-box 5 perfmon event select MSR. DB5H 3509 MSR_C5_PMON_CTR2 Package Uncore C-box 5 perfmon counter MSR. DB6H 3510 MSR_C5_PMON_EVNT_SEL3 Package Uncore C-box 5 perfmon event select MSR. DB7H 3511 MSR_C5_PMON_CTR3 Package Uncore C-box 5 perfmon counter MSR. DB8H 3512 MSR_C5_PMON_EVNT_SEL4 Package Uncore C-box 5 perfmon event select MSR. DB9H 3513 MSR_C5_PMON_CTR4 Package Uncore C-box 5 perfmon counter MSR. DBAH 3514 MSR_C5_PMON_EVNT_SEL5 Package Uncore C-box 5 perfmon event select MSR. DBBH 3515 MSR_C5_PMON_CTR5 Package Uncore C-box 5 perfmon counter MSR. DC0H 3520 MSR_C3_PMON_BOX_CTRL Package Uncore C-box 3 perfmon local box control MSR. DC1H 3521 MSR_C3_PMON_BOX_STATUS Package Uncore C-box 3 perfmon local box status MSR. DC2H 3522 MSR_C3_PMON_BOX_OVF_CTRL Package Uncore C-box 3 perfmon local box overflow control MSR. DD0H 3536 MSR_C3_PMON_EVNT_SEL0 Package Uncore C-box 3 perfmon event select MSR. DD1H 3537 MSR_C3_PMON_CTR0 Package Uncore C-box 3 perfmon counter MSR. DD2H 3538 MSR_C3_PMON_EVNT_SEL1 Package Uncore C-box 3 perfmon event select MSR. DD3H 3539 MSR_C3_PMON_CTR1 Package Uncore C-box 3 perfmon counter MSR. DD4H 3540 MSR_C3_PMON_EVNT_SEL2 Package Uncore C-box 3 perfmon event select MSR. DD5H 3541 MSR_C3_PMON_CTR2 Package Uncore C-box 3 perfmon counter MSR. DD6H 3542 MSR_C3_PMON_EVNT_SEL3 Package Uncore C-box 3 perfmon event select MSR. DD7H 3543 MSR_C3_PMON_CTR3 Package Uncore C-box 3 perfmon counter MSR. DD8H 3544 MSR_C3_PMON_EVNT_SEL4 Package Uncore C-box 3 perfmon event select MSR. DD9H 3545 MSR_C3_PMON_CTR4 Package Uncore C-box 3 perfmon counter MSR. DDAH 3546 MSR_C3_PMON_EVNT_SEL5 Package Uncore C-box 3 perfmon event select MSR. DDBH 3547 MSR_C3_PMON_CTR5 Package Uncore C-box 3 perfmon counter MSR. 2-150 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec DE0H 3552 MSR_C7_PMON_BOX_CTRL Package Uncore C-box 7 perfmon local box control MSR. DE1H 3553 MSR_C7_PMON_BOX_STATUS Package Uncore C-box 7 perfmon local box status MSR. DE2H 3554 MSR_C7_PMON_BOX_OVF_CTRL Package Uncore C-box 7 perfmon local box overflow control MSR. DF0H 3568 MSR_C7_PMON_EVNT_SEL0 Package Uncore C-box 7 perfmon event select MSR. DF1H 3569 MSR_C7_PMON_CTR0 Package Uncore C-box 7 perfmon counter MSR. DF2H 3570 MSR_C7_PMON_EVNT_SEL1 Package Uncore C-box 7 perfmon event select MSR. DF3H 3571 MSR_C7_PMON_CTR1 Package Uncore C-box 7 perfmon counter MSR. DF4H 3572 MSR_C7_PMON_EVNT_SEL2 Package Uncore C-box 7 perfmon event select MSR. DF5H 3573 MSR_C7_PMON_CTR2 Package Uncore C-box 7 perfmon counter MSR. DF6H 3574 MSR_C7_PMON_EVNT_SEL3 Package Uncore C-box 7 perfmon event select MSR. DF7H 3575 MSR_C7_PMON_CTR3 Package Uncore C-box 7 perfmon counter MSR. DF8H 3576 MSR_C7_PMON_EVNT_SEL4 Package Uncore C-box 7 perfmon event select MSR. DF9H 3577 MSR_C7_PMON_CTR4 Package Uncore C-box 7 perfmon counter MSR. DFAH 3578 MSR_C7_PMON_EVNT_SEL5 Package Uncore C-box 7 perfmon event select MSR. DFBH 3579 MSR_C7_PMON_CTR5 Package Uncore C-box 7 perfmon counter MSR. E00H 3584 MSR_R0_PMON_BOX_CTRL Package Uncore R-box 0 perfmon local box control MSR. E01H 3585 MSR_R0_PMON_BOX_STATUS Package Uncore R-box 0 perfmon local box status MSR. E02H 3586 MSR_R0_PMON_BOX_OVF_CTRL Package Uncore R-box 0 perfmon local box overflow control MSR. E04H 3588 MSR_R0_PMON_IPERF0_P0 Package Uncore R-box 0 perfmon IPERF0 unit Port 0 select MSR. E05H 3589 MSR_R0_PMON_IPERF0_P1 Package Uncore R-box 0 perfmon IPERF0 unit Port 1 select MSR. E06H 3590 MSR_R0_PMON_IPERF0_P2 Package Uncore R-box 0 perfmon IPERF0 unit Port 2 select MSR. E07H 3591 MSR_R0_PMON_IPERF0_P3 Package Uncore R-box 0 perfmon IPERF0 unit Port 3 select MSR. E08H 3592 MSR_R0_PMON_IPERF0_P4 Package Uncore R-box 0 perfmon IPERF0 unit Port 4 select MSR. E09H 3593 MSR_R0_PMON_IPERF0_P5 Package Uncore R-box 0 perfmon IPERF0 unit Port 5 select MSR. Vol. 4 2-151 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec E0AH 3594 MSR_R0_PMON_IPERF0_P6 Package Uncore R-box 0 perfmon IPERF0 unit Port 6 select MSR. E0BH 3595 MSR_R0_PMON_IPERF0_P7 Package Uncore R-box 0 perfmon IPERF0 unit Port 7 select MSR. E0CH 3596 MSR_R0_PMON_QLX_P0 Package Uncore R-box 0 perfmon QLX unit Port 0 select MSR. E0DH 3597 MSR_R0_PMON_QLX_P1 Package Uncore R-box 0 perfmon QLX unit Port 1 select MSR. E0EH 3598 MSR_R0_PMON_QLX_P2 Package Uncore R-box 0 perfmon QLX unit Port 2 select MSR. E0FH 3599 MSR_R0_PMON_QLX_P3 Package Uncore R-box 0 perfmon QLX unit Port 3 select MSR. E10H 3600 MSR_R0_PMON_EVNT_SEL0 Package Uncore R-box 0 perfmon event select MSR. E11H 3601 MSR_R0_PMON_CTR0 Package Uncore R-box 0 perfmon counter MSR. E12H 3602 MSR_R0_PMON_EVNT_SEL1 Package Uncore R-box 0 perfmon event select MSR. E13H 3603 MSR_R0_PMON_CTR1 Package Uncore R-box 0 perfmon counter MSR. E14H 3604 MSR_R0_PMON_EVNT_SEL2 Package Uncore R-box 0 perfmon event select MSR. E15H 3605 MSR_R0_PMON_CTR2 Package Uncore R-box 0 perfmon counter MSR. E16H 3606 MSR_R0_PMON_EVNT_SEL3 Package Uncore R-box 0 perfmon event select MSR. E17H 3607 MSR_R0_PMON_CTR3 Package Uncore R-box 0 perfmon counter MSR. E18H 3608 MSR_R0_PMON_EVNT_SEL4 Package Uncore R-box 0 perfmon event select MSR. E19H 3609 MSR_R0_PMON_CTR4 Package Uncore R-box 0 perfmon counter MSR. E1AH 3610 MSR_R0_PMON_EVNT_SEL5 Package Uncore R-box 0 perfmon event select MSR. E1BH 3611 MSR_R0_PMON_CTR5 Package Uncore R-box 0 perfmon counter MSR. E1CH 3612 MSR_R0_PMON_EVNT_SEL6 Package Uncore R-box 0 perfmon event select MSR. E1DH 3613 MSR_R0_PMON_CTR6 Package Uncore R-box 0 perfmon counter MSR. E1EH 3614 MSR_R0_PMON_EVNT_SEL7 Package Uncore R-box 0 perfmon event select MSR. E1FH 3615 MSR_R0_PMON_CTR7 Package Uncore R-box 0 perfmon counter MSR. E20H 3616 MSR_R1_PMON_BOX_CTRL Package Uncore R-box 1 perfmon local box control MSR. E21H 3617 MSR_R1_PMON_BOX_STATUS Package Uncore R-box 1 perfmon local box status MSR. E22H 3618 MSR_R1_PMON_BOX_OVF_CTRL Package Uncore R-box 1 perfmon local box overflow control MSR. E24H 3620 MSR_R1_PMON_IPERF1_P8 Package Uncore R-box 1 perfmon IPERF1 unit Port 8 select MSR. 2-152 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec E25H 3621 MSR_R1_PMON_IPERF1_P9 Package Uncore R-box 1 perfmon IPERF1 unit Port 9 select MSR. E26H 3622 MSR_R1_PMON_IPERF1_P10 Package Uncore R-box 1 perfmon IPERF1 unit Port 10 select MSR. E27H 3623 MSR_R1_PMON_IPERF1_P11 Package Uncore R-box 1 perfmon IPERF1 unit Port 11 select MSR. E28H 3624 MSR_R1_PMON_IPERF1_P12 Package Uncore R-box 1 perfmon IPERF1 unit Port 12 select MSR. E29H 3625 MSR_R1_PMON_IPERF1_P13 Package Uncore R-box 1 perfmon IPERF1 unit Port 13 select MSR. E2AH 3626 MSR_R1_PMON_IPERF1_P14 Package Uncore R-box 1 perfmon IPERF1 unit Port 14 select MSR. E2BH 3627 MSR_R1_PMON_IPERF1_P15 Package Uncore R-box 1 perfmon IPERF1 unit Port 15 select MSR. E2CH 3628 MSR_R1_PMON_QLX_P4 Package Uncore R-box 1 perfmon QLX unit Port 4 select MSR. E2DH 3629 MSR_R1_PMON_QLX_P5 Package Uncore R-box 1 perfmon QLX unit Port 5 select MSR. E2EH 3630 MSR_R1_PMON_QLX_P6 Package Uncore R-box 1 perfmon QLX unit Port 6 select MSR. E2FH 3631 MSR_R1_PMON_QLX_P7 Package Uncore R-box 1 perfmon QLX unit Port 7 select MSR. E30H 3632 MSR_R1_PMON_EVNT_SEL8 Package Uncore R-box 1 perfmon event select MSR. E31H 3633 MSR_R1_PMON_CTR8 Package Uncore R-box 1 perfmon counter MSR. E32H 3634 MSR_R1_PMON_EVNT_SEL9 Package Uncore R-box 1 perfmon event select MSR. E33H 3635 MSR_R1_PMON_CTR9 Package Uncore R-box 1 perfmon counter MSR. E34H 3636 MSR_R1_PMON_EVNT_SEL10 Package Uncore R-box 1 perfmon event select MSR. E35H 3637 MSR_R1_PMON_CTR10 Package Uncore R-box 1 perfmon counter MSR. E36H 3638 MSR_R1_PMON_EVNT_SEL11 Package Uncore R-box 1 perfmon event select MSR. E37H 3639 MSR_R1_PMON_CTR11 Package Uncore R-box 1 perfmon counter MSR. E38H 3640 MSR_R1_PMON_EVNT_SEL12 Package Uncore R-box 1 perfmon event select MSR. E39H 3641 MSR_R1_PMON_CTR12 Package Uncore R-box 1 perfmon counter MSR. E3AH 3642 MSR_R1_PMON_EVNT_SEL13 Package Uncore R-box 1 perfmon event select MSR. E3BH 3643 MSR_R1_PMON_CTR13 Package Uncore R-box 1perfmon counter MSR. E3CH 3644 MSR_R1_PMON_EVNT_SEL14 Package Uncore R-box 1 perfmon event select MSR. E3DH 3645 MSR_R1_PMON_CTR14 Package Uncore R-box 1 perfmon counter MSR. Vol. 4 2-153 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-17. Additional MSRs in Intel® Xeon® Processor 7500 Series (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec E3EH 3646 MSR_R1_PMON_EVNT_SEL15 Package Uncore R-box 1 perfmon event select MSR. E3FH 3647 MSR_R1_PMON_CTR15 Package Uncore R-box 1 perfmon counter MSR. E45H 3653 MSR_B0_PMON_MATCH Package Uncore B-box 0 perfmon local box match MSR. E46H 3654 MSR_B0_PMON_MASK Package Uncore B-box 0 perfmon local box mask MSR. E49H 3657 MSR_S0_PMON_MATCH Package Uncore S-box 0 perfmon local box match MSR. E4AH 3658 MSR_S0_PMON_MASK Package Uncore S-box 0 perfmon local box mask MSR. E4DH 3661 MSR_B1_PMON_MATCH Package Uncore B-box 1 perfmon local box match MSR. E4EH 3662 MSR_B1_PMON_MASK Package Uncore B-box 1 perfmon local box mask MSR. E54H 3668 MSR_M0_PMON_MM_CONFIG Package Uncore M-box 0 perfmon local box address match/mask config MSR. E55H 3669 MSR_M0_PMON_ADDR_MATCH Package Uncore M-box 0 perfmon local box address match MSR. E56H 3670 MSR_M0_PMON_ADDR_MASK Package Uncore M-box 0 perfmon local box address mask MSR. E59H 3673 MSR_S1_PMON_MATCH Package Uncore S-box 1 perfmon local box match MSR. E5AH 3674 MSR_S1_PMON_MASK Package Uncore S-box 1 perfmon local box mask MSR. E5CH 3676 MSR_M1_PMON_MM_CONFIG Package Uncore M-box 1 perfmon local box address match/mask config MSR. E5DH 3677 MSR_M1_PMON_ADDR_MATCH Package Uncore M-box 1 perfmon local box address match MSR. E5EH 3678 MSR_M1_PMON_ADDR_MASK Package Uncore M-box 1 perfmon local box address mask MSR. 3B5H 965 MSR_UNCORE_PMC5 Package See Section 18.3.1.2.2, “Uncore Performance Event Configuration Facility.” 2.9 MSRS IN THE INTEL® XEON® PROCESSOR 5600 SERIES (BASED ON INTEL® MICROARCHITECTURE CODE NAME WESTMERE) Intel® Xeon® Processor 5600 Series (based on Intel® microarchitecture code name Westmere) supports the MSR interfaces listed in Table 2-15, Table 2-16, plus additional MSR listed in Table 2-18. These MSRs apply to Intel Core i7, i5 and i3 processor family with CPUID signature DisplayFamily_DisplayModel of 06_25H and 06_2CH, see Table 2-1. 2-154 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-18. Additional MSRs Supported by Intel Processors (Based on Intel® Microarchitecture Code Name Westmere) Register Address Hex Dec 13CH 52 Register Name / Bit Fields MSR_FEATURE_CONFIG Scope Core Bit Description AES Configuration (RW-L) Privileged post-BIOS agent must provide a #GP handler to handle unsuccessful read of this MSR. 1:0 AES Configuration (RW-L) Upon a successful read of this MSR, the configuration of AES instruction set availability is as follows: 11b: AES instructions are not available until next RESET. Otherwise, AES instructions are available. Note, AES instruction set is not available if read is unsuccessful. If the configuration is not 01b, AES instructions can be mis-configured if a privileged agent unintentionally writes 11b. 63:2 Reserved 1A7H 423 MSR_OFFCORE_RSP_1 Thread Offcore Response Event Select Register (R/W) 1ADH 429 MSR_TURBO_RATIO_LIMIT Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 39:32 Package Maximum Ratio Limit for 5C Maximum turbo ratio limit of 5 core active. 47:40 Package Maximum Ratio Limit for 6C Maximum turbo ratio limit of 6 core active. 63:48 1B0H 2.10 432 IA32_ENERGY_PERF_BIAS Reserved Package See Table 2-2. MSRS IN THE INTEL® XEON® PROCESSOR E7 FAMILY (BASED ON INTEL® MICROARCHITECTURE CODE NAME WESTMERE) Intel® Xeon® Processor E7 Family (based on Intel® microarchitecture code name Westmere) supports the MSR interfaces listed in Table 2-15 (except MSR address 1ADH), Table 2-16, plus additional MSR listed in Table 2-19. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2FH. Vol. 4 2-155 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-19. Additional MSRs Supported by Intel® Xeon® Processor E7 Family Register Address Hex Dec 13CH 52 Register Name / Bit Fields MSR_FEATURE_CONFIG Scope Core Bit Description AES Configuration (RW-L) Privileged post-BIOS agent must provide a #GP handler to handle unsuccessful read of this MSR. 1:0 AES Configuration (RW-L) Upon a successful read of this MSR, the configuration of AES instruction set availability is as follows: 11b: AES instructions are not available until next RESET. Otherwise, AES instructions are available. Note, AES instruction set is not available if read is unsuccessful. If the configuration is not 01b, AES instructions can be mis-configured if a privileged agent unintentionally writes 11b. 63:2 Reserved 1A7H 423 MSR_OFFCORE_RSP_1 Thread Offcore Response Event Select Register (R/W) 1ADH 429 MSR_TURBO_RATIO_LIMIT Package Reserved Attempt to read/write will cause #UD. 1B0H 432 IA32_ENERGY_PERF_BIAS Package See Table 2-2. F40H 3904 MSR_C8_PMON_BOX_CTRL Package Uncore C-box 8 perfmon local box control MSR. F41H 3905 MSR_C8_PMON_BOX_STATUS Package Uncore C-box 8 perfmon local box status MSR. F42H 3906 MSR_C8_PMON_BOX_OVF_CTRL Package Uncore C-box 8 perfmon local box overflow control MSR. F50H 3920 MSR_C8_PMON_EVNT_SEL0 Package Uncore C-box 8 perfmon event select MSR. F51H 3921 MSR_C8_PMON_CTR0 Package Uncore C-box 8 perfmon counter MSR. F52H 3922 MSR_C8_PMON_EVNT_SEL1 Package Uncore C-box 8 perfmon event select MSR. F53H 3923 MSR_C8_PMON_CTR1 Package Uncore C-box 8 perfmon counter MSR. F54H 3924 MSR_C8_PMON_EVNT_SEL2 Package Uncore C-box 8 perfmon event select MSR. F55H 3925 MSR_C8_PMON_CTR2 Package Uncore C-box 8 perfmon counter MSR. F56H 3926 MSR_C8_PMON_EVNT_SEL3 Package Uncore C-box 8 perfmon event select MSR. F57H 3927 MSR_C8_PMON_CTR3 Package Uncore C-box 8 perfmon counter MSR. F58H 3928 MSR_C8_PMON_EVNT_SEL4 Package Uncore C-box 8 perfmon event select MSR. F59H 3929 MSR_C8_PMON_CTR4 Package Uncore C-box 8 perfmon counter MSR. F5AH 3930 MSR_C8_PMON_EVNT_SEL5 Package Uncore C-box 8 perfmon event select MSR. F5BH 3931 MSR_C8_PMON_CTR5 Package Uncore C-box 8 perfmon counter MSR. 2-156 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-19. Additional MSRs Supported by Intel® Xeon® Processor E7 Family (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec FC0H 4032 MSR_C9_PMON_BOX_CTRL Package Uncore C-box 9 perfmon local box control MSR. FC1H 4033 MSR_C9_PMON_BOX_STATUS Package Uncore C-box 9 perfmon local box status MSR. FC2H 4034 MSR_C9_PMON_BOX_OVF_CTRL Package Uncore C-box 9 perfmon local box overflow control MSR. FD0H 4048 MSR_C9_PMON_EVNT_SEL0 Package Uncore C-box 9 perfmon event select MSR. FD1H 4049 MSR_C9_PMON_CTR0 Package Uncore C-box 9 perfmon counter MSR. FD2H 4050 MSR_C9_PMON_EVNT_SEL1 Package Uncore C-box 9 perfmon event select MSR. FD3H 4051 MSR_C9_PMON_CTR1 Package Uncore C-box 9 perfmon counter MSR. FD4H 4052 MSR_C9_PMON_EVNT_SEL2 Package Uncore C-box 9 perfmon event select MSR. FD5H 4053 MSR_C9_PMON_CTR2 Package Uncore C-box 9 perfmon counter MSR. FD6H 4054 MSR_C9_PMON_EVNT_SEL3 Package Uncore C-box 9 perfmon event select MSR. FD7H 4055 MSR_C9_PMON_CTR3 Package Uncore C-box 9 perfmon counter MSR. FD8H 4056 MSR_C9_PMON_EVNT_SEL4 Package Uncore C-box 9 perfmon event select MSR. FD9H 4057 MSR_C9_PMON_CTR4 Package Uncore C-box 9 perfmon counter MSR. FDAH 4058 MSR_C9_PMON_EVNT_SEL5 Package Uncore C-box 9 perfmon event select MSR. FDBH 4059 MSR_C9_PMON_CTR5 Package Uncore C-box 9 perfmon counter MSR. 2.11 MSRS IN INTEL® PROCESSOR FAMILY BASED ON INTEL® MICROARCHITECTURE CODE NAME SANDY BRIDGE Table 2-20 lists model-specific registers (MSRs) that are common to Intel® processor family based on Intel microarchitecture code name Sandy Bridge. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2AH, 06_2DH, see Table 2-1. Additional MSRs specific to 06_2AH are listed in Table 2-21. Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 0H 0 IA32_P5_MC_ADDR Thread See Section 2.23, “MSRs in Pentium Processors.” 1H 1 IA32_P5_MC_TYPE Thread See Section 2.23, “MSRs in Pentium Processors.” 6H 6 IA32_MONITOR_FILTER_ SIZE Thread See Section 8.10.5, “Monitor/Mwait Address Range Determination” and Table 2-2. Vol. 4 2-157 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 10H 16 IA32_TIME_STAMP_COUNTER Thread See Section 17.17, “Time-Stamp Counter” and see Table 2-2. 17H 23 IA32_PLATFORM_ID Package Platform ID (R) See Table 2-2. 1BH 27 IA32_APIC_BASE Thread See Section 10.4.4, “Local APIC Status and Location” and Table 2-2. 34H 52 MSR_SMI_COUNT Thread SMI Counter (R/O) 31:0 SMI Count (R/O) Count SMIs. 63:32 3AH 58 IA32_FEATURE_CONTROL Reserved. Thread Control Features in Intel 64 Processor (R/W) See Table 2-2. 0 Lock (R/WL) 1 Enable VMX Inside SMX Operation (R/WL) 2 Enable VMX Outside SMX Operation (R/WL) 14:8 SENTER Local Functions Enables (R/WL) 15 SENTER Global Functions Enable (R/WL) 79H 121 IA32_BIOS_UPDT_TRIG Core BIOS Update Trigger Register (W) 8BH 139 IA32_BIOS_SIGN_ID Thread BIOS Update Signature ID (RO) See Table 2-2. See Table 2-2. C1H 193 IA32_PMC0 Thread C2H 194 IA32_PMC1 Thread Performance Counter Register See Table 2-2. Performance Counter Register See Table 2-2. C3H 195 IA32_PMC2 Thread C4H 196 IA32_PMC3 Thread Performance Counter Register See Table 2-2. Performance Counter Register See Table 2-2. C5H 197 IA32_PMC4 Core Performance Counter Register (if core not shared by threads) C6H 198 IA32_PMC5 Core Performance Counter Register (if core not shared by threads) C7H 199 IA32_PMC6 Core Performance Counter Register (if core not shared by threads) C8H 200 IA32_PMC7 Core Performance Counter Register (if core not shared by threads) 2-158 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Dec CEH 206 Register Name / Bit Fields MSR_PLATFORM_INFO Scope Package Bit Description Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the frequency that invariant TSC runs at. Frequency = ratio * 100 MHz. 27:16 28 Reserved Package Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limit for Turbo mode is enabled. When set to 0, indicates Programmable Ratio Limit for Turbo mode is disabled. 29 Package Programmable TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDP Limit for Turbo mode is programmable. When set to 0, indicates TDP Limit for Turbo mode is not programmable. 39:30 47:40 Reserved Package Maximum Efficiency Ratio (R/O) This is the minimum ratio (maximum efficiency) that the processor can operate, in units of 100MHz. 63:48 E2H 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Core C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 000b: C0/C1 (no package C-sate support) 001b: C2 010b: C6 no retention 011b: C6 retention 100b: C7 101b: C7s 111: No package C-state limit Note: This field cannot be used to limit package C-state to C3. 9:3 Reserved Vol. 4 2-159 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 14:11 Reserved 15 CFG Lock (R/WO) When set, locks bits 15:0 of this register until next reset. 24:16 Reserved 25 C3 State Auto Demotion Enable (R/W) When set, the processor will conditionally demote C6/C7 requests to C3 based on uncore auto-demote information. 26 C1 State Auto Demotion Enable (R/W) When set, the processor will conditionally demote C3/C6/C7 requests to C1 based on uncore auto-demote information. 27 Enable C3 Undemotion (R/W) When set, enables undemotion from demoted C3. 28 Enable C1 Undemotion (R/W) When set, enables undemotion from demoted C1. 63:29 E4H 228 MSR_PMG_IO_CAPTURE_BASE Reserved Core Power Management IO Redirection in C-state (R/W) See http://biosbits.org. 15:0 LVL_2 Base Address (R/W) Specifies the base address visible to software for IO redirection. If IO MWAIT Redirection is enabled, reads to this address will be consumed by the power management logic and decoded to MWAIT instructions. When IO port address redirection is enabled, this is the IO port address reported to the OS/software. 18:16 C-State Range (R/W) Specifies the encoding value of the maximum C-State code name to be included when IO read to MWAIT redirection is enabled by MSR_PKG_CST_CONFIG_CONTROL[bit10]: 000b - C3 is the max C-State to include. 001b - C6 is the max C-State to include. 010b - C7 is the max C-State to include. 63:19 E7H 231 IA32_MPERF Reserved Thread Maximum Performance Frequency Clock Count (RW) See Table 2-2. 2-160 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Dec E8H 232 Register Name / Bit Fields IA32_APERF Scope Thread Bit Description Actual Performance Frequency Clock Count (RW) See Table 2-2. FEH 254 IA32_MTRRCAP Thread 13CH 52 MSR_FEATURE_CONFIG Core See Table 2-2. AES Configuration (RW-L) Privileged post-BIOS agent must provide a #GP handler to handle unsuccessful read of this MSR. 1:0 AES Configuration (RW-L) Upon a successful read of this MSR, the configuration of AES instruction set availability is as follows: 11b: AES instructions are not available until next RESET. Otherwise, AES instructions are available. Note, AES instruction set is not available if read is unsuccessful. If the configuration is not 01b, AES instructions can be mis-configured if a privileged agent unintentionally writes 11b. 63:2 Reserved 174H 372 IA32_SYSENTER_CS Thread See Table 2-2. 175H 373 IA32_SYSENTER_ESP Thread See Table 2-2. 176H 374 IA32_SYSENTER_EIP Thread See Table 2-2. 179H 377 IA32_MCG_CAP Thread See Table 2-2. 17AH 378 IA32_MCG_STATUS Thread Global Machine Check Status 0 RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 Reserved 186H 390 IA32_PERFEVTSEL0 Thread See Table 2-2. 187H 391 IA32_PERFEVTSEL1 Thread See Table 2-2. Vol. 4 2-161 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 188H 392 IA32_PERFEVTSEL2 Thread See Table 2-2. 189H 393 IA32_PERFEVTSEL3 Thread See Table 2-2. 18AH 394 IA32_PERFEVTSEL4 Core See Table 2-2. If CPUID.0AH:EAX[15:8] = 8. 18BH 395 IA32_PERFEVTSEL5 Core See Table 2-2. If CPUID.0AH:EAX[15:8] = 8. 18CH 396 IA32_PERFEVTSEL6 Core See Table 2-2. If CPUID.0AH:EAX[15:8] = 8. 18DH 397 IA32_PERFEVTSEL7 Core See Table 2-2. If CPUID.0AH:EAX[15:8] = 8. 198H 408 IA32_PERF_STATUS Package See Table 2-2. 198H 408 15:0 Current Performance State Value 63:16 Reserved MSR_PERF_STATUS Package 47:32 Performance Status Core Voltage (R/O) P-state core voltage can be computed by MSR_PERF_STATUS[37:32] * (float) 1/(2^13). 199H 409 IA32_PERF_CTL Thread See Table 2-2. 19AH 410 IA32_CLOCK_MODULATION Thread Clock Modulation (R/W) See Table 2-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 3:0 On demand Clock Modulation Duty Cycle (R/W) In 6.25% increment. 19BH 411 4 On demand Clock Modulation Enable (R/W) 63:5 Reserved IA32_THERM_INTERRUPT Core Thermal Interrupt Control (R/W) See Table 2-2. 19CH 412 IA32_THERM_STATUS Core Thermal Monitor Status (R/W) See Table 2-2. 0 Thermal Status (RO) See Table 2-2. 1 Thermal Status Log (R/WC0) See Table 2-2. 2 PROTCHOT # or FORCEPR# Status (RO) See Table 2-2. 3 PROTCHOT # or FORCEPR# Log (R/WC0) See Table 2-2. 4 Critical Temperature Status (RO) See Table 2-2. 5 Critical Temperature Status Log (R/WC0) See Table 2-2. 2-162 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 6 Thermal Threshold #1 Status (RO) See Table 2-2. 7 Thermal Threshold #1 Log (R/WC0) See Table 2-2. 8 Thermal Threshold #2 Status (RO) See Table 2-2. 9 Thermal Threshold #2 Log (R/WC0) See Table 2-2. 10 Power Limitation Status (RO) See Table 2-2. 11 Power Limitation Log (R/WC0) See Table 2-2. 15:12 Reserved 22:16 Digital Readout (RO) See Table 2-2. 26:23 Reserved 30:27 Resolution in Degrees Celsius (RO) See Table 2-2. 31 Reading Valid (RO) See Table 2-2. 63:32 1A0H 416 Reserved IA32_MISC_ENABLE Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 0 Thread Fast-Strings Enable See Table 2-2 6:1 7 Reserved Thread Performance Monitoring Available (R) See Table 2-2. 10:8 11 Reserved Thread Branch Trace Storage Unavailable (RO) See Table 2-2. 12 Thread Processor Event Based Sampling Unavailable (RO) See Table 2-2. 15:13 16 Reserved Package Enhanced Intel SpeedStep Technology Enable (R/W) See Table 2-2. Vol. 4 2-163 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 18 Thread ENABLE MONITOR FSM (R/W) See Table 2-2. 21:19 22 Reserved Thread Limit CPUID Maxval (R/W) See Table 2-2. 23 Thread xTPR Message Disable (R/W) See Table 2-2. 33:24 34 Reserved Thread XD Bit Disable (R/W) See Table 2-2. 37:35 38 Reserved Package Turbo Mode Disable (R/W) When set to 1 on processors that support Intel Turbo Boost Technology, the turbo mode feature is disabled and the IDA_Enable feature flag will be clear (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processor’s support of turbo mode is enabled. Note: The power-on default value is used by BIOS to detect hardware support of turbo mode. If the poweron default value is 1, turbo mode is available in the processor. If the power-on default value is 0, turbo mode is not available. 63:39 1A2H 418 MSR_TEMPERATURE_TARGET Reserved Unique 15:0 Temperature Target Reserved 23:16 Temperature Target (R) The minimum temperature at which PROCHOT# will be asserted. The value is degrees C. 1A4H 420 63:24 Reserved MSR_MISC_FEATURE_CONTROL Miscellaneous Feature Control (R/W) 0 Core L2 Hardware Prefetcher Disable (R/W) If 1, disables the L2 hardware prefetcher, which fetches additional lines of code or data into the L2 cache. 1 Core L2 Adjacent Cache Line Prefetcher Disable (R/W) If 1, disables the adjacent cache line prefetcher, which fetches the cache line that comprises a cache line pair (128 bytes). 2-164 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 2 Core DCU Hardware Prefetcher Disable (R/W) If 1, disables the L1 data cache prefetcher, which fetches the next cache line into L1 data cache. 3 Core DCU IP Prefetcher Disable (R/W) If 1, disables the L1 data cache IP prefetcher, which uses sequential load history (based on instruction pointer of previous loads) to determine whether to prefetch additional lines. 63:4 Reserved 1A6H 422 MSR_OFFCORE_RSP_0 Thread Offcore Response Event Select Register (R/W) 1A7H 422 MSR_OFFCORE_RSP_1 Thread Offcore Response Event Select Register (R/W) 1AAH 426 MSR_MISC_PWR_MGMT Miscellaneous Power Management Control Various model specific features enumeration. See http://biosbits.org. 1B0H 432 IA32_ENERGY_PERF_BIAS Package See Table 2-2. 1B1H 433 IA32_PACKAGE_THERM_STATUS Package See Table 2-2. 1B2H 434 IA32_PACKAGE_THERM_INTERRUPT Package See Table 2-2. 1C8H 456 MSR_LBR_SELECT Thread Last Branch Record Filtering Select Register (R/W) See Section 17.9.2, “Filtering of Last Branch Records.” 1C9H 457 0 CPL_EQ_0 1 CPL_NEQ_0 2 JCC 3 NEAR_REL_CALL 4 NEAR_IND_CALL 5 NEAR_RET 6 NEAR_IND_JMP 7 NEAR_REL_JMP 8 FAR_BRANCH 63:9 Reserved MSR_LASTBRANCH_TOS Thread Last Branch Record Stack TOS (R/W) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 680H). 1D9H 473 IA32_DEBUGCTL Thread Debug Control (R/W) See Table 2-2. 0 LBR: Last Branch Record 1 BTF 5:2 Reserved 6 TR: Branch Trace Vol. 4 2-165 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex 1DDH Register Name / Bit Fields Scope Bit Description Dec 477 7 BTS: Log Branch Trace Message to BTS buffer 8 BTINT 9 BTS_OFF_OS 10 BTS_OFF_USER 11 FREEZE_LBR_ON_PMI 12 FREEZE_PERFMON_ON_PMI 13 ENABLE_UNCORE_PMI 14 FREEZE_WHILE_SMM 63:15 Reserved MSR_LER_FROM_LIP Thread Last Exception Record From Linear IP (R/W) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Thread Last Exception Record To Linear IP (R/W) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1F2H 498 IA32_SMRR_PHYSBASE Core See Table 2-2. 1F3H 499 IA32_SMRR_PHYSMASK Core See Table 2-2. 1FCH 508 MSR_POWER_CTL Core See http://biosbits.org. 200H 512 IA32_MTRR_PHYSBASE0 Thread See Table 2-2. 201H 513 IA32_MTRR_PHYSMASK0 Thread See Table 2-2. 202H 514 IA32_MTRR_PHYSBASE1 Thread See Table 2-2. 203H 515 IA32_MTRR_PHYSMASK1 Thread See Table 2-2. 204H 516 IA32_MTRR_PHYSBASE2 Thread See Table 2-2. 205H 517 IA32_MTRR_PHYSMASK2 Thread See Table 2-2. 206H 518 IA32_MTRR_PHYSBASE3 Thread See Table 2-2. 207H 519 IA32_MTRR_PHYSMASK3 Thread See Table 2-2. 208H 520 IA32_MTRR_PHYSBASE4 Thread See Table 2-2. 209H 521 IA32_MTRR_PHYSMASK4 Thread See Table 2-2. 20AH 522 IA32_MTRR_PHYSBASE5 Thread See Table 2-2. 20BH 523 IA32_MTRR_PHYSMASK5 Thread See Table 2-2. 20CH 524 IA32_MTRR_PHYSBASE6 Thread See Table 2-2. 20DH 525 IA32_MTRR_PHYSMASK6 Thread See Table 2-2. 20EH 526 IA32_MTRR_PHYSBASE7 Thread See Table 2-2. 2-166 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 20FH 527 IA32_MTRR_PHYSMASK7 Thread See Table 2-2. 210H 528 IA32_MTRR_PHYSBASE8 Thread See Table 2-2. 211H 529 IA32_MTRR_PHYSMASK8 Thread See Table 2-2. 212H 530 IA32_MTRR_PHYSBASE9 Thread See Table 2-2. 213H 531 IA32_MTRR_PHYSMASK9 Thread See Table 2-2. 250H 592 IA32_MTRR_FIX64K_00000 Thread See Table 2-2. 258H 600 IA32_MTRR_FIX16K_80000 Thread See Table 2-2. 259H 601 IA32_MTRR_FIX16K_A0000 Thread See Table 2-2. 268H 616 IA32_MTRR_FIX4K_C0000 Thread See Table 2-2. 269H 617 IA32_MTRR_FIX4K_C8000 Thread See Table 2-2. 26AH 618 IA32_MTRR_FIX4K_D0000 Thread See Table 2-2. 26BH 619 IA32_MTRR_FIX4K_D8000 Thread See Table 2-2. 26CH 620 IA32_MTRR_FIX4K_E0000 Thread See Table 2-2. 26DH 621 IA32_MTRR_FIX4K_E8000 Thread See Table 2-2. 26EH 622 IA32_MTRR_FIX4K_F0000 Thread See Table 2-2. 26FH 623 IA32_MTRR_FIX4K_F8000 Thread See Table 2-2. 277H 631 IA32_PAT Thread See Table 2-2. 280H 640 IA32_MC0_CTL2 Core See Table 2-2. 281H 641 IA32_MC1_CTL2 Core See Table 2-2. 282H 642 IA32_MC2_CTL2 Core See Table 2-2. 283H 643 IA32_MC3_CTL2 Core See Table 2-2. 284H 644 IA32_MC4_CTL2 Package Always 0 (CMCI not supported). 2FFH 767 IA32_MTRR_DEF_TYPE Thread Default Memory Types (R/W) See Table 2-2. 309H 777 IA32_FIXED_CTR0 Thread Fixed-Function Performance Counter Register 0 (R/W) See Table 2-2. 30AH 778 IA32_FIXED_CTR1 Thread Fixed-Function Performance Counter Register 1 (R/W) See Table 2-2. 30BH 779 IA32_FIXED_CTR2 Thread Fixed-Function Performance Counter Register 2 (R/W) See Table 2-2. 345H 837 IA32_PERF_CAPABILITIES 5:0 Thread See Table 2-2. See Section 17.4.1, “IA32_DEBUGCTL MSR.” LBR Format See Table 2-2. 6 PEBS Record Format. Vol. 4 2-167 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 7 PEBSSaveArchRegs See Table 2-2. 11:8 PEBS_REC_FORMAT See Table 2-2. 12 SMM_FREEZE See Table 2-2. 63:13 38DH 909 IA32_FIXED_CTR_CTRL Reserved Thread Fixed-Function-Counter Control Register (R/W) See Table 2-2. 38EH 910 IA32_PERF_GLOBAL_STATUS See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 0 Thread Ovf_PMC0 1 Thread Ovf_PMC1 2 Thread Ovf_PMC2 3 Thread Ovf_PMC3 4 Core Ovf_PMC4 (if CPUID.0AH:EAX[15:8] > 4) 5 Core Ovf_PMC5 (if CPUID.0AH:EAX[15:8] > 5) 6 Core Ovf_PMC6 (if CPUID.0AH:EAX[15:8] > 6) 7 Core Ovf_PMC7 (if CPUID.0AH:EAX[15:8] > 7) 31:8 Reserved 32 Thread Ovf_FixedCtr0 33 Thread Ovf_FixedCtr1 34 Thread Ovf_FixedCtr2 60:35 38FH 2-168 Vol. 4 911 Reserved 61 Thread Ovf_Uncore 62 Thread Ovf_BufDSSAVE 63 Thread CondChgd IA32_PERF_GLOBAL_CTRL Thread See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 0 Thread Set 1 to enable PMC0 to count. 1 Thread Set 1 to enable PMC1 to count. 2 Thread Set 1 to enable PMC2 to count. 3 Thread Set 1 to enable PMC3 to count. 4 Core Set 1 to enable PMC4 to count (if CPUID.0AH:EAX[15:8] > 4). 5 Core Set 1 to enable PMC5 to count (if CPUID.0AH:EAX[15:8] > 5). MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Dec 6 Core Set 1 to enable PMC6 to count (if CPUID.0AH:EAX[15:8] > 6). 7 Core Set 1 to enable PMC7 to count (if CPUID.0AH:EAX[15:8] > 7). 31:8 390H 912 Reserved 32 Thread Set 1 to enable FixedCtr0 to count. 33 Thread Set 1 to enable FixedCtr1 to count. 34 Thread Set 1 to enable FixedCtr2 to count. 63:35 Reserved IA32_PERF_GLOBAL_OVF_CTRL See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 0 Thread Set 1 to clear Ovf_PMC0. 1 Thread Set 1 to clear Ovf_PMC1. 2 Thread Set 1 to clear Ovf_PMC2. 3 Thread Set 1 to clear Ovf_PMC3. 4 Core Set 1 to clear Ovf_PMC4 (if CPUID.0AH:EAX[15:8] > 4). 5 Core Set 1 to clear Ovf_PMC5 (if CPUID.0AH:EAX[15:8] > 5). 6 Core Set 1 to clear Ovf_PMC6 (if CPUID.0AH:EAX[15:8] > 6). 7 Core Set 1 to clear Ovf_PMC7 (if CPUID.0AH:EAX[15:8] > 7). 31:8 Reserved 32 Thread Set 1 to clear Ovf_FixedCtr0. 33 Thread Set 1 to clear Ovf_FixedCtr1. 34 Thread Set 1 to clear Ovf_FixedCtr2. 60:35 3F1H Bit Description 1009 Reserved 61 Thread Set 1 to clear Ovf_Uncore. 62 Thread Set 1 to clear Ovf_BufDSSAVE. 63 Thread Set 1 to clear CondChgd. MSR_PEBS_ENABLE Thread See Section 18.3.1.1.1, “Processor Event Based Sampling (PEBS).” 0 Enable PEBS on IA32_PMC0. (R/W) 1 Enable PEBS on IA32_PMC1. (R/W) 2 Enable PEBS on IA32_PMC2. (R/W) 3 Enable PEBS on IA32_PMC3. (R/W) 31:4 Reserved 32 Enable Load Latency on IA32_PMC0. (R/W) 33 Enable Load Latency on IA32_PMC1. (R/W) 34 Enable Load Latency on IA32_PMC2. (R/W) Vol. 4 2-169 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex 3F6H 3F8H Register Name / Bit Fields Scope Bit Description Dec 1014 1016 35 Enable Load Latency on IA32_PMC3. (R/W) 62:36 Reserved 63 Enable Precise Store (R/W) MSR_PEBS_LD_LAT Thread See Section 18.3.1.1.2, “Load Latency Performance Monitoring Facility.” 15:0 Minimum threshold latency value of tagged load operation that will be counted. (R/W) 63:36 Reserved MSR_PKG_C3_RESIDENCY Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C3 Residency Counter (R/O) Value since last reset that this package is in processorspecific C3 states. Count at the same frequency as the TSC. 3F9H 1017 MSR_PKG_C6_RESIDENCY Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C6 Residency Counter. (R/O) Value since last reset that this package is in processorspecific C6 states. Count at the same frequency as the TSC. 3FAH 1018 MSR_PKG_C7_RESIDENCY Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C7 Residency Counter (R/O) Value since last reset that this package is in processorspecific C7 states. Count at the same frequency as the TSC. 3FCH 1020 MSR_CORE_C3_RESIDENCY Core 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. CORE C3 Residency Counter (R/O) Value since last reset that this core is in processorspecific C3 states. Count at the same frequency as the TSC. 3FDH 1021 MSR_CORE_C6_RESIDENCY 63:0 Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. CORE C6 Residency Counter (R/O) Value since last reset that this core is in processorspecific C6 states. Count at the same frequency as the TSC. 2-170 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Dec 3FEH 1022 Register Name / Bit Fields MSR_CORE_C7_RESIDENCY Scope Core 63:0 Bit Description Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. CORE C7 Residency Counter (R/O) Value since last reset that this core is in processorspecific C7 states. Count at the same frequency as the TSC. 400H 1024 IA32_MC0_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 402H 1026 IA32_MC0_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 403H 1027 IA32_MC0_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 404H 1028 IA32_MC1_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 406H 1030 IA32_MC1_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 407H 1031 IA32_MC1_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 408H 1032 IA32_MC2_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 40AH 1034 IA32_MC2_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 40BH 1035 IA32_MC2_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 40CH 1036 IA32_MC3_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 IA32_MC3_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 40EH 1038 IA32_MC3_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 40FH 1039 IA32_MC3_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 410H 1040 IA32_MC4_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 0 PCU Hardware Error (R/W) When set, enables signaling of PCU hardware detected errors. 1 PCU Controller Error (R/W) When set, enables signaling of PCU controller detected errors. 2 PCU Firmware Error (R/W) When set, enables signaling of PCU firmware detected errors. 63:2 411H 1041 IA32_MC4_STATUS Reserved Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. Vol. 4 2-171 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Dec 480H 1152 Register Name / Bit Fields IA32_VMX_BASIC Scope Thread Bit Description Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. See Appendix A.1, “Basic VMX Information.” 481H 1153 IA32_VMX_PINBASED_CTLS Thread Capability Reporting Register of Pin-Based VM-Execution Controls (R/O) See Table 2-2. See Appendix A.3, “VM-Execution Controls.” 482H 1154 IA32_VMX_PROCBASED_CTLS Thread Capability Reporting Register of Primary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” 483H 1155 IA32_VMX_EXIT_CTLS Thread Capability Reporting Register of VM-Exit Controls (R/O) See Table 2-2. See Appendix A.4, “VM-Exit Controls.” 484H 1156 IA32_VMX_ENTRY_CTLS Thread Capability Reporting Register of VM-Entry Controls (R/O) See Table 2-2. See Appendix A.5, “VM-Entry Controls.” 485H 1157 IA32_VMX_MISC Thread Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 2-2. See Appendix A.6, “Miscellaneous Data.” 486H 1158 IA32_VMX_CR0_FIXED0 Thread Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 487H 1159 IA32_VMX_CR0_FIXED1 Thread Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.7, “VMX-Fixed Bits in CR0.” 488H 1160 IA32_VMX_CR4_FIXED0 Thread Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 489H 1161 IA32_VMX_CR4_FIXED1 Thread Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 2-2. See Appendix A.8, “VMX-Fixed Bits in CR4.” 48AH 1162 IA32_VMX_VMCS_ENUM Thread Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 2-2. See Appendix A.9, “VMCS Enumeration.” 2-172 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Register Name / Bit Fields Scope Hex Dec 48BH 1163 IA32_VMX_PROCBASED_CTLS2 Thread 48CH 1164 IA32_VMX_EPT_VPID_ENUM Thread Bit Description Capability Reporting Register of Secondary ProcessorBased VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls.” Capability Reporting Register of EPT and VPID (R/O) See Table 2-2 48DH 1165 IA32_VMX_TRUE_PINBASED_CTLS Thread Capability Reporting Register of Pin-Based VM-Execution Flex Controls (R/O) See Table 2-2 48EH 1166 IA32_VMX_TRUE_PROCBASED_CTLS Thread 48FH 1167 IA32_VMX_TRUE_EXIT_CTLS Thread Capability Reporting Register of Primary ProcessorBased VM-Execution Flex Controls (R/O) See Table 2-2 Capability Reporting Register of VM-Exit Flex Controls (R/O) See Table 2-2 490H 1168 IA32_VMX_TRUE_ENTRY_CTLS Thread Capability Reporting Register of VM-Entry Flex Controls (R/O) See Table 2-2 4C1H 1217 IA32_A_PMC0 Thread See Table 2-2. 4C2H 1218 IA32_A_PMC1 Thread See Table 2-2. 4C3H 1219 IA32_A_PMC2 Thread See Table 2-2. 4C4H 1220 IA32_A_PMC3 Thread See Table 2-2. 4C5H 1221 IA32_A_PMC4 Core See Table 2-2. 4C6H 1222 IA32_A_PMC5 Core See Table 2-2. 4C7H 1223 IA32_A_PMC6 Core See Table 2-2. 4C8H 1224 IA32_A_PMC7 Core See Table 2-2. 600H 1536 IA32_DS_AREA Thread DS Save Area (R/W) See Table 2-2. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 606H 1542 MSR_RAPL_POWER_UNIT Package Unit Multipliers used in RAPL Interfaces (R/O) See Section 14.9.1, “RAPL Interfaces.” 60AH 1546 MSR_PKGC3_IRTL Package Package C3 Interrupt Response Limit (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 9:0 Interrupt Response Time Limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C3 state. Vol. 4 2-173 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. The following time unit encodings are supported: 000b: 1 ns 001b: 32 ns 010b: 1024 ns 011b: 32768 ns 100b: 1048576 ns 101b: 33554432 ns 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 60BH 1547 MSR_PKGC6_IRTL Reserved Package Package C6 Interrupt Response Limit (R/W) This MSR defines the budget allocated for the package to exit from a C6 to a C0 state, where an interrupt request can be delivered to the core and serviced. Additional core-exit latency may be applicable depending on the actual C-state the core is in. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. 9:0 Interrupt Response Time Limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C6 state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. The following time unit encodings are supported: 000b: 1 ns 001b: 32 ns 010b: 1024 ns 011b: 32768 ns 100b: 1048576 ns 101b: 33554432 ns 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 2-174 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Register Name / Bit Fields Scope Dec 63:16 60DH Bit Description 1549 MSR_PKG_C2_RESIDENCY Reserved Package 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C2 Residency Counter (R/O) Value since last reset that this package is in processorspecific C2 states. Count at the same frequency as the TSC. 610H 1552 MSR_PKG_POWER_LIMIT Package PKG RAPL Power Limit Control (R/W) See Section 14.9.3, “Package RAPL Domain.” 611H 1553 MSR_PKG_ENERGY_STATUS Package PKG Energy Status (R/O) See Section 14.9.3, “Package RAPL Domain.” 614H 1556 MSR_PKG_POWER_INFO Package 638H 1592 MSR_PP0_POWER_LIMIT Package PKG RAPL Parameters (R/W) See Section 14.9.3, “Package RAPL Domain.” PP0 RAPL Power Limit Control (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 680H 1664 MSR_LASTBRANCH_0_FROM_IP Thread Last Branch Record 0 From IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the source instruction. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.9.1 and record format in Section 17.4.8.1. 681H 1665 MSR_LASTBRANCH_1_FROM_IP Thread Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 682H 1666 MSR_LASTBRANCH_2_FROM_IP Thread Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 683H 1667 MSR_LASTBRANCH_3_FROM_IP Thread Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 684H 1668 MSR_LASTBRANCH_4_FROM_IP Thread Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 685H 1669 MSR_LASTBRANCH_5_FROM_IP Thread Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 686H 1670 MSR_LASTBRANCH_6_FROM_IP Thread Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 687H 1671 MSR_LASTBRANCH_7_FROM_IP Thread Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 688H 1672 MSR_LASTBRANCH_8_FROM_IP Thread Last Branch Record 8 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 689H 1673 MSR_LASTBRANCH_9_FROM_IP Thread Last Branch Record 9 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Vol. 4 2-175 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Dec 68AH 1674 Register Name / Bit Fields MSR_LASTBRANCH_10_FROM_IP Scope Thread Bit Description Last Branch Record 10 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68BH 1675 MSR_LASTBRANCH_11_FROM_IP Thread Last Branch Record 11 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68CH 1676 MSR_LASTBRANCH_12_FROM_IP Thread Last Branch Record 12 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68DH 1677 MSR_LASTBRANCH_13_FROM_IP Thread Last Branch Record 13 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68EH 1678 MSR_LASTBRANCH_14_FROM_IP Thread Last Branch Record 14 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 68FH 1679 MSR_LASTBRANCH_15_FROM_IP Thread Last Branch Record 15 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 6C0H 1728 MSR_LASTBRANCH_0_TO_IP Thread Last Branch Record 0 To IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the destination instruction. 6C1H 1729 MSR_LASTBRANCH_1_TO_IP Thread Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C2H 1730 MSR_LASTBRANCH_2_TO_IP Thread Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C3H 1731 MSR_LASTBRANCH_3_TO_IP Thread Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C4H 1732 MSR_LASTBRANCH_4_TO_IP Thread Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C5H 1733 MSR_LASTBRANCH_5_TO_IP Thread Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C6H 1734 MSR_LASTBRANCH_6_TO_IP Thread Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C7H 1735 MSR_LASTBRANCH_7_TO_IP Thread Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C8H 1736 MSR_LASTBRANCH_8_TO_IP Thread Last Branch Record 8 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6C9H 1737 MSR_LASTBRANCH_9_TO_IP Thread Last Branch Record 9 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CAH 1738 MSR_LASTBRANCH_10_TO_IP Thread Last Branch Record 10 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CBH 1739 MSR_LASTBRANCH_11_TO_IP Thread Last Branch Record 11 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 2-176 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-20. MSRs Supported by Intel® Processors based on Intel® microarchitecture code name Sandy Bridge (Contd.) Register Address Hex Dec 6CCH 1740 Register Name / Bit Fields MSR_LASTBRANCH_12_TO_IP Scope Thread Bit Description Last Branch Record 12 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CDH 1741 MSR_LASTBRANCH_13_TO_IP Thread Last Branch Record 13 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CEH 1742 MSR_LASTBRANCH_14_TO_IP Thread Last Branch Record 14 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6CFH 1743 MSR_LASTBRANCH_15_TO_IP Thread Last Branch Record 15 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6E0H IA32_TSC_DEADLINE Thread See Table 2-2. 802H83FH X2APIC MSRs Thread See Table 2-2. C000_ 0080H IA32_EFER Thread Extended Feature Enables C000_ 0081H IA32_STAR C000_ 0082H IA32_LSTAR C000_ 0084H IA32_FMASK C000_ 0100H IA32_FS_BASE C000_ 0101H IA32_GS_BASE C000_ 0102H IA32_KERNEL_GS_BASE C000_ 0103H IA32_TSC_AUX 2.11.1 1760 See Table 2-2. Thread System Call Target Address (R/W) See Table 2-2. Thread IA-32e Mode System Call Target Address (R/W) See Table 2-2. Thread System Call Flag Mask (R/W) See Table 2-2. Thread Map of BASE Address of FS (R/W) See Table 2-2. Thread Map of BASE Address of GS (R/W) See Table 2-2. Thread Swap Target of BASE Address of GS (R/W) See Table 2-2. Thread AUXILIARY TSC Signature (R/W) See Table 2-2 and Section 17.17.2, “IA32_TSC_AUX Register and RDTSCP Support.” MSRs In 2nd Generation Intel® Core™ Processor Family (Based on Intel® Microarchitecture Code Name Sandy Bridge) Table 2-21 and Table 2-22 list model-specific registers (MSRs) that are specific to the 2nd generation Intel® Core™ processor family (based on Intel microarchitecture code name Sandy Bridge). These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2AH; see Table 2-1. Vol. 4 2-177 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-21. MSRs Supported by 2nd Generation Intel® Core™ Processors (Intel® microarchitecture code name Sandy Bridge) Register Address Hex Dec 1ADH 429 Register Name / Bit Fields MSR_TURBO_RATIO_LIMIT Scope Package Bit Description Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 63:32 60CH 1548 MSR_PKGC7_IRTL Reserved Package Package C7 Interrupt Response Limit (R/W) This MSR defines the budget allocated for the package to exit from a C7 to a C0 state, where interrupt request can be delivered to the core and serviced. Additional core-exit latency may be applicable depending on the actual C-state the core is in. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. 9:0 Interrupt Response Time Limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C7 state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. The following time unit encodings are supported: 000b: 1 ns 001b: 32 ns 010b: 1024 ns 011b: 32768 ns 100b: 1048576 ns 101b: 33554432 ns 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 2-178 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-21. MSRs Supported by 2nd Generation Intel® Core™ Processors (Intel® microarchitecture code name Sandy Bridge) (Contd.) Register Address Hex Dec 639H 1593 Register Name / Bit Fields MSR_PP0_ENERGY_STATUS Scope Package Bit Description PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 63AH 1594 MSR_PP0_POLICY Package PP0 Balance Policy (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 640H 1600 MSR_PP1_POWER_LIMIT Package PP1 RAPL Power Limit Control (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 641H 1601 MSR_PP1_ENERGY_STATUS Package PP1 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 642H 1602 MSR_PP1_POLICY Package PP1 Balance Policy (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” See Table 2-20, Table 2-21, and Table 2-22 for MSR definitions applicable to processors with CPUID signature 06_2AH. Table 2-22 lists the MSRs of uncore PMU for Intel processors with CPUID signature 06_2AH. Table 2-22. Uncore PMU MSRs Supported by 2nd Generation Intel® Core™ Processors Register Address Hex Dec 391H 913 392H 394H 914 916 Register Name / Bit Fields MSR_UNC_PERF_GLOBAL_CTRL Scope Package Bit Description Uncore PMU Global Control 0 Slice 0 select. 1 Slice 1 select. 2 Slice 2 select. 3 Slice 3 select. 4 Slice 4 select. 18:5 Reserved 29 Enable all uncore counters. 30 Enable wake on PMI. 31 Enable Freezing counter when overflow. 63:32 Reserved MSR_UNC_PERF_GLOBAL_STATUS Package Uncore PMU Main Status 0 Fixed counter overflowed. 1 An ARB counter overflowed. 2 Reserved 3 A CBox counter overflowed (on any slice). 63:4 Reserved MSR_UNC_PERF_FIXED_CTRL Package Uncore Fixed Counter Control (R/W) 19:0 Reserved 20 Enable overflow propagation. Vol. 4 2-179 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-22. Uncore PMU MSRs Supported by 2nd Generation Intel® Core™ Processors Register Address Hex 395H 396H Register Name / Bit Fields Scope Bit Description Dec 917 918 21 Reserved 22 Enable counting. 63:23 Reserved MSR_UNC_PERF_FIXED_CTR Package Uncore Fixed Counter 47:0 Current count. 63:48 Reserved MSR_UNC_CBO_CONFIG Package Uncore C-Box Configuration Information (R/O) 3:0 Report the number of C-Box units with performance counters, including processor cores and processor graphics. 63:4 Reserved 3B0H 946 MSR_UNC_ARB_PERFCTR0 Package Uncore Arb Unit, Performance Counter 0 3B1H 947 MSR_UNC_ARB_PERFCTR1 Package Uncore Arb Unit, Performance Counter 1 3B2H 944 MSR_UNC_ARB_PERFEVTSEL0 Package Uncore Arb Unit, Counter 0 Event Select MSR 3B3H 945 MSR_UNC_ARB_PERFEVTSEL1 Package Uncore Arb unit, Counter 1 Event Select MSR 700H 1792 MSR_UNC_CBO_0_PERFEVTSEL0 Package Uncore C-Box 0, Counter 0 Event Select MSR 701H 1793 MSR_UNC_CBO_0_PERFEVTSEL1 Package Uncore C-Box 0, Counter 1 Event Select MSR 702H 1794 MSR_UNC_CBO_0_PERFEVTSEL2 Package Uncore C-Box 0, Counter 2 Event Select MSR 703H 1795 MSR_UNC_CBO_0_PERFEVTSEL3 Package Uncore C-Box 0, Counter 3 Event Select MSR 705H 1797 MSR_UNC_CBO_0_UNIT_STATUS Package Uncore C-Box 0, Unit Status for Counter 0-3 706H 1798 MSR_UNC_CBO_0_PERFCTR0 Package Uncore C-Box 0, Performance Counter 0 707H 1799 MSR_UNC_CBO_0_PERFCTR1 Package Uncore C-Box 0, Performance Counter 1 708H 1800 MSR_UNC_CBO_0_PERFCTR2 Package Uncore C-Box 0, Performance Counter 2 709H 1801 MSR_UNC_CBO_0_PERFCTR3 Package Uncore C-Box 0, Performance Counter 3 710H 1808 MSR_UNC_CBO_1_PERFEVTSEL0 Package Uncore C-Box 1, Counter 0 Event Select MSR 711H 1809 MSR_UNC_CBO_1_PERFEVTSEL1 Package Uncore C-Box 1, Counter 1 Event Select MSR 712H 1810 MSR_UNC_CBO_1_PERFEVTSEL2 Package Uncore C-Box 1, Counter 2 Event Select MSR 713H 1811 MSR_UNC_CBO_1_PERFEVTSEL3 Package Uncore C-Box 1, Counter 3 Event Select MSR 715H 1813 MSR_UNC_CBO_1_UNIT_STATUS Package Uncore C-Box 1, Unit Status for Counter 0-3 716H 1814 MSR_UNC_CBO_1_PERFCTR0 Package Uncore C-Box 1, Performance Counter 0 717H 1815 MSR_UNC_CBO_1_PERFCTR1 Package Uncore C-Box 1, Performance Counter 1 718H 1816 MSR_UNC_CBO_1_PERFCTR2 Package Uncore C-Box 1, Performance Counter 2 719H 1817 MSR_UNC_CBO_1_PERFCTR3 Package Uncore C-Box 1, Performance Counter 3 720H 1824 MSR_UNC_CBO_2_PERFEVTSEL0 Package Uncore C-Box 2, Counter 0 Event Select MSR 721H 1825 MSR_UNC_CBO_2_PERFEVTSEL1 Package Uncore C-Box 2, Counter 1 Event Select MSR 722H 1826 MSR_UNC_CBO_2_PERFEVTSEL2 Package Uncore C-Box 2, Counter 2 Event Select MSR 2-180 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-22. Uncore PMU MSRs Supported by 2nd Generation Intel® Core™ Processors Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 723H 1827 MSR_UNC_CBO_2_PERFEVTSEL3 Package Uncore C-Box 2, Counter 3 Event Select MSR 725H 1829 MSR_UNC_CBO_2_UNIT_STATUS Package Uncore C-Box 2, Unit Status for Counter 0-3 726H 1830 MSR_UNC_CBO_2_PERFCTR0 Package Uncore C-Box 2, Performance Counter 0 727H 1831 MSR_UNC_CBO_2_PERFCTR1 Package Uncore C-Box 2, Performance Counter 1 728H 1832 MSR_UNC_CBO_3_PERFCTR2 Package Uncore C-Box 3, Performance Counter 2 729H 1833 MSR_UNC_CBO_3_PERFCTR3 Package Uncore C-Box 3, Performance Counter 3 730H 1840 MSR_UNC_CBO_3_PERFEVTSEL0 Package Uncore C-Box 3, Counter 0 Event Select MSR 731H 1841 MSR_UNC_CBO_3_PERFEVTSEL1 Package Uncore C-Box 3, Counter 1 Event Select MSR 732H 1842 MSR_UNC_CBO_3_PERFEVTSEL2 Package Uncore C-Box 3, Counter 2 Event Select MSR 733H 1843 MSR_UNC_CBO_3_PERFEVTSEL3 Package Uncore C-Box 3, counter 3 Event Select MSR 735H 1845 MSR_UNC_CBO_3_UNIT_STATUS Package Uncore C-Box 3, Unit Status for Counter 0-3 736H 1846 MSR_UNC_CBO_3_PERFCTR0 Package Uncore C-Box 3, Performance Counter 0 737H 1847 MSR_UNC_CBO_3_PERFCTR1 Package Uncore C-Box 3, Performance Counter 1 738H 1848 MSR_UNC_CBO_3_PERFCTR2 Package Uncore C-Box 3, Performance Counter 2 739H 1849 MSR_UNC_CBO_3_PERFCTR3 Package Uncore C-Box 3, Performance Counter 3 740H 1856 MSR_UNC_CBO_4_PERFEVTSEL0 Package Uncore C-Box 4, Counter 0 Event Select MSR 741H 1857 MSR_UNC_CBO_4_PERFEVTSEL1 Package Uncore C-Box 4, Counter 1 Event Select MSR 742H 1858 MSR_UNC_CBO_4_PERFEVTSEL2 Package Uncore C-Box 4, Counter 2 Event Select MSR 743H 1859 MSR_UNC_CBO_4_PERFEVTSEL3 Package Uncore C-Box 4, Counter 3 Event Select MSR 745H 1861 MSR_UNC_CBO_4_UNIT_STATUS Package Uncore C-Box 4, Unit status for Counter 0-3 746H 1862 MSR_UNC_CBO_4_PERFCTR0 Package Uncore C-Box 4, Performance Counter 0 747H 1863 MSR_UNC_CBO_4_PERFCTR1 Package Uncore C-Box 4, Performance Counter 1 748H 1864 MSR_UNC_CBO_4_PERFCTR2 Package Uncore C-Box 4, Performance Counter 2 749H 1865 MSR_UNC_CBO_4_PERFCTR3 Package Uncore C-Box 4, Performance Counter 3 2.11.2 MSRs In Intel® Xeon® Processor E5 Family (Based on Intel® Microarchitecture Code Name Sandy Bridge) Table 2-23 lists additional model-specific registers (MSRs) that are specific to the Intel® Xeon® Processor E5 Family (based on Intel® microarchitecture code name Sandy Bridge). These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2DH, and also supports MSRs listed in Table 2-20 and Table 2-24. Table 2-23. Selected MSRs Supported by Intel® Xeon® Processors E5 Family (based on Sandy Bridge microarchitecture) Register Address Hex Dec 17FH 383 Register Name / Bit Fields MSR_ERROR_CONTROL 0 Scope Package Bit Description MC Bank Error Configuration (R/W) Reserved Vol. 4 2-181 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-23. Selected MSRs Supported by Intel® Xeon® Processors E5 Family (based on Sandy Bridge microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 1 MemError Log Enable (R/W) When set, enables IMC status bank to log additional info in bits 36:32. 63:2 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0, RW if MSR_PLATFORM_INFO.[28] = 1 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 cores active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 cores active. 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 cores active. 39:32 Package Maximum Ratio Limit for 5C Maximum turbo ratio limit of 5 cores active. 47:40 Package Maximum Ratio Limit for 6C Maximum turbo ratio limit of 6 cores active. 55:48 Package Maximum Ratio Limit for 7C Maximum turbo ratio limit of 7 cores active. 63:56 Package Maximum Ratio Limit for 8C Maximum turbo ratio limit of 8 cores active. 285H 645 IA32_MC5_CTL2 Package See Table 2-2. 286H 646 IA32_MC6_CTL2 Package See Table 2-2. 287H 647 IA32_MC7_CTL2 Package See Table 2-2. 288H 648 IA32_MC8_CTL2 Package See Table 2-2. 289H 649 IA32_MC9_CTL2 Package See Table 2-2. 28AH 650 IA32_MC10_CTL2 Package See Table 2-2. 28BH 651 IA32_MC11_CTL2 Package See Table 2-2. 28CH 652 IA32_MC12_CTL2 Package See Table 2-2. 28DH 653 IA32_MC13_CTL2 Package See Table 2-2. 28EH 654 IA32_MC14_CTL2 Package See Table 2-2. 28FH 655 IA32_MC15_CTL2 Package See Table 2-2. 290H 656 IA32_MC16_CTL2 Package See Table 2-2. 291H 657 IA32_MC17_CTL2 Package See Table 2-2. 292H 658 IA32_MC18_CTL2 Package See Table 2-2. 293H 659 IA32_MC19_CTL2 Package See Table 2-2. 2-182 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-23. Selected MSRs Supported by Intel® Xeon® Processors E5 Family (based on Sandy Bridge microarchitecture) (Contd.) Register Address Hex Dec 39CH 924 Register Name / Bit Fields MSR_PEBS_NUM_ALT Scope Package 0 Bit Description ENABLE_PEBS_NUM_ALT (RW) ENABLE_PEBS_NUM_ALT (RW) Write 1 to enable alternate PEBS counting logic for specific events requiring additional configuration, see Table 19-19. 63:1 Reserved, must be zero. 414H 1044 IA32_MC5_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 415H 1045 IA32_MC5_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 416H 1046 IA32_MC5_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 417H 1047 IA32_MC5_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 418H 1048 IA32_MC6_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 419H 1049 IA32_MC6_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 41AH 1050 IA32_MC6_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 41BH 1051 IA32_MC6_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 41CH 1052 IA32_MC7_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 41DH 1053 IA32_MC7_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 41EH 1054 IA32_MC7_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 41FH 1055 IA32_MC7_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 420H 1056 IA32_MC8_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 421H 1057 IA32_MC8_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 422H 1058 IA32_MC8_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 423H 1059 IA32_MC8_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 424H 1060 IA32_MC9_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 425H 1061 IA32_MC9_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 426H 1062 IA32_MC9_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 427H 1063 IA32_MC9_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 428H 1064 IA32_MC10_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 429H 1065 IA32_MC10_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 42AH 1066 IA32_MC10_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 42BH 1067 IA32_MC10_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 42CH 1068 IA32_MC11_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 42DH 1069 IA32_MC11_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. Vol. 4 2-183 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-23. Selected MSRs Supported by Intel® Xeon® Processors E5 Family (based on Sandy Bridge microarchitecture) (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 42EH 1070 IA32_MC11_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 42FH 1071 IA32_MC11_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 430H 1072 IA32_MC12_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 431H 1073 IA32_MC12_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 432H 1074 IA32_MC12_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 433H 1075 IA32_MC12_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 434H 1076 IA32_MC13_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 435H 1077 IA32_MC13_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 436H 1078 IA32_MC13_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 437H 1079 IA32_MC13_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 438H 1080 IA32_MC14_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 439H 1081 IA32_MC14_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 43AH 1082 IA32_MC14_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 43BH 1083 IA32_MC14_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 43CH 1084 IA32_MC15_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 43DH 1085 IA32_MC15_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 43EH 1086 IA32_MC15_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 43FH 1087 IA32_MC15_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 440H 1088 IA32_MC16_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 441H 1089 IA32_MC16_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 442H 1090 IA32_MC16_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 443H 1091 IA32_MC16_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 444H 1092 IA32_MC17_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 445H 1093 IA32_MC17_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 446H 1094 IA32_MC17_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 447H 1095 IA32_MC17_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 448H 1096 IA32_MC18_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 449H 1097 IA32_MC18_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 44AH 1098 IA32_MC18_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 44BH 1099 IA32_MC18_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 44CH 1100 IA32_MC19_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 2-184 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-23. Selected MSRs Supported by Intel® Xeon® Processors E5 Family (based on Sandy Bridge microarchitecture) (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 44DH 1101 IA32_MC19_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS” and Chapter 16. 44EH 1102 IA32_MC19_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 44FH 1103 IA32_MC19_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 613H 1555 MSR_PKG_PERF_STATUS Package Package RAPL Perf Status (R/O) 618H 1560 MSR_DRAM_POWER_LIMIT Package DRAM RAPL Power Limit Control (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM Energy Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61BH 1563 MSR_DRAM_PERF_STATUS Package DRAM Performance Throttling Status (R/O) 61CH 1564 MSR_DRAM_POWER_INFO Package DRAM RAPL Parameters (R/W) See Section 14.9.5, “DRAM RAPL Domain.” See Section 14.9.5, “DRAM RAPL Domain.” 639H 1593 MSR_PP0_ENERGY_STATUS Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” See Table 2-20, Table 2-23, and Table 2-24 for MSR definitions applicable to processors with CPUID signature 06_2DH. 2.11.3 Additional Uncore PMU MSRs in the Intel® Xeon® Processor E5 Family Intel Xeon Processor E5 family is based on the Sandy Bridge microarchitecture. The MSR-based uncore PMU interfaces are listed in Table 2-24. For complete detail of the uncore PMU, refer to Intel Xeon Processor E5 Product Family Uncore Performance Monitoring Guide. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2DH Table 2-24. Uncore PMU MSRs in Intel® Xeon® Processor E5 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec C08H MSR_U_PMON_UCLK_FIXED_CTL Package Uncore U-box UCLK Fixed Counter Control C09H MSR_U_PMON_UCLK_FIXED_CTR Package Uncore U-box UCLK Fixed Counter C10H MSR_U_PMON_EVNTSEL0 Package Uncore U-box Perfmon Event Select for U-box Counter 0 C11H MSR_U_PMON_EVNTSEL1 Package Uncore U-box Perfmon Event Select for U-box Counter 1 C16H MSR_U_PMON_CTR0 Package Uncore U-box Perfmon Counter 0 C17H MSR_U_PMON_CTR1 Package Uncore U-box Perfmon Counter 1 C24H MSR_PCU_PMON_BOX_CTL Package Uncore PCU Perfmon for PCU-box-wide Control C30H MSR_PCU_PMON_EVNTSEL0 Package Uncore PCU Perfmon Event Select for PCU Counter 0 C31H MSR_PCU_PMON_EVNTSEL1 Package Uncore PCU Perfmon Event Select for PCU Counter 1 C32H MSR_PCU_PMON_EVNTSEL2 Package Uncore PCU Perfmon Event Select for PCU Counter 2 Vol. 4 2-185 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-24. Uncore PMU MSRs in Intel® Xeon® Processor E5 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec C33H MSR_PCU_PMON_EVNTSEL3 Package Uncore PCU Perfmon Event Select for PCU Counter 3 C34H MSR_PCU_PMON_BOX_FILTER Package Uncore PCU Perfmon box-wide Filter C36H MSR_PCU_PMON_CTR0 Package Uncore PCU Perfmon Counter 0 C37H MSR_PCU_PMON_CTR1 Package Uncore PCU Perfmon Counter 1 C38H MSR_PCU_PMON_CTR2 Package Uncore PCU Perfmon Counter 2 C39H MSR_PCU_PMON_CTR3 Package Uncore PCU Perfmon Counter 3 D04H MSR_C0_PMON_BOX_CTL Package Uncore C-box 0 Perfmon Local Box Wide Control D10H MSR_C0_PMON_EVNTSEL0 Package Uncore C-box 0 Perfmon Event Select for C-box 0 Counter 0 D11H MSR_C0_PMON_EVNTSEL1 Package Uncore C-box 0 Perfmon Event Select for C-box 0 Counter 1 D12H MSR_C0_PMON_EVNTSEL2 Package Uncore C-box 0 Perfmon Event Select for C-box 0 Counter 2 D13H MSR_C0_PMON_EVNTSEL3 Package Uncore C-box 0 Perfmon Event Select for C-box 0 Counter 3 D14H MSR_C0_PMON_BOX_FILTER Package Uncore C-box 0 Perfmon Box Wide Filter D16H MSR_C0_PMON_CTR0 Package Uncore C-box 0 Perfmon Counter 0 D17H MSR_C0_PMON_CTR1 Package Uncore C-box 0 Perfmon Counter 1 D18H MSR_C0_PMON_CTR2 Package Uncore C-box 0 Perfmon Counter 2 D19H MSR_C0_PMON_CTR3 Package Uncore C-box 0 Perfmon Counter 3 D24H MSR_C1_PMON_BOX_CTL Package Uncore C-box 1 Perfmon Local Box Wide Control D30H MSR_C1_PMON_EVNTSEL0 Package Uncore C-box 1 Perfmon Event Select for C-box 1 Counter 0 D31H MSR_C1_PMON_EVNTSEL1 Package Uncore C-box 1 Perfmon Event Select for C-box 1 Counter 1 D32H MSR_C1_PMON_EVNTSEL2 Package Uncore C-box 1 Perfmon Event Select for C-box 1 Counter 2 D33H MSR_C1_PMON_EVNTSEL3 Package Uncore C-box 1 Perfmon Event Select for C-box 1 Counter 3 D34H MSR_C1_PMON_BOX_FILTER Package Uncore C-box 1 Perfmon Box Wide Filter D36H MSR_C1_PMON_CTR0 Package Uncore C-box 1 Perfmon Counter 0 D37H MSR_C1_PMON_CTR1 Package Uncore C-box 1 Perfmon Counter 1 D38H MSR_C1_PMON_CTR2 Package Uncore C-box 1 Perfmon Counter 2 D39H MSR_C1_PMON_CTR3 Package Uncore C-box 1 Perfmon Counter 3 D44H MSR_C2_PMON_BOX_CTL Package Uncore C-box 2 Perfmon Local Box Wide Control D50H MSR_C2_PMON_EVNTSEL0 Package Uncore C-box 2 Perfmon Event Select for C-box 2 Counter 0 D51H MSR_C2_PMON_EVNTSEL1 Package Uncore C-box 2 Perfmon Event Select for C-box 2 Counter 1 2-186 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-24. Uncore PMU MSRs in Intel® Xeon® Processor E5 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec D52H MSR_C2_PMON_EVNTSEL2 Package Uncore C-box 2 Perfmon Event Select for C-box 2 Counter 2 D53H MSR_C2_PMON_EVNTSEL3 Package Uncore C-box 2 Perfmon Event Select for C-box 2 Counter 3 D54H MSR_C2_PMON_BOX_FILTER Package Uncore C-box 2 Perfmon Box Wide Filter D56H MSR_C2_PMON_CTR0 Package Uncore C-box 2 Perfmon Counter 0 D57H MSR_C2_PMON_CTR1 Package Uncore C-box 2 Perfmon Counter 1 D58H MSR_C2_PMON_CTR2 Package Uncore C-box 2 Perfmon Counter 2 D59H MSR_C2_PMON_CTR3 Package Uncore C-box 2 Perfmon Counter 3 D64H MSR_C3_PMON_BOX_CTL Package Uncore C-box 3 Perfmon Local Box Wide Control D70H MSR_C3_PMON_EVNTSEL0 Package Uncore C-box 3 Perfmon Event Select for C-box 3 Counter 0 D71H MSR_C3_PMON_EVNTSEL1 Package Uncore C-box 3 Perfmon Event Select for C-box 3 Counter 1 D72H MSR_C3_PMON_EVNTSEL2 Package Uncore C-box 3 Perfmon Event Select for C-box 3 Counter 2 D73H MSR_C3_PMON_EVNTSEL3 Package Uncore C-box 3 Perfmon Event Select for C-box 3 Counter 3 D74H MSR_C3_PMON_BOX_FILTER Package Uncore C-box 3 Perfmon Box Wide Filter D76H MSR_C3_PMON_CTR0 Package Uncore C-box 3 Perfmon Counter 0 D77H MSR_C3_PMON_CTR1 Package Uncore C-box 3 Perfmon Counter 1 D78H MSR_C3_PMON_CTR2 Package Uncore C-box 3 Perfmon Counter 2 D79H MSR_C3_PMON_CTR3 Package Uncore C-box 3 Perfmon Counter 3 D84H MSR_C4_PMON_BOX_CTL Package Uncore C-box 4 Perfmon Local Box Wide Control D90H MSR_C4_PMON_EVNTSEL0 Package Uncore C-box 4 Perfmon Event Select for C-box 4 Counter 0 D91H MSR_C4_PMON_EVNTSEL1 Package Uncore C-box 4 Perfmon Event Select for C-box 4 Counter 1 D92H MSR_C4_PMON_EVNTSEL2 Package Uncore C-box 4 Perfmon Event Select for C-box 4 Counter 2 D93H MSR_C4_PMON_EVNTSEL3 Package Uncore C-box 4 Perfmon Event Select for C-box 4 Counter 3 D94H MSR_C4_PMON_BOX_FILTER Package Uncore C-box 4 Perfmon Box Wide Filter D96H MSR_C4_PMON_CTR0 Package Uncore C-box 4 Perfmon Counter 0 D97H MSR_C4_PMON_CTR1 Package Uncore C-box 4 Perfmon Counter 1 D98H MSR_C4_PMON_CTR2 Package Uncore C-box 4 Perfmon Counter 2 D99H MSR_C4_PMON_CTR3 Package Uncore C-box 4 Perfmon Counter 3 DA4H MSR_C5_PMON_BOX_CTL Package Uncore C-box 5 Perfmon Local Box Wide Control DB0H MSR_C5_PMON_EVNTSEL0 Package Uncore C-box 5 Perfmon Event Select for C-box 5 Counter 0 Vol. 4 2-187 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-24. Uncore PMU MSRs in Intel® Xeon® Processor E5 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec DB1H MSR_C5_PMON_EVNTSEL1 Package Uncore C-box 5 Perfmon Event Select for C-box 5 Counter 1 DB2H MSR_C5_PMON_EVNTSEL2 Package Uncore C-box 5 Perfmon Event Select for C-box 5 Counter 2 DB3H MSR_C5_PMON_EVNTSEL3 Package Uncore C-box 5 Perfmon Event Select for C-box 5 Counter 3 DB4H MSR_C5_PMON_BOX_FILTER Package Uncore C-box 5 Perfmon Box Wide Filter DB6H MSR_C5_PMON_CTR0 Package Uncore C-box 5 Perfmon Counter 0 DB7H MSR_C5_PMON_CTR1 Package Uncore C-box 5 Perfmon Counter 1 DB8H MSR_C5_PMON_CTR2 Package Uncore C-box 5 Perfmon Counter 2 DB9H MSR_C5_PMON_CTR3 Package Uncore C-box 5 Perfmon Counter 3 DC4H MSR_C6_PMON_BOX_CTL Package Uncore C-box 6 Perfmon Local Box Wide Control DD0H MSR_C6_PMON_EVNTSEL0 Package Uncore C-box 6 Perfmon Event Select for C-box 6 Counter 0 DD1H MSR_C6_PMON_EVNTSEL1 Package Uncore C-box 6 Perfmon Event Select for C-box 6 Counter 1 DD2H MSR_C6_PMON_EVNTSEL2 Package Uncore C-box 6 Perfmon Event Select for C-box 6 Counter 2 DD3H MSR_C6_PMON_EVNTSEL3 Package Uncore C-box 6 Perfmon Event Select for C-box 6 Counter 3 DD4H MSR_C6_PMON_BOX_FILTER Package Uncore C-box 6 Perfmon Box Wide Filter DD6H MSR_C6_PMON_CTR0 Package Uncore C-box 6 Perfmon Counter 0 DD7H MSR_C6_PMON_CTR1 Package Uncore C-box 6 Perfmon Counter 1 DD8H MSR_C6_PMON_CTR2 Package Uncore C-box 6 Perfmon Counter 2 DD9H MSR_C6_PMON_CTR3 Package Uncore C-box 6 Perfmon Counter 3 DE4H MSR_C7_PMON_BOX_CTL Package Uncore C-box 7 Perfmon Local Box Wide Control DF0H MSR_C7_PMON_EVNTSEL0 Package Uncore C-box 7 Perfmon Event Select for C-box 7 Counter 0 DF1H MSR_C7_PMON_EVNTSEL1 Package Uncore C-box 7 Perfmon Event Select for C-box 7 Counter 1 DF2H MSR_C7_PMON_EVNTSEL2 Package Uncore C-box 7 Perfmon Event Select for C-box 7 Counter 2 DF3H MSR_C7_PMON_EVNTSEL3 Package Uncore C-box 7 Perfmon Event Select for C-box 7 Counter 3 DF4H MSR_C7_PMON_BOX_FILTER Package Uncore C-box 7 Perfmon Box Wide Filter DF6H MSR_C7_PMON_CTR0 Package Uncore C-box 7 Perfmon Counter 0 DF7H MSR_C7_PMON_CTR1 Package Uncore C-box 7 Perfmon Counter 1 DF8H MSR_C7_PMON_CTR2 Package Uncore C-box 7 Perfmon Counter 2 DF9H MSR_C7_PMON_CTR3 Package Uncore C-box 7 Perfmon Counter 3 2-188 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) 2.12 MSRS IN THE 3RD GENERATION INTEL® CORE™ PROCESSOR FAMILY (BASED ON INTEL® MICROARCHITECTURE CODE NAME IVY BRIDGE) The 3rd generation Intel® Core™ processor family and the Intel® Xeon® processor E3-1200v2 product family (based on Intel microarchitecture code name Ivy Bridge) support the MSR interfaces listed in Table 2-20, Table 2-21, Table 2-22, and Table 2-25. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_3AH. Table 2-25. Additional MSRs Supported by 3rd Generation Intel® Core™ Processors (based on Intel® microarchitecture code name Ivy Bridge) Register Address Hex Dec CEH 206 Register Name / Bit Fields MSR_PLATFORM_INFO Scope Package Bit Description Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the frequency that invariant TSC runs at. Frequency = ratio * 100 MHz. 27:16 28 Reserved Package Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limit for Turbo mode is enabled. When set to 0, indicates Programmable Ratio Limit for Turbo mode is disabled. 29 Package Programmable TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDP Limit for Turbo mode is programmable. When set to 0, indicates that TDP Limit for Turbo mode is not programmable. 31:30 32 Reserved Package Low Power Mode Support (LPM) (R/O) When set to 1, indicates that LPM is supported. When set to 0, indicates LPM is not supported. 34:33 Package Number of ConfigTDP Levels (R/O) 00: Only Base TDP level available. 01: One additional TDP level available. 02: Two additional TDP level available. 11: Reserved 39:35 47:40 Reserved Package Maximum Efficiency Ratio (R/O) This is the minimum ratio (maximum efficiency) that the processor can operate, in units of 100MHz. 55:48 Package Minimum Operating Ratio (R/O) Contains the minimum supported operating ratio in units of 100 MHz. 63:56 Reserved Vol. 4 2-189 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-25. Additional MSRs Supported by 3rd Generation Intel® Core™ Processors (based on Intel® microarchitecture code name Ivy Bridge) (Contd.) Register Address Hex Dec E2H 226 Register Name / Bit Fields MSR_PKG_CST_CONFIG_CONTROL Scope Core Bit Description C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package Cstate limit. The following C-state code name encodings are supported: 000b: C0/C1 (no package C-sate support) 001b: C2 010b: C6 no retention 011b: C6 retention 100b: C7 101b: C7s 111: No package C-state limit. Note: This field cannot be used to limit package Cstate to C3. 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 14:11 Reserved 15 CFG Lock (R/WO) When set, locks bits 15:0 of this register until next reset. 24:16 Reserved 25 C3 State Auto Demotion Enable (R/W) When set, the processor will conditionally demote C6/C7 requests to C3 based on uncore auto-demote information. 26 C1 State Auto Demotion Enable (R/W) When set, the processor will conditionally demote C3/C6/C7 requests to C1 based on uncore autodemote information. 27 Enable C3 Undemotion (R/W) When set, enables undemotion from demoted C3. 28 Enable C1 Undemotion (R/W) When set, enables undemotion from demoted C1. 2-190 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-25. Additional MSRs Supported by 3rd Generation Intel® Core™ Processors (based on Intel® microarchitecture code name Ivy Bridge) (Contd.) Register Address Hex Register Name / Bit Fields Scope Dec 63:29 639H Bit Description 1593 MSR_PP0_ENERGY_STATUS Reserved Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 648H 1608 MSR_CONFIG_TDP_NOMINAL Package 7:0 Base TDP Ratio (R/O) Config_TDP_Base Base TDP level ratio to be used for this specific processor (in units of 100 MHz). 63:8 649H 1609 MSR_CONFIG_TDP_LEVEL1 Reserved Package 14:0 ConfigTDP Level 1 ratio and power level (R/O) PKG_TDP_LVL1 Power setting for ConfigTDP Level 1. 15 Reserved 23:16 Config_TDP_LVL1_Ratio ConfigTDP level 1 ratio to be used for this specific processor. 31:24 Reserved 46:32 PKG_MAX_PWR_LVL1 Max Power setting allowed for ConfigTDP Level 1. 47 Reserved 62:48 PKG_MIN_PWR_LVL1 MIN Power setting allowed for ConfigTDP Level 1. 63 64AH 1610 MSR_CONFIG_TDP_LEVEL2 Reserved Package 14:0 ConfigTDP Level 2 ratio and power level (R/O) PKG_TDP_LVL2 Power setting for ConfigTDP Level 2. 15 Reserved 23:16 Config_TDP_LVL2_Ratio ConfigTDP level 2 ratio to be used for this specific processor. 31:24 Reserved 46:32 PKG_MAX_PWR_LVL2 Max Power setting allowed for ConfigTDP Level 2. 47 Reserved 62:48 PKG_MIN_PWR_LVL2 MIN Power setting allowed for ConfigTDP Level 2. 63 64BH 1611 MSR_CONFIG_TDP_CONTROL Reserved Package ConfigTDP Control (R/W) Vol. 4 2-191 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-25. Additional MSRs Supported by 3rd Generation Intel® Core™ Processors (based on Intel® microarchitecture code name Ivy Bridge) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 1:0 TDP_LEVEL (RW/L) System BIOS can program this field. 30:2 Reserved. 31 Config_TDP_Lock (RW/L) When this bit is set, the content of this register is locked until a reset. 63:32 64CH 1612 Reserved MSR_TURBO_ACTIVATION_RATIO Package 7:0 ConfigTDP Control (R/W) MAX_NON_TURBO_RATIO (RW/L) System BIOS can program this field. 30:8 Reserved 31 TURBO_ACTIVATION_RATIO_Lock (RW/L) When this bit is set, the content of this register is locked until a reset. 63:32 Reserved See Table 2-20, Table 2-21 and Table 2-22 for other MSR definitions applicable to processors with CPUID signature 06_3AH. 2.12.1 MSRs In Intel® Xeon® Processor E5 v2 Product Family (Based on Ivy Bridge-E Microarchitecture) Table 2-26 lists model-specific registers (MSRs) that are specific to the Intel® Xeon® Processor E5 v2 Product Family (based on Ivy Bridge-E microarchitecture). These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_3EH, see Table 2-1. These processors supports the MSR interfaces listed in Table 2-20, and Table 2-26. Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) Register Address Hex Dec 4EH 78 2-192 Vol. 4 Register Name / Bit Fields MSR_PPIN_CTL Scope Package Bit Description Protected Processor Inventory Number Enable Control (R/W) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 0 LockOut (R/WO) Set 1 to prevent further writes to MSR_PPIN_CTL. Writing 1 to MSR_PPINCTL[bit 0] is permitted only if MSR_PPIN_CTL[bit 1] is clear. Default is 0. BIOS should provide an opt-in menu to enable the user to turn on MSR_PPIN_CTL[bit 1] for privileged inventory initialization agent to access MSR_PPIN. After reading MSR_PPIN, the privileged inventory initialization agent should write ‘01b’ to MSR_PPIN_CTL to disable further access to MSR_PPIN and prevent unauthorized modification to MSR_PPIN_CTL. 1 Enable_PPIN (R/W) If 1, enables MSR_PPIN to be accessible using RDMSR. Once set, attempt to write 1 to MSR_PPIN_CTL[bit 0] will cause #GP. If 0, an attempt to read MSR_PPIN will cause #GP. Default is 0. 63:2 4FH 79 MSR_PPIN Reserved Package 63:0 Protected Processor Inventory Number (R/O) Protected Processor Inventory Number (R/O) A unique value within a given CPUID family/model/stepping signature that a privileged inventory initialization agent can access to identify each physical processor, when access to MSR_PPIN is enabled. Access to MSR_PPIN is permitted only if MSR_PPIN_CTL[bits 1:0] = ‘10b’. CEH 206 MSR_PLATFORM_INFO Package Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the frequency that invariant TSC runs at. Frequency = ratio * 100 MHz. 22:16 23 Reserved Package PPIN_CAP (R/O) When set to 1, indicates that Protected Processor Inventory Number (PPIN) capability can be enabled for a privileged system inventory agent to read PPIN from MSR_PPIN. When set to 0, PPIN capability is not supported. An attempt to access MSR_PPIN_CTL or MSR_PPIN will cause #GP. 27:24 Reserved Vol. 4 2-193 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 28 Package Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limit for Turbo mode is enabled. When set to 0, indicates Programmable Ratio Limit for Turbo mode is disabled. 29 Package Programmable TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDP Limit for Turbo mode is programmable. When set to 0, indicates TDP Limit for Turbo mode is not programmable. 30 Package Programmable TJ OFFSET (R/O) When set to 1, indicates that MSR_TEMPERATURE_TARGET.[27:24] is valid and writable to specify a temperature offset. 39:31 47:40 Reserved Package Maximum Efficiency Ratio (R/O) This is the minimum ratio (maximum efficiency) that the processor can operate, in units of 100MHz. 63:48 E2H 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Core C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package Cstate limit. The following C-state code name encodings are supported: 000b: C0/C1 (no package C-sate support) 001b: C2 010b: C6 no retention 011b: C6 retention 100b: C7 101b: C7s 111: No package C-state limit. Note: This field cannot be used to limit package C-state to C3. 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 2-194 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 14:11 Reserved 15 CFG Lock (R/WO) When set, locks bits 15:0 of this register until next reset. 63:16 179H 17FH 377 383 IA32_MCG_CAP Reserved Thread Global Machine Check Capability (R/O) 7:0 Count 8 MCG_CTL_P 9 MCG_EXT_P 10 MCP_CMCI_P 11 MCG_TES_P 15:12 Reserved 23:16 MCG_EXT_CNT 24 MCG_SER_P 25 Reserved 26 MCG_ELOG_P 63:27 Reserved MSR_ERROR_CONTROL Package MC Bank Error Configuration (R/W) 0 Reserved 1 MemError Log Enable (R/W) When set, enables IMC status bank to log additional info in bits 36:32. 63:2 1A2H 418 MSR_TEMPERATURE_TARGET Reserved Package 15:0 Temperature Target Reserved 23:16 Temperature Target (RO) The minimum temperature at which PROCHOT# will be asserted. The value is degrees C. 27:24 TCC Activation Offset (R/W) Specifies a temperature offset in degrees C from the temperature target (bits 23:16). PROCHOT# will assert at the offset target temperature. Write is permitted only if MSR_PLATFORM_INFO.[30] is set. 63:28 1AEH 430 MSR_TURBO_RATIO_LIMIT1 Reserved Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 9C Maximum turbo ratio limit of 9 core active. Vol. 4 2-195 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 15:8 Package Maximum Ratio Limit for 10C Maximum turbo ratio limit of 10 core active. 23:16 Package Maximum Ratio Limit for 11C Maximum turbo ratio limit of 11 core active. 31:24 Package Maximum Ratio Limit for 12C Maximum turbo ratio limit of 12 core active. 63:32 Reserved 285H 645 IA32_MC5_CTL2 Package See Table 2-2. 286H 646 IA32_MC6_CTL2 Package See Table 2-2. 287H 647 IA32_MC7_CTL2 Package See Table 2-2. 288H 648 IA32_MC8_CTL2 Package See Table 2-2. 289H 649 IA32_MC9_CTL2 Package See Table 2-2. 28AH 650 IA32_MC10_CTL2 Package See Table 2-2. 28BH 651 IA32_MC11_CTL2 Package See Table 2-2. 28CH 652 IA32_MC12_CTL2 Package See Table 2-2. 28DH 653 IA32_MC13_CTL2 Package See Table 2-2. 28EH 654 IA32_MC14_CTL2 Package See Table 2-2. 28FH 655 IA32_MC15_CTL2 Package See Table 2-2. 290H 656 IA32_MC16_CTL2 Package See Table 2-2. 291H 657 IA32_MC17_CTL2 Package See Table 2-2. 292H 658 IA32_MC18_CTL2 Package See Table 2-2. 293H 659 IA32_MC19_CTL2 Package See Table 2-2. 294H 660 IA32_MC20_CTL2 Package See Table 2-2. 295H 661 IA32_MC21_CTL2 Package See Table 2-2. 296H 662 IA32_MC22_CTL2 Package See Table 2-2. 297H 663 IA32_MC23_CTL2 Package See Table 2-2. 298H 664 IA32_MC24_CTL2 Package See Table 2-2. 299H 665 IA32_MC25_CTL2 Package See Table 2-2. 29AH 666 IA32_MC26_CTL2 Package See Table 2-2. 29BH 667 IA32_MC27_CTL2 Package See Table 2-2. 29CH 668 IA32_MC28_CTL2 Package See Table 2-2. 414H 1044 IA32_MC5_CTL Package 415H 1045 IA32_MC5_STATUS Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. 416H 1046 IA32_MC5_ADDR Package 417H 1047 IA32_MC5_MISC Package 2-196 Vol. 4 Bank MC5 reports MC errors from the Intel QPI module. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 418H 1048 IA32_MC6_CTL Package 419H 1049 IA32_MC6_STATUS Package 41AH 1050 IA32_MC6_ADDR Package 41BH 1051 IA32_MC6_MISC Package 41CH 1052 IA32_MC7_CTL Package 41DH 1053 IA32_MC7_STATUS Package 41EH 1054 IA32_MC7_ADDR Package 41FH 1055 IA32_MC7_MISC Package 420H 1056 IA32_MC8_CTL Package 421H 1057 IA32_MC8_STATUS Package 422H 1058 IA32_MC8_ADDR Package 423H 1059 IA32_MC8_MISC Package 424H 1060 IA32_MC9_CTL Package 425H 1061 IA32_MC9_STATUS Package 426H 1062 IA32_MC9_ADDR Package 427H 1063 IA32_MC9_MISC Package 428H 1064 IA32_MC10_CTL Package 429H 1065 IA32_MC10_STATUS Package 42AH 1066 IA32_MC10_ADDR Package 42BH 1067 IA32_MC10_MISC Package 42CH 1068 IA32_MC11_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 42DH 1069 IA32_MC11_STATUS Package 42EH 1070 IA32_MC11_ADDR Package Bank MC11 reports MC errors from a specific channel of the integrated memory controller. 42FH 1071 IA32_MC11_MISC Package 430H 1072 IA32_MC12_CTL Package 431H 1073 IA32_MC12_STATUS Package 432H 1074 IA32_MC12_ADDR Package 433H 1075 IA32_MC12_MISC Package 434H 1076 IA32_MC13_CTL Package 435H 1077 IA32_MC13_STATUS Package 436H 1078 IA32_MC13_ADDR Package 437H 1079 IA32_MC13_MISC Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC6 reports MC errors from the integrated I/O module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC7 and MC 8 report MC errors from the two home agents. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC7 and MC 8 report MC errors from the two home agents. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. Vol. 4 2-197 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 438H 1080 IA32_MC14_CTL Package 439H 1081 IA32_MC14_STATUS Package 43AH 1082 IA32_MC14_ADDR Package 43BH 1083 IA32_MC14_MISC Package 43CH 1084 IA32_MC15_CTL Package 43DH 1085 IA32_MC15_STATUS Package 43EH 1086 IA32_MC15_ADDR Package 43FH 1087 IA32_MC15_MISC Package 440H 1088 IA32_MC16_CTL Package 441H 1089 IA32_MC16_STATUS Package 442H 1090 IA32_MC16_ADDR Package 443H 1091 IA32_MC16_MISC Package 444H 1092 IA32_MC17_CTL Package 445H 1093 IA32_MC17_STATUS Package 446H 1094 IA32_MC17_ADDR Package 447H 1095 IA32_MC17_MISC Package 448H 1096 IA32_MC18_CTL Package 449H 1097 IA32_MC18_STATUS Package 44AH 1098 IA32_MC18_ADDR Package 44BH 1099 IA32_MC18_MISC Package 44CH 1100 IA32_MC19_CTL Package 44DH 1101 IA32_MC19_STATUS Package 44EH 1102 IA32_MC19_ADDR Package 44FH 1103 IA32_MC19_MISC Package 450H 1104 IA32_MC20_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 451H 1105 IA32_MC20_STATUS Package 452H 1106 IA32_MC20_ADDR Package Bank MC20 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. 453H 1107 IA32_MC20_MISC Package 454H 1108 IA32_MC21_CTL Package 455H 1109 IA32_MC21_STATUS Package 456H 1110 IA32_MC21_ADDR Package 457H 1111 IA32_MC21_MISC Package 2-198 Vol. 4 See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC17 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC18 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC19 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC21 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Register Name / Bit Fields Scope Hex Dec 458H 1112 IA32_MC22_CTL Package 459H 1113 IA32_MC22_STATUS Package 45AH 1114 IA32_MC22_ADDR Package 45BH 1115 IA32_MC22_MISC Package 45CH 1116 IA32_MC23_CTL Package 45DH 1117 IA32_MC23_STATUS Package 45EH 1118 IA32_MC23_ADDR Package 45FH 1119 IA32_MC23_MISC Package 460H 1120 IA32_MC24_CTL Package 461H 1121 IA32_MC24_STATUS Package 462H 1122 IA32_MC24_ADDR Package 463H 1123 IA32_MC24_MISC Package 464H 1124 IA32_MC25_CTL Package 465H 1125 IA32_MC25_STATUS Package 466H 1126 IA32_MC25_ADDR Package 467H 1127 IA32_MC2MISC Package 468H 1128 IA32_MC26_CTL Package 469H 1129 IA32_MC26_STATUS Package 46AH 1130 IA32_MC26_ADDR Package 46BH 1131 IA32_MC26_MISC Package 46CH 1132 IA32_MC27_CTL Package 46DH 1133 IA32_MC27_STATUS Package 46EH 1134 IA32_MC27_ADDR Package 46FH 1135 IA32_MC27_MISC Package 470H 1136 IA32_MC28_CTL Package 471H 1137 IA32_MC28_STATUS Package 472H 1138 IA32_MC28_ADDR Package 473H 1139 IA32_MC28_MISC Package 613H 1555 MSR_PKG_PERF_STATUS Package 618H 1560 MSR_DRAM_POWER_LIMIT Package Bit Description See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC22 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC23 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC24 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC25 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC26 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC27 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC28 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. Package RAPL Perf Status (R/O) DRAM RAPL Power Limit Control (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM Energy Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61BH 1563 MSR_DRAM_PERF_STATUS Package DRAM Performance Throttling Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” Vol. 4 2-199 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-26. MSRs Supported by Intel® Xeon® Processors E5 v2 Product Family (based on Ivy Bridge-E microarchitecture) (Contd.) Register Address Hex Dec 61CH 1564 Register Name / Bit Fields MSR_DRAM_POWER_INFO Scope Package Bit Description DRAM RAPL Parameters (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 639H 1593 MSR_PP0_ENERGY_STATUS Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” See Table 2-20, for other MSR definitions applicable to Intel Xeon processor E5 v2 with CPUID signature 06_3EH. 2.12.2 Additional MSRs Supported by Intel® Xeon® Processor E7 v2 Family Intel® Xeon® processor E7 v2 family (based on Ivy Bridge-E microarchitecture) with CPUID DisplayFamily_DisplayModel signature 06_3EH supports the MSR interfaces listed in Table 2-20, Table 2-26, and Table 2-27. Table 2-27. Additional MSRs Supported by Intel® Xeon® Processor E7 v2 Family with DisplayFamily_DisplayModel Signature 06_3EH Register Address Hex Dec 3AH 58 Register Name / Bit Fields IA32_FEATURE_CONTROL Scope Thread Bit Description Control Features in Intel 64 Processor (R/W) See Table 2-2. 179H 17AH 2-200 Vol. 4 377 378 0 Lock (R/WL) 1 Enable VMX Inside SMX Operation (R/WL) 2 Enable VMX Outside SMX Operation (R/WL) 14:8 SENTER Local Functions Enables (R/WL) 15 SENTER Global Functions Enable (R/WL) 63:16 Reserved IA32_MCG_CAP Thread Global Machine Check Capability (R/O) 7:0 Count 8 MCG_CTL_P 9 MCG_EXT_P 10 MCP_CMCI_P 11 MCG_TES_P 15:12 Reserved 23:16 MCG_EXT_CNT 24 MCG_SER_P 63:25 Reserved IA32_MCG_STATUS Thread Global Machine Check Status (R/WO) 0 RIPV 1 EIPV MODEL-SPECIFIC REGISTERS (MSRS) Table 2-27. Additional MSRs Supported by Intel® Xeon® Processor E7 v2 Family with DisplayFamily_DisplayModel Signature 06_3EH Register Address Hex 1AEH Register Name / Bit Fields Scope Bit Description Dec 430 2 MCIP 3 LMCE Signaled 63:4 Reserved MSR_TURBO_RATIO_LIMIT1 Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 9C Maximum turbo ratio limit of 9 core active. 15:8 Package Maximum Ratio Limit for 10C Maximum turbo ratio limit of 10core active. 23:16 Package Maximum Ratio Limit for 11C Maximum turbo ratio limit of 11 core active. 31:24 Package Maximum Ratio Limit for 12C Maximum turbo ratio limit of 12 core active. 39:32 Package Maximum Ratio Limit for 13C Maximum turbo ratio limit of 13 core active. 47:40 Package Maximum Ratio Limit for 14C Maximum turbo ratio limit of 14 core active. 55:48 Package Maximum Ratio Limit for 15C Maximum turbo ratio limit of 15 core active. 62:56 63 Reserved Package Semaphore for Turbo Ratio Limit Configuration If 1, the processor uses override configuration1 specified in MSR_TURBO_RATIO_LIMIT and MSR_TURBO_RATIO_LIMIT1. If 0, the processor uses factory-set configuration (Default). 29DH 669 IA32_MC29_CTL2 Package See Table 2-2. 29EH 670 IA32_MC30_CTL2 Package See Table 2-2. 29FH 671 IA32_MC31_CTL2 Package See Table 2-2. 3F1H 1009 MSR_PEBS_ENABLE Thread See Section 18.3.1.1.1, “Processor Event Based Sampling (PEBS).” 0 Enable PEBS on IA32_PMC0 (R/W) 1 Enable PEBS on IA32_PMC1 (R/W) 2 Enable PEBS on IA32_PMC2 (R/W) 3 Enable PEBS on IA32_PMC3 (R/W) 31:4 Reserved 32 Enable Load Latency on IA32_PMC0 (R/W) 33 Enable Load Latency on IA32_PMC1 (R/W) Vol. 4 2-201 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-27. Additional MSRs Supported by Intel® Xeon® Processor E7 v2 Family with DisplayFamily_DisplayModel Signature 06_3EH Register Address Hex 41BH Register Name / Bit Fields Scope Bit Description Dec 1051 34 Enable Load Latency on IA32_PMC2 (R/W) 35 Enable Load Latency on IA32_PMC3 (R/W) 63:36 Reserved IA32_MC6_MISC Package Misc MAC Information of Integrated I/O (R/O) See Section 15.3.2.4. 5:0 Recoverable Address LSB 8:6 Address Mode 15:9 Reserved 31:16 PCI Express Requestor ID 39:32 PCI Express Segment Number 63:32 Reserved 474H 1140 IA32_MC29_CTL Package 475H 1141 IA32_MC29_STATUS Package 476H 1142 IA32_MC29_ADDR Package 477H 1143 IA32_MC29_MISC Package 478H 1144 IA32_MC30_CTL Package 479H 1145 IA32_MC30_STATUS Package 47AH 1146 IA32_MC30_ADDR Package 47BH 1147 IA32_MC30_MISC Package 47CH 1148 IA32_MC31_CTL Package 47DH 1149 IA32_MC31_STATUS Package 47EH 1150 IA32_MC31_ADDR Package 47FH 1147 IA32_MC31_MISC Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC29 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC30 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC31 reports MC errors from a specific CBo (core broadcast) and its corresponding slice of L3. See Table 2-20, Table 2-26 for other MSR definitions applicable to Intel Xeon processor E7 v2 with CPUID signature 06_3AH. NOTES: 1. An override configuration lower than the factory-set configuration is always supported. An override configuration higher than the factory-set configuration is dependent on features specific to the processor and the platform. 2.12.3 Additional Uncore PMU MSRs in the Intel® Xeon® Processor E5 v2 and E7 v2 Families Intel Xeon Processor E5 v2 and E7 v2 families are based on the Ivy Bridge-E microarchitecture. The MSR-based uncore PMU interfaces are listed in Table 2-24 and Table 2-28. For complete detail of the uncore PMU, refer to Intel Xeon Processor E5 v2 Product Family Uncore Performance Monitoring Guide. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_3EH. 2-202 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-28. Uncore PMU MSRs in Intel® Xeon® Processor E5 v2 and E7 v2 Families Register Address Register Name / Bit Fields Scope Bit Description Hex Dec C00H 3072 MSR_PMON_GLOBAL_CTL Package Uncore Perfmon Per-Socket Global Control C01H 3073 MSR_PMON_GLOBAL_STATUS Package Uncore Perfmon Per-Socket Global Status C06H 3078 MSR_PMON_GLOBAL_CONFIG Package Uncore Perfmon Per-Socket Global Configuration C15H 3093 MSR_U_PMON_BOX_STATUS Package Uncore U-box Perfmon U-Box Wide Status C35H 3125 MSR_PCU_PMON_BOX_STATUS Package Uncore PCU Perfmon Box Wide Status D1AH 3354 MSR_C0_PMON_BOX_FILTER1 Package Uncore C-Box 0 Perfmon Box Wide Filter1 D3AH 3386 MSR_C1_PMON_BOX_FILTER1 Package Uncore C-Box 1 Perfmon Box Wide Filter1 D5AH 3418 MSR_C2_PMON_BOX_FILTER1 Package Uncore C-Box 2 Perfmon Box Wide Filter1 D7AH 3450 MSR_C3_PMON_BOX_FILTER1 Package Uncore C-Box 3 Perfmon Box Wide Filter1 D9AH 3482 MSR_C4_PMON_BOX_FILTER1 Package Uncore C-Box 4 Perfmon Box Wide Filter1 DBAH 3514 MSR_C5_PMON_BOX_FILTER1 Package Uncore C-Box 5 Perfmon Box Wide Filter1 DDAH 3546 MSR_C6_PMON_BOX_FILTER1 Package Uncore C-Box 6 Perfmon Box Wide Filter1 DFAH 3578 MSR_C7_PMON_BOX_FILTER1 Package Uncore C-Box 7 Perfmon Box Wide Filter1 E04H 3588 MSR_C8_PMON_BOX_CTL Package Uncore C-Box 8 Perfmon Local Box Wide Control E10H 3600 MSR_C8_PMON_EVNTSEL0 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 0 E11H 3601 MSR_C8_PMON_EVNTSEL1 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 1 E12H 3602 MSR_C8_PMON_EVNTSEL2 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 2 E13H 3603 MSR_C8_PMON_EVNTSEL3 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 3 E14H 3604 MSR_C8_PMON_BOX_FILTER Package Uncore C-Box 8 Perfmon Box Wide Filter E16H 3606 MSR_C8_PMON_CTR0 Package Uncore C-Box 8 Perfmon Counter 0 E17H 3607 MSR_C8_PMON_CTR1 Package Uncore C-Box 8 Perfmon Counter 1 E18H 3608 MSR_C8_PMON_CTR2 Package Uncore C-Box 8 Perfmon Counter 2 E19H 3609 MSR_C8_PMON_CTR3 Package Uncore C-Box 8 Perfmon Counter 3 E1AH 3610 MSR_C8_PMON_BOX_FILTER1 Package Uncore C-Box 8 Perfmon Box Wide Filter1 E24H 3620 MSR_C9_PMON_BOX_CTL Package Uncore C-Box 9 Perfmon Local Box Wide Control E30H 3632 MSR_C9_PMON_EVNTSEL0 Package Uncore C-Box 9 Perfmon Event Select for C-box 9 Counter 0 E31H 3633 MSR_C9_PMON_EVNTSEL1 Package Uncore C-Box 9 Perfmon Event Select for C-box 9 Counter 1 E32H 3634 MSR_C9_PMON_EVNTSEL2 Package Uncore C-Box 9 Perfmon Event Select for C-box 9 Counter 2 E33H 3635 MSR_C9_PMON_EVNTSEL3 Package Uncore C-Box 9 Perfmon Event Select for C-box 9 Counter 3 E34H 3636 MSR_C9_PMON_BOX_FILTER Package Uncore C-Box 9 Perfmon Box Wide Filter E36H 3638 MSR_C9_PMON_CTR0 Package Uncore C-Box 9 Perfmon Counter 0 Vol. 4 2-203 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-28. Uncore PMU MSRs in Intel® Xeon® Processor E5 v2 and E7 v2 Families (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec E37H 3639 MSR_C9_PMON_CTR1 Package Uncore C-Box 9 Perfmon Counter 1 E38H 3640 MSR_C9_PMON_CTR2 Package Uncore C-Box 9 Perfmon Counter 2 E39H 3641 MSR_C9_PMON_CTR3 Package Uncore C-Box 9 Perfmon Counter 3 E3AH 3642 MSR_C9_PMON_BOX_FILTER1 Package Uncore C-Box 9 Perfmon Box Wide Filter1 E44H 3652 MSR_C10_PMON_BOX_CTL Package Uncore C-Box 10 Perfmon Local Box Wide Control E50H 3664 MSR_C10_PMON_EVNTSEL0 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 0 E51H 3665 MSR_C10_PMON_EVNTSEL1 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 1 E52H 3666 MSR_C10_PMON_EVNTSEL2 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 2 E53H 3667 MSR_C10_PMON_EVNTSEL3 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 3 E54H 3668 MSR_C10_PMON_BOX_FILTER Package Uncore C-Box 10 Perfmon Box Wide Filter E56H 3670 MSR_C10_PMON_CTR0 Package Uncore C-Box 10 Perfmon Counter 0 E57H 3671 MSR_C10_PMON_CTR1 Package Uncore C-Box 10 Perfmon Counter 1 E58H 3672 MSR_C10_PMON_CTR2 Package Uncore C-Box 10 Perfmon Counter 2 E59H 3673 MSR_C10_PMON_CTR3 Package Uncore C-Box 10 Perfmon Counter 3 E5AH 3674 MSR_C10_PMON_BOX_FILTER1 Package Uncore C-Box 10 Perfmon Box Wide Filter1 E64H 3684 MSR_C11_PMON_BOX_CTL Package Uncore C-Box 11 Perfmon Local Box Wide Control E70H 3696 MSR_C11_PMON_EVNTSEL0 Package Uncore C-Box 11 Perfmon Event Select for C-Box 11 Counter 0 E71H 3697 MSR_C11_PMON_EVNTSEL1 Package Uncore C-Box 11 Perfmon Event Select for C-Box 11 Counter 1 E72H 3698 MSR_C11_PMON_EVNTSEL2 Package Uncore C-Box 11 Perfmon Event Select for C-Box 11 Counter 2 E73H 3699 MSR_C11_PMON_EVNTSEL3 Package Uncore C-Box 11 Perfmon Event Select for C-Box 11 Counter 3 E74H 3700 MSR_C11_PMON_BOX_FILTER Package Uncore C-Box 11 Perfmon Box Wide Filter E76H 3702 MSR_C11_PMON_CTR0 Package Uncore C-Box 11 Perfmon Counter 0 E77H 3703 MSR_C11_PMON_CTR1 Package Uncore C-Box 11 Perfmon Counter 1 E78H 3704 MSR_C11_PMON_CTR2 Package Uncore C-Box 11 Perfmon Counter 2 E79H 3705 MSR_C11_PMON_CTR3 Package Uncore C-Box 11 Perfmon Counter 3 E7AH 3706 MSR_C11_PMON_BOX_FILTER1 Package Uncore C-Box 11 Perfmon Box Wide Filter1 E84H 3716 MSR_C12_PMON_BOX_CTL Package Uncore C-Box 12 Perfmon Local Box Wide Control E90H 3728 MSR_C12_PMON_EVNTSEL0 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 0 E91H 3729 MSR_C12_PMON_EVNTSEL1 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 1 2-204 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-28. Uncore PMU MSRs in Intel® Xeon® Processor E5 v2 and E7 v2 Families (Contd.) Register Address Register Name / Bit Fields Scope Bit Description Hex Dec E92H 3730 MSR_C12_PMON_EVNTSEL2 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 2 E93H 3731 MSR_C12_PMON_EVNTSEL3 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 3 E94H 3732 MSR_C12_PMON_BOX_FILTER Package Uncore C-Box 12 Perfmon Box Wide Filter E96H 3734 MSR_C12_PMON_CTR0 Package Uncore C-Box 12 Perfmon Counter 0 E97H 3735 MSR_C12_PMON_CTR1 Package Uncore C-Box 12 Perfmon Counter 1 E98H 3736 MSR_C12_PMON_CTR2 Package Uncore C-Box 12 Perfmon Counter 2 E99H 3737 MSR_C12_PMON_CTR3 Package Uncore C-Box 12 Perfmon Counter 3 E9AH 3738 MSR_C12_PMON_BOX_FILTER1 Package Uncore C-Box 12 Perfmon Box Wide Filter1 EA4H 3748 MSR_C13_PMON_BOX_CTL Package Uncore C-Box 13 Perfmon Local Box Wide Control EB0H 3760 MSR_C13_PMON_EVNTSEL0 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 0 EB1H 3761 MSR_C13_PMON_EVNTSEL1 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 1 EB2H 3762 MSR_C13_PMON_EVNTSEL2 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 2 EB3H 3763 MSR_C13_PMON_EVNTSEL3 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 3 EB4H 3764 MSR_C13_PMON_BOX_FILTER Package Uncore C-Box 13 Perfmon Box Wide Filter EB6H 3766 MSR_C13_PMON_CTR0 Package Uncore C-Box 13 Perfmon Counter 0 EB7H 3767 MSR_C13_PMON_CTR1 Package Uncore C-Box 13 Perfmon Counter 1 EB8H 3768 MSR_C13_PMON_CTR2 Package Uncore C-Box 13 Perfmon Counter 2 EB9H 3769 MSR_C13_PMON_CTR3 Package Uncore C-Box 13 Perfmon Counter 3 EBAH 3770 MSR_C13_PMON_BOX_FILTER1 Package Uncore C-Box 13 Perfmon Box Wide Filter1 EC4H 3780 MSR_C14_PMON_BOX_CTL Package Uncore C-Box 14 Perfmon Local Box Wide Control ED0H 3792 MSR_C14_PMON_EVNTSEL0 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 0 ED1H 3793 MSR_C14_PMON_EVNTSEL1 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 1 ED2H 3794 MSR_C14_PMON_EVNTSEL2 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 2 ED3H 3795 MSR_C14_PMON_EVNTSEL3 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 3 ED4H 3796 MSR_C14_PMON_BOX_FILTER Package Uncore C-Box 14 Perfmon Box Wide Filter ED6H 3798 MSR_C14_PMON_CTR0 Package Uncore C-Box 14 Perfmon Counter 0 ED7H 3799 MSR_C14_PMON_CTR1 Package Uncore C-Box 14 Perfmon Counter 1 ED8H 3800 MSR_C14_PMON_CTR2 Package Uncore C-Box 14 Perfmon Counter 2 ED9H 3801 MSR_C14_PMON_CTR3 Package Uncore C-Box 14 Perfmon Counter 3 EDAH 3802 MSR_C14_PMON_BOX_FILTER1 Package Uncore C-Box 14 Perfmon Box Wide Filter1 Vol. 4 2-205 MODEL-SPECIFIC REGISTERS (MSRS) 2.13 MSRS IN THE 4TH GENERATION INTEL® CORE™ PROCESSORS (BASED ON HASWELL MICROARCHITECTURE) The 4th generation Intel® Core™ processor family and Intel® Xeon® processor E3-1200v3 product family (based on Haswell microarchitecture), with CPUID DisplayFamily_DisplayModel signature 06_3CH/06_45H/06_46H, support the MSR interfaces listed in Table 2-20, Table 2-21, Table 2-22, and Table 2-29. For an MSR listed in Table 2-20 that also appears in Table 2-29, Table 2-29 supercede Table 2-20. The MSRs listed in Table 2-29 also apply to processors based on Haswell-E microarchitecture (see Section 2.14). Table 2-29. Additional MSRs Supported by Processors based on the Haswell or Haswell-E microarchitectures Register Address Hex Dec 3BH 59 Register Name / Bit Fields IA32_TSC_ADJUST Scope Thread Bit Description Per-Logical-Processor TSC ADJUST (R/W) See Table 2-2. CEH 206 MSR_PLATFORM_INFO Package Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the frequency that invariant TSC runs at. Frequency = ratio * 100 MHz. 27:16 28 Reserved Package Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limit for Turbo mode is enabled. When set to 0, indicates Programmable Ratio Limit for Turbo mode is disabled. 29 Package Programmable TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDP Limit for Turbo mode is programmable. When set to 0, indicates TDP Limit for Turbo mode is not programmable. 31:30 32 Reserved Package Low Power Mode Support (LPM) (R/O) When set to 1, indicates that LPM is supported. When set to 0, indicates LPM is not supported. 34:33 Package Number of ConfigTDP Levels (R/O) 00: Only Base TDP level available. 01: One additional TDP level available. 02: Two additional TDP level available. 11: Reserved 39:35 47:40 Reserved Package Maximum Efficiency Ratio (R/O) This is the minimum ratio (maximum efficiency) that the processor can operate, in units of 100MHz. 2-206 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-29. Additional MSRs Supported by Processors based on the Haswell or Haswell-E microarchitectures Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 55:48 Package Minimum Operating Ratio (R/O) Contains the minimum supported operating ratio in units of 100 MHz. 63:56 186H 390 IA32_PERFEVTSEL0 Reserved Thread Performance Event Select for Counter 0 (R/W) Supports all fields described inTable 2-2 and the fields below. 32 IN_TX: See Section 18.3.6.5.1. When IN_TX (bit 32) is set, AnyThread (bit 21) should be cleared to prevent incorrect results. 187H 391 IA32_PERFEVTSEL1 Thread Performance Event Select for Counter 1 (R/W) Supports all fields described inTable 2-2 and the fields below. 32 IN_TX: See Section 18.3.6.5.1. When IN_TX (bit 32) is set, AnyThread (bit 21) should be cleared to prevent incorrect results. 188H 392 IA32_PERFEVTSEL2 Thread Performance Event Select for Counter 2 (R/W) Supports all fields described inTable 2-2 and the fields below. 32 IN_TX: See Section 18.3.6.5.1. When IN_TX (bit 32) is set, AnyThread (bit 21) should be cleared to prevent incorrect results. 33 IN_TXCP: See Section 18.3.6.5.1. When IN_TXCP=1 & IN_TX=1 and in sampling, a spurious PMI may occur and transactions may continuously abort near overflow conditions. Software should favor using IN_TXCP for counting over sampling. If sampling, software should use large “sample-after” value after clearing the counter configured to use IN_TXCP and also always reset the counter even when no overflow condition was reported. 189H 393 IA32_PERFEVTSEL3 Thread Performance Event Select for Counter 3 (R/W) Supports all fields described inTable 2-2 and the fields below. 32 IN_TX: See Section 18.3.6.5.1 When IN_TX (bit 32) is set, AnyThread (bit 21) should be cleared to prevent incorrect results. 1C8H 456 MSR_LBR_SELECT Thread Last Branch Record Filtering Select Register (R/W) 0 CPL_EQ_0 1 CPL_NEQ_0 2 JCC 3 NEAR_REL_CALL Vol. 4 2-207 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-29. Additional MSRs Supported by Processors based on the Haswell or Haswell-E microarchitectures Register Address Hex 1D9H Register Name / Bit Fields Scope Bit Description Dec 473 4 NEAR_IND_CALL 5 NEAR_RET 6 NEAR_IND_JMP 7 NEAR_REL_JMP 8 FAR_BRANCH 9 EN_CALL_STACK 63:9 Reserved IA32_DEBUGCTL Thread Debug Control (R/W) See Table 2-2. 491H 1169 0 LBR: Last Branch Record 1 BTF 5:2 Reserved 6 TR: Branch Trace 7 BTS: Log Branch Trace Message to BTS Buffer 8 BTINT 9 BTS_OFF_OS 10 BTS_OFF_USER 11 FREEZE_LBR_ON_PMI 12 FREEZE_PERFMON_ON_PMI 13 ENABLE_UNCORE_PMI 14 FREEZE_WHILE_SMM 15 RTM_DEBUG 63:15 Reserved IA32_VMX_VMFUNC Thread Capability Reporting Register of VM-Function Controls (R/O) See Table 2-2. 60BH 1548 MSR_PKGC_IRTL1 Package Package C6/C7 Interrupt Response Limit 1 (R/W) This MSR defines the interrupt response time limit used by the processor to manage a transition to a package C6 or C7 state. The latency programmed in this register is for the shorter-latency sub C-states used by an MWAIT hint to a C6 or C7 state. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 9:0 Interrupt Response Time Limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C6 or C7 state. 2-208 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-29. Additional MSRs Supported by Processors based on the Haswell or Haswell-E microarchitectures Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. See Table 2-20 for supported time unit encodings. 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 60CH 1548 MSR_PKGC_IRTL2 Reserved Package Package C6/C7 Interrupt Response Limit 2 (R/W) This MSR defines the interrupt response time limit used by the processor to manage a transition to a package C6 or C7 state. The latency programmed in this register is for the longer-latency sub C-states used by an MWAIT hint to a C6 or C7 state. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. 9:0 Interrupt response time limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C6 or C7 state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. See Table 2-20 for supported time unit encodings. 14:13 Reserved 15 Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 613H 1555 MSR_PKG_PERF_STATUS Reserved Package PKG Perf Status (R/O) See Section 14.9.3, “Package RAPL Domain.” 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM Energy Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61BH 1563 MSR_DRAM_PERF_STATUS Package DRAM Performance Throttling Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 648H 1608 MSR_CONFIG_TDP_NOMINAL 7:0 Package Base TDP Ratio (R/O) Config_TDP_Base Base TDP level ratio to be used for this specific processor (in units of 100 MHz). Vol. 4 2-209 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-29. Additional MSRs Supported by Processors based on the Haswell or Haswell-E microarchitectures Register Address Hex Register Name / Bit Fields Scope Dec 63:8 649H Bit Description 1609 MSR_CONFIG_TDP_LEVEL1 Reserved Package 14:0 ConfigTDP Level 1 Ratio and Power Level (R/O) PKG_TDP_LVL1 Power setting for ConfigTDP Level 1. 15 Reserved 23:16 Config_TDP_LVL1_Ratio ConfigTDP level 1 ratio to be used for this specific processor. 31:24 Reserved 46:32 PKG_MAX_PWR_LVL1 62:47 PKG_MIN_PWR_LVL1 Max Power setting allowed for ConfigTDP Level 1. MIN Power setting allowed for ConfigTDP Level 1. 63 64AH 1610 MSR_CONFIG_TDP_LEVEL2 Reserved Package 14:0 ConfigTDP Level 2 Ratio and Power Level (R/O) PKG_TDP_LVL2 Power setting for ConfigTDP Level 2. 15 Reserved 23:16 Config_TDP_LVL2_Ratio ConfigTDP level 2 ratio to be used for this specific processor. 31:24 Reserved 46:32 PKG_MAX_PWR_LVL2 Max Power setting allowed for ConfigTDP Level 2. 62:47 PKG_MIN_PWR_LVL2 MIN Power setting allowed for ConfigTDP Level 2. 63 64BH 1611 MSR_CONFIG_TDP_CONTROL Reserved Package 1:0 ConfigTDP Control (R/W) TDP_LEVEL (RW/L) System BIOS can program this field. 30:2 Reserved 31 Config_TDP_Lock (RW/L) When this bit is set, the content of this register is locked until a reset. 63:32 64CH 2-210 Vol. 4 1612 MSR_TURBO_ACTIVATION_RATIO Reserved Package ConfigTDP Control (R/W) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-29. Additional MSRs Supported by Processors based on the Haswell or Haswell-E microarchitectures Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 7:0 MAX_NON_TURBO_RATIO (RW/L) System BIOS can program this field. 30:8 Reserved 31 TURBO_ACTIVATION_RATIO_Lock (RW/L) When this bit is set, the content of this register is locked until a reset. 63:32 C80H 3200 Reserved IA32_DEBUG_INTERFACE Package Silicon Debug Feature Control (R/W) See Table 2-2. 2.13.1 MSRs in 4th Generation Intel® Core™ Processor Family (based on Haswell Microarchitecture) Table 2-30 lists model-specific registers (MSRs) that are specific to 4th generation Intel® Core™ processor family and Intel® Xeon® processor E3-1200 v3 product family (based on Haswell microarchitecture). These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_3CH/06_45H/06_46H, see Table 2-1. Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) Register Address Hex Dec E2H 226 Register Name / Bit Fields MSR_PKG_CST_CONFIG_CONTROL Scope Core Bit Description C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. See http://biosbits.org. 3:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package Cstate limit. The following C-state code name encodings are supported: 0000b: C0/C1 (no package C-state support) 0001b: C2 0010b: C3 0011b: C6 0100b: C7 0101b: C7s Package C states C7 are not available to processors with a signature of 06_3CH. 9:4 Reserved 10 I/O MWAIT Redirection Enable (R/W) Vol. 4 2-211 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex 17DH Register Name / Bit Fields Scope Bit Description Dec 390 14:11 Reserved 15 CFG Lock (R/WO) 24:16 Reserved 25 C3 State Auto Demotion Enable (R/W) 26 C1 State Auto Demotion Enable (R/W) 27 Enable C3 Undemotion (R/W) 28 Enable C1 Undemotion (R/W) 63:29 Reserved MSR_SMM_MCA_CAP THREAD Enhanced SMM Capabilities (SMM-RO) Reports SMM capability Enhancement. Accessible only while in SMM. 57:0 Reserved 58 SMM_Code_Access_Chk (SMM-RO) If set to 1, indicates that the SMM code access restriction is supported and the MSR_SMM_FEATURE_CONTROL is supported. 59 Long_Flow_Indication (SMM-RO) If set to 1, indicates that the SMM long flow indicator is supported and the MSR_SMM_DELAYED is supported. 63:60 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 63:32 391H 2-212 Vol. 4 913 MSR_UNC_PERF_GLOBAL_CTRL Reserved Package Uncore PMU Global Control 0 Core 0 select. 1 Core 1 select. 2 Core 2 select. 3 Core 3 select. 18:4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex 392H 394H 395H 396H Register Name / Bit Fields Scope Bit Description Dec 914 916 917 918 29 Enable all uncore counters. 30 Enable wake on PMI. 31 Enable Freezing counter when overflow. 63:32 Reserved MSR_UNC_PERF_GLOBAL_STATUS Package Uncore PMU Main Status 0 Fixed counter overflowed. 1 An ARB counter overflowed. 2 Reserved 3 A CBox counter overflowed (on any slice). 63:4 Reserved MSR_UNC_PERF_FIXED_CTRL Package Uncore Fixed Counter Control (R/W) 19:0 Reserved 20 Enable overflow propagation. 21 Reserved 22 Enable counting. 63:23 Reserved MSR_UNC_PERF_FIXED_CTR Package Uncore Fixed Counter 47:0 Current count. 63:48 Reserved MSR_UNC_CBO_CONFIG Package Uncore C-Box Configuration Information (R/O) 3:0 Encoded number of C-Box, derive value by “-1“. 63:4 Reserved 3B0H 946 MSR_UNC_ARB_PERFCTR0 Package Uncore Arb Unit, Performance Counter 0 3B1H 947 MSR_UNC_ARB_PERFCTR1 Package Uncore Arb Unit, Performance Counter 1 3B2H 944 MSR_UNC_ARB_PERFEVTSEL0 Package Uncore Arb Unit, Counter 0 Event Select MSR 3B3H 945 MSR_UNC_ARB_PERFEVTSEL1 Package Uncore Arb Unit, Counter 1 Event Select MSR 391H 913 MSR_UNC_PERF_GLOBAL_CTRL Package Uncore PMU Global Control 0 Core 0 select. 1 Core 1 select. 2 Core 2 select. 3 Core 3 select. 18:4 Reserved 29 Enable all uncore counters. 30 Enable wake on PMI. 31 Enable Freezing counter when overflow. 63:32 Reserved Vol. 4 2-213 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Dec 395H 917 Register Name / Bit Fields MSR_UNC_PERF_FIXED_CTR Scope Package Bit Description Uncore Fixed Counter 47:0 Current count. 63:48 Reserved 3B3H 945 MSR_UNC_ARB_PERFEVTSEL1 Package Uncore Arb Unit, Counter 1 Event Select MSR 4E0H 1248 MSR_SMM_FEATURE_CONTROL Package Enhanced SMM Feature Control (SMM-RW) Reports SMM capability Enhancement. Accessible only while in SMM. 0 Lock (SMM-RWO) When set to ‘1’ locks this register from further changes. 1 Reserved 2 SMM_Code_Chk_En (SMM-RW) This control bit is available only if MSR_SMM_MCA_CAP[58] == 1. When set to ‘0’ (default) none of the logical processors are prevented from executing SMM code outside the ranges defined by the SMRR. When set to ‘1’ any logical processor in the package that attempts to execute SMM code not within the ranges defined by the SMRR will assert an unrecoverable MCE. 63:3 4E2H 1250 MSR_SMM_DELAYED Reserved Package SMM Delayed (SMM-RO) Reports the interruptible state of all logical processors in the package. Available only while in SMM and MSR_SMM_MCA_CAP[LONG_FLOW_INDICATION] == 1. N-1:0 LOG_PROC_STATE (SMM-RO) Each bit represents a logical processor of its state in a long flow of internal operation which delays servicing an interrupt. The corresponding bit will be set at the start of long events such as: Microcode Update Load, C6, WBINVD, Ratio Change, Throttle. The bit is automatically cleared at the end of each long event. The reset value of this field is 0. Only bit positions below N = CPUID.(EAX=0BH, ECX=PKG_LVL):EBX[15:0] can be updated. 63:N 4E3H 1251 MSR_SMM_BLOCKED Reserved Package SMM Blocked (SMM-RO) Reports the blocked state of all logical processors in the package. Available only while in SMM. 2-214 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec N-1:0 LOG_PROC_STATE (SMM-RO) Each bit represents a logical processor of its blocked state to service an SMI. The corresponding bit will be set if the logical processor is in one of the following states: Wait For SIPI or SENTER Sleep. The reset value of this field is 0FFFH. Only bit positions below N = CPUID.(EAX=0BH, ECX=PKG_LVL):EBX[15:0] can be updated. 63:N 606H 1542 Reserved MSR_RAPL_POWER_UNIT Package 3:0 Package Unit Multipliers Used in RAPL Interfaces (R/O) Power Units See Section 14.9.1, “RAPL Interfaces.” 7:4 Package Reserved 12:8 Package Energy Status Units Energy related information (in Joules) is based on the multiplier, 1/2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 0EH (or 61 micro-joules). 15:13 Package Reserved 19:16 Package Time Units See Section 14.9.1, “RAPL Interfaces.” 63:20 639H 1593 MSR_PP0_ENERGY_STATUS Reserved Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 640H 1600 MSR_PP1_POWER_LIMIT Package PP1 RAPL Power Limit Control (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 641H 1601 MSR_PP1_ENERGY_STATUS Package PP1 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 642H 1602 MSR_PP1_POLICY Package PP1 Balance Policy (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 690H 1680 MSR_CORE_PERF_LIMIT_REASONS Package Indicator of Frequency Clipping in Processor Cores (R/W) (Frequency refers to processor core frequency.) 0 PROCHOT Status (R0) When set, processor core frequency is reduced below the operating system request due to assertion of external PROCHOT. 1 Thermal Status (R0) When set, frequency is reduced below the operating system request due to a thermal event. 3:2 Reserved Vol. 4 2-215 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 4 Graphics Driver Status (R0) When set, frequency is reduced below the operating system request due to Processor Graphics driver override. 5 Autonomous Utilization-Based Frequency Control Status (R0) When set, frequency is reduced below the operating system request because the processor has detected that utilization is low. 6 VR Therm Alert Status (R0) When set, frequency is reduced below the operating system request due to a thermal alert from the Voltage Regulator. 7 Reserved 8 Electrical Design Point Status (R0) When set, frequency is reduced below the operating system request due to electrical design point constraints (e.g., maximum electrical current consumption). 9 Core Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to domain-level power limiting. 10 Package-Level Power Limiting PL1 Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL1. 11 Package-Level PL2 Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL2. 12 Max Turbo Limit Status (R0) When set, frequency is reduced below the operating system request due to multi-core turbo limits. 13 Turbo Transition Attenuation Status (R0) When set, frequency is reduced below the operating system request due to Turbo transition attenuation. This prevents performance degradation due to frequent operating ratio changes. 15:14 Reserved 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 2-216 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 19:18 20 Reserved Graphics Driver Log When set, indicates that the Graphics Driver Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 21 Autonomous Utilization-Based Frequency Control Log When set, indicates that the Autonomous UtilizationBased Frequency Control Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 23 24 Reserved Electrical Design Point Log When set, indicates that the EDP Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Core Power Limiting Log When set, indicates that the Core Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 26 Package-Level PL1 Power Limiting Log When set, indicates that the Package Level PL1 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. Vol. 4 2-217 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 27 Package-Level PL2 Power Limiting Log When set, indicates that the Package Level PL2 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28 Max Turbo Limit Log When set, indicates that the Max Turbo Limit Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 29 Turbo Transition Attenuation Log When set, indicates that the Turbo Transition Attenuation Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:30 6B0H 1712 MSR_GRAPHICS_PERF_LIMIT_REASONS Reserved Package Indicator of Frequency Clipping in the Processor Graphics (R/W) (Frequency refers to processor graphics frequency.) 0 PROCHOT Status (R0) When set, frequency is reduced below the operating system request due to assertion of external PROCHOT. 1 Thermal Status (R0) When set, frequency is reduced below the operating system request due to a thermal event. 3:2 Reserved 4 Graphics Driver Status (R0) When set, frequency is reduced below the operating system request due to Processor Graphics driver override. 5 Autonomous Utilization-Based Frequency Control Status (R0) When set, frequency is reduced below the operating system request because the processor has detected that utilization is low. 6 VR Therm Alert Status (R0) When set, frequency is reduced below the operating system request due to a thermal alert from the Voltage Regulator. 7 2-218 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 8 Electrical Design Point Status (R0) When set, frequency is reduced below the operating system request due to electrical design point constraints (e.g., maximum electrical current consumption). 9 Graphics Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to domain-level power limiting. 10 Package-Level Power Limiting PL1 Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL1. 11 Package-Level PL2 Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL2. 15:12 Reserved 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 19:18 Reserved 20 Graphics Driver Log When set, indicates that the Graphics Driver Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 21 Autonomous Utilization-Based Frequency Control Log When set, indicates that the Autonomous UtilizationBased Frequency Control Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. Vol. 4 2-219 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 23 Reserved 24 Electrical Design Point Log When set, indicates that the EDP Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Core Power Limiting Log When set, indicates that the Core Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 26 Package-Level PL1 Power Limiting Log When set, indicates that the Package Level PL1 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 27 Package-Level PL2 Power Limiting Log When set, indicates that the Package Level PL2 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28 Max Turbo Limit Log When set, indicates that the Max Turbo Limit Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 29 Turbo Transition Attenuation Log When set, indicates that the Turbo Transition Attenuation Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:30 6B1H 1713 MSR_RING_PERF_LIMIT_REASONS Reserved Package Indicator of Frequency Clipping in the Ring Interconnect (R/W) (Frequency refers to ring interconnect in the uncore.) 0 PROCHOT Status (R0) When set, frequency is reduced below the operating system request due to assertion of external PROCHOT. 2-220 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 1 Thermal Status (R0) When set, frequency is reduced below the operating system request due to a thermal event. 5:2 6 Reserved VR Therm Alert Status (R0) When set, frequency is reduced below the operating system request due to a thermal alert from the Voltage Regulator. 7 Reserved 8 Electrical Design Point Status (R0) When set, frequency is reduced below the operating system request due to electrical design point constraints (e.g., maximum electrical current consumption). 9 Reserved 10 Package-Level Power Limiting PL1 Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL1. 11 Package-Level PL2 Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to package-level power limiting PL2. 15:12 Reserved 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 19:18 Reserved. 20 Graphics Driver Log When set, indicates that the Graphics Driver Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. Vol. 4 2-221 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 21 Autonomous Utilization-Based Frequency Control Log When set, indicates that the Autonomous UtilizationBased Frequency Control Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 23 Reserved 24 Electrical Design Point Log When set, indicates that the EDP Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Core Power Limiting Log When set, indicates that the Core Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 26 Package-Level PL1 Power Limiting Log When set, indicates that the Package Level PL1 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 27 Package-Level PL2 Power Limiting Log When set, indicates that the Package Level PL2 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28 Max Turbo Limit Log When set, indicates that the Max Turbo Limit Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 2-222 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-30. MSRs Supported by 4th Generation Intel® Core™ Processors (Haswell Microarchitecture) (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 29 Turbo Transition Attenuation Log When set, indicates that the Turbo Transition Attenuation Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:30 Reserved 700H 1792 MSR_UNC_CBO_0_PERFEVTSEL0 Package Uncore C-Box 0, Counter 0 Event Select MSR 701H 1793 MSR_UNC_CBO_0_PERFEVTSEL1 Package Uncore C-Box 0, Counter 1 Event Select MSR 706H 1798 MSR_UNC_CBO_0_PERFCTR0 Package Uncore C-Box 0, Performance Counter 0 707H 1799 MSR_UNC_CBO_0_PERFCTR1 Package Uncore C-Box 0, Performance Counter 1 710H 1808 MSR_UNC_CBO_1_PERFEVTSEL0 Package Uncore C-Box 1, Counter 0 Event Select MSR 711H 1809 MSR_UNC_CBO_1_PERFEVTSEL1 Package Uncore C-Box 1, Counter 1 Event Select MSR 716H 1814 MSR_UNC_CBO_1_PERFCTR0 Package Uncore C-Box 1, Performance Counter 0 717H 1815 MSR_UNC_CBO_1_PERFCTR1 Package Uncore C-Box 1, Performance Counter 1 720H 1824 MSR_UNC_CBO_2_PERFEVTSEL0 Package Uncore C-Box 2, Counter 0 Event Select MSR 721H 1824 MSR_UNC_CBO_2_PERFEVTSEL1 Package Uncore C-Box 2, Counter 1 Event Select MSR 726H 1830 MSR_UNC_CBO_2_PERFCTR0 Package Uncore C-Box 2, Performance Counter 0 727H 1831 MSR_UNC_CBO_2_PERFCTR1 Package Uncore C-Box 2, Performance Counter 1 730H 1840 MSR_UNC_CBO_3_PERFEVTSEL0 Package Uncore C-Box 3, Counter 0 Event Select MSR 731H 1841 MSR_UNC_CBO_3_PERFEVTSEL1 Package Uncore C-Box 3, Counter 1 Event Select MSR 736H 1846 MSR_UNC_CBO_3_PERFCTR0 Package Uncore C-Box 3, Performance Counter 0 737H 1847 MSR_UNC_CBO_3_PERFCTR1 Package Uncore C-Box 3, Performance Counter 1 See Table 2-20, Table 2-21, Table 2-22, Table 2-25, Table 2-29 for other MSR definitions applicable to processors with CPUID signatures 063CH, 06_46H. 2.13.2 Additional Residency MSRs Supported in 4th Generation Intel® Core™ Processors The 4th generation Intel® Core™ processor family (based on Haswell microarchitecture) with CPUID DisplayFamily_DisplayModel signature 06_45H supports the MSR interfaces listed in Table 2-20, Table 2-21, Table 2-29, Table 2-30, and Table 2-31. Vol. 4 2-223 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-31. Additional Residency MSRs Supported by 4th Generation Intel® Core™ Processors with DisplayFamily_DisplayModel Signature 06_45H Register Address Hex Dec E2H 226 Register Name / Bit Fields MSR_PKG_CST_CONFIG_CONTROL Scope Core Bit Description C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. See http://biosbits.org. 3:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 0000b: C0/C1 (no package C-state support) 0001b: C2 0010b: C3 0011b: C6 0100b: C7 0101b: C7s 0110b: C8 0111b: C9 1000b: C10 630H 1584 9:4 Reserved 10 I/O MWAIT Redirection Enable (R/W) 14:11 Reserved 15 CFG Lock (R/WO) 24:16 Reserved 25 C3 State Auto Demotion Enable (R/W) 26 C1 State Auto Demotion Enable (R/W) 27 Enable C3 Undemotion (R/W) 28 Enable C1 Undemotion (R/W) 63:29 Reserved MSR_PKG_C8_RESIDENCY 59:0 Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C8 Residency Counter (R/O) Value since last reset that this package is in processorspecific C8 states. Count at the same frequency as the TSC. 63:60 2-224 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-31. Additional Residency MSRs Supported by 4th Generation Intel® Core™ Processors with DisplayFamily_DisplayModel Signature 06_45H Register Address Hex Dec 631H 1585 Register Name / Bit Fields MSR_PKG_C9_RESIDENCY Scope Package 59:0 Bit Description Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C9 Residency Counter (R/O) Value since last reset that this package is in processorspecific C9 states. Count at the same frequency as the TSC. 63:60 632H 1586 MSR_PKG_C10_RESIDENCY Reserved Package 59:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States. Package C10 Residency Counter (R/O) Value since last reset that this package is in processorspecific C10 states. Count at the same frequency as the TSC. 63:60 Reserved See Table 2-20, Table 2-21, Table 2-22, Table 2-29, Table 2-30 for other MSR definitions applicable to processors with CPUID signature 06_45H. 2.14 MSRS IN INTEL® XEON® PROCESSOR E5 V3 AND E7 V3 PRODUCT FAMILY Intel® Xeon® processor E5 v3 family and Intel® Xeon® processor E7 v3 family are based on Haswell-E microarchitecture (CPUID DisplayFamily_DisplayModel = 06_3F). These processors supports the MSR interfaces listed in Table 2-20, Table 2-29, and Table 2-32. Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Dec 35H 53 Register Name / Bit Fields MSR_CORE_THREAD_COUNT Scope Package Bit Description Configured State of Enabled Processor Core Count and Logical Processor Count (RO) • After a Power-On RESET, enumerates factory configuration of the number of processor cores and logical processors in the physical package. • Following the sequence of (i) BIOS modified a Configuration Mask which selects a subset of processor cores to be active post RESET and (ii) a RESET event after the modification, enumerates the current configuration of enabled processor core count and logical processor count in the physical package. 15:0 Core_COUNT (RO) The number of processor cores that are currently enabled (by either factory configuration or BIOS configuration) in the physical package. Vol. 4 2-225 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 31:16 THREAD_COUNT (RO) The number of logical processors that are currently enabled (by either factory configuration or BIOS configuration) in the physical package. 63:32 53H 83 MSR_THREAD_ID_INFO Reserved Thread 7:0 A Hardware Assigned ID for the Logical Processor (RO) Logical_Processor_ID (RO) An implementation-specific numerical value physically assigned to each logical processor. This ID is not related to Initial APIC ID or x2APIC ID, it is unique within a physical package. 63:8 E2H 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Core C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 000b: C0/C1 (no package C-state support) 001b: C2 010b: C6 (non-retention) 011b: C6 (retention) 111b: No Package C state limits. All C states supported by the processor are available. 2-226 Vol. 4 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) 14:11 Reserved 15 CFG Lock (R/WO) 24:16 Reserved 25 C3 State Auto Demotion Enable (R/W) 26 C1 State Auto Demotion Enable (R/W) 27 Enable C3 Undemotion (R/W) 28 Enable C1 Undemotion (R/W) 29 Package C State Demotion Enable (R/W) 30 Package C State UnDemotion Enable (R/W) 63:31 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Dec 179H 377 17DH 390 Register Name / Bit Fields IA32_MCG_CAP Scope Thread Bit Description Global Machine Check Capability (R/O) 7:0 Count 8 MCG_CTL_P 9 MCG_EXT_P 10 MCP_CMCI_P 11 MCG_TES_P 15:12 Reserved 23:16 MCG_EXT_CNT 24 MCG_SER_P 25 MCG_EM_P 26 MCG_ELOG_P 63:27 Reserved MSR_SMM_MCA_CAP Thread Enhanced SMM Capabilities (SMM-RO) Reports SMM capability Enhancement. Accessible only while in SMM. 57:0 Reserved 58 SMM_Code_Access_Chk (SMM-RO) If set to 1, indicates that the SMM code access restriction is supported and a host-space interface available to SMM handler. 59 Long_Flow_Indication (SMM-RO) If set to 1, indicates that the SMM long flow indicator is supported and a host-space interface available to SMM handler. 63:60 17FH 383 MSR_ERROR_CONTROL Reserved Package 0 MC Bank Error Configuration (R/W) Reserved 1 MemError Log Enable (R/W) When set, enables IMC status bank to log additional info in bits 36:32. 63:2 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. Vol. 4 2-227 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 39:32 Package Maximum Ratio Limit for 5C Maximum turbo ratio limit of 5 core active. 47:40 Package Maximum Ratio Limit for 6C Maximum turbo ratio limit of 6 core active. 55:48 Package Maximum Ratio Limit for 7C Maximum turbo ratio limit of 7 core active. 63:56 Package Maximum Ratio Limit for 8C Maximum turbo ratio limit of 8 core active. 1AEH 430 MSR_TURBO_RATIO_LIMIT1 Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 9C Maximum turbo ratio limit of 9 core active. 15:8 Package Maximum Ratio Limit for 10C Maximum turbo ratio limit of 10 core active. 23:16 Package Maximum Ratio Limit for 11C Maximum turbo ratio limit of 11 core active. 31:24 Package Maximum Ratio Limit for 12C Maximum turbo ratio limit of 12 core active. 39:32 Package Maximum Ratio Limit for 13C Maximum turbo ratio limit of 13 core active. 47:40 Package Maximum Ratio Limit for 14C Maximum turbo ratio limit of 14 core active. 55:48 Package Maximum Ratio Limit for 15C Maximum turbo ratio limit of 15 core active. 63:56 Package Maximum Ratio Limit for16C Maximum turbo ratio limit of 16 core active. 1AFH 431 MSR_TURBO_RATIO_LIMIT2 Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 17C Maximum turbo ratio limit of 17 core active. 15:8 Package Maximum Ratio Limit for 18C Maximum turbo ratio limit of 18 core active. 62:16 2-228 Vol. 4 Package Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 63 Package Semaphore for Turbo Ratio Limit Configuration If 1, the processor uses override configuration1 specified in MSR_TURBO_RATIO_LIMIT, MSR_TURBO_RATIO_LIMIT1 and MSR_TURBO_RATIO_LIMIT2. If 0, the processor uses factory-set configuration (Default). 414H 1044 IA32_MC5_CTL Package 415H 1045 IA32_MC5_STATUS Package 416H 1046 IA32_MC5_ADDR Package 417H 1047 IA32_MC5_MISC Package 418H 1048 IA32_MC6_CTL Package 419H 1049 IA32_MC6_STATUS Package 41AH 1050 IA32_MC6_ADDR Package 41BH 1051 IA32_MC6_MISC Package 41CH 1052 IA32_MC7_CTL Package 41DH 1053 IA32_MC7_STATUS Package 41EH 1054 IA32_MC7_ADDR Package 41FH 1055 IA32_MC7_MISC Package 420H 1056 IA32_MC8_CTL Package 421H 1057 IA32_MC8_STATUS Package 422H 1058 IA32_MC8_ADDR Package 423H 1059 IA32_MC8_MISC Package 424H 1060 IA32_MC9_CTL Package 425H 1061 IA32_MC9_STATUS Package 426H 1062 IA32_MC9_ADDR Package 427H 1063 IA32_MC9_MISC Package 428H 1064 IA32_MC10_CTL Package 429H 1065 IA32_MC10_STATUS Package 42AH 1066 IA32_MC10_ADDR Package 42BH 1067 IA32_MC10_MISC Package 42CH 1068 IA32_MC11_CTL Package 42DH 1069 IA32_MC11_STATUS Package 42EH 1070 IA32_MC11_ADDR Package 42FH 1071 IA32_MC11_MISC Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC5 reports MC errors from the Intel QPI 0 module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC6 reports MC errors from the integrated I/O module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC7 reports MC errors from the home agent HA 0. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC8 reports MC errors from the home agent HA 1. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. Vol. 4 2-229 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Register Name / Bit Fields Scope Hex Dec 430H 1072 IA32_MC12_CTL Package 431H 1073 IA32_MC12_STATUS Package 432H 1074 IA32_MC12_ADDR Package 433H 1075 IA32_MC12_MISC Package 434H 1076 IA32_MC13_CTL Package 435H 1077 IA32_MC13_STATUS Package 436H 1078 IA32_MC13_ADDR Package 437H 1079 IA32_MC13_MISC Package 438H 1080 IA32_MC14_CTL Package 439H 1081 IA32_MC14_STATUS Package 43AH 1082 IA32_MC14_ADDR Package 43BH 1083 IA32_MC14_MISC Package 43CH 1084 IA32_MC15_CTL Package 43DH 1085 IA32_MC15_STATUS Package 43EH 1086 IA32_MC15_ADDR Package 43FH 1087 IA32_MC15_MISC Package 440H 1088 IA32_MC16_CTL Package 441H 1089 IA32_MC16_STATUS Package 442H 1090 IA32_MC16_ADDR Package 443H 1091 IA32_MC16_MISC Package 444H 1092 IA32_MC17_CTL Package 445H 1093 IA32_MC17_STATUS Package 446H 1094 IA32_MC17_ADDR Package 447H 1095 IA32_MC17_MISC Package 448H 1096 IA32_MC18_CTL Package 449H 1097 IA32_MC18_STATUS Package 44AH 1098 IA32_MC18_ADDR Package 44BH 1099 IA32_MC18_MISC Package 44CH 1100 IA32_MC19_CTL Package 44DH 1101 IA32_MC19_STATUS Package 44EH 1102 IA32_MC19_ADDR Package 44FH 1103 IA32_MC19_MISC Package 450H 1104 IA32_MC20_CTL Package 451H 1105 IA32_MC20_STATUS Package 452H 1106 IA32_MC20_ADDR Package 453H 1107 IA32_MC20_MISC Package 2-230 Vol. 4 Bit Description See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC17 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo0, CBo3, CBo6, CBo9, CBo12, CBo15. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC18 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo1, CBo4, CBo7, CBo10, CBo13, CBo16. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC19 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo2, CBo5, CBo8, CBo11, CBo14, CBo17. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC20 reports MC errors from the Intel QPI 1 module. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 454H 1108 IA32_MC21_CTL Package 455H 1109 IA32_MC21_STATUS Package 456H 1110 IA32_MC21_ADDR Package 457H 1111 IA32_MC21_MISC Package 606H 1542 MSR_RAPL_POWER_UNIT Package Unit Multipliers Used in RAPL Interfaces (R/O) 3:0 Package Power Units See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC21 reports MC errors from the Intel QPI 2 module. See Section 14.9.1, “RAPL Interfaces.” 7:4 Package Reserved 12:8 Package Energy Status Units Energy related information (in Joules) is based on the multiplier, 1/2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 0EH (or 61 micro-joules). 15:13 Package Reserved 19:16 Package Time Units See Section 14.9.1, “RAPL Interfaces.” 63:20 618H 1560 MSR_DRAM_POWER_LIMIT Reserved Package DRAM RAPL Power Limit Control (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM Energy Status (R/O) Energy Consumed by DRAM devices. 61BH 1563 31:0 Energy in 15.3 micro-joules. Requires BIOS configuration to enable DRAM RAPL mode 0 (Direct VR). 63:32 Reserved MSR_DRAM_PERF_STATUS Package DRAM Performance Throttling Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61CH 1564 MSR_DRAM_POWER_INFO Package DRAM RAPL Parameters (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 61EH 1566 MSR_PCIE_PLL_RATIO Package 1:0 Package Configuration of PCIE PLL Relative to BCLK(R/W) PCIE Ratio (R/W) 00b: Use 5:5 mapping for100MHz operation (default). 01b: Use 5:4 mapping for125MHz operation. 10b: Use 5:3 mapping for166MHz operation. 11b: Use 5:2 mapping for250MHz operation. 2 Package LPLL Select (R/W) If 1, use configured setting of PCIE Ratio. 3 Package LONG RESET (R/W) If 1, wait an additional time-out before re-locking Gen2/Gen3 PLLs. 63:4 Reserved Vol. 4 2-231 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Dec 620H 1568 Register Name / Bit Fields MSR UNCORE_RATIO_LIMIT Scope Package Bit Description Uncore Ratio Limit (R/W) Out of reset, the min_ratio and max_ratio fields represent the widest possible range of uncore frequencies. Writing to these fields allows software to control the minimum and the maximum frequency that hardware will select. 63:15 Reserved 14:8 MIN_RATIO Writing to this field controls the minimum possible ratio of the LLC/Ring. 7 Reserved 6:0 MAX_RATIO This field is used to limit the max ratio of the LLC/Ring. 639H 1593 MSR_PP0_ENERGY_STATUS Package Reserved (R/O) Reads return 0. 690H 1680 MSR_CORE_PERF_LIMIT_REASONS Package Indicator of Frequency Clipping in Processor Cores (R/W) (Frequency refers to processor core frequency.) 0 PROCHOT Status (R0) When set, processor core frequency is reduced below the operating system request due to assertion of external PROCHOT. 1 Thermal Status (R0) When set, frequency is reduced below the operating system request due to a thermal event. 2 Power Budget Management Status (R0) When set, frequency is reduced below the operating system request due to PBM limit 3 Platform Configuration Services Status (R0) When set, frequency is reduced below the operating system request due to PCS limit 4 Reserved 5 Autonomous Utilization-Based Frequency Control Status (R0) When set, frequency is reduced below the operating system request because the processor has detected that utilization is low. 6 VR Therm Alert Status (R0) When set, frequency is reduced below the operating system request due to a thermal alert from the Voltage Regulator. 7 2-232 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 8 Electrical Design Point Status (R0) When set, frequency is reduced below the operating system request due to electrical design point constraints (e.g., maximum electrical current consumption). 9 Reserved 10 Multi-Core Turbo Status (R0) When set, frequency is reduced below the operating system request due to Multi-Core Turbo limits. 12:11 Reserved 13 Core Frequency P1 Status (R0) When set, frequency is reduced below max non-turbo P1. 14 Core Max N-Core Turbo Frequency Limiting Status (R0) When set, frequency is reduced below max n-core turbo frequency. 15 Core Frequency Limiting Status (R0) When set, frequency is reduced below the operating system request. 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 18 Power Budget Management Log When set, indicates that the PBM Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 19 Platform Configuration Services Log When set, indicates that the PCS Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 20 Reserved Vol. 4 2-233 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 21 Autonomous Utilization-Based Frequency Control Log When set, indicates that the AUBFC Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 23 Reserved 24 Electrical Design Point Log When set, indicates that the EDP Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Reserved 26 Multi-Core Turbo Log When set, indicates that the Multi-Core Turbo Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28:27 Reserved 29 Core Frequency P1 Log When set, indicates that the Core Frequency P1 Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 30 Core Max N-Core Turbo Frequency Limiting Log When set, indicates that the Core Max n-core Turbo Frequency Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 31 Core Frequency Limiting Log When set, indicates that the Core Frequency Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:32 C8DH 3213 IA32_QM_EVTSEL Reserved THREAD Monitoring Event Select Register (R/W) If CPUID.(EAX=07H, ECX=0):EBX.RDT-M[bit 12] = 1. 2-234 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-32. Additional MSRs Supported by Intel® Xeon® Processor E5 v3 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 7:0 EventID (RW) Event encoding: 0x0: No monitoring. 0x1: L3 occupancy monitoring. All other encoding reserved. C8EH 3214 31:8 Reserved 41:32 RMID (RW) 63:42 Reserved IA32_QM_CTR THREAD Monitoring Counter Register (R/O) If CPUID.(EAX=07H, ECX=0):EBX.RDT-M[bit 12] = 1. C8FH 3215 61:0 Resource Monitored Data 62 Unavailable: If 1, indicates data for this RMID is not available or not monitored for this resource or RMID. 63 Error: If 1, indicates and unsupported RMID or event type was written to IA32_PQR_QM_EVTSEL. IA32_PQR_ASSOC THREAD Resource Association Register (R/W) 9:0 RMID 63: 10 Reserved See Table 2-20, Table 2-29 for other MSR definitions applicable to processors with CPUID signature 06_3FH. NOTES: 1. An override configuration lower than the factory-set configuration is always supported. An override configuration higher than the factory-set configuration is dependent on features specific to the processor and the platform. Additional Uncore PMU MSRs in the Intel® Xeon® Processor E5 v3 Family 2.14.1 Intel Xeon Processor E5 v3 and E7 v3 family are based on the Haswell-E microarchitecture. The MSR-based uncore PMU interfaces are listed in Table 2-33. For complete detail of the uncore PMU, refer to Intel Xeon Processor E5 v3 Product Family Uncore Performance Monitoring Guide. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_3FH. Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 700H MSR_PMON_GLOBAL_CTL Package Uncore Perfmon Per-Socket Global Control 701H MSR_PMON_GLOBAL_STATUS Package Uncore Perfmon Per-Socket Global Status 702H MSR_PMON_GLOBAL_CONFIG Package Uncore Perfmon Per-Socket Global Configuration 703H MSR_U_PMON_UCLK_FIXED_CTL Package Uncore U-Box UCLK Fixed Counter Control 704H MSR_U_PMON_UCLK_FIXED_CTR Package Uncore U-Box UCLK Fixed Counter 705H MSR_U_PMON_EVNTSEL0 Package Uncore U-Box Perfmon Event Select for U-Box Counter 0 Vol. 4 2-235 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 706H MSR_U_PMON_EVNTSEL1 Package Uncore U-Box Perfmon Event Select for U-Box Counter 1 708H MSR_U_PMON_BOX_STATUS Package Uncore U-Box Perfmon U-Box Wide Status 709H MSR_U_PMON_CTR0 Package Uncore U-Box Perfmon Counter 0 70AH MSR_U_PMON_CTR1 Package Uncore U-Box Perfmon Counter 1 710H MSR_PCU_PMON_BOX_CTL Package Uncore PCU Perfmon for PCU-Box-Wide Control 711H MSR_PCU_PMON_EVNTSEL0 Package Uncore PCU Perfmon Event Select for PCU Counter 0 712H MSR_PCU_PMON_EVNTSEL1 Package Uncore PCU Perfmon Event Select for PCU Counter 1 713H MSR_PCU_PMON_EVNTSEL2 Package Uncore PCU Perfmon Event Select for PCU Counter 2 714H MSR_PCU_PMON_EVNTSEL3 Package Uncore PCU Perfmon Event Select for PCU Counter 3 715H MSR_PCU_PMON_BOX_FILTER Package Uncore PCU Perfmon Box-Wide Filter 716H MSR_PCU_PMON_BOX_STATUS Package Uncore PCU Perfmon Box Wide Status 717H MSR_PCU_PMON_CTR0 Package Uncore PCU Perfmon Counter 0 718H MSR_PCU_PMON_CTR1 Package Uncore PCU Perfmon Counter 1 719H MSR_PCU_PMON_CTR2 Package Uncore PCU Perfmon Counter 2 71AH MSR_PCU_PMON_CTR3 Package Uncore PCU Perfmon Counter 3 720H MSR_S0_PMON_BOX_CTL Package Uncore SBo 0 Perfmon for SBo 0 Box-Wide Control 721H MSR_S0_PMON_EVNTSEL0 Package Uncore SBo 0 Perfmon Event Select for SBo 0 Counter 0 722H MSR_S0_PMON_EVNTSEL1 Package Uncore SBo 0 Perfmon Event Select for SBo 0 Counter 1 723H MSR_S0_PMON_EVNTSEL2 Package Uncore SBo 0 Perfmon Event Select for SBo 0 Counter 2 724H MSR_S0_PMON_EVNTSEL3 Package Uncore SBo 0 Perfmon Event Select for SBo 0 Counter 3 725H MSR_S0_PMON_BOX_FILTER Package Uncore SBo 0 Perfmon Box-Wide Filter 726H MSR_S0_PMON_CTR0 Package Uncore SBo 0 Perfmon Counter 0 727H MSR_S0_PMON_CTR1 Package Uncore SBo 0 Perfmon Counter 1 728H MSR_S0_PMON_CTR2 Package Uncore SBo 0 Perfmon Counter 2 729H MSR_S0_PMON_CTR3 Package Uncore SBo 0 Perfmon Counter 3 72AH MSR_S1_PMON_BOX_CTL Package Uncore SBo 1 Perfmon for SBo 1 Box-Wide Control 72BH MSR_S1_PMON_EVNTSEL0 Package Uncore SBo 1 Perfmon Event Select for SBo 1 Counter 0 72CH MSR_S1_PMON_EVNTSEL1 Package Uncore SBo 1 Perfmon Event Select for SBo 1 Counter 1 72DH MSR_S1_PMON_EVNTSEL2 Package Uncore SBo 1 Perfmon Event Select for SBo 1 Counter 2 72EH MSR_S1_PMON_EVNTSEL3 Package Uncore SBo 1 Perfmon Event Select for SBo 1 Counter 3 72FH MSR_S1_PMON_BOX_FILTER Package Uncore SBo 1 Perfmon Box-Wide Filter 2-236 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 730H MSR_S1_PMON_CTR0 Package Uncore SBo 1 Perfmon Counter 0 731H MSR_S1_PMON_CTR1 Package Uncore SBo 1 Perfmon Counter 1 732H MSR_S1_PMON_CTR2 Package Uncore SBo 1 Perfmon Counter 2 733H MSR_S1_PMON_CTR3 Package Uncore SBo 1 Perfmon Counter 3 734H MSR_S2_PMON_BOX_CTL Package Uncore SBo 2 Perfmon for SBo 2 Box-Wide Control 735H MSR_S2_PMON_EVNTSEL0 Package Uncore SBo 2 Perfmon Event Select for SBo 2 Counter 0 736H MSR_S2_PMON_EVNTSEL1 Package Uncore SBo 2 Perfmon Event Select for SBo 2 Counter 1 737H MSR_S2_PMON_EVNTSEL2 Package Uncore SBo 2 Perfmon Event Select for SBo 2 Counter 2 738H MSR_S2_PMON_EVNTSEL3 Package Uncore SBo 2 Perfmon Event Select for SBo 2 Counter 3 739H MSR_S2_PMON_BOX_FILTER Package Uncore SBo 2 Perfmon Box-Wide Filter 73AH MSR_S2_PMON_CTR0 Package Uncore SBo 2 Perfmon Counter 0 73BH MSR_S2_PMON_CTR1 Package Uncore SBo 2 Perfmon Counter 1 73CH MSR_S2_PMON_CTR2 Package Uncore SBo 2 Perfmon Counter 2 73DH MSR_S2_PMON_CTR3 Package Uncore SBo 2 Perfmon Counter 3 73EH MSR_S3_PMON_BOX_CTL Package Uncore SBo 3 Perfmon for SBo 3 Box-Wide Control 73FH MSR_S3_PMON_EVNTSEL0 Package Uncore SBo 3 Perfmon Event Select for SBo 3 Counter 0 740H MSR_S3_PMON_EVNTSEL1 Package Uncore SBo 3 Perfmon Event Select for SBo 3 Counter 1 741H MSR_S3_PMON_EVNTSEL2 Package Uncore SBo 3 Perfmon Event Select for SBo 3 Counter 2 742H MSR_S3_PMON_EVNTSEL3 Package Uncore SBo 3 Perfmon Event Select for SBo 3 Counter 3 743H MSR_S3_PMON_BOX_FILTER Package Uncore SBo 3 Perfmon Box-Wide Filter 744H MSR_S3_PMON_CTR0 Package Uncore SBo 3 Perfmon Counter 0 745H MSR_S3_PMON_CTR1 Package Uncore SBo 3 Perfmon Counter 1 746H MSR_S3_PMON_CTR2 Package Uncore SBo 3 Perfmon Counter 2 747H MSR_S3_PMON_CTR3 Package Uncore SBo 3 Perfmon Counter 3 E00H MSR_C0_PMON_BOX_CTL Package Uncore C-Box 0 Perfmon for Box-Wide Control E01H MSR_C0_PMON_EVNTSEL0 Package Uncore C-Box 0 Perfmon Event Select for C-Box 0 Counter 0 E02H MSR_C0_PMON_EVNTSEL1 Package Uncore C-Box 0 Perfmon Event Select for C-Box 0 Counter 1 E03H MSR_C0_PMON_EVNTSEL2 Package Uncore C-Box 0 Perfmon Event Select for C-Box 0 Counter 2 E04H MSR_C0_PMON_EVNTSEL3 Package Uncore C-Box 0 Perfmon Event Select for C-Box 0 Counter 3 Vol. 4 2-237 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec E05H MSR_C0_PMON_BOX_FILTER0 Package Uncore C-Box 0 Perfmon Box Wide Filter 0 E06H MSR_C0_PMON_BOX_FILTER1 Package Uncore C-Box 0 Perfmon Box Wide Filter 1 E07H MSR_C0_PMON_BOX_STATUS Package Uncore C-Box 0 Perfmon Box Wide Status E08H MSR_C0_PMON_CTR0 Package Uncore C-Box 0 Perfmon Counter 0 E09H MSR_C0_PMON_CTR1 Package Uncore C-Box 0 Perfmon Counter 1 E0AH MSR_C0_PMON_CTR2 Package Uncore C-Box 0 Perfmon Counter 2 E0BH MSR_C0_PMON_CTR3 Package Uncore C-Box 0 Perfmon Counter 3 E10H MSR_C1_PMON_BOX_CTL Package Uncore C-Box 1 Perfmon for Box-Wide Control E11H MSR_C1_PMON_EVNTSEL0 Package Uncore C-Box 1 Perfmon Event Select for C-Box 1 Counter 0 E12H MSR_C1_PMON_EVNTSEL1 Package Uncore C-Box 1 Perfmon Event Select for C-Box 1 Counter 1 E13H MSR_C1_PMON_EVNTSEL2 Package Uncore C-Box 1 Perfmon Event Select for C-Box 1 Counter 2 E14H MSR_C1_PMON_EVNTSEL3 Package Uncore C-Box 1 Perfmon Event Select for C-Box 1 Counter 3 E15H MSR_C1_PMON_BOX_FILTER0 Package Uncore C-Box 1 Perfmon Box Wide Filter 0 E16H MSR_C1_PMON_BOX_FILTER1 Package Uncore C-Box 1 Perfmon Box Wide Filter1 E17H MSR_C1_PMON_BOX_STATUS Package Uncore C-Box 1 Perfmon Box Wide Status E18H MSR_C1_PMON_CTR0 Package Uncore C-Box 1 Perfmon Counter 0 E19H MSR_C1_PMON_CTR1 Package Uncore C-Box 1 Perfmon Counter 1 E1AH MSR_C1_PMON_CTR2 Package Uncore C-Box 1 Perfmon Counter 2 E1BH MSR_C1_PMON_CTR3 Package Uncore C-Box 1 Perfmon Counter 3 E20H MSR_C2_PMON_BOX_CTL Package Uncore C-Box 2 Perfmon for Box-Wide Control E21H MSR_C2_PMON_EVNTSEL0 Package Uncore C-Box 2 Perfmon Event Select for C-Box 2 Counter 0 E22H MSR_C2_PMON_EVNTSEL1 Package Uncore C-Box 2 Perfmon Event Select for C-Box 2 Counter 1 E23H MSR_C2_PMON_EVNTSEL2 Package Uncore C-Box 2 Perfmon Event Select for C-Box 2 Counter 2 E24H MSR_C2_PMON_EVNTSEL3 Package Uncore C-Box 2 Perfmon Event select for C-Box 2 Counter 3 E25H MSR_C2_PMON_BOX_FILTER0 Package Uncore C-Box 2 Perfmon Box Wide Filter 0 E26H MSR_C2_PMON_BOX_FILTER1 Package Uncore C-Box 2 Perfmon Box Wide Filter1 E27H MSR_C2_PMON_BOX_STATUS Package Uncore C-Box 2 Perfmon Box Wide Status E28H MSR_C2_PMON_CTR0 Package Uncore C-Box 2 Perfmon Counter 0 E29H MSR_C2_PMON_CTR1 Package Uncore C-Box 2 Perfmon Counter 1 E2AH MSR_C2_PMON_CTR2 Package Uncore C-Box 2 Perfmon Counter 2 E2BH MSR_C2_PMON_CTR3 Package Uncore C-Box 2 Perfmon Counter 3 2-238 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec E30H MSR_C3_PMON_BOX_CTL Package Uncore C-Box 3 Perfmon for Box-Wide Control E31H MSR_C3_PMON_EVNTSEL0 Package Uncore C-Box 3 Perfmon Event Select for C-Box 3 Counter 0 E32H MSR_C3_PMON_EVNTSEL1 Package Uncore C-Box 3 Perfmon Event Select for C-Box 3 Counter 1 E33H MSR_C3_PMON_EVNTSEL2 Package Uncore C-Box 3 Perfmon Event Select for C-Box 3 Counter 2 E34H MSR_C3_PMON_EVNTSEL3 Package Uncore C-Box 3 Perfmon Event Select for C-Box 3 Counter 3 E35H MSR_C3_PMON_BOX_FILTER0 Package Uncore C-Box 3 Perfmon Box Wide Filter 0 E36H MSR_C3_PMON_BOX_FILTER1 Package Uncore C-Box 3 Perfmon Box Wide Filter1 E37H MSR_C3_PMON_BOX_STATUS Package Uncore C-Box 3 Perfmon Box Wide Status E38H MSR_C3_PMON_CTR0 Package Uncore C-Box 3 Perfmon Counter 0 E39H MSR_C3_PMON_CTR1 Package Uncore C-Box 3 Perfmon Counter 1 E3AH MSR_C3_PMON_CTR2 Package Uncore C-Box 3 Perfmon Counter 2 E3BH MSR_C3_PMON_CTR3 Package Uncore C-Box 3 Perfmon Counter 3 E40H MSR_C4_PMON_BOX_CTL Package Uncore C-Box 4 Perfmon for Box-Wide Control E41H MSR_C4_PMON_EVNTSEL0 Package Uncore C-Box 4 Perfmon Event Select for C-Box 4 Counter 0 E42H MSR_C4_PMON_EVNTSEL1 Package Uncore C-Box 4 Perfmon Event Select for C-Box 4 Counter 1 E43H MSR_C4_PMON_EVNTSEL2 Package Uncore C-Box 4 Perfmon Event Select for C-Box 4 Counter 2 E44H MSR_C4_PMON_EVNTSEL3 Package Uncore C-Box 4 Perfmon Event Select for C-Box 4 Counter 3 E45H MSR_C4_PMON_BOX_FILTER0 Package Uncore C-Box 4 Perfmon Box Wide Filter 0 E46H MSR_C4_PMON_BOX_FILTER1 Package Uncore C-Box 4 Perfmon Box Wide Filter1 E47H MSR_C4_PMON_BOX_STATUS Package Uncore C-Box 4 Perfmon Box Wide Status E48H MSR_C4_PMON_CTR0 Package Uncore C-Box 4 Perfmon Counter 0 E49H MSR_C4_PMON_CTR1 Package Uncore C-Box 4 Perfmon Counter 1 E4AH MSR_C4_PMON_CTR2 Package Uncore C-Box 4 Perfmon Counter 2 E4BH MSR_C4_PMON_CTR3 Package Uncore C-Box 4 Perfmon Counter 3 E50H MSR_C5_PMON_BOX_CTL Package Uncore C-Box 5 Perfmon for Box-Wide Control E51H MSR_C5_PMON_EVNTSEL0 Package Uncore C-Box 5 Perfmon Event Select for C-Box 5 Counter 0 E52H MSR_C5_PMON_EVNTSEL1 Package Uncore C-Box 5 Perfmon Event Select for C-Box 5 Counter 1 E53H MSR_C5_PMON_EVNTSEL2 Package Uncore C-Box 5 Perfmon Event Select for C-Box 5 Counter 2 E54H MSR_C5_PMON_EVNTSEL3 Package Uncore C-Box 5 Perfmon Event Select for C-Box 5 Counter 3 Vol. 4 2-239 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec E55H MSR_C5_PMON_BOX_FILTER0 Package Uncore C-Box 5 Perfmon Box Wide Filter 0 E56H MSR_C5_PMON_BOX_FILTER1 Package Uncore C-Box 5 Perfmon Box Wide Filter 1 E57H MSR_C5_PMON_BOX_STATUS Package Uncore C-Box 5 Perfmon Box Wide Status E58H MSR_C5_PMON_CTR0 Package Uncore C-Box 5 Perfmon Counter 0 E59H MSR_C5_PMON_CTR1 Package Uncore C-Box 5 Perfmon Counter 1 E5AH MSR_C5_PMON_CTR2 Package Uncore C-Box 5 Perfmon Counter 2 E5BH MSR_C5_PMON_CTR3 Package Uncore C-Box 5 Perfmon Counter 3 E60H MSR_C6_PMON_BOX_CTL Package Uncore C-Box 6 Perfmon for Box-Wide Control E61H MSR_C6_PMON_EVNTSEL0 Package Uncore C-Box 6 Perfmon Event Select for C-Box 6 Counter 0 E62H MSR_C6_PMON_EVNTSEL1 Package Uncore C-Box 6 Perfmon Event Select for C-Box 6 Counter 1 E63H MSR_C6_PMON_EVNTSEL2 Package Uncore C-Box 6 Perfmon Event Select for C-Box 6 Counter 2 E64H MSR_C6_PMON_EVNTSEL3 Package Uncore C-Box 6 Perfmon Event Select for C-Box 6 Counter 3 E65H MSR_C6_PMON_BOX_FILTER0 Package Uncore C-Box 6 Perfmon Box Wide Filter 0 E66H MSR_C6_PMON_BOX_FILTER1 Package Uncore C-Box 6 Perfmon Box Wide Filter 1 E67H MSR_C6_PMON_BOX_STATUS Package Uncore C-Box 6 Perfmon Box Wide Status E68H MSR_C6_PMON_CTR0 Package Uncore C-Box 6 Perfmon Counter 0 E69H MSR_C6_PMON_CTR1 Package Uncore C-Box 6 Perfmon Counter 1 E6AH MSR_C6_PMON_CTR2 Package Uncore C-Box 6 Perfmon Counter 2 E6BH MSR_C6_PMON_CTR3 Package Uncore C-Box 6 Perfmon Counter 3 E70H MSR_C7_PMON_BOX_CTL Package Uncore C-Box 7 Perfmon for Box-Wide Control E71H MSR_C7_PMON_EVNTSEL0 Package Uncore C-Box 7 Perfmon Event Select for C-Box 7 Counter 0 E72H MSR_C7_PMON_EVNTSEL1 Package Uncore C-Box 7 Perfmon Event Select for C-Box 7 Counter 1 E73H MSR_C7_PMON_EVNTSEL2 Package Uncore C-Box 7 Perfmon Event Select for C-Box 7 Counter 2 E74H MSR_C7_PMON_EVNTSEL3 Package Uncore C-Box 7 Perfmon Event Select for C-Box 7 Counter 3 E75H MSR_C7_PMON_BOX_FILTER0 Package Uncore C-Box 7 Perfmon Box Wide Filter 0 E76H MSR_C7_PMON_BOX_FILTER1 Package Uncore C-Box 7 Perfmon Box Wide Filter 1 E77H MSR_C7_PMON_BOX_STATUS Package Uncore C-Box 7 Perfmon Box Wide Status E78H MSR_C7_PMON_CTR0 Package Uncore C-Box 7 Perfmon Counter 0 E79H MSR_C7_PMON_CTR1 Package Uncore C-Box 7 Perfmon Counter 1 E7AH MSR_C7_PMON_CTR2 Package Uncore C-Box 7 Perfmon Counter 2 E7BH MSR_C7_PMON_CTR3 Package Uncore C-Box 7 Perfmon Counter 3 2-240 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec E80H MSR_C8_PMON_BOX_CTL Package Uncore C-Box 8 Perfmon Local Box Wide Control E81H MSR_C8_PMON_EVNTSEL0 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 0 E82H MSR_C8_PMON_EVNTSEL1 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 1 E83H MSR_C8_PMON_EVNTSEL2 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 2 E84H MSR_C8_PMON_EVNTSEL3 Package Uncore C-Box 8 Perfmon Event Select for C-Box 8 Counter 3 E85H MSR_C8_PMON_BOX_FILTER0 Package Uncore C-Box 8 Perfmon Box Wide Filter 0 E86H MSR_C8_PMON_BOX_FILTER1 Package Uncore C-Box 8 Perfmon Box Wide Filter 1 E87H MSR_C8_PMON_BOX_STATUS Package Uncore C-Box 8 Perfmon Box Wide Status E88H MSR_C8_PMON_CTR0 Package Uncore C-Box 8 Perfmon Counter 0 E89H MSR_C8_PMON_CTR1 Package Uncore C-Box 8 Perfmon Counter 1 E8AH MSR_C8_PMON_CTR2 Package Uncore C-Box 8 Perfmon Counter 2 E8BH MSR_C8_PMON_CTR3 Package Uncore C-Box 8 Perfmon Counter 3 E90H MSR_C9_PMON_BOX_CTL Package Uncore C-Box 9 Perfmon Local Box Wide Control E91H MSR_C9_PMON_EVNTSEL0 Package Uncore C-Box 9 Perfmon Event Select for C-Box 9 Counter 0 E92H MSR_C9_PMON_EVNTSEL1 Package Uncore C-Box 9 Perfmon Event Select for C-Box 9 Counter 1 E93H MSR_C9_PMON_EVNTSEL2 Package Uncore C-Box 9 Perfmon Event Select for C-Box 9 Counter 2 E94H MSR_C9_PMON_EVNTSEL3 Package Uncore C-Box 9 Perfmon Event Select for C-Box 9 Counter 3 E95H MSR_C9_PMON_BOX_FILTER0 Package Uncore C-Box 9 Perfmon Box Wide Filter 0 E96H MSR_C9_PMON_BOX_FILTER1 Package Uncore C-Box 9 Perfmon Box Wide Filter 1 E97H MSR_C9_PMON_BOX_STATUS Package Uncore C-Box 9 Perfmon Box Wide Status E98H MSR_C9_PMON_CTR0 Package Uncore C-Box 9 Perfmon Counter 0 E99H MSR_C9_PMON_CTR1 Package Uncore C-Box 9 Perfmon Counter 1 E9AH MSR_C9_PMON_CTR2 Package Uncore C-Box 9 Perfmon Counter 2 E9BH MSR_C9_PMON_CTR3 Package Uncore C-Box 9 Perfmon Counter 3 EA0H MSR_C10_PMON_BOX_CTL Package Uncore C-Box 10 Perfmon Local Box Wide Control EA1H MSR_C10_PMON_EVNTSEL0 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 0 EA2H MSR_C10_PMON_EVNTSEL1 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 1 EA3H MSR_C10_PMON_EVNTSEL2 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 2 EA4H MSR_C10_PMON_EVNTSEL3 Package Uncore C-Box 10 Perfmon Event Select for C-Box 10 Counter 3 Vol. 4 2-241 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec EA5H MSR_C10_PMON_BOX_FILTER0 Package Uncore C-Box 10 Perfmon Box Wide Filter 0 EA6H MSR_C10_PMON_BOX_FILTER1 Package Uncore C-Box 10 Perfmon Box Wide Filter 1 EA7H MSR_C10_PMON_BOX_STATUS Package Uncore C-Box 10 Perfmon Box Wide Status EA8H MSR_C10_PMON_CTR0 Package Uncore C-Box 10 Perfmon Counter 0 EA9H MSR_C10_PMON_CTR1 Package Uncore C-Box 10 perfmon Counter 1 EAAH MSR_C10_PMON_CTR2 Package Uncore C-Box 10 Perfmon Counter 2 EABH MSR_C10_PMON_CTR3 Package Uncore C-Box 10 Perfmon Counter 3 EB0H MSR_C11_PMON_BOX_CTL Package Uncore C-Box 11 Perfmon Local Box Wide Control EB1H MSR_C11_PMON_EVNTSEL0 Package Uncore C-Box 11 Perfmon Event Select for C-Box 11 Counter 0 EB2H MSR_C11_PMON_EVNTSEL1 Package Uncore C-Box 11 Perfmon Event Select for C-Box 11 Counter 1 EB3H MSR_C11_PMON_EVNTSEL2 Package Uncore C-Box 11 Perfmon Event Select for C-Box 11 Counter 2 EB4H MSR_C11_PMON_EVNTSEL3 Package Uncore C-box 11 Perfmon Event Select for C-Box 11 Counter 3 EB5H MSR_C11_PMON_BOX_FILTER0 Package Uncore C-Box 11 Perfmon Box Wide Filter 0 EB6H MSR_C11_PMON_BOX_FILTER1 Package Uncore C-Box 11 Perfmon Box Wide Filter 1 EB7H MSR_C11_PMON_BOX_STATUS Package Uncore C-Box 11 Perfmon Box Wide Status EB8H MSR_C11_PMON_CTR0 Package Uncore C-Box 11 Perfmon Counter 0 EB9H MSR_C11_PMON_CTR1 Package Uncore C-Box 11 Perfmon Counter 1 EBAH MSR_C11_PMON_CTR2 Package Uncore C-Box 11 Perfmon Counter 2 EBBH MSR_C11_PMON_CTR3 Package Uncore C-Box 11 Perfmon Counter 3 EC0H MSR_C12_PMON_BOX_CTL Package Uncore C-Box 12 Perfmon Local Box Wide Control EC1H MSR_C12_PMON_EVNTSEL0 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 0 EC2H MSR_C12_PMON_EVNTSEL1 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 1 EC3H MSR_C12_PMON_EVNTSEL2 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 2 EC4H MSR_C12_PMON_EVNTSEL3 Package Uncore C-Box 12 Perfmon Event Select for C-Box 12 Counter 3 EC5H MSR_C12_PMON_BOX_FILTER0 Package Uncore C-Box 12 Perfmon Box Wide Filter 0 EC6H MSR_C12_PMON_BOX_FILTER1 Package Uncore C-Box 12 Perfmon Box Wide Filter 1 EC7H MSR_C12_PMON_BOX_STATUS Package Uncore C-Box 12 Perfmon Box Wide Status EC8H MSR_C12_PMON_CTR0 Package Uncore C-Box 12 Perfmon Counter 0 EC9H MSR_C12_PMON_CTR1 Package Uncore C-Box 12 Perfmon Counter 1 ECAH MSR_C12_PMON_CTR2 Package Uncore C-Box 12 Perfmon Counter 2 ECBH MSR_C12_PMON_CTR3 Package Uncore C-Box 12 Perfmon Counter 3 2-242 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec ED0H MSR_C13_PMON_BOX_CTL Package Uncore C-Box 13 Perfmon local box wide control. ED1H MSR_C13_PMON_EVNTSEL0 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 0 ED2H MSR_C13_PMON_EVNTSEL1 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 1 ED3H MSR_C13_PMON_EVNTSEL2 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 2 ED4H MSR_C13_PMON_EVNTSEL3 Package Uncore C-Box 13 Perfmon Event Select for C-Box 13 Counter 3 ED5H MSR_C13_PMON_BOX_FILTER0 Package Uncore C-Box 13 Perfmon Box Wide Filter 0 ED6H MSR_C13_PMON_BOX_FILTER1 Package Uncore C-Box 13 Perfmon Box Wide Filter 1 ED7H MSR_C13_PMON_BOX_STATUS Package Uncore C-Box 13 Perfmon Box Wide Status ED8H MSR_C13_PMON_CTR0 Package Uncore C-Box 13 Perfmon Counter 0 ED9H MSR_C13_PMON_CTR1 Package Uncore C-Box 13 Perfmon Counter 1 EDAH MSR_C13_PMON_CTR2 Package Uncore C-Box 13 Perfmon Counter 2 EDBH MSR_C13_PMON_CTR3 Package Uncore C-Box 13 Perfmon Counter 3 EE0H MSR_C14_PMON_BOX_CTL Package Uncore C-Box 14 Perfmon Local Box Wide Control EE1H MSR_C14_PMON_EVNTSEL0 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 0 EE2H MSR_C14_PMON_EVNTSEL1 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 1 EE3H MSR_C14_PMON_EVNTSEL2 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 2 EE4H MSR_C14_PMON_EVNTSEL3 Package Uncore C-Box 14 Perfmon Event Select for C-Box 14 Counter 3 EE5H MSR_C14_PMON_BOX_FILTER Package Uncore C-Box 14 Perfmon Box Wide Filter 0 EE6H MSR_C14_PMON_BOX_FILTER1 Package Uncore C-Box 14 Perfmon Box Wide Filter 1 EE7H MSR_C14_PMON_BOX_STATUS Package Uncore C-Box 14 Perfmon Box Wide Status EE8H MSR_C14_PMON_CTR0 Package Uncore C-Box 14 Perfmon Counter 0 EE9H MSR_C14_PMON_CTR1 Package Uncore C-Box 14 Perfmon Counter 1 EEAH MSR_C14_PMON_CTR2 Package Uncore C-Box 14 Perfmon Counter 2 EEBH MSR_C14_PMON_CTR3 Package Uncore C-Box 14 Perfmon Counter 3 EF0H MSR_C15_PMON_BOX_CTL Package Uncore C-Box 15 Perfmon Local Box Wide Control EF1H MSR_C15_PMON_EVNTSEL0 Package Uncore C-Box 15 Perfmon Event Select for C-Box 15 Counter 0 EF2H MSR_C15_PMON_EVNTSEL1 Package Uncore C-Box 15 Perfmon Event Select for C-Box 15 Counter 1 EF3H MSR_C15_PMON_EVNTSEL2 Package Uncore C-Box 15 Perfmon Event Select for C-Box 15 Counter 2 EF4H MSR_C15_PMON_EVNTSEL3 Package Uncore C-Box 15 Perfmon Event Select for C-Box 15 Counter 3 Vol. 4 2-243 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-33. Uncore PMU MSRs in Intel® Xeon® Processor E5 v3 Family (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec EF5H MSR_C15_PMON_BOX_FILTER0 Package Uncore C-Box 15 Perfmon Box Wide Filter 0 EF6H MSR_C15_PMON_BOX_FILTER1 Package Uncore C-Box 15 Perfmon Box Wide Filter 1 EF7H MSR_C15_PMON_BOX_STATUS Package Uncore C-Box 15 Perfmon Box Wide Status EF8H MSR_C15_PMON_CTR0 Package Uncore C-Box 15 Perfmon Counter 0 EF9H MSR_C15_PMON_CTR1 Package Uncore C-Box 15 Perfmon Counter 1 EFAH MSR_C15_PMON_CTR2 Package Uncore C-Box 15 Perfmon Counter 2 EFBH MSR_C15_PMON_CTR3 Package Uncore C-Box 15 Perfmon Counter 3 F00H MSR_C16_PMON_BOX_CTL Package Uncore C-Box 16 Perfmon for Box-Wide Control F01H MSR_C16_PMON_EVNTSEL0 Package Uncore C-Box 16 Perfmon Event Select for C-Box 16 Counter 0 F02H MSR_C16_PMON_EVNTSEL1 Package Uncore C-Box 16 Perfmon Event Select for C-Box 16 Counter 1 F03H MSR_C16_PMON_EVNTSEL2 Package Uncore C-Box 16 Perfmon Event Select for C-Box 16 Counter 2 F04H MSR_C16_PMON_EVNTSEL3 Package Uncore C-Box 16 Perfmon Event Select for C-Box 16 Counter 3 F05H MSR_C16_PMON_BOX_FILTER0 Package Uncore C-Box 16 Perfmon Box Wide Filter 0 F06H MSR_C16_PMON_BOX_FILTER1 Package Uncore C-Box 16 Perfmon Box Wide Filter 1 F07H MSR_C16_PMON_BOX_STATUS Package Uncore C-Box 16 Perfmon Box Wide Status F08H MSR_C16_PMON_CTR0 Package Uncore C-Box 16 Perfmon Counter 0 F09H MSR_C16_PMON_CTR1 Package Uncore C-Box 16 Perfmon Counter 1 F0AH MSR_C16_PMON_CTR2 Package Uncore C-Box 16 Perfmon Counter 2 E0BH MSR_C16_PMON_CTR3 Package Uncore C-Box 16 Perfmon Counter 3 F10H MSR_C17_PMON_BOX_CTL Package Uncore C-Box 17 Perfmon for Box-Wide Control F11H MSR_C17_PMON_EVNTSEL0 Package Uncore C-Box 17 Perfmon Event Select for C-Box 17 Counter 0 F12H MSR_C17_PMON_EVNTSEL1 Package Uncore C-Box 17 Perfmon Event Select for C-Box 17 Counter 1 F13H MSR_C17_PMON_EVNTSEL2 Package Uncore C-Box 17 Perfmon Event Select for C-Box 17 Counter 2 F14H MSR_C17_PMON_EVNTSEL3 Package Uncore C-Box 17 Perfmon Event Select for C-Box 17 Counter 3 F15H MSR_C17_PMON_BOX_FILTER0 Package Uncore C-Box 17 Perfmon Box Wide Filter 0 F16H MSR_C17_PMON_BOX_FILTER1 Package Uncore C-Box 17 Perfmon Box Wide Filter1 F17H MSR_C17_PMON_BOX_STATUS Package Uncore C-Box 17 Perfmon Box Wide Status F18H MSR_C17_PMON_CTR0 Package Uncore C-Box 17 Perfmon Counter 0 F19H MSR_C17_PMON_CTR1 Package Uncore C-Box 17 Perfmon Counter 1 F1AH MSR_C17_PMON_CTR2 Package Uncore C-Box 17 Perfmon Counter 2 F1BH MSR_C17_PMON_CTR3 Package Uncore C-Box 17 Perfmon Counter 3 2-244 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) 2.15 MSRS IN INTEL® CORE™ M PROCESSORS AND 5TH GENERATION INTEL CORE PROCESSORS The Intel® Core™ M-5xxx processors and 5th generation Intel® Core™ Processors, and Intel® Xeon® Processor E3-1200 v4 family are based on the Broadwell microarchitecture. The Intel® Core™ M-5xxx processors and 5th generation Intel® Core™ Processors have CPUID DisplayFamily_DisplayModel signature 06_3DH. Intel® Xeon® Processor E3-1200 v4 family and the 5th generation Intel® Core™ Processors have CPUID DisplayFamily_DisplayModel signature 06_47H. Processors with signatures 06_3DH and 06_47H support the MSR interfaces listed in Table 2-20, Table 2-21, Table 2-22, Table 2-25, Table 2-29, Table 2-30, Table 2-34, and Table 2-35. For an MSR listed in Table 2-35 that also appears in the model-specific tables of prior generations, Table 2-35 supercede prior generation tables. Table 2-34 lists MSRs that are common to processors based on the Broadwell microarchitectures (including CPUID signatures 06_3DH, 06_47H, 06_4FH, and 06_56H). Table 2-34. Additional MSRs Common to Processors Based the Broadwell Microarchitectures Register Address Hex Dec 38EH 910 Register Name / Bit Fields IA32_PERF_GLOBAL_STATUS Scope Thread Bit Description See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 0 Ovf_PMC0 1 Ovf_PMC1 2 Ovf_PMC2 3 Ovf_PMC3 31:4 Reserved 32 Ovf_FixedCtr0 33 Ovf_FixedCtr1 34 Ovf_FixedCtr2 54:35 Reserved 55 Trace_ToPA_PMI See Section 35.2.6.2, “Table of Physical Addresses (ToPA).” 390H 912 60:56 Reserved 61 Ovf_Uncore 62 Ovf_BufDSSAVE 63 CondChgd IA32_PERF_GLOBAL_OVF_CTRL Thread See Table 2-2. See Section 18.6.2.2, “Global Counter Control Facilities.” 0 Set 1 to clear Ovf_PMC0 1 Set 1 to clear Ovf_PMC1 2 Set 1 to clear Ovf_PMC2 3 Set 1 to clear Ovf_PMC3 31:4 Reserved 32 Set 1 to clear Ovf_FixedCtr0 33 Set 1 to clear Ovf_FixedCtr1 Vol. 4 2-245 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-34. Additional MSRs Common to Processors Based the Broadwell Microarchitectures Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 34 Set 1 to clear Ovf_FixedCtr2 54:35 Reserved. 55 Set 1 to clear Trace_ToPA_PMI. See Section 35.2.6.2, “Table of Physical Addresses (ToPA).” 560H 1376 60:56 Reserved 61 Set 1 to clear Ovf_Uncore 62 Set 1 to clear Ovf_BufDSSAVE 63 Set 1 to clear CondChgd IA32_RTIT_OUTPUT_BASE THREAD Reserved MAXPHYADDR1-1:7 Base physical address. 63:MAXPHYADDR 561H 570H 1377 1392 Trace Output Base Register (R/W) 6:0 IA32_RTIT_OUTPUT_MASK_PTRS Reserved THREAD Trace Output Mask Pointers Register (R/W) 6:0 Reserved 31:7 MaskOrTableOffset 63:32 Output Offset. IA32_RTIT_CTL Thread Trace Control Register (R/W) 0 TraceEn 1 Reserved, must be zero. 2 OS 3 User 6:4 Reserved, must be zero. 7 CR3 filter 8 ToPA Writing 0 will #GP if also setting TraceEn. 571H 1393 2-246 Vol. 4 9 Reserved, must be zero. 10 TSCEn 11 DisRETC 12 Reserved, must be zero. 13 Reserved; writing 0 will #GP if also setting TraceEn. 63:14 Reserved, must be zero. IA32_RTIT_STATUS Thread Tracing Status Register (R/W) 0 Reserved, writes ignored. 1 ContexEn, writes ignored. 2 TriggerEn, writes ignored. 3 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-34. Additional MSRs Common to Processors Based the Broadwell Microarchitectures Register Address Hex 572H Register Name / Bit Fields Scope Bit Description Dec 1394 620H 4 Error (R/W) 5 Stopped 63:6 Reserved, must be zero. IA32_RTIT_CR3_MATCH THREAD Trace Filter CR3 Match Register (R/W) 4:0 Reserved 63:5 CR3[63:5] value to match. MSR UNCORE_RATIO_LIMIT Package Uncore Ratio Limit (R/W) Out of reset, the min_ratio and max_ratio fields represent the widest possible range of uncore frequencies. Writing to these fields allows software to control the minimum and the maximum frequency that hardware will select. 63:15 Reserved 14:8 MIN_RATIO Writing to this field controls the minimum possible ratio of the LLC/Ring. 7 Reserved 6:0 MAX_RATIO This field is used to limit the max ratio of the LLC/Ring. NOTES: 1. MAXPHYADDR is reported by CPUID.80000008H:EAX[7:0]. Table 2-35 lists MSRs that are specific to Intel Core M processors and 5th Generation Intel Core Processors. Table 2-35. Additional MSRs Supported by Intel® Core™ M Processors and 5th Generation Intel® Core™ Processors Register Address Hex Dec E2H 226 Register Name MSR_PKG_CST_CONFIG_CONTROL Scope Core Bit Description C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. See http://biosbits.org. Vol. 4 2-247 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-35. Additional MSRs Supported by Intel® Core™ M Processors and 5th Generation Intel® Core™ Processors Register Address Hex Register Name Scope Bit Description Dec 3:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package Cstate limit. The following C-state code name encodings are supported: 0000b: C0/C1 (no package C-state support) 0001b: C2 0010b: C3 0011b: C6 0100b: C7 0101b: C7s 0110b: C8 0111b: C9 1000b: C10 1ADH 429 9:4 Reserved 10 I/O MWAIT Redirection Enable (R/W) 14:11 Reserved 15 CFG Lock (R/WO) 24:16 Reserved 25 C3 State Auto Demotion Enable (R/W) 26 C1 State Auto Demotion Enable (R/W) 27 Enable C3 Undemotion (R/W) 28 Enable C1 Undemotion (R/W) 29 Enable Package C-State Auto-Demotion (R/W) 30 Enable Package C-State Undemotion (R/W) 63:31 Reserved MSR_TURBO_RATIO_LIMIT Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 2-248 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-35. Additional MSRs Supported by Intel® Core™ M Processors and 5th Generation Intel® Core™ Processors Register Address Hex Register Name Scope Bit Description Dec 39:32 Package Maximum Ratio Limit for 5C Maximum turbo ratio limit of 5core active. 47:40 Package Maximum Ratio Limit for 6C Maximum turbo ratio limit of 6core active. 63:48 639H 1593 Reserved MSR_PP0_ENERGY_STATUS Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” See Table 2-20, Table 2-21, Table 2-22, Table 2-25, Table 2-29, Table 2-30, Table 2-34 for other MSR definitions applicable to processors with CPUID signature 06_3DH. 2.16 MSRS IN INTEL® XEON® PROCESSORS E5 V4 FAMILY The MSRs listed in Table 2-36 are available and common to Intel® Xeon® Processor D product Family (CPUID DisplayFamily_DisplayModel = 06_56H) and to Intel Xeon processors E5 v4, E7 v4 families (CPUID DisplayFamily_DisplayModel = 06_4FH). They are based on the Broadwell microarchitecture. See Section 2.16.1 for lists of tables of MSRs that are supported by Intel® Xeon® Processor D Family. Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Dec 4EH 78 Register Name / Bit Fields MSR_PPIN_CTL Scope Package 0 Bit Description Protected Processor Inventory Number Enable Control (R/W) LockOut (R/WO) See Table 2-26. 1 Enable_PPIN (R/W) See Table 2-26. 63:2 4FH 79 MSR_PPIN Reserved Package 63:0 Protected Processor Inventory Number (R/O) Protected Processor Inventory Number (R/O) See Table 2-26. CEH 206 MSR_PLATFORM_INFO Package Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) See Table 2-26. 22:16 Reserved. Vol. 4 2-249 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 23 Package PPIN_CAP (R/O) See Table 2-26. 27:24 28 Reserved Package Programmable Ratio Limit for Turbo Mode (R/O) See Table 2-26. 29 Package Programmable TDP Limit for Turbo Mode (R/O) See Table 2-26. 30 Package Programmable TJ OFFSET (R/O) See Table 2-26. 39:31 47:40 Reserved Package Maximum Efficiency Ratio (R/O) See Table 2-26. 63:48 E2H 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Core C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 000b: C0/C1 (no package C-state support) 001b: C2 010b: C6 (non-retention) 011b: C6 (retention) 111b: No Package C state limits. All C states supported by the processor are available. 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) 14:11 Reserved 15 CFG Lock (R/WO) 16 Automatic C-State Conversion Enable (R/W) If 1, the processor will convert HALT or MWAT(C1) to MWAIT(C6). 2-250 Vol. 4 24:17 Reserved 25 C3 State Auto Demotion Enable (R/W) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex 179H 17DH Register Name / Bit Fields Scope Bit Description Dec 377 390 26 C1 State Auto Demotion Enable (R/W) 27 Enable C3 Undemotion (R/W) 28 Enable C1 Undemotion (R/W) 29 Package C State Demotion Enable (R/W) 30 Package C State UnDemotion Enable (R/W) 63:31 Reserved IA32_MCG_CAP Thread Global Machine Check Capability (R/O) 7:0 Count 8 MCG_CTL_P 9 MCG_EXT_P 10 MCP_CMCI_P 11 MCG_TES_P 15:12 Reserved 23:16 MCG_EXT_CNT 24 MCG_SER_P 25 MCG_EM_P 26 MCG_ELOG_P 63:27 Reserved MSR_SMM_MCA_CAP Thread Enhanced SMM Capabilities (SMM-RO) Reports SMM capability Enhancement. Accessible only while in SMM. 57:0 Reserved 58 SMM_Code_Access_Chk (SMM-RO) If set to 1, indicates that the SMM code access restriction is supported and a host-space interface available to SMM handler. 59 Long_Flow_Indication (SMM-RO) If set to 1, indicates that the SMM long flow indicator is supported and a host-space interface available to SMM handler. 63:60 19CH 412 IA32_THERM_STATUS Reserved Core Thermal Monitor Status (R/W) See Table 2-2. 0 Thermal Status (RO) See Table 2-2. 1 Thermal Status Log (R/WC0) See Table 2-2. Vol. 4 2-251 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 2 PROTCHOT # or FORCEPR# Status (RO) See Table 2-2. 3 PROTCHOT # or FORCEPR# Log (R/WC0) See Table 2-2. 4 Critical Temperature Status (RO) See Table 2-2. 5 Critical Temperature Status Log (R/WC0) See Table 2-2. 6 Thermal Threshold #1 Status (RO) See Table 2-2. 7 Thermal Threshold #1 Log (R/WC0) See Table 2-2. 8 Thermal Threshold #2 Status (RO) See Table 2-2. 9 Thermal Threshold #2 Log (R/WC0) See Table 2-2. 10 Power Limitation Status (RO) See Table 2-2. 11 Power Limitation Log (R/WC0) See Table 2-2. 12 Current Limit Status (RO) See Table 2-2. 13 Current Limit Log (R/WC0) See Table 2-2. 14 Cross Domain Limit Status (RO) See Table 2-2. 15 Cross Domain Limit Log (R/WC0) See Table 2-2. 22:16 Digital Readout (RO) See Table 2-2. 26:23 Reserved 30:27 Resolution in Degrees Celsius (RO) See Table 2-2. 31 Reading Valid (RO) See Table 2-2. 63:32 1A2H 418 MSR_TEMPERATURE_TARGET 15:0 2-252 Vol. 4 Reserved Package Temperature Target Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 23:16 Temperature Target (RO) See Table 2-26. 27:24 TCC Activation Offset (R/W) See Table 2-26. 63:28 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved. Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 1AEH 430 7:0 Package Maximum Ratio Limit for 1C 15:8 Package Maximum Ratio Limit for 2C 23:16 Package Maximum Ratio Limit for 3C 31:24 Package Maximum Ratio Limit for 4C 39:32 Package Maximum Ratio Limit for 5C 47:40 Package Maximum Ratio Limit for 6C 55:48 Package Maximum Ratio Limit for 7C 63:56 Package Maximum Ratio Limit for 8C MSR_TURBO_RATIO_LIMIT1 Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 606H 1542 7:0 Package Maximum Ratio Limit for 9C 15:8 Package Maximum Ratio Limit for 10C 23:16 Package Maximum Ratio Limit for 11C 31:24 Package Maximum Ratio Limit for 12C 39:32 Package Maximum Ratio Limit for 13C 47:40 Package Maximum Ratio Limit for 14C 55:48 Package Maximum Ratio Limit for 15C 63:56 Package Maximum Ratio Limit for 16C MSR_RAPL_POWER_UNIT Package Unit Multipliers Used in RAPL Interfaces (R/O) 3:0 Package Power Units See Section 14.9.1, “RAPL Interfaces.” 7:4 Package Reserved 12:8 Package Energy Status Units Energy related information (in Joules) is based on the multiplier, 1/2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 0EH (or 61 micro-joules). 15:13 Package Reserved Vol. 4 2-253 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 19:16 Package Time Units See Section 14.9.1, “RAPL Interfaces.” 63:20 618H 1560 MSR_DRAM_POWER_LIMIT Reserved Package DRAM RAPL Power Limit Control (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM Energy Status (R/O) Energy consumed by DRAM devices. 61BH 1563 31:0 Energy in 15.3 micro-joules. Requires BIOS configuration to enable DRAM RAPL mode 0 (Direct VR). 63:32 Reserved MSR_DRAM_PERF_STATUS Package DRAM Performance Throttling Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61CH 1564 MSR_DRAM_POWER_INFO Package DRAM RAPL Parameters (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 620H 1568 MSR UNCORE_RATIO_LIMIT Package Uncore Ratio Limit (R/W) Out of reset, the min_ratio and max_ratio fields represent the widest possible range of uncore frequencies. Writing to these fields allows software to control the minimum and the maximum frequency that hardware will select. 63:15 Reserved 14:8 MIN_RATIO Writing to this field controls the minimum possible ratio of the LLC/Ring. 7 Reserved 6:0 MAX_RATIO This field is used to limit the max ratio of the LLC/Ring. 639H 1593 MSR_PP0_ENERGY_STATUS Package Reserved (R/O) Reads return 0. 690H 1680 MSR_CORE_PERF_LIMIT_REASONS Package Indicator of Frequency Clipping in Processor Cores (R/W) (Frequency refers to processor core frequency.) 0 PROCHOT Status (R0) When set, processor core frequency is reduced below the operating system request due to assertion of external PROCHOT. 1 Thermal Status (R0) When set, frequency is reduced below the operating system request due to a thermal event. 2-254 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 2 Power Budget Management Status (R0) When set, frequency is reduced below the operating system request due to PBM limit. 3 Platform Configuration Services Status (R0) When set, frequency is reduced below the operating system request due to PCS limit. 4 Reserved 5 Autonomous Utilization-Based Frequency Control Status (R0) When set, frequency is reduced below the operating system request because the processor has detected that utilization is low. 6 VR Therm Alert Status (R0) When set, frequency is reduced below the operating system request due to a thermal alert from the Voltage Regulator. 7 Reserved 8 Electrical Design Point Status (R0) When set, frequency is reduced below the operating system request due to electrical design point constraints (e.g., maximum electrical current consumption). 9 Reserved 10 Multi-Core Turbo Status (R0) When set, frequency is reduced below the operating system request due to Multi-Core Turbo limits. 12:11 Reserved 13 Core Frequency P1 Status (R0) When set, frequency is reduced below max non-turbo P1. 14 Core Max N-Core Turbo Frequency Limiting Status (R0) When set, frequency is reduced below max n-core turbo frequency. 15 Core Frequency Limiting Status (R0) When set, frequency is reduced below the operating system request. 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. Vol. 4 2-255 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 18 Power Budget Management Log When set, indicates that the PBM Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 19 Platform Configuration Services Log When set, indicates that the PCS Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 20 Reserved 21 Autonomous Utilization-Based Frequency Control Log When set, indicates that the AUBFC Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 23 Reserved 24 Electrical Design Point Log When set, indicates that the EDP Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Reserved 26 Multi-Core Turbo Log When set, indicates that the Multi-Core Turbo Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28:27 2-256 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 29 Core Frequency P1 Log When set, indicates that the Core Frequency P1 Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 30 Core Max N-Core Turbo Frequency Limiting Log When set, indicates that the Core Max n-core Turbo Frequency Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 31 Core Frequency Limiting Log When set, indicates that the Core Frequency Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:32 Reserved 770H 1904 IA32_PM_ENABLE Package See Section 14.4.2, “Enabling HWP”. 771H 1905 IA32_HWP_CAPABILITIES Thread See Section 14.4.3, “HWP Performance Range and Dynamic Capabilities”. 774H 1908 IA32_HWP_REQUEST Thread See Section 14.4.4, “Managing HWP”. 777H C8DH 1911 3213 7:0 Minimum Performance (R/W) 15:8 Maximum Performance (R/W) 23:16 Desired Performance (R/W) 63:24 Reserved IA32_HWP_STATUS Thread See Section 14.4.5, “HWP Feedback”. 1:0 Reserved 2 Excursion to Minimum (RO) 63:3 Reserved IA32_QM_EVTSEL THREAD Monitoring Event Select Register (R/W) If CPUID.(EAX=07H, ECX=0):EBX.RDT-M[bit 12] = 1. 7:0 EventID (RW) Event encoding: 0x00: No monitoring. 0x01: L3 occupancy monitoring. 0x02: Total memory bandwidth monitoring. 0x03: Local memory bandwidth monitoring. All other encoding reserved. 31:8 Reserved Vol. 4 2-257 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex C8FH C90H Register Name / Bit Fields Scope Bit Description Dec 3215 3216 41:32 RMID (RW) 63:42 Reserved IA32_PQR_ASSOC THREAD Resource Association Register (R/W) 9:0 RMID 31:10 Reserved 51:32 COS (R/W) 63: 52 Reserved IA32_L3_QOS_MASK_0 Package L3 Class Of Service Mask - COS 0 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=0. C91H 3217 0:19 CBM: Bit vector of available L3 ways for COS 0 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_1 Package L3 Class Of Service Mask - COS 1 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=1. C92H 3218 0:19 CBM: Bit vector of available L3 ways for COS 1 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_2 Package L3 Class Of Service Mask - COS 2 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=2. C93H 3219 0:19 CBM: Bit vector of available L3 ways for COS 2 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_3 Package L3 Class Of Service Mask - COS 3 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=3. C94H 3220 0:19 CBM: Bit vector of available L3 ways for COS 3 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_4 Package L3 Class Of Service Mask - COS 4 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=4. 0:19 CBM: Bit vector of available L3 ways for COS 4 enforcement. 63:20 C95H 3221 IA32_L3_QOS_MASK_5 Reserved Package L3 Class Of Service Mask - COS 5 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=5. C96H 3222 0:19 CBM: Bit vector of available L3 ways for COS 5 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_6 Package L3 Class Of Service Mask - COS 6 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=6. 2-258 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex C97H Register Name / Bit Fields Scope Bit Description Dec 3223 0:19 CBM: Bit vector of available L3 ways for COS 6 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_7 Package L3 Class Of Service Mask - COS 7 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=7. C98H 3224 0:19 CBM: Bit vector of available L3 ways for COS 7 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_8 Package L3 Class Of Service Mask - COS 8 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=8. 0:19 CBM: Bit vector of available L3 ways for COS 8 enforcement. 63:20 C99H 3225 IA32_L3_QOS_MASK_9 Reserved Package L3 Class Of Service Mask - COS 9 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=9. C9AH 3226 0:19 CBM: Bit vector of available L3 ways for COS 9 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_10 Package L3 Class Of Service Mask - COS 10 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=10. C9BH 3227 0:19 CBM: Bit vector of available L3 ways for COS 10 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_11 Package L3 Class Of Service Mask - COS 11 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=11. C9CH 3228 0:19 CBM: Bit vector of available L3 ways for COS 11 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_12 Package L3 Class Of Service Mask - COS 12 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=12. C9DH 3229 0:19 CBM: Bit vector of available L3 ways for COS 12 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_13 Package L3 Class Of Service Mask - COS 13 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=13. C9EH 3230 0:19 CBM: Bit vector of available L3 ways for COS 13 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_14 Package L3 Class Of Service Mask - COS 14 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=14. Vol. 4 2-259 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-36. Additional MSRs Common to Intel® Xeon® Processor D and Intel Xeon Processors E5 v4 Family Based on the Broadwell Microarchitecture Register Address Hex C9FH Register Name / Bit Fields Scope Bit Description Dec 3231 0:19 CBM: Bit vector of available L3 ways for COS 14 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_15 Package L3 Class Of Service Mask - COS 15 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=15. 2.16.1 0:19 CBM: Bit vector of available L3 ways for COS 15 enforcement. 63:20 Reserved Additional MSRs Supported in the Intel® Xeon® Processor D Product Family The MSRs listed in Table 2-37 are available to Intel® Xeon® Processor D Product Family (CPUID DisplayFamily_DisplayModel = 06_56H). The Intel® Xeon® processor D product family is based on the Broadwell microarchitecture and supports the MSR interfaces listed in Table 2-20, Table 2-29, Table 2-34, Table 2-36, and Table 2-37. Table 2-37. Additional MSRs Supported by Intel® Xeon® Processor D with DisplayFamily_DisplayModel 06_56H Register Address Hex Dec 1ACH 428 Register Name / Bit Fields MSR_TURBO_RATIO_LIMIT3 Scope Package Bit Description Config Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 62:0 Package Reserved 63 Package Semaphore for Turbo Ratio Limit Configuration If 1, the processor uses override configuration1 specified in MSR_TURBO_RATIO_LIMIT, MSR_TURBO_RATIO_LIMIT1. If 0, the processor uses factory-set configuration (Default). 286H 646 IA32_MC6_CTL2 Package See Table 2-2. 287H 647 IA32_MC7_CTL2 Package See Table 2-2. 289H 649 IA32_MC9_CTL2 Package See Table 2-2. 28AH 650 IA32_MC10_CTL2 Package See Table 2-2. 291H 657 IA32_MC17_CTL2 Package See Table 2-2. 292H 658 IA32_MC18_CTL2 Package See Table 2-2. 293H 659 IA32_MC19_CTL2 Package See Table 2-2. 2-260 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-37. Additional MSRs Supported by Intel® Xeon® Processor D with DisplayFamily_DisplayModel 06_56H Register Address Register Name / Bit Fields Scope Hex Dec 418H 1048 IA32_MC6_CTL Package 419H 1049 IA32_MC6_STATUS Package 41AH 1050 IA32_MC6_ADDR Package 41BH 1051 IA32_MC6_MISC Package 41CH 1052 IA32_MC7_CTL Package 41DH 1053 IA32_MC7_STATUS Package 41EH 1054 IA32_MC7_ADDR Package 41FH 1055 IA32_MC7_MISC Package 424H 1060 IA32_MC9_CTL Package 425H 1061 IA32_MC9_STATUS Package 426H 1062 IA32_MC9_ADDR Package 427H 1063 IA32_MC9_MISC Package 428H 1064 IA32_MC10_CTL Package 429H 1065 IA32_MC10_STATUS Package 42AH 1066 IA32_MC10_ADDR Package 42BH 1067 IA32_MC10_MISC Package 444H 1092 IA32_MC17_CTL Package 445H 1093 IA32_MC17_STATUS Package 446H 1094 IA32_MC17_ADDR Package 447H 1095 IA32_MC17_MISC Package 448H 1096 IA32_MC18_CTL Package 449H 1097 IA32_MC18_STATUS Package 44AH 1098 IA32_MC18_ADDR Package 44BH 1099 IA32_MC18_MISC Package 44CH 1100 IA32_MC19_CTL Package 44DH 1101 IA32_MC19_STATUS Package 44EH 1102 IA32_MC19_ADDR Package 44FH 1103 IA32_MC19_MISC Package Bit Description See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC6 reports MC errors from the integrated I/O module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC7 reports MC errors from the home agent HA 0. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 10 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 10 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC17 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo0, CBo3, CBo6, CBo9, CBo12, CBo15. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC18 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo1, CBo4, CBo7, CBo10, CBo13, CBo16. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC19 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo2, CBo5, CBo8, CBo11, CBo14, CBo17. See Table 2-20, Table 2-29, Table 2-34, and Table 2-36 for other MSR definitions applicable to processors with CPUID signature 06_56H. NOTES: 1. An override configuration lower than the factory-set configuration is always supported. An override configuration higher than the factory-set configuration is dependent on features specific to the processor and the platform. 2.16.2 Additional MSRs Supported in Intel® Xeon® Processors E5 v4 and E7 v4 Families The MSRs listed in Table 2-37 are available to Intel® Xeon® Processor E5 v4 and E7 v4 Families (CPUID DisplayFamily_DisplayModel = 06_4FH). The Intel® Xeon® processor E5 v4 family is based on the Broadwell Vol. 4 2-261 MODEL-SPECIFIC REGISTERS (MSRS) microarchitecture and supports the MSR interfaces listed in Table 2-20, Table 2-21, Table 2-29, Table 2-34, Table 2-36, and Table 2-38. Table 2-38. Additional MSRs Supported by Intel® Xeon® Processors with DisplayFamily_DisplayModel 06_4FH Register Address Hex Dec 1ACH 428 Register Name / Bit Fields MSR_TURBO_RATIO_LIMIT3 Scope Package Bit Description Config Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0. RW if MSR_PLATFORM_INFO.[28] = 1. 62:0 Package Reserved 63 Package Semaphore for Turbo Ratio Limit Configuration If 1, the processor uses override configuration1 specified in MSR_TURBO_RATIO_LIMIT, MSR_TURBO_RATIO_LIMIT1 and MSR_TURBO_RATIO_LIMIT2. If 0, the processor uses factory-set configuration (Default). 285H 645 IA32_MC5_CTL2 Package See Table 2-2. 286H 646 IA32_MC6_CTL2 Package See Table 2-2. 287H 647 IA32_MC7_CTL2 Package See Table 2-2. 288H 648 IA32_MC8_CTL2 Package See Table 2-2. 289H 649 IA32_MC9_CTL2 Package See Table 2-2. 28AH 650 IA32_MC10_CTL2 Package See Table 2-2. 28BH 651 IA32_MC11_CTL2 Package See Table 2-2. 28CH 652 IA32_MC12_CTL2 Package See Table 2-2. 28DH 653 IA32_MC13_CTL2 Package See Table 2-2. 28EH 654 IA32_MC14_CTL2 Package See Table 2-2. 28FH 655 IA32_MC15_CTL2 Package See Table 2-2. 290H 656 IA32_MC16_CTL2 Package See Table 2-2. 291H 657 IA32_MC17_CTL2 Package See Table 2-2. 292H 658 IA32_MC18_CTL2 Package See Table 2-2. 293H 659 IA32_MC19_CTL2 Package See Table 2-2. 294H 660 IA32_MC20_CTL2 Package See Table 2-2. 295H 661 IA32_MC21_CTL2 Package See Table 2-2. 414H 1044 IA32_MC5_CTL Package 415H 1045 IA32_MC5_STATUS Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. 416H 1046 IA32_MC5_ADDR Package 417H 1047 IA32_MC5_MISC Package 418H 1048 IA32_MC6_CTL Package 419H 1049 IA32_MC6_STATUS Package 41AH 1050 IA32_MC6_ADDR Package 41BH 1051 IA32_MC6_MISC Package 2-262 Vol. 4 Bank MC5 reports MC errors from the Intel QPI 0 module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC6 reports MC errors from the integrated I/O module. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-38. Additional MSRs Supported by Intel® Xeon® Processors with DisplayFamily_DisplayModel 06_4FH Register Address Register Name / Bit Fields Scope Hex Dec 41CH 1052 IA32_MC7_CTL Package 41DH 1053 IA32_MC7_STATUS Package 41EH 1054 IA32_MC7_ADDR Package 41FH 1055 IA32_MC7_MISC Package 420H 1056 IA32_MC8_CTL Package 421H 1057 IA32_MC8_STATUS Package 422H 1058 IA32_MC8_ADDR Package 423H 1059 IA32_MC8_MISC Package 424H 1060 IA32_MC9_CTL Package 425H 1061 IA32_MC9_STATUS Package 426H 1062 IA32_MC9_ADDR Package 427H 1063 IA32_MC9_MISC Package 428H 1064 IA32_MC10_CTL Package 429H 1065 IA32_MC10_STATUS Package 42AH 1066 IA32_MC10_ADDR Package 42BH 1067 IA32_MC10_MISC Package 42CH 1068 IA32_MC11_CTL Package 42DH 1069 IA32_MC11_STATUS Package 42EH 1070 IA32_MC11_ADDR Package 42FH 1071 IA32_MC11_MISC Package 430H 1072 IA32_MC12_CTL Package 431H 1073 IA32_MC12_STATUS Package 432H 1074 IA32_MC12_ADDR Package 433H 1075 IA32_MC12_MISC Package 434H 1076 IA32_MC13_CTL Package 435H 1077 IA32_MC13_STATUS Package 436H 1078 IA32_MC13_ADDR Package 437H 1079 IA32_MC13_MISC Package 438H 1080 IA32_MC14_CTL Package 439H 1081 IA32_MC14_STATUS Package 43AH 1082 IA32_MC14_ADDR Package 43BH 1083 IA32_MC14_MISC Package 43CH 1084 IA32_MC15_CTL Package 43DH 1085 IA32_MC15_STATUS Package 43EH 1086 IA32_MC15_ADDR Package 43FH 1087 IA32_MC15_MISC Package Bit Description See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC7 reports MC errors from the home agent HA 0. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC8 reports MC errors from the home agent HA 1. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. Vol. 4 2-263 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-38. Additional MSRs Supported by Intel® Xeon® Processors with DisplayFamily_DisplayModel 06_4FH Register Address Register Name / Bit Fields Scope Hex Dec 440H 1088 IA32_MC16_CTL Package 441H 1089 IA32_MC16_STATUS Package 442H 1090 IA32_MC16_ADDR Package 443H 1091 IA32_MC16_MISC Package 444H 1092 IA32_MC17_CTL Package 445H 1093 IA32_MC17_STATUS Package 446H 1094 IA32_MC17_ADDR Package 447H 1095 IA32_MC17_MISC Package 448H 1096 IA32_MC18_CTL Package 449H 1097 IA32_MC18_STATUS Package 44AH 1098 IA32_MC18_ADDR Package 44BH 1099 IA32_MC18_MISC Package 44CH 1100 IA32_MC19_CTL Package 44DH 1101 IA32_MC19_STATUS Package 44EH 1102 IA32_MC19_ADDR Package 44FH 1103 IA32_MC19_MISC Package 450H 1104 IA32_MC20_CTL Package 451H 1105 IA32_MC20_STATUS Package 452H 1106 IA32_MC20_ADDR Package 453H 1107 IA32_MC20_MISC Package 454H 1108 IA32_MC21_CTL Package 455H 1109 IA32_MC21_STATUS Package 456H 1110 IA32_MC21_ADDR Package 457H 1111 IA32_MC21_MISC Package C81H 3201 IA32_L3_QOS_CFG Package Bit Description See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 through MC 16 report MC errors from each channel of the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC17 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo0, CBo3, CBo6, CBo9, CBo12, CBo15. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC18 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo1, CBo4, CBo7, CBo10, CBo13, CBo16. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC19 reports MC errors from the following pair of CBo/L3 Slices (if the pair is present): CBo2, CBo5, CBo8, CBo11, CBo14, CBo17. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC20 reports MC errors from the Intel QPI 1 module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC21 reports MC errors from the Intel QPI 2 module. Cache Allocation Technology Configuration (R/W) 0 CAT Enable. Set 1 to enable Cache Allocation Technology. 63:1 Reserved See Table 2-20, Table 2-21, Table 2-29, and Table 2-30 for other MSR definitions applicable to processors with CPUID signature 06_45H. NOTES: 1. An override configuration lower than the factory-set configuration is always supported. An override configuration higher than the factory-set configuration is dependent on features specific to the processor and the platform. 2-264 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) 2.17 MSRS IN THE 6TH GENERATION, 7TH GENERATION, 8TH GENERATION, AND 9TH GENERATION INTEL® CORE™ PROCESSORS, INTEL® XEON® PROCESSOR SCALABLE FAMILY, 8TH GENERATION INTEL® CORE™ I3 PROCESSORS, AND INTEL® XEON® E PROCESSORS 6th generation Intel® Core™ processors and the Intel® Xeon® Processor Scalable Family are based on the Skylake microarchitecture and have CPUID DisplayFamily_DisplayModel signatures of 06_4EH, 06_5EH, and 06_55H. 7th generation Intel® Core™ processors are based on the Kaby Lake microarchitecture, 8th generation and 9th generation Intel® Core™ processors and Intel® Xeon® E processors are based on the Coffee Lake microarchitecture; these processors have CPUID DisplayFamily_DisplayModel signatures of 06_8EH and 06_9EH. 8th generation Intel® Core™ i3 processors are based on Cannon Lake microarchitecture and have a CPUID DisplayFamily_DisplayModel signature of 06_66H. These processors support the MSR interfaces listed in Table 2-20, Table 2-21, Table 2-25, Table 2-29, Table 2-35, Table 2-39, and Table 2-40. For an MSR listed in Table 2-39 that also appears in the model-specific tables of prior generations, Table 2-39 supercede prior generation tables. The notation of “Platform” in the Scope column (with respect to MSR_PLATFORM_ENERGY_COUNTER and MSR_PLATFORM_POWER_LIMIT) is limited to the power-delivery domain and the specifics of the power delivery integration may vary by platform vendor’s implementation. Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 3AH 58 Register Name / Bit Fields IA32_FEATURE_CONTROL Scope Thread Bit Description Control Features in Intel 64 Processor (R/W) See Table 2-2. FEH 254 IA32_MTRRCAP Thread MTRR Capability (RO, Architectural) See Table 2-2 19CH 412 IA32_THERM_STATUS Core Thermal Monitor Status (R/W) See Table 2-2. 0 Thermal Status (RO) See Table 2-2. 1 Thermal Status Log (R/WC0) See Table 2-2. 2 PROTCHOT # or FORCEPR# Status (RO) See Table 2-2. 3 PROTCHOT # or FORCEPR# Log (R/WC0) See Table 2-2. 4 Critical Temperature Status (RO) See Table 2-2. 5 Critical Temperature Status Log (R/WC0) See Table 2-2. 6 Thermal threshold #1 Status (RO) See Table 2-2. Vol. 4 2-265 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 7 Thermal threshold #1 Log (R/WC0) See Table 2-2. 8 Thermal Threshold #2 Status (RO) See Table 2-2. 9 Thermal Threshold #2 Log (R/WC0) See Table 2-2. 10 Power Limitation Status (RO) See Table 2-2. 11 Power Limitation Log (R/WC0) See Table 2-2. 12 Current Limit Status (RO) See Table 2-2. 13 Current Limit Log (R/WC0) See Table 2-2. 14 Cross Domain Limit Status (RO) See Table 2-2. 15 Cross Domain Limit Log (R/WC0) See Table 2-2. 22:16 Digital Readout (RO) See Table 2-2. 26:23 Reserved 30:27 Resolution in Degrees Celsius (RO) See Table 2-2. 31 Reading Valid (RO) See Table 2-2. 63:32 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved Package Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0, RW if MSR_PLATFORM_INFO.[28] = 1 7:0 Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. 15:8 Package Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. 23:16 Package Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. 2-266 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 31:24 Package Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. 63:32 1C9H 457 MSR_LASTBRANCH_TOS Reserved Thread Last Branch Record Stack TOS (R/W) Contains an index (bits 0-4) that points to the MSR containing the most recent branch record. 1FCH 508 MSR_POWER_CTL Core Power Control Register See http://biosbits.org. 0 1 Reserved Package C1E Enable (R/W) When set to ‘1’, will enable the CPU to switch to the Minimum Enhanced Intel SpeedStep Technology operating point when all execution cores enter MWAIT (C1). 18:2 Reserved 19 Disable Race to Halt Optimization (R/W) Setting this bit disables the Race to Halt optimization and avoids this optimization limitation to execute below the most efficient frequency ratio. Default value is 0 for processors that support Race to Halt optimization. Default value is 1 for processors that do not support Race to Halt optimization. 20 Disable Energy Efficiency Optimization (R/W) Setting this bit disables the P-States energy efficiency optimization. Default value is 0. Disable/enable the energy efficiency optimization in P-State legacy mode (when IA32_PM_ENABLE[HWP_ENABLE] = 0), has an effect only in the turbo range or into PERF_MIN_CTL value if it is not zero set. In HWP mode (IA32_PM_ENABLE[HWP_ENABLE] == 1), has an effect between the OS desired or OS maximize to the OS minimize performance setting. 63:21 300H 768 MSR_SGXOWNEREPOCH0 Reserved Package Lower 64 Bit CR_SGXOWNEREPOCH (W) Writes do not update CR_SGXOWNEREPOCH if CPUID.(EAX=12H, ECX=0):EAX.SGX1 is 1 on any thread in the package. 63:0 Lower 64 bits of an 128-bit external entropy value for key derivation of an enclave. Vol. 4 2-267 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 301H 768 Register Name / Bit Fields MSR_SGXOWNEREPOCH1 Scope Package Bit Description Upper 64 Bit CR_SGXOWNEREPOCH (W) Writes do not update CR_SGXOWNEREPOCH if CPUID.(EAX=12H, ECX=0):EAX.SGX1 is 1 on any thread in the package. 38EH 910 63:0 Upper 64 bits of an 128-bit external entropy value for key derivation of an enclave. IA32_PERF_GLOBAL_STATUS See Table 2-2. See Section 18.2.4, “Architectural Performance Monitoring Version 4.” 0 Thread Ovf_PMC0 1 Thread Ovf_PMC1 2 Thread Ovf_PMC2 3 Thread Ovf_PMC3 4 Thread Ovf_PMC4 (if CPUID.0AH:EAX[15:8] > 4) 5 Thread Ovf_PMC5 (if CPUID.0AH:EAX[15:8] > 5) 6 Thread Ovf_PMC6 (if CPUID.0AH:EAX[15:8] > 6) 7 Thread Ovf_PMC7 (if CPUID.0AH:EAX[15:8] > 7) 31:8 Reserved 32 Thread Ovf_FixedCtr0 33 Thread Ovf_FixedCtr1 34 Thread Ovf_FixedCtr2 54:35 55 Reserved Thread 57:56 390H 2-268 Vol. 4 912 Trace_ToPA_PMI Reserved 58 Thread LBR_Frz 59 Thread CTR_Frz 60 Thread ASCI 61 Thread Ovf_Uncore 62 Thread Ovf_BufDSSAVE 63 Thread CondChgd IA32_PERF_GLOBAL_STATUS_RESET See Table 2-2. See Section 18.2.4, “Architectural Performance Monitoring Version 4.” 0 Thread Set 1 to clear Ovf_PMC0. 1 Thread Set 1 to clear Ovf_PMC1. 2 Thread Set 1 to clear Ovf_PMC2. 3 Thread Set 1 to clear Ovf_PMC3. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Dec 4 Thread Set 1 to clear Ovf_PMC4 (if CPUID.0AH:EAX[15:8] > 4). 5 Thread Set 1 to clear Ovf_PMC5 (if CPUID.0AH:EAX[15:8] > 5). 6 Thread Set 1 to clear Ovf_PMC6 (if CPUID.0AH:EAX[15:8] > 6). 7 Thread Set 1 to clear Ovf_PMC7 (if CPUID.0AH:EAX[15:8] > 7). 31:8 Reserved 32 Thread Set 1 to clear Ovf_FixedCtr0. 33 Thread Set 1 to clear Ovf_FixedCtr1. 34 Thread Set 1 to clear Ovf_FixedCtr2. 54:35 55 Reserved Thread 57:56 391H Bit Description 913 Set 1 to clear Trace_ToPA_PMI. Reserved 58 Thread Set 1 to clear LBR_Frz. 59 Thread Set 1 to clear CTR_Frz. 60 Thread Set 1 to clear ASCI. 61 Thread Set 1 to clear Ovf_Uncore. 62 Thread Set 1 to clear Ovf_BufDSSAVE. 63 Thread Set 1 to clear CondChgd. IA32_PERF_GLOBAL_STATUS_SET See Table 2-2. See Section 18.2.4, “Architectural Performance Monitoring Version 4.” 0 Thread Set 1 to cause Ovf_PMC0 = 1. 1 Thread Set 1 to cause Ovf_PMC1 = 1. 2 Thread Set 1 to cause Ovf_PMC2 = 1. 3 Thread Set 1 to cause Ovf_PMC3 = 1. 4 Thread Set 1 to cause Ovf_PMC4=1 (if CPUID.0AH:EAX[15:8] > 4). 5 Thread Set 1 to cause Ovf_PMC5=1 (if CPUID.0AH:EAX[15:8] > 5). 6 Thread Set 1 to cause Ovf_PMC6=1 (if CPUID.0AH:EAX[15:8] > 6). 7 Thread Set 1 to cause Ovf_PMC7=1 (if CPUID.0AH:EAX[15:8] > 7). 31:8 Reserved 32 Thread Set 1 to cause Ovf_FixedCtr0 = 1. 33 Thread Set 1 to cause Ovf_FixedCtr1 = 1. 34 Thread Set 1 to cause Ovf_FixedCtr2 = 1. Vol. 4 2-269 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Dec 54:35 55 Reserved Thread 57:56 Reserved Thread Set 1 to cause LBR_Frz = 1. 59 Thread Set 1 to cause CTR_Frz = 1. 60 Thread Set 1 to cause ASCI = 1. 61 Thread Set 1 to cause Ovf_Uncore. 62 Thread Set 1 to cause Ovf_BufDSSAVE. 63 Reserved See Table 2-2. 913 IA32_PERF_GLOBAL_INUSE 3F7H 1015 MSR_PEBS_FRONTEND 1280 Set 1 to cause Trace_ToPA_PMI = 1. 58 392H 500H Bit Description Thread FrontEnd Precise Event Condition Select (R/W) 2:0 Event Code Select 3 Reserved 4 Event Code Select High 7:5 Reserved 19:8 IDQ_Bubble_Length Specifier 22:20 IDQ_Bubble_Width Specifier 63:23 Reserved IA32_SGX_SVN_STATUS Thread 0 Status and SVN Threshold of SGX Support for ACM (RO) Lock See Section 41.11.3, “Interactions with Authenticated Code Modules (ACMs)”. 15:1 Reserved 23:16 SGX_SVN_SINIT See Section 41.11.3, “Interactions with Authenticated Code Modules (ACMs)”. 63:24 560H 1376 IA32_RTIT_OUTPUT_BASE Reserved Thread Trace Output Base Register (R/W) See Table 2-2. 561H 1377 IA32_RTIT_OUTPUT_MASK_PTRS Thread Trace Output Mask Pointers Register (R/W) See Table 2-2. 570H 2-270 Vol. 4 1392 IA32_RTIT_CTL Thread Trace Control Register (R/W) 0 TraceEn 1 CYCEn 2 OS MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 3 User 6:4 Reserved, must be zero. 7 CR3 filter 8 ToPA Writing 0 will #GP if also setting TraceEn. 571H 572H 580H 1393 1394 1408 9 MTCEn 10 TSCEn 11 DisRETC 12 Reserved, must be zero. 13 BranchEn 17:14 MTCFreq 18 Reserved, must be zero. 22:19 CYCThresh 23 Reserved, must be zero. 27:24 PSBFreq 31:28 Reserved, must be zero. 35:32 ADDR0_CFG 39:36 ADDR1_CFG 63:40 Reserved, must be zero. IA32_RTIT_STATUS Thread Tracing Status Register (R/W) 0 FilterEn, writes ignored. 1 ContexEn, writes ignored. 2 TriggerEn, writes ignored. 3 Reserved 4 Error (R/W) 5 Stopped 31:6 Reserved, must be zero. 48:32 PacketByteCnt 63:49 Reserved, must be zero. IA32_RTIT_CR3_MATCH Thread Trace Filter CR3 Match Register (R/W) 4:0 Reserved 63:5 CR3[63:5] value to match IA32_RTIT_ADDR0_A 63:0 Thread Region 0 Start Address (R/W) See Table 2-2. Vol. 4 2-271 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 581H 1409 Register Name / Bit Fields IA32_RTIT_ADDR0_B Scope Thread 63:0 582H 1410 IA32_RTIT_ADDR1_A 1411 IA32_RTIT_ADDR1_B Thread 1593 MSR_PP0_ENERGY_STATUS Region 1 Start Address (R/W) See Table 2-2. Thread 63:0 639H Region 0 End Address (R/W) See Table 2-2. 63:0 583H Bit Description Region 1 End Address (R/W) See Table 2-2. Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 64DH 1613 MSR_PLATFORM_ENERGY_COUNTER Platform* Platform Energy Counter (R/O) This MSR is valid only if both platform vendor hardware implementation and BIOS enablement support it. This MSR will read 0 if not valid. 64EH 1614 31:0 Total energy consumed by all devices in the platform that receive power from integrated power delivery mechanism, included platform devices are processor cores, SOC, memory, add-on or peripheral devices that get powered directly from the platform power delivery means. The energy units are specified in the MSR_RAPL_POWER_UNIT.Enery_Status_Unit. 63:32 Reserved MSR_PPERF Thread 63:0 64FH 1615 MSR_CORE_PERF_LIMIT_REASONS Productive Performance Count (R/O) Hardware’s view of workload scalability. See Section 14.4.5.1. Package Indicator of Frequency Clipping in Processor Cores (R/W) (Frequency refers to processor core frequency.) 0 PROCHOT Status (R0) When set, frequency is reduced below the operating system request due to assertion of external PROCHOT. 1 Thermal Status (R0) When set, frequency is reduced below the operating system request due to a thermal event. 3:2 Reserved 4 Residency State Regulation Status (R0) When set, frequency is reduced below the operating system request due to residency state regulation limit. 2-272 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 5 Running Average Thermal Limit Status (R0) When set, frequency is reduced below the operating system request due to Running Average Thermal Limit (RATL). 6 VR Therm Alert Status (R0) When set, frequency is reduced below the operating system request due to a thermal alert from a processor Voltage Regulator (VR). 7 VR Therm Design Current Status (R0) When set, frequency is reduced below the operating system request due to VR thermal design current limit. 8 Other Status (R0) When set, frequency is reduced below the operating system request due to electrical or other constraints. 9 Reserved 10 Package/Platform-Level Power Limiting PL1 Status (R0) When set, frequency is reduced below the operating system request due to package/platform-level power limiting PL1. 11 Package/Platform-Level PL2 Power Limiting Status (R0) When set, frequency is reduced below the operating system request due to package/platform-level power limiting PL2/PL3. 12 Max Turbo Limit Status (R0) When set, frequency is reduced below the operating system request due to multi-core turbo limits. 13 Turbo Transition Attenuation Status (R0) When set, frequency is reduced below the operating system request due to Turbo transition attenuation. This prevents performance degradation due to frequent operating ratio changes. 15:14 Reserved 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. Vol. 4 2-273 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 19:18 20 Reserved. Residency State Regulation Log When set, indicates that the Residency State Regulation Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 21 Running Average Thermal Limit Log When set, indicates that the RATL Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 23 VR Thermal Design Current Log When set, indicates that the VR TDC Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 24 Other Log When set, indicates that the Other Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Reserved 26 Package/Platform-Level PL1 Power Limiting Log When set, indicates that the Package or Platform Level PL1 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 2-274 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 27 Package/Platform-Level PL2 Power Limiting Log When set, indicates that the Package or Platform Level PL2/PL3 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28 Max Turbo Limit Log When set, indicates that the Max Turbo Limit Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 29 Turbo Transition Attenuation Log When set, indicates that the Turbo Transition Attenuation Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:30 652H 1618 MSR_PKG_HDC_CONFIG Reserved Package 2:0 HDC Configuration (R/W) PKG_Cx_Monitor Configures Package Cx state threshold for MSR_PKG_HDC_DEEP_RESIDENCY. 63: 3 653H 1619 MSR_CORE_HDC_RESIDENCY Reserved Core 63:0 655H 1621 MSR_PKG_HDC_SHALLOW_RESIDENCY Core_Cx_Duty_Cycle_Cnt Package 63:0 656H 1622 MSR_PKG_HDC_DEEP_RESIDENCY 1624 MSR_WEIGHTED_CORE_C0 Package 1625 MSR_ANY_CORE_C0 63:0 Package Cx HDC Idle Residency (R/O) Pkg_Cx_Duty_Cycle_Cnt Package 63:0 659H Accumulate the cycles the package was in C2 state and at least one logical processor was in forced idle (R/O) Pkg_C2_Duty_Cycle_Cnt 63:0 658H Core HDC Idle Residency (R/O) Core-count Weighted C0 Residency (R/O) Increment at the same rate as the TSC. The increment each cycle is weighted by the number of processor cores in the package that reside in C0. If N cores are simultaneously in C0, then each cycle the counter increments by N. Package Any Core C0 Residency (R/O) Increment at the same rate as the TSC. The increment each cycle is one if any processor core in the package is in C0. Vol. 4 2-275 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 65AH 1626 Register Name / Bit Fields MSR_ANY_GFXE_C0 Scope Package 63:0 65BH 1627 MSR_CORE_GFXE_OVERLAP_C0 1628 MSR_PLATFORM_POWER_LIMIT Any Graphics Engine C0 Residency (R/O) Increment at the same rate as the TSC. The increment each cycle is one if any processor graphic device’s compute engines are in C0. Package 63:0 65CH Bit Description Core and Graphics Engine Overlapped C0 Residency (R/O) Increment at the same rate as the TSC. The increment each cycle is one if at least one compute engine of the processor graphics is in C0 and at least one processor core in the package is also in C0. Platform* Platform Power Limit Control (R/W-L) Allows platform BIOS to limit power consumption of the platform devices to the specified values. The Long Duration power consumption is specified via Platform_Power_Limit_1 and Platform_Power_Limit_1_Time. The Short Duration power consumption limit is specified via the Platform_Power_Limit_2 with duration chosen by the processor. The processor implements an exponential-weighted algorithm in the placement of the time windows. 14:0 Platform Power Limit #1 Average Power limit value which the platform must not exceed over a time window as specified by Power_Limit_1_TIME field. The default value is the Thermal Design Power (TDP) and varies with product skus. The unit is specified in MSR_RAPLPOWER_UNIT. 15 Enable Platform Power Limit #1 When set, enables the processor to apply control policy such that the platform power does not exceed Platform Power limit #1 over the time window specified by Power Limit #1 Time Window. 16 Platform Clamping Limitation #1 When set, allows the processor to go below the OS requested P states in order to maintain the power below specified Platform Power Limit #1 value. This bit is writeable only when CPUID (EAX=6):EAX[4] is set. 2-276 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 23:17 Time Window for Platform Power Limit #1 Specifies the duration of the time window over which Platform Power Limit 1 value should be maintained for sustained long duration. This field is made up of two numbers from the following equation: Time Window = (float) ((1+(X/4))*(2^Y)), where: X = POWER_LIMIT_1_TIME[23:22] Y = POWER_LIMIT_1_TIME[21:17] The maximum allowed value in this field is defined in MSR_PKG_POWER_INFO[PKG_MAX_WIN]. The default value is 0DH, The unit is specified in MSR_RAPLPOWER_UNIT[Time Unit]. 31:24 Reserved 46:32 Platform Power Limit #2 Average Power limit value which the platform must not exceed over the Short Duration time window chosen by the processor. The recommended default value is 1.25 times the Long Duration Power Limit (i.e., Platform Power Limit # 1). 47 Enable Platform Power Limit #2 When set, enables the processor to apply control policy such that the platform power does not exceed Platform Power limit #2 over the Short Duration time window. 48 Platform Clamping Limitation #2 When set, allows the processor to go below the OS requested P states in order to maintain the power below specified Platform Power Limit #2 value. 690H 1680 62:49 Reserved 63 Lock. Setting this bit will lock all other bits of this MSR until system RESET. MSR_LASTBRANCH_16_FROM_IP Thread Last Branch Record 16 From IP (R/W) One of 32 triplets of last branch record registers on the last branch record stack. This part of the stack contains pointers to the source instruction. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.12. 691H 1681 MSR_LASTBRANCH_17_FROM_IP Thread Last Branch Record 17 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 692H 1682 MSR_LASTBRANCH_18_FROM_IP Thread Last Branch Record 18 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Vol. 4 2-277 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 693H 1683 Register Name / Bit Fields MSR_LASTBRANCH_19_FROM_IP Scope Thread Bit Description Last Branch Record 19From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 694H 1684 MSR_LASTBRANCH_20_FROM_IP Thread Last Branch Record 20 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 695H 1685 MSR_LASTBRANCH_21_FROM_IP Thread Last Branch Record 21 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 696H 1686 MSR_LASTBRANCH_22_FROM_IP Thread Last Branch Record 22 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 697H 1687 MSR_LASTBRANCH_23_FROM_IP Thread Last Branch Record 23 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 698H 1688 MSR_LASTBRANCH_24_FROM_IP Thread Last Branch Record 24 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 699H 1689 MSR_LASTBRANCH_25_FROM_IP Thread Last Branch Record 25 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69AH 1690 MSR_LASTBRANCH_26_FROM_IP Thread Last Branch Record 26 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69BH 1691 MSR_LASTBRANCH_27_FROM_IP Thread Last Branch Record 27 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69CH 1692 MSR_LASTBRANCH_28_FROM_IP Thread Last Branch Record 28 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69DH 1693 MSR_LASTBRANCH_29_FROM_IP Thread Last Branch Record 29 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69EH 1694 MSR_LASTBRANCH_30_FROM_IP Thread Last Branch Record 30 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 69FH 1695 MSR_LASTBRANCH_31_FROM_IP Thread Last Branch Record 31 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. 6B0H 1712 MSR_GRAPHICS_PERF_LIMIT_REASONS Package Indicator of Frequency Clipping in the Processor Graphics (R/W) (Frequency refers to processor graphics frequency.) 0 PROCHOT Status (R0) When set, frequency is reduced due to assertion of external PROCHOT. 1 Thermal Status (R0) When set, frequency is reduced due to a thermal event. 4:2 2-278 Vol. 4 Reserved. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 5 Running Average Thermal Limit Status (R0) When set, frequency is reduced due to running average thermal limit. 6 VR Therm Alert Status (R0) When set, frequency is reduced due to a thermal alert from a processor Voltage Regulator. 7 VR Thermal Design Current Status (R0) When set, frequency is reduced due to VR TDC limit. 8 Other Status (R0) When set, frequency is reduced due to electrical or other constraints. 9 Reserved 10 Package/Platform-Level Power Limiting PL1 Status (R0) When set, frequency is reduced due to package/platform-level power limiting PL1. 11 Package/Platform-Level PL2 Power Limiting Status (R0) When set, frequency is reduced due to package/platform-level power limiting PL2/PL3. 12 Inefficient Operation Status (R0) When set, processor graphics frequency is operating below target frequency. 15:13 16 Reserved PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 20:18 Reserved. 21 Running Average Thermal Limit Log When set, indicates that the RATL Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. Vol. 4 2-279 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 23 VR Thermal Design Current Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 24 Other Log When set, indicates that the OTHER Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Reserved 26 Package/Platform-Level PL1 Power Limiting Log When set, indicates that the Package/Platform Level PL1 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 27 Package/Platform-Level PL2 Power Limiting Log When set, indicates that the Package/Platform Level PL2 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 28 Inefficient Operation Log When set, indicates that the Inefficient Operation Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:29 6B1H 1713 MSR_RING_PERF_LIMIT_REASONS Reserved Package Indicator of Frequency Clipping in the Ring Interconnect (R/W) (Frequency refers to ring interconnect in the uncore.) 0 PROCHOT Status (R0) When set, frequency is reduced due to assertion of external PROCHOT. 2-280 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 1 Thermal Status (R0) When set, frequency is reduced due to a thermal event. 4:2 Reserved 5 Running Average Thermal Limit Status (R0) When set, frequency is reduced due to running average thermal limit. 6 VR Therm Alert Status (R0) When set, frequency is reduced due to a thermal alert from a processor Voltage Regulator. 7 VR Thermal Design Current Status (R0) When set, frequency is reduced due to VR TDC limit. 8 Other Status (R0) When set, frequency is reduced due to electrical or other constraints. 9 Reserved 10 Package/Platform-Level Power Limiting PL1 Status (R0) When set, frequency is reduced due to package/Platform-level power limiting PL1. 11 Package/Platform-Level PL2 Power Limiting Status (R0) When set, frequency is reduced due to package/Platform-level power limiting PL2/PL3. 15:12 Reserved 16 PROCHOT Log When set, indicates that the PROCHOT Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 17 Thermal Log When set, indicates that the Thermal Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 20:18 Reserved Vol. 4 2-281 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 21 Running Average Thermal Limit Log When set, indicates that the RATL Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 22 VR Therm Alert Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 23 VR Thermal Design Current Log When set, indicates that the VR Therm Alert Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 24 Other Log When set, indicates that the OTHER Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 25 Reserved 26 Package/Platform-Level PL1 Power Limiting Log When set, indicates that the Package/Platform Level PL1 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 27 Package/Platform-Level PL2 Power Limiting Log When set, indicates that the Package/Platform Level PL2 Power Limiting Status bit has asserted since the log bit was last cleared. This log bit will remain set until cleared by software writing 0. 63:28 6D0H 1744 MSR_LASTBRANCH_16_TO_IP Reserved Thread Last Branch Record 16 To IP (R/W) One of 32 triplets of last branch record registers on the last branch record stack. This part of the stack contains pointers to the destination instruction. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.12. 2-282 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 6D1H 1745 Register Name / Bit Fields MSR_LASTBRANCH_17_TO_IP Scope Thread Bit Description Last Branch Record 17 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D2H 1746 MSR_LASTBRANCH_18_TO_IP Thread Last Branch Record 18 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D3H 1747 MSR_LASTBRANCH_19_TO_IP Thread Last Branch Record 19To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D4H 1748 MSR_LASTBRANCH_20_TO_IP Thread Last Branch Record 20 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D5H 1749 MSR_LASTBRANCH_21_TO_IP Thread Last Branch Record 21 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D6H 1750 MSR_LASTBRANCH_22_TO_IP Thread Last Branch Record 22 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D7H 1751 MSR_LASTBRANCH_23_TO_IP Thread Last Branch Record 23 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D8H 1752 MSR_LASTBRANCH_24_TO_IP Thread Last Branch Record 24 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6D9H 1753 MSR_LASTBRANCH_25_TO_IP Thread Last Branch Record 25 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DAH 1754 MSR_LASTBRANCH_26_TO_IP Thread Last Branch Record 26 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DBH 1755 MSR_LASTBRANCH_27_TO_IP Thread Last Branch Record 27 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DCH 1756 MSR_LASTBRANCH_28_TO_IP Thread Last Branch Record 28 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DDH 1757 MSR_LASTBRANCH_29_TO_IP Thread Last Branch Record 29 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DEH 1758 MSR_LASTBRANCH_30_TO_IP Thread Last Branch Record 30 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 6DFH 1759 MSR_LASTBRANCH_31_TO_IP Thread Last Branch Record 31 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. 770H 1904 IA32_PM_ENABLE Package See Section 14.4.2, “Enabling HWP”. 771H 1905 IA32_HWP_CAPABILITIES Thread See Section 14.4.3, “HWP Performance Range and Dynamic Capabilities”. 772H 1906 IA32_HWP_REQUEST_PKG Package See Section 14.4.4, “Managing HWP”. 773H 1907 IA32_HWP_INTERRUPT Thread See Section 14.4.6, “HWP Notifications”. 774H 1908 IA32_HWP_REQUEST Thread See Section 14.4.4, “Managing HWP”. Vol. 4 2-283 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 7:0 Minimum Performance (R/W) 15:8 Maximum Performance (R/W) 23:16 Desired Performance (R/W) 31:24 Energy/Performance Preference (R/W) 41:32 Activity Window (R/W) 42 Package Control (R/W) 63:43 Reserved 777H 1911 IA32_HWP_STATUS Thread See Section 14.4.5, “HWP Feedback”. D90H 3472 IA32_BNDCFGS Thread See Table 2-2. DA0H 3488 IA32_XSS Thread See Table 2-2. DB0H 3504 IA32_PKG_HDC_CTL Package See Section 14.5.2, “Package level Enabling HDC”. DB1H 3505 IA32_PM_CTL1 Thread See Section 14.5.3, “Logical-Processor Level HDC Control”. DB2H 3506 IA32_THREAD_STALL Thread See Section 14.5.4.1, “IA32_THREAD_STALL”. DC0H 3520 MSR_LBR_INFO_0 Thread Last Branch Record 0 Additional Information (R/W) One of 32 triplet of last branch record registers on the last branch record stack. This part of the stack contains flag, TSX-related and elapsed cycle information. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.9.1, “LBR Stack.” DC1H 3521 MSR_LBR_INFO_1 Thread Last Branch Record 1 Additional Information (R/W) See description of MSR_LBR_INFO_0. DC2H 3522 MSR_LBR_INFO_2 Thread Last Branch Record 2 Additional Information (R/W) See description of MSR_LBR_INFO_0. DC3H 3523 MSR_LBR_INFO_3 Thread Last Branch Record 3 Additional Information (R/W) See description of MSR_LBR_INFO_0. DC4H 3524 MSR_LBR_INFO_4 Thread Last Branch Record 4 Additional Information (R/W) See description of MSR_LBR_INFO_0. DC5H 3525 MSR_LBR_INFO_5 Thread Last Branch Record 5 Additional Information (R/W) See description of MSR_LBR_INFO_0. DC6H 3526 MSR_LBR_INFO_6 Thread Last Branch Record 6 Additional Information (R/W) See description of MSR_LBR_INFO_0. DC7H 3527 MSR_LBR_INFO_7 Thread Last Branch Record 7 Additional Information (R/W) See description of MSR_LBR_INFO_0. DC8H 3528 MSR_LBR_INFO_8 Thread Last Branch Record 8 Additional Information (R/W) See description of MSR_LBR_INFO_0. 2-284 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec DC9H 3529 Register Name / Bit Fields MSR_LBR_INFO_9 Scope Thread Bit Description Last Branch Record 9 Additional Information (R/W) See description of MSR_LBR_INFO_0. DCAH 3530 MSR_LBR_INFO_10 Thread Last Branch Record 10 Additional Information (R/W) See description of MSR_LBR_INFO_0. DCBH 3531 MSR_LBR_INFO_11 Thread Last Branch Record 11 Additional Information (R/W) See description of MSR_LBR_INFO_0. DCCH 3532 MSR_LBR_INFO_12 Thread Last Branch Record 12 Additional Information (R/W) See description of MSR_LBR_INFO_0. DCDH 3533 MSR_LBR_INFO_13 Thread Last Branch Record 13 Additional Information (R/W) See description of MSR_LBR_INFO_0. DCEH 3534 MSR_LBR_INFO_14 Thread Last Branch Record 14 Additional Information (R/W) See description of MSR_LBR_INFO_0. DCFH 3535 MSR_LBR_INFO_15 Thread Last Branch Record 15 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD0H 3536 MSR_LBR_INFO_16 Thread Last Branch Record 16 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD1H 3537 MSR_LBR_INFO_17 Thread Last Branch Record 17 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD2H 3538 MSR_LBR_INFO_18 Thread Last Branch Record 18 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD3H 3539 MSR_LBR_INFO_19 Thread Last Branch Record 19 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD4H 3520 MSR_LBR_INFO_20 Thread Last Branch Record 20 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD5H 3521 MSR_LBR_INFO_21 Thread Last Branch Record 21 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD6H 3522 MSR_LBR_INFO_22 Thread Last Branch Record 22 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD7H 3523 MSR_LBR_INFO_23 Thread Last Branch Record 23 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD8H 3524 MSR_LBR_INFO_24 Thread Last Branch Record 24 Additional Information (R/W) See description of MSR_LBR_INFO_0. DD9H 3525 MSR_LBR_INFO_25 Thread Last Branch Record 25 Additional Information (R/W) See description of MSR_LBR_INFO_0. DDAH 3526 MSR_LBR_INFO_26 Thread Last Branch Record 26 Additional Information (R/W) See description of MSR_LBR_INFO_0. Vol. 4 2-285 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-39. Additional MSRs Supported by 6th Generation Intel® Core™ Processors and the Intel® Xeon® Processor Scalable Family Based on Skylake Microarchitecture, 7th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture, 8th Generation and 9th Generation Intel® Core™ Processors and Intel® Xeon® E Processors Based on Coffee Lake Microarchitecture, and 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec DDBH 3527 Register Name / Bit Fields MSR_LBR_INFO_27 Scope Thread Bit Description Last Branch Record 27 Additional Information (R/W) See description of MSR_LBR_INFO_0. DDCH 3528 MSR_LBR_INFO_28 Thread Last Branch Record 28 Additional Information (R/W) See description of MSR_LBR_INFO_0. DDDH 3529 MSR_LBR_INFO_29 Thread Last Branch Record 29 Additional Information (R/W) See description of MSR_LBR_INFO_0. DDEH 3530 MSR_LBR_INFO_30 Thread Last Branch Record 30 Additional Information (R/W) See description of MSR_LBR_INFO_0. DDFH 3531 MSR_LBR_INFO_31 Thread Last Branch Record 31 Additional Information (R/W) See description of MSR_LBR_INFO_0. Table 2-40 lists the MSRs of uncore PMU for Intel processors with CPUID DisplayFamily_DisplayModel signatures of 06_4EH, 06_5EH, 06_8EH, 06_9EH, and 06_66H. Table 2-40. Uncore PMU MSRs Supported by 6th Generation, 7th Generation, and 8th Generation Intel® Core™ Processors, and Future Intel® Core™ Processors Register Address Hex Dec 394H 916 395H 396H 917 918 Register Name / Bit Fields MSR_UNC_PERF_FIXED_CTRL Scope Package Bit Description Uncore Fixed Counter Control (R/W) 19:0 Reserved 20 Enable overflow propagation. 21 Reserved 22 Enable counting. 63:23 Reserved MSR_UNC_PERF_FIXED_CTR Package Uncore Fixed Counter 43:0 Current count. 63:44 Reserved MSR_UNC_CBO_CONFIG Package Uncore C-Box Configuration Information (R/O) 3:0 Specifies the number of C-Box units with programmable counters (including processor cores and processor graphics). 63:4 Reserved 3B0H 946 MSR_UNC_ARB_PERFCTR0 Package Uncore Arb Unit, Performance Counter 0 3B1H 947 MSR_UNC_ARB_PERFCTR1 Package Uncore Arb Unit, Performance Counter 1 3B2H 944 MSR_UNC_ARB_PERFEVTSEL0 Package Uncore Arb Unit, Counter 0 Event Select MSR 3B3H 945 MSR_UNC_ARB_PERFEVTSEL1 Package Uncore Arb Unit, Counter 1 Event Select MSR 2-286 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-40. Uncore PMU MSRs Supported by 6th Generation, 7th Generation, and 8th Generation Intel® Core™ Processors, and Future Intel® Core™ Processors Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 700H 1792 MSR_UNC_CBO_0_PERFEVTSEL0 Package Uncore C-Box 0, Counter 0 Event Select MSR 701H 1793 MSR_UNC_CBO_0_PERFEVTSEL1 Package Uncore C-Box 0, Counter 1 Event Select MSR 706H 1798 MSR_UNC_CBO_0_PERFCTR0 Package Uncore C-Box 0, Performance Counter 0 707H 1799 MSR_UNC_CBO_0_PERFCTR1 Package Uncore C-Box 0, Performance Counter 1 710H 1808 MSR_UNC_CBO_1_PERFEVTSEL0 Package Uncore C-Box 1, Counter 0 Event Select MSR 711H 1809 MSR_UNC_CBO_1_PERFEVTSEL1 Package Uncore C-Box 1, Counter 1 Event Select MSR 716H 1814 MSR_UNC_CBO_1_PERFCTR0 Package Uncore C-Box 1, Performance Counter 0 717H 1815 MSR_UNC_CBO_1_PERFCTR1 Package Uncore C-Box 1, Performance Counter 1 720H 1824 MSR_UNC_CBO_2_PERFEVTSEL0 Package Uncore C-Box 2, Counter 0 Event Select MSR 721H 1825 MSR_UNC_CBO_2_PERFEVTSEL1 Package Uncore C-Box 2, Counter 1 Event Select MSR 726H 1830 MSR_UNC_CBO_2_PERFCTR0 Package Uncore C-Box 2, Performance Counter 0 727H 1831 MSR_UNC_CBO_2_PERFCTR1 Package Uncore C-Box 2, Performance Counter 1 730H 1840 MSR_UNC_CBO_3_PERFEVTSEL0 Package Uncore C-Box 3, Counter 0 Event Select MSR 731H 1841 MSR_UNC_CBO_3_PERFEVTSEL1 Package Uncore C-Box 3, Counter 1 Event Select MSR 736H 1846 MSR_UNC_CBO_3_PERFCTR0 Package Uncore C-Box 3, Performance Counter 0 737H 1847 MSR_UNC_CBO_3_PERFCTR1 Package Uncore C-Box 3, Performance Counter 1 E01H 3585 MSR_UNC_PERF_GLOBAL_CTRL Package Uncore PMU Global Control E02H 3586 0 Slice 0 select. 1 Slice 1 select. 2 Slice 2 select. 3 Slice 3 select. 4 Slice 4select. 18:5 Reserved 29 Enable all uncore counters. 30 Enable wake on PMI. 31 Enable Freezing counter when overflow. 63:32 Reserved MSR_UNC_PERF_GLOBAL_STATUS Package Uncore PMU Main Status 0 Fixed counter overflowed. 1 An ARB counter overflowed. 2 Reserved 3 A CBox counter overflowed (on any slice). 63:4 Reserved Vol. 4 2-287 MODEL-SPECIFIC REGISTERS (MSRS) 2.17.1 MSRs Specific to 7th Generation and 8th Generation Intel® Core™ Processors based on Kaby Lake Microarchitecture and Coffee Lake Microarchitecture Table 2-42 lists additional MSRs for 7th generation and 8th generation Intel Core processors with a CPUID DisplayFamily_DisplayModel signatures of 06_8EH and 06_9EH. For an MSR listed in Table 2-42 that also appears in the model-specific tables of prior generations, Table 2-42 supersedes prior generation tables. Table 2-41. Additional MSRs Supported by 7th Generation and 8th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture and Coffee Lake Microarchitecture Register Address Hex Dec 80H 128 Register Name / Bit Fields MSR_TRACE_HUB_STH_ACPIBAR_BASE Scope Package 0 Bit Description NPK Address Used by AET Messages (R/W) Lock Bit If set, this MSR cannot be re-written anymore. Lock bit has to be set in order for the AET packets to be directed to NPK MMIO. 17:1 Reserved 63:18 ACPIBAR_BASE_ADDRESS AET target address in NPK MMIO space. 1F4H 500 MSR_PRMRR_PHYS_BASE Core 2:0 Processor Reserved Memory Range Register Physical Base Control Register (R/W) MemType PRMRR BASE MemType. 11:3 Reserved 45:12 Base PRMRR Base Address. 63:46 1F5H 501 MSR_PRMRR_PHYS_MASK Reserved Core Processor Reserved Memory Range Register Physical Mask Control Register (R/W) 9:0 Reserved 10 Lock Lock bit for the PRMRR. 11 VLD Enable bit for the PRMRR. 45:12 Mask PRMRR MASK bits. 63:46 1FBH 2-288 Vol. 4 507 MSR_PRMRR_VALID_CONFIG Reserved Core Valid PRMRR Configurations (R/W) 0 1M supported MEE size. 4:1 Reserved 5 32M supported MEE size. 6 64M supported MEE size. 7 128M supported MEE size. 31:8 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-41. Additional MSRs Supported by 7th Generation and 8th Generation Intel® Core™ Processors Based on Kaby Lake Microarchitecture and Coffee Lake Microarchitecture Register Address Hex Dec 2F4H 756 Register Name / Bit Fields MSR_UNCORE_PRMRR_PHYS_BASE Scope Package Bit Description (R/W) The PRMRR range is used to protect Xucode memory from unauthorized reads and writes. Any IO access to this range is aborted. This register controls the location of the PRMRR range by indicating its starting address. It functions in tandem with the PRMRR mask register. 11:0 Reserved 38:12 Range Base This field corresponds to bits 38:12 of the base address memory range which is allocated to PRMRR memory. 63:39 2F5H 757 MSR_UNCORE_PRMRR_PHYS_MASK Reserved Package (R/W) This register controls the size of the PRMRR range by indicating which address bits must match the PRMRR base register value. 9:0 Reserved 10 Lock Setting this bit locks all writeable settings in this register, including itself. 11 Range_En Indicates whether the PRMRR range is enabled and valid. 38:12 Range_Mask This field indicates which address bits must match PRMRR base in order to qualify as an PRMRR access. 63:39 620H 1568 MSR_RING_RATIO_LIMIT Reserved Package Ring Ratio Limit (R/W) This register provides Min/Max Ratio Limits for the LLC and Ring. 6:0 MAX_Ratio This field is used to limit the max ratio of the LLC/Ring. 7 14:8 Reserved MIN_Ratio Writing to this field controls the minimum possible ratio of the LLC/Ring. 63:15 Reserved Vol. 4 2-289 MODEL-SPECIFIC REGISTERS (MSRS) 2.17.2 MSRs Specific to 8th Generation Intel® Core™ i3 Processors Table 2-42 lists additional MSRs for 8th generation Intel Core i3 processors with a CPUID DisplayFamily_DisplayModel signature of 06_66H. For an MSR listed in Table 2-42 that also appears in the modelspecific tables of prior generations, Table 2-42 supersede prior generation tables. Table 2-42. Additional MSRs Supported by 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 3AH 58 Register Name / Bit Fields IA32_FEATURE_CONTROL Scope Thread Bit Description Control Features in Intel 64 Processor (R/W) See Table 2-2. 0 Lock (R/WL) 1 Enable VMX Inside SMX Operation (R/WL) 2 Enable VMX Outside SMX Operation (R/WL) 14:8 SENTER Local Functions Enables (R/WL) 15 SENTER Global Functions Enable (R/WL) 17 SGX Launch Control Enable (R/WL) This bit must be set to enable runtime reconfiguration of SGX Launch Control via IA32_SGXLEPUBKEYHASHn MSR. Available only if CPUID.(EAX=07H, ECX=0H): ECX[30] = 1. 350H 848 18 SGX Global Functions Enable (R/WL) 63:21 Reserved MSR_BR_DETECT_CTRL Branch Monitoring Global Control (R/W) 0 EnMonitoring Global enable for branch monitoring. 1 EnExcept Enable branch monitoring event signaling on threshold trip. The branch monitoring event handler is signaled via the existing PMI signaling mechanism as programmed from the corresponding local APIC LVT entry. 2 EnLBRFrz Enable LBR freeze on threshold trip. This will cause the LBR frozen bit 58 to be set in IA32_PERF_GLOBAL_STATUS when a triggering condition occurs and this bit is enabled. 3 DisableInGuest When set to ‘1’, branch monitoring, event triggering and LBR freeze actions are disabled when operating at VMX non-root operation. 7:4 2-290 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-42. Additional MSRs Supported by 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 17:8 WindowSize Window size defined by WindowCntSel. Values 0 – 1023 are supported. Once the Window counter reaches the WindowSize count both the Window Counter and all Branch Monitoring Counters are cleared. 23:18 Reserved 25:24 WindowCntSel Window event count select: ‘00 = Instructions retired. ‘01 = Branch instructions retired ‘10 = Return instructions retired. ‘11 = Indirect branch instructions retired. 26 CntAndMode When set to ‘1’, the overall branch monitoring event triggering condition is true only if all enabled counters’ threshold conditions are true. When ‘0’, the threshold tripping condition is true if any enabled counters’ threshold is true. 351H 849 63:27 Reserved MSR_BR_DETECT_STATUS Branch Monitoring Global Status (R/W) 0 Branch Monitoring Event Signaled When set to '1', Branch Monitoring event signaling is blocked until this bit is cleared by software. 1 LBRsValid This status bit is set to ‘1’ if the LBR state is considered valid for sampling by branch monitoring software. 7:2 8 Reserved CntrHit0 Branch monitoring counter #0 threshold hit. This status bit is sticky and once set requires clearing by software. Counter operation continues independent of the state of the bit. 9 CntrHit1 Branch monitoring counter #1 threshold hit. This status bit is sticky and once set requires clearing by software. Counter operation continues independent of the state of the bit. 15:10 Reserved Reserved for additional branch monitoring counters threshold hit status. Vol. 4 2-291 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-42. Additional MSRs Supported by 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 25:16 CountWindow The current value of the window counter. The count value is frozen on a valid branch monitoring triggering condition. This is a 10-bit unsigned value. 31:26 Reserved Reserved for future extension of CountWindow. 39:32 Count0 The current value of counter 0 updated after each occurrence of the event being counted. The count value is frozen on a valid branch monitoring triggering condition (in which case CntrHit0 will also be set). This is an 8-bit signed value (2’s complement). Heuristic events which only increment will saturate and freeze at maximum value 0xFF (256). RET-CALL event counter saturate at maximum value 0x7F (+127) and minimum value 0x80 (-128). 47:40 Count1 The current value of counter 1 updated after each occurrence of the event being counted. The count value is frozen on a valid branch monitoring triggering condition (in which case CntrHit1 will also be set). This is an 8-bit signed value (2’s complement). Heuristic events which only increment will saturate and freeze at maximum value 0xFF (256). RET-CALL event counter saturate at maximum value 0x7F (+127) and minimum value 0x80 (-128). 354H 355H 852 853 63:48 Reserved MSR_BR_DETECT_COUNTER_CONFIG_i Branch Monitoring Detect Counter Configuration (R/W) 0 CntrEn Enable counter. 7:1 CntrEvSel Event select (other values #GP) ‘0000000 = RETs. ‘0000001 = RET-CALL bias. ‘0000010 = RET mispredicts. ‘0000011 = Branch (all) mispredicts. ‘0000100 = Indirect branch mispredicts. ‘0000101 = Far branch instructions. 2-292 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-42. Additional MSRs Supported by 8th Generation Intel® Core™ i3 Processors Based on Cannon Lake Microarchitecture (Contd.) Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 14:8 CntrThreshold Threshold (an unsigned value of 0 to 127 supported). The value 0 of counter threshold will result in event signaled after every instruction. #GP if threshold is < 2. 15 MispredEventCnt Mispredict events counting behavior: ‘0 = Mispredict events are counted in a window. ‘1 = Mispredict events are counted based on a consecutive occurrence. CntrThreshold is treated as # of consecutive mispredicts. This control bit only applies to events specified by CntrEvSel that involve a prediction (0000010, 0000011, 0000100). Setting this bit for other events is ignored. 63:16 3F8H 1016 MSR_PKG_C3_RESIDENCY Reserved Package 63:0 620H 1568 MSR_RING_RATIO_LIMIT Package C3 Residency Counter (R/O) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. Package Ring Ratio Limit (R/W) This register provides Min/Max Ratio Limits for the LLC and Ring. 6:0 MAX_Ratio This field is used to limit the max ratio of the LLC/Ring. 7 Reserved 14:8 MIN_Ratio Writing to this field controls the minimum possible ratio of the LLC/Ring. 63:15 660H 1632 MSR_CORE_C1_RESIDENCY Reserved Core 63:0 662H 1634 MSR_CORE_C3_RESIDENCY 63:0 Core C1 Residency Counter (R/O) Value since last reset for the Core C1 residency. Counter rate is the Max Non-Turbo frequency (same as TSC). This counter counts in case both of the core's threads are in an idle state and at least one of the core's thread residency is in a C1 state or in one of its sub states. The counter is updated only after a core C state exit. Note: Always reads 0 if core C1 is unsupported. A value of zero indicates that this processor does not support core C1 or never entered core C1 level state. Core Core C3 Residency Counter (R/O) Will always return 0. Vol. 4 2-293 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-43 lists the MSRs of uncore PMU for Intel processors with CPUID signature 06_66H. Table 2-43. Uncore PMU MSRs Supported by Intel® Core™ Processors Based on Cannon Lake Microarchitecture Register Address Hex Dec 394H 916 395H 396H 917 918 Register Name / Bit Fields MSR_UNC_PERF_FIXED_CTRL Scope Package Bit Description Uncore Fixed Counter Control (R/W) 19:0 Reserved 20 Enable overflow propagation. 21 Reserved 22 Enable counting. 63:23 Reserved MSR_UNC_PERF_FIXED_CTR Package Uncore Fixed Counter 47:0 Current count. 63:48 Reserved MSR_UNC_CBO_CONFIG Package Uncore C-Box Configuration Information (R/O) 3:0 Report the number of C-Box units with performance counters, including processor cores and processor graphics. 63:4 Reserved 3B0H 946 MSR_UNC_ARB_PERFCTR0 Package Uncore Arb Unit, Performance Counter 0 3B1H 947 MSR_UNC_ARB_PERFCTR1 Package Uncore Arb Unit, Performance Counter 1 3B2H 944 MSR_UNC_ARB_PERFEVTSEL0 Package Uncore Arb Unit, Counter 0 Event Select MSR 3B3H 945 MSR_UNC_ARB_PERFEVTSEL1 Package Uncore Arb unit, Counter 1 Event Select MSR 700H 1792 MSR_UNC_CBO_0_PERFEVTSEL0 Package Uncore C-Box 0, Counter 0 Event Select MSR 701H 1793 MSR_UNC_CBO_0_PERFEVTSEL1 Package Uncore C-Box 0, Counter 1 Event Select MSR 702H 1794 MSR_UNC_CBO_0_PERFCTR0 Package Uncore C-Box 0, Performance Counter 0 703H 1795 MSR_UNC_CBO_0_PERFCTR1 Package Uncore C-Box 0, Performance Counter 1 708H 1800 MSR_UNC_CBO_1_PERFEVTSEL0 Package Uncore C-Box 1, Counter 0 Event Select MSR 709H 1801 MSR_UNC_CBO_1_PERFEVTSEL1 Package Uncore C-Box 1, Counter 1 Event Select MSR 70AH 1802 MSR_UNC_CBO_1_PERFCTR0 Package Uncore C-Box 1, Performance Counter 0 70BH 1803 MSR_UNC_CBO_1_PERFCTR1 Package Uncore C-Box 1, Performance Counter 1 710H 1808 MSR_UNC_CBO_2_PERFEVTSEL0 Package Uncore C-Box 2, Counter 0 Event Select MSR 711H 1809 MSR_UNC_CBO_2_PERFEVTSEL1 Package Uncore C-Box 2, Counter 1 Event Select MSR 712H 1810 MSR_UNC_CBO_2_PERFCTR0 Package Uncore C-Box 2, Performance Counter 0 713H 1811 MSR_UNC_CBO_2_PERFCTR1 Package Uncore C-Box 2, Performance Counter 1 718H 1816 MSR_UNC_CBO_3_PERFEVTSEL0 Package Uncore C-Box 3, Counter 0 Event Select MSR 719H 1817 MSR_UNC_CBO_3_PERFEVTSEL1 Package Uncore C-Box 3, Counter 1 Event Select MSR 71AH 1818 MSR_UNC_CBO_3_PERFCTR0 Package Uncore C-Box 3, Performance Counter 0 71BH 1819 MSR_UNC_CBO_3_PERFCTR1 Package Uncore C-Box 3, Performance Counter 1 2-294 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-43. Uncore PMU MSRs Supported by Intel® Core™ Processors Based on Cannon Lake Microarchitecture Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 720H 1824 MSR_UNC_CBO_4_PERFEVTSEL0 Package Uncore C-Box 4, Counter 0 Event Select MSR 721H 1825 MSR_UNC_CBO_4_PERFEVTSEL1 Package Uncore C-Box 4, Counter 1 Event Select MSR 722H 1826 MSR_UNC_CBO_4_PERFCTR0 Package Uncore C-Box 4, Performance Counter 0 723H 1827 MSR_UNC_CBO_4_PERFCTR1 Package Uncore C-Box 4, Performance Counter 1 728H 1832 MSR_UNC_CBO_5_PERFEVTSEL0 Package Uncore C-Box 5, Counter 0 Event Select MSR 729H 1833 MSR_UNC_CBO_5_PERFEVTSEL1 Package Uncore C-Box 5, Counter 1 Event Select MSR 72AH 1834 MSR_UNC_CBO_5_PERFCTR0 Package Uncore C-Box 5, Performance Counter 0 72BH 1835 MSR_UNC_CBO_5_PERFCTR1 Package Uncore C-Box 5, Performance Counter 1 730H 1840 MSR_UNC_CBO_6_PERFEVTSEL0 Package Uncore C-Box 6, Counter 0 Event Select MSR 731H 1841 MSR_UNC_CBO_6_PERFEVTSEL1 Package Uncore C-Box 6, Counter 1 Event Select MSR 732H 1842 MSR_UNC_CBO_6_PERFCTR0 Package Uncore C-Box 6, Performance Counter 0 733H 1843 MSR_UNC_CBO_6_PERFCTR1 Package Uncore C-Box 6, Performance Counter 1 738H 1848 MSR_UNC_CBO_7_PERFEVTSEL0 Package Uncore C-Box 7, Counter 0 Event Select MSR 739H 1849 MSR_UNC_CBO_7_PERFEVTSEL1 Package Uncore C-Box 7, Counter 1 Event Select MSR 73AH 1850 MSR_UNC_CBO_7_PERFCTR0 Package Uncore C-Box 7, Performance Counter 0 73BH 1851 MSR_UNC_CBO_7_PERFCTR1 Package Uncore C-Box 7, Performance Counter 1 E01H 3585 MSR_UNC_PERF_GLOBAL_CTRL Package Uncore PMU Global Control E02H 3586 0 Slice 0 select. 1 Slice 1 select. 2 Slice 2 select. 3 Slice 3 select. 4 Slice 4select. 18:5 Reserved 29 Enable all uncore counters. 30 Enable wake on PMI. 31 Enable Freezing counter when overflow. 63:32 Reserved MSR_UNC_PERF_GLOBAL_STATUS Package Uncore PMU Main Status 0 Fixed counter overflowed. 1 An ARB counter overflowed. 2 Reserved 3 A CBox counter overflowed (on any slice). 63:4 Reserved Table 2-44 lists MSRs for Future Intel Core processors based on Ice Lake microarchitecture. Vol. 4 2-295 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-44. MSRs Supported by Future Intel® Core™ Processors Based on Ice Lake Microarchitecture Register Address Hex Dec 33H 51 Register Name / Bit Fields TEST_CTRL Scope Core Bit Description Test Control Register 28:0 Reserved. 29 Enable #AC(0) exception for split locked accesses: Cause #AC(0) exception for split locked access at all CPL irrespective of CR0.AM or EFLAGS.AC. If bits 29 and 31 are both set, bit 29 takes precedence. 329H 809 30 Reserved. 31 Disable LOCK# assertion for split locked access. MSR_PERF_METRICS Thread Performance Metrics (R/W) Reports metrics directly. Software can check (and/or expose to its guests) the availability of PERF_METRICS feature using IA32_PERF_CAPABILITIES.PERF_METRICS_AVAILABL E (bit 15). 3F2H 1010 7:0 Retiring. 15:8 Bad Speculation. 23:16 Frontend Bound. 31:24 Backend Bound. 63:25 Reserved. MSR_PEBS_DATA_CFG Thread PEBS Data Configuration (R/W) Provides software the capability to select data groups of interest and thus reduce the record size in memory and record generation latency. Hence, a PEBS record's size and layout vary based on the selected groups. The MSR also allows software to select LBR depth for branch data records. 0 Memory Info. Setting this bit will capture memory information such as the linear address, data source and latency of the memory access in the PEBS record. 1 GPRs. Setting this bit will capture the contents of the General Purpose registers in the PEBS record. 2 XMMs. Setting this bit will capture the contents of the XMM registers in the PEBS record. 3 LBRs. Setting this bit will capture LBR TO, FROM and INFO in the PEBS record. 23:4 2-296 Vol. 4 Reserved. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-44. MSRs Supported by Future Intel® Core™ Processors Based on Ice Lake Microarchitecture Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 31:24 LBR Entries. Set the field to the desired number of entries - 1. For example, if the LBR_entries field is 0, a single entry will be included in the record. To include 32 LBR entries, set the LBR_entries field to 31 (0x1F). To ensure all PEBS records are 16-byte aligned, software can use LBR_entries that is multiple of 3. 2.17.3 Intel® MSRs Specific to Intel® Xeon® Processor Scalable Family Xeon® Processor Scalable Family (CPUID DisplayFamily_DisplayModel = 06_55H) support the MSRs listed in Table 2-45. Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Dec 3AH 58 Register Name / Bit Fields IA32_FEATURE_CONTROL Scope Thread Bit Description Control Features in Intel 64 Processor (R/W) See Table 2-2. 4EH 78 0 Lock (R/WL) 1 Enable VMX Inside SMX Operation (R/WL) 2 Enable VMX Outside SMX Operation (R/WL) 14:8 SENTER Local Functions Enables (R/WL) 15 SENTER Global Functions Enable (R/WL) 18 SGX Global Functions Enable (R/WL) 20 LMCE_ON (R/WL) 63:21 Reserved MSR_PPIN_CTL Package 0 Protected Processor Inventory Number Enable Control (R/W) LockOut (R/WO) See Table 2-26. 1 Enable_PPIN (R/W) See Table 2-26. 63:2 4FH 79 MSR_PPIN Reserved Package 63:0 Protected Processor Inventory Number (R/O) Protected Processor Inventory Number (R/O) See Table 2-26. CEH 206 MSR_PLATFORM_INFO Package Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 Reserved Vol. 4 2-297 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 15:8 Package Maximum Non-Turbo Ratio (R/O) See Table 2-26. 22:16 23 Reserved. Package PPIN_CAP (R/O) See Table 2-26. 27:24 28 Reserved Package Programmable Ratio Limit for Turbo Mode (R/O) See Table 2-26. 29 Package Programmable TDP Limit for Turbo Mode (R/O) See Table 2-26. 30 Package Programmable TJ OFFSET (R/O) See Table 2-26. 39:31 47:40 Reserved Package Maximum Efficiency Ratio (R/O) See Table 2-26. 63:48 E2H 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Core C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. See http://biosbits.org. 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power) for the package. The default is set as factory-configured package Cstate limit. The following C-state code name encodings are supported: 000b: C0/C1 (no package C-state support) 001b: C2 010b: C6 (non-retention) 011b: C6 (retention) 111b: No Package C state limits. All C states supported by the processor are available. 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) 14:11 Reserved 15 CFG Lock (R/WO) 16 Automatic C-State Conversion Enable (R/W) If 1, the processor will convert HALT or MWAT(C1) to MWAIT(C6). 2-298 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex 179H 17DH Register Name / Bit Fields Scope Bit Description Dec 377 390 24:17 Reserved 25 C3 State Auto Demotion Enable (R/W) 26 C1 State Auto Demotion Enable (R/W) 27 Enable C3 Undemotion (R/W) 28 Enable C1 Undemotion (R/W) 29 Package C State Demotion Enable (R/W) 30 Package C State UnDemotion Enable (R/W) 63:31 Reserved IA32_MCG_CAP Thread Global Machine Check Capability (R/O) 7:0 Count 8 MCG_CTL_P 9 MCG_EXT_P 10 MCP_CMCI_P 11 MCG_TES_P 15:12 Reserved 23:16 MCG_EXT_CNT 24 MCG_SER_P 25 MCG_EM_P 26 MCG_ELOG_P 63:27 Reserved MSR_SMM_MCA_CAP THREAD Enhanced SMM Capabilities (SMM-RO) Reports SMM capability Enhancement. Accessible only while in SMM. 57:0 Reserved 58 SMM_Code_Access_Chk (SMM-RO) If set to 1 indicates that the SMM code access restriction is supported and a host-space interface is available to SMM handler. 59 Long_Flow_Indication (SMM-RO) If set to 1 indicates that the SMM long flow indicator is supported and a host-space interface is available to SMM handler. 63:60 19CH 412 IA32_THERM_STATUS Reserved Core Thermal Monitor Status (R/W) See Table 2-2. 0 Thermal Status (RO) See Table 2-2. 1 Thermal Status Log (R/WC0) See Table 2-2. Vol. 4 2-299 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 2 PROTCHOT # or FORCEPR# Status (RO) See Table 2-2. 3 PROTCHOT # or FORCEPR# Log (R/WC0) See Table 2-2. 4 Critical Temperature Status (RO) See Table 2-2. 5 Critical Temperature Status Log (R/WC0) See Table 2-2. 6 Thermal Threshold #1 Status (RO) See Table 2-2. 7 Thermal Threshold #1 Log (R/WC0) See Table 2-2. 8 Thermal Threshold #2 Status (RO) See Table 2-2. 9 Thermal Threshold #2 Log (R/WC0) See Table 2-2. 10 Power Limitation Status (RO) See Table 2-2. 11 Power Limitation Log (R/WC0) See Table 2-2. 12 Current Limit Status (RO) See Table 2-2. 13 Current Limit Log (R/WC0) See Table 2-2. 14 Cross Domain Limit Status (RO) See Table 2-2. 15 Cross Domain Limit Log (R/WC0) See Table 2-2. 22:16 Digital Readout (RO) See Table 2-2. 26:23 Reserved 30:27 Resolution in Degrees Celsius (RO) See Table 2-2. 31 Reading Valid (RO) See Table 2-2. 63:32 1A2H 418 MSR_TEMPERATURE_TARGET 15:0 2-300 Vol. 4 Reserved Package Temperature Target Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 23:16 Temperature Target (RO) See Table 2-26. 27:24 TCC Activation Offset (R/W) See Table 2-26. 63:28 1ADH 429 MSR_TURBO_RATIO_LIMIT Reserved Package This register defines the ratio limits. RATIO[0:7] must be populated in ascending order. RATIO[i+1] must be less than or equal to RATIO[i]. Entries with RATIO[i] will be ignored. If any of the rules above are broken, the configuration is silently rejected. If the programmed ratio is: • Above the fused ratio for that core count, it will be clipped to the fuse limits (assuming !OC). • Below the min supported ratio, it will be clipped. 7:0 RATIO_0 Defines ratio limits. 15:8 RATIO_1 Defines ratio limits. 23:16 RATIO_2 Defines ratio limits. 31:24 RATIO_3 Defines ratio limits. 39:32 RATIO_4 Defines ratio limits. 47:40 RATIO_5 Defines ratio limits. 55:48 RATIO_6 Defines ratio limits. 63:56 RATIO_7 Defines ratio limits. 1AEH 430 MSR_TURBO_RATIO_LIMIT_CORES 7:0 Package This register defines the active core ranges for each frequency point. NUMCORE[0:7] must be populated in ascending order. NUMCORE[i+1] must be greater than NUMCORE[i]. Entries with NUMCORE[i] == 0 will be ignored. The last valid entry must have NUMCORE >= the number of cores in the SKU. If any of the rules above are broken, the configuration is silently rejected. NUMCORE_0 Defines the active core ranges for each frequency point. 15:8 NUMCORE_1 Defines the active core ranges for each frequency point. Vol. 4 2-301 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 23:16 NUMCORE_2 Defines the active core ranges for each frequency point. 31:24 NUMCORE_3 Defines the active core ranges for each frequency point. 39:32 NUMCORE_4 Defines the active core ranges for each frequency point. 47:40 NUMCORE_5 Defines the active core ranges for each frequency point. 55:48 NUMCORE_6 Defines the active core ranges for each frequency point. 63:56 NUMCORE_7 Defines the active core ranges for each frequency point. 280H 640 IA32_MC0_CTL2 Core See Table 2-2. 281H 641 IA32_MC1_CTL2 Core See Table 2-2. 282H 642 IA32_MC2_CTL2 Core See Table 2-2. 283H 643 IA32_MC3_CTL2 Core See Table 2-2. 284H 644 IA32_MC4_CTL2 Package See Table 2-2. 285H 645 IA32_MC5_CTL2 Package See Table 2-2. 286H 646 IA32_MC6_CTL2 Package See Table 2-2. 287H 647 IA32_MC7_CTL2 Package See Table 2-2. 288H 648 IA32_MC8_CTL2 Package See Table 2-2. 289H 649 IA32_MC9_CTL2 Package See Table 2-2. 28AH 650 IA32_MC10_CTL2 Package See Table 2-2. 28BH 651 IA32_MC11_CTL2 Package See Table 2-2. 28CH 652 IA32_MC12_CTL2 Package See Table 2-2. 28DH 653 IA32_MC13_CTL2 Package See Table 2-2. 28EH 654 IA32_MC14_CTL2 Package See Table 2-2. 28FH 655 IA32_MC15_CTL2 Package See Table 2-2. 290H 656 IA32_MC16_CTL2 Package See Table 2-2. 291H 657 IA32_MC17_CTL2 Package See Table 2-2. 292H 658 IA32_MC18_CTL2 Package See Table 2-2. 293H 659 IA32_MC19_CTL2 Package See Table 2-2. 2-302 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Register Name / Bit Fields Scope Hex Dec 400H 1024 IA32_MC0_CTL Core 401H 1025 IA32_MC0_STATUS Core 402H 1026 IA32_MC0_ADDR Core 403H 1027 IA32_MC0_MISC Core 404H 1028 IA32_MC1_CTL Core 405H 1029 IA32_MC1_STATUS Core 406H 1030 IA32_MC1_ADDR Core 407H 1031 IA32_MC1_MISC Core 408H 1032 IA32_MC2_CTL Core 409H 1033 IA32_MC2_STATUS Core 40AH 1034 IA32_MC2_ADDR Core 40BH 1035 IA32_MC2_MISC Core 40CH 1036 IA32_MC3_CTL Core 40DH 1037 IA32_MC3_STATUS Core 40EH 1038 IA32_MC3_ADDR Core 40FH 1039 IA32_MC3_MISC Core 410H 1040 IA32_MC4_CTL Package 411H 1041 IA32_MC4_STATUS Package 412H 1042 IA32_MC4_ADDR Package 413H 1043 IA32_MC4_MISC Package 414H 1044 IA32_MC5_CTL Package 415H 1045 IA32_MC5_STATUS Package 416H 1046 IA32_MC5_ADDR Package 417H 1047 IA32_MC5_MISC Package 418H 1048 IA32_MC6_CTL Package 419H 1049 IA32_MC6_STATUS Package 41AH 1050 IA32_MC6_ADDR Package 41BH 1051 IA32_MC6_MISC Package 41CH 1052 IA32_MC7_CTL Package 41DH 1053 IA32_MC7_STATUS Package 41EH 1054 IA32_MC7_ADDR Package 41FH 1055 IA32_MC7_MISC Package 420H 1056 IA32_MC8_CTL Package 421H 1057 IA32_MC8_STATUS Package 422H 1058 IA32_MC8_ADDR Package 423H 1059 IA32_MC8_MISC Package Bit Description See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC0 reports MC errors from the IFU module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC1 reports MC errors from the DCU module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC2 reports MC errors from the DTLB module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC3 reports MC errors from the MLC module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC4 reports MC errors from the PCU module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC5 reports MC errors from a link interconnect module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC6 reports MC errors from the integrated I/O module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC7 reports MC errors from the M2M 0. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC8 reports MC errors from the M2M 1. Vol. 4 2-303 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Register Name / Bit Fields Scope Hex Dec 424H 1060 IA32_MC9_CTL Package 425H 1061 IA32_MC9_STATUS Package 426H 1062 IA32_MC9_ADDR Package 427H 1063 IA32_MC9_MISC Package 428H 1064 IA32_MC10_CTL Package 429H 1065 IA32_MC10_STATUS Package 42AH 1066 IA32_MC10_ADDR Package 42BH 1067 IA32_MC10_MISC Package 42CH 1068 IA32_MC11_CTL Package 42DH 1069 IA32_MC11_STATUS Package 42EH 1070 IA32_MC11_ADDR Package 42FH 1071 IA32_MC11_MISC Package 430H 1072 IA32_MC12_CTL Package 431H 1073 IA32_MC12_STATUS Package 432H 1074 IA32_MC12_ADDR Package 433H 1075 IA32_MC12_MISC Package 434H 1076 IA32_MC13_CTL Package 435H 1077 IA32_MC13_STATUS Package 436H 1078 IA32_MC13_ADDR Package 437H 1079 IA32_MC13_MISC Package 438H 1080 IA32_MC14_CTL Package 439H 1081 IA32_MC14_STATUS Package 43AH 1082 IA32_MC14_ADDR Package 43BH 1083 IA32_MC14_MISC Package 43CH 1084 IA32_MC15_CTL Package 43DH 1085 IA32_MC15_STATUS Package 43EH 1086 IA32_MC15_ADDR Package 43FH 1087 IA32_MC15_MISC Package 440H 1088 IA32_MC16_CTL Package 441H 1089 IA32_MC16_STATUS Package 442H 1090 IA32_MC16_ADDR Package 443H 1091 IA32_MC16_MISC Package 444H 1092 IA32_MC17_CTL Package 445H 1093 IA32_MC17_STATUS Package 446H 1094 IA32_MC17_ADDR Package 447H 1095 IA32_MC17_MISC Package 2-304 Vol. 4 Bit Description See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 - MC11 report MC errors from the CHA See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 - MC11 report MC errors from the CHA. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC9 - MC11 report MC errors from the CHA. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC12 report MC errors from each channel of a link interconnect module. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC13 through MC 18 report MC errors from the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC13 through MC 18 report MC errors from the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC13 through MC 18 report MC errors from the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC13 through MC 18 report MC errors from the integrated memory controllers See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC13 through MC 18 report MC errors from the integrated memory controllers. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 448H 1096 IA32_MC18_CTL Package 449H 1097 IA32_MC18_STATUS Package 44AH 1098 IA32_MC18_ADDR Package 44BH 1099 IA32_MC18_MISC Package 44CH 1100 IA32_MC19_CTL Package 44DH 1101 IA32_MC19_STATUS Package 44EH 1102 IA32_MC19_ADDR Package 44FH 1103 IA32_MC19_MISC Package 606H 1542 MSR_RAPL_POWER_UNIT Package Unit Multipliers Used in RAPL Interfaces (R/O) 3:0 Package Power Units See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Banks MC13 through MC 18 report MC errors from the integrated memory controllers. See Section 15.3.2.1, “IA32_MCi_CTL MSRs” through Section 15.3.2.4, “IA32_MCi_MISC MSRs.”. Bank MC19 reports MC errors from a link interconnect module. See Section 14.9.1, “RAPL Interfaces.” 7:4 Package Reserved 12:8 Package Energy Status Units Energy related information (in Joules) is based on the multiplier, 1/2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 0EH (or 61 micro-joules). 15:13 Package Reserved 19:16 Package Time Units See Section 14.9.1, “RAPL Interfaces.” 63:20 Reserved 618H 1560 MSR_DRAM_POWER_LIMIT Package 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM RAPL Power Limit Control (R/W) See Section 14.9.5, “DRAM RAPL Domain.” DRAM Energy Status (R/O) Energy consumed by DRAM devices. 31:0 Energy in 15.3 micro-joules. Requires BIOS configuration to enable DRAM RAPL mode 0 (Direct VR). 63:32 61BH 1563 MSR_DRAM_PERF_STATUS Reserved Package DRAM Performance Throttling Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61CH 1564 MSR_DRAM_POWER_INFO Package DRAM RAPL Parameters (R/W) 620H 1568 MSR UNCORE_RATIO_LIMIT Package Uncore Ratio Limit (R/W) See Section 14.9.5, “DRAM RAPL Domain.” Out of reset, the min_ratio and max_ratio fields represent the widest possible range of uncore frequencies. Writing to these fields allows software to control the minimum and the maximum frequency that hardware will select. 63:15 Reserved Vol. 4 2-305 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 14:8 MIN_RATIO Writing to this field controls the minimum possible ratio of the LLC/Ring. 7 Reserved 6:0 MAX_RATIO This field is used to limit the max ratio of the LLC/Ring. 639H 1593 MSR_PP0_ENERGY_STATUS Package Reserved (R/O) Reads return 0. C8DH 3213 IA32_QM_EVTSEL THREAD Monitoring Event Select Register (R/W) If CPUID.(EAX=07H, ECX=0):EBX.RDT-M[bit 12] = 1. 7:0 EventID (RW) Event encoding: 0x00: No monitoring. 0x01: L3 occupancy monitoring. 0x02: Total memory bandwidth monitoring. 0x03: Local memory bandwidth monitoring. All other encoding reserved. C8FH C90H 3215 3216 31:8 Reserved 41:32 RMID (RW) 63:42 Reserved IA32_PQR_ASSOC THREAD Resource Association Register (R/W) 9:0 RMID 31:10 Reserved 51:32 COS (R/W) 63: 52 Reserved IA32_L3_QOS_MASK_0 Package L3 Class Of Service Mask - COS 0 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=0. C91H 3217 0:19 CBM: Bit vector of available L3 ways for COS 0 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_1 Package L3 Class Of Service Mask - COS 1 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=1. C92H 3218 0:19 CBM: Bit vector of available L3 ways for COS 1 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_2 Package L3 Class Of Service Mask - COS 2 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=2. 2-306 Vol. 4 0:19 CBM: Bit vector of available L3 ways for COS 2 enforcement. 63:20 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Dec C93H 3219 Register Name / Bit Fields IA32_L3_QOS_MASK_3 Scope Package Bit Description L3 Class Of Service Mask - COS 3 (R/W). If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=3. C94H 3220 0:19 CBM: Bit vector of available L3 ways for COS 3 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_4 Package L3 Class Of Service Mask - COS 4 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=4. C95H 3221 0:19 CBM: Bit vector of available L3 ways for COS 4 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_5 Package L3 Class Of Service Mask - COS 5 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=5. C96H 3222 0:19 CBM: Bit vector of available L3 ways for COS 5 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_6 Package L3 Class Of Service Mask - COS 6 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=6. C97H 3223 0:19 CBM: Bit vector of available L3 ways for COS 6 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_7 Package L3 Class Of Service Mask - COS 7 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=7. C98H 3224 0:19 CBM: Bit vector of available L3 ways for COS 7 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_8 Package L3 Class Of Service Mask - COS 8 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=8. C99H 3225 0:19 CBM: Bit vector of available L3 ways for COS 8 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_9 Package L3 Class Of Service Mask - COS 9 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=9. C9AH 3226 0:19 CBM: Bit vector of available L3 ways for COS 9 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_10 Package L3 Class Of Service Mask - COS 10 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=10. 0:19 CBM: Bit vector of available L3 ways for COS 10 enforcement. 63:20 Reserved Vol. 4 2-307 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-45. MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H Register Address Hex Dec C9BH 3227 Register Name / Bit Fields IA32_L3_QOS_MASK_11 Scope Package Bit Description L3 Class Of Service Mask - COS 11 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=11. C9CH 3228 0:19 CBM: Bit vector of available L3 ways for COS 11 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_12 Package L3 Class Of Service Mask - COS 12 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=12. C9DH 3229 0:19 CBM: Bit vector of available L3 ways for COS 12 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_13 Package L3 Class Of Service Mask - COS 13 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=13. C9EH 3230 0:19 CBM: Bit vector of available L3 ways for COS 13 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_14 Package L3 Class Of Service Mask - COS 14 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=14. C9FH 3231 0:19 CBM: Bit vector of available L3 ways for COS 14 enforcement. 63:20 Reserved IA32_L3_QOS_MASK_15 Package L3 Class Of Service Mask - COS 15 (R/W) If CPUID.(EAX=10H, ECX=1):EDX.COS_MAX[15:0] >=15. 2.18 0:19 CBM: Bit vector of available L3 ways for COS 15 enforcement. 63:20 Reserved MSRS IN INTEL® XEON PHI™ PROCESSOR 3200/5200/7200 SERIES AND INTEL® XEON PHI™ PROCESSOR 7215/7285/7295 SERIES Intel® Xeon Phi™ processor 3200, 5200, 7200 series, with CPUID DisplayFamily_DisplayModel signature 06_57H, supports the MSR interfaces listed in Table 2-46. These processors are based on the Knights Landing microarchitecture. Intel® Xeon Phi™ processor 7215, 7285, 7295 series, with CPUID DisplayFamily_DisplayModel signature 06_85H, supports the MSR interfaces listed in Table 2-46 and Table 2-47. These processors are based on the Knights Mill microarchitecture. Some MSRs are shared between a pair of processor cores, the scope is marked as module. 2-308 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 0H 0 IA32_P5_MC_ADDR Module See Section 2.23, “MSRs in Pentium Processors.” 1H 1 IA32_P5_MC_TYPE Module See Section 2.23, “MSRs in Pentium Processors.” 6H 6 IA32_MONITOR_FILTER_SIZE Thread See Section 8.10.5, “Monitor/Mwait Address Range Determination.” See Table 2-2. 10H 16 IA32_TIME_STAMP_COUNTER Thread See Section 17.17, “Time-Stamp Counter,” and see Table 2-2. 17H 23 IA32_PLATFORM_ID Package Platform ID (R) See Table 2-2. 1BH 27 IA32_APIC_BASE Thread See Section 10.4.4, “Local APIC Status and Location,” and Table 2-2. 34H 52 MSR_SMI_COUNT Thread SMI Counter (R/O) 3AH 58 31:0 SMI Count (R/O) 63:32 Reserved IA32_FEATURE_CONTROL Thread Control Features in Intel 64Processor (R/W) See Table 2-2. 3BH 59 0 Lock (R/WL) 1 Reserved 2 Enable VMX outside SMX operation (R/WL) IA32_TSC_ADJUST THREAD Per-Logical-Processor TSC ADJUST (R/W) See Table 2-2. 4EH 78 MSR_PPIN_CTL Package 0 Protected Processor Inventory Number Enable Control (R/W) LockOut (R/WO) Set 1 to prevent further writes to MSR_PPIN_CTL. Writing 1 to MSR_PPINCTL[bit 0] is permitted only if MSR_PPIN_CTL[bit 1] is clear. Default is 0. BIOS should provide an opt-in menu to enable the user to turn on MSR_PPIN_CTL[bit 1] for a privileged inventory initialization agent to access MSR_PPIN. After reading MSR_PPIN, the privileged inventory initialization agent should write ‘01b’ to MSR_PPIN_CTL to disable further access to MSR_PPIN and prevent unauthorized modification to MSR_PPIN_CTL. 1 Enable_PPIN (R/W) If 1, enables MSR_PPIN to be accessible using RDMSR. Once set, an attempt to write 1 to MSR_PPIN_CTL[bit 0] will cause #GP. If 0, an attempt to read MSR_PPIN will cause #GP. Default is 0. 63:2 4FH 79 MSR_PPIN Reserved Package Protected Processor Inventory Number (R/O) Vol. 4 2-309 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 63:0 Protected Processor Inventory Number (R/O) A unique value within a given CPUID family/model/stepping signature that a privileged inventory initialization agent can access to identify each physical processor, when access to MSR_PPIN is enabled. Access to MSR_PPIN is permitted only if MSR_PPIN_CTL[bits 1:0] = ‘10b’. 79H 121 IA32_BIOS_UPDT_TRIG Core BIOS Update Trigger Register (W) See Table 2-2. 8BH 139 IA32_BIOS_SIGN_ID THREAD BIOS Update Signature ID (RO) See Table 2-2. C1H 193 IA32_PMC0 THREAD Performance Counter Register See Table 2-2. C2H 194 IA32_PMC1 THREAD Performance Counter Register See Table 2-2. CEH 206 MSR_PLATFORM_INFO Package Platform Information Contains power management and other model specific features enumeration. See http://biosbits.org. 7:0 15:8 Reserved Package Maximum Non-Turbo Ratio (R/O) This is the ratio of the frequency that invariant TSC runs at. Frequency = ratio * 100 MHz. 27:16 28 Reserved Package Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limit for Turbo mode is enabled. When set to 0, indicates Programmable Ratio Limit for Turbo mode is disabled. 29 Package Programmable TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDP Limit for Turbo mode is programmable. When set to 0, indicates TDP Limit for Turbo mode is not programmable. 39:30 47:40 Reserved Package Maximum Efficiency Ratio (R/O) This is the minimum ratio (maximum efficiency) that the processor can operate, in units of 100MHz. 63:48 E2H 2-310 Vol. 4 226 MSR_PKG_CST_CONFIG_CONTROL Reserved Package C-State Configuration Control (R/W) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 2:0 Package C-State Limit (R/W) Specifies the lowest C-state for the package. This feature does not limit the processor core C-state. The power-on default value from bit[2:0] of this register reports the deepest package C-state the processor is capable to support when manufactured. It is recommended that BIOS always read the power-on default value reported from this bit field to determine the supported deepest C-state on the processor and leave it as default without changing it. 000b - C0/C1 (No package C-state support) 001b - C2 010b - C6 (non retention)* 011b - C6 (Retention)* 100b - Reserved 101b - Reserved 110b - Reserved 111b - No package C-state limit. All C-States supported by the processor are available. Note: C6 retention mode provides more power saving than C6 non-retention mode. Limiting the package to C6 non retention mode does prevent the MSR_PKG_C6_RESIDENCY counter (MSR 3F9h) from being incremented. 9:3 Reserved 10 I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO registers at MSR_PMG_IO_CAPTURE_BASE[15:0] to MWAIT instructions. 14:11 Reserved 15 CFG Lock (RO) When set, locks bits [15:0] of this register for further writes until the next reset occurs. 25 Reserved 26 C1 State Auto Demotion Enable (R/W) When set, the processor will conditionally demote C3/C6/C7 requests to C1 based on uncore auto-demote information. 27 Reserved 28 C1 State Auto Undemotion Enable (R/W) When set, enables Undemotion from Demoted C1. 29 PKG C-State Auto Demotion Enable (R/W) When set, enables Package C state demotion. 63:30 Reserved Vol. 4 2-311 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Dec E4H 228 Register Name / Bit Fields MSR_PMG_IO_CAPTURE_BASE Scope Tile 15:0 Bit Description Power Management IO Capture Base (R/W) LVL_2 Base Address (R/W) Microcode will compare IO-read zone to this base address to determine if an MWAIT(C2/3/4) needs to be issued instead of the IO-read. Should be programmed to the chipset Plevel_2 IO address. 22:16 C-State Range (R/W) The IO-port block size in which IO-redirection will be executed (0-127). Should be programmed based on the number of LVLx registers existing in the chipset. 63:23 Reserved E7H 231 IA32_MPERF Thread E8H 232 IA32_APERF Thread Maximum Performance Frequency Clock Count (RW) See Table 2-2. Actual Performance Frequency Clock Count (RW) See Table 2-2. FEH 254 IA32_MTRRCAP Core 13CH 52 MSR_FEATURE_CONFIG Core Memory Type Range Register (R) See Table 2-2. AES Configuration (RW-L) Privileged post-BIOS agent must provide a #GP handler to handle unsuccessful read of this MSR. 1:0 AES Configuration (RW-L) Upon a successful read of this MSR, the configuration of AES instruction set availability is as follows: 11b: AES instructions are not available until next RESET. Otherwise, AES instructions are available. Note, the AES instruction set is not available if read is unsuccessful. If the configuration is not 01b, AES instructions can be mis-configured if a privileged agent unintentionally writes 11b. 63:2 140H 320 MISC_FEATURE_ENABLES Reserved Thread MISC_FEATURE_ENABLES 0 Reserved 1 User Mode MONITOR and MWAIT (R/W) If set to 1, the MONITOR and MWAIT instructions do not cause invalid-opcode exceptions when executed with CPL > 0 or in virtual-8086 mode. If MWAIT is executed when CPL > 0 or in virtual-8086 mode, and if EAX indicates a C-state other than C0 or C1, the instruction operates as if EAX indicated the C-state C1. 63:2 Reserved 174H 372 IA32_SYSENTER_CS Thread See Table 2-2. 175H 373 IA32_SYSENTER_ESP Thread See Table 2-2. 2-312 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 176H 374 IA32_SYSENTER_EIP Thread See Table 2-2. 179H 377 IA32_MCG_CAP Thread See Table 2-2. 17AH 378 IA32_MCG_STATUS Thread See Table 2-2. 17DH 390 MSR_SMM_MCA_CAP Thread Enhanced SMM Capabilities (SMM-RO) Reports SMM capability Enhancement. Accessible only while in SMM. 31:0 Bank Support (SMM-RO) One bit per MCA bank. If the bit is set, that bank supports Enhanced MCA (Default all 0; does not support EMCA). 55:32 Reserved 56 Targeted SMI (SMM-RO) Set if targeted SMI is supported. 57 SMM_CPU_SVRSTR (SMM-RO) Set if SMM SRAM save/restore feature is supported. 58 SMM_CODE_ACCESS_CHK (SMM-RO) Set if SMM code access check feature is supported. 59 Long_Flow_Indication (SMM-RO) If set to 1, indicates that the SMM long flow indicator is supported and a host-space interface available to SMM handler. 63:60 186H 390 IA32_PERFEVTSEL0 Reserved Thread Performance Monitoring Event Select Register (R/W) See Table 2-2. 7:0 Event Select 15:8 UMask 16 USR 17 OS 18 Edge 19 PC 20 INT 21 AnyThread 22 EN 23 INV 31:24 CMASK 63:32 Reserved 187H 391 IA32_PERFEVTSEL1 Thread See Table 2-2. 198H 408 IA32_PERF_STATUS Package See Table 2-2. Vol. 4 2-313 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 199H 409 IA32_PERF_CTL Thread See Table 2-2. 19AH 410 IA32_CLOCK_MODULATION Thread Clock Modulation (R/W) See Table 2-2. 19BH 411 IA32_THERM_INTERRUPT Module Thermal Interrupt Control (R/W) See Table 2-2. 19CH 412 IA32_THERM_STATUS Module Thermal Monitor Status (R/W) See Table 2-2. 1A0H 416 0 Thermal Status (RO) 1 Thermal Status Log (R/WC0) 2 PROTCHOT # or FORCEPR# Status (RO) 3 PROTCHOT # or FORCEPR# Log (R/WC0) 4 Critical Temperature Status (RO) 5 Critical Temperature Status Log (R/WC0) 6 Thermal Threshold #1 Status (RO) 7 Thermal Threshold #1 Log (R/WC0) 8 Thermal Threshold #2 Status (RO) 9 Thermal Threshold #2 Log (R/WC0) 10 Power Limitation Status (RO) 11 Power Limitation Log (RWC0) 15:12 Reserved 22:16 Digital Readout (RO) 26:23 Reserved 30:27 Resolution in Degrees Celsius (RO) 31 Reading Valid (RO) 63:32 Reserved IA32_MISC_ENABLE Thread Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 2-314 Vol. 4 0 Fast-Strings Enable 2:1 Reserved 3 Automatic Thermal Control Circuit Enable (R/W) 6:4 Reserved 7 Performance Monitoring Available (R) 10:8 Reserved 11 Branch Trace Storage Unavailable (RO) 12 Processor Event Based Sampling Unavailable (RO) 15:13 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex 1A2H 1A4H Register Name / Bit Fields Scope Bit Description Dec 418 420 16 Enhanced Intel SpeedStep Technology Enable (R/W) 18 ENABLE MONITOR FSM (R/W) 21:19 Reserved 22 Limit CPUID Maxval (R/W) 23 xTPR Message Disable (R/W) 33:24 Reserved 34 XD Bit Disable (R/W) 37:35 Reserved 38 Turbo Mode Disable (R/W) 63:39 Reserved MSR_TEMPERATURE_TARGET Package Temperature Target 15:0 Reserved 23:16 Temperature Target (R) 29:24 Target Offset (R/W) 63:30 Reserved MSR_MISC_FEATURE_CONTROL Miscellaneous Feature Control (R/W) 0 Core DCU Hardware Prefetcher Disable (R/W) If 1, disables the L1 data cache prefetcher. 1 Core L2 Hardware Prefetcher Disable (R/W) If 1, disables the L2 hardware prefetcher. 63:2 Reserved 1A6H 422 MSR_OFFCORE_RSP_0 Shared Offcore Response Event Select Register (R/W) 1A7H 423 MSR_OFFCORE_RSP_1 Shared Offcore Response Event Select Register (R/W) 1ADH 429 MSR_TURBO_RATIO_LIMIT Package Maximum Ratio Limit of Turbo Mode for Groups of Cores (RW) 0 7:1 Reserved Package Maximum Number of Cores in Group 0 Number active processor cores which operates under the maximum ratio limit for group 0. 15:8 Package Maximum Ratio Limit for Group 0 Maximum turbo ratio limit when the number of active cores are not more than the group 0 maximum core count. 20:16 Package Number of Incremental Cores Added to Group 1 Group 1, which includes the specified number of additional cores plus the cores in group 0, operates under the group 1 turbo max ratio limit = “group 0 Max ratio limit” - “group ratio delta for group 1”. Vol. 4 2-315 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 23:21 Package Group Ratio Delta for Group 1 An unsigned integer specifying the ratio decrement relative to the Max ratio limit to Group 0. 28:24 Package Number of Incremental Cores Added to Group 2 Group 2, which includes the specified number of additional cores plus all the cores in group 1, operates under the group 2 turbo max ratio limit = “group 1 Max ratio limit” - “group ratio delta for group 2”. 31:29 Package Group Ratio Delta for Group 2 An unsigned integer specifying the ratio decrement relative to the Max ratio limit for Group 1. 36:32 Package Number of Incremental Cores Added to Group 3 Group 3, which includes the specified number of additional cores plus all the cores in group 2, operates under the group 3 turbo max ratio limit = “group 2 Max ratio limit” - “group ratio delta for group 3”. 39:37 Package Group Ratio Delta for Group 3 An unsigned integer specifying the ratio decrement relative to the Max ratio limit for Group 2. 44:40 Package Number of Incremental Cores Added to Group 4 Group 4, which includes the specified number of additional cores plus all the cores in group 3, operates under the group 4 turbo max ratio limit = “group 3 Max ratio limit” - “group ratio delta for group 4”. 47:45 Package Group Ratio Delta for Group 4 An unsigned integer specifying the ratio decrement relative to the Max ratio limit for Group 3. 52:48 Package Number of Incremental Cores Added to Group 5 Group 5, which includes the specified number of additional cores plus all the cores in group 4, operates under the group 5 turbo max ratio limit = “group 4 Max ratio limit” - “group ratio delta for group 5”. 55:53 Package Group Ratio Delta for Group 5 An unsigned integer specifying the ratio decrement relative to the Max ratio limit for Group 4. 60:56 Package Number of Incremental Cores Added to Group 6 Group 6, which includes the specified number of additional cores plus all the cores in group 5, operates under the group 6 turbo max ratio limit = “group 5 Max ratio limit” - “group ratio delta for group 6”. 63:61 Package Group Ratio Delta for Group 6 An unsigned integer specifying the ratio decrement relative to the Max ratio limit for Group 5. 1B0H 2-316 Vol. 4 432 IA32_ENERGY_PERF_BIAS Thread See Table 2-2. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 1B1H 433 IA32_PACKAGE_THERM_STATUS Package See Table 2-2. 1B2H 434 IA32_PACKAGE_THERM_INTERRUPT Package See Table 2-2. 1C8H 456 MSR_LBR_SELECT Thread Last Branch Record Filtering Select Register (R/W) See Section 17.9.2, “Filtering of Last Branch Records.” 1C9H 457 0 CPL_EQ_0 1 CPL_NEQ_0 2 JCC 3 NEAR_REL_CALL 4 NEAR_IND_CALL 5 NEAR_RET 6 NEAR_IND_JMP 7 NEAR_REL_JMP 8 FAR_BRANCH 63:9 Reserved MSR_LASTBRANCH_TOS Thread Last Branch Record Stack TOS (R/W) Contains an index (bits 0-2) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP. 1D9H 473 IA32_DEBUGCTL 0 Thread Debug Control (R/W) LBR Setting this bit to 1 enables the processor to record a running trace of the most recent branches taken by the processor in the LBR stack. 1 BTF Setting this bit to 1 enables the processor to treat EFLAGS.TF as single-step on branches instead of singlestep on instructions. 5:2 Reserved 6 TR Setting this bit to 1 enables branch trace messages to be sent. 7 BTS Setting this bit enables branch trace messages (BTMs) to be logged in a BTS buffer. 8 BTINT When clear, BTMs are logged in a BTS buffer in circular fashion. When this bit is set, an interrupt is generated by the BTS facility when the BTS buffer is full. 9 BTS_OFF_OS When set, BTS or BTM is skipped if CPL = 0. Vol. 4 2-317 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 10 BTS_OFF_USR When set, BTS or BTM is skipped if CPL > 0. 11 FREEZE_LBRS_ON_PMI When set, the LBR stack is frozen on a PMI request. 12 FREEZE_PERFMON_ON_PMI When set, each ENABLE bit of the global counter control MSR are frozen (address 3BFH) on a PMI request. 13 Reserved 14 FREEZE_WHILE_SMM When set, freezes perfmon and trace messages while in SMM. 31:15 Reserved 1DDH 477 MSR_LER_FROM_LIP Thread Last Exception Record from Linear IP (R) 1DEH 478 MSR_LER_TO_LIP Thread Last Exception Record to Linear IP (R) 1F2H 498 IA32_SMRR_PHYSBASE Core See Table 2-2. 1F3H 499 IA32_SMRR_PHYSMASK Core See Table 2-2. 200H 512 IA32_MTRR_PHYSBASE0 Core See Table 2-2. 201H 513 IA32_MTRR_PHYSMASK0 Core See Table 2-2. 202H 514 IA32_MTRR_PHYSBASE1 Core See Table 2-2. 203H 515 IA32_MTRR_PHYSMASK1 Core See Table 2-2. 204H 516 IA32_MTRR_PHYSBASE2 Core See Table 2-2. 205H 517 IA32_MTRR_PHYSMASK2 Core See Table 2-2. 206H 518 IA32_MTRR_PHYSBASE3 Core See Table 2-2. 207H 519 IA32_MTRR_PHYSMASK3 Core See Table 2-2. 208H 520 IA32_MTRR_PHYSBASE4 Core See Table 2-2. 209H 521 IA32_MTRR_PHYSMASK4 Core See Table 2-2. 20AH 522 IA32_MTRR_PHYSBASE5 Core See Table 2-2. 20BH 523 IA32_MTRR_PHYSMASK5 Core See Table 2-2. 20CH 524 IA32_MTRR_PHYSBASE6 Core See Table 2-2. 20DH 525 IA32_MTRR_PHYSMASK6 Core See Table 2-2. 20EH 526 IA32_MTRR_PHYSBASE7 Core See Table 2-2. 20FH 527 IA32_MTRR_PHYSMASK7 Core See Table 2-2. 250H 592 IA32_MTRR_FIX64K_00000 Core See Table 2-2. 258H 600 IA32_MTRR_FIX16K_80000 Core See Table 2-2. 259H 601 IA32_MTRR_FIX16K_A0000 Core See Table 2-2. 268H 616 IA32_MTRR_FIX4K_C0000 Core See Table 2-2. 269H 617 IA32_MTRR_FIX4K_C8000 Core See Table 2-2. 2-318 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 26AH 618 IA32_MTRR_FIX4K_D0000 Core See Table 2-2. 26BH 619 IA32_MTRR_FIX4K_D8000 Core See Table 2-2. 26CH 620 IA32_MTRR_FIX4K_E0000 Core See Table 2-2. 26DH 621 IA32_MTRR_FIX4K_E8000 Core See Table 2-2. 26EH 622 IA32_MTRR_FIX4K_F0000 Core See Table 2-2. 26FH 623 IA32_MTRR_FIX4K_F8000 Core See Table 2-2. 277H 631 IA32_PAT Core See Table 2-2. 2FFH 767 IA32_MTRR_DEF_TYPE Core Default Memory Types (R/W) See Table 2-2. 309H 777 IA32_FIXED_CTR0 Thread Fixed-Function Performance Counter Register 0 (R/W) See Table 2-2. 30AH 778 IA32_FIXED_CTR1 Thread Fixed-Function Performance Counter Register 1 (R/W) See Table 2-2. 30BH 779 IA32_FIXED_CTR2 Thread Fixed-Function Performance Counter Register 2 (R/W) See Table 2-2. 345H 837 IA32_PERF_CAPABILITIES Package See Table 2-2. See Section 17.4.1, “IA32_DEBUGCTL MSR.” 38DH 909 IA32_FIXED_CTR_CTRL Thread Fixed-Function-Counter Control Register (R/W) See Table 2-2. 38EH 910 IA32_PERF_GLOBAL_STATUS Thread See Table 2-2. 38FH 911 IA32_PERF_GLOBAL_CTRL Thread See Table 2-2. 390H 912 IA32_PERF_GLOBAL_OVF_CTRL Thread See Table 2-2. 3F1H 1009 MSR_PEBS_ENABLE Thread See Table 2-2. 3F8H 1016 MSR_PKG_C3_RESIDENCY Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. 63:0 3F9H 1017 MSR_PKG_C6_RESIDENCY Package C3 Residency Counter (R/O) Package 63:0 3FAH 1018 MSR_PKG_C7_RESIDENCY Package C6 Residency Counter (R/O) Package 63:0 3FCH 1020 MSR_MC0_RESIDENCY 3FDH 1021 MSR_MC6_RESIDENCY Package C7 Residency Counter (R/O) Module 63:0 63:0 Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. Module C0 Residency Counter (R/O) Module Module C6 Residency Counter (R/O) Vol. 4 2-319 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Dec 3FFH 1023 Register Name / Bit Fields MSR_CORE_C6_RESIDENCY Scope Core 63:0 Bit Description Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. CORE C6 Residency Counter (R/O) 400H 1024 IA32_MC0_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 404H 1028 IA32_MC1_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 408H 1032 IA32_MC2_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40AH 1034 IA32_MC2_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 40CH 1036 IA32_MC3_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 IA32_MC3_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 IA32_MC3_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 410H 1040 IA32_MC4_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 IA32_MC4_STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 IA32_MC4_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 414H 1044 IA32_MC5_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 415H 1045 IA32_MC5_STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 416H 1046 IA32_MC5_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 4C1H 1217 IA32_A_PMC0 Thread See Table 2-2. 4C2H 1218 IA32_A_PMC1 Thread See Table 2-2. 600H 1536 IA32_DS_AREA Thread DS Save Area (R/W) See Table 2-2. 606H 1542 MSR_RAPL_POWER_UNIT Package Unit Multipliers Used in RAPL Interfaces (R/O) 3:0 Package Power Units See Section 14.9.1, “RAPL Interfaces.” 7:4 2-320 Vol. 4 Package Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 12:8 Package Energy Status Units Energy related information (in Joules) is based on the multiplier, 1/2^ESU; where ESU is an unsigned integer represented by bits 12:8. Default value is 0EH (or 61 micro-joules). 15:13 Package 19:16 Package Reserved Time Units See Section 14.9.1, “RAPL Interfaces.” 63:20 60DH 1549 MSR_PKG_C2_RESIDENCY Reserved Package 63:0 610H 1552 MSR_PKG_POWER_LIMIT Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-states. Package C2 Residency Counter (R/O) Package PKG RAPL Power Limit Control (R/W) See Section 14.9.3, “Package RAPL Domain.” 611H 1553 MSR_PKG_ENERGY_STATUS Package PKG Energy Status (R/O) See Section 14.9.3, “Package RAPL Domain.” 613H 1555 MSR_PKG_PERF_STATUS Package PKG Perf Status (R/O) See Section 14.9.3, “Package RAPL Domain.” 614H 1556 MSR_PKG_POWER_INFO Package PKG RAPL Parameters (R/W) See Section 14.9.3, “Package RAPL Domain.” 618H 1560 MSR_DRAM_POWER_LIMIT Package DRAM RAPL Power Limit Control (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 619H 1561 MSR_DRAM_ENERGY_STATUS Package DRAM Energy Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61BH 1563 MSR_DRAM_PERF_STATUS Package DRAM Performance Throttling Status (R/O) See Section 14.9.5, “DRAM RAPL Domain.” 61CH 1564 MSR_DRAM_POWER_INFO Package DRAM RAPL Parameters (R/W) See Section 14.9.5, “DRAM RAPL Domain.” 620H 1568 MSR UNCORE_RATIO_LIMIT Package Uncore Ratio Limit (R/W) Out of reset, the min_ratio and max_ratio fields represent the widest possible range of uncore frequencies. Writing to these fields allows software to control the minimum and the maximum frequency that hardware will select. 63:15 14:8 Reserved MIN_RATIO Writing to this field controls the minimum possible ratio of the LLC/Ring. 7 Reserved Vol. 4 2-321 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec 6:0 MAX_RATIO This field is used to limit the max ratio of the LLC/Ring. 638H 1592 MSR_PP0_POWER_LIMIT Package PP0 RAPL Power Limit Control (R/W) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 639H 1593 MSR_PP0_ENERGY_STATUS Package PP0 Energy Status (R/O) See Section 14.9.4, “PP0/PP1 RAPL Domains.” 648H 1608 MSR_CONFIG_TDP_NOMINAL Package Base TDP Ratio (R/O) See Table 2-25. 649H 1609 MSR_CONFIG_TDP_LEVEL1 Package ConfigTDP Level 1 ratio and power level (R/O) See Table 2-25. 64AH 1610 MSR_CONFIG_TDP_LEVEL2 Package ConfigTDP Level 2 ratio and power level (R/O) See Table 2-25. 64BH 1611 MSR_CONFIG_TDP_CONTROL Package ConfigTDP Control (R/W) See Table 2-25. 64CH 1612 MSR_TURBO_ACTIVATION_RATIO Package ConfigTDP Control (R/W) See Table 2-25. 690H 1680 MSR_CORE_PERF_LIMIT_REASONS Package Indicator of Frequency Clipping in Processor Cores (R/W) (Frequency refers to processor core frequency.) 6E0H 1760 0 PROCHOT Status (R0) 1 Thermal Status (R0) 5:2 Reserved 6 VR Therm Alert Status (R0) 7 Reserved 8 Electrical Design Point Status (R0) 63:9 Reserved IA32_TSC_DEADLINE Core TSC Target of Local APIC’s TSC Deadline Mode (R/W) See Table 2-2. 802H 2050 IA32_X2APIC_APICID Thread x2APIC ID Register (R/O) 803H 2051 IA32_X2APIC_VERSION Thread x2APIC Version Register (R/O) 808H 2056 IA32_X2APIC_TPR Thread x2APIC Task Priority Register (R/W) 80AH 2058 IA32_X2APIC_PPR Thread x2APIC Processor Priority Register (R/O) 80BH 2059 IA32_X2APIC_EOI Thread x2APIC EOI Register (W/O) 80DH 2061 IA32_X2APIC_LDR Thread x2APIC Logical Destination Register (R/O) 80FH 2063 IA32_X2APIC_SIVR Thread x2APIC Spurious Interrupt Vector Register (R/W) 810H 2064 IA32_X2APIC_ISR0 Thread x2APIC In-Service Register Bits [31:0] (R/O) 811H 2065 IA32_X2APIC_ISR1 Thread x2APIC In-Service Register Bits [63:32] (R/O) 812H 2066 IA32_X2APIC_ISR2 Thread x2APIC In-Service Register Bits [95:64] (R/O) 2-322 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Register Name / Bit Fields Scope Bit Description Hex Dec 813H 2067 IA32_X2APIC_ISR3 Thread x2APIC In-Service Register Bits [127:96] (R/O) 814H 2068 IA32_X2APIC_ISR4 Thread x2APIC In-Service Register Bits [159:128] (R/O) 815H 2069 IA32_X2APIC_ISR5 Thread x2APIC In-Service Register Bits [191:160] (R/O) 816H 2070 IA32_X2APIC_ISR6 Thread x2APIC In-Service Register Bits [223:192] (R/O) 817H 2071 IA32_X2APIC_ISR7 Thread x2APIC In-Service Register Bits [255:224] (R/O) 818H 2072 IA32_X2APIC_TMR0 Thread x2APIC Trigger Mode Register Bits [31:0] (R/O) 819H 2073 IA32_X2APIC_TMR1 Thread x2APIC Trigger Mode Register Bits [63:32] (R/O) 81AH 2074 IA32_X2APIC_TMR2 Thread x2APIC Trigger Mode Register Bits [95:64] (R/O) 81BH 2075 IA32_X2APIC_TMR3 Thread x2APIC Trigger Mode Register Bits [127:96] (R/O) 81CH 2076 IA32_X2APIC_TMR4 Thread x2APIC Trigger Mode Register Bits [159:128] (R/O) 81DH 2077 IA32_X2APIC_TMR5 Thread x2APIC Trigger Mode Register Bits [191:160] (R/O) 81EH 2078 IA32_X2APIC_TMR6 Thread x2APIC Trigger Mode Register Bits [223:192] (R/O) 81FH 2079 IA32_X2APIC_TMR7 Thread x2APIC Trigger Mode Register Bits [255:224] (R/O) 820H 2080 IA32_X2APIC_IRR0 Thread x2APIC Interrupt Request Register Bits [31:0] (R/O) 821H 2081 IA32_X2APIC_IRR1 Thread x2APIC Interrupt Request Register Bits [63:32] (R/O) 822H 2082 IA32_X2APIC_IRR2 Thread x2APIC Interrupt Request Register Bits [95:64] (R/O) 823H 2083 IA32_X2APIC_IRR3 Thread x2APIC Interrupt Request Register Bits [127:96] (R/O) 824H 2084 IA32_X2APIC_IRR4 Thread x2APIC Interrupt Request Register Bits [159:128] (R/O) 825H 2085 IA32_X2APIC_IRR5 Thread x2APIC Interrupt Request Register Bits [191:160] (R/O) 826H 2086 IA32_X2APIC_IRR6 Thread x2APIC Interrupt Request Register Bits [223:192] (R/O) 827H 2087 IA32_X2APIC_IRR7 Thread x2APIC Interrupt Request Register Bits [255:224] (R/O) 828H 2088 IA32_X2APIC_ESR Thread x2APIC Error Status Register (R/W) 82FH 2095 IA32_X2APIC_LVT_CMCI Thread x2APIC LVT Corrected Machine Check Interrupt Register (R/W) 830H 2096 IA32_X2APIC_ICR Thread x2APIC Interrupt Command Register (R/W) 832H 2098 IA32_X2APIC_LVT_TIMER Thread x2APIC LVT Timer Interrupt Register (R/W) 833H 2099 IA32_X2APIC_LVT_THERMAL Thread x2APIC LVT Thermal Sensor Interrupt Register (R/W) 834H 2100 IA32_X2APIC_LVT_PMI Thread x2APIC LVT Performance Monitor Register (R/W) 835H 2101 IA32_X2APIC_LVT_LINT0 Thread x2APIC LVT LINT0 Register (R/W) 836H 2102 IA32_X2APIC_LVT_LINT1 Thread x2APIC LVT LINT1 Register (R/W) 837H 2103 IA32_X2APIC_LVT_ERROR Thread x2APIC LVT Error Register (R/W) 838H 2104 IA32_X2APIC_INIT_COUNT Thread x2APIC Initial Count Register (R/W) 839H 2105 IA32_X2APIC_CUR_COUNT Thread x2APIC Current Count Register (R/O) 83EH 2110 IA32_X2APIC_DIV_CONF Thread x2APIC Divide Configuration Register (R/W) 83FH 2111 IA32_X2APIC_SELF_IPI Thread x2APIC Self IPI Register (W/O) Vol. 4 2-323 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-46. Selected MSRs Supported by Intel® Xeon Phi™ Processors with DisplayFamily_DisplayModel Signatures 06_57H and 06_85H Register Address Hex Register Name / Bit Fields Scope Bit Description Dec C000_ 0080H IA32_EFER C000_ 0081H IA32_STAR C000_ 0082H IA32_LSTAR C000_ 0084H IA32_FMASK C000_ 0100H IA32_FS_BASE C000_ 0101H IA32_GS_BASE C000_ 0102H IA32_KERNEL_GS_BASE C000_ 0103H IA32_TSC_AUX Thread Extended Feature Enables See Table 2-2. Thread System Call Target Address (R/W) See Table 2-2. Thread IA-32e Mode System Call Target Address (R/W) See Table 2-2. Thread System Call Flag Mask (R/W) See Table 2-2. Thread Map of BASE Address of FS (R/W) See Table 2-2. Thread Map of BASE Address of GS (R/W) See Table 2-2. Thread Swap Target of BASE Address of GS (R/W) See Table 2-2. Thread AUXILIARY TSC Signature (R/W) See Table 2-2 Table 2-47 lists model-specific registers that are supported by Intel® Xeon Phi™ processor 7215, 7285, 7295 series based on the Knights Mill microarchitecture. Table 2-47. Additional MSRs Supported by Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series with DisplayFamily_DisplayModel Signature 06_85H Register Address Hex Dec 9BH 155 Register Name / Bit Fields IA32_SMM_MONITOR_CTL Scope Core Bit Description SMM Monitor Configuration (R/W) This MSR is readable only if VMX is enabled, and writeable only if VMX is enabled and in SMM mode, and is used to configure the VMX MSEG base address. See Table 2-2. 480H 1152 IA32_VMX_BASIC Core Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. 481H 1153 IA32_VMX_PINBASED_ CTLS Core Capability Reporting Register of Pin-based VM-execution Controls (R/O) See Table 2-2. 482H 1154 IA32_VMX_PROCBASED_ CTLS Core 483H 1155 IA32_VMX_EXIT_CTLS Core Capability Reporting Register of Primary Processorbased VM-execution Controls (R/O) Capability Reporting Register of VM-exit Controls (R/O) See Table 2-2. 484H 1156 IA32_VMX_ENTRY_CTLS Core Capability Reporting Register of VM-entry Controls (R/O) See Table 2-2. 2-324 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-47. Additional MSRs Supported by Intel® Xeon Phi™ Processor 7215, 7285, 7295 Series with DisplayFamily_DisplayModel Signature 06_85H Register Address Register Name / Bit Fields Scope Hex Dec 485H 1157 IA32_VMX_MISC Core 486H 1158 IA32_VMX_CR0_FIXED0 Core Bit Description Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 2-2. Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 2-2. 487H 1159 IA32_VMX_CR0_FIXED1 Core Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 2-2. 488H 1160 IA32_VMX_CR4_FIXED0 Core Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 2-2. 489H 1161 IA32_VMX_CR4_FIXED1 Core Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 2-2. 48AH 1162 IA32_VMX_VMCS_ENUM Core Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 2-2. 48BH 1163 IA32_VMX_PROCBASED_ CTLS2 Core Capability Reporting Register of Secondary ProcessorBased VM-Execution Controls (R/O) See Table 2-2. 48CH 1164 IA32_VMX_EPT_VPID_ENUM Core Capability Reporting Register of EPT and VPID (R/O) See Table 2-2. 48DH 1165 IA32_VMX_TRUE_PINBASE D_CTLS Core Capability Reporting Register of Pin-Based VM-Execution Flex Controls (R/O) See Table 2-2. 48EH 1166 IA32_VMX_TRUE_PROCBASED_CTLS Core Capability Reporting Register of Primary ProcessorBased VM-Execution Flex Controls (R/O) See Table 2-2. 48FH 1167 IA32_VMX_TRUE_EXIT_CTLS Core Capability Reporting Register of VM-Exit Flex Controls (R/O) See Table 2-2. 490H 1168 IA32_VMX_TRUE_ENTRY_CTLS Core Capability Reporting Register of VM-Entry Flex Controls (R/O) See Table 2-2. 491H 1169 IA32_VMX_FMFUNC Core Capability Reporting Register of VM-Function Controls (R/O) See Table 2-2. Vol. 4 2-325 MODEL-SPECIFIC REGISTERS (MSRS) 2.19 MSRS IN THE PENTIUM® 4 AND INTEL® XEON® PROCESSORS Table 2-48 lists MSRs (architectural and model-specific) that are defined across processor generations based on Intel NetBurst microarchitecture. The processor can be identified by its CPUID signatures of DisplayFamily encoding of 0FH, see Table 2-1. • MSRs with an “IA32_” prefix are designated as “architectural.” This means that the functions of these MSRs and their addresses remain the same for succeeding families of IA-32 processors. • MSRs with an “MSR_” prefix are model specific with respect to address functionalities. The column “Model Availability” lists the model encoding value(s) within the Pentium 4 and Intel Xeon processor family at the specified register address. The model encoding value of a processor can be queried using CPUID. See “CPUID—CPU Identification” in Chapter 3 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A. Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors Register Address Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Hex Dec 0H 0 IA32_P5_MC_ADDR 0, 1, 2, 3, 4, 6 Shared See Section 2.23, “MSRs in Pentium Processors.” 1H 1 IA32_P5_MC_TYPE 0, 1, 2, 3, 4, 6 Shared See Section 2.23, “MSRs in Pentium Processors.” 6H 6 IA32_MONITOR_FILTER_LINE_SIZE 3, 4, 6 Shared See Section 8.10.5, “Monitor/Mwait Address Range Determination.” 10H 16 IA32_TIME_STAMP_COUNTER 0, 1, 2, 3, 4, 6 Unique Time Stamp Counter See Table 2-2. On earlier processors, only the lower 32 bits are writable. On any write to the lower 32 bits, the upper 32 bits are cleared. For processor family 0FH, models 3 and 4: all 64 bits are writable. 17H 23 IA32_PLATFORM_ID 0, 1, 2, 3, 4, 6 Shared Platform ID (R) See Table 2-2. The operating system can use this MSR to determine “slot” information for the processor and the proper microcode update to load. 1BH 2AH 27 42 IA32_APIC_BASE MSR_EBC_HARD_POWERON 0, 1, 2, 3, 4, 6 Unique 0, 1, 2, 3, 4, 6 Shared APIC Location and Status (R/W) See Table 2-2. See Section 10.4.4, “Local APIC Status and Location.” Processor Hard Power-On Configuration (R/W) Enables and disables processor features. (R) Indicates current processor configuration. 0 Output Tri-state Enabled (R) Indicates whether tri-state output is enabled (1) or disabled (0) as set by the strapping of SMI#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 2-326 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Dec 1 Model Availability Shared/ Unique1 Bit Description Execute BIST (R) Indicates whether the execution of the BIST is enabled (1) or disabled (0) as set by the strapping of INIT#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 2 In Order Queue Depth (R) Indicates whether the in order queue depth for the system bus is 1 (1) or up to 12 (0) as set by the strapping of A7#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 3 MCERR# Observation Disabled (R) Indicates whether MCERR# observation is enabled (0) or disabled (1) as determined by the strapping of A9#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 4 BINIT# Observation Enabled (R) Indicates whether BINIT# observation is enabled (0) or disabled (1) as determined by the strapping of A10#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 6:5 APIC Cluster ID (R) Contains the logical APIC cluster ID value as set by the strapping of A12# and A11#. The logical cluster ID value is written into the field on the deassertion of RESET#; the field is set to 1 when the address bus signal is asserted. 7 Bus Park Disable (R) Indicates whether bus park is enabled (0) or disabled (1) as set by the strapping of A15#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 11:8 Reserved 13:12 Agent ID (R) Contains the logical agent ID value as set by the strapping of BR[3:0]. The logical ID value is written into the field on the deassertion of RESET#; the field is set to 1 when the address bus signal is asserted. 63:14 Reserved Vol. 4 2-327 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 2BH 43 Register Name Fields and Flags MSR_EBC_SOFT_POWERON Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Bit Description Processor Soft Power-On Configuration (R/W) Enables and disables processor features. 0 RCNT/SCNT On Request Encoding Enable (R/W) Controls the driving of RCNT/SCNT on the request encoding. Set to enable (1); clear to disabled (0, default). 1 Data Error Checking Disable (R/W) Set to disable system data bus parity checking; clear to enable parity checking. 2 Response Error Checking Disable (R/W) 3 Address/Request Error Checking Disable (R/W) Set to disable (default); clear to enable. Set to disable (default); clear to enable. 4 Initiator MCERR# Disable (R/W) Set to disable MCERR# driving for initiator bus requests (default); clear to enable. 5 Internal MCERR# Disable (R/W) Set to disable MCERR# driving for initiator internal errors (default); clear to enable. 6 BINIT# Driver Disable (R/W) Set to disable BINIT# driver (default); clear to enable driver. 63:7 2CH 44 MSR_EBC_FREQUENCY_ID Reserved 2,3, 4, 6 Shared Processor Frequency Configuration The bit field layout of this MSR varies according to the MODEL value in the CPUID version information. The following bit field layout applies to Pentium 4 and Xeon Processors with MODEL encoding equal or greater than 2. (R) The field Indicates the current processor frequency configuration. 15:0 Reserved 18:16 Scalable Bus Speed (R/W) Indicates the intended scalable bus speed: Encoding Scalable Bus Speed 000B 100 MHz (Model 2) 000B 266 MHz (Model 3 or 4) 001B 133 MHz 010B 200 MHz 011B 166 MHz 100B 333 MHz (Model 6) 2-328 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Dec Model Availability Shared/ Unique1 Bit Description 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 266.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 000B and model encoding = 3 or 4. 333.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 100B and model encoding = 6. All other values are reserved. 23:19 Reserved 31:24 Core Clock Frequency to System Bus Frequency Ratio (R) The processor core clock frequency to system bus frequency ratio observed at the deassertion of the reset pin. 63:25 2CH 44 MSR_EBC_FREQUENCY_ID Reserved 0, 1 Shared Processor Frequency Configuration (R) The bit field layout of this MSR varies according to the MODEL value of the CPUID version information. This bit field layout applies to Pentium 4 and Xeon Processors with MODEL encoding less than 2. Indicates current processor frequency configuration. 20:0 Reserved 23:21 Scalable Bus Speed (R/W) Indicates the intended scalable bus speed: Encoding Scalable Bus Speed 000B 100 MHz All others values reserved. 63:24 3AH 58 IA32_FEATURE_CONTROL Reserved 3, 4, 6 Unique Control Features in IA-32 Processor (R/W) See Table 2-2. (If CPUID.01H:ECX.[bit 5]) 79H 8BH 121 139 IA32_BIOS_UPDT_TRIG IA32_BIOS_SIGN_ID 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Unique BIOS Update Trigger Register (W) See Table 2-2. BIOS Update Signature ID (R/W) See Table 2-2. Vol. 4 2-329 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex 9BH Register Name Fields and Flags Dec 155 IA32_SMM_MONITOR_CTL Model Availability 3, 4, 6 Shared/ Unique1 Unique Bit Description SMM Monitor Configuration (R/W) See Table 2-2. FEH 174H 254 372 IA32_MTRRCAP IA32_SYSENTER_CS 0, 1, 2, 3, 4, 6 Unique 0, 1, 2, 3, 4, 6 Unique MTRR Information See Section 11.11.1, “MTRR Feature Identification.”. CS Register Target for CPL 0 Code (R/W) See Table 2-2. See Section 5.8.7, “Performing Fast Calls to System Procedures with the SYSENTER and SYSEXIT Instructions.” 175H 373 IA32_SYSENTER_ESP 0, 1, 2, 3, 4, 6 Unique Stack Pointer for CPL 0 Stack (R/W) See Table 2-2. See Section 5.8.7, “Performing Fast Calls to System Procedures with the SYSENTER and SYSEXIT Instructions.” 176H 179H 17AH 17BH 374 377 378 379 IA32_SYSENTER_EIP IA32_MCG_CAP IA32_MCG_STATUS 0, 1, 2, 3, 4, 6 Unique 0, 1, 2, 3, 4, 6 Unique 0, 1, 2, 3, 4, 6 Unique CPL 0 Code Entry Point (R/W) See Table 2-2. See Section 5.8.7, “Performing Fast Calls to System Procedures with the SYSENTER and SYSEXIT Instructions.” Machine Check Capabilities (R) See Table 2-2. See Section 15.3.1.1, “IA32_MCG_CAP MSR.” Machine Check Status (R) See Table 2-2. See Section 15.3.1.2, “IA32_MCG_STATUS MSR.” IA32_MCG_CTL Machine Check Feature Enable (R/W) See Table 2-2. See Section 15.3.1.3, “IA32_MCG_CTL MSR.” 180H 384 MSR_MCG_RAX 0, 1, 2, 3, 4, 6 Unique See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 181H 385 MSR_MCG_RBX Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique 2-330 Vol. 4 386 MSR_MCG_RCX Machine Check EBX/RBX Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 182H Machine Check EAX/RAX Save State Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check ECX/RCX Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Dec Model Availability Shared/ Unique1 63:0 183H 387 MSR_MCG_RDX Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique 388 MSR_MCG_RSI Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique 389 MSR_MCG_RDI Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique 390 MSR_MCG_RBP Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique 391 MSR_MCG_RSP Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique 392 MSR_MCG_RFLAGS Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique 393 MSR_MCG_RIP 63:0 Machine Check EFLAGS/RFLAG Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 189H Machine Check ESP/RSP Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 188H Machine Check EBP/RBP Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 187H Machine Check EDI/RDI Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 186H Machine Check ESI/RSI Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 185H Machine Check EDX/RDX Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 184H Bit Description Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check EIP/RIP Save State See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. Vol. 4 2-331 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 18AH 394 Register Name Fields and Flags MSR_MCG_MISC Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Unique Bit Description Machine Check Miscellaneous See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 0 DS When set, the bit indicates that a page assist or page fault occurred during DS normal operation. The processors response is to shut down. The bit is used as an aid for debugging DS handling code. It is the responsibility of the user (BIOS or operating system) to clear this bit for normal operation. 63:1 18BH 18FH 395 MSR_MCG_RESERVED1 MSR_MCG_RESERVED5 190H 400 MSR_MCG_R8 Reserved Reserved 0, 1, 2, 3, 4, 6 Unique See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 191H 401 MSR_MCG_R9 Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique 402 MSR_MCG_R10 Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique 403 MSR_MCG_R11 63:0 2-332 Vol. 4 Machine Check R10 See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 193H Machine Check R9D/R9 See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 192H Machine Check R8 Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R11 See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 194H 404 Register Name Fields and Flags MSR_MCG_R12 Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Unique 405 MSR_MCG_R13 Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique 406 MSR_MCG_R14 Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique 407 MSR_MCG_R15 Machine Check R14 See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 197H Machine Check R13 See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 196H Machine Check R12 See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 195H Bit Description Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R15 See Section 15.3.2.6, “IA32_MCG Extended Machine Check State MSRs.” 63:0 Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 198H 408 IA32_PERF_STATUS 3, 4, 6 Unique See Table 2-2. See Section 14.1, “Enhanced Intel Speedstep® Technology.” 199H 409 IA32_PERF_CTL 3, 4, 6 Unique See Table 2-2. See Section 14.1, “Enhanced Intel Speedstep® Technology.” 19AH 410 IA32_CLOCK_MODULATION 0, 1, 2, 3, 4, 6 Unique Thermal Monitor Control (R/W) See Table 2-2. See Section 14.7.3, “Software Controlled Clock Modulation.” 19BH 19CH 411 412 IA32_THERM_INTERRUPT IA32_THERM_STATUS 0, 1, 2, 3, 4, 6 Unique 0, 1, 2, 3, 4, 6 Shared Thermal Interrupt Control (R/W) See Section 14.7.2, “Thermal Monitor,” and see Table 2-2. Thermal Monitor Status (R/W) See Section 14.7.2, “Thermal Monitor,” and see Table 2-2. Vol. 4 2-333 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 19DH 413 1A0H 416 Register Name Fields and Flags Model Availability Shared/ Unique1 MSR_THERM2_CTL IA32_MISC_ENABLE Bit Description Thermal Monitor 2 Control 3, Shared For Family F, Model 3 processors: When read, specifies the value of the target TM2 transition last written. When set, it sets the next target value for TM2 transition. 4, 6 Shared For Family F, Model 4 and Model 6 processors: When read, specifies the value of the target TM2 transition last written. Writes may cause #GP exceptions. 0, 1, 2, 3, 4, 6 Shared Enable Miscellaneous Processor Features (R/W) 0 Fast-Strings Enable. See Table 2-2. 1 Reserved 2 x87 FPU Fopcode Compatibility Mode Enable 3 Thermal Monitor 1 Enable See Section 14.7.2, “Thermal Monitor,” and see Table 2-2. 4 Split-Lock Disable When set, the bit causes an #AC exception to be issued instead of a split-lock cycle. Operating systems that set this bit must align system structures to avoid split-lock scenarios. When the bit is clear (default), normal split-locks are issued to the bus. This debug feature is specific to the Pentium 4 processor. 5 Reserved 6 Third-Level Cache Disable (R/W) When set, the third-level cache is disabled; when clear (default) the third-level cache is enabled. This flag is reserved for processors that do not have a third-level cache. Note that the bit controls only the third-level cache; and only if overall caching is enabled through the CD flag of control register CR0, the page-level cache controls, and/or the MTRRs. See Section 11.5.4, “Disabling and Enabling the L3 Cache.” 7 Performance Monitoring Available (R) See Table 2-2. 8 Suppress Lock Enable When set, assertion of LOCK on the bus is suppressed during a Split Lock access. When clear (default), LOCK is not suppressed. 2-334 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Model Availability Dec 9 Shared/ Unique1 Bit Description Prefetch Queue Disable When set, disables the prefetch queue. When clear (default), enables the prefetch queue. 10 FERR# Interrupt Reporting Enable (R/W) When set, interrupt reporting through the FERR# pin is enabled; when clear, this interrupt reporting function is disabled. When this flag is set and the processor is in the stop-clock state (STPCLK# is asserted), asserting the FERR# pin signals to the processor that an interrupt (such as, INIT#, BINIT#, INTR, NMI, SMI#, or RESET#) is pending and that the processor should return to normal operation to handle the interrupt. This flag does not affect the normal operation of the FERR# pin (to indicate an unmasked floating-point error) when the STPCLK# pin is not asserted. 11 Branch Trace Storage Unavailable (BTS_UNAVILABLE) (R) See Table 2-2. When set, the processor does not support branch trace storage (BTS); when clear, BTS is supported. 12 PEBS_UNAVILABLE: Processor Event Based Sampling Unavailable (R) See Table 2-2. When set, the processor does not support processor event-based sampling (PEBS); when clear, PEBS is supported. 13 3 TM2 Enable (R/W) When this bit is set (1) and the thermal sensor indicates that the die temperature is at the predetermined threshold, the Thermal Monitor 2 mechanism is engaged. TM2 will reduce the bus to core ratio and voltage according to the value last written to MSR_THERM2_CTL bits 15:0. When this bit is clear (0, default), the processor does not change the VID signals or the bus to core ratio when the processor enters a thermal managed state. If the TM2 feature flag (ECX[8]) is not set to 1 after executing CPUID with EAX = 1, then this feature is not supported and BIOS must not alter the contents of this bit location. The processor is operating out of spec if both this bit and the TM1 bit are set to disabled states. Vol. 4 2-335 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Dec Model Availability Shared/ Unique1 17:14 18 Bit Description Reserved 3, 4, 6 ENABLE MONITOR FSM (R/W) See Table 2-2. 19 Adjacent Cache Line Prefetch Disable (R/W) When set to 1, the processor fetches the cache line of the 128-byte sector containing currently required data. When set to 0, the processor fetches both cache lines in the sector. Single processor platforms should not set this bit. Server platforms should set or clear this bit based on platform performance observed in validation and testing. BIOS may contain a setup option that controls the setting of this bit. 21:20 22 Reserved 3, 4, 6 Limit CPUID MAXVAL (R/W) See Table 2-2. Setting this can cause unexpected behavior to software that depends on the availability of CPUID leaves greater than 3. 23 Shared xTPR Message Disable (R/W) See Table 2-2. 24 L1 Data Cache Context Mode (R/W) When set, the L1 data cache is placed in shared mode; when clear (default), the cache is placed in adaptive mode. This bit is only enabled for IA32 processors that support Intel HyperThreading Technology. See Section 11.5.6, “L1 Data Cache Context Mode.” When L1 is running in adaptive mode and CR3s are identical, data in L1 is shared across logical processors. Otherwise, L1 is not shared and cache use is competitive. If the Context ID feature flag (ECX[10]) is set to 0 after executing CPUID with EAX = 1, the ability to switch modes is not supported. BIOS must not alter the contents of IA32_MISC_ENABLE[24]. 33:25 Reserved 34 Unique XD Bit Disable (R/W) See Table 2-2. 63:35 1A1H 417 MSR_PLATFORM_BRV 17:0 2-336 Vol. 4 Reserved 3, 4, 6 Shared Platform Feature Requirements (R) Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Dec Model Availability Shared/ Unique1 18 Bit Description PLATFORM Requirements When set to 1, indicates the processor has specific platform requirements. The details of the platform requirements are listed in the respective data sheets of the processor. 63:19 1D7H 471 MSR_LER_FROM_LIP Reserved 0, 1, 2, 3, 4, 6 Unique Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. See Section 17.13.3, “Last Exception Records.” 31:0 From Linear IP Linear address of the last branch instruction. 63:32 1D7H 471 Reserved 63:0 Unique From Linear IP Linear address of the last branch instruction (If IA-32e mode is active). 1D8H 472 MSR_LER_TO_LIP 0, 1, 2, 3, 4, 6 Unique Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. See Section 17.13.3, “Last Exception Records.” 31:0 From Linear IP Linear address of the target of the last branch instruction. 63:32 1D8H 472 Reserved 63:0 Unique From Linear IP Linear address of the target of the last branch instruction (If IA-32e mode is active). 1D9H 473 MSR_DEBUGCTLA 0, 1, 2, 3, 4, 6 Unique Debug Control (R/W) Controls how several debug features are used. Bit definitions are discussed in the referenced section. See Section 17.13.1, “MSR_DEBUGCTLA MSR.” 1DAH 474 MSR_LASTBRANCH_TOS 0, 1, 2, 3, 4, 6 Unique Last Branch Record Stack TOS (R/O) Contains an index (0-3 or 0-15) that points to the top of the last branch record stack (that is, that points the index of the MSR containing the most recent branch record). See Section 17.13.2, “LBR Stack for Processors Based on Intel NetBurst® Microarchitecture”; and addresses 1DBH-1DEH and 680H-68FH. Vol. 4 2-337 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 1DBH 475 Register Name Fields and Flags MSR_LASTBRANCH_0 Model Availability 0, 1, 2 Shared/ Unique1 Unique Bit Description Last Branch Record 0 (R/O) One of four last branch record registers on the last branch record stack. It contains pointers to the source and destination instruction for one of the last four branches, exceptions, or interrupts that the processor took. MSR_LASTBRANCH_0 through MSR_LASTBRANCH_3 at 1DBH-1DEH are available only on family 0FH, models 0H-02H. They have been replaced by the MSRs at 680H68FH and 6C0H-6CFH. See Section 17.12, “Last Branch, Call Stack, Interrupt, and Exception Recording for Processors based on Skylake Microarchitecture.” 1DCH 477 MSR_LASTBRANCH_1 0, 1, 2 Unique Last Branch Record 1 See description of the MSR_LASTBRANCH_0 MSR at 1DBH. 1DDH 477 MSR_LASTBRANCH_2 0, 1, 2 Unique Last Branch Record 2 See description of the MSR_LASTBRANCH_0 MSR at 1DBH. 1DEH 478 MSR_LASTBRANCH_3 0, 1, 2 Unique Last Branch Record 3 See description of the MSR_LASTBRANCH_0 MSR at 1DBH. 200H 201H 202H 203H 204H 205H 206H 207H 208H 209H 2-338 Vol. 4 512 513 514 515 516 517 518 519 520 521 IA32_MTRR_PHYSBASE0 IA32_MTRR_PHYSMASK0 IA32_MTRR_PHYSBASE1 IA32_MTRR_PHYSMASK1 IA32_MTRR_PHYSBASE2 IA32_MTRR_PHYSMASK2 IA32_MTRR_PHYSBASE3 IA32_MTRR_PHYSMASK3 IA32_MTRR_PHYSBASE4 IA32_MTRR_PHYSMASK4 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared Variable Range Base MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs”. Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 20AH 522 20BH 20CH 20DH 20EH 20FH 250H 258H 259H 268H 269H 26AH 26BH 26CH 26DH 26EH 26FH 277H 2FFH 523 524 525 526 527 592 600 601 616 617 618 619 620 621 622 623 631 767 Register Name Fields and Flags IA32_MTRR_PHYSBASE5 IA32_MTRR_PHYSMASK5 IA32_MTRR_PHYSBASE6 IA32_MTRR_PHYSMASK6 IA32_MTRR_PHYSBASE7 IA32_MTRR_PHYSMASK7 IA32_MTRR_FIX64K_00000 IA32_MTRR_FIX16K_80000 IA32_MTRR_FIX16K_A0000 IA32_MTRR_FIX4K_C0000 IA32_MTRR_FIX4K_C8000 IA32_MTRR_FIX4K_D0000 IA32_MTRR_FIX4K_D8000 IA32_MTRR_FIX4K_E0000 IA32_MTRR_FIX4K_E8000 IA32_MTRR_FIX4K_F0000 IA32_MTRR_FIX4K_F8000 IA32_PAT IA32_MTRR_DEF_TYPE Model Availability Shared/ Unique1 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Shared 0, 1, 2, 3, 4, 6 Unique 0, 1, 2, 3, 4, 6 Shared Bit Description Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Variable Range Mask MTRR See Section 11.11.2.3, “Variable Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs”. Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs”. Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Fixed Range MTRR See Section 11.11.2.2, “Fixed Range MTRRs.” Page Attribute Table See Section 11.11.2.2, “Fixed Range MTRRs.” Default Memory Types (R/W) See Table 2-2. See Section 11.11.2.1, “IA32_MTRR_DEF_TYPE MSR.” Vol. 4 2-339 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Hex Dec 300H 768 MSR_BPU_COUNTER0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 301H 769 MSR_BPU_COUNTER1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 302H 770 MSR_BPU_COUNTER2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 303H 771 MSR_BPU_COUNTER3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 304H 772 MSR_MS_COUNTER0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 305H 773 MSR_MS_COUNTER1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 306H 774 MSR_MS_COUNTER2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 307H 775 MSR_MS_COUNTER3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 308H 776 MSR_FLAME_COUNTER0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 309H 777 MSR_FLAME_COUNTER1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 30AH 778 MSR_FLAME_COUNTER2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 30BH 779 MSR_FLAME_COUNTER3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 30CH 780 MSR_IQ_COUNTER0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 30DH 781 MSR_IQ_COUNTER1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 30EH 782 MSR_IQ_COUNTER2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 30FH 783 MSR_IQ_COUNTER3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 310H 784 MSR_IQ_COUNTER4 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 311H 785 MSR_IQ_COUNTER5 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.2, “Performance Counters.” 360H 864 MSR_BPU_CCCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 361H 865 MSR_BPU_CCCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 362H 866 MSR_BPU_CCCR2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 363H 867 MSR_BPU_CCCR3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 2-340 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Hex Dec 364H 868 MSR_MS_CCCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 365H 869 MSR_MS_CCCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 366H 870 MSR_MS_CCCR2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 367H 871 MSR_MS_CCCR3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 368H 872 MSR_FLAME_CCCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 369H 873 MSR_FLAME_CCCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 36AH 874 MSR_FLAME_CCCR2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 36BH 875 MSR_FLAME_CCCR3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 36CH 876 MSR_IQ_CCCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 36DH 877 MSR_IQ_CCCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 36EH 878 MSR_IQ_CCCR2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 36FH 879 MSR_IQ_CCCR3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 370H 880 MSR_IQ_CCCR4 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 371H 881 MSR_IQ_CCCR5 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.3, “CCCR MSRs.” 3A0H 928 MSR_BSU_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A1H 929 MSR_BSU_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A2H 930 MSR_FSB_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A3H 931 MSR_FSB_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A4H 932 MSR_FIRM_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A5H 933 MSR_FIRM_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A6H 934 MSR_FLAME_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A7H 935 MSR_FLAME_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” Vol. 4 2-341 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Hex Dec 3A8H 936 MSR_DAC_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3A9H 937 MSR_DAC_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3AAH 938 MSR_MOB_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3ABH 939 MSR_MOB_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3ACH 940 MSR_PMH_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3ADH 941 MSR_PMH_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3AEH 942 MSR_SAAT_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3AFH 943 MSR_SAAT_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B0H 944 MSR_U2L_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B1H 945 MSR_U2L_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B2H 946 MSR_BPU_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B3H 947 MSR_BPU_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B4H 948 MSR_IS_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B5H 949 MSR_IS_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B6H 950 MSR_ITLB_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B7H 951 MSR_ITLB_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B8H 952 MSR_CRU_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3B9H 953 MSR_CRU_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3BAH 954 MSR_IQ_ESCR0 0, 1, 2 Shared See Section 18.6.3.1, “ESCR MSRs.” This MSR is not available on later processors. It is only available on processor family 0FH, models 01H-02H. 3BBH 955 MSR_IQ_ESCR1 0, 1, 2 Shared See Section 18.6.3.1, “ESCR MSRs.” This MSR is not available on later processors. It is only available on processor family 0FH, models 01H-02H. 2-342 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Hex Dec 3BCH 956 MSR_RAT_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3BDH 957 MSR_RAT_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3BEH 958 MSR_SSU_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C0H 960 MSR_MS_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C1H 961 MSR_MS_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C2H 962 MSR_TBPU_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C3H 963 MSR_TBPU_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C4H 964 MSR_TC_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C5H 965 MSR_TC_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C8H 968 MSR_IX_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3C9H 969 MSR_IX_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3CAH 970 MSR_ALF_ESCR0 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3CBH 971 MSR_ALF_ESCR1 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3CCH 972 MSR_CRU_ESCR2 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3CDH 973 MSR_CRU_ESCR3 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3E0H 992 MSR_CRU_ESCR4 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3E1H 993 MSR_CRU_ESCR5 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3F0H 1008 MSR_TC_PRECISE_EVENT 0, 1, 2, 3, 4, 6 Shared See Section 18.6.3.1, “ESCR MSRs.” 3F1H 1009 MSR_PEBS_ENABLE 0, 1, 2, 3, 4, 6 Shared Processor Event Based Sampling (PEBS) (R/W) Controls the enabling of processor event sampling and replay tagging. 12:0 See Table 19-38. 23:13 Reserved 24 UOP Tag Enables replay tagging when set. Vol. 4 2-343 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Dec Model Availability Shared/ Unique1 25 Bit Description ENABLE_PEBS_MY_THR (R/W) Enables PEBS for the target logical processor when set; disables PEBS when clear (default). See Section 18.6.4.3, “IA32_PEBS_ENABLE MSR,” for an explanation of the target logical processor. This bit is called ENABLE_PEBS in IA-32 processors that do not support Intel HyperThreading Technology. 26 ENABLE_PEBS_OTH_THR (R/W) Enables PEBS for the target logical processor when set; disables PEBS when clear (default). See Section 18.6.4.3, “IA32_PEBS_ENABLE MSR,” for an explanation of the target logical processor. This bit is reserved for IA-32 processors that do not support Intel Hyper-Threading Technology. 63:27 Reserved 3F2H 1010 MSR_PEBS_MATRIX_VERT 0, 1, 2, 3, 4, 6 Shared See Table 19-38. 400H 1024 IA32_MC0_CTL 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 403H 1027 IA32_MC0_MISC 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” The IA32_MC0_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 1028 IA32_MC1_CTL 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 2-344 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 406H 1030 Register Name Fields and Flags IA32_MC1_ADDR Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Bit Description See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 407H 1031 IA32_MC1_MISC Shared See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” The IA32_MC1_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 408H 1032 IA32_MC2_CTL 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40AH 1034 IA32_MC2_ADDR See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40BH 1035 IA32_MC2_MISC See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” The IA32_MC2_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 1036 IA32_MC3_CTL 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 IA32_MC3_STATUS 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 IA32_MC3_ADDR 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. Vol. 4 2-345 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 40FH 1039 Register Name Fields and Flags IA32_MC3_MISC Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Bit Description See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” The IA32_MC3_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 410H 1040 IA32_MC4_CTL 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 IA32_MC4_STATUS 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 IA32_MC4_ADDR See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 413H 1043 IA32_MC4_MISC See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” The IA32_MC2_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 480H 1152 IA32_VMX_BASIC 3, 4, 6 Unique Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. See Appendix A.1, “Basic VMX Information.” 481H 1153 IA32_VMX_PINBASED_CTLS 3, 4, 6 Unique Capability Reporting Register of Pin-Based VM-Execution Controls (R/O) See Table 2-2. See Appendix A.3, “VM-Execution Controls.” 482H 1154 IA32_VMX_PROCBASED_CTLS 3, 4, 6 Unique Capability Reporting Register of Primary Processor-Based VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls,” and see Table 2-2. 483H 1155 IA32_VMX_EXIT_CTLS 3, 4, 6 Unique Capability Reporting Register of VM-Exit Controls (R/O) See Appendix A.4, “VM-Exit Controls,” and see Table 2-2. 2-346 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 484H 1156 Register Name Fields and Flags IA32_VMX_ENTRY_CTLS Model Availability 3, 4, 6 Shared/ Unique1 Unique Bit Description Capability Reporting Register of VM-Entry Controls (R/O) See Appendix A.5, “VM-Entry Controls,” and see Table 2-2. 485H 1157 IA32_VMX_MISC 3, 4, 6 Unique Reporting Register of Miscellaneous VMX Capabilities (R/O) See Appendix A.6, “Miscellaneous Data,” and see Table 2-2. 486H 1158 IA32_VMX_CR0_FIXED0 3, 4, 6 Unique Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Appendix A.7, “VMX-Fixed Bits in CR0,” and see Table 2-2. 487H 1159 IA32_VMX_CR0_FIXED1 3, 4, 6 Unique Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Appendix A.7, “VMX-Fixed Bits in CR0,” and see Table 2-2. 488H 1160 IA32_VMX_CR4_FIXED0 3, 4, 6 Unique Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Appendix A.8, “VMX-Fixed Bits in CR4,” and see Table 2-2. 489H 1161 IA32_VMX_CR4_FIXED1 3, 4, 6 Unique Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Appendix A.8, “VMX-Fixed Bits in CR4,” and see Table 2-2. 48AH 1162 IA32_VMX_VMCS_ENUM 3, 4, 6 Unique Capability Reporting Register of VMCS Field Enumeration (R/O) See Appendix A.9, “VMCS Enumeration,” and see Table 2-2. 48BH 1163 IA32_VMX_PROCBASED_CTLS2 3, 4, 6 Unique Capability Reporting Register of Secondary Processor-Based VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls,” and see Table 2-2. 600H 1536 IA32_DS_AREA 0, 1, 2, 3, 4, 6 Unique DS Save Area (R/W) See Table 2-2. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 680H 1664 MSR_LASTBRANCH_0_FROM_IP 3, 4, 6 Unique Last Branch Record 0 (R/W) One of 16 pairs of last branch record registers on the last branch record stack (680H-68FH). This part of the stack contains pointers to the source instruction for one of the last 16 branches, exceptions, or interrupts taken by the processor. Vol. 4 2-347 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Register Name Fields and Flags Dec Model Availability Shared/ Unique1 Bit Description The MSRs at 680H-68FH, 6C0H-6CfH are not available in processor releases before family 0FH, model 03H. These MSRs replace MSRs previously located at 1DBH-1DEH.which performed the same function for early releases. See Section 17.12, “Last Branch, Call Stack, Interrupt, and Exception Recording for Processors based on Skylake Microarchitecture.” 681H 1665 MSR_LASTBRANCH_1_FROM_IP 3, 4, 6 Unique Last Branch Record 1 See description of MSR_LASTBRANCH_0 at 680H. 682H 1666 MSR_LASTBRANCH_2_FROM_IP 3, 4, 6 Unique Last Branch Record 2 See description of MSR_LASTBRANCH_0 at 680H. 683H 1667 MSR_LASTBRANCH_3_FROM_IP 3, 4, 6 Unique Last Branch Record 3 See description of MSR_LASTBRANCH_0 at 680H. 684H 1668 MSR_LASTBRANCH_4_FROM_IP 3, 4, 6 Unique Last Branch Record 4 See description of MSR_LASTBRANCH_0 at 680H. 685H 1669 MSR_LASTBRANCH_5_FROM_IP 3, 4, 6 Unique Last Branch Record 5 See description of MSR_LASTBRANCH_0 at 680H. 686H 1670 MSR_LASTBRANCH_6_FROM_IP 3, 4, 6 Unique Last Branch Record 6 See description of MSR_LASTBRANCH_0 at 680H. 687H 1671 MSR_LASTBRANCH_7_FROM_IP 3, 4, 6 Unique Last Branch Record 7 See description of MSR_LASTBRANCH_0 at 680H. 688H 1672 MSR_LASTBRANCH_8_FROM_IP 3, 4, 6 Unique Last Branch Record 8 See description of MSR_LASTBRANCH_0 at 680H. 689H 1673 MSR_LASTBRANCH_9_FROM_IP 3, 4, 6 Unique Last Branch Record 9 See description of MSR_LASTBRANCH_0 at 680H. 68AH 1674 MSR_LASTBRANCH_10_FROM_IP 3, 4, 6 Unique Last Branch Record 10 See description of MSR_LASTBRANCH_0 at 680H. 68BH 1675 MSR_LASTBRANCH_11_FROM_IP 3, 4, 6 Unique Last Branch Record 11 See description of MSR_LASTBRANCH_0 at 680H. 68CH 1676 MSR_LASTBRANCH_12_FROM_IP 3, 4, 6 Unique Last Branch Record 12 See description of MSR_LASTBRANCH_0 at 680H. 2-348 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 68DH 1677 Register Name Fields and Flags MSR_LASTBRANCH_13_FROM_IP Model Availability 3, 4, 6 Shared/ Unique1 Unique Bit Description Last Branch Record 13 See description of MSR_LASTBRANCH_0 at 680H. 68EH 1678 MSR_LASTBRANCH_14_FROM_IP 3, 4, 6 Unique Last Branch Record 14 See description of MSR_LASTBRANCH_0 at 680H. 68FH 1679 MSR_LASTBRANCH_15_FROM_IP 3, 4, 6 Unique Last Branch Record 15 See description of MSR_LASTBRANCH_0 at 680H. 6C0H 1728 MSR_LASTBRANCH_0_TO_IP 3, 4, 6 Unique Last Branch Record 0 (R/W) One of 16 pairs of last branch record registers on the last branch record stack (6C0H-6CFH). This part of the stack contains pointers to the destination instruction for one of the last 16 branches, exceptions, or interrupts that the processor took. See Section 17.12, “Last Branch, Call Stack, Interrupt, and Exception Recording for Processors based on Skylake Microarchitecture.” 6C1H 1729 MSR_LASTBRANCH_1_TO_IP 3, 4, 6 Unique Last Branch Record 1 See description of MSR_LASTBRANCH_0 at 6C0H. 6C2H 1730 MSR_LASTBRANCH_2_TO_IP 3, 4, 6 Unique Last Branch Record 2 See description of MSR_LASTBRANCH_0 at 6C0H. 6C3H 1731 MSR_LASTBRANCH_3_TO_IP 3, 4, 6 Unique Last Branch Record 3 See description of MSR_LASTBRANCH_0 at 6C0H. 6C4H 1732 MSR_LASTBRANCH_4_TO_IP 3, 4, 6 Unique Last Branch Record 4 See description of MSR_LASTBRANCH_0 at 6C0H. 6C5H 1733 MSR_LASTBRANCH_5_TO_IP 3, 4, 6 Unique Last Branch Record 5 See description of MSR_LASTBRANCH_0 at 6C0H. 6C6H 1734 MSR_LASTBRANCH_6_TO_IP 3, 4, 6 Unique Last Branch Record 6 See description of MSR_LASTBRANCH_0 at 6C0H. 6C7H 1735 MSR_LASTBRANCH_7_TO_IP 3, 4, 6 Unique Last Branch Record 7 See description of MSR_LASTBRANCH_0 at 6C0H. 6C8H 1736 MSR_LASTBRANCH_8_TO_IP 3, 4, 6 Unique Last Branch Record 8 See description of MSR_LASTBRANCH_0 at 6C0H. Vol. 4 2-349 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-48. MSRs in the Pentium® 4 and Intel® Xeon® Processors (Contd.) Register Address Hex Dec 6C9H 1737 Register Name Fields and Flags MSR_LASTBRANCH_9_TO_IP Model Availability 3, 4, 6 Shared/ Unique1 Unique Bit Description Last Branch Record 9 See description of MSR_LASTBRANCH_0 at 6C0H. 6CAH 1738 MSR_LASTBRANCH_10_TO_IP 3, 4, 6 Unique Last Branch Record 10 See description of MSR_LASTBRANCH_0 at 6C0H. 6CBH 1739 MSR_LASTBRANCH_11_TO_IP 3, 4, 6 Unique Last Branch Record 11 See description of MSR_LASTBRANCH_0 at 6C0H. 6CCH 1740 MSR_LASTBRANCH_12_TO_IP 3, 4, 6 Unique Last Branch Record 12 See description of MSR_LASTBRANCH_0 at 6C0H. 6CDH 1741 MSR_LASTBRANCH_13_TO_IP 3, 4, 6 Unique Last Branch Record 13 See description of MSR_LASTBRANCH_0 at 6C0H. 6CEH 1742 MSR_LASTBRANCH_14_TO_IP 3, 4, 6 Unique Last Branch Record 14 See description of MSR_LASTBRANCH_0 at 6C0H. 6CFH 1743 MSR_LASTBRANCH_15_TO_IP 3, 4, 6 Unique Last Branch Record 15 See description of MSR_LASTBRANCH_0 at 6C0H. C000_ 0080H IA32_EFER C000_ 0081H IA32_STAR C000_ 0082H IA32_LSTAR C000_ 0084H IA32_FMASK C000_ 0100H IA32_FS_BASE C000_ 0101H IA32_GS_BASE C000_ 0102H IA32_KERNEL_GS_BASE 3, 4, 6 Unique Extended Feature Enables See Table 2-2. 3, 4, 6 Unique System Call Target Address (R/W) See Table 2-2. 3, 4, 6 Unique IA-32e Mode System Call Target Address (R/W) See Table 2-2. 3, 4, 6 Unique System Call Flag Mask (R/W) See Table 2-2. 3, 4, 6 Unique Map of BASE Address of FS (R/W) See Table 2-2. 3, 4, 6 Unique Map of BASE Address of GS (R/W) See Table 2-2. 3, 4, 6 Unique Swap Target of BASE Address of GS (R/W) See Table 2-2. NOTES 1. For HT-enabled processors, there may be more than one logical processors per physical unit. If an MSR is Shared, this means that one MSR is shared between logical processors. If an MSR is unique, this means that each logical processor has its own MSR. 2-350 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) 2.19.1 MSRs Unique to Intel® Xeon® Processor MP with L3 Cache The MSRs listed in Table 2-49 apply to Intel® Xeon® Processor MP with up to 8MB level three cache. These processors can be detected by enumerating the deterministic cache parameter leaf of CPUID instruction (with EAX = 4 as input) to detect the presence of the third level cache, and with CPUID reporting family encoding 0FH, model encoding 3 or 4 (see CPUID instruction for more details). Table 2-49. MSRs Unique to 64-bit Intel® Xeon® Processor MP with Up to an 8 MB L3 Cache Register Address 107CCH Register Name Fields and Flags MSR_IFSB_BUSQ0 Model Availability 3, 4 Shared/ Unique Shared Bit Description IFSB BUSQ Event Control and Counter Register (R/W) See Section 18.6.6, “Performance Monitoring on 64bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache.” 107CDH MSR_IFSB_BUSQ1 3, 4 Shared IFSB BUSQ Event Control and Counter Register (R/W) 107CEH MSR_IFSB_SNPQ0 3, 4 Shared IFSB SNPQ Event Control and Counter Register (R/W) See Section 18.6.6, “Performance Monitoring on 64bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache.” 107CFH MSR_IFSB_SNPQ1 3, 4 Shared IFSB SNPQ Event Control and Counter Register (R/W) 107D0H MSR_EFSB_DRDY0 3, 4 Shared EFSB DRDY Event Control and Counter Register (R/W) See Section 18.6.6, “Performance Monitoring on 64bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache.” 107D1H MSR_EFSB_DRDY1 3, 4 Shared 107D2H MSR_IFSB_CTL6 3, 4 Shared EFSB DRDY Event Control and Counter Register (R/W) IFSB Latency Event Control Register (R/W) See Section 18.6.6, “Performance Monitoring on 64bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache.” 107D3H MSR_IFSB_CNTR7 3, 4 Shared IFSB Latency Event Counter Register (R/W) See Section 18.6.6, “Performance Monitoring on 64bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache.” The MSRs listed in Table 2-50 apply to Intel® Xeon® Processor 7100 series. These processors can be detected by enumerating the deterministic cache parameter leaf of CPUID instruction (with EAX = 4 as input) to detect the presence of the third level cache, and with CPUID reporting family encoding 0FH, model encoding 6 (See CPUID instruction for more details.). The performance monitoring MSRs listed in Table 2-50 are shared between logical processors in the same core, but are replicated for each core. Table 2-50. MSRs Unique to Intel® Xeon® Processor 7100 Series Register Address 107CCH Register Name Fields and Flags MSR_EMON_L3_CTR_CTL0 Model Availability 6 Shared/ Unique Shared Bit Description GBUSQ Event Control and Counter Register (R/W) See Section 18.6.6, “Performance Monitoring on 64-bit Intel Xeon Processor MP with Up to 8MByte L3 Cache.” Vol. 4 2-351 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-50. MSRs Unique to Intel® Xeon® Processor 7100 Series (Contd.) Register Address Register Name Fields and Flags Model Availability Shared/ Unique 107CDH MSR_EMON_L3_CTR_CTL1 6 Shared 107CEH MSR_EMON_L3_CTR_CTL2 6 Shared Bit Description GBUSQ Event Control and Counter Register (R/W) GSNPQ Event Control and Counter Register (R/W) See Section 18.6.6, “Performance Monitoring on 64-bit Intel Xeon Processor MP with Up to 8MByte L3 Cache.” 107CFH MSR_EMON_L3_CTR_CTL3 6 Shared GSNPQ Event Control and Counter Register (R/W) 107D0H MSR_EMON_L3_CTR_CTL4 6 Shared FSB Event Control and Counter Register (R/W) See Section 18.6.6, “Performance Monitoring on 64-bit Intel Xeon Processor MP with Up to 8MByte L3 Cache.” 107D1H MSR_EMON_L3_CTR_CTL5 6 Shared FSB Event Control and Counter Register (R/W) 107D2H MSR_EMON_L3_CTR_CTL6 6 Shared FSB Event Control and Counter Register (R/W) 107D3H MSR_EMON_L3_CTR_CTL7 6 Shared FSB Event Control and Counter Register (R/W) 2.20 MSRS IN INTEL® CORE™ SOLO AND INTEL® CORE™ DUO PROCESSORS Model-specific registers (MSRs) for Intel Core Solo, Intel Core Duo processors, and Dual-core Intel Xeon processor LV are listed in Table 2-51. The column “Shared/Unique” applies to Intel Core Duo processor. “Unique” means each processor core has a separate MSR, or a bit field in an MSR governs only a core independently. “Shared” means the MSR or the bit field in an MSR address governs the operation of both processor cores. Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Register Name Shared/ Unique Bit Description Hex Dec 0H 0 P5_MC_ADDR Unique See Section 2.23, “MSRs in Pentium Processors,” and see Table 2-2. 1H 1 P5_MC_TYPE Unique See Section 2.23, “MSRs in Pentium Processors,” and see Table 2-2. 6H 6 IA32_MONITOR_FILTER_SIZE Unique See Section 8.10.5, “Monitor/Mwait Address Range Determination,” and see Table 2-2. 10H 16 IA32_TIME_STAMP_COUNTER Unique See Section 17.17, “Time-Stamp Counter,” and see Table 2-2. 17H 23 IA32_PLATFORM_ID Shared Platform ID (R) See Table 2-2. The operating system can use this MSR to determine “slot” information for the processor and the proper microcode update to load. 1BH 2-352 Vol. 4 27 IA32_APIC_BASE Unique See Section 10.4.4, “Local APIC Status and Location,” and see Table 2-2. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Dec 2AH 42 Register Name MSR_EBL_CR_POWERON Shared/ Unique Shared Bit Description Processor Hard Power-On Configuration (R/W) Enables and disables processor features; (R) indicates current processor configuration. 0 Reserved 1 Data Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 2 Response Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 3 MCERR# Drive Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 4 Address Parity Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 6: 5 7 Reserved BINIT# Driver Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 8 Output Tri-state Enabled (R/O) 1 = Enabled; 0 = Disabled 9 Execute BIST (R/O) 1 = Enabled; 0 = Disabled 10 MCERR# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled 11 12 Reserved BINIT# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled 13 Reserved 14 1 MByte Power on Reset Vector (R/O) 1 = 1 MByte; 0 = 4 GBytes 15 Reserved 17:16 APIC Cluster ID (R/O) 18 System Bus Frequency (R/O) 0 = 100 MHz 1 = Reserved 19 Reserved 21: 20 Symmetric Arbitration ID (R/O) 26:22 Clock Frequency Ratio (R/O) Vol. 4 2-353 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Dec 3AH 58 Register Name IA32_FEATURE_CONTROL Shared/ Unique Unique Bit Description Control Features in IA-32 Processor (R/W) See Table 2-2. 40H 64 MSR_LASTBRANCH_0 Unique Last Branch Record 0 (R/W) One of 8 last branch record registers on the last branch record stack: bits 31-0 hold the ‘from’ address and bits 63-32 hold the ‘to’ address. See also: • Last Branch Record Stack TOS at 1C9H • Section 17.15, “Last Branch, Interrupt, and Exception Recording (Pentium M Processors).” 41H 65 MSR_LASTBRANCH_1 Unique Last Branch Record 1 (R/W) See description of MSR_LASTBRANCH_0. 42H 66 MSR_LASTBRANCH_2 Unique Last Branch Record 2 (R/W) See description of MSR_LASTBRANCH_0. 43H 67 MSR_LASTBRANCH_3 Unique Last Branch Record 3 (R/W) See description of MSR_LASTBRANCH_0. 44H 68 MSR_LASTBRANCH_4 Unique Last Branch Record 4 (R/W) See description of MSR_LASTBRANCH_0. 45H 69 MSR_LASTBRANCH_5 Unique Last Branch Record 5 (R/W) See description of MSR_LASTBRANCH_0. 46H 70 MSR_LASTBRANCH_6 Unique Last Branch Record 6 (R/W) See description of MSR_LASTBRANCH_0. 47H 71 MSR_LASTBRANCH_7 Unique Last Branch Record 7 (R/W) See description of MSR_LASTBRANCH_0. 79H 121 IA32_BIOS_UPDT_TRIG Unique BIOS Update Trigger Register (W) See Table 2-2. 8BH 139 IA32_BIOS_SIGN_ID Unique BIOS Update Signature ID (RO) See Table 2-2. C1H 193 IA32_PMC0 Unique Performance Counter Register See Table 2-2. C2H 194 IA32_PMC1 Unique Performance Counter Register See Table 2-2. CDH 205 MSR_FSB_FREQ Shared Scaleable Bus Speed (RO) This field indicates the scaleable bus clock speed: 2:0 • 101B: 100 MHz (FSB 400) • 001B: 133 MHz (FSB 533) • 011B: 167 MHz (FSB 667) 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 101B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 63:3 2-354 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Dec E7H 231 Register Name IA32_MPERF Shared/ Unique Unique Bit Description Maximum Performance Frequency Clock Count (RW) See Table 2-2. E8H 232 IA32_APERF Unique Actual Performance Frequency Clock Count (RW) See Table 2-2. FEH 254 IA32_MTRRCAP Unique See Table 2-2. 11EH 281 MSR_BBL_CR_CTL3 Shared Control Register 3 Used to configure the L2 Cache. 0 L2 Hardware Enabled (RO) 1= 0= If the L2 is hardware-enabled Indicates if the L2 is hardware-disabled 7:1 Reserved 8 L2 Enabled (R/W) 1 = L2 cache has been initialized 0 = Disabled (default) Until this bit is set the processor will not respond to the WBINVD instruction or the assertion of the FLUSH# input. 22:9 Reserved 23 L2 Not Present (RO) 0= 1= 63:24 L2 Present L2 Not Present Reserved 174H 372 IA32_SYSENTER_CS Unique See Table 2-2. 175H 373 IA32_SYSENTER_ESP Unique See Table 2-2. 176H 374 IA32_SYSENTER_EIP Unique See Table 2-2. 179H 377 IA32_MCG_CAP Unique See Table 2-2. 17AH 378 IA32_MCG_STATUS Unique Global Machine Check Status 0 RIPV When set, this bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If this bit is cleared, the program cannot be reliably restarted. 1 EIPV When set, this bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. Vol. 4 2-355 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Register Name Shared/ Unique Bit Description Dec 2 MCIP When set, this bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 Reserved 186H 390 IA32_PERFEVTSEL0 Unique See Table 2-2. 187H 391 IA32_PERFEVTSEL1 Unique See Table 2-2. 198H 408 IA32_PERF_STATUS Shared See Table 2-2. 199H 409 IA32_PERF_CTL Unique See Table 2-2. 19AH 410 IA32_CLOCK_MODULATION Unique Clock Modulation (R/W) See Table 2-2. 19BH 411 IA32_THERM_INTERRUPT Unique Thermal Interrupt Control (R/W) See Table 2-2. See Section 14.7.2, “Thermal Monitor.” 19CH 412 IA32_THERM_STATUS Unique Thermal Monitor Status (R/W) See Table 2-2. See Section 14.7.2, “Thermal Monitor”. 19DH 413 MSR_THERM2_CTL Unique Thermal Monitor 2 Control 15:0 Reserved 16 TM_SELECT (R/W) Mode of automatic thermal monitor: 0 = Thermal Monitor 1 (thermally-initiated on-die modulation of the stop-clock duty cycle) 1 = Thermal Monitor 2 (thermally-initiated frequency transitions) If bit 3 of the IA32_MISC_ENABLE register is cleared, TM_SELECT has no effect. Neither TM1 nor TM2 will be enabled. 63:16 1A0H 416 Reserved IA32_MISC_ENABLE Enable Miscellaneous Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 2:0 3 Reserved Unique Automatic Thermal Control Circuit Enable (R/W) See Table 2-2. 6:4 7 Reserved Shared Performance Monitoring Available (R) See Table 2-2. 2-356 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Register Name Shared/ Unique Bit Description Dec 9:8 10 Reserved Shared FERR# Multiplexing Enable (R/W) 1= FERR# asserted by the processor to indicate a pending break event within the processor 0 = Indicates compatible FERR# signaling behavior This bit must be set to 1 to support XAPIC interrupt model usage. 11 Shared Branch Trace Storage Unavailable (RO) See Table 2-2. 12 13 Reserved Shared TM2 Enable (R/W) When this bit is set (1) and the thermal sensor indicates that the die temperature is at the pre-determined threshold, the Thermal Monitor 2 mechanism is engaged. TM2 will reduce the bus to core ratio and voltage according to the value last written to MSR_THERM2_CTL bits 15:0. When this bit is clear (0, default), the processor does not change the VID signals or the bus to core ratio when the processor enters a thermal managed state. If the TM2 feature flag (ECX[8]) is not set to 1 after executing CPUID with EAX = 1, then this feature is not supported and BIOS must not alter the contents of this bit location. The processor is operating out of spec if both this bit and the TM1 bit are set to disabled states. 15:14 16 Reserved Shared Enhanced Intel SpeedStep Technology Enable (R/W) 1= 18 Shared Enhanced Intel SpeedStep Technology enabled ENABLE MONITOR FSM (R/W) See Table 2-2. 19 22 Reserved Shared Limit CPUID Maxval (R/W) See Table 2-2. Setting this bit may cause behavior in software that depends on the availability of CPUID leaves greater than 2. 33:23 34 Reserved Shared XD Bit Disable (R/W) See Table 2-2. 63:35 1C9H 457 MSR_LASTBRANCH_TOS Reserved Unique Last Branch Record Stack TOS (R/W) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 40H). Vol. 4 2-357 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Dec 1D9H 473 Register Name IA32_DEBUGCTL Shared/ Unique Unique Bit Description Debug Control (R/W) Controls how several debug features are used. Bit definitions are discussed in Table 2-2. 1DDH 477 MSR_LER_FROM_LIP Unique Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Unique Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 200H 512 MTRRphysBase0 Unique Memory Type Range Registers 201H 513 MTRRphysMask0 Unique Memory Type Range Registers 202H 514 MTRRphysBase1 Unique Memory Type Range Registers 203H 515 MTRRphysMask1 Unique Memory Type Range Registers 204H 516 MTRRphysBase2 Unique Memory Type Range Registers 205H 517 MTRRphysMask2 Unique Memory Type Range Registers 206H 518 MTRRphysBase3 Unique Memory Type Range Registers 207H 519 MTRRphysMask3 Unique Memory Type Range Registers 208H 520 MTRRphysBase4 Unique Memory Type Range Registers 209H 521 MTRRphysMask4 Unique Memory Type Range Registers 20AH 522 MTRRphysBase5 Unique Memory Type Range Registers 20BH 523 MTRRphysMask5 Unique Memory Type Range Registers 20CH 524 MTRRphysBase6 Unique Memory Type Range Registers 20DH 525 MTRRphysMask6 Unique Memory Type Range Registers 20EH 526 MTRRphysBase7 Unique Memory Type Range Registers 20FH 527 MTRRphysMask7 Unique Memory Type Range Registers 250H 592 MTRRfix64K_00000 Unique Memory Type Range Registers 258H 600 MTRRfix16K_80000 Unique Memory Type Range Registers 259H 601 MTRRfix16K_A0000 Unique Memory Type Range Registers 268H 616 MTRRfix4K_C0000 Unique Memory Type Range Registers 269H 617 MTRRfix4K_C8000 Unique Memory Type Range Registers 26AH 618 MTRRfix4K_D0000 Unique Memory Type Range Registers 26BH 619 MTRRfix4K_D8000 Unique Memory Type Range Registers 26CH 620 MTRRfix4K_E0000 Unique Memory Type Range Registers 26DH 621 MTRRfix4K_E8000 Unique Memory Type Range Registers 26EH 622 MTRRfix4K_F0000 Unique Memory Type Range Registers 26FH 623 MTRRfix4K_F8000 Unique Memory Type Range Registers 2-358 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Dec 2FFH 767 Register Name IA32_MTRR_DEF_TYPE Shared/ Unique Unique Bit Description Default Memory Types (R/W) See Table 2-2. See Section 11.11.2.1, “IA32_MTRR_DEF_TYPE MSR.” 400H 1024 IA32_MC0_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 1028 IA32_MC1_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 406H 1030 IA32_MC1_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 408H 1032 IA32_MC2_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40AH 1034 IA32_MC2_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 1036 MSR_MC4_CTL Unique See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 MSR_MC4_STATUS Unique See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 MSR_MC4_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 410H 1040 IA32_MC3_CTL See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 IA32_MC3_STATUS See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 MSR_MC3_ADDR Unique See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. Vol. 4 2-359 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Register Name Shared/ Unique Bit Description Hex Dec 413H 1043 MSR_MC3_MISC Unique Machine Check Error Reporting Register - contains additional information describing the machine-check error if the MISCV flag in the IA32_MCi_STATUS register is set. 414H 1044 MSR_MC5_CTL Unique Machine Check Error Reporting Register - controls signaling of #MC for errors produced by a particular hardware unit (or group of hardware units). 415H 1045 MSR_MC5_STATUS Unique Machine Check Error Reporting Register - contains information related to a machine-check error if its VAL (valid) flag is set. Software is responsible for clearing IA32_MCi_STATUS MSRs by explicitly writing 0s to them; writing 1s to them causes a general-protection exception. 416H 1046 MSR_MC5_ADDR Unique Machine Check Error Reporting Register - contains the address of the code or data memory location that produced the machine-check error if the ADDRV flag in the IA32_MCi_STATUS register is set. 417H 1047 MSR_MC5_MISC Unique Machine Check Error Reporting Register - contains additional information describing the machine-check error if the MISCV flag in the IA32_MCi_STATUS register is set. 480H 1152 IA32_VMX_BASIC Unique Reporting Register of Basic VMX Capabilities (R/O) See Table 2-2. See Appendix A.1, “Basic VMX Information”. (If CPUID.01H:ECX.[bit 5]) 481H 1153 IA32_VMX_PINBASED_CTLS Unique Capability Reporting Register of Pin-Based VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls”. (If CPUID.01H:ECX.[bit 5]) 482H 1154 IA32_VMX_PROCBASED_CTLS Unique Capability Reporting Register of Primary Processor-Based VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls”. (If CPUID.01H:ECX.[bit 5]) 483H 1155 IA32_VMX_EXIT_CTLS Unique Capability Reporting Register of VM-Exit Controls (R/O) See Appendix A.4, “VM-Exit Controls”. (If CPUID.01H:ECX.[bit 5]) 484H 1156 IA32_VMX_ENTRY_CTLS Unique Capability Reporting Register of VM-Entry Controls (R/O) See Appendix A.5, “VM-Entry Controls”. (If CPUID.01H:ECX.[bit 5]) 485H 1157 IA32_VMX_MISC Unique Reporting Register of Miscellaneous VMX Capabilities (R/O) See Appendix A.6, “Miscellaneous Data”. (If CPUID.01H:ECX.[bit 5]) 486H 1158 IA32_VMX_CR0_FIXED0 Unique Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Appendix A.7, “VMX-Fixed Bits in CR0”. (If CPUID.01H:ECX.[bit 5]) 2-360 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-51. MSRs in Intel® Core™ Solo, Intel® Core™ Duo Processors, and Dual-Core Intel® Xeon® Processor LV Register Address Hex Dec 487H 1159 Register Name IA32_VMX_CR0_FIXED1 Shared/ Unique Unique Bit Description Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Appendix A.7, “VMX-Fixed Bits in CR0”. (If CPUID.01H:ECX.[bit 5]) 488H 1160 IA32_VMX_CR4_FIXED0 Unique Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Appendix A.8, “VMX-Fixed Bits in CR4”. (If CPUID.01H:ECX.[bit 5]) 489H 1161 IA32_VMX_CR4_FIXED1 Unique Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Appendix A.8, “VMX-Fixed Bits in CR4”. (If CPUID.01H:ECX.[bit 5]) 48AH 1162 IA32_VMX_VMCS_ENUM Unique Capability Reporting Register of VMCS Field Enumeration (R/O) See Appendix A.9, “VMCS Enumeration”. (If CPUID.01H:ECX.[bit 5]) 48BH 1163 IA32_VMX_PROCBASED_CTLS2 Unique Capability Reporting Register of Secondary Processor-Based VM-Execution Controls (R/O) See Appendix A.3, “VM-Execution Controls”. (If CPUID.01H:ECX.[bit 5] and IA32_VMX_PROCBASED_CTLS[bit 63]) 600H 1536 IA32_DS_AREA Unique DS Save Area (R/W) See Table 2-2. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 31:0 DS Buffer Management Area Linear address of the first byte of the DS buffer management area. 63:32 C000_ 0080H 2.21 Reserved IA32_EFER Unique See Table 2-2. 10:0 Reserved 11 Execute Disable Bit Enable 63:12 Reserved MSRS IN THE PENTIUM M PROCESSOR Model-specific registers (MSRs) for the Pentium M processor are similar to those described in Section 2.22 for P6 family processors. The following table describes new MSRs and MSRs whose behavior has changed on the Pentium M processor. Table 2-52. MSRs in Pentium M Processors Register Address Hex Dec 0H 0 Register Name / Bit Fields P5_MC_ADDR Bit Description See Section 2.23, “MSRs in Pentium Processors.” Vol. 4 2-361 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-52. MSRs in Pentium M Processors (Contd.) Register Address Bit Description Register Name / Bit Fields Hex Dec 1H 1 P5_MC_TYPE See Section 2.23, “MSRs in Pentium Processors.” 10H 16 IA32_TIME_STAMP_COUNTER See Section 17.17, “Time-Stamp Counter,” and see Table 2-2. 17H 23 IA32_PLATFORM_ID Platform ID (R) See Table 2-2. The operating system can use this MSR to determine “slot” information for the processor and the proper microcode update to load. 2AH 42 MSR_EBL_CR_POWERON Processor Hard Power-On Configuration (R/W) Enables and disables processor features. (R) Indicates current processor configuration. 0 Reserved 1 Data Error Checking Enable (R) 0 = Disabled Always 0 on the Pentium M processor. 2 Response Error Checking Enable (R) 0 = Disabled Always 0 on the Pentium M processor. 3 MCERR# Drive Enable (R) 0 = Disabled Always 0 on the Pentium M processor. 4 Address Parity Enable (R) 0 = Disabled Always 0 on the Pentium M processor. 6:5 Reserved 7 BINIT# Driver Enable (R) 1 = Enabled; 0 = Disabled Always 0 on the Pentium M processor. 8 Output Tri-state Enabled (R/O) 1 = Enabled; 0 = Disabled 9 Execute BIST (R/O) 1 = Enabled; 0 = Disabled 10 MCERR# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Always 0 on the Pentium M processor. 11 Reserved 12 BINIT# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Always 0 on the Pentium M processor. 13 2-362 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-52. MSRs in Pentium M Processors (Contd.) Register Address Hex Bit Description Register Name / Bit Fields Dec 14 1 MByte Power on Reset Vector (R/O) 1 = 1 MByte; 0 = 4 GBytes Always 0 on the Pentium M processor. 15 Reserved 17:16 APIC Cluster ID (R/O) Always 00B on the Pentium M processor. 18 System Bus Frequency (R/O) 0 = 100 MHz 1 = Reserved Always 0 on the Pentium M processor. 19 Reserved 21: 20 Symmetric Arbitration ID (R/O) Always 00B on the Pentium M processor. 40H 64 26:22 Clock Frequency Ratio (R/O) MSR_LASTBRANCH_0 Last Branch Record 0 (R/W) One of 8 last branch record registers on the last branch record stack: bits 31-0 hold the ‘from’ address and bits 63-32 hold the to address. See also: • Last Branch Record Stack TOS at 1C9H. • Section 17.15, “Last Branch, Interrupt, and Exception Recording (Pentium M Processors)”. 41H 65 MSR_LASTBRANCH_1 Last Branch Record 1 (R/W) See description of MSR_LASTBRANCH_0. 42H 66 MSR_LASTBRANCH_2 Last Branch Record 2 (R/W) See description of MSR_LASTBRANCH_0. 43H 67 MSR_LASTBRANCH_3 Last Branch Record 3 (R/W) See description of MSR_LASTBRANCH_0. 44H 68 MSR_LASTBRANCH_4 Last Branch Record 4 (R/W) See description of MSR_LASTBRANCH_0. 45H 69 MSR_LASTBRANCH_5 Last Branch Record 5 (R/W) See description of MSR_LASTBRANCH_0. 46H 70 MSR_LASTBRANCH_6 Last Branch Record 6 (R/W) See description of MSR_LASTBRANCH_0. 47H 71 MSR_LASTBRANCH_7 Last Branch Record 7 (R/W) See description of MSR_LASTBRANCH_0. 119H 281 MSR_BBL_CR_CTL Control Register Used to program L2 commands to be issued via cache configuration accesses mechanism. Also receives L2 lookup response. 11EH 281 63:0 Reserved MSR_BBL_CR_CTL3 Control register 3 Used to configure the L2 Cache. Vol. 4 2-363 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-52. MSRs in Pentium M Processors (Contd.) Register Address Hex Bit Description Register Name / Bit Fields Dec 0 L2 Hardware Enabled (RO) 1 = If the L2 is hardware-enabled. 0 = Indicates if the L2 is hardware-disabled. 4:1 Reserved 5 ECC Check Enable (RO) This bit enables ECC checking on the cache data bus. ECC is always generated on write cycles. 0 = Disabled (default) 1 = Enabled For the Pentium M processor, ECC checking on the cache data bus is always enabled. 7:6 Reserved 8 L2 Enabled (R/W) 1 = L2 cache has been initialized 0 = Disabled (default) Until this bit is set the processor will not respond to the WBINVD instruction or the assertion of the FLUSH# input. 22:9 23 Reserved L2 Not Present (RO) 0 = L2 Present 1 = L2 Not Present 179H 377 63:24 Reserved IA32_MCG_CAP Read-only register that provides information about the machine-check architecture of the processor. 7:0 Count (RO) Indicates the number of hardware unit error reporting banks available in the processor. 8 IA32_MCG_CTL Present (RO) 1 = Indicates that the processor implements the MSR_MCG_CTL register found at MSR 17BH. 0 = Not supported. 17AH 378 63:9 Reserved IA32_MCG_STATUS Global Machine Check Status 0 RIPV When set, this bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If this bit is cleared, the program cannot be reliably restarted. 1 EIPV When set, this bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2-364 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-52. MSRs in Pentium M Processors (Contd.) Register Address Hex Bit Description Register Name / Bit Fields Dec 2 MCIP When set, this bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 Reserved 198H 408 IA32_PERF_STATUS See Table 2-2. 199H 409 IA32_PERF_CTL See Table 2-2. 19AH 410 IA32_CLOCK_MODULATION Clock Modulation (R/W). See Table 2-2. See Section 14.7.3, “Software Controlled Clock Modulation.” 19BH 411 IA32_THERM_INTERRUPT Thermal Interrupt Control (R/W) See Table 2-2. See Section 14.7.2, “Thermal Monitor.” 19CH 412 IA32_THERM_STATUS Thermal Monitor Status (R/W) See Table 2-2. See Section 14.7.2, “Thermal Monitor.” 19DH 413 MSR_THERM2_CTL Thermal Monitor 2 Control 15:0 Reserved 16 TM_SELECT (R/W) Mode of automatic thermal monitor: 0 = Thermal Monitor 1 (thermally-initiated on-die modulation of the stop-clock duty cycle) 1 = Thermal Monitor 2 (thermally-initiated frequency transitions) If bit 3 of the IA32_MISC_ENABLE register is cleared, TM_SELECT has no effect. Neither TM1 nor TM2 will be enabled. 1A0H 416 63:16 Reserved IA32_MISC_ENABLE Enable Miscellaneous Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. 2:0 Reserved Vol. 4 2-365 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-52. MSRs in Pentium M Processors (Contd.) Register Address Hex Bit Description Register Name / Bit Fields Dec 3 Automatic Thermal Control Circuit Enable (R/W) 1 = Setting this bit enables the thermal control circuit (TCC) portion of the Intel Thermal Monitor feature. This allows processor clocks to be automatically modulated based on the processor's thermal sensor operation. 0 = Disabled (default). The automatic thermal control circuit enable bit determines if the thermal control circuit (TCC) will be activated when the processor's internal thermal sensor determines the processor is about to exceed its maximum operating temperature. When the TCC is activated and TM1 is enabled, the processors clocks will be forced to a 50% duty cycle. BIOS must enable this feature. The bit should not be confused with the on-demand thermal control circuit enable bit. 6:4 Reserved 7 Performance Monitoring Available (R) 1= 0= Performance monitoring enabled. Performance monitoring disabled. 9:8 Reserved 10 FERR# Multiplexing Enable (R/W) 1= FERR# asserted by the processor to indicate a pending break event within the processor. 0 = Indicates compatible FERR# signaling behavior. This bit must be set to 1 to support XAPIC interrupt model usage. Branch Trace Storage Unavailable (RO) 1 = Processor doesn’t support branch trace storage (BTS) 0 = BTS is supported 12 Processor Event Based Sampling Unavailable (RO) 1= Processor does not support processor event based sampling (PEBS); 0 = PEBS is supported. The Pentium M processor does not support PEBS. 15:13 Reserved 16 Enhanced Intel SpeedStep Technology Enable (R/W) 1 = Enhanced Intel SpeedStep Technology enabled. On the Pentium M processor, this bit may be configured to be read-only. 22:17 Reserved 23 xTPR Message Disable (R/W) When set to 1, xTPR messages are disabled. xTPR messages are optional messages that allow the processor to inform the chipset of its priority. The default is processor specific. 63:24 2-366 Vol. 4 Reserved MODEL-SPECIFIC REGISTERS (MSRS) Table 2-52. MSRs in Pentium M Processors (Contd.) Register Address Hex Dec 1C9H 457 Bit Description Register Name / Bit Fields MSR_LASTBRANCH_TOS Last Branch Record Stack TOS (R/W) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See also: • MSR_LASTBRANCH_0_FROM_IP (at 40H). • Section 17.15, “Last Branch, Interrupt, and Exception Recording (Pentium M Processors)”. 1D9H 473 MSR_DEBUGCTLB Debug Control (R/W) Controls how several debug features are used. Bit definitions are discussed in the referenced section. See Section 17.15, “Last Branch, Interrupt, and Exception Recording (Pentium M Processors).” 1DDH 477 MSR_LER_TO_LIP Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. See Section 17.15, “Last Branch, Interrupt, and Exception Recording (Pentium M Processors)” and Section 17.16.2, “Last Branch and Last Exception MSRs.” 1DEH 478 MSR_LER_FROM_LIP Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. See Section 17.15, “Last Branch, Interrupt, and Exception Recording (Pentium M Processors)” and Section 17.16.2, “Last Branch and Last Exception MSRs.” 2FFH 767 IA32_MTRR_DEF_TYPE Default Memory Types (R/W) Sets the memory type for the regions of physical memory that are not mapped by the MTRRs. See Section 11.11.2.1, “IA32_MTRR_DEF_TYPE MSR.” 400H 1024 IA32_MC0_CTL See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 401H 1025 IA32_MC0_STATUS See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 402H 1026 IA32_MC0_ADDR See Section 14.3.2.3., “IA32_MCi_ADDR MSRs”. The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 1028 IA32_MC1_CTL See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 405H 1029 IA32_MC1_STATUS See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 406H 1030 IA32_MC1_ADDR See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 408H 1032 IA32_MC2_CTL See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 409H 1033 IA32_MC2_STATUS See Chapter 15.3.2.2, “IA32_MCi_STATUS MSRS.” Vol. 4 2-367 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-52. MSRs in Pentium M Processors (Contd.) Register Address Hex Dec 40AH 1034 Bit Description Register Name / Bit Fields IA32_MC2_ADDR See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 1036 MSR_MC4_CTL See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 MSR_MC4_STATUS See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 MSR_MC4_ADDR See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 410H 1040 MSR_MC3_CTL See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 411H 1041 MSR_MC3_STATUS See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 412H 1042 MSR_MC3_ADDR See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 600H 1536 IA32_DS_AREA DS Save Area (R/W) See Table 2-2. Points to the DS buffer management area, which is used to manage the BTS and PEBS buffers. See Section 18.6.3.4, “Debug Store (DS) Mechanism.” 31:0 DS Buffer Management Area Linear address of the first byte of the DS buffer management area. 63:32 2.22 Reserved MSRS IN THE P6 FAMILY PROCESSORS The following MSRs are defined for the P6 family processors. The MSRs in this table that are shaded are available only in the Pentium II and Pentium III processors. Beginning with the Pentium 4 processor, some of the MSRs in this list have been designated as “architectural” and have had their names changed. See Table 2-2 for a list of the architectural MSRs. Table 2-53. MSRs in the P6 Family Processors Register Address Register Name / Bit Fields Bit Description Hex Dec 0H 0 P5_MC_ADDR See Section 2.23, “MSRs in Pentium Processors.” 1H 1 P5_MC_TYPE See Section 2.23, “MSRs in Pentium Processors.” 10H 16 TSC See Section 17.17, “Time-Stamp Counter.” 2-368 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Hex Dec 17H 23 Register Name / Bit Fields IA32_PLATFORM_ID Bit Description Platform ID (R) The operating system can use this MSR to determine “slot” information for the processor and the proper microcode update to load. 49:0 52:50 Reserved Platform Id (R) Contains information concerning the intended platform for the processor. 52 0 0 0 0 1 1 1 1 1BH 27 51 0 0 1 1 0 0 1 1 50 0 1 0 1 0 1 0 1 Processor Flag 0 Processor Flag 1 Processor Flag 2 Processor Flag 3 Processor Flag 4 Processor Flag 5 Processor Flag 6 Processor Flag 7 56:53 L2 Cache Latency Read. 59:57 Reserved 60 Clock Frequency Ratio Read. 63:61 Reserved APIC_BASE Section 10.4.4, “Local APIC Status and Location.” 7:0 Reserved 8 Boot Strap Processor Indicator Bit 1 = BSP 10:9 11 Reserved APIC Global Enable Bit - Permanent till reset 1 = Enabled 0 = Disabled 2AH 42 31:12 APIC Base Address. 63:32 Reserved EBL_CR_POWERON Processor Hard Power-On Configuration (R/W) Enables and disables processor features; (R) indicates current processor configuration. 0 Reserved1 1 Data Error Checking Enable (R/W) 1 = Enabled 0 = Disabled 2 Response Error Checking Enable FRCERR Observation Enable (R/W) 1 = Enabled 0 = Disabled 3 AERR# Drive Enable (R/W) 1 = Enabled 0 = Disabled Vol. 4 2-369 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Hex Register Name / Bit Fields Bit Description Dec 4 BERR# Enable for Initiator Bus Requests (R/W) 1 = Enabled 0 = Disabled 5 Reserved 6 BERR# Driver Enable for Initiator Internal Errors (R/W) 1 = Enabled 0 = Disabled 7 BINIT# Driver Enable (R/W) 1 = Enabled 0 = Disabled 8 Output Tri-state Enabled (R) 1 = Enabled 0 = Disabled 9 Execute BIST (R) 1 = Enabled 0 = Disabled 10 AERR# Observation Enabled (R) 1 = Enabled 0 = Disabled 11 Reserved 12 BINIT# Observation Enabled (R) 1 = Enabled 0 = Disabled 13 In Order Queue Depth (R) 1=1 0=8 14 1-MByte Power on Reset Vector (R) 1 = 1MByte 0 = 4GBytes 15 FRC Mode Enable (R) 1 = Enabled 0 = Disabled 17:16 19:18 APIC Cluster ID (R) System Bus Frequency (R) 00 = 66MHz 10 = 100Mhz 01 = 133MHz 11 = Reserved 2-370 Vol. 4 21: 20 Symmetric Arbitration ID (R) 25:22 Clock Frequency Ratio (R) 26 Low Power Mode Enable (R/W) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Hex 33H Register Name / Bit Fields Bit Description Dec 51 27 Clock Frequency Ratio 63:28 Reserved1 TEST_CTL Test Control Register 29:0 Reserved 30 Streaming Buffer Disable 31 Disable LOCK# Assertion for split locked access. 79H 121 BIOS_UPDT_TRIG BIOS Update Trigger Register. 88H 136 BBL_CR_D0[63:0] Chunk 0 data register D[63:0]: used to write to and read from the L2 89H 137 BBL_CR_D1[63:0] Chunk 1 data register D[63:0]: used to write to and read from the L2 8AH 138 BBL_CR_D2[63:0] Chunk 2 data register D[63:0]: used to write to and read from the L2 8BH 139 BIOS_SIGN/BBL_CR_D3[63:0] BIOS Update Signature Register or Chunk 3 data register D[63:0] Used to write to and read from the L2 depending on the usage model. C1H 193 PerfCtr0 (PERFCTR0) Performance Counter Register See Table 2-2. C2H 194 PerfCtr1 (PERFCTR1) Performance Counter Register See Table 2-2. FEH 254 116H 278 MTRRcap Memory Type Range Registers BBL_CR_ADDR [63:0] Address register: used to send specified address (A31-A3) to L2 during cache initialization accesses. BBL_CR_ADDR [63:32] Reserved, BBL_CR_ADDR [31:3] Address bits [35:3] BBL_CR_ADDR [2:0] Reserved Set to 0. 118H 280 BBL_CR_DECC[63:0] Data ECC register D[7:0]: used to write ECC and read ECC to/from L2 119H 281 BBL_CR_CTL Control register: used to program L2 commands to be issued via cache configuration accesses mechanism. Also receives L2 lookup response BL_CR_CTL[63:22] BBL_CR_CTL[21] Reserved Processor number2 Disable = 1 Enable = 0 Reserved Vol. 4 2-371 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Hex Register Name / Bit Fields Bit Description Dec BBL_CR_CTL[20:19] User supplied ECC BBL_CR_CTL[18] Reserved BBL_CR_CTL[17] L2 Hit BBL_CR_CTL[16] Reserved BBL_CR_CTL[15:14] State from L2 BBL_CR_CTL[13:12] Modified - 11,Exclusive - 10, Shared - 01, Invalid - 00 BBL_CR_CTL[11:10] BBL_CR_CTL[9:8] BBL_CR_CTL[7] BBL_CR_CTL[6:5] BBL_CR_CTL[4:0] 01100 01110 01111 00010 00011 010 + MESI encode 111 + MESI encode 100 + MESI encode Way from L2 Way 0 - 00, Way 1 - 01, Way 2 - 10, Way 3 - 11 Way to L2 Reserved State to L2 L2 Command Data Read w/ LRU update (RLU) Tag Read w/ Data Read (TRR) Tag Inquire (TI) L2 Control Register Read (CR) L2 Control Register Write (CW) Tag Write w/ Data Read (TWR) Tag Write w/ Data Write (TWW) Tag Write (TW) 11AH 282 BBL_CR_TRIG Trigger register: used to initiate a cache configuration accesses access, Write only with Data = 0. 11BH 283 BBL_CR_BUSY Busy register: indicates when a cache configuration accesses L2 command is in progress. D[0] = 1 = BUSY 286 BBL_CR_CTL3 Control register 3: used to configure the L2 Cache BBL_CR_CTL3[63:26] Reserved BBL_CR_CTL3[25] Cache bus fraction (read only) BBL_CR_CTL3[24] Reserved BBL_CR_CTL3[23] L2 Hardware Disable (read only) BBL_CR_CTL3[22:20] L2 Physical Address Range support 11EH 111 110 101 100 011 010 001 000 2-372 Vol. 4 64GBytes 32GBytes 16GBytes 8GBytes 4GBytes 2GBytes 1GBytes 512MBytes BBL_CR_CTL3[19] Reserved BBL_CR_CTL3[18] Cache State error checking enable (read/write) MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Hex Register Name / Bit Fields Bit Description Dec BBL_CR_CTL3[17:13 Cache size per bank (read/write) 00001 00010 00100 01000 10000 256KBytes 512KBytes 1MByte 2MByte 4MBytes BBL_CR_CTL3[12:11] Number of L2 banks (read only) BBL_CR_CTL3[10:9] L2 Associativity (read only) Direct Mapped 2 Way 4 Way Reserved 00 01 10 11 BBL_CR_CTL3[8] L2 Enabled (read/write) BBL_CR_CTL3[7] CRTN Parity Check Enable (read/write) BBL_CR_CTL3[6] Address Parity Check Enable (read/write) BBL_CR_CTL3[5] ECC Check Enable (read/write) BBL_CR_CTL3[4:1] L2 Cache Latency (read/write) BBL_CR_CTL3[0] L2 Configured (read/write ) 174H 372 SYSENTER_CS_MSR CS register target for CPL 0 code 175H 373 SYSENTER_ESP_MSR Stack pointer for CPL 0 stack 176H 374 SYSENTER_EIP_MSR CPL 0 code entry point 179H 377 MCG_CAP Machine Check Global Control Register 17AH 378 MCG_STATUS Machine Check Error Reporting Register - contains information related to a machine-check error if its VAL (valid) flag is set. Software is responsible for clearing IA32_MCi_STATUS MSRs by explicitly writing 0s to them; writing 1s to them causes a general-protection exception. 17BH 379 MCG_CTL Machine Check Error Reporting Register - controls signaling of #MC for errors produced by a particular hardware unit (or group of hardware units). 186H 390 PerfEvtSel0 (EVNTSEL0) Performance Event Select Register 0 (R/W) 7:0 Event Select Refer to Performance Counter section for a list of event encodings. 15:8 UMASK (Unit Mask) Unit mask register set to 0 to enable all count options. 16 USER Controls the counting of events at Privilege levels of 1, 2, and 3. 17 OS Controls the counting of events at Privilege level of 0. Vol. 4 2-373 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Hex Register Name / Bit Fields Bit Description Dec 18 E Occurrence/Duration Mode Select 1 = Occurrence 0 = Duration 19 PC Enabled the signaling of performance counter overflow via BP0 pin 20 INT Enables the signaling of counter overflow via input to APIC 1 = Enable 0 = Disable 22 ENABLE Enables the counting of performance events in both counters 1 = Enable 0 = Disable 23 INV Inverts the result of the CMASK condition 1 = Inverted 0 = Non-Inverted 187H 391 31:24 CMASK (Counter Mask) PerfEvtSel1 (EVNTSEL1) Performance Event Select for Counter 1 (R/W) 7:0 Event Select Refer to Performance Counter section for a list of event encodings. 15:8 UMASK (Unit Mask) Unit mask register set to 0 to enable all count options. 16 USER Controls the counting of events at Privilege levels of 1, 2, and 3. 17 OS Controls the counting of events at Privilege level of 0. 18 E Occurrence/Duration Mode Select. 1 = Occurrence 0 = Duration 19 PC Enabled the signaling of performance counter overflow via BP0 pin. 2-374 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Hex Register Name / Bit Fields Bit Description Dec 20 INT Enables the signaling of counter overflow via input to APIC. 1 = Enable 0 = Disable 23 INV Inverts the result of the CMASK condition. 1 = Inverted 0 = Non-Inverted 1D9H 473 31:24 CMASK (Counter Mask) DEBUGCTLMSR Enables last branch, interrupt, and exception recording; taken branch breakpoints; the breakpoint reporting pins; and trace messages. This register can be written to using the WRMSR instruction, when operating at privilege level 0 or when in real-address mode. 0 Enable/Disable Last Branch Records 1 Branch Trap Flag 2 Performance Monitoring/Break Point Pins 3 Performance Monitoring/Break Point Pins 4 Performance Monitoring/Break Point Pins 5 Performance Monitoring/Break Point Pins 6 Enable/Disable Execution Trace Messages 31:7 Reserved 1DBH 475 LASTBRANCHFROMIP 32-bit register for recording the instruction pointers for the last branch, interrupt, or exception that the processor took prior to a debug exception being generated. 1DCH 476 LASTBRANCHTOIP 32-bit register for recording the instruction pointers for the last branch, interrupt, or exception that the processor took prior to a debug exception being generated. 1DDH 477 LASTINTFROMIP Last INT from IP 1DEH 478 LASTINTTOIP Last INT to IP 200H 512 MTRRphysBase0 Memory Type Range Registers 201H 513 MTRRphysMask0 Memory Type Range Registers 202H 514 MTRRphysBase1 Memory Type Range Registers 203H 515 MTRRphysMask1 Memory Type Range Registers 204H 516 MTRRphysBase2 Memory Type Range Registers 205H 517 MTRRphysMask2 Memory Type Range Registers 206H 518 MTRRphysBase3 Memory Type Range Registers 207H 519 MTRRphysMask3 Memory Type Range Registers 208H 520 MTRRphysBase4 Memory Type Range Registers 209H 521 MTRRphysMask4 Memory Type Range Registers 20AH 522 MTRRphysBase5 Memory Type Range Registers Vol. 4 2-375 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Register Name / Bit Fields Bit Description Hex Dec 20BH 523 MTRRphysMask5 Memory Type Range Registers 20CH 524 MTRRphysBase6 Memory Type Range Registers 20DH 525 MTRRphysMask6 Memory Type Range Registers 20EH 526 MTRRphysBase7 Memory Type Range Registers 20FH 527 MTRRphysMask7 Memory Type Range Registers 250H 592 MTRRfix64K_00000 Memory Type Range Registers 258H 600 MTRRfix16K_80000 Memory Type Range Registers 259H 601 MTRRfix16K_A0000 Memory Type Range Registers 268H 616 MTRRfix4K_C0000 Memory Type Range Registers 269H 617 MTRRfix4K_C8000 Memory Type Range Registers 26AH 618 MTRRfix4K_D0000 Memory Type Range Registers 26BH 619 MTRRfix4K_D8000 Memory Type Range Registers 26CH 620 MTRRfix4K_E0000 Memory Type Range Registers 26DH 621 MTRRfix4K_E8000 Memory Type Range Registers 26EH 622 MTRRfix4K_F0000 Memory Type Range Registers 26FH 623 MTRRfix4K_F8000 Memory Type Range Registers 2FFH 767 MTRRdefType Memory Type Range Registers 2:0 Default memory type 10 Fixed MTRR enable 11 MTRR Enable 400H 1024 MC0_CTL Machine Check Error Reporting Register - controls signaling of #MC for errors produced by a particular hardware unit (or group of hardware units). 401H 1025 MC0_STATUS Machine Check Error Reporting Register - contains information related to a machine-check error if its VAL (valid) flag is set. Software is responsible for clearing IA32_MCi_STATUS MSRs by explicitly writing 0s to them; writing 1s to them causes a general-protection exception. 15:0 MC_STATUS_MCACOD 31:16 MC_STATUS_MSCOD 57 MC_STATUS_DAM 58 MC_STATUS_ADDRV 59 MC_STATUS_MISCV 60 MC_STATUS_EN. (Note: For MC0_STATUS only, this bit is hardcoded to 1.) 61 MC_STATUS_UC 62 MC_STATUS_O 63 MC_STATUS_V 402H 1026 MC0_ADDR 403H 1027 MC0_MISC 2-376 Vol. 4 Defined in MCA architecture but not implemented in the P6 family processors. MODEL-SPECIFIC REGISTERS (MSRS) Table 2-53. MSRs in the P6 Family Processors (Contd.) Register Address Register Name / Bit Fields Bit Description Hex Dec 404H 1028 MC1_CTL 405H 1029 MC1_STATUS 406H 1030 MC1_ADDR 407H 1031 MC1_MISC 408H 1032 MC2_CTL 409H 1033 MC2_STATUS 40AH 1034 MC2_ADDR 40BH 1035 MC2_MISC 40CH 1036 MC4_CTL 40DH 1037 MC4_STATUS Bit definitions same as MC0_STATUS, except bits 0, 4, 57, and 61 are hardcoded to 1. 40EH 1038 MC4_ADDR Defined in MCA architecture but not implemented in P6 Family processors. 40FH 1039 MC4_MISC Defined in MCA architecture but not implemented in the P6 family processors. 410H 1040 MC3_CTL 411H 1041 MC3_STATUS 412H 1042 MC3_ADDR 413H 1043 MC3_MISC Bit definitions same as MC0_STATUS. Defined in MCA architecture but not implemented in the P6 family processors. Bit definitions same as MC0_STATUS. Defined in MCA architecture but not implemented in the P6 family processors. Bit definitions same as MC0_STATUS. Defined in MCA architecture but not implemented in the P6 family processors. NOTES 1. Bit 0 of this register has been redefined several times, and is no longer used in P6 family processors. 2. The processor number feature may be disabled by setting bit 21 of the BBL_CR_CTL MSR (model-specific register address 119h) to “1”. Once set, bit 21 of the BBL_CR_CTL may not be cleared. This bit is write-once. The processor number feature will be disabled until the processor is reset. 3. The Pentium III processor will prevent FSB frequency overclocking with a new shutdown mechanism. If the FSB frequency selected is greater than the internal FSB frequency the processor will shutdown. If the FSB selected is less than the internal FSB frequency the BIOS may choose to use bit 11 to implement its own shutdown policy. 2.23 MSRS IN PENTIUM PROCESSORS The following MSRs are defined for the Pentium processors. The P5_MC_ADDR, P5_MC_TYPE, and TSC MSRs (named IA32_P5_MC_ADDR, IA32_P5_MC_TYPE, and IA32_TIME_STAMP_COUNTER in the Pentium 4 processor) are architectural; that is, code that accesses these registers will run on Pentium 4 and P6 family processors without generating exceptions (see Section 2.1, “Architectural MSRs”). The CESR, CTR0, and CTR1 MSRs are unique to Pentium processors; code that accesses these registers will generate exceptions on Pentium 4 and P6 family processors. Vol. 4 2-377 MODEL-SPECIFIC REGISTERS (MSRS) Table 2-54. MSRs in the Pentium Processor Register Address Hex Dec 0H 0 P5_MC_ADDR See Section 15.10.2, “Pentium Processor Machine-Check Exception Handling.” 1H 1 P5_MC_TYPE See Section 15.10.2, “Pentium Processor Machine-Check Exception Handling.” 10H 16 TSC See Section 17.17, “Time-Stamp Counter.” 11H 17 CESR See Section 18.6.9.1, “Control and Event Select Register (CESR).” 12H 18 CTR0 Section 18.6.9.3, “Events Counted.” 13H 19 CTR1 Section 18.6.9.3, “Events Counted.” 2.24 Register Name Bit Description MSR INDEX MSRs of recent processors are indexed here for convenience. IA32 MSRs are excluded from this index. MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_ALF_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_ALF_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_ANY_CORE_C0 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_ANY_GFXE_C0 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_B0_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 2-378 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_B0_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_MASK 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B0_PMON_MATCH 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_MASK 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_B1_PMON_MATCH 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_BBL_CR_CTL 06_09H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_BBL_CR_CTL3 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 Vol. 4 2-379 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_BPU_CCCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_CCCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_CCCR2 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_CCCR3 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_COUNTER0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_COUNTER1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_COUNTER2 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_COUNTER3 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BPU_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BR_DETECT_COUNTER_CONFIG_i 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-42 MSR_BR_DETECT_CTRL 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-42 MSR_BR_DETECT_STATUS 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-42 MSR_BSU_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_BSU_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_C0_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C0_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_BOX_OVF_CTRL 2-380 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C0_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C0_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C0_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 Vol. 4 2-381 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C0_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C0_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C1_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C1_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C1_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 2-382 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_C1_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C1_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C1_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C1_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C1_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C10_PMON_BOX_FILTER 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 MSR_C10_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C10_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C11_PMON_BOX_FILTER 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 MSR_C11_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C11_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C12_PMON_BOX_FILTER 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 MSR_C12_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 Vol. 4 2-383 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_C12_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C13_PMON_BOX_FILTER 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 MSR_C13_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C13_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C14_PMON_BOX_FILTER 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 MSR_C14_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C14_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_BOX_CTL 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_BOX_FILTER1 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_BOX_STATUS 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_CTR0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_CTR1 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_CTR2 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_CTR3 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_EVNTSEL0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_EVNTSEL1 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_EVNTSEL2 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C15_PMON_EVNTSEL3 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_BOX_CTL 2-384 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_BOX_FILTER1 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_BOX_STATUS 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_CTR0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_CTR3 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_CTR2 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_CTR3 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_EVNTSEL0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_EVNTSEL1 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_EVNTSEL2 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C16_PMON_EVNTSEL3 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_BOX_CTL 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_BOX_FILTER1 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_BOX_STATUS 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_CTR0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_CTR1 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_CTR2 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_CTR3 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_EVNTSEL0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_EVNTSEL1 Vol. 4 2-385 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_EVNTSEL2 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C17_PMON_EVNTSEL3 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C2_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C2_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C2_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C2_PMON_EVNT_SEL0 2-386 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C2_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C2_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C3_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C3_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C3_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 Vol. 4 2-387 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C3_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C3_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C3_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C3_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C4_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C4_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_BOX_FILTER1 2-388 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C4_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C4_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C4_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 Vol. 4 2-389 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C4_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C4_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C5_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C5_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C5_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C5_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C5_PMON_EVNT_SEL0 2-390 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C5_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C5_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C6_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C6_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C6_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 Vol. 4 2-391 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C6_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C6_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C6_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C6_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C7_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_BOX_FILTER 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 MSR_C7_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_BOX_FILTER1 2-392 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C7_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_CTR1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_CTR4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C7_PMON_CTR5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C7_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 Vol. 4 2-393 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C7_PMON_EVNT_SEL4 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C7_PMON_EVNT_SEL5 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_C8_PMON_BOX_CTRL 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_BOX_FILTER 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 MSR_C8_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_BOX_OVF_CTRL 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C8_PMON_BOX_STATUS 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_CTR0 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_CTR1 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_CTR2 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_CTR3 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_CTR4 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C8_PMON_CTR5 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C8_PMON_EVNT_SEL0 2-394 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_EVNT_SEL1 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_EVNT_SEL2 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_EVNT_SEL3 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C8_PMON_EVNT_SEL4 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C8_PMON_EVNT_SEL5 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C9_PMON_BOX_CTRL 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_BOX_FILTER 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 MSR_C9_PMON_BOX_FILTER0 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_BOX_FILTER1 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_BOX_OVF_CTRL 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C9_PMON_BOX_STATUS 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_CTR0 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_CTR1 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 Vol. 4 2-395 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_CTR2 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_CTR3 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_CTR4 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C9_PMON_CTR5 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C9_PMON_EVNT_SEL0 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_EVNT_SEL1 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_EVNT_SEL2 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_EVNT_SEL3 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_C9_PMON_EVNT_SEL4 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_C9_PMON_EVNT_SEL5 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 MSR_CC6_DEMOTION_POLICY_CONFIG 06_37H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-9 MSR_CONFIG_TDP_CONTROL 06_3AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_CONFIG_TDP_LEVEL1 06_3AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 2-396 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_CONFIG_TDP_LEVEL2 06_3AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_CONFIG_TDP_NOMINAL 06_3AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_CORE_C1_RESIDENCY 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_66H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-42 MSR_CORE_C3_RESIDENCY 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH, 06_25H, 06_2CH, 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_CORE_C6_RESIDENCY 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH, 06_25H, 06_2CH, 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_CORE_C7_RESIDENCY 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_CORE_GFXE_OVERLAP_C0 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_CORE_HDC_RESIDENCY 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_CORE_PERF_LIMIT_REASONS 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_3F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_CORE_THREAD_COUNT 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 MSR_CRU_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_CRU_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_CRU_ESCR2 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_CRU_ESCR3 Vol. 4 2-397 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_CRU_ESCR4 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_CRU_ESCR5 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_DAC_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_DAC_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_DRAM_ENERGY_ STATUS 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH, 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 06_3F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_DRAM_PERF_STATUS 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH, 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 06_3F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_DRAM_POWER_INFO 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH, 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_DRAM_POWER_LIMIT 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH, 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_EBC_FREQUENCY_ID 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_EBC_HARD_POWERON 2-398 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_EBC_SOFT_POWERON 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_EBL_CR_POWERON 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_EFSB_DRDY0 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_EFSB_DRDY1 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_EMON_L3_CTR_CTL0 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_CTR_CTL1 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_CTR_CTL2 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_CTR_CTL3 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_CTR_CTL4 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_CTR_CTL5 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_CTR_CTL6 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_CTR_CTL7 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 0F_06H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-50 MSR_EMON_L3_GL_CTL 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 MSR_ERROR_CONTROL 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 Vol. 4 2-399 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 MSR_FEATURE_CONFIG 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_25H, 06_2CH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-18 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_2AH, 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_FIRM_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FIRM_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_CCCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_CCCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_CCCR2 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_CCCR3 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_COUNTER0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_COUNTER1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_COUNTER2 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_COUNTER3 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FLAME_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FSB_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FSB_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_FSB_FREQ 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_4CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-11 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 MSR_GQ_SNOOP_MESF 2-400 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_GRAPHICS_PERF_LIMIT_REASONS 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 MSR_IFSB_BUSQ0 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_IFSB_BUSQ1 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_IFSB_CNTR7 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_IFSB_CTL6 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_IFSB_SNPQ0 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_IFSB_SNPQ1 0F_03H, 0F_04H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-49 MSR_IQ_CCCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_CCCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_CCCR2 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_CCCR3 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_CCCR4 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_CCCR5 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_COUNTER0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_COUNTER1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_COUNTER2 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_COUNTER3 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_COUNTER4 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_COUNTER5 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IQ_ESCR1 Vol. 4 2-401 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IS_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IS_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_ITLB_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_ITLB_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IX_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_IX_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_0_FROM_IP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_0_TO_IP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_1_FROM_IP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 2-402 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_1_TO_IP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_10_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_10_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_11_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_11_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_12_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_12_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_13_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 Vol. 4 2-403 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_13_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_14_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_14_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_15_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_15_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_16_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_16_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_17_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_17_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_18_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 2-404 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_LASTBRANCH_18_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_19_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_19_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_2 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_2_FROM_IP 06_0FH, 06_17H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_2_TO_IP 06_0FH, 06_17H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_20_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_20_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_21_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_21_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_22_FROM_IP Vol. 4 2-405 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_22_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_23_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_23_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_24_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_24_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_25_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_25_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_26_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_26_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_27_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_27_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_28_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_28_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_29_FROM_IP 2-406 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_29_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_3 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_3_FROM_IP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_3_TO_IP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_30_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_30_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_31_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_31_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_LASTBRANCH_4 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_4_FROM_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 Vol. 4 2-407 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_4_TO_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_5 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_5_FROM_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_5_TO_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_6 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_6_FROM_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_6_TO_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 2-408 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_7 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_7_FROM_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_7_TO_IP 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_8_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_8_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_9_FROM_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_9_TO_IP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_LASTBRANCH_TOS Vol. 4 2-409 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LASTBRANCH_INFO_0 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_1 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_10 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_11 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_13 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_14 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_15 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_16 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_17 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_18 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 2-410 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_LBR_INFO_19 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_2 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_20 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_21 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_22 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_23 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_24 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_25 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_26 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_27 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_28 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_29 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_3 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_30 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 Vol. 4 2-411 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_LBR_INFO_31 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_4 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_5 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_6 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_7 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_8 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_INFO_9 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 MSR_LBR_SELECT 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 06_57H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_LER_FROM_LIP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_LER_TO_LIP 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 2-412 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_M0_PMON_ADDR_MASK 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_ADDR_MATCH 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_BOX_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_BOX_OVF_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_BOX_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_CTR0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_CTR1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_CTR2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_CTR3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_CTR4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_CTR5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_DSP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_EVNT_SEL0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_EVNT_SEL1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_EVNT_SEL2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_EVNT_SEL3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_EVNT_SEL4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_EVNT_SEL5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 Vol. 4 2-413 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_M0_PMON_ISS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_MAP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_MM_CONFIG 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_MSC_THR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_PGT 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_PLD 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_TIMESTAMP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M0_PMON_ZDP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_ADDR_MASK 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_ADDR_MATCH 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_BOX_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_BOX_OVF_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_BOX_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_CTR0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_CTR1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_CTR2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_CTR3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_CTR4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_CTR5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_DSP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_EVNT_SEL0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 2-414 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_M1_PMON_EVNT_SEL1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_EVNT_SEL2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_EVNT_SEL3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_EVNT_SEL4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_EVNT_SEL5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_ISS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_MAP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_MM_CONFIG 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_MSC_THR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_PGT 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_PLD 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_TIMESTAMP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_M1_PMON_ZDP 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 IA32_MC0_MISC / MSR_MC0_MISC 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 MSR_MC0_RESIDENCY 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 IA32_MC1_MISC / MSR_MC1_MISC 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 IA32_MC10_ADDR / MSR_MC10_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC10_CTL / MSR_MC10_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 Vol. 4 2-415 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC10_MISC / MSR_MC10_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC10_STATUS / MSR_MC10_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC11_ADDR / MSR_MC11_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC11_CTL / MSR_MC11_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC11_MISC / MSR_MC11_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC11_STATUS / MSR_MC11_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 2-416 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location IA32_MC12_ADDR / MSR_MC12_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC12_CTL / MSR_MC12_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC12_MISC / MSR_MC12_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC12_STATUS / MSR_MC12_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC13_ADDR / MSR_MC13_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC13_CTL / MSR_MC13_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC13_MISC / MSR_MC13_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 Vol. 4 2-417 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location IA32_MC13_STATUS / MSR_MC13_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC14_ADDR / MSR_MC14_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC14_CTL / MSR_MC14_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC14_MISC / MSR_MC14_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC14_STATUS / MSR_MC14_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC15_ADDR / MSR_MC15_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC15_CTL / MSR_MC15_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 2-418 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location IA32_MC15_MISC / MSR_MC15_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC15_STATUS / MSR_MC15_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC16_ADDR / MSR_MC16_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC16_CTL / MSR_MC16_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC16_MISC / MSR_MC16_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC16_STATUS / MSR_MC16_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC17_ADDR / MSR_MC17_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 Vol. 4 2-419 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC17_CTL / MSR_MC17_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC17_MISC / MSR_MC17_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC17_STATUS / MSR_MC17_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC18_ADDR / MSR_MC18_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC18_CTL / MSR_MC18_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC18_MISC / MSR_MC18_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 2-420 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC18_STATUS / MSR_MC18_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC19_ADDR / MSR_MC19_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC19_CTL / MSR_MC19_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC19_MISC / MSR_MC19_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC19_STATUS / MSR_MC19_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC2_MISC / MSR_MC2_MISC 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 IA32_MC20_ADDR / MSR_MC20_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 Vol. 4 2-421 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC20_CTL / MSR_MC20_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC20_MISC / MSR_MC20_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC20_STATUS / MSR_MC20_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC21_ADDR / MSR_MC21_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC21_CTL / MSR_MC21_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC21_MISC / MSR_MC21_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC21_STATUS / MSR_MC21_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC22_ADDR / MSR_MC22_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC22_CTL / MSR_MC22_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC22_MISC / MSR_MC22_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 2-422 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location IA32_MC22_STATUS / MSR_MC22_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC23_ADDR / MSR_MC23_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC23_CTL / MSR_MC23_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC23_MISC / MSR_MC23_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC23_STATUS / MSR_MC23_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC24_ADDR / MSR_MC24_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC24_CTL / MSR_MC24_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC24_MISC / MSR_MC24_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC24_STATUS / MSR_MC24_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC25_ADDR / MSR_MC25_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC25_CTL / MSR_MC25_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC25_MISC / MSR_MC25_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC25_STATUS / MSR_MC25_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC26_ADDR / MSR_MC26_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC26_CTL / MSR_MC26_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC26_MISC / MSR_MC26_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC26_STATUS / MSR_MC26_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC27_ADDR / MSR_MC27_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC27_CTL / MSR_MC27_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC27_MISC / MSR_MC27_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC27_STATUS / MSR_MC27_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 Vol. 4 2-423 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location IA32_MC28_ADDR / MSR_MC28_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC28_CTL / MSR_MC28_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC28_MISC / MSR_MC28_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC28_STATUS / MSR_MC28_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 IA32_MC29_ADDR / MSR_MC29_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC29_CTL / MSR_MC29_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC29_MISC / MSR_MC29_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC29_STATUS / MSR_MC29_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC3_ADDR / MSR_MC3_ADDR 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 IA32_MC3_CTL / MSR_MC3_CTL 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 IA32_MC3_MISC / MSR_MC3_MISC 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 IA32_MC3_STATUS / MSR_MC3_STATUS 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 2-424 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 IA32_MC30_ADDR / MSR_MC30_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC30_CTL / MSR_MC30_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC30_MISC / MSR_MC30_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC30_STATUS / MSR_MC30_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC31_ADDR / MSR_MC31_ADDR 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC31_CTL / MSR_MC31_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC31_MISC / MSR_MC31_MISC 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC31_STATUS / MSR_MC31_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 IA32_MC4_ADDR / MSR_MC4_ADDR 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 IA32_MC4_CTL / MSR_MC4_CTL 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 IA32_MC4_CTL2 / MSR_MC4_CTL2 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 IA32_MC4_STATUS / MSR_MC4_STATUS 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 Vol. 4 2-425 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_MC5_ADDR / MSR_MC5_ADDR 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 IA32_MC5_CTL / MSR_MC5_CTL 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 IA32_MC5_MISC / MSR_MC5_MISC 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 IA32_MC5_STATUS / MSR_MC5_STATUS 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 IA32_MC6_ADDR / MSR_MC6_ADDR 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 2-426 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC6_CTL / MSR_MC6_CTL 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 MSR_MC6_DEMOTION_POLICY_CONFIG 06_37H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-9 IA32_MC6_MISC / MSR_MC6_MISC 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 MSR_MC6_RESIDENCY_COUNTER 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_37H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-9 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 IA32_MC6_STATUS / MSR_MC6_STATUS 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC7_ADDR / MSR_MC7_ADDR 06_1AH, 06_1EH, 06_1FH, 06_2EH See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC7_CTL / MSR_MC7_CTL 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 Vol. 4 2-427 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC7_MISC / MSR_MC7_MISC 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC7_STATUS / MSR_MC7_STATUS 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC8_ADDR / MSR_MC8_ADDR 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC8_CTL / MSR_MC8_CTL 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC8_MISC / MSR_MC8_MISC 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC8_STATUS / MSR_MC8_STATUS 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 2-428 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC9_ADDR / MSR_MC9_ADDR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC9_CTL / MSR_MC9_CTL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC9_MISC / MSR_MC9_MISC 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 IA32_MC9_STATUS / MSR_MC9_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 MSR_MCG_MISC 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R10 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R11 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R12 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R13 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R14 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R15 Vol. 4 2-429 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R8 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_R9 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RAX 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RBP 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RBX 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RCX 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RDI 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RDX 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RESERVED1 - MSR_MCG_RESERVED5 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RFLAGS 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RIP 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RSI 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MCG_RSP 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MISC_FEATURE_CONTROL 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_MISC_PWR_MGMT 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_MOB_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MOB_ESCR1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_CCCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_CCCR1 2-430 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_CCCR2 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_CCCR3 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_COUNTER0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_COUNTER1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_COUNTER2 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_COUNTER3 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MS_ESCR1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_MTRRCAP 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_OFFCORE_RSP_0 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_OFFCORE_RSP_1 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-18 06_2FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PCIE_PLL_RATIO 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 MSR_PCU_PMON_BOX_CTL 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_BOX_FILTER 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_BOX_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_CTR0 Vol. 4 2-431 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_CTR1 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_CTR2 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_CTR3 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_EVNTSEL0 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_EVNTSEL1 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_EVNTSEL2 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PCU_PMON_EVNTSEL3 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PEBS_ENABLE 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-13 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-27 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_PEBS_FRONTEND 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_PEBS_LD_LAT 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_PEBS_MATRIX_VERT 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_PEBS_NUM_ALT 2-432 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 MSR_PERF_CAPABILITIES 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 MSR_PERF_GLOBAL_CTRL 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 MSR_PERF_GLOBAL_OVF_CTRL 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 MSR_PERF_GLOBAL_STATUS 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 MSR_PERF_STATUS 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_PKG_C10_RESIDENCY 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_45H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 and Table 2-31 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 MSR_PKG_C2_RESIDENCY 06_27H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-5 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKG_C3_RESIDENCY 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH, 06_25H, 06_2CH, 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-42 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKG_C4_RESIDENCY 06_27H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-5 MSR_PKG_C6_RESIDENCY 06_27H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-5 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH, 06_25H, 06_2CH, 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKG_C7_RESIDENCY 06_1AH, 06_1EH, 06_1FH, 06_2EH, 06_25H, 06_2CH, 06_2FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 Vol. 4 2-433 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKG_C8_RESIDENCY 06_45H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-31 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 MSR_PKG_C9_RESIDENCY 06_45H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-31 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 MSR_PKG_CST_CONFIG_CONTROL 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_4CH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-11 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_3AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_45H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-31 06_3F. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_3DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-35 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKG_ENERGY_STATUS 06_37H, 06_4AH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-8 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_PKG_HDC_CONFIG 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_PKG_HDC_DEEP_RESIDENCY 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_PKG_HDC_SHALLOW_RESIDENCY 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_PKG_PERF_STATUS 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH, 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKG_POWER_INFO 06_4DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-10 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 2-434 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKG_POWER_LIMIT 06_37H, 06_4AH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-8 06_4DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-10 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PKGC_IRTL1 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3CH, 06_45H, 06_46H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 MSR_PKGC_IRTL2 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3CH, 06_45H, 06_46H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 MSR_PKGC3_IRTL 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_PKGC6_IRTL 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_PKGC7_IRTL 06_2AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-21 MSR_PLATFORM_BRV 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_PLATFORM_ENERGY_COUNTER 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_PLATFORM_ID 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-7 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 MSR_PLATFORM_INFO 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_3AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 and Table 2-30 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PLATFORM_POWER_LIMIT 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 Vol. 4 2-435 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_PMG_IO_CAPTURE_BASE 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_4CH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-11 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_3AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PMH_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_PMH_ESCR1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_PMON_GLOBAL_CONFIG 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PMON_GLOBAL_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_PMON_GLOBAL_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_POWER_CTL 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 MSR_PP0_ENERGY_STATUS 06_37H, 06_4AH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-8 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PP0_POLICY 06_2AH, 06_45H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-21 MSR_PP0_POWER_LIMIT 06_4CH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-11 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_PP1_ENERGY_STATUS 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_45H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-21 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 MSR_PP1_POLICY 06_2AH, 06_45H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-21 2-436 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 MSR_PP1_POWER_LIMIT 06_2AH, 06_45H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-21 06_3CH, 06_45H, 06_46H See Table 2-30 MSR_PPERF 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_PPIN 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 MSR_PPIN_CTL 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 MSR_PRMRR_PHYS_BASE 06_8EH, 06_9EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-41 MSR_PRMRR_PHYS_MASK 06_8EH, 06_9EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-41 MSR_PRMRR_VALID_CONFIG 06_8EH, 06_9EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-41 MSR_RING_RATIO_LIMIT 06_8EH, 06_9EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-41 MSR_R0_PMON_BOX_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_BOX_OVF_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_BOX_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR6 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_CTR7 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 Vol. 4 2-437 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_R0_PMON_EVNT_SEL0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_EVNT_SEL1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_EVNT_SEL2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_EVNT_SEL3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_EVNT_SEL4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_EVNT_SEL5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_EVNT_SEL6 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_EVNT_SEL7 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P6 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_IPERF0_P7 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_QLX_P0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_QLX_P1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_QLX_P2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R0_PMON_QLX_P3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_BOX_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 2-438 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_R1_PMON_BOX_OVF_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_BOX_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR10 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR11 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR12 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR13 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR14 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR15 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR8 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_CTR9 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL10 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL11 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL12 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL13 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL14 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL15 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL8 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_EVNT_SEL9 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_IPERF1_P10 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_IPERF1_P11 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_IPERF1_P12 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 Vol. 4 2-439 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_R1_PMON_IPERF1_P13 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_IPERF1_P14 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_IPERF1_P15 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_IPERF1_P8 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_IPERF1_P9 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_QLX_P4 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_QLX_P5 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_QLX_P6 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_R1_PMON_QLX_P7 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_RAPL_POWER_UNIT 06_37H, 06_4AH, 06_5AH, 06_5DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-8 06_4DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-10 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_2AH, 06_2DH, 06_3AH, 06_3CH, 06_3EH, 06_3FH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_RAT_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_RAT_ESCR1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_RING_PERF_LIMIT_REASONS 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 MSR_S0_PMON_BOX_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_BOX_FILTER 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_BOX_OVF_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S0_PMON_BOX_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S0_PMON_CTR0 2-440 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_CTR1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_CTR2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_CTR3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_EVNT_SEL0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_EVNT_SEL1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_EVNT_SEL2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_EVNT_SEL3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S0_PMON_MASK 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S0_PMON_MATCH 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S1_PMON_BOX_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_BOX_FILTER 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_BOX_OVF_CTRL 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S1_PMON_BOX_STATUS 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S1_PMON_CTR0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_CTR1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 Vol. 4 2-441 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_S1_PMON_CTR2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_CTR3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_EVNT_SEL0 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_EVNT_SEL1 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_EVNT_SEL2 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_EVNT_SEL3 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S1_PMON_MASK 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S1_PMON_MATCH 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_S2_PMON_BOX_CTL 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_BOX_FILTER 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_CTR0 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_CTR1 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_CTR2 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_CTR3 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_EVNTSEL0 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_EVNTSEL1 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_EVNTSEL2 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S2_PMON_EVNTSEL3 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 2-442 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_S3_PMON_BOX_CTL 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_BOX_FILTER 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_CTR0 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_CTR1 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_CTR2 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_CTR3 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_EVNTSEL0 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_EVNTSEL1 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_EVNTSEL2 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_S3_PMON_EVNTSEL3 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_SAAT_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_SAAT_ESCR1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_SGXOWNEREPOCH0 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_SGXOWNEREPOCH1 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_SMI_COUNT 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_SMM_BLOCKED 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 MSR_SMM_DELAYED 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 MSR_SMM_FEATURE_CONTROL Vol. 4 2-443 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 MSR_SMM_MCA_CAP 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_SMRR_PHYSBASE 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 MSR_SMRR_PHYSMASK 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 MSR_SSU_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_TBPU_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_TBPU_ESCR1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_TC_ESCR0 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_TC_ESCR1 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_TC_PRECISE_EVENT 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_TEMPERATURE_TARGET 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_2AH, 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-20 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 06_56H, 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_THERM2_CTL 06_0FH, 06_17H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-3 06_1CH, 06_26H, 06_27H, 06_35H, 06_36H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-4 0FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 06_0EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 06_09H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-52 MSR_THREAD_ID_INFO 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 MSR_TRACE_HUB_STH_ACPIBAR_BASE 06_8EH, 06_9EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-41 MSR_TURBO_ACTIVATION_RATIO 2-444 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_3AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-25 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-29 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_TURBO_GROUP_CORECNT 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 MSR_TURBO_POWER_CURRENT_LIMIT 06_1AH, 06_1EH, 06_1FH, 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 MSR_TURBO_RATIO_LIMIT 06_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_5CH, 06_7AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-6 06_4DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-10 06_5CH, 06_7AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-12 06_1AH, 06_1EH, 06_1FH, 06_2EH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-15 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-18 06_2FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-19 06_2AH, 06_45H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-21 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-23 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 and Table 2-27 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_3DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-35 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 06_55H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-45 06_57H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-46 MSR_TURBO_RATIO_LIMIT1 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-26 and Table 2-27 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 06_56H, 06_4FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-36 MSR_TURBO_RATIO_LIMIT2 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-32 MSR_TURBO_RATIO_LIMIT3 06_56H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-37 06_4FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-38 MSR_TURBO_RATIO_LIMIT_CORES 06_55H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-45 MSR_U_PMON_BOX_STATUS 06_3EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-28 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 Vol. 4 2-445 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_U_PMON_CTR 06_2EH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_U_PMON_CTR0 06_2DH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_U_PMON_CTR1 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_U_PMON_EVNT_SEL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_U_PMON_EVNTSEL0 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_U_PMON_EVNTSEL1 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_U_PMON_GLOBAL_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_U_PMON_GLOBAL_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_U_PMON_GLOBAL_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_U_PMON_UCLK_FIXED_CTL 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_U_PMON_UCLK_FIXED_CTR 06_2DH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-24 06_3FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-33 MSR_U2L_ESCR0 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_U2L_ESCR1 0FH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-48 MSR_UNC_ARB_PERFCTR0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_ARB_PERFCTR1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_ARB_PERFEVTSEL0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 2-446 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_ARB_PERFEVTSEL1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_0_PERFCTR0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_0_PERFCTR1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_0_PERFCTR2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_0_PERFCTR3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_0_PERFEVTSEL0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_0_PERFEVTSEL1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_0_PERFEVTSEL2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_0_PERFEVTSEL3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_0_UNIT_STATUS 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_1_PERFCTR0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_1_PERFCTR1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_1_PERFCTR2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 Vol. 4 2-447 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_UNC_CBO_1_PERFCTR3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_1_PERFEVTSEL0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_1_PERFEVTSEL1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_1_PERFEVTSEL2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_1_PERFEVTSEL3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_1_UNIT_STATUS 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_2_PERFCTR0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_2_PERFCTR1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_2_PERFCTR2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_2_PERFCTR3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_2_PERFEVTSEL0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_2_PERFEVTSEL1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_2_PERFEVTSEL2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_2_PERFEVTSEL3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_2_UNIT_STATUS 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 2-448 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_UNC_CBO_3_PERFCTR0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_3_PERFCTR1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_3_PERFCTR2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_3_PERFCTR3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_3_PERFEVTSEL0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_3_PERFEVTSEL1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_CBO_3_PERFEVTSEL2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_3_PERFEVTSEL3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_3_UNIT_STATUS 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFCTR0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFCTR1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFCTR2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFCTR3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFEVTSEL0 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFEVTSEL1 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFEVTSEL2 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_4_PERFEVTSEL3 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 Vol. 4 2-449 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_UNC_CBO_4_UNIT_STATUS 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 MSR_UNC_CBO_CONFIG 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_PERF_FIXED_CTR 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_PERF_FIXED_CTRL 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_PERF_GLOBAL_CTRL 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNC_PERF_GLOBAL_STATUS 06_2AH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-22 06_3CH, 06_45H, 06_46H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-30 06_4EH, 06_5EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-40 MSR_UNCORE_ADDR_OPCODE_MATCH 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_FIXED_CTR_CTRL 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_FIXED_CTR0 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERF_GLOBAL_CTRL 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERF_GLOBAL_OVF_CTRL 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERF_GLOBAL_STATUS 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERFEVTSEL0 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERFEVTSEL1 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERFEVTSEL2 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERFEVTSEL3 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 2-450 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location MSR_UNCORE_PERFEVTSEL4 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERFEVTSEL5 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERFEVTSEL6 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PERFEVTSEL7 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PMC0 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PMC1 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PMC2 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PMC3 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PMC4 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PMC5 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_UNCORE_PMC6 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PMC7 06_1AH, 06_1EH, 06_1FH, 06_25H, 06_2CH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-16 MSR_UNCORE_PRMRR_BASE 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_UNCORE_PRMRR_MASK 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MSR_UNCORE_PRMRR_PHYS_BASE 06_8EH, 06_9EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-41 MSR_UNCORE_PRMRR_PHYS_MASK 06_8EH, 06_9EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-41 MSR_W_PMON_BOX_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_BOX_OVF_CTRL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_BOX_STATUS 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_CTR0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_CTR1 Vol. 4 2-451 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_CTR2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_CTR3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_EVNT_SEL0 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_EVNT_SEL1 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_EVNT_SEL2 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_EVNT_SEL3 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_FIXED_CTR 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_W_PMON_FIXED_CTR_CTL 06_2EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-17 MSR_WEIGHTED_CORE_C0 06_4EH, 06_5EH, 06_55H, 06_8EH, 06_9EH, 06_66H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-39 MTRRfix16K_80000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix16K_A0000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_C0000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_C8000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_D0000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_D8000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_E0000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_E8000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 2-452 Vol. 4 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_F0000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix4K_F8000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRfix64K_00000 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase0 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase1 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase2 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase3 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase4 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase5 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase6 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysBase7 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask0 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask1 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask2 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 Vol. 4 2-453 MODEL-SPECIFIC REGISTERS (MSRS) MSR Name and CPUID DisplayFamily_DisplayModel Location P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask3 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask4 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask5 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask6 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 MTRRphysMask7 06_0EH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-51 P6 Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . See Table 2-53 2-454 Vol. 4
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.6 Linearized : Yes Author : Intel Corporation Create Date : 2019:05:13 12:50:27Z Modify Date : 2019:05:16 10:11:41-07:00 XMP Toolkit : Adobe XMP Core 5.6-c016 91.163616, 2018/10/29-16:58:49 Producer : Acrobat Distiller 19.0 (Windows) Creator Tool : FrameMaker 10.0.2 Metadata Date : 2019:05:16 10:11:41-07:00 Format : application/pdf Title : Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes Creator : Intel Corporation Document ID : uuid:4e2a01ca-b09d-4caf-acd8-c1d265dd7760 Instance ID : uuid:ddb0aa42-8b52-40aa-b881-8bf39f49fb49 Page Mode : UseOutlines Page Count : 1601EXIF Metadata provided by EXIF.tools