ARM Architecture Reference Manual ARMv7 A And R Edition V7
User Manual: Pdf
Open the PDF directly: View PDF
.
Page Count: 2158
ARM Architecture
Reference Manual
®
ARM v7-A and ARM v7-R edition
®
®
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
ARM Architecture Reference Manual
ARMv7-A and ARMv7-R edition
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
Release Information
The following changes have been made to this document.
Change History
Date
Issue
Confidentiality
Change
05 April 2007
A
Non-Confidential
New edition for ARMv7-A and ARMv7-R architecture profiles.
Document number changed from ARM DDI 0100 to ARM DDI 0406 and contents
restructured.
29 April 2008
B
Non-Confidential
Addition of the VFP Half-precision and Multiprocessing Extensions, and many clarifications
and enhancements.
From ARMv7, the ARM® architecture defines different architectural profiles and this edition of this manual describes
only the A and R profiles. For details of the documentation of the ARMv7-M profile see Further reading on page xx.
Before ARMv7 there was only a single ARM Architecture Reference Manual, with document number DDI 0100. The first
issue of this was in February 1996, and the final issue, Issue I, was in July 2005. For more information see Further reading
on page xx.
Proprietary Notice
Words and logos marked with ® or ™ are registered trademarks or trademarks of ARM Limited in the EU and other
countries, except as otherwise stated below in this proprietary notice. Other brands and names mentioned herein may be
the trademarks of their respective owners.
Neither the whole nor any part of the information contained in, or the product described in, this document may be adapted
or reproduced in any material form except with the prior written permission of the copyright holder.
The product described in this document is subject to continuous developments and improvements. All particulars of the
product and its use contained in this document are given by ARM in good faith. However, all warranties implied or
expressed, including but not limited to implied warranties of merchantability, or fitness for purpose, are excluded.
1. Subject to the provisions set out below, ARM hereby grants to you a perpetual, non-exclusive, nontransferable, royalty
free, worldwide licence to use this ARM Architecture Reference Manual for the purposes of developing; (i) software
applications or operating systems which are targeted to run on microprocessor cores distributed under licence from ARM;
(ii) tools which are designed to develop software programs which are targeted to run on microprocessor cores distributed
under licence from ARM; (iii) or having developed integrated circuits which incorporate a microprocessor core
manufactured under licence from ARM.
2. Except as expressly licensed in Clause 1 you acquire no right, title or interest in the ARM Architecture Reference
Manual, or any Intellectual Property therein. In no event shall the licences granted in Clause 1, be construed as granting
you expressly or by implication, estoppel or otherwise, licences to any ARM technology other than the ARM Architecture
Reference Manual. The licence grant in Clause 1 expressly excludes any rights for you to use or take into use any ARM
patents. No right is granted to you under the provisions of Clause 1 to; (i) use the ARM Architecture Reference Manual
for the purposes of developing or having developed microprocessor cores or models thereof which are compatible in
whole or part with either or both the instructions or programmers’ models described in this ARM Architecture Reference
Manual; or (ii) develop or have developed models of any microprocessor cores designed by or for ARM; or (iii) distribute
ii
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
in whole or in part this ARM Architecture Reference Manual to third parties, other than to your subcontractors for the
purposes of having developed products in accordance with the licence grant in Clause 1 without the express written
permission of ARM; or (iv) translate or have translated this ARM Architecture Reference Manual into any other
languages.
3. THE ARM ARCHITECTURE REFERENCE MANUAL IS PROVIDED "AS IS" WITH NO WARRANTIES
EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF
SATISFACTORY QUALITY, NONINFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE.
4. No licence, express, implied or otherwise, is granted to LICENSEE, under the provisions of Clause 1, to use the ARM
tradename, in connection with the use of the ARM Architecture Reference Manual or any products based thereon.
Nothing in Clause 1 shall be construed as authority for you to make any representations on behalf of ARM in respect of
the ARM Architecture Reference Manual or any products based thereon.
Where the term ARM is used to refer to the company it means “ARM or any of its subsidiaries as appropriate”.
Note
The term ARM is also used to refer to versions of the ARM architecture, for example ARMv6 refers to version 6 of the
ARM architecture. The context makes it clear when the term is used in this way.
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited
110 Fulbourn Road Cambridge, England CB1 9NJ
Restricted Rights Legend: Use, duplication or disclosure by the United States Government is subject to the restrictions
set forth in DFARS 252.227-7013 (c)(1)(ii) and FAR 52.227-19.
This document is Non-Confidential. The right to use, copy and disclose this document is subject to the licence set out
above.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
iii
iv
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Contents
ARM Architecture Reference Manual
ARMv7-A and ARMv7-R edition
Preface
About this manual ............................................................................... xiv
Using this manual ................................................................................ xv
Conventions ....................................................................................... xviii
Further reading .................................................................................... xx
Feedback ............................................................................................ xxi
Part A
Chapter A1
Application Level Architecture
Introduction to the ARM Architecture
A1.1
A1.2
A1.3
A1.4
A1.5
A1.6
Chapter A2
A1-2
A1-3
A1-4
A1-6
A1-7
A1-8
Application Level Programmers’ Model
A2.1
ARM DDI 0406B
About the ARM architecture .............................................................
The ARM and Thumb instruction sets ..............................................
Architecture versions, profiles, and variants ....................................
Architecture extensions ....................................................................
The ARM memory model .................................................................
Debug ..............................................................................................
About the Application level programmers’ model ............................. A2-2
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
v
Contents
A2.2
A2.3
A2.4
A2.5
A2.6
A2.7
A2.8
A2.9
A2.10
A2.11
Chapter A3
Application Level Memory Model
A3.1
A3.2
A3.3
A3.4
A3.5
A3.6
A3.7
A3.8
A3.9
Chapter A4
About the instruction sets ................................................................. A4-2
Unified Assembler Language ........................................................... A4-4
Branch instructions .......................................................................... A4-7
Data-processing instructions ............................................................ A4-8
Status register access instructions ................................................ A4-18
Load/store instructions ................................................................... A4-19
Load/store multiple instructions ..................................................... A4-22
Miscellaneous instructions ............................................................. A4-23
Exception-generating and exception-handling instructions ............ A4-24
Coprocessor instructions ............................................................... A4-25
Advanced SIMD and VFP load/store instructions .......................... A4-26
Advanced SIMD and VFP register transfer instructions ................. A4-29
Advanced SIMD data-processing operations ................................. A4-30
VFP data-processing instructions .................................................. A4-38
ARM Instruction Set Encoding
A5.1
A5.2
A5.3
A5.4
A5.5
A5.6
A5.7
vi
Address space ................................................................................. A3-2
Alignment support ............................................................................ A3-4
Endian support ................................................................................. A3-7
Synchronization and semaphores .................................................. A3-12
Memory types and attributes and the memory order model .......... A3-24
Access rights .................................................................................. A3-38
Virtual and physical addressing ..................................................... A3-40
Memory access order .................................................................... A3-41
Caches and memory hierarchy ...................................................... A3-51
The Instruction Sets
A4.1
A4.2
A4.3
A4.4
A4.5
A4.6
A4.7
A4.8
A4.9
A4.10
A4.11
A4.12
A4.13
A4.14
Chapter A5
ARM core data types and arithmetic ................................................ A2-3
ARM core registers ........................................................................ A2-11
The Application Program Status Register (APSR) ......................... A2-14
Execution state registers ................................................................ A2-15
Advanced SIMD and VFP extensions ............................................ A2-20
Floating-point data types and arithmetic ........................................ A2-32
Polynomial arithmetic over {0,1} .................................................... A2-67
Coprocessor support ...................................................................... A2-68
Execution environment support ..................................................... A2-69
Exceptions, debug events and checks ........................................... A2-81
ARM instruction set encoding .......................................................... A5-2
Data-processing and miscellaneous instructions ............................. A5-4
Load/store word and unsigned byte ............................................... A5-19
Media instructions .......................................................................... A5-21
Branch, branch with link, and block data transfer .......................... A5-27
Supervisor Call, and coprocessor instructions ............................... A5-28
Unconditional instructions .............................................................. A5-30
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Contents
Chapter A6
Thumb Instruction Set Encoding
A6.1
A6.2
A6.3
Chapter A7
Advanced SIMD and VFP Instruction Encoding
A7.1
A7.2
A7.3
A7.4
A7.5
A7.6
A7.7
A7.8
A7.9
Chapter A8
Chapter B1
The ThumbEE instruction set ........................................................... A9-2
ThumbEE instruction set encoding .................................................. A9-6
Additional instructions in Thumb and ThumbEE instruction sets ..... A9-7
ThumbEE instructions with modified behavior ................................. A9-8
Additional ThumbEE instructions ................................................... A9-14
System Level Architecture
The System Level Programmers’ Model
B1.1
B1.2
B1.3
B1.4
B1.5
B1.6
B1.7
B1.8
B1.9
ARM DDI 0406B
Format of instruction descriptions .................................................... A8-2
Standard assembler syntax fields .................................................... A8-7
Conditional execution ....................................................................... A8-8
Shifts applied to a register ............................................................. A8-10
Memory accesses .......................................................................... A8-13
Alphabetical list of instructions ....................................................... A8-14
ThumbEE
A9.1
A9.2
A9.3
A9.4
A9.5
Part B
Overview .......................................................................................... A7-2
Advanced SIMD and VFP instruction syntax ................................... A7-3
Register encoding ............................................................................ A7-8
Advanced SIMD data-processing instructions ............................... A7-10
VFP data-processing instructions .................................................. A7-24
Extension register load/store instructions ...................................... A7-26
Advanced SIMD element or structure load/store instructions ........ A7-27
8, 16, and 32-bit transfer between ARM core and extension registers .....
A7-31
64-bit transfers between ARM core and extension registers ......... A7-32
Instruction Details
A8.1
A8.2
A8.3
A8.4
A8.5
A8.6
Chapter A9
Thumb instruction set encoding ....................................................... A6-2
16-bit Thumb instruction encoding ................................................... A6-6
32-bit Thumb instruction encoding ................................................. A6-14
About the system level programmers’ model ................................... B1-2
System level concepts and terminology ........................................... B1-3
ARM processor modes and core registers ....................................... B1-6
Instruction set states ...................................................................... B1-23
The Security Extensions ................................................................ B1-25
Exceptions ..................................................................................... B1-30
Coprocessors and system control .................................................. B1-62
Advanced SIMD and floating-point support .................................... B1-64
Execution environment support ..................................................... B1-73
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
vii
Contents
Chapter B2
Common Memory System Architecture Features
B2.1
B2.2
B2.3
B2.4
Chapter B3
Virtual Memory System Architecture (VMSA)
B3.1
B3.2
B3.3
B3.4
B3.5
B3.6
B3.7
B3.8
B3.9
B3.10
B3.11
B3.12
B3.13
Chapter B4
B4.6
B4.7
Chapter C1
Alphabetical list of instructions ......................................................... B6-2
Debug Architecture
Introduction to the ARM Debug Architecture
C1.1
C1.2
viii
Introduction to the CPUID scheme .................................................. B5-2
The CPUID registers ........................................................................ B5-4
Advanced SIMD and VFP feature identification registers .............. B5-34
System Instructions
B6.1
Part C
About the PMSA .............................................................................. B4-2
Memory access control .................................................................... B4-9
Memory region attributes ............................................................... B4-11
PMSA memory aborts .................................................................... B4-13
Fault Status and Fault Address registers in a PMSA implementation ......
B4-18
CP15 registers for a PMSA implementation .................................. B4-22
Pseudocode details of PMSA memory system operations ............ B4-79
The CPUID Identification Scheme
B5.1
B5.2
B5.3
Chapter B6
About the VMSA .............................................................................. B3-2
Memory access sequence ............................................................... B3-4
Translation tables ............................................................................. B3-7
Address mapping restrictions ......................................................... B3-23
Secure and Non-secure address spaces ....................................... B3-26
Memory access control .................................................................. B3-28
Memory region attributes ............................................................... B3-32
VMSA memory aborts .................................................................... B3-40
Fault Status and Fault Address registers in a VMSA implementation ......
B3-48
Translation Lookaside Buffers (TLBs) ............................................ B3-54
Virtual Address to Physical Address translation operations ........... B3-63
CP15 registers for a VMSA implementation .................................. B3-64
Pseudocode details of VMSA memory system operations .......... B3-156
Protected Memory System Architecture (PMSA)
B4.1
B4.2
B4.3
B4.4
B4.5
Chapter B5
About the memory system architecture ........................................... B2-2
Caches ............................................................................................. B2-3
Implementation defined memory system features ......................... B2-27
Pseudocode details of general memory system operations .......... B2-29
Scope of part C of this manual ......................................................... C1-2
About the ARM Debug architecture ................................................. C1-3
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Contents
C1.3
C1.4
Chapter C2
Invasive Debug Authentication
C2.1
Chapter C3
About the debug register interfaces ................................................. C6-2
Reset and power-down support ....................................................... C6-4
Debug register map ....................................................................... C6-18
Synchronization of debug register updates .................................... C6-24
Access permissions ....................................................................... C6-26
The CP14 debug register interfaces .............................................. C6-32
The memory-mapped and recommended external debug interfaces .......
C6-43
Non-invasive Debug Authentication
C7.1
C7.2
C7.3
C7.4
ARM DDI 0406B
About Debug state ........................................................................... C5-2
Entering Debug state ....................................................................... C5-3
Behavior of the PC and CPSR in Debug state ................................. C5-7
Executing instructions in Debug state .............................................. C5-9
Privilege in Debug state ................................................................. C5-13
Behavior of non-invasive debug in Debug state ............................. C5-19
Exceptions in Debug state ............................................................. C5-20
Memory system behavior in Debug state ....................................... C5-24
Leaving Debug state ...................................................................... C5-28
Debug Register Interfaces
C6.1
C6.2
C6.3
C6.4
C6.5
C6.6
C6.7
Chapter C7
About debug exceptions .................................................................. C4-2
Effects of debug exceptions on CP15 registers and the DBGWFAR ........
C4-4
Debug State
C5.1
C5.2
C5.3
C5.4
C5.5
C5.6
C5.7
C5.8
C5.9
Chapter C6
About debug events ......................................................................... C3-2
Software debug events .................................................................... C3-5
Halting debug events ..................................................................... C3-38
Generation of debug events ........................................................... C3-40
Debug event prioritization .............................................................. C3-43
Debug Exceptions
C4.1
C4.2
Chapter C5
About invasive debug authentication ............................................... C2-2
Debug Events
C3.1
C3.2
C3.3
C3.4
C3.5
Chapter C4
Security Extensions and debug ....................................................... C1-8
Register interfaces ........................................................................... C1-9
About non-invasive debug authentication ........................................
v7 Debug non-invasive debug authentication ..................................
Effects of non-invasive debug authentication ..................................
ARMv6 non-invasive debug authentication ......................................
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
C7-2
C7-4
C7-6
C7-8
ix
Contents
Chapter C8
Sample-based Profiling
C8.1
Chapter C9
Performance Monitors
C9.1
C9.2
C9.3
C9.4
C9.5
C9.6
C9.7
C9.8
C9.9
C9.10
Chapter C10
Scope of this appendix ............................................................... AppxB-2
Introduction to the Common VFP subarchitecture ..................... AppxB-3
Exception processing ................................................................. AppxB-6
Support code requirements ...................................................... AppxB-11
Context switching ..................................................................... AppxB-14
Subarchitecture additions to the VFP system registers ........... AppxB-15
Version 1 of the Common VFP subarchitecture ....................... AppxB-23
Version 2 of the Common VFP subarchitecture ....................... AppxB-24
Legacy Instruction Mnemonics
C.1
C.2
x
System integration signals ......................................................... AppxA-2
Recommended debug slave port ............................................. AppxA-13
Common VFP Subarchitecture Specification
B.1
B.2
B.3
B.4
B.5
B.6
B.7
B.8
Appendix C
Accessing the debug registers ....................................................... C10-2
Debug identification registers ......................................................... C10-3
Control and status registers ......................................................... C10-10
Instruction and data transfer registers ......................................... C10-40
Software debug event registers ................................................... C10-48
OS Save and Restore registers, v7 Debug only .......................... C10-75
Memory system control registers ................................................. C10-80
Management registers, ARMv7 only ............................................ C10-88
Performance monitor registers ................................................... C10-105
Recommended External Debug Interface
A.1
A.2
Appendix B
About the performance monitors ...................................................... C9-2
Status in the ARM architecture ........................................................ C9-4
Accuracy of the performance monitors ............................................ C9-5
Behavior on overflow ....................................................................... C9-6
Interaction with Security Extensions ................................................ C9-7
Interaction with trace ........................................................................ C9-8
Interaction with power saving operations ......................................... C9-9
CP15 c9 register map .................................................................... C9-10
Access permissions ....................................................................... C9-12
Event numbers ............................................................................... C9-13
Debug Registers Reference
C10.1
C10.2
C10.3
C10.4
C10.5
C10.6
C10.7
C10.8
C10.9
Appendix A
Program Counter sampling .............................................................. C8-2
Thumb instruction mnemonics ................................................... AppxC-2
Pre-UAL pseudo-instruction NOP .............................................. AppxC-3
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Contents
Appendix D
Deprecated and Obsolete Features
D.1
D.2
D.3
D.4
D.5
D.6
D.7
Appendix E
Fast Context Switch Extension (FCSE)
E.1
E.2
E.3
Appendix F
Introduction to ARMv6 .............................................................. AppxG-2
Application level register support .............................................. AppxG-3
Application level memory support ............................................. AppxG-6
Instruction set support ............................................................. AppxG-10
System level register support .................................................. AppxG-16
System level memory model ................................................... AppxG-20
System Control coprocessor (CP15) support .......................... AppxG-29
Introduction to ARMv4 and ARMv5 ............................................ AppxH-2
Application level register support ............................................... AppxH-4
Application level memory support .............................................. AppxH-6
Instruction set support .............................................................. AppxH-11
System level register support ................................................... AppxH-18
System level memory model .................................................... AppxH-21
System Control coprocessor (CP15) support ........................... AppxH-31
Pseudocode Definition
I.1
I.2
I.3
I.4
I.5
I.6
I.7
ARM DDI 0406B
AppxF-2
AppxF-3
AppxF-5
AppxF-7
ARMv4 and ARMv5 Differences
H.1
H.2
H.3
H.4
H.5
H.6
H.7
Appendix I
About VFP vector mode .............................................................
Vector length and stride control .................................................
VFP register banks ....................................................................
VFP instruction type selection ....................................................
ARMv6 Differences
G.1
G.2
G.3
G.4
G.5
G.6
G.7
Appendix H
About the FCSE ......................................................................... AppxE-2
Modified virtual addresses ......................................................... AppxE-3
Debug and trace ........................................................................ AppxE-5
VFP Vector Operation Support
F.1
F.2
F.3
F.4
Appendix G
Deprecated features .................................................................. AppxD-2
Deprecated terminology ............................................................. AppxD-5
Obsolete features ....................................................................... AppxD-6
Semaphore instructions ............................................................. AppxD-7
Use of the SP as a general-purpose register ............................. AppxD-8
Explicit use of the PC in ARM instructions ................................. AppxD-9
Deprecated Thumb instructions ............................................... AppxD-10
Instruction encoding diagrams and pseudocode ......................... AppxI-2
Limitations of pseudocode .......................................................... AppxI-4
Data types ................................................................................... AppxI-5
Expressions ................................................................................ AppxI-9
Operators and built-in functions ................................................ AppxI-11
Statements and program structure ............................................ AppxI-17
Miscellaneous helper procedures and functions ....................... AppxI-22
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
xi
Contents
Appendix J
Pseudocode Index
J.1
J.2
Appendix K
Pseudocode operators and keywords ........................................ AppxJ-2
Pseudocode functions and procedures ...................................... AppxJ-6
Register Index
K.1
Register index ............................................................................ AppxK-2
Glossary
xii
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Preface
This preface summarizes the contents of this manual and lists the conventions it uses. It contains the
following sections:
•
About this manual on page xiv
•
Using this manual on page xv
•
Conventions on page xviii
•
Further reading on page xx
•
Feedback on page xxi.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
xiii
Preface
About this manual
This manual describes the ARM®v7 instruction set architecture, including its high code density Thumb®
instruction encoding and the following extensions to it:
•
The System Control coprocessor, coprocessor 15 (CP15), used to control memory system
components such as caches, write buffers, Memory Management Units, and Protection Units.
•
The optional Advanced SIMD extension, that provides high-performance integer and
single-precision floating-point vector operations.
•
The optional VFP extension, that provides high-performance floating-point operations. It can
optionally support double-precision operations.
•
The Debug architecture, that provides software access to debug features in ARM processors.
Part A describes the application level view of the architecture. It describes the application level view of the
programmers’ model and the memory model. It also describes the precise effects of each instruction in User
mode (the normal operating mode), including any restrictions on its use. This information is of primary
importance to authors and users of compilers, assemblers, and other programs that generate ARM machine
code.
Part B describes the system level view of the architecture. It gives details of system registers that are not
accessible from User mode, and the system level view of the memory model. It also gives full details of the
effects of instructions in privileged modes (any mode other than User mode), where these are different from
their effects in User mode.
Part C describes the Debug architecture. This is an extension to the ARM architecture that provides
configuration, breakpoint and watchpoint support, and a Debug Communications Channel (DCC) to a debug
host.
Assembler syntax is given for the instructions described in this manual, permitting instructions to be
specified in textual form. However, this manual is not intended as tutorial material for ARM assembler
language, nor does it describe ARM assembler language at anything other than a very basic level. To make
effective use of ARM assembler language, consult the documentation supplied with the assembler being
used.
xiv
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Preface
Using this manual
The information in this manual is organized into four parts, as described below.
Part A, Application Level Architecture
Part A describes the application level view of the architecture. It contains the following chapters:
Chapter A1
Gives a brief overview of the ARM architecture, and the ARM and Thumb instruction sets.
Chapter A2
Describes the application level view of the ARM programmers’ model, including the
application level view of the Advanced SIMD and VFP extensions. It describes the types of
value that ARM instructions operate on, the general-purpose registers that contain those
values, and the Application Program Status Register.
Chapter A3
Describes the application level view of the memory model, including the ARM memory
types and attributes, and memory access control.
Chapter A4
Describes the range of instructions available in the ARM, Thumb, Advanced SIMD, and
VFP instruction sets. It also contains some details of instruction operation, where these are
common to several instructions.
Chapter A5
Gives details of the encoding of the ARM instruction set.
Chapter A6
Gives details of the encoding of the Thumb instruction set.
Chapter A7
Gives details of the encoding of the Advanced SIMD and VFP instruction sets.
Chapter A8
Provides detailed reference information about every instruction available in the Thumb,
ARM, Advanced SIMD, and VFP instruction sets, with the exception of information only
relevant in privileged modes.
Chapter A9
Provides detailed reference information about the ThumbEE (Execution Environment)
variant of the Thumb instruction set.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
xv
Preface
Part B, System Level Architecture
Part B describes the system level view of the architecture. It contains the following chapters:
Chapter B1
Describes the system level view of the programmers’ model.
Chapter B2
Describes the system level view of the memory model features that are common to all
memory systems.
Chapter B3
Describes the system level view of the Virtual Memory System Architecture (VMSA) that
is part of all ARMv7-A implementations. This chapter includes descriptions of all of the
CP15 System Control Coprocessor registers in a VMSA implementation.
Chapter B4
Describes the system level view of the Protected Memory System Architecture (PMSA) that
is part of all ARMv7-R implementations. This chapter includes descriptions of all of the
CP15 System Control Coprocessor registers in a PMSA implementation.
Chapter B5
Describes the CPUID scheme.
Chapter B6
Provides detailed reference information about system instructions, and more information
about instructions where they behave differently in privileged modes.
Part C, Debug Architecture
Part C describes the Debug architecture. It contains the following chapters:
Chapter C1
Gives a brief introduction to the Debug architecture.
Chapter C2
Describes the authentication of invasive debug.
Chapter C3
Describes the debug events.
Chapter C4
Describes the debug exceptions.
Chapter C5
Describes Debug state.
Chapter C6
Describes the permitted debug register interfaces.
Chapter C7
Describes the authentication of non-invasive debug.
Chapter C8
Describes sample-based profiling.
Chapter C9
Describes the ARM performance monitors.
Chapter C10 Describes the debug registers.
xvi
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Preface
Part D, Appendices
This manual contains the following appendices:
Appendix A
Describes the recommended external Debug interfaces.
Note
This description is not part of the ARM architecture specification. It is included here only
as supplementary information, for the convenience of developers and users who might
require this information.
Appendix B
The Common VFP subarchitecture specification.
Note
This specification is not part of the ARM architecture specification. This sub-architectural
information is included here only as supplementary information, for the convenience of
developers and users who might require this information.
Appendix C
Describes the legacy mnemonics.
Appendix D
Identifies the deprecated architectural features.
Appendix E
Describes the Fast Context Switch Extension (FCSE). From ARMv6, the use of this feature
is deprecated, and in ARMv7 the FCSE is optional.
Appendix F
Describes the VFP vector operations. Use of these operations is deprecated in ARMv7.
Appendix G
Describes the differences in the ARMv6 architecture.
Appendix H
Describes the differences in the ARMv4 and ARMv5 architectures.
Appendix I
The formal definition of the pseudocode.
Appendix J
Index to definitions of pseudocode operators, keywords, functions, and procedures.
Appendix K
Index to register descriptions in the manual.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
xvii
Preface
Conventions
This manual employs typographic and other conventions intended to improve its ease of use.
General typographic conventions
typewriter
Is used for assembler syntax descriptions, pseudocode descriptions of instructions,
and source code examples. In the cases of assembler syntax descriptions and
pseudocode descriptions, see the additional conventions below.
The typewriter style is also used in the main text for instruction mnemonics and for
references to other items appearing in assembler syntax descriptions, pseudocode
descriptions of instructions and source code examples.
italic
Highlights important notes, introduces special terminology, and denotes internal
cross-references and citations.
bold
Is used for emphasis in descriptive lists and elsewhere, where appropriate.
SMALL CAPITALS
Are used for a few terms that have specific technical meanings. Their meanings can
be found in the Glossary.
Signals
In general this specification does not define processor signals, but it does include some signal examples and
recommendations. It uses the following signal conventions:
Signal level
The level of an asserted signal depends on whether the signal is active-HIGH or
active-LOW. Asserted means:
•
HIGH for active-HIGH signals
•
LOW for active-LOW signals.
Lower-case n
At the start or end of a signal name denotes an active-LOW signal.
Numbers
Numbers are normally written in decimal. Binary numbers are preceded by 0b, and hexadecimal numbers
by 0x and written in a typewriter font.
Bit values
Values of bits and bitfields are normally given in binary, in single quotes. The quotes are normally omitted
in encoding diagrams and tables.
Pseudocode descriptions
This manual uses a form of pseudocode to provide precise descriptions of the specified functionality. This
pseudocode is written in a typewriter font, and is described in Appendix I Pseudocode Definition.
xviii
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Preface
Assembler syntax descriptions
This manual contains numerous syntax descriptions for assembler instructions and for components of
assembler instructions. These are shown in a typewriter font, and use the conventions described in
Assembler syntax on page A8-4.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
xix
Preface
Further reading
This section lists publications from both ARM and third parties that provide more information on the ARM
family of processors.
ARM periodically provides updates and corrections to its documentation. See http://www.arm.com for
current errata sheets and addenda, and the ARM Frequently Asked Questions.
ARM publications
•
•
•
•
ARM Debug Interface v5 Architecture Specification (ARM IHI 0031)
ARMv7-M Architecture Reference Manual (ARM DDI 0403)
CoreSight Architecture Specification (ARM IHI 0029)
ARM Architecture Reference Manual (ARM DDI 0100I)
Note
—
—
•
•
Issue I of the ARM Architecture Reference Manual (DDI 0100I) was issued in July 2005 and
describes the first version of the ARMv6 architecture, and all previous architecture versions.
Addison-Wesley Professional publish ARM Architecture Reference Manual, Second Edition
(December 27, 2000). The contents of this are identical to Issue E of the ARM Architecture
Reference Manual (DDI 0100E). It describes ARMv5TE and earlier versions of the ARM
architecture, and is superseded by DDI 0100I.
Embedded Trace Macrocell Architecture Specification (ARM IHI 0014)
CoreSight Program Flow Trace Architecture Specification (ARM IHI 0035).
External publications
The following books are referred to in this manual, or provide more information:
xx
•
IEEE Std 1596.5-1993, IEEE Standard for Shared-Data Formats Optimized for Scalable Coherent
Interface (SCI) Processors, ISBN 1-55937-354-7
•
IEEE Std 1149.1-2001, IEEE Standard Test Access Port and Boundary Scan Architecture (JTAG)
•
ANSI/IEEE Std 754-1985, IEEE Standard for Binary Floating-Point Arithmetic
•
JEP106, Standard Manufacturers Identification Code, JEDEC Solid State Technology Association
•
The Java Virtual Machine Specification Second Edition, Tim Lindholm and Frank Yellin, published
by Addison Wesley (ISBN: 0-201-43294-3)
•
Memory Consistency Models for Shared Memory-Multiprocessors, Kourosh Gharachorloo, Stanford
University Technical Report CSL-TR-95-685
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Preface
Feedback
ARM welcomes feedback on its documentation.
Feedback on this manual
If you notice any errors or omissions in this manual, send e-mail to errata@arm.com giving:
•
the document title
•
the document number
•
the page number(s) to which your comments apply
•
a concise explanation of the problem.
General suggestions for additions and improvements are also welcome.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
xxi
Preface
xxii
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Part A
Application Level Architecture
Chapter A1
Introduction to the ARM Architecture
This chapter introduces the ARM architecture and contains the following sections:
•
About the ARM architecture on page A1-2
•
The ARM and Thumb instruction sets on page A1-3
•
Architecture versions, profiles, and variants on page A1-4
•
Architecture extensions on page A1-6
•
The ARM memory model on page A1-7
•
Debug on page A1-8.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A1-1
Introduction to the ARM Architecture
A1.1
About the ARM architecture
The ARM architecture supports implementations across a wide range of performance points. It is
established as the dominant architecture in many market segments. The architectural simplicity of ARM
processors leads to very small implementations, and small implementations mean devices can have very low
power consumption. Implementation size, performance, and very low power consumption are key attributes
of the ARM architecture.
The ARM architecture is a Reduced Instruction Set Computer (RISC) architecture, as it incorporates these
typical RISC architecture features:
•
a large uniform register file
•
a load/store architecture, where data-processing operations only operate on register contents, not
directly on memory contents
•
simple addressing modes, with all load/store addresses being determined from register contents and
instruction fields only.
In addition, the ARM architecture provides:
•
instructions that combine a shift with an arithmetic or logical operation
•
auto-increment and auto-decrement addressing modes to optimize program loops
•
Load and Store Multiple instructions to maximize data throughput
•
conditional execution of almost all instructions to maximize execution throughput.
These enhancements to a basic RISC architecture enable ARM processors to achieve a good balance of high
performance, small code size, low power consumption, and small silicon area.
Except where the architecture specifies differently, the programmer-visible behavior of an implementation
must be the same as a simple sequential execution of the program. This programmer-visible behavior does
not include the execution time of the program.
A1-2
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Introduction to the ARM Architecture
A1.2
The ARM and Thumb instruction sets
The ARM instruction set is a set of 32-bit instructions providing comprehensive data-processing and control
functions.
The Thumb instruction set was developed as a 16-bit instruction set with a subset of the functionality of the
ARM instruction set. It provides significantly improved code density, at a cost of some reduction in
performance. A processor executing Thumb instructions can change to executing ARM instructions for
performance critical segments, in particular for handling interrupts.
In ARMv6T2, Thumb-2 technology is introduced. This technology makes it possible to extend the original
Thumb instruction set with many 32-bit instructions. The range of 32-bit Thumb instructions included in
ARMv6T2 permits Thumb code to achieve performance similar to ARM code, with code density better than
that of earlier Thumb code.
From ARMv6T2, the ARM and Thumb instruction sets provide almost identical functionality. For more
information, see Chapter A4 The Instruction Sets.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A1-3
Introduction to the ARM Architecture
A1.3
Architecture versions, profiles, and variants
The ARM and Thumb instruction set architectures have evolved significantly since they were first
developed. They will continue to be developed in the future. Seven major versions of the instruction set have
been defined to date, denoted by the version numbers 1 to 7. Of these, the first three versions are now
obsolete.
ARMv7 provides three profiles:
ARMv7-A
Application profile, described in this manual. Implements a traditional ARM architecture
with multiple modes and supporting a Virtual Memory System Architecture (VMSA) based
on an MMU. Supports the ARM and Thumb instruction sets.
ARMv7-R
Real-time profile, described in this manual. Implements a traditional ARM architecture with
multiple modes and supporting a Protected Memory System Architecture (PMSA) based on
an MPU. Supports the ARM and Thumb instruction sets.
ARMv7-M
Microcontroller profile, described in the ARMv7-M Architecture Reference Manual.
Implements a programmers' model designed for fast interrupt processing, with hardware
stacking of registers and support for writing interrupt handlers in high-level languages.
Implements a variant of the ARMv7 PMSA and supports a variant of the Thumb instruction
set.
Versions can be qualified with variant letters to specify additional instructions and other functionality that
are included as an architecture extension. Extensions are typically included in the base architecture of the
next version number. Provision is also made to exclude variants by prefixing the variant letter with x.
Some extensions are described separately instead of using a variant letter. For details of these extensions see
Architecture extensions on page A1-6.
The valid variants of ARMv4, ARMv5, and ARMv6 are as follows:
A1-4
ARMv4
The earliest architecture variant covered by this manual. It includes only the ARM
instruction set.
ARMv4T
Adds the Thumb instruction set.
ARMv5T
Improves interworking of ARM and Thumb instructions. Adds count leading zeros (CLZ)
and software breakpoint (BKPT) instructions.
ARMv5TE
Enhances arithmetic support for digital signal processing (DSP) algorithms. Adds preload
data (PLD), dual word load (LDRD), store (STRD), and 64-bit coprocessor register transfers
(MCRR, MRRC).
ARMv5TEJ
Adds the BXJ instruction and other support for the Jazelle® architecture extension.
ARMv6
Adds many new instructions to the ARM instruction set. Formalizes and revises the memory
model and the Debug architecture.
ARMv6K
Adds instructions to support multi-processing to the ARM instruction set, and some extra
memory model features.
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Introduction to the ARM Architecture
ARMv6T2
Introduces Thumb-2 technology, giving a major development of the Thumb instruction set
to provide a similar level of functionality to the ARM instruction set.
Note
ARMv6KZ or ARMv6Z are sometimes used to describe the ARMv6K architecture with the optional
Security Extensions.
For detailed information about versions of the ARM architecture, see Appendix G ARMv6 Differences and
Appendix H ARMv4 and ARMv5 Differences.
The following architecture variants are now obsolete:
ARMv1, ARMv2, ARMv2a, ARMv3, ARMv3G, ARMv3M, ARMv4xM, ARMv4TxM, ARMv5,
ARMv5xM, ARMv5TxM, and ARMv5TExP.
Contact ARM if you require details of obsolete variants.
Instruction descriptions in this manual specify the architecture versions that support them.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A1-5
Introduction to the ARM Architecture
A1.4
Architecture extensions
This manual describes the following extensions to the ARM and Thumb instruction set architectures:
ThumbEE
Is a variant of the Thumb instruction set that is designed as a target for dynamically
generated code. It is:
•
a required extension to the ARMv7-A profile
•
an optional extension to the ARMv7-R profile.
VFP
Is a floating-point coprocessor extension to the instruction set architectures. There
have been three main versions of VFP to date:
•
VFPv1 is obsolete. Details are available on request from ARM.
•
VFPv2 is an optional extension to:
•
Advanced SIMD
—
the ARM instruction set in the ARMv5TE, ARMv5TEJ, ARMv6, and
ARMv6K architectures
—
the ARM and Thumb instruction sets in the ARMv6T2 architecture.
VFPv3 is an optional extension to the ARM, Thumb and ThumbEE
instruction sets in the ARMv7-A and ARMv7-R profiles.
VFPv3 can be implemented with either thirty-two or sixteen doubleword
registers, as described in Advanced SIMD and VFP extension registers on
page A2-21. Where necessary, the terms VFPv3-D32 and VFPv3-D16 are
used to distinguish between these two implementation options. Where the
term VFPv3 is used it covers both options.
VFPv3 can be extended by the half-precision extensions that provide
conversion functions in both directions between half-precision floating-point
and single-precision floating-point.
Is an instruction set extension that provides Single Instruction Multiple Data
(SIMD) functionality. It is an optional extension to the ARMv7-A and ARMv7-R
profiles. When VFPv3 and Advanced SIMD are both implemented, they use a
shared register bank and have some shared instructions.
Advanced SIMD can be extended by the half-precision extensions that provide
conversion functions in both directions between half-precision floating-point and
single-precision floating-point.
Security Extensions
Are a set of security features that facilitate the development of secure applications.
They are an optional extension to the ARMv6K architecture and the ARMv7-A
profile.
Jazelle
Is the Java bytecode execution extension that extended ARMv5TE to ARMv5TEJ.
From ARMv6 Jazelle is a required part of the architecture, but is still often
described as the Jazelle extension.
Multiprocessing Extensions
Are a set of features that enhance multiprocessing functionality. They are an
optional extension to the ARMv7-A and ARMv7-R profiles.
A1-6
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Introduction to the ARM Architecture
A1.5
The ARM memory model
The ARM architecture uses a single, flat address space of 232 8-bit bytes. The address space is also regarded
as 230 32-bit words or 231 16-bit halfwords.
The architecture provides facilities for:
•
faulting unaligned memory accesses
•
restricting access by applications to specified areas of memory
•
translating virtual addresses provided by executing instructions into physical addresses
•
altering the interpretation of word and halfword data between big-endian and little-endian
•
optionally preventing out-of-order access to memory
•
controlling caches
•
synchronizing access to shared memory by multiple processors.
For more information, see:
•
Chapter A3 Application Level Memory Model
•
Chapter B2 Common Memory System Architecture Features
•
Chapter B3 Virtual Memory System Architecture (VMSA)
•
Chapter B4 Protected Memory System Architecture (PMSA).
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A1-7
Introduction to the ARM Architecture
A1.6
Debug
ARMv7 processors implement two types of debug support:
Invasive debug
Debug permitting modification of the state of the processor. This is intended
primarily for run-control debugging.
Non-invasive debug
Debug permitting data and program flow observation, without modifying the state
of the processor or interrupting the flow of execution.
This provides for:
•
instruction and data tracing
•
program counter sampling
•
performance monitors.
For more information, see Chapter C1 Introduction to the ARM Debug Architecture.
A1-8
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Chapter A2
Application Level Programmers’ Model
This chapter gives an application level view of the ARM programmers’ model. It contains the following
sections:
•
About the Application level programmers’ model on page A2-2
•
ARM core data types and arithmetic on page A2-3
•
ARM core registers on page A2-11
•
The Application Program Status Register (APSR) on page A2-14
•
Execution state registers on page A2-15
•
Advanced SIMD and VFP extensions on page A2-20
•
Floating-point data types and arithmetic on page A2-32
•
Polynomial arithmetic over {0,1} on page A2-67
•
Coprocessor support on page A2-68
•
Execution environment support on page A2-69
•
Exceptions, debug events and checks on page A2-81.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-1
Application Level Programmers’ Model
A2.1
About the Application level programmers’ model
This chapter contains the programmers’ model information required for application development.
The information in this chapter is distinct from the system information required to service and support
application execution under an operating system. However, some knowledge of that system information is
needed to put the Application level programmers' model into context.
System level support requires access to all features and facilities of the architecture, a mode of operation
referred to as privileged operation. System code determines whether an application runs in a privileged or
unprivileged manner. When an operating system supports both privileged and unprivileged operation, an
application usually runs unprivileged. This:
•
permits the operating system to allocate system resources to it in a unique or shared manner
•
provides a degree of protection from other processes and tasks, and so helps protect the operating
system from malfunctioning applications.
This chapter indicates where some system level understanding is helpful, and where appropriate it:
•
gives an overview of the system level information
•
gives references to the system level descriptions in Chapter B1 The System Level Programmers’
Model and elsewhere.
The Security Extensions extend the architecture to provide hardware security features that support the
development of secure applications. For more information, see The Security Extensions on page B1-25.
A2-2
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
A2.2
ARM core data types and arithmetic
All ARMv7-A and ARMv7-R processors support the following data types in memory:
Byte
8 bits
Halfword
16 bits
Word
32 bits
Doubleword 64 bits.
Processor registers are 32 bits in size. The instruction set contains instructions supporting the following data
types held in registers:
•
32-bit pointers
•
unsigned or signed 32-bit integers
•
unsigned 16-bit or 8-bit integers, held in zero-extended form
•
signed 16-bit or 8-bit integers, held in sign-extended form
•
two 16-bit integers packed into a register
•
four 8-bit integers packed into a register
•
unsigned or signed 64-bit integers held in two registers.
Load and store operations can transfer bytes, halfwords, or words to and from memory. Loads of bytes or
halfwords zero-extend or sign-extend the data as it is loaded, as specified in the appropriate load instruction.
The instruction sets include load and store operations that transfer two or more words to and from memory.
You can load and store doublewords using these instructions. The exclusive doubleword load/store
instructions LDREXD and STREXD specify single-copy atomic doubleword accesses to memory.
When any of the data types is described as unsigned, the N-bit data value represents a non-negative integer
in the range 0 to 2N-1, using normal binary format.
When any of these types is described as signed, the N-bit data value represents an integer in the range -2N-1
to +2N-1-1, using two's complement format.
The instructions that operate on packed halfwords or bytes include some multiply instructions that use just
one of two halfwords, and Single Instruction Multiple Data (SIMD) instructions that operate on all of the
halfwords or bytes in parallel.
Direct instruction support for 64-bit integers is limited, and most 64-bit operations require sequences of two
or more instructions to synthesize them.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-3
Application Level Programmers’ Model
A2.2.1
Integer arithmetic
The instruction set provides a wide variety of operations on the values in registers, including bitwise logical
operations, shifts, additions, subtractions, multiplications, and many others. These operations are defined
using the pseudocode described in Appendix I Pseudocode Definition, usually in one of three ways:
A2-4
•
By direct use of the pseudocode operators and built-in functions defined in Operators and built-in
functions on page AppxI-11.
•
By use of pseudocode helper functions defined in the main text. These can be located using the table
in Appendix J Pseudocode Index.
•
By a sequence of the form:
1.
Use of the SInt(), UInt(), and Int() built-in functions defined in Converting bitstrings to
integers on page AppxI-14 to convert the bitstring contents of the instruction operands to the
unbounded integers that they represent as two's complement or unsigned integers.
2.
Use of mathematical operators, built-in functions and helper functions on those unbounded
integers to calculate other such integers.
3.
Use of either the bitstring extraction operator defined in Bitstring extraction on page AppxI-12
or of the saturation helper functions described in Pseudocode details of saturation on
page A2-9 to convert an unbounded integer result into a bitstring result that can be written to
a register.
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
Shift and rotate operations
The following types of shift and rotate operations are used in instructions:
Logical Shift Left
(LSL) moves each bit of a bitstring left by a specified number of bits. Zeros are shifted in at
the right end of the bitstring. Bits that are shifted off the left end of the bitstring are
discarded, except that the last such bit can be produced as a carry output.
Logical Shift Right
(LSR) moves each bit of a bitstring right by a specified number of bits. Zeros are shifted in
at the left end of the bitstring. Bits that are shifted off the right end of the bitstring are
discarded, except that the last such bit can be produced as a carry output.
Arithmetic Shift Right
(ASR) moves each bit of a bitstring right by a specified number of bits. Copies of the leftmost
bit are shifted in at the left end of the bitstring. Bits that are shifted off the right end of the
bitstring are discarded, except that the last such bit can be produced as a carry output.
Rotate Right (ROR) moves each bit of a bitstring right by a specified number of bits. Each bit that is shifted
off the right end of the bitstring is re-introduced at the left end. The last bit shifted off the
right end of the bitstring can be produced as a carry output.
Rotate Right with Extend
(RRX) moves each bit of a bitstring right by one bit. The carry input is shifted in at the left
end of the bitstring. The bit shifted off the right end of the bitstring can be produced as a
carry output.
Pseudocode details of shift and rotate operations
These shift and rotate operations are supported in pseudocode by the following functions:
// LSL_C()
// =======
(bits(N), bit) LSL_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = x : Zeros(shift);
result = extended_x;
carry_out = extended_x;
return (result, carry_out);
// LSL()
// =====
bits(N) LSL(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-5
Application Level Programmers’ Model
(result, -) = LSL_C(x, shift);
return result;
// LSR_C()
// =======
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x;
carry_out = extended_x;
return (result, carry_out);
// LSR()
// =====
bits(N) LSR(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
(result, -) = LSR_C(x, shift);
return result;
// ASR_C()
// =======
(bits(N), bit) ASR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = SignExtend(x, shift+N);
result = extended_x;
carry_out = extended_x;
return (result, carry_out);
// ASR()
// =====
bits(N) ASR(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
(result, -) = ASR_C(x, shift);
return result;
// ROR_C()
// =======
(bits(N), bit) ROR_C(bits(N) x, integer shift)
assert shift != 0;
m = shift MOD N;
result = LSR(x,m) OR LSL(x,N-m);
carry_out = result;
return (result, carry_out);
A2-6
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
// ROR()
// =====
bits(N) ROR(bits(N) x, integer shift)
if n == 0 then
result = x;
else
(result, -) = ROR_C(x, shift);
return result;
// RRX_C()
// =======
(bits(N), bit) RRX_C(bits(N) x, bit carry_in)
result = carry_in : x;
carry_out = x<0>;
return (result, carry_out);
// RRX()
// =====
bits(N) RRX(bits(N) x, bit carry_in)
(result, -) = RRX_C(x, shift);
return result;
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-7
Application Level Programmers’ Model
Pseudocode details of addition and subtraction
In pseudocode, addition and subtraction can be performed on any combination of unbounded integers and
bitstrings, provided that if they are performed on two bitstrings, the bitstrings must be identical in length.
The result is another unbounded integer if both operands are unbounded integers, and a bitstring of the same
length as the bitstring operand(s) otherwise. For the precise definition of these operations, see Addition and
subtraction on page AppxI-15.
The main addition and subtraction instructions can produce status information about both unsigned carry
and signed overflow conditions. This status information can be used to synthesize multi-word additions and
subtractions. In pseudocode the AddWithCarry() function provides an addition with a carry input and carry
and overflow outputs:
// AddWithCarry()
// ==============
(bits(N), bit, bit) AddWithCarry(bits(N) x, bits(N) y, bit carry_in)
unsigned_sum = UInt(x) + UInt(y) + UInt(carry_in);
signed_sum
= SInt(x) + SInt(y) + UInt(carry_in);
result
= unsigned_sum; // == signed_sum
carry_out
= if UInt(result) == unsigned_sum then ‘0’ else ‘1’;
overflow
= if SInt(result) == signed_sum then ‘0’ else ‘1’;
return (result, carry_out, overflow);
An important property of the AddWithCarry() function is that if:
(result, carry_out, overflow) = AddWithCarry(x, NOT(y), carry_in)
then:
•
if carry_in == '1', then result == x-y with:
overflow == '1' if signed overflow occurred during the subtraction
—
—
carry_out == '1' if unsigned borrow did not occur during the subtraction, that is, if x >= y
•
if carry_in == '0', then result == x-y-1 with:
overflow == '1' if signed overflow occurred during the subtraction
—
—
carry_out == '1' if unsigned borrow did not occur during the subtraction, that is, if x > y.
Together, these mean that the carry_in and carry_out bits in AddWithCarry() calls can act as NOT borrow
flags for subtractions as well as carry flags for additions.
A2-8
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
Pseudocode details of saturation
Some instructions perform saturating arithmetic, that is, if the result of the arithmetic overflows the
destination signed or unsigned N-bit integer range, the result produced is the largest or smallest value in that
range, rather than wrapping around modulo 2N. This is supported in pseudocode by the SignedSatQ() and
UnsignedSatQ() functions when a boolean result is wanted saying whether saturation occurred, and by the
SignedSat() and UnsignedSat() functions when only the saturated result is wanted:
// SignedSatQ()
// ============
(bits(N), boolean) SignedSatQ(integer i, integer N)
if i > 2^(N-1) - 1 then
result = 2^(N-1) - 1; saturated = TRUE;
elsif i < -(2^(N-1)) then
result = -(2^(N-1)); saturated = TRUE;
else
result = i; saturated = FALSE;
return (result, saturated);
// UnsignedSatQ()
// ==============
(bits(N), boolean) UnsignedSatQ(integer i, integer N)
if i > 2^N - 1 then
result = 2^N - 1; saturated = TRUE;
elsif i < 0 then
result = 0; saturated = TRUE;
else
result = i; saturated = FALSE;
return (result, saturated);
// SignedSat()
// ===========
bits(N) SignedSat(integer i, integer N)
(result, -) = SignedSatQ(i, N);
return result;
// UnsignedSat()
// =============
bits(N) UnsignedSat(integer i, integer N)
(result, -) = UnsignedSatQ(i, N);
return result;
SatQ(i, N, unsigned) returns either UnsignedSatQ(i,N) or SignedSatQ(i, N) depending on the value of its
third argument, and Sat(i, N, unsigned) returns either UnsignedSat(i, N) or SignedSat(i, N) depending on
the value of its third argument:
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-9
Application Level Programmers’ Model
// SatQ()
// ======
(bits(N), boolean) SatQ(integer i, integer N, boolean unsigned)
(result, sat) = if unsigned then UnsignedSatQ(i, N) else SignedSatQ(i, N);
return (result, sat);
// Sat()
// =====
bits(N) Sat(integer i, integer N, boolean unsigned)
result = if unsigned then UnsignedSat(i, N) else SignedSat(i, N);
return result;
A2-10
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
A2.3
ARM core registers
In the application level view, an ARM processor has:
•
thirteen general-purpose32-bit registers, R0 to R12
•
three 32-bit registers, R13 to R15, that sometimes or always have a special use.
Registers R13 to R15 are usually referred to by names that indicate their special uses:
SP, the Stack Pointer
Register R13 is used as a pointer to the active stack.
In Thumb code, most instructions cannot access SP. The only instructions that can access
SP are those designed to use SP as a stack pointer.
The use of SP for any purpose other than as a stack pointer is deprecated.
Note
Using SP for any purpose other than as a stack pointer is likely to break the requirements of
operating systems, debuggers, and other software systems, causing them to malfunction.
LR, the Link Register
Register R14 is used to store the return address from a subroutine. At other times, LR can
be used for other purposes.
When a BL or BLX instruction performs a subroutine call, LR is set to the subroutine return
address. To perform a subroutine return, copy LR back to the program counter. This is
typically done in one of two ways, after entering the subroutine with a BL or BLX instruction:
•
•
Return with a BX LR instruction.
On subroutine entry, store LR to the stack with an instruction of the form:
PUSH {,LR}
and use a matching instruction to return:
POP {,PC}
ThumbEE checks and handler calls use LR in a similar way. For details see Chapter A9
ThumbEE.
PC, the Program Counter
Register R15 is the program counter:
•
When executing an ARM instruction, PC reads as the address of the current
instruction plus 8.
•
When executing a Thumb instruction, PC reads as the address of the current
instruction plus 4.
•
Writing an address to PC causes a branch to that address.
In Thumb code, most instructions cannot access PC.
See ARM core registers on page B1-9 for the system level view of SP, LR, and PC.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-11
Application Level Programmers’ Model
Note
The names SP, LR and PC are preferred to R13, R14 and R15. However, sometimes it is simpler to use the
R13-R15 names when referring to a group of registers. For example, it is simpler to refer to Registers R8 to
R15, rather than to Registers R8 to R12, the SP, LR and PC. However these two descriptions of the group of
registers have exactly the same meaning.
A2.3.1
Pseudocode details of operations on ARM core registers
In pseudocode, the R[] function is used to:
•
Read or write R0-R12, SP, and LR, using n == 0-12, 13, and 14 respectively.
•
Read the PC, using n == 15.
This function has prototypes:
bits(32) R[integer n]
assert n >= 0 && n <= 15;
R[integer n] = bits(32) value
assert n >= 0 && n <= 14;
The full operation of this function is explained in Pseudocode details of ARM core register operations on
page B1-12.
Descriptions of ARM store instructions that store the PC value use the PCStoreValue() pseudocode function
to specify the PC value stored by the instruction:
// PCStoreValue()
// ==============
bits(32) PCStoreValue()
// This function returns the PC value. On architecture versions before ARMv7, it
// is permitted to instead return PC+4, provided it does so consistently. It is
// used only to describe ARM instructions, so it returns the address of the current
// instruction plus 8 (normally) or 12 (when the alternative is permitted).
return PC;
Writing an address to the PC causes either a simple branch to that address or an interworking branch that
also selects the instruction set to execute after the branch. A simple branch is performed by the
BranchWritePC() function:
// BranchWritePC()
// ===============
BranchWritePC(bits(32) address)
if CurrentInstrSet() == InstrSet_ARM then
if ArchVersion() < 6 && address<1:0> != ‘00’ then UNPREDICTABLE;
BranchTo(address<31:2>:’00’);
else
BranchTo(address<31:1>:’0’);
An interworking branch is performed by the BXWritePC() function:
A2-12
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
// BXWritePC()
// ===========
BXWritePC(bits(32) address)
if CurrentInstrSet() == InstrSet_ThumbEE then
if address<0> == ‘1’ then
BranchTo(address<31:1>:’0’); // Remaining in ThumbEE state
else
UNPREDICTABLE;
else
if address<0> == ‘1’ then
SelectInstrSet(InstrSet_Thumb);
BranchTo(address<31:1>:’0’);
elsif address<1> == ‘0’ then
SelectInstrSet(InstrSet_ARM);
BranchTo(address);
else // address<1:0> == ‘10’
UNPREDICTABLE;
The LoadWritePC() and ALUWritePC() functions are used for two cases where the behavior was systematically
modified between architecture versions:
// LoadWritePC()
// =============
LoadWritePC(bits(32) address)
if ArchVersion() >= 5 then
BXWritePC(address);
else
BranchWritePC(address);
// ALUWritePC()
// ============
ALUWritePC(bits(32) address)
if ArchVersion() >= 7 && CurrentInstrSet() == InstrSet_ARM then
BXWritePC(address);
else
BranchWritePC(address);
Note
The behavior of the PC writes performed by the ALUWritePC() function is different in Debug state, where
there are more UNPREDICTABLE cases. The pseudocode in this section only handles the non-debug cases. For
more information, see Data-processing instructions with the PC as the target in Debug state on page C5-12.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-13
Application Level Programmers’ Model
A2.4
The Application Program Status Register (APSR)
Program status is reported in the 32-bit Application Program Status Register (APSR). The format of the
APSR is:
31 30 29 28 27 26
N Z C V Q
24 23
RAZ/
SBZP
20 19
Reserved
16 15
GE[3:0]
0
Reserved
In the APSR, the bits are in the following categories:
•
Reserved bits are allocated to system features, or are available for future expansion. Unprivileged
execution ignores writes to privileged fields. However, application level software that writes to the
APSR must treat reserved bits as Do-Not-Modify (DNM) bits. For more information about the
reserved bits, see Format of the CPSR and SPSRs on page B1-16.
•
Flags that can be set by many instructions:
N, bit [31] Negative condition code flag. Set to bit [31] of the result of the instruction. If the result
is regarded as a two's complement signed integer, then N == 1 if the result is negative and
N == 0 if it is positive or zero.
Z, bit [30] Zero condition code flag. Set to 1 if the result of the instruction is zero, and to 0 otherwise.
A result of zero often indicates an equal result from a comparison.
C, bit [29] Carry condition code flag. Set to 1 if the instruction results in a carry condition, for
example an unsigned overflow on an addition.
V, bit [28] Overflow condition code flag. Set to 1 if the instruction results in an overflow condition,
for example a signed overflow on an addition.
Q, bit [27] Set to 1 to indicate overflow or saturation occurred in some instructions, normally related
to Digital Signal Processing (DSP). For more information, see Pseudocode details of
saturation on page A2-9.
GE[3:0], bits [19:16]
Greater than or Equal flags. SIMD instructions update these flags to indicate the results
from individual bytes or halfwords of the operation. These flags can control a later SEL
instruction. For more information, see SEL on page A8-312.
•
Bits [26:24] are RAZ/SBZP. Therefore, software can use MSR instructions that write the top byte of
the APSR without using a read, modify, write sequence. If it does this, it must write zeros to
bits [26:24].
Instructions can test the N, Z, C, and V condition code flags to determine whether the instruction is to be
executed. In this way, execution of the instruction can be made conditional on the result of a previous
operation. For more information about conditional execution see Conditional execution on page A4-3 and
Conditional execution on page A8-8.
In ARMv7-A and ARMv7-R, the APSR is the same register as the CPSR, but the APSR must be used only
to access the N, Z, C, V, Q, and GE[3:0] bits. For more information, see Program Status Registers (PSRs)
on page B1-14.
A2-14
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
A2.5
Execution state registers
The execution state registers modify the execution of instructions. They control:
•
Whether instructions are interpreted as Thumb instructions, ARM instructions, ThumbEE
instructions, or Java bytecodes. For more information, see ISETSTATE.
•
In Thumb state and ThumbEE state only, what conditions apply to the next four instructions. For
more information, see ITSTATE on page A2-17.
•
Whether data is interpreted as big-endian or little-endian. For more information, see ENDIANSTATE
on page A2-19.
In ARMv7-A and ARMv7-R, the execution state registers are part of the Current Program Status Register.
For more information, see Program Status Registers (PSRs) on page B1-14.
There is no direct access to the execution state registers from application level instructions, but they can be
changed by side effects of application level instructions.
A2.5.1
ISETSTATE
1 0
J T
The J bit and the T bit determine the instruction set used by the processor. Table A2-1 shows the encoding
of these bits.
Table A2-1 J and T bit encoding in ISETSTATE
J
T
Instruction set state
0
0
ARM
0
1
Thumb
1
0
Jazelle
1
1
ThumbEE
ARM state
The processor executes the ARM instruction set described in Chapter A5 ARM
Instruction Set Encoding.
Thumb state
The processor executes the Thumb instruction set as described in Chapter A6
Thumb Instruction Set Encoding.
Jazelle state
The processor executes Java bytecodes as part of a Java Virtual Machine (JVM). For
more information, see Jazelle direct bytecode execution support on page A2-73.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-15
Application Level Programmers’ Model
ThumbEE state
The processor executes a variation of the Thumb instruction set specifically targeted
for use with dynamic compilation techniques associated with an execution
environment. This can be Java or other execution environments. This feature is
required in ARMv7-A, and optional in ARMv7-R. For more information, see
Thumb Execution Environment on page A2-69.
Pseudocode details of ISETSTATE operations
The following pseudocode functions return the current instruction set and select a new instruction set:
enumeration InstrSet {InstrSet_ARM, InstrSet_Thumb, InstrSet_Jazelle, InstrSet_ThumbEE};
// CurrentInstrSet()
// =================
InstrSet CurrentInstrSet()
case ISETSTATE of
when ‘00’ result =
when ‘01’ result =
when ‘10’ result =
when ‘11’ result =
return result;
InstrSet_ARM;
InstrSet_Thumb;
InstrSet_Jazelle;
InstrSet_ThumbEE;
// SelectInstrSet()
// ================
SelectInstrSet(InstrSet iset)
case iset of
when InstrSet_ARM
if CurrentInstrSet() == InstrSet_ThumbEE then
UNPREDICTABLE;
else
ISETSTATE = ‘00’;
when InstrSet_Thumb
ISETSTATE = ‘01’;
when InstrSet_Jazelle
ISETSTATE = ‘10’;
when InstrSet_ThumbEE
ISETSTATE = ‘11’;
return;
A2-16
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
A2.5.2
ITSTATE
7 6 5 4
3 2 1 0
IT[7:0]
This field holds the If-Then execution state bits for the Thumb IT instruction. See IT on page A8-104 for a
description of the IT instruction and the associated IT block.
ITSTATE divides into two subfields:
IT[7:5]
Holds the base condition for the current IT block. The base condition is the top 3 bits of the
condition specified by the IT instruction.
This subfield is 0b000 when no IT block is active.
IT[4:0]
Encodes:
•
The size of the IT block. This is the number of instructions that are to be conditionally
executed. The size of the block is implied by the position of the least significant 1 in
this field, as shown in Table A2-2 on page A2-18.
•
The value of the least significant bit of the condition code for each instruction in the
block.
Note
Changing the value of the least significant bit of a condition code from 0 to 1 has the
effect of inverting the condition code.
This subfield is 0b00000 when no IT block is active.
When an IT instruction is executed, these bits are set according to the condition in the instruction, and the
Then and Else (T and E) parameters in the instruction. For more information, see IT on page A8-104.
An instruction in an IT block is conditional, see Conditional instructions on page A4-4 and Conditional
execution on page A8-8. The condition used is the current value of IT[7:4]. When an instruction in an IT
block completes its execution normally, ITSTATE is advanced to the next line of Table A2-2 on page A2-18.
For details of what happens if such an instruction takes an exception see Exception entry on page B1-34.
Note
Instructions that can complete their normal execution by branching are only permitted in an IT block as its
last instruction, and so always result in ITSTATE advancing to normal execution.
Note
ITSTATE affects instruction execution only in Thumb and ThumbEE states. In ARM and Jazelle states,
ITSTATE must be '00000000', otherwise behavior is UNPREDICTABLE.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-17
Application Level Programmers’ Model
Table A2-2 Effect of IT execution state bits
IT bits a
Note
[7:5]
[4]
[3]
[2]
[1]
[0]
cond_base
P1
P2
P3
P4
1
Entry point for 4-instruction IT block
cond_base
P1
P2
P3
1
0
Entry point for 3-instruction IT block
cond_base
P1
P2
1
0
0
Entry point for 2-instruction IT block
cond_base
P1
1
0
0
0
Entry point for 1-instruction IT block
000
0
0
0
0
0
Normal execution, not in an IT block
a. Combinations of the IT bits not shown in this table are reserved.
Pseudocode details of ITSTATE operations
ITSTATE advances after normal execution of an IT block instruction. This is described by the ITAdvance()
pseudocode function:
// ITAdvance()
// ===========
ITAdvance()
if ITSTATE<2:0> == ‘000’ then
ITSTATE.IT = ‘00000000’;
else
ITSTATE.IT<4:0> = LSL(ITSTATE.IT<4:0>, 1);
The following functions test whether the current instruction is in an IT block, and whether it is the last
instruction of an IT block:
// InITBlock()
// ===========
boolean InITBlock()
return (ITSTATE.IT<3:0> != ‘0000’);
// LastInITBlock()
// ===============
boolean LastInITBlock()
return (ITSTATE.IT<3:0> == ‘1000’);
A2-18
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
A2.5.3
ENDIANSTATE
ARMv7-A and ARMv7-R support configuration between little-endian and big-endian interpretations of
data memory, as shown in Table A2-3. The endianness is controlled by ENDIANSTATE.
Table A2-3 APSR configuration of endianness
ENDIANSTATE
Endian mapping
0
Little-endian
1
Big-endian
The ARM and Thumb instruction sets both include an instruction to manipulate ENDIANSTATE:
SETEND BE
Sets ENDIANSTATE to 1, for big-endian operation
SETEND LE
Sets ENDIANSTATE to 0, for little-endian operation.
The SETEND instruction is unconditional. For more information, see SETEND on page A8-314.
Pseudocode details of ENDIANSTATE operations
The BigEndian() pseudocode function tests whether big-endian memory accesses are currently selected.
// BigEndian()
// ===========
boolean BigEndian()
return (ENDIANSTATE == ‘1’);
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-19
Application Level Programmers’ Model
A2.6
Advanced SIMD and VFP extensions
Advanced SIMD and VFP are two optional extensions to ARMv7.
Advanced SIMD performs packed Single Instruction Multiple Data (SIMD) operations, either integer or
single-precision floating-point. VFP performs single-precision or double-precision floating-point
operations.
Both extensions permit floating-point exceptions, such as overflow or division by zero, to be handled in an
untrapped fashion. When handled in this way, a floating-point exception causes a cumulative status register
bit to be set to 1 and a default result to be produced by the operation.
The ARMv7 VFP implementation is VFPv3. ARMv7 also permits a variant of VFPv3, VFPv3U, that
supports the trapping of floating-point exceptions, see VFPv3U on page A2-31. VFPv2 also supports the
trapping of floating-point exceptions.
For more information about floating-point exceptions see Floating-point exceptions on page A2-42.
Each extension can be implemented at a number of levels. Table A2-4 shows the permitted combinations of
implementations of the two extensions.
Table A2-4 Permitted combinations of Advanced SIMD and VFP extensions
Advanced SIMD
VFP
Not implemented
Not implemented
Integer only
Not implemented
Integer and single-precision floating-point
Single-precision floating-point only a
Integer and single-precision floating-point
Single-precision and double-precision floating-point
Not implemented
Single-precision floating-point only a
Not implemented
Single-precision and double-precision floating-point
a. Must be able to load and store double-precision data.
The optional half-precision extensions provide conversion functions in both directions between
half-precision floating-point and single-precision floating-point. These extensions can be implemented with
any Advanced SIMD and VFP implementation that supports single-precision floating-point. The
half-precision extensions apply to both VFP and Advanced SIMD if they are both implemented.
For system-level information about the Advanced SIMD and VFP extensions see:
•
Advanced SIMD and VFP extension system registers on page B1-66
•
Advanced SIMD and floating-point support on page B1-64.
A2-20
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
Note
Before ARMv7, the VFP extension was called the Vector Floating-point Architecture, and was used for
vector operations. For details of these deprecated operations see Appendix F VFP Vector Operation
Support. From ARMv7:
A2.6.1
•
ARM recommends that the Advanced SIMD extension is used for single-precision vector
floating-point operations
•
an implementation that requires support for vector operations must implement the Advanced SIMD
extension.
Advanced SIMD and VFP extension registers
Advanced SIMD and VFPv3 use the same register set. This is distinct from the ARM core register set. These
registers are generally referred to as the extension registers.
The extension register set consists of either thirty-two or sixteen doubleword registers, as follows:
•
If VFPv2 is implemented, it consists of sixteen doubleword registers.
•
If VFPv3 is implemented, it consists of either thirty-two or sixteen doubleword registers. Where
necessary the terms VFPv3-D32 and VFPv3-D16 are used to distinguish between these two
implementation options.
•
If Advanced SIMD is implemented, it consists of thirty-two doubleword registers. If both Advanced
SIMD and VFPv3 are implemented, VFPv3 must be implemented in its VFPv3-D32 form.
The Advanced SIMD and VFP views of the extension register set are not identical. They are described in
the following sections.
Figure A2-1 on page A2-22 shows the views of the extension register set, and the way the word,
doubleword, and quadword registers overlap.
Advanced SIMD views of the extension register set
Advanced SIMD can view this register set as:
•
Sixteen 128-bit quadword registers, Q0-Q15.
•
Thirty-two 64-bit doubleword registers, D0-D31. This view is also available in VFPv3.
These views can be used simultaneously. For example, a program might hold 64-bit vectors in D0 and D1
and a 128-bit vector in Q1.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-21
Application Level Programmers’ Model
VFP views of the extension register set
In VFPv3-D32, the extension register set consists of thirty-two doubleword registers, that VFP can view as:
•
Thirty-two 64-bit doubleword registers, D0-D31. This view is also available in Advanced SIMD.
•
Thirty-two 32-bit single word registers, S0-S31. Only half of the set is accessible in this view.
In VFPv3-D16 and VFPv2, the extension register set consists of sixteen doubleword registers, that VFP can
view as:
•
Sixteen 64-bit doubleword registers, D0-D15.
•
Thirty-two 32-bit single word registers, S0-S31.
In each case, the two views can be used simultaneously.
Advanced SIMD and VFP register mapping
S2
S3
S4
S5
S6
...
S7
S28
S29
S30
S31
D0
D0
D1
D1
D2
D2
D3
D3
Q0-Q15
Advanced SIMD only
Q0
Q1
D14
D14
D15
D15
...
S1
D0-D31
VFPv3-D32 or
Advanced SIMD
...
S0
D0-D15
VFPv2 or
VFPv3-D16
...
S0-S31
VFP only
Q7
D16
Q8
...
...
D17
D30
Q15
D31
Figure A2-1 Advanced SIMD and VFP register set
A2-22
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
The mapping between the registers is as follows:
S<2n> maps to the least significant half of D
•
•
S<2n+1> maps to the most significant half of D
•
D<2n> maps to the least significant half of Q
•
D<2n+1> maps to the most significant half of Q.
For example, you can access the least significant half of the elements of a vector in Q6 by referring to D12,
and the most significant half of the elements by referring to D13.
Pseudocode details of Advanced SIMD and VFP extension registers
The pseudocode function VFPSmallRegisterBank() returns FALSE if all of the 32 registers D0-D31 can be
accessed, and TRUE if only the 16 registers D0-D15 can be accessed:
boolean VFPSmallRegisterBank()
In more detail, VFPSmallRegisterBank():
•
returns TRUE for a VFPv2 or VFPv3-D16 implementation
•
for a VFPv3-D32 implementation:
—
returns FALSE if CPACR.D32DIS == 0
—
returns TRUE if CPACR.D32DIS == 1 and CPACR.ASEDIS == 1
—
results in UNPREDICTABLE behavior if CPACR.D32DIS == 1 and CPACR.ASEDIS == 0.
For details of the CPACR register, see:
•
c1, Coprocessor Access Control Register (CPACR) on page B3-104 for a VMSA implementation
•
c1, Coprocessor Access Control Register (CPACR) on page B4-51 for a PMSA implementation.
The S0-S31, D0-D31, and Q0-Q15 views of the registers are provided by the following functions:
// The 64-bit extension register bank for Advanced SIMD and VFP.
array bits(64) _D[0..31];
// S[] - non-assignment form
// =========================
bits(32) S[integer n]
assert n >= 0 && n <= 31;
if (n MOD 2) == 0 then
result = D[n DIV 2]<31:0>;
else
result = D[n DIV 2]<63:32>;
return result;
// S[] - assignment form
// =====================
S[integer n] = bits(32) value
assert n >= 0 && n <= 31;
if (n MOD 2) == 0 then
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-23
Application Level Programmers’ Model
D[n DIV 2]<31:0> = value;
else
D[n DIV 2]<63:32> = value;
return;
// D[] - non-assignment form
// =========================
bits(64) D[integer n]
assert n >= 0 && n <= 31;
if n >= 16 && VFPSmallRegisterBank() then UNDEFINED;
return _D[n];
// D[] - assignment form
// =====================
D[integer n] = bits(64) value
assert n >= 0 && n <= 31;
if n >= 16 && VFPSmallRegisterBank() then UNDEFINED;
_D[n] = value;
return;
// Q[] - non-assignment form
// =========================
bits(128) Q[integer n]
assert n >= 0 && n <= 15;
return D[2*n+1]:D[2*n];
// Q[] - assignment form
// =====================
Q[integer n]
assert n
D[2*n] =
D[2*n+1]
return;
A2-24
= bits(128) value
>= 0 && n <= 15;
value<63:0>;
= value<127:64>;
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
A2.6.2
Data types supported by the Advanced SIMD extension
When the Advanced SIMD extension is implemented, it can operate on integer and floating-point data. It
defines a set of data types to represent the different data formats. Table A2-5 shows the available formats.
Each instruction description specifies the data types that the instruction supports.
Table A2-5 Advanced SIMD data types
Data type specifier
Meaning
.
Any element of bits
.F
Floating-point number of bits
.I
Signed or unsigned integer of bits
.P
Polynomial over {0,1} of degree less than
.S
Signed integer of bits
.U
Unsigned integer of bits
The polynomial data type is described in Polynomial arithmetic over {0,1} on page A2-67.
The .F16 data type is the half-precision data type currently selected by the FPSCR.AHP bit, see Advanced
SIMD and VFP system registers on page A2-28. It is supported only when the half-precision extensions are
implemented.
The .F32 data type is the ARM standard single-precision floating-point data type, see Advanced SIMD and
VFP single-precision format on page A2-34.
The instruction definitions use a data type specifier to define the data types appropriate to the operation.
Figure A2-2 on page A2-26 shows the hierarchy of Advanced SIMD data types.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-25
Application Level Programmers’ Model
.S8
.U8
.I8
.8
.P8
.S16
.U16
.I16
.16
.P16
.F16
.S32
.U32
.I32
.32
.F32
.S64
.U64
.I64
.64
-
Supported only if the half-precision extensions are implemented
Figure A2-2 Advanced SIMD data type hierarchy
For example, a multiply instruction must distinguish between integer and floating-point data types.
However, some multiply instructions use modulo arithmetic for integer instructions and therefore do not
need to distinguish between signed and unsigned inputs.
A multiply instruction that generates a double-width (long) result must specify the input data types as signed
or unsigned, because for this operation it does make a difference.
A2.6.3
Advanced SIMD vectors
When the Advanced SIMD extension is implemented, a register can hold one or more packed elements, all
of the same size and type. The combination of a register and a data type describes a vector of elements. The
vector is considered to be an array of elements of the data type specified in the instruction. The number of
elements in the vector is implied by the size of the data elements and the size of the register.
Vector indices are in the range 0 to (number of elements – 1). An index of 0 refers to the least significant
end of the vector. Figure A2-3 on page A2-27 shows examples of Advanced SIMD vectors:
A2-26
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
127
0
Qn
.F32
.F32
.F32
.F32
[3]
[2]
[1]
[0]
128-bit vector of single-precision
(32-bit) floating-point numbers
.S16
.S16
.S16
.S16
.S16
.S16
.S16
.S16
[7]
[6]
[5]
[4]
[3]
[2]
[1]
[0]
63
128-bit vector of 16-bit signed integers
0
Dn
.S32
.S32
[1]
[0]
64-bit vector of 32-bit signed integers
.U16
.U16
.U16
.U16
[3]
[2]
[1]
[0]
64-bit vector of 16-bit unsigned integers
Figure A2-3 Examples of Advanced SIMD vectors
Pseudocode details of Advanced SIMD vectors
The pseudocode function Elem[] is used to access the element of a specified index and size in a vector:
// Elem[] - non-assignment form
// ============================
bits(size) Elem[bits(N) vector, integer e, integer size]
assert e >= 0 && (e+1)*size <= N;
return vector<(e+1)*size-1:e*size>;
// Elem[] - assignment form
// ========================
Elem[bits(N) vector, integer e, integer size] = bits(size) value
assert e >= 0 && (e+1)*size <= N;
vector<(e+1)*size-1:e*size> = value;
return;
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-27
Application Level Programmers’ Model
A2.6.4
Advanced SIMD and VFP system registers
The Advanced SIMD and VFP extensions have a shared register space for system registers. Only one
register in this space is accessible at the application level, see Floating-point Status and Control Register
(FPSCR).
See Advanced SIMD and VFP extension system registers on page B1-66 for the system level description of
the registers.
Floating-point Status and Control Register (FPSCR)
The Floating-point Status and Control Register (FPSCR) is implemented in any system that implements one
or both of:
•
the VFP extension
•
the Advanced SIMD extension.
The FPSCR provides all necessary User level control of the floating-point system
The FPSCR is a 32-bit read/write system register, accessible in unprivileged and privileged modes.
The format of the FPSCR is:
31 30 29 28 27 26 25 24 23 22 21 20 19 18
N Z C V
QC
AHP
DN
FZ
Stride
RMode
UNK/SBZP
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Len
IDE
UNK/
SBZP
UNK/
SBZP
IXE
UFE
OFE
DZE
IOE
IDC
IXC
UFC
OFC
DZC
IOC
Bits [31:28]
Condition code bits. These are updated on floating-point comparison operations. They are
not updated on SIMD operations, and do not affect SIMD instructions.
N, bit [31] Negative condition code flag.
Z, bit [30] Zero condition code flag.
C, bit [29] Carry condition code flag.
V, bit [28] Overflow condition code flag.
QC, bit [27]
Cumulative saturation flag, Advanced SIMD only. This bit is set to 1 to indicate that an
Advanced SIMD integer operation has saturated since 0 was last written to this bit. For
details of saturation, see Pseudocode details of saturation on page A2-9.
The value of this bit is ignored by the VFP extension. If Advanced SIMD is not implemented
this bit is UNK/SBZP.
A2-28
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
AHP, bit[26] Alternative half-precision control bit:
0
IEEE half-precision format selected.
1
Alternative half-precision format selected.
For more information see Advanced SIMD and VFP half-precision formats on page A2-38.
If the half-precision extensions are not implemented this bit is UNK/SBZP.
Bits [19,14:13,6:5]
Reserved. UNK/SBZP.
DN, bit [25]
Default NaN mode control bit:
0
NaN operands propagate through to the output of a floating-point operation.
1
Any operation involving one or more NaNs returns the Default NaN.
For more information, see NaN handling and the Default NaN on page A2-41.
The value of this bit only controls VFP arithmetic. Advanced SIMD arithmetic always uses
the Default NaN setting, regardless of the value of the DN bit.
FZ, bit [24]
Flush-to-zero mode control bit:
0
Flush-to-zero mode disabled. Behavior of the floating-point system is fully
compliant with the IEEE 754 standard.
1
Flush-to-zero mode enabled.
For more information, see Flush-to-zero on page A2-39.
The value of this bit only controls VFP arithmetic. Advanced SIMD arithmetic always uses
the Flush-to-zero setting, regardless of the value of the FZ bit.
RMode, bits [23:22]
Rounding Mode control field. The encoding of this field is:
0b00
Round to Nearest (RN) mode
0b01
Round towards Plus Infinity (RP) mode
0b10
Round towards Minus Infinity (RM) mode
0b11
Round towards Zero (RZ) mode.
The specified rounding mode is used by almost all VFP floating-point instructions.
Advanced SIMD arithmetic always uses the Round to Nearest setting, regardless of the
value of the RMode bits.
Stride, bits [21:20] and Len, bits [18:16]
Use of nonzero values of these fields is deprecated in ARMv7. For details of their use in
previous versions of the ARM architecture see Appendix F VFP Vector Operation Support.
The values of these fields are ignored by the Advanced SIMD extension.
Bits [15,12:8] Floating-point exception trap enable bits. These bits are supported only in VFPv2 and
VFPv3U. They are reserved, RAZ/SBZP, on a system that implements VFPv3.
ARM DDI 0406B
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
A2-29
Application Level Programmers’ Model
The possible values of each bit are:
0
Untrapped exception handling selected
1
Trapped exception handling selected.
The values of these bits control only VFP arithmetic. Advanced SIMD arithmetic always
uses untrapped exception handling, regardless of the values of these bits.
For more information, see Floating-point exceptions on page A2-42.
Bits [7,4:0]
IDE, bit [15]
Input Denormal exception trap enable.
IXE, bit [12]
Inexact exception trap enable.
UFE, bit [11]
Underflow exception trap enable.
OFE, bit [10]
Overflow exception trap enable.
DZE, bit [9]
Division by Zero exception trap enable.
IOE, bit [8]
Invalid Operation exception trap enable.
Cumulative exception flags for floating-point exceptions. Each of these bits is set to 1 to
indicate that the corresponding exception has occurred since 0 was last written to it. How
VFP instructions update these bits depends on the value of the corresponding exception trap
enable bits:
Trap enable bit = 0
If the floating-point exception occurs then the cumulative exception flag is set
to 1.
Trap enable bit = 1
If the floating-point exception occurs the trap handling software can decide
whether to set the cumulative exception flag to 1.
Advanced SIMD instructions set each cumulative exception flag if the corresponding
exception occurs in one or more of the floating-point calculations performed by the
instruction, regardless of the setting of the trap enable bits.
For more information, see Floating-point exceptions on page A2-42.
IDC, bit [7]
Input Denormal cumulative exception flag.
IXC, bit [4]
Inexact cumulative exception flag.
UFC, bit [3]
Underflow cumulative exception flag.
OFC, bit [2]
Overflow cumulative exception flag.
DZC, bit [1]
Division by Zero cumulative exception flag.
IOC, bit [0]
Invalid Operation cumulative exception flag.
If the processor implements the integer-only Advanced SIMD extension and does not implement the VFP
extension, all of these bits except QC are UNK/SBZP.
Writes to the FPSCR can have side-effects on various aspects of processor operation. All of these
side-effects are synchronous to the FPSCR write. This means they are guaranteed not to be visible to earlier
instructions in the execution stream, and they are guaranteed to be visible to later instructions in the
execution stream.
A2-30
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
Application Level Programmers’ Model
Accessing the FPSCR
You read or write the FPSCR using the VMRS and VMSR instructions. For more information, see VMRS on
page A8-658 and VMSR on page A8-660. For example:
VMRS