ARM Architecture Reference Manual ARMv7 A And R Edition V7

User Manual: Pdf

Open the PDF directly: View PDF PDF.
Page Count: 2158 [warning: Documents this large are best viewed by clicking the View PDF Link!]

Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
ARM® Architecture
Reference Manual
ARM®v7-A and ARM®v7-R edition
ii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
ARM Architecture Reference Manual
ARMv7-A and ARMv7-R edition
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
Release Information
The following changes have been made to this document.
From ARMv7, the ARM® architecture defines different architectural profiles and this edition of this manual describes
only the A and R profiles. For details of the documentation of the ARMv7-M profile see Further reading on page xx.
Before ARMv7 there was only a single ARM Architecture Reference Manual, with document number DDI 0100. The first
issue of this was in February 1996, and the final issue, Issue I, was in July 2005. For more information see Further reading
on page xx.
Proprietary Notice
Words and logos marked with ® or are registered trademarks or trademarks of ARM Limited in the EU and other
countries, except as otherwise stated below in this proprietary notice. Other brands and names mentioned herein may be
the trademarks of their respective owners.
Neither the whole nor any part of the information contained in, or the product described in, this document may be adapted
or reproduced in any material form except with the prior written permission of the copyright holder.
The product described in this document is subject to continuous developments and improvements. All particulars of the
product and its use contained in this document are given by ARM in good faith. However, all warranties implied or
expressed, including but not limited to implied warranties of merchantability, or fitness for purpose, are excluded.
1. Subject to the provisions set out below, ARM hereby grants to you a perpetual, non-exclusive, nontransferable, royalty
free, worldwide licence to use this ARM Architecture Reference Manual for the purposes of developing; (i) software
applications or operating systems which are targeted to run on microprocessor cores distributed under licence from ARM;
(ii) tools which are designed to develop software programs which are targeted to run on microprocessor cores distributed
under licence from ARM; (iii) or having developed integrated circuits which incorporate a microprocessor core
manufactured under licence from ARM.
2. Except as expressly licensed in Clause 1 you acquire no right, title or interest in the ARM Architecture Reference
Manual, or any Intellectual Property therein. In no event shall the licences granted in Clause 1, be construed as granting
you expressly or by implication, estoppel or otherwise, licences to any ARM technology other than the ARM Architecture
Reference Manual. The licence grant in Clause 1 expressly excludes any rights for you to use or take into use any ARM
patents. No right is granted to you under the provisions of Clause 1 to; (i) use the ARM Architecture Reference Manual
for the purposes of developing or having developed microprocessor cores or models thereof which are compatible in
whole or part with either or both the instructions or programmers’ models described in this ARM Architecture Reference
Manual; or (ii) develop or have developed models of any microprocessor cores designed by or for ARM; or (iii) distribute
Change History
Date Issue Confidentiality Change
05 April 2007 A Non-Confidential New edition for ARMv7-A and ARMv7-R architecture profiles.
Document number changed from ARM DDI 0100 to ARM DDI 0406 and contents
restructured.
29 April 2008 B Non-Confidential Addition of the VFP Half-precision and Multiprocessing Extensions, and many clarifications
and enhancements.
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. iii
in whole or in part this ARM Architecture Reference Manual to third parties, other than to your subcontractors for the
purposes of having developed products in accordance with the licence grant in Clause 1 without the express written
permission of ARM; or (iv) translate or have translated this ARM Architecture Reference Manual into any other
languages.
3. THE ARM ARCHITECTURE REFERENCE MANUAL IS PROVIDED "AS IS" WITH NO WARRANTIES
EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF
SATISFACTORY QUALITY, NONINFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE.
4. No licence, express, implied or otherwise, is granted to LICENSEE, under the provisions of Clause 1, to use the ARM
tradename, in connection with the use of the ARM Architecture Reference Manual or any products based thereon.
Nothing in Clause 1 shall be construed as authority for you to make any representations on behalf of ARM in respect of
the ARM Architecture Reference Manual or any products based thereon.
Where the term ARM is used to refer to the company it means “ARM or any of its subsidiaries as appropriate”.
Note
The term ARM is also used to refer to versions of the ARM architecture, for example ARMv6 refers to version 6 of the
ARM architecture. The context makes it clear when the term is used in this way.
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited
110 Fulbourn Road Cambridge, England CB1 9NJ
Restricted Rights Legend: Use, duplication or disclosure by the United States Government is subject to the restrictions
set forth in DFARS 252.227-7013 (c)(1)(ii) and FAR 52.227-19.
This document is Non-Confidential. The right to use, copy and disclose this document is subject to the licence set out
above.
iv Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. v
Contents
ARM Architecture Reference Manual
ARMv7-A and ARMv7-R edition
Preface
About this manual ............................................................................... xiv
Using this manual ................................................................................ xv
Conventions ....................................................................................... xviii
Further reading .................................................................................... xx
Feedback ............................................................................................ xxi
Part A Application Level Architecture
Chapter A1 Introduction to the ARM Architecture
A1.1 About the ARM architecture ............................................................. A1-2
A1.2 The ARM and Thumb instruction sets .............................................. A1-3
A1.3 Architecture versions, profiles, and variants .................................... A1-4
A1.4 Architecture extensions .................................................................... A1-6
A1.5 The ARM memory model ................................................................. A1-7
A1.6 Debug .............................................................................................. A1-8
Chapter A2 Application Level Programmers’ Model
A2.1 About the Application level programmers’ model ............................. A2-2
Contents
vi Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.2 ARM core data types and arithmetic ................................................ A2-3
A2.3 ARM core registers ........................................................................ A2-11
A2.4 The Application Program Status Register (APSR) ......................... A2-14
A2.5 Execution state registers ................................................................ A2-15
A2.6 Advanced SIMD and VFP extensions ............................................ A2-20
A2.7 Floating-point data types and arithmetic ........................................ A2-32
A2.8 Polynomial arithmetic over {0,1} .................................................... A2-67
A2.9 Coprocessor support ...................................................................... A2-68
A2.10 Execution environment support ..................................................... A2-69
A2.11 Exceptions, debug events and checks ........................................... A2-81
Chapter A3 Application Level Memory Model
A3.1 Address space ................................................................................. A3-2
A3.2 Alignment support ............................................................................ A3-4
A3.3 Endian support ................................................................................. A3-7
A3.4 Synchronization and semaphores .................................................. A3-12
A3.5 Memory types and attributes and the memory order model .......... A3-24
A3.6 Access rights .................................................................................. A3-38
A3.7 Virtual and physical addressing ..................................................... A3-40
A3.8 Memory access order .................................................................... A3-41
A3.9 Caches and memory hierarchy ...................................................... A3-51
Chapter A4 The Instruction Sets
A4.1 About the instruction sets ................................................................. A4-2
A4.2 Unified Assembler Language ........................................................... A4-4
A4.3 Branch instructions .......................................................................... A4-7
A4.4 Data-processing instructions ............................................................ A4-8
A4.5 Status register access instructions ................................................ A4-18
A4.6 Load/store instructions ................................................................... A4-19
A4.7 Load/store multiple instructions ..................................................... A4-22
A4.8 Miscellaneous instructions ............................................................. A4-23
A4.9 Exception-generating and exception-handling instructions ............ A4-24
A4.10 Coprocessor instructions ............................................................... A4-25
A4.11 Advanced SIMD and VFP load/store instructions .......................... A4-26
A4.12 Advanced SIMD and VFP register transfer instructions ................. A4-29
A4.13 Advanced SIMD data-processing operations ................................. A4-30
A4.14 VFP data-processing instructions .................................................. A4-38
Chapter A5 ARM Instruction Set Encoding
A5.1 ARM instruction set encoding .......................................................... A5-2
A5.2 Data-processing and miscellaneous instructions ............................. A5-4
A5.3 Load/store word and unsigned byte ............................................... A5-19
A5.4 Media instructions .......................................................................... A5-21
A5.5 Branch, branch with link, and block data transfer .......................... A5-27
A5.6 Supervisor Call, and coprocessor instructions ............................... A5-28
A5.7 Unconditional instructions .............................................................. A5-30
Contents
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. vii
Chapter A6 Thumb Instruction Set Encoding
A6.1 Thumb instruction set encoding ....................................................... A6-2
A6.2 16-bit Thumb instruction encoding ................................................... A6-6
A6.3 32-bit Thumb instruction encoding ................................................. A6-14
Chapter A7 Advanced SIMD and VFP Instruction Encoding
A7.1 Overview .......................................................................................... A7-2
A7.2 Advanced SIMD and VFP instruction syntax ................................... A7-3
A7.3 Register encoding ............................................................................ A7-8
A7.4 Advanced SIMD data-processing instructions ............................... A7-10
A7.5 VFP data-processing instructions .................................................. A7-24
A7.6 Extension register load/store instructions ...................................... A7-26
A7.7 Advanced SIMD element or structure load/store instructions ........ A7-27
A7.8 8, 16, and 32-bit transfer between ARM core and extension registers .....
A7-31
A7.9 64-bit transfers between ARM core and extension registers ......... A7-32
Chapter A8 Instruction Details
A8.1 Format of instruction descriptions .................................................... A8-2
A8.2 Standard assembler syntax fields .................................................... A8-7
A8.3 Conditional execution ....................................................................... A8-8
A8.4 Shifts applied to a register ............................................................. A8-10
A8.5 Memory accesses .......................................................................... A8-13
A8.6 Alphabetical list of instructions ....................................................... A8-14
Chapter A9 ThumbEE
A9.1 The ThumbEE instruction set ........................................................... A9-2
A9.2 ThumbEE instruction set encoding .................................................. A9-6
A9.3 Additional instructions in Thumb and ThumbEE instruction sets ..... A9-7
A9.4 ThumbEE instructions with modified behavior ................................. A9-8
A9.5 Additional ThumbEE instructions ................................................... A9-14
Part B System Level Architecture
Chapter B1 The System Level Programmers’ Model
B1.1 About the system level programmers’ model ................................... B1-2
B1.2 System level concepts and terminology ........................................... B1-3
B1.3 ARM processor modes and core registers ....................................... B1-6
B1.4 Instruction set states ...................................................................... B1-23
B1.5 The Security Extensions ................................................................ B1-25
B1.6 Exceptions ..................................................................................... B1-30
B1.7 Coprocessors and system control .................................................. B1-62
B1.8 Advanced SIMD and floating-point support .................................... B1-64
B1.9 Execution environment support ..................................................... B1-73
Contents
viii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Chapter B2 Common Memory System Architecture Features
B2.1 About the memory system architecture ........................................... B2-2
B2.2 Caches ............................................................................................. B2-3
B2.3 Implementation defined memory system features ......................... B2-27
B2.4 Pseudocode details of general memory system operations .......... B2-29
Chapter B3 Virtual Memory System Architecture (VMSA)
B3.1 About the VMSA .............................................................................. B3-2
B3.2 Memory access sequence ............................................................... B3-4
B3.3 Translation tables ............................................................................. B3-7
B3.4 Address mapping restrictions ......................................................... B3-23
B3.5 Secure and Non-secure address spaces ....................................... B3-26
B3.6 Memory access control .................................................................. B3-28
B3.7 Memory region attributes ............................................................... B3-32
B3.8 VMSA memory aborts .................................................................... B3-40
B3.9 Fault Status and Fault Address registers in a VMSA implementation ......
B3-48
B3.10 Translation Lookaside Buffers (TLBs) ............................................ B3-54
B3.11 Virtual Address to Physical Address translation operations ........... B3-63
B3.12 CP15 registers for a VMSA implementation .................................. B3-64
B3.13 Pseudocode details of VMSA memory system operations .......... B3-156
Chapter B4 Protected Memory System Architecture (PMSA)
B4.1 About the PMSA .............................................................................. B4-2
B4.2 Memory access control .................................................................... B4-9
B4.3 Memory region attributes ............................................................... B4-11
B4.4 PMSA memory aborts .................................................................... B4-13
B4.5 Fault Status and Fault Address registers in a PMSA implementation ......
B4-18
B4.6 CP15 registers for a PMSA implementation .................................. B4-22
B4.7 Pseudocode details of PMSA memory system operations ............ B4-79
Chapter B5 The CPUID Identification Scheme
B5.1 Introduction to the CPUID scheme .................................................. B5-2
B5.2 The CPUID registers ........................................................................ B5-4
B5.3 Advanced SIMD and VFP feature identification registers .............. B5-34
Chapter B6 System Instructions
B6.1 Alphabetical list of instructions ......................................................... B6-2
Part C Debug Architecture
Chapter C1 Introduction to the ARM Debug Architecture
C1.1 Scope of part C of this manual ......................................................... C1-2
C1.2 About the ARM Debug architecture ................................................. C1-3
Contents
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ix
C1.3 Security Extensions and debug ....................................................... C1-8
C1.4 Register interfaces ........................................................................... C1-9
Chapter C2 Invasive Debug Authentication
C2.1 About invasive debug authentication ............................................... C2-2
Chapter C3 Debug Events
C3.1 About debug events ......................................................................... C3-2
C3.2 Software debug events .................................................................... C3-5
C3.3 Halting debug events ..................................................................... C3-38
C3.4 Generation of debug events ........................................................... C3-40
C3.5 Debug event prioritization .............................................................. C3-43
Chapter C4 Debug Exceptions
C4.1 About debug exceptions .................................................................. C4-2
C4.2 Effects of debug exceptions on CP15 registers and the DBGWFAR ........
C4-4
Chapter C5 Debug State
C5.1 About Debug state ........................................................................... C5-2
C5.2 Entering Debug state ....................................................................... C5-3
C5.3 Behavior of the PC and CPSR in Debug state ................................. C5-7
C5.4 Executing instructions in Debug state .............................................. C5-9
C5.5 Privilege in Debug state ................................................................. C5-13
C5.6 Behavior of non-invasive debug in Debug state ............................. C5-19
C5.7 Exceptions in Debug state ............................................................. C5-20
C5.8 Memory system behavior in Debug state ....................................... C5-24
C5.9 Leaving Debug state ...................................................................... C5-28
Chapter C6 Debug Register Interfaces
C6.1 About the debug register interfaces ................................................. C6-2
C6.2 Reset and power-down support ....................................................... C6-4
C6.3 Debug register map ....................................................................... C6-18
C6.4 Synchronization of debug register updates .................................... C6-24
C6.5 Access permissions ....................................................................... C6-26
C6.6 The CP14 debug register interfaces .............................................. C6-32
C6.7 The memory-mapped and recommended external debug interfaces .......
C6-43
Chapter C7 Non-invasive Debug Authentication
C7.1 About non-invasive debug authentication ........................................ C7-2
C7.2 v7 Debug non-invasive debug authentication .................................. C7-4
C7.3 Effects of non-invasive debug authentication .................................. C7-6
C7.4 ARMv6 non-invasive debug authentication ...................................... C7-8
Contents
xCopyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Chapter C8 Sample-based Profiling
C8.1 Program Counter sampling .............................................................. C8-2
Chapter C9 Performance Monitors
C9.1 About the performance monitors ...................................................... C9-2
C9.2 Status in the ARM architecture ........................................................ C9-4
C9.3 Accuracy of the performance monitors ............................................ C9-5
C9.4 Behavior on overflow ....................................................................... C9-6
C9.5 Interaction with Security Extensions ................................................ C9-7
C9.6 Interaction with trace ........................................................................ C9-8
C9.7 Interaction with power saving operations ......................................... C9-9
C9.8 CP15 c9 register map .................................................................... C9-10
C9.9 Access permissions ....................................................................... C9-12
C9.10 Event numbers ............................................................................... C9-13
Chapter C10 Debug Registers Reference
C10.1 Accessing the debug registers ....................................................... C10-2
C10.2 Debug identification registers ......................................................... C10-3
C10.3 Control and status registers ......................................................... C10-10
C10.4 Instruction and data transfer registers ......................................... C10-40
C10.5 Software debug event registers ................................................... C10-48
C10.6 OS Save and Restore registers, v7 Debug only .......................... C10-75
C10.7 Memory system control registers ................................................. C10-80
C10.8 Management registers, ARMv7 only ............................................ C10-88
C10.9 Performance monitor registers ................................................... C10-105
Appendix A Recommended External Debug Interface
A.1 System integration signals ......................................................... AppxA-2
A.2 Recommended debug slave port ............................................. AppxA-13
Appendix B Common VFP Subarchitecture Specification
B.1 Scope of this appendix ............................................................... AppxB-2
B.2 Introduction to the Common VFP subarchitecture ..................... AppxB-3
B.3 Exception processing ................................................................. AppxB-6
B.4 Support code requirements ...................................................... AppxB-11
B.5 Context switching ..................................................................... AppxB-14
B.6 Subarchitecture additions to the VFP system registers ........... AppxB-15
B.7 Version 1 of the Common VFP subarchitecture ....................... AppxB-23
B.8 Version 2 of the Common VFP subarchitecture ....................... AppxB-24
Appendix C Legacy Instruction Mnemonics
C.1 Thumb instruction mnemonics ................................................... AppxC-2
C.2 Pre-UAL pseudo-instruction NOP .............................................. AppxC-3
Contents
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xi
Appendix D Deprecated and Obsolete Features
D.1 Deprecated features .................................................................. AppxD-2
D.2 Deprecated terminology ............................................................. AppxD-5
D.3 Obsolete features ....................................................................... AppxD-6
D.4 Semaphore instructions ............................................................. AppxD-7
D.5 Use of the SP as a general-purpose register ............................. AppxD-8
D.6 Explicit use of the PC in ARM instructions ................................. AppxD-9
D.7 Deprecated Thumb instructions ............................................... AppxD-10
Appendix E Fast Context Switch Extension (FCSE)
E.1 About the FCSE ......................................................................... AppxE-2
E.2 Modified virtual addresses ......................................................... AppxE-3
E.3 Debug and trace ........................................................................ AppxE-5
Appendix F VFP Vector Operation Support
F.1 About VFP vector mode ............................................................. AppxF-2
F.2 Vector length and stride control ................................................. AppxF-3
F.3 VFP register banks .................................................................... AppxF-5
F.4 VFP instruction type selection .................................................... AppxF-7
Appendix G ARMv6 Differences
G.1 Introduction to ARMv6 .............................................................. AppxG-2
G.2 Application level register support .............................................. AppxG-3
G.3 Application level memory support ............................................. AppxG-6
G.4 Instruction set support ............................................................. AppxG-10
G.5 System level register support .................................................. AppxG-16
G.6 System level memory model ................................................... AppxG-20
G.7 System Control coprocessor (CP15) support .......................... AppxG-29
Appendix H ARMv4 and ARMv5 Differences
H.1 Introduction to ARMv4 and ARMv5 ............................................ AppxH-2
H.2 Application level register support ............................................... AppxH-4
H.3 Application level memory support .............................................. AppxH-6
H.4 Instruction set support .............................................................. AppxH-11
H.5 System level register support ................................................... AppxH-18
H.6 System level memory model .................................................... AppxH-21
H.7 System Control coprocessor (CP15) support ........................... AppxH-31
Appendix I Pseudocode Definition
I.1 Instruction encoding diagrams and pseudocode ......................... AppxI-2
I.2 Limitations of pseudocode .......................................................... AppxI-4
I.3 Data types ................................................................................... AppxI-5
I.4 Expressions ................................................................................ AppxI-9
I.5 Operators and built-in functions ................................................ AppxI-11
I.6 Statements and program structure ............................................ AppxI-17
I.7 Miscellaneous helper procedures and functions ....................... AppxI-22
Contents
xii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Appendix J Pseudocode Index
J.1 Pseudocode operators and keywords ........................................ AppxJ-2
J.2 Pseudocode functions and procedures ...................................... AppxJ-6
Appendix K Register Index
K.1 Register index ............................................................................ AppxK-2
Glossary
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xiii
Preface
This preface summarizes the contents of this manual and lists the conventions it uses. It contains the
following sections:
About this manual on page xiv
Using this manual on page xv
Conventions on page xviii
Further reading on page xx
Feedback on page xxi.
Preface
xiv Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
About this manual
This manual describes the ARM®v7 instruction set architecture, including its high code density Thumb®
instruction encoding and the following extensions to it:
The System Control coprocessor, coprocessor 15 (CP15), used to control memory system
components such as caches, write buffers, Memory Management Units, and Protection Units.
The optional Advanced SIMD extension, that provides high-performance integer and
single-precision floating-point vector operations.
The optional VFP extension, that provides high-performance floating-point operations. It can
optionally support double-precision operations.
The Debug architecture, that provides software access to debug features in ARM processors.
Part A describes the application level view of the architecture. It describes the application level view of the
programmers’ model and the memory model. It also describes the precise effects of each instruction in User
mode (the normal operating mode), including any restrictions on its use. This information is of primary
importance to authors and users of compilers, assemblers, and other programs that generate ARM machine
code.
Part B describes the system level view of the architecture. It gives details of system registers that are not
accessible from User mode, and the system level view of the memory model. It also gives full details of the
effects of instructions in privileged modes (any mode other than User mode), where these are different from
their effects in User mode.
Part C describes the Debug architecture. This is an extension to the ARM architecture that provides
configuration, breakpoint and watchpoint support, and a Debug Communications Channel (DCC) to a debug
host.
Assembler syntax is given for the instructions described in this manual, permitting instructions to be
specified in textual form. However, this manual is not intended as tutorial material for ARM assembler
language, nor does it describe ARM assembler language at anything other than a very basic level. To make
effective use of ARM assembler language, consult the documentation supplied with the assembler being
used.
Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xv
Using this manual
The information in this manual is organized into four parts, as described below.
Part A, Application Level Architecture
Part A describes the application level view of the architecture. It contains the following chapters:
Chapter A1 Gives a brief overview of the ARM architecture, and the ARM and Thumb instruction sets.
Chapter A2 Describes the application level view of the ARM programmers’ model, including the
application level view of the Advanced SIMD and VFP extensions. It describes the types of
value that ARM instructions operate on, the general-purpose registers that contain those
values, and the Application Program Status Register.
Chapter A3 Describes the application level view of the memory model, including the ARM memory
types and attributes, and memory access control.
Chapter A4 Describes the range of instructions available in the ARM, Thumb, Advanced SIMD, and
VFP instruction sets. It also contains some details of instruction operation, where these are
common to several instructions.
Chapter A5 Gives details of the encoding of the ARM instruction set.
Chapter A6 Gives details of the encoding of the Thumb instruction set.
Chapter A7 Gives details of the encoding of the Advanced SIMD and VFP instruction sets.
Chapter A8 Provides detailed reference information about every instruction available in the Thumb,
ARM, Advanced SIMD, and VFP instruction sets, with the exception of information only
relevant in privileged modes.
Chapter A9 Provides detailed reference information about the ThumbEE (Execution Environment)
variant of the Thumb instruction set.
Preface
xvi Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Part B, System Level Architecture
Part B describes the system level view of the architecture. It contains the following chapters:
Chapter B1 Describes the system level view of the programmers’ model.
Chapter B2 Describes the system level view of the memory model features that are common to all
memory systems.
Chapter B3 Describes the system level view of the Virtual Memory System Architecture (VMSA) that
is part of all ARMv7-A implementations. This chapter includes descriptions of all of the
CP15 System Control Coprocessor registers in a VMSA implementation.
Chapter B4 Describes the system level view of the Protected Memory System Architecture (PMSA) that
is part of all ARMv7-R implementations. This chapter includes descriptions of all of the
CP15 System Control Coprocessor registers in a PMSA implementation.
Chapter B5 Describes the CPUID scheme.
Chapter B6 Provides detailed reference information about system instructions, and more information
about instructions where they behave differently in privileged modes.
Part C, Debug Architecture
Part C describes the Debug architecture. It contains the following chapters:
Chapter C1 Gives a brief introduction to the Debug architecture.
Chapter C2 Describes the authentication of invasive debug.
Chapter C3 Describes the debug events.
Chapter C4 Describes the debug exceptions.
Chapter C5 Describes Debug state.
Chapter C6 Describes the permitted debug register interfaces.
Chapter C7 Describes the authentication of non-invasive debug.
Chapter C8 Describes sample-based profiling.
Chapter C9 Describes the ARM performance monitors.
Chapter C10 Describes the debug registers.
Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xvii
Part D, Appendices
This manual contains the following appendices:
Appendix A Describes the recommended external Debug interfaces.
Note
This description is not part of the ARM architecture specification. It is included here only
as supplementary information, for the convenience of developers and users who might
require this information.
Appendix B The Common VFP subarchitecture specification.
Note
This specification is not part of the ARM architecture specification. This sub-architectural
information is included here only as supplementary information, for the convenience of
developers and users who might require this information.
Appendix C Describes the legacy mnemonics.
Appendix D Identifies the deprecated architectural features.
Appendix E Describes the Fast Context Switch Extension (FCSE). From ARMv6, the use of this feature
is deprecated, and in ARMv7 the FCSE is optional.
Appendix F Describes the VFP vector operations. Use of these operations is deprecated in ARMv7.
Appendix G Describes the differences in the ARMv6 architecture.
Appendix H Describes the differences in the ARMv4 and ARMv5 architectures.
Appendix I The formal definition of the pseudocode.
Appendix J Index to definitions of pseudocode operators, keywords, functions, and procedures.
Appendix K Index to register descriptions in the manual.
Preface
xviii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Conventions
This manual employs typographic and other conventions intended to improve its ease of use.
General typographic conventions
typewriter
Is used for assembler syntax descriptions, pseudocode descriptions of instructions,
and source code examples. In the cases of assembler syntax descriptions and
pseudocode descriptions, see the additional conventions below.
The
typewriter
style is also used in the main text for instruction mnemonics and for
references to other items appearing in assembler syntax descriptions, pseudocode
descriptions of instructions and source code examples.
italic Highlights important notes, introduces special terminology, and denotes internal
cross-references and citations.
bold Is used for emphasis in descriptive lists and elsewhere, where appropriate.
SMALL CAPITALS Are used for a few terms that have specific technical meanings. Their meanings can
be found in the Glossary.
Signals
In general this specification does not define processor signals, but it does include some signal examples and
recommendations. It uses the following signal conventions:
Signal level The level of an asserted signal depends on whether the signal is active-HIGH or
active-LOW. Asserted means:
HIGH for active-HIGH signals
LOW for active-LOW signals.
Lower-case n At the start or end of a signal name denotes an active-LOW signal.
Numbers
Numbers are normally written in decimal. Binary numbers are preceded by 0b, and hexadecimal numbers
by
0x
and written in a
typewriter
font.
Bit values
Values of bits and bitfields are normally given in binary, in single quotes. The quotes are normally omitted
in encoding diagrams and tables.
Pseudocode descriptions
This manual uses a form of pseudocode to provide precise descriptions of the specified functionality. This
pseudocode is written in a
typewriter
font, and is described in Appendix I Pseudocode Definition.
Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xix
Assembler syntax descriptions
This manual contains numerous syntax descriptions for assembler instructions and for components of
assembler instructions. These are shown in a
typewriter
font, and use the conventions described in
Assembler syntax on page A8-4.
Preface
xx Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Further reading
This section lists publications from both ARM and third parties that provide more information on the ARM
family of processors.
ARM periodically provides updates and corrections to its documentation. See
http://www.arm.com
for
current errata sheets and addenda, and the ARM Frequently Asked Questions.
ARM publications
ARM Debug Interface v5 Architecture Specification (ARM IHI 0031)
ARMv7-M Architecture Reference Manual (ARM DDI 0403)
CoreSight Architecture Specification (ARM IHI 0029)
ARM Architecture Reference Manual (ARM DDI 0100I)
Note
Issue I of the ARM Architecture Reference Manual (DDI 0100I) was issued in July 2005 and
describes the first version of the ARMv6 architecture, and all previous architecture versions.
Addison-Wesley Professional publish ARM Architecture Reference Manual, Second Edition
(December 27, 2000). The contents of this are identical to Issue E of the ARM Architecture
Reference Manual (DDI 0100E). It describes ARMv5TE and earlier versions of the ARM
architecture, and is superseded by DDI 0100I.
Embedded Trace Macrocell Architecture Specification (ARM IHI 0014)
CoreSight Program Flow Trace Architecture Specification (ARM IHI 0035).
External publications
The following books are referred to in this manual, or provide more information:
IEEE Std 1596.5-1993, IEEE Standard for Shared-Data Formats Optimized for Scalable Coherent
Interface (SCI) Processors, ISBN 1-55937-354-7
IEEE Std 1149.1-2001, IEEE Standard Test Access Port and Boundary Scan Architecture (JTAG)
ANSI/IEEE Std 754-1985, IEEE Standard for Binary Floating-Point Arithmetic
JEP106, Standard Manufacturers Identification Code, JEDEC Solid State Technology Association
The Java Virtual Machine Specification Second Edition, Tim Lindholm and Frank Yellin, published
by Addison Wesley (ISBN: 0-201-43294-3)
Memory Consistency Models for Shared Memory-Multiprocessors, Kourosh Gharachorloo, Stanford
University Technical Report CSL-TR-95-685
Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xxi
Feedback
ARM welcomes feedback on its documentation.
Feedback on this manual
If you notice any errors or omissions in this manual, send e-mail to
errata@arm.com
giving:
the document title
the document number
the page number(s) to which your comments apply
a concise explanation of the problem.
General suggestions for additions and improvements are also welcome.
Preface
xxii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Part A
Application Level Architecture
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-1
Chapter A1
Introduction to the ARM Architecture
This chapter introduces the ARM architecture and contains the following sections:
About the ARM architecture on page A1-2
The ARM and Thumb instruction sets on page A1-3
Architecture versions, profiles, and variants on page A1-4
Architecture extensions on page A1-6
The ARM memory model on page A1-7
Debug on page A1-8.
Introduction to the ARM Architecture
A1-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.1 About the ARM architecture
The ARM architecture supports implementations across a wide range of performance points. It is
established as the dominant architecture in many market segments. The architectural simplicity of ARM
processors leads to very small implementations, and small implementations mean devices can have very low
power consumption. Implementation size, performance, and very low power consumption are key attributes
of the ARM architecture.
The ARM architecture is a Reduced Instruction Set Computer (RISC) architecture, as it incorporates these
typical RISC architecture features:
a large uniform register file
•a load/store architecture, where data-processing operations only operate on register contents, not
directly on memory contents
simple addressing modes, with all load/store addresses being determined from register contents and
instruction fields only.
In addition, the ARM architecture provides:
instructions that combine a shift with an arithmetic or logical operation
auto-increment and auto-decrement addressing modes to optimize program loops
Load and Store Multiple instructions to maximize data throughput
conditional execution of almost all instructions to maximize execution throughput.
These enhancements to a basic RISC architecture enable ARM processors to achieve a good balance of high
performance, small code size, low power consumption, and small silicon area.
Except where the architecture specifies differently, the programmer-visible behavior of an implementation
must be the same as a simple sequential execution of the program. This programmer-visible behavior does
not include the execution time of the program.
Introduction to the ARM Architecture
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-3
A1.2 The ARM and Thumb instruction sets
The ARM instruction set is a set of 32-bit instructions providing comprehensive data-processing and control
functions.
The Thumb instruction set was developed as a 16-bit instruction set with a subset of the functionality of the
ARM instruction set. It provides significantly improved code density, at a cost of some reduction in
performance. A processor executing Thumb instructions can change to executing ARM instructions for
performance critical segments, in particular for handling interrupts.
In ARMv6T2, Thumb-2 technology is introduced. This technology makes it possible to extend the original
Thumb instruction set with many 32-bit instructions. The range of 32-bit Thumb instructions included in
ARMv6T2 permits Thumb code to achieve performance similar to ARM code, with code density better than
that of earlier Thumb code.
From ARMv6T2, the ARM and Thumb instruction sets provide almost identical functionality. For more
information, see Chapter A4 The Instruction Sets.
Introduction to the ARM Architecture
A1-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.3 Architecture versions, profiles, and variants
The ARM and Thumb instruction set architectures have evolved significantly since they were first
developed. They will continue to be developed in the future. Seven major versions of the instruction set have
been defined to date, denoted by the version numbers 1 to 7. Of these, the first three versions are now
obsolete.
ARMv7 provides three profiles:
ARMv7-A Application profile, described in this manual. Implements a traditional ARM architecture
with multiple modes and supporting a Virtual Memory System Architecture (VMSA) based
on an MMU. Supports the ARM and Thumb instruction sets.
ARMv7-R Real-time profile, described in this manual. Implements a traditional ARM architecture with
multiple modes and supporting a Protected Memory System Architecture (PMSA) based on
an MPU. Supports the ARM and Thumb instruction sets.
ARMv7-M Microcontroller profile, described in the ARMv7-M Architecture Reference Manual.
Implements a programmers' model designed for fast interrupt processing, with hardware
stacking of registers and support for writing interrupt handlers in high-level languages.
Implements a variant of the ARMv7 PMSA and supports a variant of the Thumb instruction
set.
Versions can be qualified with variant letters to specify additional instructions and other functionality that
are included as an architecture extension. Extensions are typically included in the base architecture of the
next version number. Provision is also made to exclude variants by prefixing the variant letter with
x
.
Some extensions are described separately instead of using a variant letter. For details of these extensions see
Architecture extensions on page A1-6.
The valid variants of ARMv4, ARMv5, and ARMv6 are as follows:
ARMv4 The earliest architecture variant covered by this manual. It includes only the ARM
instruction set.
ARMv4T Adds the Thumb instruction set.
ARMv5T Improves interworking of ARM and Thumb instructions. Adds count leading zeros (
CLZ
)
and software breakpoint (
BKPT
) instructions.
ARMv5TE Enhances arithmetic support for digital signal processing (DSP) algorithms. Adds preload
data (
PLD
), dual word load (
LDRD
), store (
STRD
), and 64-bit coprocessor register transfers
(
MCRR
,
MRRC
).
ARMv5TEJ Adds the
BXJ
instruction and other support for the Jazelle® architecture extension.
ARMv6 Adds many new instructions to the ARM instruction set. Formalizes and revises the memory
model and the Debug architecture.
ARMv6K Adds instructions to support multi-processing to the ARM instruction set, and some extra
memory model features.
Introduction to the ARM Architecture
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-5
ARMv6T2 Introduces Thumb-2 technology, giving a major development of the Thumb instruction set
to provide a similar level of functionality to the ARM instruction set.
Note
ARMv6KZ or ARMv6Z are sometimes used to describe the ARMv6K architecture with the optional
Security Extensions.
For detailed information about versions of the ARM architecture, see Appendix G ARMv6 Differences and
Appendix H ARMv4 and ARMv5 Differences.
The following architecture variants are now obsolete:
ARMv1, ARMv2, ARMv2a, ARMv3, ARMv3G, ARMv3M, ARMv4xM, ARMv4TxM, ARMv5,
ARMv5xM, ARMv5TxM, and ARMv5TExP.
Contact ARM if you require details of obsolete variants.
Instruction descriptions in this manual specify the architecture versions that support them.
Introduction to the ARM Architecture
A1-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.4 Architecture extensions
This manual describes the following extensions to the ARM and Thumb instruction set architectures:
ThumbEE Is a variant of the Thumb instruction set that is designed as a target for dynamically
generated code. It is:
a required extension to the ARMv7-A profile
an optional extension to the ARMv7-R profile.
VFP Is a floating-point coprocessor extension to the instruction set architectures. There
have been three main versions of VFP to date:
VFPv1 is obsolete. Details are available on request from ARM.
VFPv2 is an optional extension to:
the ARM instruction set in the ARMv5TE, ARMv5TEJ, ARMv6, and
ARMv6K architectures
the ARM and Thumb instruction sets in the ARMv6T2 architecture.
VFPv3 is an optional extension to the ARM, Thumb and ThumbEE
instruction sets in the ARMv7-A and ARMv7-R profiles.
VFPv3 can be implemented with either thirty-two or sixteen doubleword
registers, as described in Advanced SIMD and VFP extension registers on
page A2-21. Where necessary, the terms VFPv3-D32 and VFPv3-D16 are
used to distinguish between these two implementation options. Where the
term VFPv3 is used it covers both options.
VFPv3 can be extended by the half-precision extensions that provide
conversion functions in both directions between half-precision floating-point
and single-precision floating-point.
Advanced SIMD Is an instruction set extension that provides Single Instruction Multiple Data
(SIMD) functionality. It is an optional extension to the ARMv7-A and ARMv7-R
profiles. When VFPv3 and Advanced SIMD are both implemented, they use a
shared register bank and have some shared instructions.
Advanced SIMD can be extended by the half-precision extensions that provide
conversion functions in both directions between half-precision floating-point and
single-precision floating-point.
Security Extensions Are a set of security features that facilitate the development of secure applications.
They are an optional extension to the ARMv6K architecture and the ARMv7-A
profile.
Jazelle Is the Java bytecode execution extension that extended ARMv5TE to ARMv5TEJ.
From ARMv6 Jazelle is a required part of the architecture, but is still often
described as the Jazelle extension.
Multiprocessing Extensions
Are a set of features that enhance multiprocessing functionality. They are an
optional extension to the ARMv7-A and ARMv7-R profiles.
Introduction to the ARM Architecture
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-7
A1.5 The ARM memory model
The ARM architecture uses a single, flat address space of 232 8-bit bytes. The address space is also regarded
as 230 32-bit words or 231 16-bit halfwords.
The architecture provides facilities for:
faulting unaligned memory accesses
restricting access by applications to specified areas of memory
translating virtual addresses provided by executing instructions into physical addresses
altering the interpretation of word and halfword data between big-endian and little-endian
optionally preventing out-of-order access to memory
controlling caches
synchronizing access to shared memory by multiple processors.
For more information, see:
Chapter A3 Application Level Memory Model
Chapter B2 Common Memory System Architecture Features
Chapter B3 Virtual Memory System Architecture (VMSA)
Chapter B4 Protected Memory System Architecture (PMSA).
Introduction to the ARM Architecture
A1-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.6 Debug
ARMv7 processors implement two types of debug support:
Invasive debug Debug permitting modification of the state of the processor. This is intended
primarily for run-control debugging.
Non-invasive debug Debug permitting data and program flow observation, without modifying the state
of the processor or interrupting the flow of execution.
This provides for:
instruction and data tracing
program counter sampling
performance monitors.
For more information, see Chapter C1 Introduction to the ARM Debug Architecture.
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-1
Chapter A2
Application Level Programmers’ Model
This chapter gives an application level view of the ARM programmers’ model. It contains the following
sections:
About the Application level programmers’ model on page A2-2
ARM core data types and arithmetic on page A2-3
ARM core registers on page A2-11
The Application Program Status Register (APSR) on page A2-14
Execution state registers on page A2-15
Advanced SIMD and VFP extensions on page A2-20
Floating-point data types and arithmetic on page A2-32
Polynomial arithmetic over {0,1} on page A2-67
Coprocessor support on page A2-68
Execution environment support on page A2-69
Exceptions, debug events and checks on page A2-81.
Application Level Programmers’ Model
A2-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.1 About the Application level programmers’ model
This chapter contains the programmers’ model information required for application development.
The information in this chapter is distinct from the system information required to service and support
application execution under an operating system. However, some knowledge of that system information is
needed to put the Application level programmers' model into context.
System level support requires access to all features and facilities of the architecture, a mode of operation
referred to as privileged operation. System code determines whether an application runs in a privileged or
unprivileged manner. When an operating system supports both privileged and unprivileged operation, an
application usually runs unprivileged. This:
permits the operating system to allocate system resources to it in a unique or shared manner
provides a degree of protection from other processes and tasks, and so helps protect the operating
system from malfunctioning applications.
This chapter indicates where some system level understanding is helpful, and where appropriate it:
gives an overview of the system level information
gives references to the system level descriptions in Chapter B1 The System Level Programmers’
Model and elsewhere.
The Security Extensions extend the architecture to provide hardware security features that support the
development of secure applications. For more information, see The Security Extensions on page B1-25.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-3
A2.2 ARM core data types and arithmetic
All ARMv7-A and ARMv7-R processors support the following data types in memory:
Byte 8 bits
Halfword 16 bits
Word 32 bits
Doubleword 64 bits.
Processor registers are 32 bits in size. The instruction set contains instructions supporting the following data
types held in registers:
32-bit pointers
unsigned or signed 32-bit integers
unsigned 16-bit or 8-bit integers, held in zero-extended form
signed 16-bit or 8-bit integers, held in sign-extended form
two 16-bit integers packed into a register
four 8-bit integers packed into a register
unsigned or signed 64-bit integers held in two registers.
Load and store operations can transfer bytes, halfwords, or words to and from memory. Loads of bytes or
halfwords zero-extend or sign-extend the data as it is loaded, as specified in the appropriate load instruction.
The instruction sets include load and store operations that transfer two or more words to and from memory.
You can load and store doublewords using these instructions. The exclusive doubleword load/store
instructions
LDREXD
and
STREXD
specify single-copy atomic doubleword accesses to memory.
When any of the data types is described as unsigned, the N-bit data value represents a non-negative integer
in the range 0 to 2N-1, using normal binary format.
When any of these types is described as signed, the N-bit data value represents an integer in the range -2N-1
to +2N-1-1, using two's complement format.
The instructions that operate on packed halfwords or bytes include some multiply instructions that use just
one of two halfwords, and Single Instruction Multiple Data (SIMD) instructions that operate on all of the
halfwords or bytes in parallel.
Direct instruction support for 64-bit integers is limited, and most 64-bit operations require sequences of two
or more instructions to synthesize them.
Application Level Programmers’ Model
A2-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.2.1 Integer arithmetic
The instruction set provides a wide variety of operations on the values in registers, including bitwise logical
operations, shifts, additions, subtractions, multiplications, and many others. These operations are defined
using the pseudocode described in Appendix I Pseudocode Definition, usually in one of three ways:
By direct use of the pseudocode operators and built-in functions defined in Operators and built-in
functions on page AppxI-11.
By use of pseudocode helper functions defined in the main text. These can be located using the table
in Appendix J Pseudocode Index.
By a sequence of the form:
1. Use of the
SInt()
,
UInt()
, and
Int()
built-in functions defined in Converting bitstrings to
integers on page AppxI-14 to convert the bitstring contents of the instruction operands to the
unbounded integers that they represent as two's complement or unsigned integers.
2. Use of mathematical operators, built-in functions and helper functions on those unbounded
integers to calculate other such integers.
3. Use of either the bitstring extraction operator defined in Bitstring extraction on page AppxI-12
or of the saturation helper functions described in Pseudocode details of saturation on
page A2-9 to convert an unbounded integer result into a bitstring result that can be written to
a register.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-5
Shift and rotate operations
The following types of shift and rotate operations are used in instructions:
Logical Shift Left
(
LSL
) moves each bit of a bitstring left by a specified number of bits. Zeros are shifted in at
the right end of the bitstring. Bits that are shifted off the left end of the bitstring are
discarded, except that the last such bit can be produced as a carry output.
Logical Shift Right
(
LSR
) moves each bit of a bitstring right by a specified number of bits. Zeros are shifted in
at the left end of the bitstring. Bits that are shifted off the right end of the bitstring are
discarded, except that the last such bit can be produced as a carry output.
Arithmetic Shift Right
(
ASR
) moves each bit of a bitstring right by a specified number of bits. Copies of the leftmost
bit are shifted in at the left end of the bitstring. Bits that are shifted off the right end of the
bitstring are discarded, except that the last such bit can be produced as a carry output.
Rotate Right (
ROR
) moves each bit of a bitstring right by a specified number of bits. Each bit that is shifted
off the right end of the bitstring is re-introduced at the left end. The last bit shifted off the
right end of the bitstring can be produced as a carry output.
Rotate Right with Extend
(
RRX
) moves each bit of a bitstring right by one bit. The carry input is shifted in at the left
end of the bitstring. The bit shifted off the right end of the bitstring can be produced as a
carry output.
Pseudocode details of shift and rotate operations
These shift and rotate operations are supported in pseudocode by the following functions:
// LSL_C()
// =======
(bits(N), bit) LSL_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = x : Zeros(shift);
result = extended_x<N-1:0>;
carry_out = extended_x<N>;
return (result, carry_out);
// LSL()
// =====
bits(N) LSL(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
Application Level Programmers’ Model
A2-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
(result, -) = LSL_C(x, shift);
return result;
// LSR_C()
// =======
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
// LSR()
// =====
bits(N) LSR(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
(result, -) = LSR_C(x, shift);
return result;
// ASR_C()
// =======
(bits(N), bit) ASR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = SignExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
// ASR()
// =====
bits(N) ASR(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
(result, -) = ASR_C(x, shift);
return result;
// ROR_C()
// =======
(bits(N), bit) ROR_C(bits(N) x, integer shift)
assert shift != 0;
m = shift MOD N;
result = LSR(x,m) OR LSL(x,N-m);
carry_out = result<N-1>;
return (result, carry_out);
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-7
// ROR()
// =====
bits(N) ROR(bits(N) x, integer shift)
if n == 0 then
result = x;
else
(result, -) = ROR_C(x, shift);
return result;
// RRX_C()
// =======
(bits(N), bit) RRX_C(bits(N) x, bit carry_in)
result = carry_in : x<N-1:1>;
carry_out = x<0>;
return (result, carry_out);
// RRX()
// =====
bits(N) RRX(bits(N) x, bit carry_in)
(result, -) = RRX_C(x, shift);
return result;
Application Level Programmers’ Model
A2-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Pseudocode details of addition and subtraction
In pseudocode, addition and subtraction can be performed on any combination of unbounded integers and
bitstrings, provided that if they are performed on two bitstrings, the bitstrings must be identical in length.
The result is another unbounded integer if both operands are unbounded integers, and a bitstring of the same
length as the bitstring operand(s) otherwise. For the precise definition of these operations, see Addition and
subtraction on page AppxI-15.
The main addition and subtraction instructions can produce status information about both unsigned carry
and signed overflow conditions. This status information can be used to synthesize multi-word additions and
subtractions. In pseudocode the
AddWithCarry()
function provides an addition with a carry input and carry
and overflow outputs:
// AddWithCarry()
// ==============
(bits(N), bit, bit) AddWithCarry(bits(N) x, bits(N) y, bit carry_in)
unsigned_sum = UInt(x) + UInt(y) + UInt(carry_in);
signed_sum = SInt(x) + SInt(y) + UInt(carry_in);
result = unsigned_sum<N-1:0>; // == signed_sum<N-1:0>
carry_out = if UInt(result) == unsigned_sum then ‘0’ else ‘1’;
overflow = if SInt(result) == signed_sum then ‘0’ else ‘1’;
return (result, carry_out, overflow);
An important property of the
AddWithCarry()
function is that if:
(result, carry_out, overflow) = AddWithCarry(x, NOT(y), carry_in)
then:
•if
carry_in == '1'
, then
result == x-y
with:
overflow == '1'
if signed overflow occurred during the subtraction
carry_out == '1'
if unsigned borrow did not occur during the subtraction, that is, if
x >= y
•if
carry_in == '0'
, then
result == x-y-1
with:
overflow == '1'
if signed overflow occurred during the subtraction
carry_out == '1'
if unsigned borrow did not occur during the subtraction, that is, if
x > y
.
Together, these mean that the
carry_in
and
carry_out
bits in
AddWithCarry()
calls can act as NOT borrow
flags for subtractions as well as carry flags for additions.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-9
Pseudocode details of saturation
Some instructions perform saturating arithmetic, that is, if the result of the arithmetic overflows the
destination signed or unsigned N-bit integer range, the result produced is the largest or smallest value in that
range, rather than wrapping around modulo 2N. This is supported in pseudocode by the
SignedSatQ()
and
UnsignedSatQ()
functions when a boolean result is wanted saying whether saturation occurred, and by the
SignedSat()
and
UnsignedSat()
functions when only the saturated result is wanted:
// SignedSatQ()
// ============
(bits(N), boolean) SignedSatQ(integer i, integer N)
if i > 2^(N-1) - 1 then
result = 2^(N-1) - 1; saturated = TRUE;
elsif i < -(2^(N-1)) then
result = -(2^(N-1)); saturated = TRUE;
else
result = i; saturated = FALSE;
return (result<N-1:0>, saturated);
// UnsignedSatQ()
// ==============
(bits(N), boolean) UnsignedSatQ(integer i, integer N)
if i > 2^N - 1 then
result = 2^N - 1; saturated = TRUE;
elsif i < 0 then
result = 0; saturated = TRUE;
else
result = i; saturated = FALSE;
return (result<N-1:0>, saturated);
// SignedSat()
// ===========
bits(N) SignedSat(integer i, integer N)
(result, -) = SignedSatQ(i, N);
return result;
// UnsignedSat()
// =============
bits(N) UnsignedSat(integer i, integer N)
(result, -) = UnsignedSatQ(i, N);
return result;
SatQ(i, N, unsigned)
returns either
UnsignedSatQ(i,N)
or
SignedSatQ(i, N)
depending on the value of its
third argument, and
Sat(i, N, unsigned)
returns either
UnsignedSat(i, N)
or
SignedSat(i, N)
depending on
the value of its third argument:
Application Level Programmers’ Model
A2-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
// SatQ()
// ======
(bits(N), boolean) SatQ(integer i, integer N, boolean unsigned)
(result, sat) = if unsigned then UnsignedSatQ(i, N) else SignedSatQ(i, N);
return (result, sat);
// Sat()
// =====
bits(N) Sat(integer i, integer N, boolean unsigned)
result = if unsigned then UnsignedSat(i, N) else SignedSat(i, N);
return result;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-11
A2.3 ARM core registers
In the application level view, an ARM processor has:
thirteen general-purpose32-bit registers, R0 to R12
three 32-bit registers, R13 to R15, that sometimes or always have a special use.
Registers R13 to R15 are usually referred to by names that indicate their special uses:
SP, the Stack Pointer
Register R13 is used as a pointer to the active stack.
In Thumb code, most instructions cannot access SP. The only instructions that can access
SP are those designed to use SP as a stack pointer.
The use of SP for any purpose other than as a stack pointer is deprecated.
Note
Using SP for any purpose other than as a stack pointer is likely to break the requirements of
operating systems, debuggers, and other software systems, causing them to malfunction.
LR, the Link Register
Register R14 is used to store the return address from a subroutine. At other times, LR can
be used for other purposes.
When a
BL
or
BLX
instruction performs a subroutine call, LR is set to the subroutine return
address. To perform a subroutine return, copy LR back to the program counter. This is
typically done in one of two ways, after entering the subroutine with a
BL
or
BLX
instruction:
Return with a
BX LR
instruction.
On subroutine entry, store LR to the stack with an instruction of the form:
PUSH {<registers>,LR}
and use a matching instruction to return:
POP {<registers>,PC}
ThumbEE checks and handler calls use LR in a similar way. For details see Chapter A9
ThumbEE.
PC, the Program Counter
Register R15 is the program counter:
When executing an ARM instruction, PC reads as the address of the current
instruction plus 8.
When executing a Thumb instruction, PC reads as the address of the current
instruction plus 4.
Writing an address to PC causes a branch to that address.
In Thumb code, most instructions cannot access PC.
See ARM core registers on page B1-9 for the system level view of SP, LR, and PC.
Application Level Programmers’ Model
A2-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Note
The names SP, LR and PC are preferred to R13, R14 and R15. However, sometimes it is simpler to use the
R13-R15 names when referring to a group of registers. For example, it is simpler to refer to Registers R8 to
R15, rather than to Registers R8 to R12, the SP, LR and PC. However these two descriptions of the group of
registers have exactly the same meaning.
A2.3.1 Pseudocode details of operations on ARM core registers
In pseudocode, the
R[]
function is used to:
Read or write R0-R12, SP, and LR, using n == 0-12, 13, and 14 respectively.
Read the PC, using n == 15.
This function has prototypes:
bits(32) R[integer n]
assert n >= 0 && n <= 15;
R[integer n] = bits(32) value
assert n >= 0 && n <= 14;
The full operation of this function is explained in Pseudocode details of ARM core register operations on
page B1-12.
Descriptions of ARM store instructions that store the PC value use the
PCStoreValue()
pseudocode function
to specify the PC value stored by the instruction:
// PCStoreValue()
// ==============
bits(32) PCStoreValue()
// This function returns the PC value. On architecture versions before ARMv7, it
// is permitted to instead return PC+4, provided it does so consistently. It is
// used only to describe ARM instructions, so it returns the address of the current
// instruction plus 8 (normally) or 12 (when the alternative is permitted).
return PC;
Writing an address to the PC causes either a simple branch to that address or an interworking branch that
also selects the instruction set to execute after the branch. A simple branch is performed by the
BranchWritePC()
function:
// BranchWritePC()
// ===============
BranchWritePC(bits(32) address)
if CurrentInstrSet() == InstrSet_ARM then
if ArchVersion() < 6 && address<1:0> != ‘00’ then UNPREDICTABLE;
BranchTo(address<31:2>:’00’);
else
BranchTo(address<31:1>:’0’);
An interworking branch is performed by the
BXWritePC()
function:
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-13
// BXWritePC()
// ===========
BXWritePC(bits(32) address)
if CurrentInstrSet() == InstrSet_ThumbEE then
if address<0> == ‘1’ then
BranchTo(address<31:1>:’0’); // Remaining in ThumbEE state
else
UNPREDICTABLE;
else
if address<0> == ‘1’ then
SelectInstrSet(InstrSet_Thumb);
BranchTo(address<31:1>:’0’);
elsif address<1> == ‘0’ then
SelectInstrSet(InstrSet_ARM);
BranchTo(address);
else // address<1:0> == ‘10’
UNPREDICTABLE;
The
LoadWritePC()
and
ALUWritePC()
functions are used for two cases where the behavior was systematically
modified between architecture versions:
// LoadWritePC()
// =============
LoadWritePC(bits(32) address)
if ArchVersion() >= 5 then
BXWritePC(address);
else
BranchWritePC(address);
// ALUWritePC()
// ============
ALUWritePC(bits(32) address)
if ArchVersion() >= 7 && CurrentInstrSet() == InstrSet_ARM then
BXWritePC(address);
else
BranchWritePC(address);
Note
The behavior of the PC writes performed by the
ALUWritePC()
function is different in Debug state, where
there are more UNPREDICTABLE cases. The pseudocode in this section only handles the non-debug cases. For
more information, see Data-processing instructions with the PC as the target in Debug state on page C5-12.
Application Level Programmers’ Model
A2-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.4 The Application Program Status Register (APSR)
Program status is reported in the 32-bit Application Program Status Register (APSR). The format of the
APSR is:
In the APSR, the bits are in the following categories:
Reserved bits are allocated to system features, or are available for future expansion. Unprivileged
execution ignores writes to privileged fields. However, application level software that writes to the
APSR must treat reserved bits as Do-Not-Modify (DNM) bits. For more information about the
reserved bits, see Format of the CPSR and SPSRs on page B1-16.
Flags that can be set by many instructions:
N, bit [31] Negative condition code flag. Set to bit [31] of the result of the instruction. If the result
is regarded as a two's complement signed integer, then N == 1 if the result is negative and
N == 0 if it is positive or zero.
Z, bit [30] Zero condition code flag. Set to 1 if the result of the instruction is zero, and to 0 otherwise.
A result of zero often indicates an equal result from a comparison.
C, bit [29] Carry condition code flag. Set to 1 if the instruction results in a carry condition, for
example an unsigned overflow on an addition.
V, bit [28] Overflow condition code flag. Set to 1 if the instruction results in an overflow condition,
for example a signed overflow on an addition.
Q, bit [27] Set to 1 to indicate overflow or saturation occurred in some instructions, normally related
to Digital Signal Processing (DSP). For more information, see Pseudocode details of
saturation on page A2-9.
GE[3:0], bits [19:16]
Greater than or Equal flags. SIMD instructions update these flags to indicate the results
from individual bytes or halfwords of the operation. These flags can control a later
SEL
instruction. For more information, see SEL on page A8-312.
Bits [26:24] are RAZ/SBZP. Therefore, software can use
MSR
instructions that write the top byte of
the APSR without using a read, modify, write sequence. If it does this, it must write zeros to
bits [26:24].
Instructions can test the N, Z, C, and V condition code flags to determine whether the instruction is to be
executed. In this way, execution of the instruction can be made conditional on the result of a previous
operation. For more information about conditional execution see Conditional execution on page A4-3 and
Conditional execution on page A8-8.
In ARMv7-A and ARMv7-R, the APSR is the same register as the CPSR, but the APSR must be used only
to access the N, Z, C, V, Q, and GE[3:0] bits. For more information, see Program Status Registers (PSRs)
on page B1-14.
31 30 29 28 27 26 24 23 20 19 16 15 0
NZCVQ RAZ/
SBZP Reserved GE[3:0] Reserved
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-15
A2.5 Execution state registers
The execution state registers modify the execution of instructions. They control:
Whether instructions are interpreted as Thumb instructions, ARM instructions, ThumbEE
instructions, or Java bytecodes. For more information, see ISETSTATE.
In Thumb state and ThumbEE state only, what conditions apply to the next four instructions. For
more information, see ITSTATE on page A2-17.
Whether data is interpreted as big-endian or little-endian. For more information, see ENDIANSTATE
on page A2-19.
In ARMv7-A and ARMv7-R, the execution state registers are part of the Current Program Status Register.
For more information, see Program Status Registers (PSRs) on page B1-14.
There is no direct access to the execution state registers from application level instructions, but they can be
changed by side effects of application level instructions.
A2.5.1 ISETSTATE
The J bit and the T bit determine the instruction set used by the processor. Table A2-1 shows the encoding
of these bits.
ARM state The processor executes the ARM instruction set described in Chapter A5 ARM
Instruction Set Encoding.
Thumb state The processor executes the Thumb instruction set as described in Chapter A6
Thumb Instruction Set Encoding.
Jazelle state The processor executes Java bytecodes as part of a Java Virtual Machine (JVM). For
more information, see Jazelle direct bytecode execution support on page A2-73.
10
JT
Table A2-1 J and T bit encoding in ISETSTATE
J T Instruction set state
00 ARM
0 1 Thumb
10 Jazelle
1 1 ThumbEE
Application Level Programmers’ Model
A2-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
ThumbEE state The processor executes a variation of the Thumb instruction set specifically targeted
for use with dynamic compilation techniques associated with an execution
environment. This can be Java or other execution environments. This feature is
required in ARMv7-A, and optional in ARMv7-R. For more information, see
Thumb Execution Environment on page A2-69.
Pseudocode details of ISETSTATE operations
The following pseudocode functions return the current instruction set and select a new instruction set:
enumeration InstrSet {InstrSet_ARM, InstrSet_Thumb, InstrSet_Jazelle, InstrSet_ThumbEE};
// CurrentInstrSet()
// =================
InstrSet CurrentInstrSet()
case ISETSTATE of
when ‘00’ result = InstrSet_ARM;
when ‘01’ result = InstrSet_Thumb;
when ‘10’ result = InstrSet_Jazelle;
when ‘11’ result = InstrSet_ThumbEE;
return result;
// SelectInstrSet()
// ================
SelectInstrSet(InstrSet iset)
case iset of
when InstrSet_ARM
if CurrentInstrSet() == InstrSet_ThumbEE then
UNPREDICTABLE;
else
ISETSTATE = ‘00’;
when InstrSet_Thumb
ISETSTATE = ‘01’;
when InstrSet_Jazelle
ISETSTATE = ‘10’;
when InstrSet_ThumbEE
ISETSTATE = ‘11’;
return;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-17
A2.5.2 ITSTATE
This field holds the If-Then execution state bits for the Thumb
IT
instruction. See IT on page A8-104 for a
description of the
IT
instruction and the associated IT block.
ITSTATE divides into two subfields:
IT[7:5] Holds the base condition for the current IT block. The base condition is the top 3 bits of the
condition specified by the IT instruction.
This subfield is 0b000 when no IT block is active.
IT[4:0] Encodes:
The size of the IT block. This is the number of instructions that are to be conditionally
executed. The size of the block is implied by the position of the least significant 1 in
this field, as shown in Table A2-2 on page A2-18.
The value of the least significant bit of the condition code for each instruction in the
block.
Note
Changing the value of the least significant bit of a condition code from 0 to 1 has the
effect of inverting the condition code.
This subfield is 0b00000 when no IT block is active.
When an IT instruction is executed, these bits are set according to the condition in the instruction, and the
Then and Else (T and E) parameters in the instruction. For more information, see IT on page A8-104.
An instruction in an IT block is conditional, see Conditional instructions on page A4-4 and Conditional
execution on page A8-8. The condition used is the current value of IT[7:4]. When an instruction in an IT
block completes its execution normally,
ITSTATE
is advanced to the next line of Table A2-2 on page A2-18.
For details of what happens if such an instruction takes an exception see Exception entry on page B1-34.
Note
Instructions that can complete their normal execution by branching are only permitted in an IT block as its
last instruction, and so always result in
ITSTATE
advancing to normal execution.
Note
ITSTATE
affects instruction execution only in Thumb and ThumbEE states. In ARM and Jazelle states,
ITSTATE
must be '00000000', otherwise behavior is UNPREDICTABLE.
76543210
IT[7:0]
Application Level Programmers’ Model
A2-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Pseudocode details of ITSTATE operations
ITSTATE
advances after normal execution of an IT block instruction. This is described by the
ITAdvance()
pseudocode function:
// ITAdvance()
// ===========
ITAdvance()
if ITSTATE<2:0> == ‘000’ then
ITSTATE.IT = ‘00000000’;
else
ITSTATE.IT<4:0> = LSL(ITSTATE.IT<4:0>, 1);
The following functions test whether the current instruction is in an IT block, and whether it is the last
instruction of an IT block:
// InITBlock()
// ===========
boolean InITBlock()
return (ITSTATE.IT<3:0> != ‘0000’);
// LastInITBlock()
// ===============
boolean LastInITBlock()
return (ITSTATE.IT<3:0> == ‘1000’);
Table A2-2 Effect of IT execution state bits
IT bits a
a. Combinations of the IT bits not shown in this table are reserved.
Note
[7:5] [4] [3] [2] [1] [0]
cond_base P1 P2 P3 P4 1 Entry point for 4-instruction IT block
cond_base P1 P2 P3 1 0 Entry point for 3-instruction IT block
cond_base P1 P2 1 0 0 Entry point for 2-instruction IT block
cond_base P1 1 0 0 0 Entry point for 1-instruction IT block
000 0 0 0 0 0 Normal execution, not in an IT block
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-19
A2.5.3 ENDIANSTATE
ARMv7-A and ARMv7-R support configuration between little-endian and big-endian interpretations of
data memory, as shown in Table A2-3. The endianness is controlled by ENDIANSTATE.
The ARM and Thumb instruction sets both include an instruction to manipulate ENDIANSTATE:
SETEND BE
Sets ENDIANSTATE to 1, for big-endian operation
SETEND LE
Sets ENDIANSTATE to 0, for little-endian operation.
The
SETEND
instruction is unconditional. For more information, see SETEND on page A8-314.
Pseudocode details of ENDIANSTATE operations
The
BigEndian()
pseudocode function tests whether big-endian memory accesses are currently selected.
// BigEndian()
// ===========
boolean BigEndian()
return (ENDIANSTATE == ‘1’);
Table A2-3 APSR configuration of endianness
ENDIANSTATE Endian mapping
0 Little-endian
1Big-endian
Application Level Programmers’ Model
A2-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.6 Advanced SIMD and VFP extensions
Advanced SIMD and VFP are two optional extensions to ARMv7.
Advanced SIMD performs packed Single Instruction Multiple Data (SIMD) operations, either integer or
single-precision floating-point. VFP performs single-precision or double-precision floating-point
operations.
Both extensions permit floating-point exceptions, such as overflow or division by zero, to be handled in an
untrapped fashion. When handled in this way, a floating-point exception causes a cumulative status register
bit to be set to 1 and a default result to be produced by the operation.
The ARMv7 VFP implementation is VFPv3. ARMv7 also permits a variant of VFPv3, VFPv3U, that
supports the trapping of floating-point exceptions, see VFPv3U on page A2-31. VFPv2 also supports the
trapping of floating-point exceptions.
For more information about floating-point exceptions see Floating-point exceptions on page A2-42.
Each extension can be implemented at a number of levels. Table A2-4 shows the permitted combinations of
implementations of the two extensions.
The optional half-precision extensions provide conversion functions in both directions between
half-precision floating-point and single-precision floating-point. These extensions can be implemented with
any Advanced SIMD and VFP implementation that supports single-precision floating-point. The
half-precision extensions apply to both VFP and Advanced SIMD if they are both implemented.
For system-level information about the Advanced SIMD and VFP extensions see:
Advanced SIMD and VFP extension system registers on page B1-66
Advanced SIMD and floating-point support on page B1-64.
Table A2-4 Permitted combinations of Advanced SIMD and VFP extensions
Advanced SIMD VFP
Not implemented Not implemented
Integer only Not implemented
Integer and single-precision floating-point Single-precision floating-point onlya
a. Must be able to load and store double-precision data.
Integer and single-precision floating-point Single-precision and double-precision floating-point
Not implemented Single-precision floating-point onlya
Not implemented Single-precision and double-precision floating-point
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-21
Note
Before ARMv7, the VFP extension was called the Vector Floating-point Architecture, and was used for
vector operations. For details of these deprecated operations see Appendix F VFP Vector Operation
Support. From ARMv7:
ARM recommends that the Advanced SIMD extension is used for single-precision vector
floating-point operations
an implementation that requires support for vector operations must implement the Advanced SIMD
extension.
A2.6.1 Advanced SIMD and VFP extension registers
Advanced SIMD and VFPv3 use the same register set. This is distinct from the ARM core register set. These
registers are generally referred to as the extension registers.
The extension register set consists of either thirty-two or sixteen doubleword registers, as follows:
If VFPv2 is implemented, it consists of sixteen doubleword registers.
If VFPv3 is implemented, it consists of either thirty-two or sixteen doubleword registers. Where
necessary the terms VFPv3-D32 and VFPv3-D16 are used to distinguish between these two
implementation options.
If Advanced SIMD is implemented, it consists of thirty-two doubleword registers. If both Advanced
SIMD and VFPv3 are implemented, VFPv3 must be implemented in its VFPv3-D32 form.
The Advanced SIMD and VFP views of the extension register set are not identical. They are described in
the following sections.
Figure A2-1 on page A2-22 shows the views of the extension register set, and the way the word,
doubleword, and quadword registers overlap.
Advanced SIMD views of the extension register set
Advanced SIMD can view this register set as:
Sixteen 128-bit quadword registers,
Q0-Q15
.
Thirty-two 64-bit doubleword registers,
D0-D31
. This view is also available in VFPv3.
These views can be used simultaneously. For example, a program might hold 64-bit vectors in D0 and D1
and a 128-bit vector in Q1.
Application Level Programmers’ Model
A2-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
VFP views of the extension register set
In VFPv3-D32, the extension register set consists of thirty-two doubleword registers, that VFP can view as:
Thirty-two 64-bit doubleword registers,
D0-D31
. This view is also available in Advanced SIMD.
Thirty-two 32-bit single word registers,
S0-S31
. Only half of the set is accessible in this view.
In VFPv3-D16 and VFPv2, the extension register set consists of sixteen doubleword registers, that VFP can
view as:
Sixteen 64-bit doubleword registers,
D0-D15
.
Thirty-two 32-bit single word registers,
S0-S31
.
In each case, the two views can be used simultaneously.
Advanced SIMD and VFP register mapping
Figure A2-1 Advanced SIMD and VFP register set
D0
D3
D31
D30
S0
S1
S2
S3
S4
S5
S28
S29
S6
S7
S30
S31
...
D1
D2
D14
D15
D16
D17
...
Q0
Q1
Q7
Q8
Q15
......
...
D0
D3
D1
D2
D14
D15
...
S0-S31
VFP only
D0-D15
VFPv2 or
VFPv3-D16
D0-D31
VFPv3-D32 or
Advanced SIMD
Q0-Q15
Advanced SIMD only
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-23
The mapping between the registers is as follows:
S<2n>
maps to the least significant half of
D<n>
S<2n+1>
maps to the most significant half of
D<n>
D<2n>
maps to the least significant half of
Q<n>
D<2n+1>
maps to the most significant half of
Q<n>
.
For example, you can access the least significant half of the elements of a vector in
Q6
by referring to
D12
,
and the most significant half of the elements by referring to
D13
.
Pseudocode details of Advanced SIMD and VFP extension registers
The pseudocode function
VFPSmallRegisterBank()
returns FALSE if all of the 32 registers D0-D31 can be
accessed, and TRUE if only the 16 registers D0-D15 can be accessed:
boolean VFPSmallRegisterBank()
In more detail,
VFPSmallRegisterBank()
:
returns TRUE for a VFPv2 or VFPv3-D16 implementation
for a VFPv3-D32 implementation:
returns FALSE if CPACR.D32DIS == 0
returns TRUE if CPACR.D32DIS == 1 and CPACR.ASEDIS == 1
results in UNPREDICTABLE behavior if CPACR.D32DIS == 1 and CPACR.ASEDIS == 0.
For details of the CPACR register, see:
c1, Coprocessor Access Control Register (CPACR) on page B3-104 for a VMSA implementation
c1, Coprocessor Access Control Register (CPACR) on page B4-51 for a PMSA implementation.
The S0-S31, D0-D31, and Q0-Q15 views of the registers are provided by the following functions:
// The 64-bit extension register bank for Advanced SIMD and VFP.
array bits(64) _D[0..31];
// S[] - non-assignment form
// =========================
bits(32) S[integer n]
assert n >= 0 && n <= 31;
if (n MOD 2) == 0 then
result = D[n DIV 2]<31:0>;
else
result = D[n DIV 2]<63:32>;
return result;
// S[] - assignment form
// =====================
S[integer n] = bits(32) value
assert n >= 0 && n <= 31;
if (n MOD 2) == 0 then
Application Level Programmers’ Model
A2-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
D[n DIV 2]<31:0> = value;
else
D[n DIV 2]<63:32> = value;
return;
// D[] - non-assignment form
// =========================
bits(64) D[integer n]
assert n >= 0 && n <= 31;
if n >= 16 && VFPSmallRegisterBank() then UNDEFINED;
return _D[n];
// D[] - assignment form
// =====================
D[integer n] = bits(64) value
assert n >= 0 && n <= 31;
if n >= 16 && VFPSmallRegisterBank() then UNDEFINED;
_D[n] = value;
return;
// Q[] - non-assignment form
// =========================
bits(128) Q[integer n]
assert n >= 0 && n <= 15;
return D[2*n+1]:D[2*n];
// Q[] - assignment form
// =====================
Q[integer n] = bits(128) value
assert n >= 0 && n <= 15;
D[2*n] = value<63:0>;
D[2*n+1] = value<127:64>;
return;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-25
A2.6.2 Data types supported by the Advanced SIMD extension
When the Advanced SIMD extension is implemented, it can operate on integer and floating-point data. It
defines a set of data types to represent the different data formats. Table A2-5 shows the available formats.
Each instruction description specifies the data types that the instruction supports.
The polynomial data type is described in Polynomial arithmetic over {0,1} on page A2-67.
The
.F16
data type is the half-precision data type currently selected by the FPSCR.AHP bit, see Advanced
SIMD and VFP system registers on page A2-28. It is supported only when the half-precision extensions are
implemented.
The
.F32
data type is the ARM standard single-precision floating-point data type, see Advanced SIMD and
VFP single-precision format on page A2-34.
The instruction definitions use a data type specifier to define the data types appropriate to the operation.
Figure A2-2 on page A2-26 shows the hierarchy of Advanced SIMD data types.
Table A2-5 Advanced SIMD data types
Data type specifier Meaning
.<size>
Any element of
<size>
bits
.F<size>
Floating-point number of
<size>
bits
.I<size>
Signed or unsigned integer of
<size>
bits
.P<size>
Polynomial over {0,1} of degree less than
<size>
.S<size>
Signed integer of
<size>
bits
.U<size>
Unsigned integer of
<size>
bits
Application Level Programmers’ Model
A2-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Figure A2-2 Advanced SIMD data type hierarchy
For example, a multiply instruction must distinguish between integer and floating-point data types.
However, some multiply instructions use modulo arithmetic for integer instructions and therefore do not
need to distinguish between signed and unsigned inputs.
A multiply instruction that generates a double-width (long) result must specify the input data types as signed
or unsigned, because for this operation it does make a difference.
A2.6.3 Advanced SIMD vectors
When the Advanced SIMD extension is implemented, a register can hold one or more packed elements, all
of the same size and type. The combination of a register and a data type describes a vector of elements. The
vector is considered to be an array of elements of the data type specified in the instruction. The number of
elements in the vector is implied by the size of the data elements and the size of the register.
Vector indices are in the range 0 to (number of elements – 1). An index of 0 refers to the least significant
end of the vector. Figure A2-3 on page A2-27 shows examples of Advanced SIMD vectors:
.U8
.S16
.S8
.I64
.I32 .U32
.16
.32
.U16
.S32
.8
.I8
.64 .U64
.P16
.P8
.F32
.S64
-
-
-
-
.F16 
.I16
 Supported only if the half-precision extensions are implemented
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-27
Figure A2-3 Examples of Advanced SIMD vectors
Pseudocode details of Advanced SIMD vectors
The pseudocode function
Elem[]
is used to access the element of a specified index and size in a vector:
// Elem[] - non-assignment form
// ============================
bits(size) Elem[bits(N) vector, integer e, integer size]
assert e >= 0 && (e+1)*size <= N;
return vector<(e+1)*size-1:e*size>;
// Elem[] - assignment form
// ========================
Elem[bits(N) vector, integer e, integer size] = bits(size) value
assert e >= 0 && (e+1)*size <= N;
vector<(e+1)*size-1:e*size> = value;
return;
Qn
64-bit vector of 32-bit signed integers
[1]
[2]
[3]
[3][7] [6] [5]
64-bit vector of 16-bit unsigned integers
128-bit vector of single-precision
(32-bit) floating-point numbers
128-bit vector of 16-bit signed integers
[2] [0]
[4] [1] [0]
127 0
63 0
.F32 .F32 .F32 .F32
.S16 .S16 .S16 .S16 .S16 .S16 .S16 .S16
Dn
.S32 .S32
[1] [0]
.U16 .U16 .U16 .U16
[2][3] [1] [0]
Application Level Programmers’ Model
A2-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.6.4 Advanced SIMD and VFP system registers
The Advanced SIMD and VFP extensions have a shared register space for system registers. Only one
register in this space is accessible at the application level, see Floating-point Status and Control Register
(FPSCR).
See Advanced SIMD and VFP extension system registers on page B1-66 for the system level description of
the registers.
Floating-point Status and Control Register (FPSCR)
The Floating-point Status and Control Register (FPSCR) is implemented in any system that implements one
or both of:
the VFP extension
the Advanced SIMD extension.
The FPSCR provides all necessary User level control of the floating-point system
The FPSCR is a 32-bit read/write system register, accessible in unprivileged and privileged modes.
The format of the FPSCR is:
Bits [31:28] Condition code bits. These are updated on floating-point comparison operations. They are
not updated on SIMD operations, and do not affect SIMD instructions.
N, bit [31] Negative condition code flag.
Z, bit [30] Zero condition code flag.
C, bit [29] Carry condition code flag.
V, bit [28] Overflow condition code flag.
QC, bit [27] Cumulative saturation flag, Advanced SIMD only. This bit is set to 1 to indicate that an
Advanced SIMD integer operation has saturated since 0 was last written to this bit. For
details of saturation, see Pseudocode details of saturation on page A2-9.
The value of this bit is ignored by the VFP extension. If Advanced SIMD is not implemented
this bit is UNK/SBZP.
01231 30292827262524232221201918 161514131211109876543
N Z C V Stride Len UNK/
SBZP
IXC
AHP
QC
DN
FZ
RMode
UNK/SBZP
UFC
OFC
DZC
IOC
IDE
UNK/
SBZP
UFE
OFE
DZE
IOE
IXE
IDC
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-29
AHP, bit[26] Alternative half-precision control bit:
0 IEEE half-precision format selected.
1 Alternative half-precision format selected.
For more information see Advanced SIMD and VFP half-precision formats on page A2-38.
If the half-precision extensions are not implemented this bit is UNK/SBZP.
Bits [19,14:13,6:5]
Reserved. UNK/SBZP.
DN, bit [25] Default NaN mode control bit:
0 NaN operands propagate through to the output of a floating-point operation.
1 Any operation involving one or more NaNs returns the Default NaN.
For more information, see NaN handling and the Default NaN on page A2-41.
The value of this bit only controls VFP arithmetic. Advanced SIMD arithmetic always uses
the Default NaN setting, regardless of the value of the DN bit.
FZ, bit [24] Flush-to-zero mode control bit:
0 Flush-to-zero mode disabled. Behavior of the floating-point system is fully
compliant with the IEEE 754 standard.
1 Flush-to-zero mode enabled.
For more information, see Flush-to-zero on page A2-39.
The value of this bit only controls VFP arithmetic. Advanced SIMD arithmetic always uses
the Flush-to-zero setting, regardless of the value of the FZ bit.
RMode, bits [23:22]
Rounding Mode control field. The encoding of this field is:
0b00 Round to Nearest (RN) mode
0b01 Round towards Plus Infinity (RP) mode
0b10 Round towards Minus Infinity (RM) mode
0b11 Round towards Zero (RZ) mode.
The specified rounding mode is used by almost all VFP floating-point instructions.
Advanced SIMD arithmetic always uses the Round to Nearest setting, regardless of the
value of the RMode bits.
Stride, bits [21:20] and Len, bits [18:16]
Use of nonzero values of these fields is deprecated in ARMv7. For details of their use in
previous versions of the ARM architecture see Appendix F VFP Vector Operation Support.
The values of these fields are ignored by the Advanced SIMD extension.
Bits [15,12:8] Floating-point exception trap enable bits. These bits are supported only in VFPv2 and
VFPv3U. They are reserved, RAZ/SBZP, on a system that implements VFPv3.
Application Level Programmers’ Model
A2-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
The possible values of each bit are:
0 Untrapped exception handling selected
1 Trapped exception handling selected.
The values of these bits control only VFP arithmetic. Advanced SIMD arithmetic always
uses untrapped exception handling, regardless of the values of these bits.
For more information, see Floating-point exceptions on page A2-42.
IDE, bit [15] Input Denormal exception trap enable.
IXE, bit [12] Inexact exception trap enable.
UFE, bit [11] Underflow exception trap enable.
OFE, bit [10] Overflow exception trap enable.
DZE, bit [9] Division by Zero exception trap enable.
IOE, bit [8] Invalid Operation exception trap enable.
Bits [7,4:0] Cumulative exception flags for floating-point exceptions. Each of these bits is set to 1 to
indicate that the corresponding exception has occurred since 0 was last written to it. How
VFP instructions update these bits depends on the value of the corresponding exception trap
enable bits:
Trap enable bit = 0
If the floating-point exception occurs then the cumulative exception flag is set
to 1.
Trap enable bit = 1
If the floating-point exception occurs the trap handling software can decide
whether to set the cumulative exception flag to 1.
Advanced SIMD instructions set each cumulative exception flag if the corresponding
exception occurs in one or more of the floating-point calculations performed by the
instruction, regardless of the setting of the trap enable bits.
For more information, see Floating-point exceptions on page A2-42.
IDC, bit [7] Input Denormal cumulative exception flag.
IXC, bit [4] Inexact cumulative exception flag.
UFC, bit [3] Underflow cumulative exception flag.
OFC, bit [2] Overflow cumulative exception flag.
DZC, bit [1] Division by Zero cumulative exception flag.
IOC, bit [0] Invalid Operation cumulative exception flag.
If the processor implements the integer-only Advanced SIMD extension and does not implement the VFP
extension, all of these bits except QC are UNK/SBZP.
Writes to the FPSCR can have side-effects on various aspects of processor operation. All of these
side-effects are synchronous to the FPSCR write. This means they are guaranteed not to be visible to earlier
instructions in the execution stream, and they are guaranteed to be visible to later instructions in the
execution stream.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-31
Accessing the FPSCR
You read or write the FPSCR using the
VMRS
and
VMSR
instructions. For more information, see VMRS on
page A8-658 and VMSR on page A8-660. For example:
VMRS <Rt>, FPSCR ; Read Floating-point System Control Register
VMSR FPSCR, <Rt> ; Write Floating-point System Control Register
A2.6.5 VFPv3U
VFPv3 does not support the exception trap enable bits in the FPSCR, see Floating-point Status and Control
Register (FPSCR) on page A2-28. All floating-point exceptions are untrapped.
The VFPv3U variant of the VFPv3 architecture implements the exception trap enable bits in the FPSCR,
and provides exception handling as described in VFP support code on page B1-70. There is a separate trap
enable bit for each of the six floating-point exceptions described in Floating-point exceptions on
page A2-42. The VFPv3U architecture is otherwise identical to VFPv3.
Trapped exception handling never causes the corresponding cumulative exception bit of the FPSCR to be
set to 1. If this behavior is desired, the trap handler routine must use a read, modify, write sequence on the
FPSCR to set the cumulative exception bit.
VFPv3U is backwards compatible with VFPv2.
Application Level Programmers’ Model
A2-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7 Floating-point data types and arithmetic
The VFP extension supports single-precision (32-bit) and double-precision (64-bit) floating-point data
types and arithmetic as defined by the IEEE 754 floating-point standard. It also supports the ARM Standard
modifications to that arithmetic described in Flush-to-zero on page A2-39 and NaN handling and the
Default NaN on page A2-41.
Trapped floating-point exception handling is supported in the VFPv3U variant only (see VFPv3U on
page A2-31).
ARM standard floating-point arithmetic means IEEE 754 floating-point arithmetic with the ARM standard
modifications and:
the Round to Nearest rounding mode selected
untrapped exception handling selected for all floating-point exceptions.
The Advanced SIMD extension only supports single-precision ARM standard floating-point arithmetic.
Note
Implementations of the VFP extension require support code to be installed in the system if trapped
floating-point exception handling is required. See VFP support code on page B1-70.
They might also require support code to be installed in the system to support other aspects of their
floating-point arithmetic. It is IMPLEMENTATION DEFINED which aspects of VFP floating-point arithmetic
are supported in a system without support code installed.
Aspects of floating-point arithmetic that are implemented in support code are likely to run much more
slowly than those that are executed in hardware.
ARM recommends that:
To maximize the chance of getting high floating-point performance, software developers use ARM
standard floating-point arithmetic.
Software developers check whether their systems have support code installed, and if not, observe the
IMPLEMENTATION DEFINED restrictions on what operations their VFP implementation can handle
without support code.
VFP implementation developers implement at least ARM standard floating-point arithmetic in
hardware, so that it can be executed without any need for support code.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-33
A2.7.1 ARM standard floating-point input and output values
ARM standard floating-point arithmetic supports the following input formats defined by the IEEE 754
floating-point standard:
• Zeros.
Normalized numbers.
Denormalized numbers are flushed to 0 before floating-point operations. For details, see
Flush-to-zero on page A2-39.
•NaNs.
• Infinities.
ARM standard floating-point arithmetic supports the Round to Nearest rounding mode defined by the IEEE
754 standard.
ARM standard floating-point arithmetic supports the following output result formats defined by the IEEE
754 standard:
• Zeros.
Normalized numbers.
Results that are less than the minimum normalized number are flushed to zero, see Flush-to-zero on
page A2-39.
NaNs produced in floating-point operations are always the default NaN, see NaN handling and the
Default NaN on page A2-41.
• Infinities.
Application Level Programmers’ Model
A2-34 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7.2 Advanced SIMD and VFP single-precision format
The single-precision floating-point format used by the Advanced SIMD and VFP extensions is as defined
by the IEEE 754 standard.
This description includes ARM-specific details that are left open by the standard. It is only intended as an
introduction to the formats and to the values they can contain. For full details, especially of the handling of
infinities, NaNs and signed zeros, see the IEEE 754 standard.
A single-precision value is a 32-bit word, and must be word-aligned when held in memory. It has the format:
The interpretation of the format depends on the value of the exponent field, bits [30:23]:
0 < exponent <
0xFF
The value is a normalized number and is equal to:
–1S × 2(exponent – 127) × (1.fraction)
The minimum positive normalized number is 2–126, or approximately 1.175 ×10–38.
The maximum positive normalized number is (2 – 2–23) × 2127, or approximately
3.403 ×1038.
exponent == 0
The value is either a zero or a denormalized number, depending on the fraction bits:
fraction == 0
The value is a zero. There are two distinct zeros:
+0 when S==0
–0 when S==1.
These usually behave identically. In particular, the result is equal if +0 and –0
are compared as floating-point numbers. However, they yield different results in
some circumstances. For example, the sign of the infinity produced as the result
of dividing by zero depends on the sign of the zero. The two zeros can be
distinguished from each other by performing an integer comparison of the two
words.
fraction != 0
The value is a denormalized number and is equal to:
–1S × 2–126 × (0.fraction)
The minimum positive denormalized number is 2–149, or approximately 1.401 × 10–45.
Denormalized numbers are flushed to zero in the Advanced SIMD extension. They are
optionally flushed to zero in the VFP extension. For details see Flush-to-zero on
page A2-39.
31 30 23 22 0
S exponent fraction
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-35
exponent ==
0xFF
The value is either an infinity or a Not a Number (NaN), depending on the fraction bits:
fraction == 0
The value is an infinity. There are two distinct infinities:
+ When S==0. This represents all positive numbers that are too big to
be represented accurately as a normalized number.
- When S==1. This represents all negative numbers with an absolute
value that is too big to be represented accurately as a normalized
number.
fraction != 0
The value is a NaN, and is either a quiet NaN or a signaling NaN.
In the VFP architecture, the two types of NaN are distinguished on the basis of
their most significant fraction bit, bit [22]:
bit [22] == 0
The NaN is a signaling NaN. The sign bit can take any value, and
the remaining fraction bits can take any value except all zeros.
bit [22] == 1
The NaN is a quiet NaN. The sign bit and remaining fraction bits
can take any value.
For details of the default NaN see NaN handling and the Default NaN on page A2-41.
Note
NaNs with different sign or fraction bits are distinct NaNs, but this does not mean you can use floating-point
comparison instructions to distinguish them. This is because the IEEE 754 standard specifies that a NaN
compares as unordered with everything, including itself. However, you can use integer comparisons to
distinguish different NaNs.
Application Level Programmers’ Model
A2-36 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7.3 VFP double-precision format
The double-precision floating-point format used by the VFP extension is as defined by the IEEE 754
standard.
This description includes VFP-specific details that are left open by the standard. It is only intended as an
introduction to the formats and to the values they can contain. For full details, especially of the handling of
infinities, NaNs and signed zeros, see the IEEE 754 standard.
A double-precision value consists of two 32-bit words, with the formats:
Most significant word:
Least significant word:
When held in memory, the two words must appear consecutively and must both be word-aligned. The order
of the two words depends on the endianness of the memory system:
In a little-endian memory system, the least significant word appears at the lower memory address and
the most significant word at the higher memory address.
In a big-endian memory system, the most significant word appears at the lower memory address and
the least significant word at the higher memory address.
Double-precision values represent numbers, infinities and NaNs in a similar way to single-precision values,
with the interpretation of the format depending on the value of the exponent:
0 < exponent <
0x7FF
The value is a normalized number and is equal to:
–1S × 2exponent–1023 × (1.fraction)
The minimum positive normalized number is 2–1022, or approximately 2.225 × 10–308.
The maximum positive normalized number is (2 – 2–52) × 21023, or approximately
1.798 × 10308.
exponent == 0
The value is either a zero or a denormalized number, depending on the fraction bits:
fraction == 0
The value is a zero. There are two distinct zeros that behave analogously to the
two single-precision zeros:
+0 when S==0
–0 when S==1.
31 30 20 19 0
S exponent fraction[51:32]
31 0
fraction[31:0]
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-37
fraction != 0
The value is a denormalized number and is equal to:
1–S × 2–1022 × (0.fraction)
The minimum positive denormalized number is 2–1074, or approximately 4.941 × 10–324.
Optionally, denormalized numbers are flushed to zero in the VFP extension. For details see
Flush-to-zero on page A2-39.
exponent ==
0x7FF
The value is either an infinity or a NaN, depending on the fraction bits:
fraction == 0
the value is an infinity. As for single-precision, there are two infinities:
+ Plus infinity, when S==0
- Minus infinity, when S==1.
fraction != 0
The value is a NaN, and is either a quiet NaN or a signaling NaN.
In the VFP architecture, the two types of NaN are distinguished on the basis of
their most significant fraction bit, bit [19] of the most significant word:
bit [19] == 0
The NaN is a signaling NaN. The sign bit can take any value, and
the remaining fraction bits can take any value except all zeros.
bit [19] == 1
The NaN is a quiet NaN. The sign bit and the remaining fraction bits
can take any value.
For details of the default NaN see NaN handling and the Default NaN on page A2-41.
Application Level Programmers’ Model
A2-38 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7.4 Advanced SIMD and VFP half-precision formats
Two half-precision floating-point formats are used by the half-precision extensions to Advanced SIMD and
VFP:
IEEE half-precision, as described in the revised IEEE 754 standard
Alternative half-precision.
The description of IEEE half-precision includes ARM-specific details that are left open by the standard, and
is only an introduction to the formats and to the values they can contain. For more information, especially
on the handling of infinities, NaNs and signed zeros, see the IEEE 754 standard.
For both half-precision floating-point formats, the layout of the 16-bit number is the same. The format is:
The interpretation of the format depends on the value of the exponent field, bits[14:10] and on which
half-precision format is being used.
0 < exponent <
0x1F
The value is a normalized number and is equal to:
–1S × 2((exponent-15) × (1.fraction)
The minimum positive normalized number is 2–14, or approximately 6.104 ×10–5.
The maximum positive normalized number is (2 – 2–10) × 215, or 65504.
Larger normalized numbers can be expressed using the alternative format when the
exponent ==
0x1F
.
exponent == 0
The value is either a zero or a denormalized number, depending on the fraction bits:
fraction == 0
The value is a zero. There are two distinct zeros:
+0 when S==0
–0 when S==1.
fraction != 0
The value is a denormalized number and is equal to:
–1S × 2–14 × (0.fraction)
The minimum positive denormalized number is 2–25, or approximately 2.980 × 10–8.
15 14 10 9 0
SExponent Fraction
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-39
exponent ==
0x1F
The value depends on which half-precision format is being used:
IEEE Half-precision
The value is either an infinity or a Not a Number (NaN), depending on the
fraction bits:
fraction == 0
The value is an infinity. There are two distinct infinities:
+ When S==0. This represents all positive
numbers that are too big to be represented
accurately as a normalized number.
- When S==1. This represents all negative
numbers with an absolute value that is too
big to be represented accurately as a
normalized number.
fraction != 0
The value is a NaN, and is either a quiet NaN or a signaling NaN.
The two types of NaN are distinguished by their most significant
fraction bit, bit [9]:
bit [9] == 0 The NaN is a signaling NaN. The sign bit
can take any value, and the remaining
fraction bits can take any value except all
zeros.
bit [9] == 1 The NaN is a quiet NaN. The sign bit and
remaining fraction bits can take any value.
Alternative Half-precision
The value is a normalized number and is equal to:
-1S x 216 x (1.fraction)
The maximum positive normalized number is (2-2-10) x 216 or 131008.
A2.7.5 Flush-to-zero
The performance of floating-point implementations can be significantly reduced when performing
calculations involving denormalized numbers and Underflow exceptions. In particular this occurs for
implementations that only handle normalized numbers and zeros in hardware, and invoke support code to
handle any other types of value. For an algorithm where a significant number of the operands and
intermediate results are denormalized numbers, this can result in a considerable loss of performance.
In many of these algorithms, this performance can be recovered, without significantly affecting the accuracy
of the final result, by replacing the denormalized operands and intermediate results with zeros. To permit
this optimization, VFP implementations have a special processing mode called Flush-to-zero mode.
Advanced SIMD implementations always use Flush-to-zero mode.
Application Level Programmers’ Model
A2-40 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Behavior in Flush-to-zero mode differs from normal IEEE 754 arithmetic in the following ways:
All inputs to floating-point operations that are double-precision de-normalized numbers or
single-precision de-normalized numbers are treated as though they were zero. This causes an Input
Denormal exception, but does not cause an Inexact exception. The Input Denormal exception occurs
only in Flush-to-zero mode.
The FPSCR contains a cumulative exception bit FPSCR.IDC and trap enable bit FPSCR.IDE
corresponding to the Input Denormal exception. For details of how these are used when processing
the exception see Advanced SIMD and VFP system registers on page A2-28.
The occurrence of all exceptions except Input Denormal is determined using the input values after
flush-to-zero processing has occurred.
The result of a floating-point operation is flushed to zero if the result of the operation before rounding
satisfies the condition:
0 <
Abs(result)
< MinNorm, where:
—MinNorm ==2
-126 for single-precision
—MinNorm ==2
-1022 for double-precision.
This causes the FPSCR.UFC bit to be set to 1, and prevents any Inexact exception from occurring for
the operation.
Underflow exceptions occur only when a result is flushed to zero.
In a VFPv2 or VFPv3U implementation Underflow exceptions that occur in Flush-to-zero mode are
always treated as untrapped, even when the Underflow trap enable bit, FPSCR.UFE, is set to 1.
An Inexact exception does not occur if the result is flushed to zero, even though the final result of
zero is not equivalent to the value that would be produced if the operation were performed with
unbounded precision and exponent range.
For information on the FPSCR bits see Floating-point Status and Control Register (FPSCR) on page A2-28.
When an input or a result is flushed to zero the value of the sign bit of the zero is determined as follows:
In VFPv3 or VFPv3U, it is preserved. That is, the sign bit of the zero matches the sign bit of the input
or result that is being flushed to zero.
In VFPv2, it is IMPLEMENTATION DEFINED whether it is preserved or always positive. The same
choice must be made for all cases of flushing an input or result to zero.
Flush-to-zero mode has no effect on half-precision numbers that are inputs to floating-point operations, or
results from floating-point operations.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-41
Note
Flush-to-zero mode is incompatible with the IEEE 754 standard, and must not be used when IEEE 754
compatibility is a requirement. Flush-to-zero mode must be treated with care. Although it can lead to a major
performance increase on many algorithms, there are significant limitations on its use. These are application
dependent:
On many algorithms, it has no noticeable effect, because the algorithm does not normally use
denormalized numbers.
On other algorithms, it can cause exceptions to occur or seriously reduce the accuracy of the results
of the algorithm.
A2.7.6 NaN handling and the Default NaN
The IEEE 754 standard specifies that:
an operation that produces an Invalid Operation floating-point exception generates a quiet NaN as its
result if that exception is untrapped
an operation involving a quiet NaN operand, but not a signaling NaN operand, returns an input NaN
as its result.
The VFP behavior when Default NaN mode is disabled adheres to this with the following extra details,
where the first operand means the first argument to the pseudocode function call that describes the
operation:
If an untrapped Invalid Operation floating-point exception is produced because one of the operands
is a signaling NaN, the quiet NaN result is equal to the signaling NaN with its most significant
fraction bit changed to 1. If both operands are signaling NaNs, the result is produced in this way from
the first operand.
If an untrapped Invalid Operation floating-point exception is produced for other reasons, the quiet
NaN result is the Default NaN.
If both operands are quiet NaNs, the result is the first operand.
The VFP behavior when Default NaN mode is enabled, and the Advanced SIMD behavior in all
circumstances, is that the Default NaN is the result of all floating-point operations that:
generate untrapped Invalid Operation floating-point exceptions
have one or more quiet NaN inputs.
Table A2-6 on page A2-42 shows the format of the default NaN for ARM floating-point processors.
Default NaN mode is selected for VFP by setting the FPSCR.DN bit to 1, see Floating-point Status and
Control Register (FPSCR) on page A2-28.
Application Level Programmers’ Model
A2-42 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Other aspects of the functionality of the Invalid Operation exception are not affected by Default NaN mode.
These are that:
If untrapped, it causes the FPSCR.IOC bit be set to 1.
If trapped, it causes a user trap handler to be invoked. This is only possible in VFPv2 and VFPv3U.
A2.7.7 Floating-point exceptions
The Advanced SIMD and VFP extensions record the following floating-point exceptions in the FPSCR
cumulative flags, see Floating-point Status and Control Register (FPSCR) on page A2-28:
IOC Invalid Operation. The flag is set to 1 if the result of an operation has no mathematical value
or cannot be represented. Cases include infinity * 0, +infinity + (–infinity), for example.
These tests are made after flush-to-zero processing. For example, if flush-to-zero mode is
selected, multiplying a denormalized number and an infinity is treated as 0 * infinity and
causes an Invalid Operation floating-point exception.
IOC is also set on any floating-point operation with one or more signaling NaNs as
operands, except for negation and absolute value, as described in Negation and absolute
value on page A2-47.
DZC Division by Zero. The flag is set to 1 if a divide operation has a zero divisor and a dividend
that is not zero, an infinity or a NaN. These tests are made after flush-to-zero processing, so
if flush-to-zero processing is selected, a denormalized dividend is treated as zero and
prevents Division by Zero from occurring, and a denormalized divisor is treated as zero and
causes Division by Zero to occur if the dividend is a normalized number.
For the reciprocal and reciprocal square root estimate functions the dividend is assumed to
be +1.0. This means that a zero or denormalized operand to these functions sets the DZC
flag.
OFC Overflow. The flag is set to 1 if the absolute value of the result of an operation, produced
after rounding, is greater than the maximum positive normalized number for the destination
precision.
UFC Underflow. The flag is set to 1 if the absolute value of the result of an operation, produced
before rounding, is less than the minimum positive normalized number for the destination
precision, and the rounded result is inexact.
Table A2-6 Default NaN encoding
Half-precision, IEEE Format Single-precision Double-precision
Sign bit 0 0a0a
Exponent
0x1F 0xFF 0x7FF
Fraction Bit[9] == 1, bits[8:0] == 0 bit [22] == 1, bits [21:0] == 0 bit [51] == 1, bits [50:0] == 0
a. In VFPv2, the sign bit of the Default NaN is UNKNOWN.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-43
The criteria for the Underflow exception to occur are different in Flush-to-zero mode. For
details, see Flush-to-zero on page A2-39.
IXC Inexact. The flag is set to 1 if the result of an operation is not equivalent to the value that
would be produced if the operation were performed with unbounded precision and exponent
range.
The criteria for the Inexact exception to occur are different in Flush-to-zero mode. For
details, see Flush-to-zero on page A2-39.
IDC Input Denormal. The flag is set to 1 if a denormalized input operand is replaced in the
computation by a zero, as described in Flush-to-zero on page A2-39.
With the Advanced SIMD extension and the VFPv3 extension these are non-trapping exceptions and the
data-processing instructions do not generate any trapped exceptions.
With the VFPv2 and VFPv3U extensions:
These exceptions can be trapped, by setting trap enable flags in the FPSCR, see VFPv3U on
page A2-31. Trapped floating-point exceptions are delivered to user code in an IMPLEMENTATION
DEFINED fashion.
The definitions of the floating-point exceptions change as follows:
if the Underflow exception is trapped, it occurs if the absolute value of the result of an
operation, produced before rounding, is less than the minimum positive normalized number
for the destination precision, regardless of whether the rounded result is inexact
higher priority trapped exceptions can prevent lower priority exceptions from occurring, as
described in Combinations of exceptions on page A2-44.
Table A2-7 shows the default results of the floating-point exceptions:
Table A2-7 Floating-point exception default results
Exception type Default result for positive sign Default result for negative sign
IOC, Invalid Operation Quiet NaN Quiet NaN
DZC, Division by Zero
+
(plus infinity)
(minus infinity)
OFC, Overflow RN, RP:
RM, RZ:
+
(plus infinity)
+MaxNorm
RN, RM:
RP, RZ:
(minus infinity)
–MaxNorm
UFC, Underflow Normal rounded result Normal rounded result
IXC, Inexact Normal rounded result Normal rounded result
IDC, Input Denormal Normal rounded result Normal rounded result
Application Level Programmers’ Model
A2-44 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
In Table A2-7 on page A2-43:
MaxNorm The maximum normalized number of the destination precision
RM Round towards Minus Infinity mode, as defined in the IEEE 754 standard
RN Round to Nearest mode, as defined in the IEEE 754 standard
RP Round towards Plus Infinity mode, as defined in the IEEE 754 standard
RZ Round towards Zero mode, as defined in the IEEE 754 standard
For Invalid Operation exceptions, for details of which quiet NaN is produced as the default result see
NaN handling and the Default NaN on page A2-41.
For Division by Zero exceptions, the sign bit of the default result is determined normally for a
division. This means it is the exclusive OR of the sign bits of the two operands.
For Overflow exceptions, the sign bit of the default result is determined normally for the overflowing
operation.
Combinations of exceptions
The following pseudocode functions perform floating-point operations:
FixedToFP()
FPAbs()
FPAdd()
FPCompare()
FPCompareGE()
FPCompareGT()
FPDiv()
FPDoubleToSingle()
FPMax()
FPMin()
FPMul()
FPNeg()
FPRecipEstimate()
FPRecipStep()
FPRSqrtEstimate()
FPRSqrtStep()
FPSingleToDouble()
FPSqrt()
FPSub()
FPToFixed()
All of these operations except
FPAbs()
and
FPNeg()
can generate floating-point exceptions.
More than one exception can occur on the same operation. The only combinations of exceptions that can
occur are:
Overflow with Inexact
Underflow with Inexact
Input Denormal with other exceptions.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-45
When none of the exceptions caused by an operation are trapped, any exception that occurs causes the
associated cumulative flag in the FPSCR to be set.
When one or more exceptions caused by an operation are trapped, the behavior of the instruction depends
on the priority of the exceptions. The Inexact exception is treated as lowest priority, and Input Denormal as
highest priority:
If the higher priority exception is trapped, its trap handler is called. It is IMPLEMENTATION DEFINED
whether the parameters to the trap handler include information about the lower priority exception.
Apart from this, the lower priority exception is ignored in this case.
If the higher priority exception is untrapped, its cumulative bit is set to 1 and its default result is
evaluated. Then the lower priority exception is handled normally, using this default result.
Some floating-point instructions specify more than one floating-point operation, as indicated by the
pseudocode descriptions of the instruction. In such cases, an exception on one operation is treated as higher
priority than an exception on another operation if the occurrence of the second exception depends on the
result of the first operation. Otherwise, it is UNPREDICTABLE which exception is treated as higher priority.
For example, a
VMLA.F32
instruction specifies a floating-point multiplication followed by a floating-point
addition. The addition can generate Overflow, Underflow and Inexact exceptions, all of which depend on
both operands to the addition and so are treated as lower priority than any exception on the multiplication.
The same applies to Invalid Operation exceptions on the addition caused by adding opposite-signed
infinities.
The addition can also generate an Input Denormal exception, caused by the addend being a denormalized
number while in Flush-to-zero mode. It is UNPREDICTABLE which of an Input Denormal exception on the
addition and an exception on the multiplication is treated as higher priority, because the occurrence of the
Input Denormal exception does not depend on the result of the multiplication. The same applies to an Invalid
Operation exception on the addition caused by the addend being a signaling NaN.
Note
Like other details of VFP instruction execution, these rules about exception handling apply to the overall
results produced by an instruction when the system uses a combination of hardware and support code to
implement it. See VFP support code on page B1-70 for more information.
These principles also apply to the multiple floating-point operations generated by VFP instructions in the
deprecated VFP vector mode of operation. For details of this mode of operation see Appendix F VFP Vector
Operation Support.
Application Level Programmers’ Model
A2-46 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7.8 Pseudocode details of floating-point operations
This section contains pseudocode definitions of the floating-point operations used by the architecture.
Generation of specific floating-point values
The following pseudocode functions generate specific floating-point values. The
sign
argument of
FPInfinity()
,
FPMaxNormal()
, and
FPZero()
is
'0'
for the positive version and
'1'
for the negative version.
// FPZero()
// ========
bits(N) FPZero(bit sign, integer N)
assert N == 16 || N == 32 || N == 64;
if N == 16 then
return sign : ‘00000 0000000000’;
elsif N == 32 then
return sign : ‘00000000 00000000000000000000000’;
else
return sign : ‘00000000000 0000000000000000000000000000000000000000000000000000’;
// FPTwo()
// =======
bits(N) FPTwo(integer N)
assert N == 32 || N == 64;
if N == 32 then
return ‘0 10000000 00000000000000000000000’;
else
return ‘0 10000000000 0000000000000000000000000000000000000000000000000000’;
// FPThree()
// =========
bits(N) FPThree(integer N)
assert N == 32 || N == 64;
if N == 32 then
return ‘0 10000000 10000000000000000000000’;
else
return ‘0 10000000000 1000000000000000000000000000000000000000000000000000’;
// FPMaxNormal()
// =============
bits(N) FPMaxNormal(bit sign, integer N)
assert N == 16 || N == 32 || N == 64;
if N == 16 then
return sign : ‘11110 1111111111’;
elsif N == 32 then
return sign : ‘11111110 11111111111111111111111’;
else
return sign : ‘11111111110 1111111111111111111111111111111111111111111111111111’;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-47
// FPInfinity()
// ============
bits(N) FPInfinity(bit sign, integer N)
assert N == 16 || N == 32 || N == 64;
if N == 16 then
return sign : ‘11111 0000000000’;
elsif N == 32 then
return sign : ‘11111111 00000000000000000000000’;
else
return sign : ‘11111111111 0000000000000000000000000000000000000000000000000000’;
// FPDefaultNaN()
// ==============
bits(N) FPDefaultNaN(integer N)
assert N == 16 || N == 32 || N == 64;
if N == 16 then
return ‘0 11111 1000000000’;
elsif N == 32 then
return ‘0 11111111 10000000000000000000000’;
else
return ‘0 11111111111 1000000000000000000000000000000000000000000000000000’;
Note
This definition of
FPDefaultNaN()
applies to VFPv3 and VFPv3U. For VFPv2, the sign bit of the result is a
single-bit UNKNOWN value, instead of 0.
Negation and absolute value
The floating-point negation and absolute value operations only affect the sign bit. They do not treat NaN
operands specially, nor denormalized number operands when flush-to-zero is selected.
// FPNeg()
// =======
bits(N) FPNeg(bits(N) operand)
assert N == 32 || N == 64;
return NOT(operand<N-1>) : operand<N-2:0>;
// FPAbs()
// =======
bits(N) FPAbs(bits(N) operand)
assert N == 32 || N == 64;
return ‘0’ : operand<N-2:0>;
Application Level Programmers’ Model
A2-48 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Floating-point value unpacking
The
FPUnpack()
function determines the type and numerical value of a floating-point number. It also does
flush-to-zero processing on input operands.
enumeration FPType {FPType_Nonzero, FPType_Zero, FPType_Infinity, FPType_QNaN, FPType_SNaN};
// FPUnpack()
// ==========
//
// Unpack a floating-point number into its type, sign bit and the real number
// that it represents. The real number result has the correct sign for numbers
// and infinities, is very large in magnitude for infinities, and is 0.0 for
// NaNs. (These values are chosen to simplify the description of comparisons
// and conversions.)
//
// The ‘fpscr_val’ argument supplies FPSCR control bits. Status information is
// updated directly in the FPSCR where appropriate.
(FPType, bit, real) FPUnpack(bits(N) fpval, bits(32) fpscr_val)
assert N == 16 || N == 32 || N == 64;
if N == 16 then
sign = fpval<15>;
exp = fpval<14:10>;
frac = fpval<9:0>;
if IsZero(exp) then
// Produce zero if value is zero
if IsZero(frac) then
type = FPType_Zero; value = 0.0;
else
type = FPType_Nonzero; value = 2^-14 * (UInt(frac) * 2^-10);
elsif IsOnes(exp) && fpscr_val<26> == ‘0’ then // Infinity or NaN in IEEE format
if IsZero(frac) then
type = FPType_Infinity; value = 2^1000000;
else
type = if frac<9> == ‘1’ then FPType_QNaN else FPType_SNaN;
value = 0.0;
else
type = FPType_Nonzero; value = 2^(UInt(exp)-15) * (1.0 + UInt(frac) * 2^-10));
elsif N == 32 then
sign = fpval<31>;
exp = fpval<30:23>;
frac = fpval<22:0>;
if IsZero(exp) then
// Produce zero if value is zero or flush-to-zero is selected.
if IsZero(frac) || fpscr_val<24> == ‘1’ then
type = FPType_Zero; value = 0.0;
if !IsZero(frac) then // Denormalized input flushed to zero
FPProcessException(FPExc_InputDenorm, fpscr_val);
else
type = FPType_Nonzero; value = 2^-126 * (UInt(frac) * 2^-23);
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-49
elsif IsOnes(exp) then
if IsZero(frac) then
type = FPType_Infinity; value = 2^1000000;
else
type = if frac<22> == ‘1’ then FPType_QNaN else FPType_SNaN;
value = 0.0;
else
type = FPType_Nonzero; value = 2^(UInt(exp)-127) * (1.0 + UInt(frac) * 2^-23));
else // N == 64
sign = fpval<63>;
exp = fpval<62:52>;
frac = fpval<51:0>;
if IsZero(exp) then
// Produce zero if value is zero or flush-to-zero is selected.
if IsZero(frac) || fpscr_val<24> == ‘1’ then
type = FPType_Zero; value = 0.0;
if !IsZero(frac) then // Denormalized input flushed to zero
FPProcessException(FPExc_InputDenorm, fpscr_val);
else
type = FPType_Nonzero; value = 2^-1022 * (UInt(frac) * 2^-52);
elsif IsOnes(exp) then
if IsZero(frac) then
type = FPType_Infinity; value = 2^1000000;
else
type = if frac<51> == ‘1’ then FPType_QNaN else FPType_SNaN;
value = 0.0;
else
type = FPType_Nonzero; value = 2^(UInt(exp)-1023) * (1.0 + UInt(frac) * 2^-52));
if sign == ‘1’ then value = -value;
return (type, sign, value);
Floating-point exception and NaN handling
The
FPProcessException()
procedure checks whether a floating-point exception is trapped, and handles it
accordingly:
enumeration FPExc (FPExc_InvalidOp, FPExc_DivideByZero, FPExc_Overflow,
FPExc_Underflow, FPExc_Inexact, FPExc_InputDenorm};
// FPProcessException()
// ====================
//
// The ‘fpscr_val’ argument supplies FPSCR control bits. Status information is
// updated directly in the FPSCR where appropriate.
FPProcessException(FPExc exception, bits(32) fpscr_val)
// Get appropriate FPSCR bit numbers
case exception of
when FPExc_InvalidOp enable = 8; cumul = 0;
when FPExc_DivideByZero enable = 9; cumul = 1;
Application Level Programmers’ Model
A2-50 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
when FPExc_Overflow enable = 10; cumul = 2;
when FPExc_Underflow enable = 11; cumul = 3;
when FPExc_Inexact enable = 12; cumul = 4;
when FPExc_InputDenorm enable = 15; cumul = 7;
if fpscr_val<enable> then
IMPLEMENTATION_DEFINED floating-point trap handling;
else
FPSCR<cumul> = ‘1’;
return;
The
FPProcessNaN()
function processes a NaN operand, producing the correct result value and generating an
Invalid Operation exception if necessary:
// FPProcessNaN()
// ==============
//
// The ‘fpscr_val’ argument supplies FPSCR control bits. Status information is
// updated directly in the FPSCR where appropriate.
bits(N) FPProcessNaN(FPType type, bits(N) operand, bits(32) fpscr_val)
assert N == 32 || N == 64;
topfrac = if N == 32 then 22 else 51;
result = operand;
if type = FPType_SNaN then
result<topfrac> = ‘1’;
FPProcessException(FPExc_InvalidOp, fpscr_val);
if fpscr_val<25> == ‘1’ then // DefaultNaN requested
result = FPDefaultNaN(N);
return result;
The
FPProcessNaNs()
function performs the standard NaN processing for a two-operand operation:
// FPProcessNaNs()
// ===============
//
// The boolean part of the return value says whether a NaN has been found and
// processed. The bits(N) part is only relevant if it has and supplies the
// result of the operation.
//
// The ‘fpscr_val’ argument supplies FPSCR control bits. Status information is
// updated directly in the FPSCR where appropriate.
(boolean, bits(N)) FPProcessNaNs(FPType type1, FPType type2,
bits(N) op1, bits(N) op2,
bits(32) fpscr_val)
assert N == 32 || N == 64;
if type1 == FPType_SNaN then
done = TRUE; result = FPProcessNaN(type1, op1, fpscr_val);
elsif type2 == FPType_SNaN then
done = TRUE; result = FPProcessNaN(type2, op2, fpscr_val);
elsif type1 == FPType_QNaN then
done = TRUE; result = FPProcessNaN(type1, op1, fpscr_val);
elsif type2 == FPType_QNaN then
done = TRUE; result = FPProcessNaN(type2, op2, fpscr_val);
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-51
else
done = FALSE; result = Zeros(N); // ‘Don’t care’ result
return (done, result);
Floating-point rounding
The
FPRound()
function rounds and encodes a floating-point result value to a specified destination format.
This includes processing Overflow, Underflow and Inexact floating-point exceptions and performing
flush-to-zero processing on result values.
// FPRound()
// =========
//
// The ‘fpscr_val’ argument supplies FPSCR control bits. Status information is
// updated directly in the FPSCR where appropriate.
bits(N) FPRound(real result, integer N, bits(32) fpscr_val)
assert N == 16 || N == 32 || N == 64;
assert result != 0.0;
// Obtain format parameters - minimum exponent, numbers of exponent and fraction bits.
if N == 16 then
minimum_exp = -14; E = 5; F = 10;
elsif N == 32 then
minimum_exp = -126; E = 8; F = 23;
else // N == 64
minimum_exp = -1022; E = 11; F = 52;
// Split value into sign, unrounded mantissa and exponent.
if result < 0.0 then
sign = ‘1’; mantissa = -result;
else
sign = ‘0’; mantissa = result;
exponent = 0;
while mantissa < 1.0 do
mantissa = mantissa * 2.0; exponent = exponent - 1;
while mantissa >= 2.0 do
mantissa = mantissa / 2.0; exponent = exponent + 1;
// Deal with flush-to-zero.
if fpscr_val<24> == ‘1’ && N != 16 && exponent < minimum_exp then
result = FPZero(sign, N);
FPSCR.UFC = ‘1’; // Flush-to-zero never generates a trapped exception
else
// Start creating the exponent value for the result. Start by biasing the actual exponent
// so that the minimum exponent becomes 1, lower values 0 (indicating possible underflow).
biased_exp = Max(exponent - minimum_exp + 1, 0);
if biased_exp == 0 then mantissa = mantissa / 2^(minimum_exp - exponent);
// Get the unrounded mantissa as an integer, and the “units in last place” rounding error.
int_mant = RoundDown(mantissa * 2^F); // < 2^F if biased_exp == 0, >= 2^F if not
error = mantissa * 2^F - int_mant;
Application Level Programmers’ Model
A2-52 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
// Underflow occurs if exponent is too small before rounding, and result is inexact or
// the Underflow exception is trapped.
if biased_exp == 0 && (error != 0.0 || fpscr_val<11> == ‘1’) then
FPProcessException(FPExc_Underflow, fpscr_val);
// Round result according to rounding mode.
case fpscr_val<23:22> of
when ‘00’ // Round to Nearest (rounding to even if exactly halfway)
round_up = (error > 0.5 || (error == 0.5 && int_mant<0> == ‘1’));
overflow_to_inf = TRUE;
when ‘01’ // Round towards Plus Infinity
round_up = (error != 0.0 && sign == ‘0’);
overflow_to_inf = (sign == ‘0’);
when ‘10’ // Round towards Minus Infinity
round_up = (error != 0.0 && sign == ‘1’);
overflow_to_inf = (sign == ‘1’);
when ‘11’ // Round towards Zero
round_up = FALSE;
overflow_to_inf = FALSE;
if round_up then
int_mant = int_mant + 1;
if int_mant == 2^F then // Rounded up from denormalized to normalized
biased_exp = 1;
if int_mant == 2^(F+1) then // Rounded up to next exponent
biased_exp = biased_exp + 1; int_mant = int_mant DIV 2;
// Deal with overflow and generate result.
if N != 16 || fpscr_val<26> == ‘0’ then // Single, double or IEEE half precision
if biased_exp >= 2^E - 1 then
result = if overflow_to_inf then FPInfinity(sign, N) else FPMaxNormal(sign, N);
FPProcessException(FPExc_Overflow, fpscr_val);
else
result = sign : biased_exp<E-1:0> : int_mant<F-1:0>;
else // Alternative half precision
if biased_exp >= 2^E then
result = sign : Ones(15);
FPProcessException(FPExc_InvalidOp, fpscr_val);
error = 0.0; // avoid an Inexact exception
else
result = sign : biased_exp<E-1:0> : int_mant<F-1:0>;
// Deal with Inexact exception.
if error != 0 then
FPProcessException(FPExc_Inexact, fpscr_val);
return result;
Selection of ARM standard floating-point arithmetic
StandardFPSCRValue
is an FPSCR value that selects ARM standard floating-point arithmetic. Most of the
arithmetic functions have a boolean
fpscr_controlled
argument that is
TRUE
for VFP operations and
FALSE
for Advanced SIMD operations, and that selects between using the real FPSCR value and this value.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-53
// StandardFPSCRValue()
// ====================
bits(32) StandardFPSCRValue()
return ‘00000’ : FPSCR<26> : ‘11000000000000000000000000’;
Comparisons
The
FPCompare()
function compares two floating-point numbers, producing an (N,Z,C,V) flags result as
shown in Table A2-8:
This result is used to define the
VCMP
instruction in the VFP extension. The
VCMP
instruction writes these flag
values in the FPSCR. After using a
VMRS
instruction to transfer them to the APSR, they can be used to control
conditional execution as shown in Table A8-1 on page A8-8.
// FPCompare()
// ===========
(bit, bit, bit, bit) FPCompare(bits(N) op1, bits(N) op2, boolean quiet_nan_exc,
boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
if type1==FPType_SNaN || type1==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then
result = (‘0’,’0’,’1’,’1’);
if type1==FPType_SNaN || type2==FPType_SNaN || quiet_nan_exc then
FPProcessException(FPExc_InvalidOp, fpscr_val);
else
// All non-NaN cases can be evaluated on the values produced by FPUnpack()
if value1 == value2 then
result = (‘0’,’1’,’1’,’0’);
elsif value1 < value2 then
result = (‘1’,’0’,’0’,’0’);
else // value1 > value2
result = (‘0’,’0’,’1’,’0’);
return result;
Table A2-8 VFP comparison flag values
Comparison result N Z C V
Equal 0 1 1 0
Less than 1 0 0 0
Greater than 0 0 1 0
Unordered 0 0 1 1
Application Level Programmers’ Model
A2-54 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
The
FPCompareEQ()
,
FPCompareGE()
and
FPCompareGT()
functions are used to describe Advanced SIMD
instructions that perform floating-point comparisons.
// FPCompareEQ()
// =============
boolean FPCompareEQ(bits(32) op1, bits(32) op2, boolean fpscr_controlled)
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
if type1==FPType_SNaN || type1==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then
result = FALSE;
if type1==FPType_SNaN || type2==FPType_SNaN then
FPProcessException(FPExc_InvalidOp, fpscr_val);
else
// All non-NaN cases can be evaluated on the values produced by FPUnpack()
result = (value1 == value2);
return result;
// FPCompareGE()
// =============
boolean FPCompareGE(bits(32) op1, bits(32) op2, boolean fpscr_controlled)
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
if type1==FPType_SNaN || type1==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then
result = FALSE;
FPProcessException(FPExc_InvalidOp, fpscr_val);
else
// All non-NaN cases can be evaluated on the values produced by FPUnpack()
result = (value1 >= value2);
return result;
// FPCompareGT()
// =============
boolean FPCompareGT(bits(32) op1, bits(32) op2, boolean fpscr_controlled)
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
if type1==FPType_SNaN || type1==FPType_QNaN || type2==FPType_SNaN || type2==FPType_QNaN then
result = FALSE;
FPProcessException(FPExc_InvalidOp, fpscr_val);
else
// All non-NaN cases can be evaluated on the values produced by FPUnpack()
result = (value1 > value2);
return result;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-55
Maximum and minimum
// FPMax()
// =======
bits(N) FPMax(bits(N) op1, bits(N) op2, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
(done,result) = FPProcessNaNs(type1, type2, op1, op2, fpscr_val);
if !done then
if type1 == FPType_Zero && type2 == FPType_Zero && sign1 == NOT(sign2) then
// Opposite-signed zeros produce +0.0
result = FPZero(‘0’, N);
else
// All other cases can be evaluated on the values produced by FPUnpack()
result = if value1 > value2 then op1 else op2;
return result;
// FPMin()
// =======
bits(N) FPMin(bits(N) op1, bits(N) op2, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
(done,result) = FPProcessNaNs(type1, type2, op1, op2, fpscr_val);
if !done then
if type1 == FPType_Zero && type2 == FPType_Zero && sign1 == NOT(sign2) then
// Opposite-signed zeros produce -0.0
result = FPZero(‘1’, N);
else
// All other cases can be evaluated on the values produced by FPUnpack()
result = if value1 < value2 then op1 else op2;
return result;
Addition and subtraction
// FPAdd()
// =======
bits(N) FPAdd(bits(N) op1, bits(N) op2, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
(done,result) = FPProcessNaNs(type1, type2, op1, op2, fpscr_val);
if !done then
inf1 = (type1 == FPType_Infinity); inf2 = (type2 == FPType_Infinity);
zero1 = (type1 == FPType_Zero); zero2 = (type2 == FPType_Zero);
if inf1 && inf2 && sign1 == NOT(sign2) then
Application Level Programmers’ Model
A2-56 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
result = FPDefaultNaN(N);
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif (inf1 && sign1 == ‘0’) || (inf2 && sign2 == ‘0’) then
result = FPInfinity(‘0’, N);
elsif (inf1 && sign1 == ‘1’) || (inf2 && sign2 == ‘1’) then
result = FPInfinity(‘1’, N);
elsif zero1 && zero2 && sign1 == sign2 then
result = FPZero(sign1, N);
else
result_value = value1 + value2;
if result_value == 0.0 then // Sign of exact zero result depends on rounding mode
result_sign = if fpscr_val<23:22> == ‘10’ then ‘1’ else ‘0’;
result = FPZero(result_sign, N);
else
result = FPRound(result_value, N, fpscr_val);
return result;
// FPSub()
// =======
bits(N) FPSub(bits(N) op1, bits(N) op2, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
(done,result) = FPProcessNaNs(type1, type2, op1, op2, fpscr_val);
if !done then
inf1 = (type1 == FPType_Infinity); inf2 = (type2 == FPType_Infinity);
zero1 = (type1 == FPType_Zero); zero2 = (type2 == FPType_Zero);
if inf1 && inf2 && sign1 == sign2 then
result = FPDefaultNaN(N);
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif (inf1 && sign1 == ‘0’) || (inf2 && sign2 == ‘1’) then
result = FPInfinity(‘0’, N);
elsif (inf1 && sign1 == ‘1’) || (inf2 && sign2 == ‘0’) then
result = FPInfinity(‘1’, N);
elsif zero1 && zero2 && sign1 == NOT(sign2) then
result = FPZero(sign1, N);
else
result_value = value1 - value2;
if result_value == 0.0 then // Sign of exact zero result depends on rounding mode
result_sign = if fpscr_val<23:22> == ‘10’ then ‘1’ else ‘0’;
result = FPZero(result_sign, N);
else
result = FPRound(result_value, N, fpscr_val);
return result;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-57
Multiplication and division
// FPMul()
// =======
bits(N) FPMul(bits(N) op1, bits(N) op2, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
(done,result) = FPProcessNaNs(type1, type2, op1, op2, fpscr_val);
if !done then
inf1 = (type1 == FPType_Infinity); inf2 = (type2 == FPType_Infinity);
zero1 = (type1 == FPType_Zero); zero2 = (type2 == FPType_Zero);
if (inf1 && zero2) || (zero1 && inf2) then
result = FPDefaultNaN(N);
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif inf1 || inf2 then
result_sign = if sign1 == sign2 then ‘0’ else ‘1’;
result = FPInfinity(result_sign, N);
elsif zero1 || zero2 then
result_sign = if sign1 == sign2 then ‘0’ else ‘1’;
result = FPZero(result_sign, N);
else
result = FPRound(value1*value2, N, fpscr_val);
return result;
// FPDiv()
// =======
bits(N) FPDiv(bits(N) op1, bits(N) op2, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type1,sign1,value1) = FPUnpack(op1, fpscr_val);
(type2,sign2,value2) = FPUnpack(op2, fpscr_val);
(done,result) = FPProcessNaNs(type1, type2, op1, op2, fpscr_val);
if !done then
inf1 = (type1 == FPType_Infinity); inf2 = (type2 == FPType_Infinity);
zero1 = (type1 == FPType_Zero); zero2 = (type2 == FPType_Zero);
if (inf1 && inf2) || (zero1 && zero2) then
result = FPDefaultNaN(N);
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif inf1 || zero2 then
result_sign = if sign1 == sign2 then ‘0’ else ‘1’;
result = FPInfinity(result_sign, N);
if !inf1 then FPProcessException(FPExc_DivideByZero);
elsif zero1 || inf2 then
result_sign = if sign1 == sign2 then ‘0’ else ‘1’;
result = FPZero(result_sign, N);
else
result = FPRound(value1/value2, N, fpscr_val);
return result;
Application Level Programmers’ Model
A2-58 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Reciprocal estimate and step
The Advanced SIMD extension includes instructions that support Newton-Raphson calculation of the
reciprocal of a number.
The
VRECPE
instruction produces the initial estimate of the reciprocal. It uses the following pseudocode
functions:
// FPRecipEstimate()
// =================
bits(32) FPRecipEstimate(bits(32) operand)
(type,sign,value) = FPUnpack(operand, StandardFPSCRValue());
if type == FPType_SNaN || type == FPType_QNaN then
result = FPProcessNaN(type, operand, StandardFPSCRValue());
elsif type = FPType_Infinity then
result = FPZero(sign, 32);
elsif type = FPType_Zero then
result = FPInfinity(sign, 32);
FPProcessException(FPExc_DivideByZero, StandardFPSCRValue());
elsif Abs(value) >= 2^126 then // Result underflows to zero of correct sign
result = FPZero(sign, 32);
FPProcessException(FPExc_Underflow, StandardFPSCRValue());;
else
// Operand must be normalized, since denormalized numbers are flushed to zero. Scale to a
// double-precision value in the range 0.5 <= x < 1.0, and calculate result exponent.
// Scaled value has copied sign bit, exponent = 1022 = double-precision biased version of
// -1, fraction = original fraction extended with zeros.
scaled = operand<31> : ‘01111111110’ : operand<22:0> : Zeros(29);
result_exp = 253 - UInt(operand<30:23>); // In range 253-252 = 1 to 253-1 = 252
// Call C function to get reciprocal estimate of scaled value.
estimate = recip_estimate(scaled);
// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256. Convert
// to scaled single-precision result with copied sign bit and high-order fraction bits,
// and exponent calculated above.
result = estimate<63> : result_exp<7:0> : estimate<51:29>;
return result;
// UnsignedRecipEstimate()
// =======================
bits(32) UnsignedRecipEstimate(bits(32) operand)
if operand<31> == ‘0’ then // Operands <= 0x7FFFFFFF produce 0xFFFFFFFF
result = Ones(32);
else
// Generate double-precision value = operand * 2^-32. This has zero sign bit,
// exponent = 1022 = double-precision biased version of -1, fraction taken from
// operand, excluding its most significant bit.
dp_operand = ‘0 01111111110’ : operand<30:0> : Zeros(21);
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-59
// Call C function to get reciprocal estimate of scaled value.
estimate = recip_estimate(dp_operand);
// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256.
// Multiply by 2^31 and convert to an unsigned integer - this just involves
// concatenating the implicit units bit with the top 31 fraction bits.
result = ‘1’ : estimate<51:21>;
return result;
where
recip_estimate()
is defined by the following C function:
double recip_estimate(double a)
{
int q, s;
double r;
q = (int)(a * 512.0); /* a in units of 1/512 rounded down */
r = 1.0 / (((double)q + 0.5) / 512.0); /* reciprocal r */
s = (int)(256.0 * r + 0.5); /* r in units of 1/256 rounded to nearest */
return (double)s / 256.0;
}
Table A2-9 shows the results where input values are out of range.
The Newton-Raphson iteration:
x
n+1
= x
n
(2-dx
n
)
converges to (
1/d
) if
x
0 is the result of
VRECPE
applied to
d
.
The VRECPS instruction performs a 2 - op1*op2 calculation and can be used with a multiplication to
perform a step of this iteration. The functionality of this instruction is defined by the following pseudocode
function:
// FPRecipStep()
// =============
Table A2-9 VRECPE results for out-of-range inputs
Number type Input Vm[i] Result Vd[i]
Integer <=
0x7FFFFFFF 0xFFFFFFFF
Floating-point NaN Default NaN
Floating-point +/– 0 or denormalized number +/– Infinity a
a. The Division by Zero exception bit in the FPSCR (FPSCR[1]) is set
Floating-point +/– infinity +/– 0
Floating-point Absolute value >= 2126 +/– 0
Application Level Programmers’ Model
A2-60 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
bits(32) FPRecipStep(bits(32) op1, bits(32) op2)
(type1,sign1,value1) = FPUnpack(op1, StandardFPSCRValue());
(type2,sign2,value2) = FPUnpack(op2, StandardFPSCRValue());
(done,result) = FPProcessNaNs(type1, type2, op1, op2, StandardFPSCRValue());
if !done then
inf1 = (type1 == FPType_Infinity); inf2 = (type2 == FPType_Infinity);
zero1 = (type1 == FPType_Zero); zero2 = (type2 == FPType_Zero);
if (inf1 && zero2) || (zero1 && inf2) then
product = FPZero(‘0’, 32);
else
product = FPMul(op1, op2, FALSE);
result = FPSub(FPTwo(32), product, FALSE);
return result;
Table A2-10 shows the results where input values are out of range.
Square root
// FPSqrt()
// ========
bits(N) FPSqrt(bits(N) operand, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type,sign,value) = FPUnpack(operand, fpscr_val);
if type == FPType_SNaN || type == FPType_QNaN then
result = FPProcessNaN(type, operand, fpscr_val);
elsif type == FPType_Zero || (type = FPType_Infinity && sign == ‘0’) then
result = operand;
elsif sign == ‘1’ then
result = FPDefaultNaN(N);
FPProcessException(FPExc_InvalidOp, fpscr_val);
else
result = FPRound(Sqrt(value), N, fpscr_val);
return result;
Table A2-10 VRECPS results for out-of-range inputs
Input Vn[i] Input Vm[i] Result Vd[i]
Any NaN - Default NaN
- Any NaN Default NaN
+/– 0.0 or denormalized number +/– infinity 2.0
+/– infinity +/– 0.0 or denormalized number 2.0
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-61
Reciprocal square root
The Advanced SIMD extension includes instructions that support Newton-Raphson calculation of the
reciprocal of the square root of a number.
The
VRSQRTE
instruction produces the initial estimate of the reciprocal of the square root. It uses the following
pseudocode functions:
// FPRSqrtEstimate()
// =================
bits(32) FPRSqrtEstimate(bits(32) operand)
(type,sign,value) = FPUnpack(operand, StandardFPSCRValue());
if type == FPType_SNaN || type == FPType_QNaN then
result = FPProcessNaN(type, operand, StandardFPSCRValue());
elsif type = FPType_Zero then
result = FPInfinity(sign, 32);
FPProcessException(FPExc_DivideByZero, StandardFPSCRValue());
elsif sign == ‘1’ then
result = FPDefaultNaN(32);
FPProcessException(FPExc_InvalidOp, StandardFPSCRValue());
elsif type = FPType_Infinity then
result = FPZero(‘0’, 32);
else
// Operand must be normalized, since denormalized numbers are flushed to zero. Scale to a
// double-precision value in the range 0.25 <= x < 1.0, with the evenness or oddness of
// the exponent unchanged, and calculate result exponent. Scaled value has copied sign
// bit, exponent = 1022 or 1021 = double-precision biased version of -1 or -2, fraction
// = original fraction extended with zeros.
if operand<23> == ‘0’ then
scaled = operand<31> : ‘01111111110’ : operand<22:0> : Zeros(29);
else
scaled = operand<31> : ‘01111111101’ : operand<22:0> : Zeros(29);
result_exp = (380 - UInt(operand<30:23>)) DIV 2;
// Call C function to get reciprocal estimate of scaled value.
estimate = recip_sqrt_estimate(scaled);
// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256. Convert
// to scaled single-precision result with copied sign bit and high-order fraction bits,
// and exponent calculated above.
result = estimate<63> : result_exp<7:0> : estimate<51:29>;
return result;
// UnsignedRSqrtEstimate()
// =======================
bits(32) UnsignedRSqrtEstimate(bits(32) operand)
if operand<31:30> == ‘00’ then // Operands <= 0x3FFFFFFF produce 0xFFFFFFFF
result = Ones(32);
else
Application Level Programmers’ Model
A2-62 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
// Generate double-precision value = operand * 2^-32. This has zero sign bit,
// exponent = 1022 or 1021 = double-precision biased version of -1 or -2,
// fraction taken from operand, excluding its most significant one or two bits.
if operand<31> == ‘1’ then
dp_operand = ‘0 01111111110’ : operand<30:0> : Zeros(21);
else // operand<31:30> == ‘01’
dp_operand = ‘0 01111111101’ : operand<29:0> : Zeros(22);
// Call C function to get reciprocal estimate of scaled value.
estimate = recip_sqrt_estimate(dp_operand);
// Result is double-precision and a multiple of 1/256 in the range 1 to 511/256.
// Multiply by 2^31 and convert to an unsigned integer - this just involves
// concatenating the implicit units bit with the top 31 fraction bits.
result = ‘1’ : estimate<51:21>;
return result;
where
recip_sqrt_estimate()
is defined by the following C function:
double recip_sqrt_estimate(double a)
{
int q0, q1, s;
double r;
if (a < 0.5) /* range 0.25 <= a < 0.5 */
{
q0 = (int)(a * 512.0); /* a in units of 1/512 rounded down */
r = 1.0 / sqrt(((double)q0 + 0.5) / 512.0); /* reciprocal root r */
}
else /* range 0.5 <= a < 1.0 */
{
q1 = (int)(a * 256.0); /* a in units of 1/256 rounded down */
r = 1.0 / sqrt(((double)q1 + 0.5) / 256.0); /* reciprocal root r */
}
s = (int)(256.0 * r + 0.5); /* r in units of 1/256 rounded to nearest */
return (double)s / 256.0;
}
Table A2-11 shows the results where input values are out of range.
Table A2-11 VRSQRTE results for out-of-range inputs
Number type Input Vm[i] Result Vd[i]
Integer <=
0x3FFFFFFF 0xFFFFFFFF
Floating-point NaN, – normalized number, – infinity Default NaN
Floating-point – 0 or – denormalized number – infinity a
Floating-point + 0 or + denormalized number + infinity a
Floating-point + infinity + 0
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-63
The Newton-Raphson iteration:
x
n+1
= x
n
(3-dx
n2
)/2
converges to (
1/
d
) if
x
0 is the result of
VRSQRTE
applied to
d
.
The
VRSQRTS
instruction performs a (3 – op1*op2)/2 calculation and can be used with two multiplications to
perform a step of this iteration. The functionality of this instruction is defined by the following pseudocode
function:
// FPRSqrtStep()
// =============
bits(32) FPRSqrtStep(bits(32) op1, bits(32) op2)
(type1,sign1,value1) = FPUnpack(op1, StandardFPSCRValue());
(type2,sign2,value2) = FPUnpack(op2, StandardFPSCRValue());
(done,result) = FPProcessNaNs(type1, type2, op1, op2, StandardFPSCRValue());
if !done then
inf1 = (type1 == FPType_Infinity); inf2 = (type2 == FPType_Infinity);
zero1 = (type1 == FPType_Zero); zero2 = (type2 == FPType_Zero);
if (inf1 && zero2) || (zero1 && inf2) then
product = FPZero(‘0’, 32);
else
product = FPMul(op1, op2, FALSE);
result = FPDiv(FPSub(FPThree(32), product, FALSE), FPTwo(32), FALSE);
return result;
Table A2-12 shows the results where input values are out of range.
a. The Division by Zero exception bit in the FPSCR (FPSCR[1]) is set.
Table A2-12 VRSQRTS results for out-of-range inputs
Input Vn[i] Input Vm[i] Result Vd[i]
Any NaN - Default NaN
- Any NaN Default NaN
+/– 0.0 or denormalized number +/– infinity 1.5
+/– infinity +/– 0.0 or denormalized number 1.5
Application Level Programmers’ Model
A2-64 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Conversions
The following functions perform conversions between half-precision and single-precision floating-point
numbers.
// FPHalfToSingle()
// ================
bits(32) FPHalfToSingle(bits(16) operand, boolean fpscr_controlled)
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type,sign,value) = FPUnpack(operand, fpscr_val);
if type == FPType_SNaN || type == FPType_QNaN then
if fpscr_val<25> == ‘1’ then // DN bit set
result = FPDefaultNaN(32);
else
result = sign : ‘11111111 1’ : operand<8:0> : Zeros(13);
if type == FPType_SNaN then
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif type = FPType_Infinity then
result = FPInfinity(sign, 32);
elsif type = FPType_Zero then
result = FPZero(sign, 32);
else
result = FPRound(value, 32, fpscr_val); // Rounding will be exact
return result;
// FPSingleToHalf()
// ================
bits(16) FPSingleToHalf(bits(32) operand, boolean fpscr_controlled)
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type,sign,value) = FPUnpack(operand, fpscr_val);
if type == FPType_SNaN || type == FPType_QNaN then
if fpscr_val<26> == ‘1’ then // AH bit set
result = FPZero(sign, 16);
elsif fpscr_val<25> == ‘1’ then // DN bit set
result = FPDefaultNaN(16);
else
result = sign : ‘11111 1’ : operand<21:13>;
if type == FPType_SNaN || fpscr_val<26> == ‘1’ then
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif type = FPType_Infinity then
if fpscr_val<26> == ‘1’ then // AH bit set
result = sign : Ones(15);
FPProcessException(FPExc_InvalidOp, fpscr_val);
else
result = FPInfinity(sign, 16);
elsif type = FPType_Zero then
result = FPZero(sign, 16);
else
result = FPRound(value, 16, fpscr_val);
return result;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-65
The following functions perform conversions between single-precision and double-precision floating-point
numbers.
// FPSingleToDouble()
// ==================
bits(64) FPSingleToDouble(bits(32) operand, boolean fpscr_controlled)
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type,sign,value) = FPUnpack(operand, fpscr_val);
if type == FPType_SNaN || type == FPType_QNaN then
if fpscr_val<25> == ‘1’ then // DN bit set
result = FPDefaultNaN(64);
else
result = sign : ‘11111111111 1’ : operand<21:0> : Zeros(29);
if type == FPType_SNaN then
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif type = FPType_Infinity then
result = FPInfinity(sign, 64);
elsif type = FPType_Zero then
result = FPZero(sign, 64);
else
result = FPRound(value, 64, fpscr_val); // Rounding will be exact
return result;
// FPDoubleToSingle()
// ==================
bits(32) FPDoubleToSingle(bits(64) operand, boolean fpscr_controlled)
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
(type,sign,value) = FPUnpack(operand, fpscr_val);
if type == FPType_SNaN || type == FPType_QNaN then
if fpscr_val<25> == ‘1’ then // DN bit set
result = FPDefaultNaN(32);
else
result = sign : ‘11111111 1’ : operand<50:29>;
if type == FPType_SNaN then
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif type = FPType_Infinity then
result = FPInfinity(sign, 32);
elsif type = FPType_Zero then
result = FPZero(sign, 32);
else
result = FPRound(value, 32, fpscr_val);
return result;
The following functions perform conversions between floating-point numbers and integers or fixed-point
numbers:
// FPToFixed()
// ===========
bits(M) FPToFixed(bits(N) operand, integer M, integer fraction_bits, boolean unsigned,
boolean round_towards_zero, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
Application Level Programmers’ Model
A2-66 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
if round_towards_zero then fpscr_val<23:22> = ‘11’;
(type,sign,value) = FPUnpack(operand, fpscr_val);
// For NaNs and infinities, FPUnpack() has produced a value that will round to the
// required result of the conversion. Also, the value produced for infinities will
// cause the conversion to overflow and signal an Invalid Operation floating-point
// exception as required. NaNs must also generate such a floating-point exception.
if type == FPType_SNaN || type == FPType_QNaN then
FPProcessException(FPExc_InvalidOp, fpscr_val);
// Scale value by specified number of fraction bits, then start rounding to an integer
// and determine the rounding error.
value = value * 2^fraction_bits;
int_result = RoundDown(value);
error = value - int_result;
// Apply the specified rounding mode.
case fpscr_val<23:22> of
when ‘00’ // Round to Nearest (rounding to even if exactly halfway)
round_up = (error > 0.5 || (error == 0.5 && int_result<0> == ‘1’));
when ‘01’ // Round towards Plus Infinity
round_up = (error != 0.0);
when ‘10’ // Round towards Minus Infinity
round_up = FALSE;
when ‘11’ // Round towards Zero
round_up = (error != 0.0 && int_result < 0);
if round_up then int_result = int_result + 1;
// Bitstring result is the integer result saturated to the destination size, with
// saturation indicating overflow of the conversion (signaled as an Invalid
// Operation floating-point exception).
(result, overflow) = SatQ(int_result, M, unsigned);
if overflow then
FPProcessException(FPExc_InvalidOp, fpscr_val);
elsif error != 0 then
FPProcessException(FPExc_Inexact, fpscr_val);
return result;
// FixedToFP()
// ===========
bits(N) FixedToFP(bits(M) operand, integer N, integer fraction_bits, boolean unsigned,
boolean round_to_nearest, boolean fpscr_controlled)
assert N == 32 || N == 64;
fpscr_val = if fpscr_controlled then FPSCR else StandardFPSCRValue();
if round_to_nearest then fpscr_val<23:22> = ‘00’;
int_operand = if unsigned then UInt(operand) else SInt(operand);
real_operand = int_operand / 2^fraction_bits;
if real_operand == 0.0 then
result = FPZero(‘0’, N);
else
result = FPRound(real_operand, N, fpscr_val);
return result;
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-67
A2.8 Polynomial arithmetic over {0,1}
The polynomial data type represents a polynomial in x of the form bn–1xn–1 + … + b1x + b0 where bk is
bit [k] of the value.
The coefficients 0 and 1 are manipulated using the rules of Boolean arithmetic:
0 + 0 = 1 + 1 = 0
0 + 1 = 1 + 0 = 1
0 * 0 = 0 * 1 = 1 * 0 = 0
1 * 1 = 1.
That is:
adding two polynomials over {0,1} is the same as a bitwise exclusive OR
multiplying two polynomials over {0,1} is the same as integer multiplication except that partial
products are exclusive-ORed instead of being added.
A2.8.1 Pseudocode details of polynomial multiplication
In pseudocode, polynomial addition is described by the EOR operation on bitstrings.
Polynomial multiplication is described by the
PolynomialMult()
function:
// PolynomialMult()
// ================
bits(M+N) PolynomialMult(bits(M) op1, bits(N) op2)
result = Zeros(M+N);
extended_op2 = Zeros(M) : op2;
for i=0 to M-1
if op1<i> == ‘1’ then
result = result EOR LSL(extended_op2, i);
return result;
Application Level Programmers’ Model
A2-68 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.9 Coprocessor support
Coprocessor space is used to extend the functionality of an ARM processor. There are sixteen coprocessors
defined in the coprocessor instruction space. These are commonly known as CP0 to CP15. The following
coprocessors are reserved by ARM for specific purposes:
Coprocessor 15 (CP15) provides system control functionality. This includes architecture and feature
identification, as well as control, status information and configuration support. The following
sections describe CP15:
CP15 registers for a VMSA implementation on page B3-64
CP15 registers for a PMSA implementation on page B4-22.
CP15 also provides performance monitor registers, see Chapter C9 Performance Monitors.
Coprocessor 14 (CP14) supports:
debug, see Chapter C6 Debug Register Interfaces
the execution environment features defined by the architecture, see Execution environment
support on page A2-69.
Coprocessor 11 (CP11) supports double-precision floating-point operations.
Coprocessor 10 (CP10) supports single-precision floating-point operations and the control and
configuration of both the VFP and the Advanced SIMD architecture extensions.
Coprocessors 8, 9, 12, and 13 are reserved for future use by ARM.
Note
Any implementation that includes either or both of the Advanced SIMD extension and the VFP extension
must enable access to both CP10 and CP11, see Enabling Advanced SIMD and floating-point support on
page B1-64.
In general, privileged access is required for:
system control through CP15
debug control and configuration
access to the identification registers
access to any register bits that enable or disable coprocessor features.
For details of the exact split between the privileged and unprivileged coprocessor operations see the relevant
sections of this manual.
All load, store, branch and data operation instructions associated with floating-point, Advanced SIMD and
execution environment support can execute unprivileged.
Coprocessors 0 to 7 can be used to provide vendor specific features.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-69
A2.10 Execution environment support
The Jazelle and ThumbEE states, introduced in ISETSTATE on page A2-15, support execution
environments:
The ThumbEE state is more generic, supporting a variant of the Thumb instruction set that minimizes
the code size overhead generated by a Just-In-Time (JIT) or Ahead-Of-Time (AOT) compiler. JIT and
AOT compilers convert execution environment source code to a native executable. For more
information, see Thumb Execution Environment.
The Jazelle state is specific to hardware acceleration of Java bytecodes. For more information, see
Jazelle direct bytecode execution support on page A2-73.
A2.10.1 Thumb Execution Environment
Thumb Execution Environment (ThumbEE) is a variant of the Thumb instruction set designed as a target for
dynamically generated code. This is code that is compiled on the device, from a portable bytecode or other
intermediate or native representation, either shortly before or during execution. ThumbEE provides support
for Just-In-Time (JIT), Dynamic Adaptive Compilation (DAC) and Ahead-Of-Time (AOT) compilers, but
cannot interwork freely with the ARM and Thumb instruction sets.
ThumbEE is particularly suited to languages that feature managed pointers and array types.
ThumbEE executes instructions in the ThumbEE instruction set state. For information about instruction set
states see ISETSTATE on page A2-15.
See Thumb Execution Environment on page B1-73 for system level information about ThumbEE.
ThumbEE instructions
In ThumbEE state, the processor executes almost the same instruction set as in Thumb state. However some
instructions behave differently, some are removed, and some ThumbEE instructions are added.
The key differences are:
additional instructions to change instruction set in both Thumb state and ThumbEE state
new ThumbEE instructions to branch to handlers
null pointer checking on load/store instructions executed in ThumbEE state
an additional instruction in ThumbEE state to check array bounds
some other modifications to load, store, and control flow instructions.
For more information about the ThumbEE instructions see Chapter A9 ThumbEE.
ThumbEE configuration
ThumbEE introduces two new registers:
ThumbEE Configuration Register, TEECR. This contains a single bit, the ThumbEE configuration
control bit, XED.
Application Level Programmers’ Model
A2-70 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
ThumbEE Handler Base Register. This contains the base address for ThumbEE handlers.
A handler is a short, commonly executed, sequence of instructions. It is typically, but not always,
associated directly with one or more bytecodes or other intermediate language elements.
Changes to these CP14 registers have the same synchronization requirements as changes to the CP15
registers. These are described in:
Changes to CP15 registers and the memory order model on page B3-77 for a VMSA implementation
Changes to CP15 registers and the memory order model on page B4-28 for a PMSA implementation.
ThumbEE is an unprivileged, user-level facility, and there are no special provisions for using it securely. For
more information, see ThumbEE and the Security Extensions on page B1-73.
ThumbEE Configuration Register (TEECR)
The ThumbEE Configuration Register (TEECR) controls unprivileged access to the ThumbEE Handler
Base Register.
The TEECR is:
a CP14 register
a 32-bit register, with access rights that depend on the current privilege:
the result of an unprivileged write to the register is UNDEFINED
unprivileged reads, and privileged reads and writes, are permitted.
when the Security Extensions are implemented, a Common register.
The format of the TEECR is:
Bits [31:1] UNK/SBZP.
XED, bit [0] Execution Environment Disable bit. Controls unprivileged access to the ThumbEE Handler
Base Register:
0 Unprivileged access permitted.
1 Unprivileged access disabled.
The reset value of this bit is 0.
The effects of a write to this register on ThumbEE configuration are only guaranteed to be visible to
subsequent instructions after the execution of an
ISB
instruction, an exception entry or an exception return.
However, a read of this register always returns the value most recently written to the register.
To access the TEECR, read or write the CP14 registers with an
MRC
or
MCR
instruction with
<opc1>
set to 6,
<CRn>
set to c0,
<CRm>
set to c0, and
<opc2>
set to 0. For example:
MRC p14, 6, <Rt>, c0, c0, 0 ; Read ThumbEE Configuration Register
MCR p14, 6, <Rt>, c0, c0, 0 ; Write ThumbEE Configuration Register
31 10
UNK/SBZP XED
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-71
ThumbEE Handler Base Register (TEEHBR)
The ThumbEE Handler Base Register (TEEHBR) holds the base address for ThumbEE handlers.
The TEEHBR is:
a CP14 register
a 32-bit read/write register, with access rights that depend on the current privilege and the value of
the TEECR.XED bit:
privileged accesses are always permitted
when TEECR.XED == 0, unprivileged accesses are permitted
when TEECR.XED == 1, the result of an unprivileged access is UNDEFINED.
when the Security Extensions are implemented, a Common register.
The format of the TEEHBR is:
HandlerBase, bits [31:2]
The address of the ThumbEE Handler_00 implementation. This is the address of the first of
the ThumbEE handlers.
The reset value of this field is UNKNOWN.
bits [1:0] Reserved, SBZ.
The effects of a write to this register on ThumbEE handler entry are only guaranteed to be visible to
subsequent instructions after the execution of an
ISB
instruction, an exception entry or an exception return.
However, a read of this register always returns the value most recently written to the register.
To access the TEEHBR, read or write the CP14 registers with an
MRC
or
MCR
instruction with
<opc1>
set to 6,
<CRn>
set to c1,
<CRm>
set to c0, and
<opc2>
set to 0. For example:
MRC p14, 6, <Rt>, c1, c0, 0 ; Read ThumbEE Handler Base Register
MCR p14, 6, <Rt>, c1, c0, 0 ; Write ThumbEE Handler Base Register
31 210
HandlerBase SBZ
Application Level Programmers’ Model
A2-72 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Use of HandlerBase
ThumbEE handlers are entered by reference to a HandlerBase address, defined by the TEEHBR. See
ThumbEE Handler Base Register (TEEHBR) on page A2-71. Table A2-13 shows how the handlers are
arranged in relation to the value of HandlerBase:
The IndexCheck occurs when a
CHKA
instruction detects an index out of range. For more information, see
CHKA on page A9-15.
The NullCheck occurs when any memory access instruction is executed with a value of 0 in the base register.
For more information, see Null checking on page A9-3.
Note
Checks are similar to conditional branches, with the added property that they clear the IT bits when taken.
Other handlers are called using explicit handler call instructions. For details see the following sections:
HB, HBL on page A9-16
HBLP on page A9-17
HBP on page A9-18.
Table A2-13 Access to ThumbEE handlers
Offset from HandlerBase Name Value stored
-
0x0008
IndexCheck Branch to IndexCheck handler
-
0x0004
NullCheck Branch to NullCheck handler
+
0x0000
Handler_00 Implementation of Handler_00
+
0x0020
Handler_01 Implementation of Handler_01
... ... ...
+(
0x0000
+ 32n) Handler_<n> Implementation of Handler_<n>
... ... Implementation of additional handlers
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-73
A2.10.2 Jazelle direct bytecode execution support
From ARMv5TEJ, the architecture requires every system to include an implementation of the Jazelle
extension. The Jazelle extension provides architectural support for hardware acceleration of bytecode
execution by a Java Virtual Machine (JVM).
In the simplest implementations of the Jazelle extension, the processor does not accelerate the execution of
any bytecodes, and the JVM uses software routines to execute all bytecodes. Such an implementation is
called a trivial implementation of the Jazelle extension, and has minimal additional cost compared with not
implementing the Jazelle extension at all. An implementation that provides hardware acceleration of
bytecode execution is a non-trivial Jazelle implementation.
These requirements for the Jazelle extension mean a JVM can be written to both:
function correctly on all processors that include a Jazelle extension implementation
automatically take advantage of the accelerated bytecode execution provided by a processor that
includes a non-trivial implementation.
Typically, a non-trivial implementation of the Jazelle extension implements a subset of the bytecodes in
hardware, choosing bytecodes that:
can have simple hardware implementations
account for a large percentage of bytecode execution time.
The required features of a non-trivial implementation are:
provision of the Jazelle state
a new instruction,
BXJ
, to enter Jazelle state
system support that enables an operating system to regulate the use of the Jazelle extension hardware
system support that enables a JVM to configure the Jazelle extension hardware to its specific needs.
The required features of a trivial implementation are:
Normally, the Jazelle instruction set state is never entered. If an incorrect exception return causes
entry to the Jazelle instruction set state, the next instruction executed is treated as UNDEFINED.
The
BXJ
instruction behaves as a
BX
instruction.
Configuration support that maintains the interface to the Jazelle extension is permanently disabled.
For more information about trivial implementations see Trivial implementation of the Jazelle extension on
page B1-81.
A JVM that has been written to take advantage automatically of hardware-accelerated bytecode execution
is known as an Enabled JVM (EJVM).
Application Level Programmers’ Model
A2-74 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Subarchitectures
A processor implementation that includes the Jazelle extension expects the general-purpose register values
and other resources of the ARM processor to conform to an interface standard defined by the Jazelle
implementation when Jazelle state is entered and exited. For example, a specific general-purpose register
might be reserved for use as the pointer to the current bytecode.
In order for an EJVM and associated debug support to function correctly, it must be written to comply with
the interface standard defined by the acceleration hardware at Jazelle state execution entry and exit points.
An implementation of the Jazelle extension might define other configuration registers in addition to the
architecturally defined ones.
The interface standard and any additional configuration registers used to communicate with the Jazelle
extension are known collectively as the subarchitecture of the implementation. They are not described in
this manual. Only EJVM implementations and debug or similar software can depend on the subarchitecture.
All other software must rely only on the architectural definition of the Jazelle extension given in this manual.
A particular subarchitecture is identified by reading the JIDR described in Jazelle ID Register (JIDR) on
page A2-76.
Jazelle state
While the processor is in Jazelle state, it executes bytecode programs. A bytecode program is defined as an
executable object that comprises one or more
class
files, or is derived from and functionally equivalent to
one or more
class
files. See Lindholm and Yellin, The Java Virtual Machine Specification 2nd Edition for
the definition of
class
files.
While the processor is in Jazelle state, the PC identifies the next JVM bytecode to be executed. A JVM
bytecode is a bytecode defined in Lindholm and Yellin, or a functionally equivalent transformed version of
a bytecode defined in Lindholm and Yellin.
For the Jazelle extension, the functionality of Native methods, as described in Lindholm and Yellin, must be
specified using only instructions from the ARM, Thumb, and ThumbEE instruction sets.
An implementation of the Jazelle extension must not be documented or promoted as performing any task
while it is in Jazelle state other than the acceleration of bytecode programs in accordance with this section
and The Java Virtual Machine Specification.
Jazelle state entry instruction,
BXJ
ARMv7 includes an ARM instruction similar to
BX
. The
BXJ
instruction has a single register operand that
specifies a target instruction set state, ARM state or Thumb state, and branch target address for use if entry
to Jazelle state is not available. For more information, see BXJ on page A8-64.
Correct entry into Jazelle state involves the EJVM executing the
BXJ
instruction at a time when both:
the Jazelle extension Control and Configuration registers are initialized correctly, see Application
level configuration and control of the Jazelle extension on page A2-75
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-75
application level registers and any additional configuration registers are initialized as required by the
subarchitecture of the implementation.
Executing
BXJ
with Jazelle extension enabled
Executing a
BXJ
instruction when the JMCR.JE bit is 1, see Jazelle Main Configuration Register (JMCR) on
page A2-77, causes the Jazelle hardware to do one of the following:
enter Jazelle state and start executing bytecodes directly from a SUBARCHITECTURE DEFINED address
•branch to a SUBARCHITECTURE DEFINED handler.
Which of these occurs is SUBARCHITECTURE DEFINED.
The Jazelle subarchitecture can use Application Level registers (but not System Level registers) to transfer
information between the Jazelle extension and the EJVM. There are SUBARCHITECTURE DEFINED
restrictions on what Application Level registers must contain when a BXJ instruction is executed, and
Application Level registers have SUBARCHITECTURE DEFINED values when Jazelle state execution ends and
ARM or Thumb state execution resumes.
Jazelle subarchitectures and implementations must not use any unallocated bits in Application Level
registers such as the CPSR or FPSCR. All such bits are reserved for future expansion of the ARM
architecture.
Executing
BXJ
with Jazelle extension disabled
If a
BXJ
instruction is executed when the JMCR.JE bit is 0, it is executed identically to a
BX
instruction with
the same register operand.
This means that
BXJ
instructions can be executed freely when the JMCR.JE bit is 0. In particular, if an EJVM
determines that it is executing on a processor whose Jazelle extension implementation is trivial or uses an
incompatible subarchitecture, it can set JE == 0 and execute correctly. In this case it executes without the
benefit of any Jazelle hardware acceleration that might be present.
Application level configuration and control of the Jazelle extension
All registers associated with the Jazelle extension are implemented in coprocessor space as part of
coprocessor 14 (CP14). The registers are accessed using the instructions:
MCR
, see MCR, MCR2 on page A8-186
MRC
, see MRC, MRC2 on page A8-202.
In a non-trivial implementation at least three registers are required. These are described in:
Jazelle ID Register (JIDR) on page A2-76
Jazelle Main Configuration Register (JMCR) on page A2-77
Jazelle OS Control Register (JOSCR) on page B1-77.
Additional configuration registers might be provided and are SUBARCHITECTURE DEFINED.
The following rules apply to all Jazelle extension control and configuration registers:
All configuration registers are accessed by CP14
MRC
and
MCR
instructions with
<opc1>
set to 7.
Application Level Programmers’ Model
A2-76 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
The values contained in configuration registers are changed only by the execution of
MCR
instructions.
In particular, they are never changed by Jazelle state execution of bytecodes.
The access policy for the required registers is fully defined in their descriptions. With unprivileged
operation:
—all
MCR
accesses to the JIDR are UNDEFINED
MRC
and
MCR
accesses that are restricted to privileged modes are UNDEFINED.
The access policy of other configuration registers is SUBARCHITECTURE DEFINED.
When the Security Extensions are implemented, the registers are common to the Secure and
Non-secure security states. For more information, see Effect of the Security Extensions on the CP15
registers on page B3-71. This section applies to some CP14 registers as well as to the CP15 registers.
When a configuration register is readable, reading the register returns the last value written to it.
Reading a readable configuration register has no side effects.
When a configuration register is not readable, attempting to read it returns an UNKNOWN value.
When a configuration register can be written, the effect of writing to it must be idempotent. That is,
the overall effect of writing the same value more than once must not differ from the effect of writing
it once.
Changes to these CP14 registers have the same synchronization requirements as changes to the CP15
registers. These are described in:
Changes to CP15 registers and the memory order model on page B3-77 for a VMSA implementation
Changes to CP15 registers and the memory order model on page B4-28 for a PMSA implementation.
For more information, see Jazelle state configuration and control on page B1-77.
Jazelle ID Register (JIDR)
The Jazelle ID Register (JIDR) enables an EJVM to determine the architecture and subarchitecture under
which it is running.
The JIDR is:
a CP14 register
a 32-bit read-only register
accessible during privileged and unprivileged execution
when the Security Extensions are implemented, a Common register, see Common CP15 registers on
page B3-74.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-77
The format of the JIDR is:
Architecture, bits [31:28]
Architecture code. This uses the same Architecture code that appears in the Main ID register
in coprocessor 15, see c0, Main ID Register (MIDR) on page B3-81 (VMSA
implementation) or c0, Main ID Register (MIDR) on page B4-32 (PMSA implementation).
Implementer, bits [27:20]
Implementer code of the designer of the subarchitecture. This uses the same Implementer
code that appears in the Main ID register in coprocessor 15, see c0, Main ID Register
(MIDR) on page B3-81 (VMSA implementation) or c0, Main ID Register (MIDR) on
page B4-32 (PMSA implementation).
If the trivial implementation of the Jazelle extension is used, the Implementer code is
0x00
.
Subarchitecture, bits [19:12]
Contain the subarchitecture code. The following subarchitecture code is defined:
0x00
Jazelle v1 subarchitecture, or trivial implementation of Jazelle extension if
Implementer code is
0x00
.
bits [11:0] Contain additional SUBARCHITECTURE DEFINED information.
To access the JIDR, read the CP14 registers with an
MRC
instruction with
<opc1>
set to 7,
<CRn>
set to c0,
<CRm>
set to c0, and
<opc2>
set to 0. For example:
MRC p14, 7, <Rt>, c0, c0, 0 ; Read Jazelle ID register
Jazelle Main Configuration Register (JMCR)
The Jazelle Main Configuration Register (JMCR) controls the Jazelle extension.
The JMCR is:
a CP14 register
a 32-bit register, with access rights that depend on the current privilege:
for privileged operations the register is read/write
for unprivileged operations, the register is normally write-only
when the Security Extensions are implemented, a Common register, see Common CP15 registers on
page B3-74.
For more information about unprivileged access restrictions see Access to Jazelle registers on page A2-78.
31 28 27 20 19 12 11 0
Architecture Implementer Subarchitecture SUBARCHITECTURE DEFINED
Application Level Programmers’ Model
A2-78 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
The format of the JMCR is:
bit [31:1] SUBARCHITECTURE DEFINED information.
JE, bit [0] Jazelle Enable bit:
0 Jazelle extension disabled. The
BXJ
instruction does not cause Jazelle state
execution.
BXJ
behaves exactly as a
BX
instruction, see Jazelle state entry
instruction, BXJ on page A2-74.
1 Jazelle extension enabled.
The reset value of this bit is 0.
To access the JMCR, read or write the CP14 registers with an
MRC
or
MCR
instruction with
<opc1>
set to 7,
<CRn>
set to c2,
<CRm>
set to c0, and
<opc2>
set to 0. For example:
MRC p14, 7, <Rt>, c2, c0, 0 ; Read Jazelle Main Configuration register
MCR p14, 7, <Rt>, c2, c0, 0 ; Write Jazelle Main Configuration register
Access to Jazelle registers
Table A2-14 shows the access permissions for the Jazelle registers, and how unprivileged access to the
registers depends on the value of the JOSCR.
31 10
SUBARCHITECTURE DEFINED JE
Table A2-14 Access to Jazelle registers
Jazelle register
Unprivileged access
Privileged access
JOSCR.CD == 0aJOSCR.CD == 1a
JIDR
Read access permitted Read and write access
UNDEFINED
Read access permitted
Write access ignored Write access ignored
JMCR
Read access UNDEFINED Read and write access
UNDEFINED Read and write access permitted
Write access permitted
SUBARCHITECTURE
DEFINED configuration
registers
Read access UNDEFINED
Read and write access
UNDEFINED
Read access SUBARCHITECTURE
DEFINED
Write access permitted Write access permitted
a. See Jazelle OS Control Register (JOSCR) on page B1-77.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-79
EJVM operation
The following subsections summarize how an EJVM must operate, to meet the requirements of the
architecture:
Initialization
Bytecode execution
Jazelle exception conditions
Other considerations on page A2-80.
Initialization
During initialization, the EJVM must first check which subarchitecture is present, by checking the
Implementer and Subarchitecture codes in the value read from the JIDR.
If the EJVM is incompatible with the subarchitecture, it must do one of the following:
write a value with JE == 0 to the JMCR
if unaccelerated bytecode execution is unacceptable, generate an error.
If the EJVM is compatible with the subarchitecture, it must write its required configuration to the JMCR
and any SUBARCHITECTURE DEFINED configuration registers.
Bytecode execution
The EJVM must contain a handler for each bytecode.
The EJVM initiates bytecode execution by executing a
BXJ
instruction with:
the register operand specifying the target address of the bytecode handler for the first bytecode of the
program
the Application Level registers set up in accordance with the SUBARCHITECTURE DEFINED interface
standard.
The bytecode handler:
performs the data-processing operations required by the bytecode indicated
determines the address of the next bytecode to be executed
determines the address of the handler for that bytecode
performs a
BXJ
to that handler address with the registers again set up to the SUBARCHITECTURE
DEFINED interface standard.
Jazelle exception conditions
During bytecode execution, the EJVM might encounter SUBARCHITECTURE DEFINED Jazelle exception
conditions that must be resolved by a software handler. For example, in the case of a configuration invalid
handler, the handler rewrites the desired configuration to the JMCR and to any SUBARCHITECTURE DEFINED
configuration registers.
Application Level Programmers’ Model
A2-80 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
On entry to a Jazelle exception condition handler the contents of the Application Level registers are
SUBARCHITECTURE DEFINED. This interface to the Jazelle exception condition handler might differ from the
interface standard for the bytecode handler, in order to supply information about the Jazelle exception
condition.
The Jazelle exception condition handler:
resolves the Jazelle exception condition
determines the address of the next bytecode to be executed
determines the address of the handler for that bytecode
performs a
BXJ
to that handler address with the registers again set up to the SUBARCHITECTURE
DEFINED interface standard.
Other considerations
To ensure application execution and correct interaction with an operating system, an EJVM must only
perform operations that are permitted in unprivileged operation. In particular, for register accesses they must
only:
read the JIDR,
write to the JMCR, and other configuration registers.
An EJVM must not attempt to access the JOSCR.
Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-81
A2.11 Exceptions, debug events and checks
ARMv7 uses the following terms to describe various types of exceptional condition:
Exceptions In the ARM architecture, exceptions cause entry into a privileged mode and execution of a
software handler for the exception.
Note
The terms floating-point exception and Jazelle exception condition do not use this meaning
of exception. These terms are described later in this list.
Exceptions include:
•reset
• interrupts
memory system aborts
undefined instructions
supervisor calls (SVCs).
Most details of exception handling are not visible to application-level code, and are
described in Exceptions on page B1-30. Aspects that are visible to application-level code
are:
The
SVC
instruction causes an SVC exception. This provides a mechanism for
unprivileged code to make a call to the operating system (or other privileged
component of the software system).
If the Security Extensions are implemented, the
SMC
instruction causes an SMC
exception, but only if it is executed in a privileged mode. Unprivileged code can only
cause SMC exceptions to occur by methods defined by the operating system (or other
privileged component of the software system).
The
WFI
instruction provides a hint that nothing needs to be done until an interrupt or
similar exception is taken, see Wait For Interrupt on page B1-47. This permits the
processor to enter a low-power state until that happens.
The
WFE
instruction provides a hint that nothing needs to be done until either an event
is generated by an
SEV
instruction or an interrupt or similar exception is taken, see
Wait For Event and Send Event on page B1-44. This permits the processor to enter a
low-power state until one of these happens.
The
YIELD
instruction provides a hint that the current execution thread is of low
importance, see The Yield instruction on page A2-82.
Floating-point exceptions
These relate to exceptional conditions encountered during floating-point arithmetic, such as
division by zero or overflow. For more information see:
Floating-point exceptions on page A2-42
Floating-point Status and Control Register (FPSCR) on page A2-28
ANSI/IEEE Std. 754-1985, IEEE Standard for Binary Floating-Point Arithmetic.
Application Level Programmers’ Model
A2-82 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Jazelle exception conditions
These are conditions that cause Jazelle hardware acceleration to exit into a software handler,
as described in Jazelle exception conditions on page A2-79.
Debug events These are conditions that cause a debug system to take action. Most aspects of debug events
are not visible to application-level code, and are described in Chapter C3 Debug Events.
Aspects that are visible to application-level code include:
The
BKPT
instruction causes a BKPT Instruction debug event to occur, see BKPT
Instruction debug events on page C3-20.
The
DBG
instruction provides a hint to the debug system.
Checks These are provided in the ThumbEE extension. A check causes an unconditional branch to
a specific handler entry point. The base address of the ThumbEE check handlers is held in
the TEEHBR, see ThumbEE Handler Base Register (TEEHBR) on page A2-71.
A2.11.1 The Yield instruction
In a Symmetric Multi-Threading (SMT) design, a thread can use a Yield instruction to give a hint to the
processor that it is running on. The Yield hint indicates that whatever the thread is currently doing is of low
importance, and so could yield. For example, the thread might be sitting in a spin-lock. Similar behavior
might be used to modify the arbitration priority of the snoop bus in a multiprocessor (MP) system. Defining
such an instruction permits binary compatibility between SMT and SMP systems.
ARMv7 defines a
YIELD
instruction as a specific NOP-hint instruction, see YIELD on page A8-812.
The
YIELD
instruction has no effect in a single-threaded system, but developers of such systems can use the
instruction to flag its intended use on migration to a multiprocessor or multithreading system. Operating
systems can use
YIELD
in places where a yield hint is wanted, knowing that it will be treated as a
NOP
if there
is no implementation benefit.
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-1
Chapter A3
Application Level Memory Model
This chapter gives an application level view of the memory model. It contains the following sections:
Address space on page A3-2
Alignment support on page A3-4
Endian support on page A3-7
Synchronization and semaphores on page A3-12
Memory types and attributes and the memory order model on page A3-24
Access rights on page A3-38
Virtual and physical addressing on page A3-40
Memory access order on page A3-41
Caches and memory hierarchy on page A3-51.
Application Level Memory Model
A3-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.1 Address space
The ARM architecture uses a single, flat address space of 232 8-bit bytes. Byte addresses are treated as
unsigned numbers, running from 0 to 232 - 1. The address space is also regarded as:
•2
30 32-bit words:
the address of each word is word-aligned, meaning that the address is divisible by 4 and the
last two bits of the address are 0b00
the word at word-aligned address A consists of the four bytes with addresses A, A+1, A+2 and
A+3.
•2
31 16-bit halfwords:
the address of each halfword is halfword-aligned, meaning that the address is divisible by 2
and the last bit of the address is 0
the halfword at halfword-aligned address A consists of the two bytes with addresses A and
A+1.
In some situations the ARM architecture supports accesses to halfwords and words that are not aligned to
the appropriate access size, see Alignment support on page A3-4.
Normally, address calculations are performed using ordinary integer instructions. This means that the
address wraps around if the calculation overflows or underflows the address space. Another way of
describing this is that any address calculation is reduced modulo 232.
A3.1.1 Address incrementing and address space overflow
When a processor performs normal sequential execution of instructions, it effectively calculates:
(address_of_current_instruction) + (size_of_executed_instruction)
after each instruction to determine which instruction to execute next.
Note
The size of the executed instruction depends on the current instruction set, and might depend on the
instruction executed.
If this address calculation overflows the top of the address space, the result is UNPREDICTABLE. In other
words, a program must not rely on sequential execution of the instruction at address
0x00000000
after the
instruction at address:
0xFFFFFFFC
, when a 4-byte instruction is executed
0xFFFFFFFE
, when a 2-byte instruction is executed
0xFFFFFFFF
, when a single byte instruction is executed.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-3
This UNPREDICTABLE behavior only applies to instructions that are executed, including those that fail their
condition code check. Most ARM implementations prefetch instructions ahead of the currently-executing
instruction. If this prefetching overflows the top of the address space, it does not cause UNPREDICTABLE
behavior unless a prefetched instruction with an overflowed address is actually executed.
LDC
,
LDM
,
LDRD
,
POP
,
PUSH
,
STC
,
STRD
, and
STM
instructions access a sequence of words at increasing memory
addresses, effectively incrementing the memory address by 4 for each load or store. If this calculation
overflows the top of the address space, the result is UNPREDICTABLE. In other words, programs must not use
these instructions in such a way that they attempt to access the word at address
0x00000000
sequentially after
the word at address
0xFFFFFFFC
.
Note
In some cases instructions that operate on multiple words can decrement the memory address by 4 after each
word access. If this calculation underflows the address space, by decrementing the address
0x00000000
, the
result is UNPREDICTABLE.
The behavior of any unaligned load or store with a calculated address that would access the byte at
0xFFFFFFFF
and the byte at address
0x00000000
as part of the instruction is UNPREDICTABLE.
Application Level Memory Model
A3-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.2 Alignment support
Instructions in the ARM architecture are aligned as follows:
ARM instructions are word-aligned
Thumb and ThumbEE instructions are halfword-aligned
Java bytecodes are byte-aligned.
The data alignment behavior supported by the ARM architecture has changed significantly between ARMv4
and ARMv7. This behavior is indicated by the SCTLR.U bit, see:
c1, System Control Register (SCTLR) on page B3-96 for a VMSAv7 implementation
c1, System Control Register (SCTLR) on page B4-45 for a PMSAv7 implementation
c1, System Control Register (SCTLR) on page AppxG-34 for architecture versions before ARMv7.
This bit defines the alignment behavior of the memory system for data accesses. Table A3-1 shows the
values of SCTLR.U for the different architecture versions.
On an ARMv6 processor, the SCTLR.U bit indicates which of two possible alignment models is selected:
U == 0 The processor implements the legacy alignment model. This is described in Alignment on
page AppxG-6.
Note
The use of U == 0 is deprecated in ARMv6T2, and is obsolete from ARMv7.
U == 1 The processor implements the alignment model described in this section. This model
supports unaligned data accesses.
ARMv7 requires the processor to implement the alignment model described in this section.
Table A3-1 SCTLR.U bit values for different architecture versions
Architecture version SCTLR.U value
Before ARMv6 0
ARMv6 0 or 1
ARMv7 1
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-5
A3.2.1 Unaligned data access
An ARMv7 implementation must support unaligned data accesses. The SCTLR.U bit is RAO to indicate
this support. The SCTLR.A bit, the strict alignment bit, controls whether strict alignment is required. The
checking of load and store alignment depends on the value of this bit. For more information, see c1, System
Control Register (SCTLR) on page B3-96 for a VMSA implementation, or c1, System Control Register
(SCTLR) on page B4-45 for a PMSA implementation.
Table A3-2 shows how the checking of load and store alignment depends on the instruction type and the
value of SCTLR.A.
Table A3-2 Alignment requirements of load/store instructions
Instructions Alignment
check
Result if check fails when:
SCTLR.A == 0 SCTLR.A == 1
LDRB
,
LDREXB
,
LDRBT
,
LDRSB
,
LDRSBT
,
STRB
,
STREXB
,
STRBT
,
SWPB
,
TBB
None - -
LDRH
,
LDRHT
,
LDRSH
,
LDRSHT
,
STRH
,
STRHT
,
TBH
Halfword Unaligned access Alignment fault
LDREXH
,
STREXH
Halfword Alignment fault Alignment fault
LDR
,
LDRT
,
STR
,
STRT
Word Unaligned access Alignment fault
LDREX
, STREX Word Alignment fault Alignment fault
LDREXD
,
STREXD
Doubleword Alignment fault Alignment fault
All forms of
LDM
,
LDRD
,
PUSH
,
POP
,
RFE
,
SRS
, all forms of
STM
,
STRD
,
SWP
Word Alignment fault Alignment fault
LDC
,
LDC2
,
STC
,
STC2
Word Alignment fault Alignment fault
VLDM
,
VLDR
,
VSTM
,
VSTR
Word Alignment fault Alignment fault
VLD1
,
VLD2
,
VLD3
,
VLD4
,
VST1
,
VST2
,
VST3
,
VST4
, all with
standard alignmenta
Element size Unaligned access Alignment fault
VLD1
,
VLD2
,
VLD3
,
VLD4
,
VST1
,
VST2
,
VST3
,
VST4
, all with
@<align>
specifieda
As specified by
@
<align>
Alignment fault Alignment fault
a. These element and structure load/store instructions are only in the Advanced SIMD extension to the ARMv7 ARM and
Thumb instruction sets. ARMv7 does not support the pre-ARMv6 alignment model, so you cannot use that model with
these instructions.
Application Level Memory Model
A3-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.2.2 Cases where unaligned accesses are UNPREDICTABLE
The following cases cause the resulting unaligned accesses to be UNPREDICTABLE, and overrule any
successful load or store behavior described in Unaligned data access on page A3-5:
Any load instruction that is not faulted by the alignment restrictions and that loads the PC has
UNPREDICTABLE behavior if it the address it loads from is not word-aligned.
Any unaligned access that is not faulted by the alignment restrictions and that accesses memory with
the Strongly-ordered or Device attribute has UNPREDICTABLE behavior.
Note
These memory attributes are described in Memory types and attributes and the memory order model
on page A3-24.
A3.2.3 Unaligned data access restrictions in ARMv7 and ARMv6
ARMv7 and ARMv6 have the following restrictions on unaligned data accesses:
Accesses are not guaranteed to be single-copy atomic, see Atomicity in the ARM architecture on
page A3-26. An access can be synthesized out of a series of aligned operations in a shared memory
system without guaranteeing locked transaction cycles.
Unaligned accesses typically take a number of additional cycles to complete compared to a naturally
aligned transfer. The real-time implications must be analyzed carefully and key data structures might
need to have their alignment adjusted for optimum performance.
If an unaligned access occurs across a page boundary, the operation can abort on either or both halves
of the access.
Shared memory schemes must not rely on seeing monotonic updates of non-aligned data of loads and stores
for data items larger than byte wide. For more information, see Atomicity in the ARM architecture on
page A3-26.
Unaligned access operations must not be used for accessing Device memory-mapped registers. They must
only be used with care in shared memory structures that are protected by aligned semaphores or
synchronization variables.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-7
A3.3 Endian support
The rules in Address space on page A3-2 require that for a word-aligned address A:
the word at address A consists of the bytes at addresses A, A+1, A+2 and A+3
the halfword at address A consists of the bytes at addresses A and A+1
the halfword at address A+2 consists of the bytes at addresses A+2 and A+3.
the word at address A therefore consists of the halfwords at addresses A and A+2.
However, this does not specify completely the mappings between words, halfwords, and bytes.
A memory system uses one of the two following mapping schemes. This choice is known as the endianness
of the memory system.
In a little-endian memory system:
the byte or halfword at a word-aligned address is the least significant byte or halfword in the word at
that address
the byte at a halfword-aligned address is the least significant byte in the halfword at that address.
In a big-endian memory system:
the byte or halfword at a word-aligned address is the most significant byte or halfword in the word at
that address
the byte at a halfword-aligned address is the most significant byte in the halfword at that address.
For a word-aligned address A, Table A3-3 and Table A3-4 on page A3-8 show the relationship between:
the word at address A
the halfwords at addresses A and A+2
the bytes at addresses A, A+1, A+2 and A+3.
Table A3-3 shows this relationship for a big-endian memory system, and Table A3-4 on page A3-8 shows
the relationship for a little-endian memory system.
Table A3-3 Big-endian memory system
MSByte MSByte - 1 LSByte + 1 LSByte
Word at Address A
Halfword at Address A Halfword at Address A+2
Byte at Address A Byte at Address A+1 Byte at Address A+2 Byte at Address A+3
Application Level Memory Model
A3-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
The big-endian and little-endian mapping schemes determine the order in which the bytes of a word or
halfword are interpreted. For example, a load of a word (4 bytes) from address
0x1000
always results in an
access of the bytes at memory locations
0x1000
,
0x1001
,
0x1002
, and
0x1003
. The endianness mapping scheme
determines the significance of these four bytes.
A3.3.1 Control of the endianness mapping scheme in ARMv7
In ARMv7-A, the mapping of instruction memory is always little-endian. In ARMv7-R, instruction
endianness can be controlled at the system level, see Instruction endianness.
For information about data memory endianness control, see ENDIANSTATE on page A2-19.
Note
Versions of the ARM architecture before ARMv7 had a different mechanism to control the endianness, see
Endian configuration and control on page AppxG-20.
A3.3.2 Instruction endianness
Before ARMv7, the ARM architecture included legacy support for an alternative big-endian memory model,
described as BE-32 and controlled by the B bit, bit [7], of the SCTLR, see c1, System Control Register
(SCTLR) on page AppxG-34. ARMv7 does not support BE-32 operation, and bit [7] of the SCTLR is RAZ.
Where legacy object code for ARM processors contains instructions with a big-endian byte order, the
removal of support for BE-32 operation requires the instructions in the object files to have their bytes
reversed for the code to be executed on an ARMv7 processor. This means that:
each Thumb instruction, whether a 32-bit Thumb instruction or a 16-bit Thumb instruction, must
have the byte order of each halfword of instruction reversed
each ARM instruction must have the byte order of each word of instruction reversed.
For most situations, this can be handled in the link stage of a tool-flow, provided the object files include
sufficient information to permit this to happen. In practice, this is the situation for all applications with the
ARMv7-A profile.
Table A3-4 Little-endian memory system
MSByte MSByte - 1 LSByte + 1 LSByte
Word at Address A
Halfword at Address A+2 Halfword at Address A
Byte at Address A+3 Byte at Address A+2 Byte at Address A+1 Byte at Address A
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-9
For applications of the ARMv7-R profile, there are some legacy code situations where the arrangement of
the bytes in the object files cannot be adjusted by the linker. For these object files to be used by an ARMv7-R
processor the byte order of the instructions must be reversed by the processor at runtime. Therefore, the
ARMv7-R profile permits configuration of the instruction endianness.
Instruction endianness static configuration, ARMv7-R only
To provide support for legacy big-endian object code, the ARMv7-R profile supports optional byte order
reversal hardware as a static option from reset. The ARMv7-R profile includes a read-only bit in the CP15
Control Register, SCTLR.IE, bit [31]. For more information, see c1, System Control Register (SCTLR) on
page B4-45.
A3.3.3 Element size and endianness
The effect of the endianness mapping on data transfers depends on the size of the data element or elements
transferred by the load/store instructions. Table A3-5 lists the element sizes of all the load/store instructions,
for all instruction sets.
A3.3.4 Instructions to reverse bytes in a general-purpose register
An application or device driver might have to interface to memory-mapped peripheral registers or shared
memory structures that are not the same endianness as the internal data structures. Similarly, the endianness
of the operating system might not match that of the peripheral registers or shared memory. In these cases,
the processor requires an efficient method to transform explicitly the endianness of the data.
In ARMv7, the ARM and Thumb instruction sets provide this functionality. There are instructions to:
Reverse word (four bytes) register, for transforming big and little-endian 32-bit representations. See
REV on page A8-272.
Table A3-5 Element size of load/store instructions
Instructions Element size
LDRB
,
LDREXB
,
LDRBT
,
LDRSB
,
LDRSBT
,
STRB
,
STREXB
,
STRBT
,
SWPB
,
TBB
Byte
LDRH
,
LDREXH
,
LDRHT
,
LDRSH
,
LDRSHT
,
STRH
,
STREXH
,
STRHT
,
TBH
Halfword
LDR
,
LDRT
,
LDREX
,
STR
,
STRT
,
STREX
Word
LDRD
,
LDREXD
,
STRD
,
STREXD
Word
All forms of
LDM
,
PUSH
,
POP
,
RFE
,
SRS
, all forms of
STM
,
SWP
Word
LDC
,
LDC2
,
STC
,
STC2
,
VLDM
,
VLDR
,
VSTM
,
VSTR
Word
VLD1
,
VLD2
,
VLD3
,
VLD4
,
VST1
,
VST2
,
VST3
,
VST4
Element size of the Advanced SIMD access
Application Level Memory Model
A3-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Reverse halfword and sign-extend, for transforming signed 16-bit representations. See REVSH on
page A8-276.
Reverse packed halfwords in a register for transforming big- and little-endian 16-bit representations.
See REV16 on page A8-274.
A3.3.5 Endianness in Advanced SIMD
Advanced SIMD element load/store instructions transfer vectors of elements between memory and the
Advanced SIMD register bank. An instruction specifies both the length of the transfer and the size of the
data elements being transferred. This information is used by the processor to load and store data correctly
in both big-endian and little-endian systems.
Consider. for example, the instruction:
VLD1.16 {D0}, [R1]
This loads a 64-bit register with four 16-bit values. The four elements appear in the register in array order,
with the lowest indexed element fetched from the lowest address. The order of bytes in the elements depends
on the endianness configuration, as shown in Figure A3-1. Therefore, the order of the elements in the
registers is the same regardless of the endianness configuration. This means that Advanced SIMD code is
usually independent of endianness.
Figure A3-1 Advanced SIMD byte order example
The Advanced SIMD extension supports Little-Endian (LE) and Big-Endian (BE) models.
For information about the alignment of Advanced SIMD instructions see Unaligned data access on
page A3-5.
VLD1.16 {D0}, [R1]
64-bit register containing four 16-bit elements
Memory system with
Big endian addressing (BE)
D[15:8]
D[7:0]
C[15:8]
C[7:0]
B[7:0]
A[15:8]
A[7:0]1
0
6
7
4
5
2
3
VLD1.16 {D0}, [R1]
B[15:8]
Memory system with
Little endian addressing (LE)
D[15:8]
D[7:0]
C[15:8]
C[7:0]
B[7:0]
A[15:8]
A[7:0]
1
0
6
7
4
5
2
3B[15:8]
D[15:8] D[7:0] C[15:8] C[7:0] B[15:8] B[7:0] A[15:8] A[7:0]
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-11
Note
Advanced SIMD is an extension to the ARMv7 ARM and Thumb instruction sets. In ARMv7, the SCTLR.B
bit always has the value 0, indicating that ARMv7 does not support the legacy BE-32 endianness model, and
you cannot use this model with Advanced SIMD element and structure load/store instructions.
Application Level Memory Model
A3-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.4 Synchronization and semaphores
In architecture versions before ARMv6, support for the synchronization of shared memory depends on the
SWP
and
SWPB
instructions. These are read-locked-write operations that swap register contents with memory,
and are described in SWP, SWPB on page A8-432. These instructions support basic busy/free semaphore
mechanisms, but do not support mechanisms that require calculation to be performed on the semaphore
between the read and write phases.
ARMv6 introduced a new mechanism to support more comprehensive non-blocking synchronization of
shared memory, using synchronization primitives that scale for multiprocessor system designs. ARMv6
provided a pair of synchronization primitives,
LDREX
and
STREX
. ARMv7 extends the new model by:
adding byte, halfword and doubleword versions of the synchronization primitives
adding a Clear-Exclusive instruction,
CLREX
adding the synchronization primitives to the Thumb instruction set.
Note
From ARMv6, use of the
SWP
and
SWPB
instructions is deprecated. ARM strongly recommends that all
software migrates to using the new synchronization primitives described in this section.
In ARMv7, the synchronization primitives provided in the ARM and Thumb instruction sets are:
• Load-Exclusives:
LDREX
, see LDREX on page A8-142
LDREXB
, see LDREXB on page A8-144
LDREXD
, see LDREXD on page A8-146
LDREXH
, see LDREXH on page A8-148
Store-Exclusives:
STREX
, see STREX on page A8-400
STREXB
, see STREXB on page A8-402
STREXD
, see STREXD on page A8-404
STREXH
, see STREXH on page A8-406
•Clear-Exclusive,
CLREX
, see CLREX on page A8-70.
Note
This section describes the operation of a Load-Exclusive/Store-Exclusive pair of synchronization primitives
using, as examples, the
LDREX
and
STREX
instructions. The same description applies to any other pair of
synchronization primitives:
LDREXB
used with
STREXB
LDREXD
used with
STREXD
LDREXH
used with
STREXH
.
Each Load-Exclusive instruction must be used only with the corresponding Store-Exclusive instruction.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-13
The model for the use of a Load-Exclusive/Store-Exclusive instruction pair, accessing a non-aborting
memory address x is:
The Load-Exclusive instruction reads a value from memory address x.
The corresponding Store-Exclusive instruction succeeds in writing back to memory address x only if
no other observer, process, or thread has performed a more recent store of address x. The
Store-Exclusive operation returns a status bit that indicates whether the memory write succeeded.
A Load-Exclusive instruction tags a small block of memory for exclusive access. The size of the tagged
block is IMPLEMENTATION DEFINED, see Tagging and the size of the tagged memory block on page A3-20.
A Store-Exclusive instruction to the same address clears the tag.
Note
In this section, the term processor includes any observer that can generate a Load-Exclusive or a
Store-Exclusive.
A3.4.1 Exclusive access instructions and Non-shareable memory regions
For memory regions that do not have the Shareable attribute, the exclusive access instructions rely on a local
monitor that tags any address from which the processor executes a Load-Exclusive. Any non-aborted
attempt by the same processor to use a Store-Exclusive to modify any address is guaranteed to clear the tag.
A Load-Exclusive performs a load from memory, and:
the executing processor tags the physical memory address for exclusive access
the local monitor of the executing processor transitions to its Exclusive Access state.
A Store-Exclusive performs a conditional store to memory, that depends on the state of the local monitor:
If the local monitor is in its Exclusive Access state
If the address of the Store-Exclusive is the same as the address that has been tagged
in the monitor by an earlier Load-Exclusive, then the store takes place, otherwise it
is IMPLEMENTATION DEFINED whether the store takes place.
A status value is returned to a register:
if the store took place the status value is 0
otherwise, the status value is 1.
The local monitor of the executing processor transitions to its Open Access state.
If the local monitor is in its Open Access state
no store takes place
a status value of 1 is returned to a register.
the local monitor remains in its Open Access state.
The Store-Exclusive instruction defines the register to which the status value is returned.
Application Level Memory Model
A3-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
When a processor writes using any instruction other than a Store-Exclusive:
if the write is to a physical address that is not covered by its local monitor the write does not affect
the state of the local monitor
if the write is to a physical address that is covered by its local monitor it is IMPLEMENTATION DEFINED
whether the write affects the state of the local monitor.
If the local monitor is in its Exclusive Access state and the processor performs a Store-Exclusive to any
address other than the last one from which it performed a Load-Exclusive, it is IMPLEMENTATION DEFINED
whether the store updates memory, but in all cases the local monitor is reset to its Open Access state. This
mechanism:
is used on a context switch, see Context switch support on page A3-21
must be treated as a software programming error in all other cases.
Note
It is UNPREDICTABLE whether a store to a tagged physical address causes a tag in the local monitor to be
cleared if that store is by an observer other than the one that caused the physical address to be tagged.
Figure A3-2 shows the state machine for the local monitor. Table A3-6 on page A3-15 shows the effect of
each of the operations shown in the figure.
Figure A3-2 Local monitor state machine diagram
Operations marked * are possible alternative
IMPLEMENTATION
DEFINED
options.
Any LoadExcl operation updates the tagged address to the most significant bits of the address x used
for the operation. For more information see the section Size of the tagged memory block.
In the diagram: LoadExcl represents any Load-Exclusive instruction
StoreExcl represents any Store-Exclusive instruction
Store represents any other store instruction.
CLREX
StoreExcl(x)
Store(x)
LoadExcl(x) LoadExcl(x)
CLREX
Store(Tagged_address) *
StoreExcl(Tagged_address)
StoreExcl(!Tagged_address)
Open
Access
Store(!Tagged_address)
Store(Tagged_address) *
Exclusive
Access
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-15
Note
For the local monitor state machine, as shown in Figure A3-2 on page A3-14:
The IMPLEMENTATION DEFINED options for the local monitor are consistent with the local monitor
being constructed so that it does not hold any physical address, but instead treats any access as
matching the address of the previous
LoadExcl
.
A local monitor implementation can be unaware of Load-Exclusive and Store-Exclusive operations
from other processors.
•It is UNPREDICTABLE whether the transition from Exclusive Access to Open Access state occurs when
the
Store
or
StoreExcl
is from another observer.
Table A3-6 shows the effect of the operations shown in Figure A3-2 on page A3-14.
Table A3-6 Effect of Exclusive instructions and write operations on local monitor
Initial state OperationaEffect Final state
Open Access
CLREX
No effect Open Access
Open Access
StoreExcl(x)
Does not update memory, returns status 1 Open Access
Open Access
LoadExcl(x)
Loads value from memory, tags address x Exclusive Access
Open Access
Store(x)
Updates memory, no effect on monitor Open Access
Exclusive Access
CLREX
Clears tagged address Open Access
Exclusive Access
StoreExcl(t)
Updates memory, returns status 0 Open Access
Exclusive Access
StoreExcl(!t)
Updates memory, returns status 0b
Open Access
Does not update memory, returns status 1b
Exclusive Access
LoadExcl(x)
Loads value from memory, changes tag to address to x Exclusive Access
Exclusive Access
Store(!t)
Updates memory, no effect on monitor Exclusive Access
Exclusive Access
Store(t)
Updates memory
Exclusive Accessb
Open Accessb
a. In the table:
LoadExcl
represents any Load-Exclusive instruction
StoreExcl
represents any Store-Exclusive instruction
Store
represents any store operation other than a Store-Exclusive operation.
t is the tagged address, bits [31:a] of the address of the last Load-Exclusive instruction. For more information, see
Tagging and the size of the tagged memory block on page A3-20.
b. IMPLEMENTATION DEFINED alternative actions.
Application Level Memory Model
A3-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.4.2 Exclusive access instructions and Shareable memory regions
For memory regions that have the Shareable attribute, exclusive access instructions rely on:
•A local monitor for each processor in the system, that tags any address from which the processor
executes a Load-Exclusive. The local monitor operates as described in Exclusive access instructions
and Non-shareable memory regions on page A3-13, except that for Shareable memory any
Store-Exclusive is then subject to checking by the global monitor if it is described in that section as
doing at least one of:
updating memory
returning a status value of 0.
The local monitor can ignore exclusive accesses from other processors in the system.
•A global monitor that tags a physical address as exclusive access for a particular processor. This tag
is used later to determine whether a Store-Exclusive to that address that has not been failed by the
local monitor can occur. Any successful write to the tagged address by any other observer in the
shareability domain of the memory location is guaranteed to clear the tag. For each processor in the
system, the global monitor:
holds a single tagged address
maintains a state machine.
The global monitor can either reside in a processor block or exist as a secondary monitor at the memory
interfaces.
Note
An implementation can combine the functionality of the global and local monitors into a single unit.
Operation of the global monitor
Load-Exclusive from Shareable memory performs a load from memory, and causes the physical address of
the access to be tagged as exclusive access for the requesting processor. This access also causes the exclusive
access tag to be removed from any other physical address that has been tagged by the requesting processor.
The global monitor only supports a single outstanding exclusive access to Shareable memory per processor.
Store-Exclusive performs a conditional store to memory:
The store is guaranteed to succeed only if the physical address accessed is tagged as exclusive access
for the requesting processor and both the local monitor and the global monitor state machines for the
requesting processor are in the Exclusive Access state. In this case:
a status value of 0 is returned to a register to acknowledge the successful store
the final state of the global monitor state machine for the requesting processor is
IMPLEMENTATION DEFINED
if the address accessed is tagged for exclusive access in the global monitor state machine for
any other processor then that state machine transitions to Open Access state.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-17
If no address is tagged as exclusive access for the requesting processor, the store does not succeed:
a status value of 1 is returned to a register to indicate that the store failed
the global monitor is not affected and remains in Open Access state for the requesting
processor.
If a different physical address is tagged as exclusive access for the requesting processor, it is
IMPLEMENTATION DEFINED whether the store succeeds or not:
if the store succeeds a status value of 0 is returned to a register, otherwise a value of 1 is
returned
if the global monitor state machine for the processor was in the Exclusive Access state before
the Store-Exclusive it is IMPLEMENTATION DEFINED whether that state machine transitions to
the Open Access state.
The Store-Exclusive instruction defines the register to which the status value is returned.
In a shared memory system, the global monitor implements a separate state machine for each processor in
the system. The state machine for accesses to Shareable memory by processor (n) can respond to all the
Shareable memory accesses visible to it. This means it responds to:
accesses generated by the associated processor (n)
accesses generated by the other observers in the shareability domain of the memory location (!n).
In a shared memory system, the global monitor implements a separate state machine for each observer that
can generate a Load-Exclusive or a Store-Exclusive in the system.
Figure A3-3 on page A3-18 shows the state machine for processor(n) in a global monitor. Table A3-7 on
page A3-19 shows the effect of each of the operations shown in the figure.
Application Level Memory Model
A3-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Figure A3-3 Global monitor state machine diagram for processor(n) in a multiprocessor system
Note
For the global monitor state machine, as shown in Figure A3-3:
Whether a Store-Exclusive successfully updates memory or not depends on whether the address
accessed matches the tagged Shareable memory address for the processor issuing the Store-Exclusive
instruction. For this reason, Figure A3-3 and Table A3-7 on page A3-19 only show how the (!n)
entries cause state transitions of the state machine for processor(n).
An Load-Exclusive can only update the tagged Shareable memory address for the processor issuing
the Load-Exclusive instruction.
The effect of the
CLREX
instruction on the global monitor is IMPLEMENTATION DEFINED.
It is IMPLEMENTATION DEFINED:
whether a modification to a non-shareable memory location can cause a global monitor to
transition from Exclusive Access to Open Access state
whether a Load-Exclusive to a non-shareable memory location can cause a global monitor to
transition from Open Access to Exclusive Access state.
StoreExcl(Tagged_Address,!n) clears the monitor only if the StoreExcl updates memory
Any LoadExcl operation updates the tagged address to the most significant bits of the address x
used for the operation. For more information see the section Size of the tagged memory block.
LoadExcl(x,n)
Open
Access
Exclusive
Access
CLREX(n), CLREX(!n),
LoadExcl(x,!n),
StoreExcl(x,n),
StoreExcl(x,!n),
Store(x,n), Store(x,!n)
StoreExcl(Tagged_address,!n)
Store(Tagged_address,!n)
StoreExcl(Tagged_address,n) *
StoreExcl(!Tagged_address,n) *
Store(Tagged_address,n) *
CLREX(n) *
StoreExcl(Tagged_address,!n)
Store(!Tagged_address,n)
StoreExcl(Tagged_address,n) *
StoreExcl(!Tagged_address,n) *
Store(Tagged_address,n) *
CLREX(n) *
StoreExcl(!Tagged_address,!n)
Store(!Tagged_address,!n)
CLREX(!n)
LoadExcl(x,n)
Operations marked * are possible alternative IMPLEMENTATION DEFINED options.
In the diagram: LoadExcl represents any Load-Exclusive instruction
StoreExcl represents any Store-Exclusive instruction
Store represents any other store instruction.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-19
Table A3-7 shows the effect of the operations shown in Figure A3-3 on page A3-18.
Table A3-7 Effect of load/store operations on global monitor for processor(n)
Initial
stateaOperationbEffect Final
statea
Open
CLREX(n)
,
CLREX(!n)
None Open
StoreExcl(x,n)
Does not update memory, returns status 1 Open
LoadExcl(x,!n)
Loads value from memory, no effect on tag address for processor(n) Open
StoreExcl(x,!n)
Depends on state machine and tag address for processor issuing
STREX
c
Open
Store(x,n)
,
Store(x,!n)
Updates memory, no effect on monitor Open
LoadExcl(x,n)
Loads value from memory, tags address x Exclusive
Exclusive
LoadExcl(x,n)
Loads value from memory, tags address x Exclusive
CLREX(n)
None. Effect on the final state is IMPLEMENTATION DEFINED.
Exclusivee
Opene
CLREX(!n)
None Exclusive
StoreExcl(t,!n)
Updates memory, returns status 0cOpen
Does not update memory, returns status 1cExclusive
StoreExcl(t,n)
Updates memory, returns status 0d
Open
Exclusive
StoreExcl(!t,n)
Updates memory, returns status 0e
Open
Exclusive
Does not update memory, returns status 1e
Open
Exclusive
StoreExcl(!t,!n)
Depends on state machine and tag address for processor issuing
STREX
Exclusive
Application Level Memory Model
A3-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.4.3 Tagging and the size of the tagged memory block
As stated in the footnotes to Table A3-6 on page A3-15 and Table A3-7 on page A3-19, when a
Load-Exclusive instruction is executed, the resulting tag address ignores the least significant bits of the
memory address.
Tagged_address = Memory_address[31:a]
The value of
a
in this assignment is IMPLEMENTATION DEFINED, between a minimum value of 3 and a
maximum value of 11. For example, in an implementation where a == 4, a successful
LDREX
of address
0x000341B4
gives a tag value of bits [31:4] of the address, giving
0x000341B
. This means that the four words
of memory from
0x0003 41B0
to
0x000341BF
are tagged for exclusive access.
The size of the tagged memory block called the Exclusives Reservation Granule. The Exclusives
Reservation Granule is IMPLEMENTATION DEFINED between:
two words, in an implementation with a == 3
512 words, in an implementation with a == 11.
In some implementations the CTR identifies the Exclusives Reservation Granule, see:
c0, Cache Type Register (CTR) on page B3-83 for a VMSA implementation
c0, Cache Type Register (CTR) on page B4-34 for a PMSA implementation.
Exclusive
Store(t,n)
Updates memory
Exclusivee
Opene
Store(t,!n)
Updates memory Open
Store(!t,n)
,
Store(!t,!n)
Updates memory, no effect on monitor Exclusive
a. Open = Open Access state, Exclusive = Exclusive Access state.
b. In the table:
LoadExcl
represents any Load-Exclusive instruction
StoreExcl
represents any Store-Exclusive instruction
Store
represents any store operation other than a Store-Exclusive operation.
t is the tagged address for processor(n), bits [31:a] of the address of the last Load-Exclusive instruction issued by
processor(n), see Tagging and the size of the tagged memory block.
c. The result of a
STREX(x,!n)
or a
STREX(t,!n)
operation depends on the state machine and tagged address for the processor
issuing the
STREX
instruction. This table shows how each possible outcome affects the state machine for processor(n).
d. After a successful
STREX
to the tagged address, the state of the state machine is IMPLEMENTATION DEFINED. However,
this state has no effect on the subsequent operation of the global monitor.
e. Effect is IMPLEMENTATION DEFINED. The table shows all permitted implementations.
Table A3-7 Effect of load/store operations on global monitor for processor(n) (continued)
Initial
stateaOperationbEffect Final
statea
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-21
A3.4.4 Context switch support
After a context switch, software must ensure that the local monitor is in the Open Access state. This requires
it to either:
execute a
CLREX
instruction
execute a dummy
STREX
to a memory address allocated for this purpose.
Note
Using a dummy
STREX
for this purpose is backwards-compatible with the ARMv6 implementation of
the exclusive operations. The
CLREX
instruction is introduced in ARMv6K.
Context switching is not an application level operation. However, this information is included here to
complete the description of the exclusive operations.
The
STREX
or
CLREX
instruction following a context switch might cause a subsequent Store-Exclusive to fail,
requiring a load … store sequence to be replayed. To minimize the possibility of this happening, ARM
recommends that the Store-Exclusive instruction is kept as close as possible to the associated
Load-Exclusive instruction, see Load-Exclusive and Store-Exclusive usage restrictions.
A3.4.5 Load-Exclusive and Store-Exclusive usage restrictions
The Load-Exclusive and Store-Exclusive instructions are intended to work together, as a pair, for example
a
LDREX
/
STREX
pair or a
LDREXB
/
STREXB
pair. As mentioned in Context switch support, ARM recommends that
the Store-Exclusive instruction always follows within a few instructions of its associated Load-Exclusive
instructions. To support different implementations of these functions, software must follow the notes and
restrictions given here.
These notes describe use of an
LDREX
/
STREX
pair, but apply equally to any other
Load-Exclusive/Store-Exclusive pair:
The exclusives support a single outstanding exclusive access for each processor thread that is
executed. The architecture makes use of this by not requiring an address or size check as part of the
IsExclusiveLocal()
function. If the target address of an
STREX
is different from the preceding
LDREX
in
the same execution thread, behavior can be UNPREDICTABLE. As a result, an
LDREX
/
STREX
pair can only
be relied upon to eventually succeed if they are executed with the same address. Where a context
switch or exception might result in a change of execution thread, a
CLREX
instruction or a dummy
STREX
instruction must be executed to avoid unwanted effects, as described in Context switch support
Using an
STREX
in this way is the only occasion where software can program an
STREX
with a different
address from the previously executed
LDREX
.
An explicit store to memory can cause the clearing of exclusive monitors associated with other
processors, therefore, performing a store between the
LDREX
and the
STREX
can result in a livelock
situation. As a result, code must avoid placing an explicit store between an
LDREX
and an
STREX
in a
single code sequence.
Application Level Memory Model
A3-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
•If two
STREX
instructions are executed without an intervening
LDREX
the second
STREX
returns a status
value of 1. This means that:
—every
STREX
must have a preceding
LDREX
associated with it in a given thread of execution
it is not necessary for every
LDREX
to have a subsequent
STREX
.
An implementation of the Load-Exclusive and Store-Exclusive instructions can require that, in any
thread of execution, the transaction size of a Store-Exclusive is the same as the transaction size of the
preceding Load-Exclusive that was executed in that thread. If the transaction size of a
Store-Exclusive is different from the preceding Load-Exclusive in the same execution thread,
behavior can be UNPREDICTABLE. As a result, an
LDREX
/
STREX
pair can only be relied upon to
eventually succeed only if they have the same size. Where a context switch or exception might result
in a change of execution thread, the software must execute a
CLREX
instruction or a dummy
STREX
instruction to avoid unwanted effects, as described in Context switch support on page A3-21. Using
an
STREX
in this way is the only occasion where software can use a Store-Exclusive instruction with
a different transaction size from the previously executed Load-Exclusive instruction.
An implementation might clear an exclusive monitor between the
LDREX
and the
STREX
, without any
application-related cause. For example, this might happen because of cache evictions. Code written
for such an implementation must avoid having any explicit memory accesses or cache maintenance
operations between the
LDREX
and
STREX
instructions.
Implementations can benefit from keeping the
LDREX
and
STREX
operations close together in a single
code sequence. This minimizes the likelihood of the exclusive monitor state being cleared between
the
LDREX
instruction and the
STREX
instruction. Therefore, ARM strongly recommends a limit of 128
bytes between
LDREX
and
STREX
instructions in a single code sequence, for best performance.
Implementations that implement coherent protocols, or have only a single master, might combine the
local and global monitors for a given processor. The IMPLEMENTATION DEFINED and UNPREDICTABLE
parts of the definitions in Exclusive monitors operations on page B2-35 are provided to cover this
behavior.
The architecture sets an upper limit of 2048 bytes on the size of a region that can be marked as
exclusive. Therefore, for performance reasons, ARM recommends that software separates objects
that will be accessed by exclusive accesses by at least 2048 bytes. This is a performance guideline
rather than a functional requirement.
LDREX
and
STREX
operations must be performed only on memory with the Normal memory attribute.
The effect of Data Abort exceptions on the state of monitors is UNPREDICTABLE. ARM recommends
that abort handling code performs a
CLREX
instruction or a dummy
STREX
instruction to clear the
monitor state.
If the memory attributes for the memory being accessed by an
LDREX
/
STREX
pair are changed between
the
LDREX
and the
STREX
, behavior is UNPREDICTABLE.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-23
A3.4.6 Semaphores
The Swap (
SWP
) and Swap Byte (
SWPB
) instructions must be used with care to ensure that expected behavior
is observed. Two examples are as follows:
1. A system with multiple bus masters that uses Swap instructions to implement semaphores that control
interactions between different bus masters.
In this case, the semaphores must be placed in an uncached region of memory, where any buffering
of writes occurs at a point common to all bus masters using the mechanism. The Swap instruction
then causes a locked read-write bus transaction.
2. A systems with multiple threads running on a uniprocessor that uses the Swap instructions to
implement semaphores that control interaction of the threads.
In this case, the semaphores can be placed in a cached region of memory, and a locked read-write bus
transaction might or might not occur. The Swap and Swap Byte instructions are likely to have better
performance on such a system than they do on a system with multiple bus masters such as that
described in example 1.
Note
From ARMv6, use of the Swap and Swap Byte instructions is deprecated. All new software should use the
Load-Exclusive and Store-Exclusive synchronization primitives described in Synchronization and
semaphores on page A3-12, for example
LDREX
and
STREX
.
A3.4.7 Synchronization primitives and the memory order model
The synchronization primitives follow the memory order model of the memory type accessed by the
instructions. For this reason:
Portable code for claiming a spin-lock must include a Data Memory Barrier (DMB) operation,
performed by a
DMB
instruction, between claiming the spin-lock and making any access that makes
use of the spin-lock.
Portable code for releasing a spin-lock must include a
DMB
instruction before writing to clear the
spin-lock.
This requirement applies to code using:
the Load-Exclusive/Store-Exclusive instruction pairs, for example
LDREX
/
STREX
the deprecated synchronization primitives,
SWP
/
SWPB
.
A3.4.8 Use of WFE and SEV instructions by spin-locks
ARMv7 and ARMv6K provide Wait For Event and Send Event instructions,
WFE
and
SEV
, that can assist with
reducing power consumption and bus contention caused by processors repeatedly attempting to obtain a
spin-lock. These instructions can be used at application level, but a complete understanding of what they do
depends on system-level understanding of exceptions. They are described in Wait For Event and Send Event
on page B1-44.
Application Level Memory Model
A3-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.5 Memory types and attributes and the memory order model
ARMv6 defined a set of memory attributes with the characteristics required to support the memory and
devices in the system memory map. In ARMv7 this set of attributes is extended by the addition of the Outer
Shareable attribute for Normal memory.
Note
Whether an ARMv7 implementation supports the Outer Shareable memory attribute is IMPLEMENTATION
DEFINED.
The ordering of accesses for regions of memory, referred to as the memory order model, is defined by the
memory attributes. This model is described in the following sections:
Memory types
Summary of ARMv7 memory attributes on page A3-25
Atomicity in the ARM architecture on page A3-26
Normal memory on page A3-28
Device memory on page A3-33
Strongly-ordered memory on page A3-34
Memory access restrictions on page A3-35
Backwards compatibility on page A3-37
The effect of the Security Extensions on page A3-37.
A3.5.1 Memory types
For each memory region, the most significant memory attribute specifies the memory type. There are three
mutually exclusive memory types:
• Normal
•Device
• Strongly-ordered.
Normal and Device memory regions have additional attributes.
Usually, memory used for program code and for data storage is Normal memory. Examples of Normal
memory technologies are:
programmed Flash ROM
Note
During programming, Flash memory can be ordered more strictly than Normal memory.
•ROM
•SRAM
DRAM and DDR memory.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-25
System peripherals (I/O) generally conform to different access rules to Normal memory. Examples of I/O
accesses are:
FIFOs where consecutive accesses
add queued values on write accesses
remove queued values on read accesses.
interrupt controller registers where an access can be used as an interrupt acknowledge, changing the
state of the controller itself
memory controller configuration registers that are used to set up the timing and correctness of areas
of Normal memory
memory-mapped peripherals, where accessing a memory location can cause side effects in the
system.
In ARMv7, regions of the memory map for these accesses are defined as Device or Strongly-ordered
memory. To ensure system correctness, access rules for Device and Strongly-ordered memory are more
restrictive than those for Normal memory:
both read and write accesses can have side effects
accesses must not be repeated, for example, on return from an exception
the number, order and sizes of the accesses must be maintained.
In addition, for Strongly-ordered memory, all memory accesses are strictly ordered to correspond to the
program order of the memory access instructions.
A3.5.2 Summary of ARMv7 memory attributes
Table A3-8 summarizes the memory attributes. For more information about theses attributes see:
Normal memory on page A3-28 and Shareable attribute for Device memory regions on page A3-34,
for the shareability attribute
Write-Through Cacheable, Write-Back Cacheable and Non-cacheable Normal memory on
page A3-32, for the cacheability attribute.
Table A3-8 Memory attribute summary
Memory type
attribute Shareability Other attributes Description
Strongly-
ordered
- - All memory accesses to
Strongly-ordered memory
occur in program order. All
Strongly-ordered regions are
assumed to be Shareable.
Application Level Memory Model
A3-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.5.3 Atomicity in the ARM architecture
Atomicity is a feature of memory accesses, described as atomic accesses. The ARM architecture description
refers to two types of atomicity, defined in:
Single-copy atomicity on page A3-27
Multi-copy atomicity on page A3-28.
Device Shareable - Intended to handle memory-
mapped peripherals that are
shared by several processors.
Non-
shareable
- Intended to handle memory-
mapped peripherals that are
used only by a single processor.
Normal Outer
Shareable
Cacheability, one of: aThe Outer Shareable attribute
qualifies the Shareable attribute
for Normal memory regions
and enables two levels of
Normal memory sharing.b
Non-cacheable
Write-Through Cacheable
Write-Back Write-Allocate Cacheable
Write-Back no Write-Allocate Cacheable
Inner
Shareable
Cacheability, one of: aIntended to handle Normal
memory that is shared between
several processors.
Non-cacheable
Write-Through Cacheable
Write-Back Write-Allocate Cacheable
Write-Back no Write-Allocate Cacheable
Non-
shareable
Cacheability, one of: aIntended to handle Normal
memory that is used by only a
single processor.
Non-cacheable
Write-Through Cacheable
Write-Back Write-Allocate Cacheable
Write-Back no Write-Allocate Cacheable
a. The cacheability attribute is defined independently for inner and outer cache regions.
b. The significance of the Outer Shareable attribute is IMPLEMENTATION DEFINED.
Table A3-8 Memory attribute summary (continued)
Memory type
attribute Shareability Other attributes Description
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-27
Single-copy atomicity
A read or write operation is single-copy atomic if the following conditions are both true:
After any number of write operations to an operand, the value of the operand is the value written by
one of the write operations. It is impossible for part of the value of the operand to come from one
write operation and another part of the value to come from a different write operation.
When a read operation and a write operation are made to the same operand, the value obtained by the
read operation is one of:
the value of the operand before the write operation
the value of the operand after the write operation.
It is never the case that the value of the read operation is partly the value of the operand before the
write operation and partly the value of the operand after the write operation.
In ARMv7, the single-copy atomic processor accesses are:
all byte accesses
all halfword accesses to halfword-aligned locations
all word accesses to word-aligned locations
memory accesses caused by
LDREXD
and
STREXD
instructions to doubleword-aligned locations.
LDM
,
LDC
,
LDC2
,
LDRD
,
STM
,
STC
,
STC2
,
STRD
,
PUSH
,
POP
,
RFE
,
SRS
,
VLDM
,
VLDR
,
VSTM
, and
VSTR
instructions are
executed as a sequence of word-aligned word accesses. Each 32-bit word access is guaranteed to be
single-copy atomic. A subsequence of two or more word accesses from the sequence might not exhibit
single-copy atomicity.
Advanced SIMD element and structure loads and stores are executed as a sequence of accesses of the
element or structure size. The element accesses are single-copy atomic if and only if both:
the element size is 32 bits, or smaller
the elements are naturally aligned.
Accesses to 64-bit elements or structures that are at least word-aligned are executed as a sequence of 32-bit
accesses, each of which is single-copy atomic. Subsequences of two or more 32-bit accesses from the
sequence might not be single-copy atomic.
When an access is not single-copy atomic, it is executed as a sequence of smaller accesses, each of which
is single-copy atomic, at least at the byte level.
If an instruction is executed as a sequence of accesses according to these rules, some exceptions can be taken
in the sequence and cause execution of the instruction to be abandoned. These exceptions are:
synchronous Data Abort exceptions
if low interrupt latency configuration is selected and the accesses are to Normal memory, see Low
interrupt latency configuration on page B1-43:
IRQ interrupts
FIQ interrupts
asynchronous aborts.
Application Level Memory Model
A3-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
If any of these exceptions are returned from using their preferred exception return, the instruction that
generated the sequence of accesses is re-executed and so any accesses that had already been performed
before the exception was taken are repeated.
Note
The exception behavior for these multiple access instructions means they are not suitable for use for writes
to memory for the purpose of software synchronization.
For implicit accesses:
Cache linefills and evictions have no effect on the single-copy atomicity of explicit transactions or
instruction fetches.
Instruction fetches are single-copy atomic for each instruction fetched.
Note
32-bit Thumb instructions are fetched as two 16-bit items.
Translation table walks are performed as 32-bit accesses aligned to 32 bits, each of which is
single-copy atomic.
Multi-copy atomicity
In a multiprocessing system, writes to a memory location are multi-copy atomic if the following conditions
are both true:
All writes to the same location are serialized, meaning they are observed in the same order by all
observers, although some observers might not observe all of the writes.
A read of a location does not return the value of a write until all observers observe that write.
Writes to Normal memory are not multi-copy atomic.
All writes to Device and Strongly-ordered memory that are single-copy atomic are also multi-copy atomic.
All write accesses to the same location are serialized. Write accesses to Normal memory can be repeated up
to the point that another write to the same address is observed.
For Normal memory, serialization of writes does not prohibit the merging of writes.
A3.5.4 Normal memory
Normal memory is idempotent, meaning that it exhibits the following properties:
read accesses can be repeated with no side effects
repeated read accesses return the last value written to the resource being read
read accesses can prefetch additional memory locations with no side effects
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-29
write accesses can be repeated with no side effects, provided that the contents of the location are
unchanged between the repeated writes
unaligned accesses can be supported
accesses can be merged before accessing the target memory system.
Normal memory can be read/write or read-only, and a Normal memory region is defined as being either
Shareable or Non-shareable. In a VMSA implementation, Shareable Normal memory can be either Inner
Shareable or Outer Shareable. In a PMSA implementation, no distinction is made between Inner Shareable
and Outer Shareable regions.
The Normal memory type attribute applies to most memory used in a system.
Accesses to Normal Memory have a weakly consistent model of memory ordering. See a standard text
describing memory ordering issues for a description of weakly consistent memory models, for example
chapter 2 of Memory Consistency Models for Shared Memory-Multiprocessors, Kourosh Gharachorloo,
Stanford University Technical Report CSL-TR-95-685. In general, for Normal memory, barrier operations
are required where the order of memory accesses observed by other observers must be controlled. This
requirement applies regardless of the cacheablility and shareability attributes of the Normal memory region.
The ordering requirements of accesses described in Ordering requirements for memory accesses on
page A3-45 apply to all explicit accesses.
An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture on
page A3-26 might be abandoned as a result of an exception being taken during the sequence of accesses. On
return from the exception the instruction is restarted, and therefore one or more of the memory locations
might be accessed multiple times. This can result in repeated write accesses to a location that has been
changed between the write accesses.
The architecture permits speculative accesses to memory locations marked as Normal if the access
permissions and domain permit an access to the locations.
A Normal memory region has shareability attributes that define the data coherency properties of the region.
These attributes do not affect the coherency requirements of:
instruction fetches, see Instruction coherency issues on page A3-53
translation table walks, if supported, in the base ARMv7 architecture and in versions of the
architecture before ARMv7, see TLB maintenance operations and the memory order model on
page B3-59.
Application Level Memory Model
A3-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Non-shareable Normal memory
For a Normal memory region, the Non-shareable attribute identifies Normal memory that is likely to be
accessed only by a single processor.
A region of Normal memory with the Non-shareable attribute does not have any requirement to make data
accesses by different observers coherent, unless the memory is non-cacheable. If other observers share the
memory system, software must use cache maintenance operations if the presence of caches might lead to
coherency issues when communicating between the observers. This cache maintenance requirement is in
addition to the barrier operations that are required to ensure memory ordering.
For Non-shareable Normal memory, the Load-Exclusive and Store-Exclusive synchronization primitives do
not take account of the possibility of accesses by more than one observer.
Shareable, Inner Shareable, and Outer Shareable Normal memory
For Normal memory, the Shareable and Outer Shareable memory attributes describe Normal memory that
is expected to be accessed by multiple processors or other system masters:
In a VMSA implementation, Normal memory that has the Shareable attribute but not the Outer
Shareable attribute assigned is described as having the Inner Shareable attribute.
In a PMSA implementation, no distinction is made between Inner Shareable and Outer Shareable
Normal memory, and you cannot assign the Outer Shareable attribute to Normal memory regions.
A region of Normal memory with the Shareable attribute is one for which data accesses to memory by
different observers within the same shareability domain are coherent.
The Outer Shareable attribute is introduced in ARMv7, and can be applied only to a Normal memory region
in a VMSA implementation that has the Shareable attribute assigned. It creates three levels of shareability
for a Normal memory region:
Non-shareable
A Normal memory region that does not have the Shareable attribute assigned.
Inner Shareable
A Normal memory region that has the Shareable attribute assigned, but not the Outer
Shareable attribute.
Outer Shareable
A Normal memory region that has both the Shareable and the Outer Shareable attributes
assigned.
These attributes can be used to define sets of observers for which the shareability attributes make the data
or unified caches transparent for data accesses. The sets of observers that are affected by the shareability
attributes are described as shareability domains. The details of the use of these attributes are
system-specific. Example A3-1 on page A3-31 shows how they might be used:
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-31
Example A3-1 Use of shareability attributes
In a VMSA implementation, a particular sub-system with two clusters of processors has the requirement
that:
in each cluster, the data or unified caches of the processors in the cluster are transparent for all data
accesses with the Inner Shareable attribute
however, between the two clusters, the caches:
are not transparent for data accesses that have only the Inner Shareable attribute
are transparent for data accesses that have the Outer Shareable attribute.
In this system, each cluster is in a different shareability domain for the Inner Shareable attribute, but all
components of the sub-system are in the same shareability domain for the Outer Shareable attribute.
A system might implement two such sub-systems. If the data or unified caches of one subsystem are not
transparent to the accesses from the other subsystem, this system has two Outer Shareable shareability
domains.
Having two levels of shareability attribute means you can reduce the performance and power overhead for
shared memory regions that do not need to be part of the Outer Shareable shareability domain.
Whether an ARMv7 implementation supports the Outer Shareable attribute is IMPLEMENTATION DEFINED.
If the Outer Shareable attribute is supported, its significance in the implementation is IMPLEMENTATION
DEFINED.
For Shareable Normal memory, the Load-Exclusive and Store-Exclusive synchronization primitives take
account of the possibility of accesses by more than one observer in the same Shareability domain.
Note
The Shareable concept enables system designers to specify the locations in Normal memory that must have
coherency requirements. However, to facilitate porting of software, software developers must not assume
that specifying a memory region as Non-shareable permits software to make assumptions about the
incoherency of memory locations between different processors in a shared memory system. Such
assumptions are not portable between different multiprocessing implementations that make use of the
Shareable concept. Any multiprocessing implementation might implement caches that, inherently, are
shared between different processing elements.
Application Level Memory Model
A3-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Write-Through Cacheable, Write-Back Cacheable and Non-cacheable Normal
memory
In addition to being Outer Shareable, Inner Shareable or Non-shareable, each region of Normal memory can
be marked as being one of:
Write-Through Cacheable
Write-Back Cacheable, with an additional qualifier that marks it as one of:
Write-Back, Write-Allocate
Write-Back, no Write-Allocate
• Non-cacheable.
If the same memory locations are marked as having different cacheability attributes, for example by the use
of aliases in a virtual to physical address mapping, behavior is UNPREDICTABLE.
The cacheability attributes provide a mechanism of coherency control with observers that lie outside the
shareability domain of a region of memory. In some cases, the use of Write-Through Cacheable or
Non-cacheable regions of memory might provide a better mechanism for controlling coherency than the use
of hardware coherency mechanisms or the use of cache maintenance routines. To this end, the architecture
requires the following properties for Non-cacheable or Write-Through Cacheable memory:
a completed write to a memory location that is Non-cacheable or Write-Through Cacheable for a
level of cache made by an observer accessing the memory system inside the level of cache is visible
to all observers accessing the memory system outside the level of cache without the need of explicit
cache maintenance
a completed write to a memory location that is Non-cacheable for a level of cache made by an
observer accessing the memory system outside the level of cache is visible to all observers accessing
the memory system inside the level of cache without the need of explicit cache maintenance.
Note
Implementations can also use the cacheability attributes to provide a performance hint regarding the
performance benefit of caching. For example, it might be known to a programmer that a piece of memory
is not going to be accessed again and would be better treated as Non-cacheable. The distinction between
Write-Back Write-Allocate and Write-Back no Write-Allocate memory exists only as a hint for
performance.
The ARM architecture provides independent cacheability attributes for Normal memory for two conceptual
levels of cache, the inner and the outer cache. The relationship between these conceptual levels of cache and
the implemented physical levels of cache is IMPLEMENTATION DEFINED, and can differ from the boundaries
between the Inner and Outer Shareability domains. However:
inner refers to the innermost caches, and always includes the lowest level of cache
no cache controlled by the Inner cacheability attributes can lie outside a cache controlled by the Outer
cacheability attributes
an implementation might not have any outer cache.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-33
Example A3-2 to Example A3-4 describe the three possible ways of implementing a system with three
levels of cache, L1 to L3. L1 is the level closest to the processor, see Memory hierarchy on page A3-52.
Example A3-2 Implementation with two inner and one outer cache levels
Implement the three levels of cache in the system, L1 to L3, with:
the Inner cacheability attribute applied to L1 and L2 cache
the Outer cacheability attribute applied to L3 cache.
Example A3-3 Implementation with three inner and no outer cache levels
Implement the three levels of cache in the system, L1 to L3, with the Inner cacheability attribute applied to
L1, L2, and L3 cache. Do not use the Outer cacheability attribute.
Example A3-4 Implementation with one inner and two outer cache levels
Implement the three levels of cache in the system, L1 to L3, with:
the Inner cacheability attribute applied to L1 cache
the Outer cacheability attribute applied to L2 and L3 cache.
A3.5.5 Device memory
The Device memory type attribute defines memory locations where an access to the location can cause side
effects, or where the value returned for a load can vary depending on the number of loads performed.
Memory-mapped peripherals and I/O locations are examples of memory regions normally marked as being
Device memory.
For explicit accesses from the processor to memory marked as Device:
all accesses occur at their program size
the number of accesses is the number specified by the program.
An implementation must not repeat an access to a Device memory location if the program has only one
access to that location. In other words, accesses to Device memory locations are not restartable.
The architecture does not permit speculative accesses to memory marked as Device.
The architecture permits an Advanced SIMD element or structure load instruction to access bytes in Device
memory that are not explicitly accessed by the instruction, provided the bytes accessed are within a 16-byte
window, aligned to 16-bytes, that contains at least one byte that is explicitly accessed by the instruction.
Address locations marked as Device are never held in a cache.
Application Level Memory Model
A3-34 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
All explicit accesses to Device memory must comply with the ordering requirements of accesses described
in Ordering requirements for memory accesses on page A3-45.
An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture on
page A3-26 might be abandoned as a result of an exception being taken during the sequence of accesses. On
return from the exception the instruction is restarted, and therefore one or more of the memory locations
might be accessed multiple times. This can result in repeated write accesses to a location that has been
changed between the write accesses.
Note
Do not use an instruction that generates a sequence of accesses to access Device memory if the instruction
might generate an abort on any access other than the first one.
Any unaligned access that is not faulted by the alignment restrictions and accesses Device memory has
UNPREDICTABLE behavior.
Shareable attribute for Device memory regions
Device memory regions can be given the Shareable attribute. This means that a region of Device memory
can be described as either:
Shareable Device memory
Non-shareable Device memory.
Non-shareable Device memory is defined as only accessible by a single processor. An example of a system
supporting Shareable and Non-shareable Device memory is an implementation that supports both:
a local bus for its private peripherals
system peripherals implemented on the main shared system bus.
Such a system might have more predictable access times for local peripherals such as watchdog timers or
interrupt controllers. In particular, a specific address in a Non-shareable Device memory region might
access a different physical peripheral for each processor.
A3.5.6 Strongly-ordered memory
The Strongly-ordered memory type attribute defines memory locations where an access to the location can
cause side effects, or where the value returned for a load can vary depending on the number of loads
performed. Examples of memory regions normally marked as being Strongly-ordered are memory-mapped
peripherals and I/O locations.
For explicit accesses from the processor to memory marked as Strongly-ordered:
all accesses occur at their program size
the number of accesses is the number specified by the program.
An implementation must not repeat an access to a Strongly-ordered memory location if the program has
only one access to that location. In other words, accesses to Strongly-ordered memory locations are not
restartable.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-35
The architecture does not permit speculative accesses to memory marked as Strongly-ordered.
The architecture permits an Advanced SIMD element or structure load instruction to access bytes in
Strongly-ordered memory that are not explicitly accessed by the instruction, provided the bytes accessed are
within a 16-byte window, aligned to 16-bytes, that contains at least one byte that is explicitly accessed by
the instruction.
Address locations in Strongly-ordered memory are not held in a cache, and are always treated as Shareable
memory locations.
All explicit accesses to Strongly-ordered memory must correspond to the ordering requirements of accesses
described in Ordering requirements for memory accesses on page A3-45.
An instruction that generates a sequence of accesses as described in Atomicity in the ARM architecture on
page A3-26 might be abandoned as a result of an exception being taken during the sequence of accesses. On
return from the exception the instruction is restarted, and therefore one or more of the memory locations
might be accessed multiple times. This can result in repeated write accesses to a location that has been
changed between the write accesses.
Note
Do not use an instruction that generates a sequence of accesses to access Strongly-ordered memory if the
instruction might generate an abort on any access other than the first one.
Any unaligned access that is not faulted by the alignment restrictions and accesses Strongly-ordered
memory has UNPREDICTABLE behavior.
Note
See Ordering of instructions that change the CPSR interrupt masks on page AppxG-8 for additional
requirements that apply to accesses to Strongly-ordered memory in ARMv6.
A3.5.7 Memory access restrictions
The following restrictions apply to memory accesses:
For any access X, the bytes accessed by X must all have the same memory type attribute, otherwise
the behavior of the access is UNPREDICTABLE. That is, an unaligned access that spans a boundary
between different memory types is UNPREDICTABLE.
For any two memory accesses X and Y that are generated by the same instruction, the bytes accessed
by X and Y must all have the same memory type attribute, otherwise the results are UNPREDICTABLE.
For example, an
LDC
,
LDM
,
LDRD
,
STC
,
STM
, or
STRD
that spans a boundary between Normal and Device
memory is UNPREDICTABLE.
An instruction that generates an unaligned memory access to Device or Strongly-ordered memory is
UNPREDICTABLE.
Application Level Memory Model
A3-36 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
To ensure access rules are maintained, an instruction that causes multiple accesses to Device or
Strongly-ordered memory must not cross a 4KB address boundary, otherwise the effect is
UNPREDICTABLE. For this reason, it is important that an access to a volatile memory device is not
made using a single instruction that crosses a 4KB address boundary.
ARM expects this restriction to impose constraints on the placing of volatile memory devices in the
memory map of a system, rather than expecting a compiler to be aware of the alignment of memory
accesses.
For instructions that generate accesses to Device or Strongly-ordered memory, implementations must
not change the sequence of accesses specified by the pseudocode of the instruction. This includes not
changing:
how many accesses there are
the time order of the accesses
the data sizes and other properties of each access.
In addition, processor implementations expect any attached memory system to be able to identify the
memory type of an accesses, and to obey similar restrictions with regard to the number, time order,
data sizes and other properties of the accesses.
Exceptions to this rule are:
An implementation of a processor can break this rule, provided that the information it supplies
to the memory system enables the original number, time order, and other details of the accesses
to be reconstructed. In addition, the implementation must place a requirement on attached
memory systems to do this reconstruction when the accesses are to Device or Strongly-ordered
memory.
For example, an implementation with a 64-bit bus might pair the word loads generated by an
LDM
into 64-bit accesses. This is because the instruction semantics ensure that the 64-bit access
is always a word load from the lower address followed by a word load from the higher address.
However the implementation must permit the memory systems to unpack the two word loads
when the access is to Device or Strongly-ordered memory.
Any implementation technique that produces results that cannot be observed to be different
from those described above is legitimate.
An Advanced SIMD element or structure load instruction can access bytes in Device or
Strongly-ordered memory that are not explicitly accessed by the instruction, provided the
bytes accessed are within a 16-byte window, aligned to 16-bytes, that contains at least one byte
that is explicitly accessed by the instruction.
Any multi-access instruction that loads or stores the PC must access only Normal memory. If the
instruction accesses Device or Strongly-ordered memory the result is UNPREDICTABLE. There is one
exception to this restriction. In the VMSA architecture, when the MMU is disabled any multi-access
instruction that loads or stores the PC functions correctly, see Enabling and disabling the MMU on
page B3-5.
Any instruction fetch must access only Normal memory. If it accesses Device or Strongly-ordered
memory, the result is UNPREDICTABLE. For example, instruction fetches must not be performed to an
area of memory that contains read-sensitive devices, because there is no ordering requirement
between instruction fetches and explicit accesses.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-37
•Behavior is UNPREDICTABLE if the same memory location:
is marked as Shareable Normal and Non-shareable Normal
is marked as having different memory types (Normal, Device, or Strongly-ordered)
is marked as having different cacheability attributes
is marked as being Shareable Device and Non-shareable Device memory.
Such memory marking contradictions can occur, for example, by the use of aliases in a virtual to
physical address mapping.
Before ARMv6, it is IMPLEMENTATION DEFINED whether a low interrupt latency mode is supported. From
ARMv6, low interrupt latency support is controlled by the SCTLR.FI bit. It is IMPLEMENTATION DEFINED
whether multi-access instructions behave correctly in low interrupt latency configurations.
A3.5.8 Backwards compatibility
From ARMv6, the memory attributes are significantly different from those in previous versions of the
architecture. Table A3-9 shows the interpretation of the earlier memory types in the light of this definition.
A3.5.9 The effect of the Security Extensions
The Security Extensions can be included as part of an ARMv7-A implementation, with a VMSA. They
provide two distinct 4GByte virtual memory spaces:
a Secure virtual memory space
a Non-secure virtual memory space.
The Secure virtual memory space is accessed by memory accesses in the Secure state, and the Non-secure
virtual memory space is accessed by memory accesses in the Non-secure state.
By providing different virtual memory spaces, the Security Extensions permit memory accesses made from
the Non-secure state to be distinguished from those made from the Secure state.
Table A3-9 Backwards compatibility
Previous architectures ARMv6 and ARMv7 attribute
NCNB (Non-cacheable, Non-bufferable) Strongly-ordered
NCB (Non-cacheable, Bufferable) Shareable Device
Write-Through Cacheable, Bufferable Non-shareable Normal, Write-Through Cacheable
Write-Back Cacheable, Bufferable Non-shareable Normal, Write-Back Cacheable
Application Level Memory Model
A3-38 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.6 Access rights
ARMv7 includes additional attributes for memory regions, that enable:
Data accesses to be restricted, based on the privilege of the access. See Privilege level access controls
for data accesses.
Instruction fetches to be restricted, based on the privilege of the process or thread making the fetch.
See Privilege level access controls for instruction accesses.
On a system that implements the Security Extensions, accesses to be restricted to memory accesses
with the Secure memory attribute. See Memory region security status on page A3-39.
A3.6.1 Privilege level access controls for data accesses
The memory attributes can define that a memory region is:
not accessible to any accesses
accessible only to Privileged accesses
accessible to Privileged and Unprivileged accesses.
The access privilege level is defined separately for explicit read and explicit write accesses. However, a
system that defines the memory attributes is not required to support all combinations of memory attributes
for read and write accesses.
A Privileged access is an access made during privileged execution, as a result of a load or store operation
other than
LDRT
,
STRT
,
LDRBT
,
STRBT
,
LDRHT
,
STRHT
,
LDRSHT
, or
LDRSBT
.
An Unprivileged access is an access made as a result of load or store operation performed in one of these
cases:
when the processor is in an unprivileged mode
when the processor is in any mode and the access is made as a result of a
LDRT
,
STRT
,
LDRBT
,
STRBT
,
LDRHT
,
STRHT
,
LDRSHT
, or
LDRSBT
instruction.
A Data Abort exception is generated if the processor attempts a data access that the access rights do not
permit. For example, a Data Abort exception is generated if the processor is in unprivileged mode and
attempts to access a memory region that is marked as only accessible to Privileged accesses.
A3.6.2 Privilege level access controls for instruction accesses
Memory attributes can define that a memory region is:
Not accessible for execution
Accessible for execution by Privileged processes only
Accessible for execution by Privileged and Unprivileged processes.
To define the instruction access rights to a memory region, the memory attributes describe, separately, for
the region:
its read access rights, see Privilege level access controls for data accesses
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-39
whether it is suitable for execution.
For example, a region that is accessible for execution by Privileged processes only has the memory
attributes:
accessible only to Privileged read accesses
suitable for execution.
This means there is some linkage between the memory attributes that define the accessibility of a region to
explicit memory accesses, and those that define that a region can be executed.
A memory fault occurs if a processor attempts to execute code from a memory location with attributes that
do not permit code execution.
A3.6.3 Memory region security status
An additional memory attribute determines whether the memory region is Secure or Non-secure in an
ARMv7-A system that implements the Security Extensions. When the Security Extensions are
implemented, this attribute is checked by the system hardware to ensure that a region of memory that is
designated as Secure by the system hardware is not accessed by memory accesses with the Non-secure
memory attribute. For more information, see Memory region attributes on page B3-32.
Application Level Memory Model
A3-40 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.7 Virtual and physical addressing
ARMv7 provides three alternative architectural profiles, ARMv7-A, ARMv7-R and ARMv7-M. Each of the
profiles specifies a different memory system. This manual describes two of these profiles:
ARMv7-A profile
The ARMv7-A memory system incorporates a Memory Management Unit (MMU),
controlled by CP15 registers. The memory system supports virtual addressing, with the
MMU performing virtual to physical address translation, in hardware, as part of program
execution.
ARMv7-R profile
The ARMv7-R memory system incorporates a Memory Protection Unit (MPU), controlled
by CP15 registers. The MPU does not support virtual addressing.
At the application level, the difference between the ARMv7-A and ARMv7-R memory systems is
transparent. Regardless of which profile is implemented, an application accesses the memory map described
in Address space on page A3-2, and the implemented memory system makes the features described in this
chapter available to the application.
For a system-level description of the ARMv7-A and ARMv7-R memory models see:
Chapter B2 Common Memory System Architecture Features
Chapter B3 Virtual Memory System Architecture (VMSA)
Chapter B4 Protected Memory System Architecture (PMSA).
Note
This manual does not describe the ARMv7-M profile. For details of this profile see:
ARMv7-M Architecture Application Level Reference Manual, for an application-level description
ARMv7-M Architecture Reference Manual, for a full description.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-41
A3.8 Memory access order
ARMv7 provides a set of three memory types, Normal, Device, and Strongly-ordered, with well-defined
memory access properties.
The ARMv7 application-level view of the memory attributes is described in:
Memory types and attributes and the memory order model on page A3-24
Access rights on page A3-38.
When considering memory access ordering, an important feature of the ARMv6 memory model is the
Shareable memory attribute, that indicates whether a region of memory can be shared between multiple
processors, and therefore requires an appearance of cache transparency in the ordering model.
The key issues with the memory order model depend on the target audience:
For software programmers, considering the model at the application level, the key factor is that for
accesses to Normal memory barriers are required in some situations where the order of accesses
observed by other observers must be controlled.
For silicon implementers, considering the model at the system level, the Strongly-ordered and Device
memory attributes place certain restrictions on the system designer in terms of what can be built and
when to indicate completion of an access.
Note
Implementations remain free to choose the mechanisms required to implement the functionality of
the memory model.
More information about the memory order model is given in the following subsections:
Reads and writes on page A3-42
Ordering requirements for memory accesses on page A3-45
Memory barriers on page A3-47.
Additional attributes and behaviors relate to the memory system architecture. These features are defined in
the system level section of this manual:
Virtual memory systems based on an MMU, described in Chapter B3 Virtual Memory System
Architecture (VMSA).
Protected memory systems based on an MPU, described in Chapter B4 Protected Memory System
Architecture (PMSA).
Caches, described in Caches on page B2-3.
Note
In these system level descriptions, some attributes are described in relation to an MMU. In general, these
descriptions can also be applied to an MPU based system.
Application Level Memory Model
A3-42 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.8.1 Reads and writes
Each memory access is either a read or a write. Explicit memory accesses are the memory accesses required
by the function of an instruction. The following can cause memory accesses that are not explicit:
instruction fetches
cache loads and writebacks
translation table walks.
Except where otherwise stated, the memory ordering requirements only apply to explicit memory accesses.
Reads
Reads are defined as memory operations that have the semantics of a load.
The memory accesses of the following instructions are reads:
LDR
,
LDRB
,
LDRH
,
LDRSB
, and
LDRSH
LDRT
,
LDRBT
,
LDRHT
,
LDRSBT
, and
LDRSHT
LDREX
,
LDREXB
,
LDREXD
, and
LDREXH
LDM
,
LDRD
,
POP
, and
RFE
LDC
,
LDC2
,
VLDM
,
VLDR
,
VLD1
,
VLD2
,
VLD3
, and
VLD4
the return of status values by
STREX
,
STREXB
,
STREXD
, and
STREXH
in the ARM instruction set only,
SWP
and
SWPB
in the Thumb instruction set only,
TBB
and
TBH
.
Hardware-accelerated opcode execution by the Jazelle extension can cause a number of reads to occur,
according to the state of the operand stack and the implementation of the Jazelle hardware acceleration.
Writes
Writes are defined as memory operations that have the semantics of a store.
The memory accesses of the following instructions are Writes:
STR
,
STRB
, and
STRH
STRT
,
STRBT
, and
STRHT
STREX
,
STREXB
,
STREXD
, and
STREXH
STM
,
STRD
, PUSH, and
SRS
STC
,
STC2
,
VSTM
,
VSTR
,
VST1
,
VST2
,
VST3
, and
VST4
in the ARM instruction set only,
SWP
and
SWPB
.
Hardware-accelerated opcode execution by the Jazelle extension can cause a number of writes to occur,
according to the state of the operand stack and the implementation of the Jazelle hardware acceleration.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-43
Synchronization primitives
Synchronization primitives must ensure correct operation of system semaphores in the memory order
model. The synchronization primitive instructions are defined as those instructions that are used to ensure
memory synchronization:
LDREX
,
STREX
,
LDREXB
,
STREXB
,
LDREXD
,
STREXD
,
LDREXH
,
STREXH
.
SWP
,
SWPB
. Use of these instructions is deprecated from ARMv6.
Before ARMv6, support consisted of the
SWP
and
SWPB
instructions. ARMv6 introduced new Load-Exclusive
and Store-Exclusive instructions
LDREX
and
STREX
, and deprecated using the
SWP
and
SWPB
instructions.
ARMv7 introduces:
additional Load-Exclusive and Store-Exclusive instructions,
LDREXB
,
LDREXD
,
LDREXH
,
STREXB
,
STREXD
,
and
STREXH
the Clear-Exclusive instruction CLREX
the Load-Exclusive, Store-Exclusive and Clear-Exclusive instructions in the Thumb instruction set.
For details of the Load-Exclusive, Store-Exclusive and Clear-Exclusive instructions see Synchronization
and semaphores on page A3-12.
The Load-Exclusive and Store-Exclusive instructions are supported to Shareable and Non-shareable
memory. Non-shareable memory can be used to synchronize processes that are running on the same
processor. Shareable memory must be used to synchronize processes that might be running on different
processors.
Observability and completion
An observer is an agent in the system that can access memory. For a processor, the following mechanisms
must be treated as independent observers:
the mechanism that performs reads or writes to memory
a mechanism that causes an instruction cache to be filled from memory or that fetches instructions to
be executed directly from memory
a mechanism that performs translation table walks.
The set of observers that can observe a memory access is defined by the system.
For all memory:
a write to a location in memory is said to be observed by an observer when a subsequent read of the
location by the same observer will return the value written by the write
a write to a location in memory is said to be globally observed for a shareability domain when a
subsequent read of the location by any observer in that shareability domain will return the value
written by the write
Application Level Memory Model
A3-44 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
a read of a location in memory is said to be observed by an observer when a subsequent write to the
location by the same observer will have no effect on the value returned by the read
a read of a location in memory is said to be globally observed for a shareability domain when a
subsequent write to the location by any observer in that shareability domain will have no effect on
the value returned by the read.
Additionally, for Strongly-ordered memory:
A read or write of a memory-mapped location in a peripheral that exhibits side-effects is said to be
observed, and globally observed, only when the read or write:
meets the general conditions listed
can begin to affect the state of the memory-mapped peripheral
can trigger all associated side effects, whether they affect other peripheral devices, processors
or memory.
For all memory, the completion rules are defined as:
A read or write is complete for a shareability domain when all of the following are true:
the read or write is globally observed for that shareability domain
any translation table walks associated with the read or write are complete for that shareability
domain.
A translation table walk is complete for a shareability domain when the memory accesses associated
with the translation table walk are globally observed for that shareability domain, and the TLB is
updated.
A cache, branch predictor or TLB maintenance operation is complete for a shareability domain when
the effects of operation are globally observed for that shareability domain and any translation table
walks that arise from the operation are complete for that shareability domain.
The completion of any cache, branch predictor and TLB maintenance operation includes its
completion on all processors that are affected by both the operation and the DSB.
Side effect completion in Strongly-ordered and Device memory
The completion of a memory access in Strongly-ordered or Device memory is not guaranteed to be
sufficient to determine that the side effects of the memory access are visible to all observers. The mechanism
that ensures the visibility of side-effects of a memory accesses is IMPLEMENTATION DEFINED.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-45
A3.8.2 Ordering requirements for memory accesses
ARMv7 and ARMv6 define access restrictions in the permitted ordering of memory accesses. These
restrictions depend on the memory attributes of the accesses involved.
Two terms used in describing the memory access ordering requirements are:
Address dependency
An address dependency exists when the value returned by a read access is used to compute
the virtual address of a subsequent read or write access. An address dependency exists even
if the value read by the first read access does not change the virtual address of the second
read or write access. This might be the case if the value returned is masked off before it is
used, or if it has no effect on the predicted address value for the second access.
Control dependency
A control dependency exists when the data value returned by a read access is used to
determine the condition code flags, and the values of the flags are used for condition code
checking to determine the address of a subsequent read access. This address determination
might be through conditional execution, or through the evaluation of a branch.
Figure A3-4 on page A3-46 shows the memory ordering between two explicit accesses A1 and A2, where
A1 occurs before A2 in program order. The symbols used in the figure are as follows:
< Accesses must be observed in program order, that is, A1 must be observed before A2.
- Accesses can be observed in any order, provided that the requirements of uniprocessor
semantics, for example respecting dependencies between instructions in a single processor,
are maintained.
The following additional restrictions apply to the ordering of memory accesses that have this
symbol:
If there is an address dependency then the two memory accesses are observed in
program order by any observer in the common shareability domain of the two
accesses.
This ordering restriction does not apply if there is only a control dependency between
the two read accesses.
If there is both an address dependency and a control dependency between two read
accesses the ordering requirements of the address dependency apply.
If the value returned by a read access is used as data written by a subsequent write
access, then the two memory accesses are observed in program order.
It is impossible for an observer in the shareability domain of a memory location to
observe a write access to that memory location if that location would not be written
to in a sequential execution of a program.
It is impossible for an observer in the shareability domain of a memory location to
observe a write value written to that memory location if that value would not be
written in a sequential execution of a program.
Application Level Memory Model
A3-46 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
It is impossible for an observer in the shareability domain of a memory location to
observe two reads to the same memory location performed by the same observer in
an order that would not occur in a sequential execution of a program.
In Figure A3-4, an access refers to a read or a write access to the specified memory type.
For example, Device access, Non-shareable refers to a read or write access to Non-shareable
Device memory.
Figure A3-4 Memory ordering restrictions
There are no ordering requirements for implicit accesses to any type of memory.
Program order for instruction execution
The program order of instruction execution is the order of the instructions in the control flow trace.
Explicit memory accesses in an execution can be either:
Strictly Ordered
Denoted by <. Must occur strictly in order.
Ordered Denoted by <=. Can occur either in order or simultaneously.
Load/store multiple instructions, such as
LDM
,
LDRD
,
STM
, and
STRD
, generate multiple word accesses, each of
which is a separate access for the purpose of determining ordering.
The rules for determining program order for two accesses A1 and A2 are:
If A1 and A2 are generated by two different instructions:
A1 < A2 if the instruction that generates A1 occurs before the instruction that generates A2 in
program order
A2 < A1 if the instruction that generates A2 occurs before the instruction that generates A1 in
program order.
If A1 and A2 are generated by the same instruction:
If A1 and A2 are the load and store generated by a
SWP
or
SWPB
instruction:
A1 < A2 if A1 is the load and A2 is the store
A2 < A1 if A2 is the load and A1 is the store.
A1
Normal access
Device access, Non-shareable
Strongly-
ordered
access
ShareableNon-shareable
Device access
Normal
access
A2
<
Strongly-ordered access
- <<<
Device access, Shareable
-
<
<
<
---
--
--
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-47
In these descriptions:
—an LDM-class instruction is any form of
LDM
,
LDMDA
,
LDMDB
,
LDMIB
, or
POP
instruction
—an LDC-class instruction is an
LDC
,
VLDM
, or
VLDR
instruction
—an STM-class instruction is any form of
STM
,
STMDA
,
STMDB
,
STMIB
, or
PUSH
instruction
—an STC-class instruction is an
STC
,
VSTM
, or
VSTR
instruction.
If A1 and A2 are two word loads generated by an LDC-class or LDM-class instruction, or two word
stores generated by an STC-class or STM-class instruction, excluding LDM-class and STM-class
instructions with a register list that includes the PC:
A1 <= A2 if the address of A1 is less than the address of A2
A2 <= A1 if the address of A2 is less than the address of A1.
If A1 and A2 are two word loads generated by an LDM-class instruction with a register list that
includes the PC or two word stores generated by an STM-class instruction with a register list that
includes the PC, the program order of the memory accesses is not defined.
If A1 and A2 are two word loads generated by an
LDRD
instruction or two word stores generated by
an
STRD
instruction, the program order of the memory accesses is not defined.
If A1 and A2 are load or store accesses generated by Advanced SIMD element or structure load/store
instructions, the program order of the memory accesses is not defined.
For any instruction or operation not explicitly mentioned in this section, if the single-copy atomicity
rules described in Single-copy atomicity on page A3-27 mean the operation becomes a sequence of
accesses, then the time-ordering of those accesses is not defined.
A3.8.3 Memory barriers
Memory barrier is the general term applied to an instruction, or sequence of instructions, used to force
synchronization events by a processor with respect to retiring load/store instructions. The ARM architecture
defines a number of memory barriers that provide a range of functionality, including:
ordering of issued load/store instructions to the programmers’ model
completion of preceding load/store instructions to the programmers’ model
flushing of any instructions prefetched before the memory barrier operation.
ARMv7 and ARMv6 require three explicit memory barriers to support the memory order model described
in this chapter. In ARMv7 the memory barriers are provided as instructions that are available in the ARM
and Thumb instruction sets, and in ARMv6 the memory barriers are performed by CP15 register writes. The
three memory barriers are:
Data Memory Barrier, see Data Memory Barrier (DMB) on page A3-48
Data Synchronization Barrier, see Data Synchronization Barrier (DSB) on page A3-49
Instruction Synchronization Barrier, see Instruction Synchronization Barrier (ISB) on page A3-49.
Depending on the synchronization needed, a program might use memory barriers on their own, or it might
use them in conjunction with cache and memory management maintenance operations that are only
available in privileged modes.
Application Level Memory Model
A3-48 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
The DMB and DSB memory barriers affect reads and writes to the memory system generated by load/store
instructions and data or unified cache maintenance operations being executed by the processor. Instruction
fetches or accesses caused by a hardware translation table access are not explicit accesses.
Data Memory Barrier (DMB)
The
DMB
instruction is a data memory barrier. The processor that executes the
DMB
instruction is referred to
as the executing processor, Pe. The
DMB
instruction takes the required shareability domain and required
access types as arguments. If the required shareability is Full system then the operation applies to all
observers within the system.
A DMB creates two groups of memory accesses, Group A and Group B:
Group A Contains:
All explicit memory accesses of the required access types from observers in the same
required shareability domain as Pe that are observed by Pe before the
DMB
instruction.
These accesses include any accesses of the required access types and required
shareability domain performed by Pe.
All loads of required access types from observers in the same required shareability
domain as Pe that have been observed by any given observer, Py, in the same required
shareability domain as Pe before Py has performed a memory access that is a member
of Group A.
Group B Contains:
All explicit memory accesses of the required access types by Pe that occur in program
order after the
DMB
instruction.
All explicit memory accesses of the required access types by any given observer Px
in the same required shareability domain as Pe that can only occur after Px has
observed a store that is a member of Group B.
Any observer with the same required shareability domain as Pe observes all members of Group A before it
observes any member of Group B to the extent that those group members are required to be observed, as
determined by the shareability and cacheability of the memory locations accessed by the group members.
Where members of Group A and Group B access the same memory-mapped peripheral, all members of
Group A will be visible at the memory-mapped peripheral before any members of Group B are visible at
that peripheral.
Note
A memory access might be in neither Group A nor Group B. The DMB does not affect the order of
observation of such a memory access.
The second part of the definition of Group A is recursive. Ultimately, membership of Group A derives
from the observation by Py of a load before Py performs an access that is a member of Group A as a
result of the first part of the definition of Group A.
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-49
The second part of the definition of Group B is recursive. Ultimately, membership of Group B derives
from the observation by any observer of an access by Pe that is a member of Group B as a result of
the first part of the definition of Group B.
DMB
only affects memory accesses. It has no effect on the ordering of any other instructions executing on the
processor.
For details of the
DMB
instruction in the Thumb and ARM instruction sets see DMB on page A8-90.
Data Synchronization Barrier (DSB)
The
DSB
instruction is a special memory barrier, that synchronizes the execution stream with memory
accesses. The
DSB
instruction takes the required shareability domain and required access types as arguments.
If the required shareability is Full system then the operation applies to all observers within the system.
A
DSB
behaves as a
DMB
with the same arguments, and also has the additional properties defined here.
A
DSB
completes when both:
all explicit memory accesses that are observed by Pe before the
DSB
is executed, are of the required
access types, and are from observers in the same required shareability domain as Pe, are complete for
the set of observers in the required shareability domain
all cache, branch predictor, and TLB maintenance operations issued by Pe before the
DSB
are complete
for the required shareability domain.
In addition, no instruction that appears in program order after the
DSB
instruction can execute until the
DSB
completes.
For details of the
DSB
instruction in the Thumb and ARM instruction sets see DSB on page A8-92.
Note
Historically, this operation was referred to as Drain Write Buffer or Data Write Barrier (DWB). From
ARMv6, these names and the use of DWB were deprecated in favor of the new Data Synchronization Barrier
name and DSB abbreviation. DSB better reflects the functionality provided from ARMv6, because DSB is
architecturally defined to include all cache, TLB and branch prediction maintenance operations as well as
explicit memory operations.
Instruction Synchronization Barrier (ISB)
An
ISB
instruction flushes the pipeline in the processor, so that all instructions that come after the
ISB
instruction in program order are fetched from cache or memory only after the
ISB
instruction has completed.
Using an
ISB
ensures that the effects of context altering operations executed before the
ISB
are visible to the
instructions fetched after the
ISB
instruction. Examples of context altering operations that require the
insertion of an
ISB
instruction to ensure the operations are complete are:
cache, TLB, and branch predictor maintenance operations
changes to the CP14 and CP15 registers.
Application Level Memory Model
A3-50 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
In addition, any branches that appear in program order after the
ISB
instruction are written into the branch
prediction logic with the context that is visible after the
ISB
instruction. This is needed to ensure correct
execution of the instruction stream.
Any context altering operations appearing in program order after the
ISB
instruction only take effect after
the
ISB
has been executed.
For details of the
ISB
instruction in the Thumb and ARM instruction sets see ISB on page A8-102.
Pseudocode details of memory barriers
The following types define the required shareability domains and required access types used as arguments
for
DMB
and
DSB
instructions:
enumeration MBReqDomain {MBReqDomain_FullSystem,
MBReqDomain_OuterShareable,
MBReqDomain_InnerShareable,
MBReqDomain_Nonshareable};
enumeration MBReqTypes {MBReqTypes_All, MBReqTypes_Writes};
The following procedures perform the memory barriers:
DataMemoryBarrier(MBReqDomain domain, MBReqTypes types)
DataSynchronizationBarrier(MBReqDomain domain, MBReqTypes types)
InstructionSynchronizationBarrier()
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-51
A3.9 Caches and memory hierarchy
The implementation of a memory system depends heavily on the microarchitecture and therefore the details
of the system are IMPLEMENTATION DEFINED. ARMv7 defines the application level interface to the memory
system, and supports a hierarchical memory system with multiple levels of cache. This section provides an
application level view of this system. It contains the subsections:
Introduction to caches
Memory hierarchy on page A3-52
Implication of caches for the application programmer on page A3-52
Preloading caches on page A3-54.
A3.9.1 Introduction to caches
A cache is a block of high-speed memory that contains a number of entries, each consisting of:
main memory address information, commonly known as a tag
the associated data.
Caches are used to increase the average speed of a memory access. Cache operation takes account of two
principles of locality:
Spatial locality
An access to one location is likely to be followed by accesses to adjacent locations.
Examples of this principle are:
sequential instruction execution
accessing a data structure.
Temporal locality
An access to an area of memory is likely to be repeated in a short time period. An example
of this principle is the execution of a code loop
To minimize the quantity of control information stored, the spatial locality property is used to group several
locations together under the same tag. This logical block is commonly known as a cache line. When data is
loaded into a cache, access times for subsequent loads and stores are reduced, resulting in overall
performance benefits. An access to information already in a cache is known as a cache hit, and other
accesses are called cache misses.
Normally, caches are self-managing, with the updates occurring automatically. Whenever the processor
wants to access a cacheable location, the cache is checked. If the access is a cache hit, the access occurs in
the cache, otherwise a location is allocated and the cache line loaded from memory. Different cache
topologies and access policies are possible, however, they must comply with the memory coherency model
of the underlying architecture.
Caches introduce a number of potential problems, mainly because of:
Memory accesses occurring at times other than when the programmer would normally expect them
There being multiple physical locations where a data item can be held
Application Level Memory Model
A3-52 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.9.2 Memory hierarchy
Memory close to a processor has very low latency, but is limited in size and expensive to implement. Further
from the processor it is easier to implement larger blocks of memory but these have increased latency. To
optimize overall performance, an ARMv7 memory system can include multiple levels of cache in a
hierarchical memory system. Figure A3-5 shows such a system, in an ARMv7-A implementation of a
VMSA, supporting virtual addressing.
Figure A3-5 Multiple levels of cache in a memory hierarchy
Note
In this manual, in a hierarchical memory system, Level 1 refers to the level closest to the processor, as shown
in Figure A3-5.
A3.9.3 Implication of caches for the application programmer
In normal operation, the caches are largely invisible to the application programmer. However they can
become visible when there is a breakdown in the coherency of the caches. Such a breakdown can occur:
when memory locations are updated by other agents in the system
when memory updates made from the application code must be made visible to other agents in the
system.
For example:
In a system with a DMA controller that reads memory locations that are held in the data cache of a
processor, a breakdown of coherency occurs when the processor has written new data in the data
cache, but the DMA controller reads the old data held in memory.
In a Harvard architecture of caches, where there are separate instruction and data caches, a
breakdown of coherency occurs when new instruction data has been written into the data cache, but
the instruction cache still contains the old instruction data.
Level 4
for example,
CF card, disk
Processor
R15
R0
.
.
.
Address
Translation
Level 1
Cache
Level 2
Cache
Level 3
DRAM
SRAM
Flash
ROM
Instruction
Prefetch
CP15 configuration
and control
Physical address
Virtual
address
Load
Store
Application Level Memory Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A3-53
Data coherency issues
You can ensure the data coherency of caches in the following ways:
By not using the caches in situations where coherency issues can arise. You can achieve this by:
using Non-cacheable or, in some cases, Write-Through Cacheable memory for the caches
not enabling caches in the system.
By using cache maintenance operations to manage the coherency issues in software, see Cache
maintenance functionality on page B2-9. Many of these operations are only available to system
software.
By using hardware coherency mechanisms to ensure the coherency of data accesses to memory for
cacheable locations by observers within the different shareability domains, see Non-shareable
Normal memory on page A3-30 and Shareable, Inner Shareable, and Outer Shareable Normal
memory on page A3-30.
The performance of these hardware coherency mechanisms is highly implementation specific. In
some implementations the mechanism suppresses the ability to cache shareable locations. In other
implementations, cache coherency hardware can hold data in caches while managing coherency
between observers within the shareability domains.
Instruction coherency issues
How far ahead of the current point of execution instructions are prefetched from is IMPLEMENTATION
DEFINED. Such prefetching can be either a fixed or a dynamically varying number of instructions, and can
follow any or all possible future execution paths. For all types of memory:
the processor might have fetched the instructions from memory at any time since the last ISB,
exception entry or exception return executed by that processor
any instructions fetched in this way might be executed multiple times, if this is required by the
execution of the program, without being refetched from memory
In addition, the ARM architecture does not require the hardware to ensure coherency between instruction
caches and memory, even for regions of memory with Shareable attributes. This means that for cacheable
regions of memory, an instruction cache can hold instructions that were fetched from memory before the
last ISB, exception entry or exception return.
If software requires coherency between instruction execution and memory, it must manage this coherency
using the ISB and DSB memory barriers and cache maintenance operations, see Ordering of cache and
branch predictor maintenance operations on page B2-21. Many of these operations are only available to
system software.
Application Level Memory Model
A3-54 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A3.9.4 Preloading caches
The ARM architecture provides memory system hints
PLD
(Preload Data) and
PLI
(Preload Instruction) to
permit software to communicate the expected use of memory locations to the hardware. The memory system
can respond by taking actions that are expected to speed up the memory accesses if and when they do occur.
The effect of these memory system hints is IMPLEMENTATION DEFINED. Typically, implementations will use
this information to bring the data or instruction locations into caches that have faster access times than
normal memory.
The Preload instructions are hints, and so implementations can treat them as NOPs without affecting the
functional behavior of the device. The instructions do not generate synchronous Data Abort exceptions, but
the memory system operations might, under exceptional circumstances, generate asynchronous aborts. For
more information, see Data Abort exception on page B1-55.
Hardware implementations can provide other implementation-specific mechanisms to prefetch memory
locations in the cache. These must comply with the general cache behavior described in Cache behavior on
page B2-5.
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-1
Chapter A4
The Instruction Sets
This chapter describes the ARM and Thumb instruction sets. It contains the following sections:
About the instruction sets on page A4-2
Unified Assembler Language on page A4-4
Branch instructions on page A4-7
Data-processing instructions on page A4-8
Status register access instructions on page A4-18
Load/store instructions on page A4-19
Load/store multiple instructions on page A4-22
Miscellaneous instructions on page A4-23
Exception-generating and exception-handling instructions on page A4-24
Coprocessor instructions on page A4-25
Advanced SIMD and VFP load/store instructions on page A4-26
Advanced SIMD and VFP register transfer instructions on page A4-29
Advanced SIMD data-processing operations on page A4-30
VFP data-processing instructions on page A4-38.
The Instruction Sets
A4-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.1 About the instruction sets
ARMv7 contains two main instruction sets, the ARM and Thumb instruction sets. Much of the functionality
available is identical in the two instruction sets. This chapter describes the functionality available in the
instruction sets, and the Unified Assembler Language (UAL) that can be assembled to either instruction set.
The two instruction sets differ in how instructions are encoded:
Thumb instructions are either 16-bit or 32-bit, and are aligned on a two-byte boundary. 16-bit and
32-bit instructions can be intermixed freely. Many common operations are most efficiently executed
using 16-bit instructions. However:
Most 16-bit instructions can only access eight of the general-purpose registers, R0-R7. These
are known as the low registers. A small number of 16-bit instructions can access the high
registers, R8-R15.
Many operations that would require two or more 16-bit instructions can be more efficiently
executed with a single 32-bit instruction.
ARM instructions are always 32-bit, and are aligned on a four-byte boundary.
The ARM and Thumb instruction sets can interwork freely, that is, different procedures can be compiled or
assembled to different instruction sets, and still be able to call each other efficiently.
ThumbEE is a variant of the Thumb instruction set that is designed as a target for dynamically generated
code. However, it cannot interwork freely with the ARM and Thumb instruction sets.
See:
Chapter A5 ARM Instruction Set Encoding for encoding details of the ARM instruction set
Chapter A6 Thumb Instruction Set Encoding for encoding details of the Thumb instruction set
Chapter A8 Instruction Details for detailed descriptions of the instructions
Chapter A9 ThumbEE for encoding details of the ThumbEE instruction set.
A4.1.1 Changing between Thumb state and ARM state
A processor in Thumb state (that is, executing Thumb instructions) can enter ARM state (and change to
executing ARM instructions) by executing any of the following instructions:
BX
,
BLX
, or an
LDR
or
LDM
that
loads the PC.
A processor in ARM state (that is, executing ARM instructions) can enter Thumb state (and change to
executing Thumb instructions) by executing any of the same instructions.
In ARMv7, a processor in ARM state can also enter Thumb state (and change to executing Thumb
instructions) by executing an
ADC
,
ADD
,
AND
,
ASR
,
BIC
,
EOR
,
LSL
,
LSR
,
MOV
,
MVN
,
ORR
,
ROR
,
RRX
,
RSB
,
RSC
,
SBC
, or
SUB
instruction that has the PC as destination register and does not set the condition flags.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-3
Note
This permits calls and returns between ARM code written for ARMv4 processors and Thumb code running
on ARMv7 processors to function correctly. In new code, ARM recommends that you use
BX
or
BLX
instructions instead. In particular, use
BX LR
to return from a procedure, not
MOV PC,LR
.
The target instruction set is either encoded directly in the instruction (for the immediate offset version of
BLX
), or is held as bit [0] of an interworking address. For details, see the description of the
BXWritePC()
function in Pseudocode details of operations on ARM core registers on page A2-12.
Exception entries and returns can also change between ARM and Thumb states. For details see Exceptions
on page B1-30.
A4.1.2 Conditional execution
Most ARM instructions can be conditionally executed. This means that they only have their normal effect
on the programmers’ model operation, memory and coprocessors if the N, Z, C and V flags in the APSR
satisfy a condition specified in the instruction. If the flags do not satisfy this condition, the instruction acts
as a NOP, that is, execution advances to the next instruction as normal, including any relevant checks for
exceptions being taken, but has no other effect.
Most Thumb instructions are unconditional. Conditional execution in Thumb code can be achieved using
any of the following instructions:
A 16-bit conditional branch instruction, with a branch range of –256 to +254 bytes. For details see B
on page A8-44. Before ARMv6T2, this was the only mechanism for conditional execution in Thumb
code.
A 32-bit conditional branch instruction, with a branch range of approximately ± 1MB. For details see
B on page A8-44.
16-bit Compare and Branch on Zero and Compare and Branch on Nonzero instructions, with a branch
range of +4 to +130 bytes. For details see CBNZ, CBZ on page A8-66.
A 16-bit If-Then instruction that makes up to four following instructions conditional. For details see
IT on page A8-104. The instructions that are made conditional by an
IT
instruction are called its IT
block. Instructions in an IT block must either all have the same condition, or some can have one
condition, and others can have the inverse condition.
For more information about conditional execution see Conditional execution on page A8-8.
The Instruction Sets
A4-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.2 Unified Assembler Language
This document uses the ARM Unified Assembler Language (UAL). This assembly language syntax
provides a canonical form for all ARM and Thumb instructions.
UAL describes the syntax for the mnemonic and the operands of each instruction. In addition, it assumes
that instructions and data items can be given labels. It does not specify the syntax to be used for labels, nor
what assembler directives and options are available. See your assembler documentation for these details.
Most earlier ARM assembly language mnemonics are still supported as synonyms, as described in the
instruction details.
Note
Most earlier Thumb assembly language mnemonics are not supported. For details see Appendix C Legacy
Instruction Mnemonics.
UAL includes instruction selection rules that specify which instruction encoding is selected when more than
one can provide the required functionality. For example, both 16-bit and 32-bit encodings exist for an
ADD R0,R1,R2
instruction. The most common instruction selection rule is that when both a 16-bit encoding
and a 32-bit encoding are available, the 16-bit encoding is selected, to optimize code density.
Syntax options exist to override the normal instruction selection rules and ensure that a particular encoding
is selected. These are useful when disassembling code, to ensure that subsequent assembly produces the
original code, and in some other situations.
A4.2.1 Conditional instructions
For maximum portability of UAL assembly language between the ARM and Thumb instruction sets, ARM
recommends that:
IT
instructions are written before conditional instructions in the correct way for the Thumb
instruction set.
When assembling to the ARM instruction set, assemblers check that any
IT
instructions are correct,
but do not generate any code for them.
Although other Thumb instructions are unconditional, all instructions that are made conditional by an
IT
instruction must be written with a condition. These conditions must match the conditions imposed by the
IT
instruction. For example, an
ITTEE EQ
instruction imposes the
EQ
condition on the first two following
instructions, and the
NE
condition on the next two. Those four instructions must be written with
EQ
,
EQ
,
NE
and
NE
conditions respectively.
Some instructions cannot be made conditional by an
IT
instruction. Some instructions can be conditional if
they are the last instruction in the IT block, but not otherwise.
The branch instruction encodings that include a condition field cannot be made conditional by an
IT
instruction. If the assembler syntax indicates a conditional branch that correctly matches a preceding
IT
instruction, it is assembled using a branch instruction encoding that does not include a condition field.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-5
A4.2.2 Use of labels in UAL instruction syntax
The UAL syntax for some instructions includes the label of an instruction or a literal data item that is at a
fixed offset from the instruction being specified. The assembler must:
1. Calculate the
PC
or
Align(PC,4)
value of the instruction. The
PC
value of an instruction is its address
plus 4 for a Thumb instruction, or plus 8 for an ARM instruction. The
Align(PC,4)
value of an
instruction is its
PC
value ANDed with
0xFFFFFFFC
to force it to be word-aligned. There is no
difference between the
PC
and
Align(PC,4)
values for an ARM instruction, but there can be for a
Thumb instruction.
2. Calculate the offset from the
PC
or
Align(PC,4)
value of the instruction to the address of the labelled
instruction or literal data item.
3. Assemble a PC-relative encoding of the instruction, that is, one that reads its
PC
or
Align(PC,4)
value
and adds the calculated offset to form the required address.
Note
For instructions that can encode a subtraction operation, if the instruction cannot encode the
calculated offset but can encode minus the calculated offset, the instruction encoding specifies a
subtraction of minus the calculated offset.
The syntax of the following instructions includes a label:
B
,
BL
, and
BLX
(immediate). The assembler syntax for these instructions always specifies the label of
the instruction that they branch to. Their encodings specify a sign-extended immediate offset that is
added to the
PC
value of the instruction to form the target address of the branch.
CBNZ
and
CBZ
. The assembler syntax for these instructions always specifies the label of the instruction
that they branch to. Their encodings specify a zero-extended immediate offset that is added to the
PC
value of the instruction to form the target address of the branch. They do not support backward
branches.
LDC
,
LDC2
,
LDR
,
LDRB
,
LDRD
,
LDRH
,
LDRSB
,
LDRSH
,
PLD
,
PLDW
,
PLI
, and
VLDR
. The normal assembler syntax of
these load instructions can specify the label of a literal data item that is to be loaded. The encodings
of these instructions specify a zero-extended immediate offset that is either added to or subtracted
from the
Align(PC,4)
value of the instruction to form the address of the data item. A few such
encodings perform a fixed addition or a fixed subtraction and must only be used when that operation
is required, but most contain a bit that specifies whether the offset is to be added or subtracted.
When the assembler calculates an offset of 0 for the normal syntax of these instructions, it must
assemble an encoding that adds 0 to the
Align(PC,4)
value of the instruction. Encodings that subtract
0 from the
Align(PC,4)
value cannot be specified by the normal syntax.
There is an alternative syntax for these instructions that specifies the addition or subtraction and the
immediate offset explicitly. In this syntax, the label is replaced by
[PC, #+/-<imm>]
, where:
+/-
Is
+
or omitted to specify that the immediate offset is to be added to the
Align(PC,4)
value,
or
-
if it is to be subtracted.
<imm>
Is the immediate offset.
The Instruction Sets
A4-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
This alternative syntax makes it possible to assemble the encodings that subtract 0 from the
Align(PC,4)
value, and to disassemble them to a syntax that can be re-assembled correctly.
ADR
. The normal assembler syntax for this instruction can specify the label of an instruction or literal
data item whose address is to be calculated. Its encoding specifies a zero-extended immediate offset
that is either added to or subtracted from the
Align(PC,4)
value of the instruction to form the address
of the data item, and some opcode bits that determine whether it is an addition or subtraction.
When the assembler calculates an offset of 0 for the normal syntax of this instruction, it must
assemble the encoding that adds 0 to the
Align(PC,4)
value of the instruction. The encoding that
subtracts 0 from the
Align(PC,4)
value cannot be specified by the normal syntax.
There is an alternative syntax for this instruction that specifies the addition or subtraction and the
immediate value explicitly, by writing them as additions
ADD <Rd>,PC,#<imm>
or subtractions
SUB <Rd>,PC,#<imm>
. This alternative syntax makes it possible to assemble the encoding that subtracts
0 from the
Align(PC,4)
value, and to disassemble it to a syntax that can be re-assembled correctly.
Note
ARM recommends that where possible, you avoid using:
the alternative syntax for the
ADR
,
LDC
,
LDC2
,
LDR
,
LDRB
,
LDRD
,
LDRH
,
LDRSB
,
LDRSH
,
PLD
,
PLI
,
PLDW
, and
VLDR
instructions
the encodings of these instructions that subtract 0 from the
Align(PC,4)
value.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-7
A4.3 Branch instructions
Table A4-1 summarizes the branch instructions in the ARM and Thumb instruction sets. In addition to
providing for changes in the flow of execution, some branch instructions can change instruction set.
Branches to loaded and calculated addresses can be performed by
LDR
,
LDM
and data-processing instructions.
For details see Load/store instructions on page A4-19, Load/store multiple instructions on page A4-22,
Standard data-processing instructions on page A4-8, and Shift instructions on page A4-10.
Table A4-1 Branch instructions
Instruction See Range
(Thumb)
Range
(ARM)
Branch to target address B on page A8-44 +/–16MB +/–32MB
Compare and Branch on Nonzero, Compare
and Branch on Zero
CBNZ, CBZ on page A8-66 0-126B a
Call a subroutine
Call a subroutine, change instruction setb
BL, BLX (immediate) on page A8-58 +/–16MB
+/–16MB
+/–32MB
+/–32MB
Call a subroutine, optionally change instruction
set
BLX (register) on page A8-60 Any Any
Branch to target address, change instruction set BX on page A8-62 Any Any
Change to Jazelle state BXJ on page A8-64 - -
Table Branch (byte offsets)
Table Branch (halfword offsets)
TBB, TBH on page A8-446 0-510B
0-131070B
a
a. These instructions do not exist in the ARM instruction set.
b. The range is determined by the instruction set of the
BLX
instruction, not of the instruction it branches to.
The Instruction Sets
A4-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.4 Data-processing instructions
Core data-processing instructions belong to one of the following groups:
Standard data-processing instructions. These instructions perform basic data-processing operations,
and share a common format with some variations.
Shift instructions on page A4-10.
Saturating instructions on page A4-13.
Packing and unpacking instructions on page A4-14.
Miscellaneous data-processing instructions on page A4-15.
Parallel addition and subtraction instructions on page A4-16.
Divide instructions on page A4-17.
For extension data-processing instructions, see Advanced SIMD data-processing operations on page A4-30
and VFP data-processing instructions on page A4-38.
A4.4.1 Standard data-processing instructions
These instructions generally have a destination register Rd, a first operand register Rn, and a second
operand. The second operand can be another register Rm, or an immediate constant.
If the second operand is an immediate constant, it can be:
Encoded directly in the instruction.
•A modified immediate constant that uses 12 bits of the instruction to encode a range of constants.
Thumb and ARM instructions have slightly different ranges of modified immediate constants. For
details see Modified immediate constants in Thumb instructions on page A6-17 and Modified
immediate constants in ARM instructions on page A5-9.
If the second operand is another register, it can optionally be shifted in any of the following ways:
LSL
Logical Shift Left by 1-31 bits.
LSR
Logical Shift Right by 1-32 bits.
ASR
Arithmetic Shift Right by 1-32 bits.
ROR
Rotate Right by 1-31 bits.
RRX
Rotate Right with Extend. For details see Shift and rotate operations on page A2-5.
In Thumb code, the amount to shift by is always a constant encoded in the instruction. In ARM code, the
amount to shift by is either a constant encoded in the instruction, or the value of a register Rs.
For instructions other than
CMN
,
CMP
,
TEQ
, and
TST
, the result of the data-processing operation is placed in the
destination register. In the ARM instruction set, the destination register can be the PC, causing the result to
be treated as an address to branch to. In the Thumb instruction set, this is only permitted for some 16-bit
forms of the
ADD
and
MOV
instructions.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-9
These instructions can optionally set the condition code flags, according to the result of the operation. If
they do not set the flags, existing flag settings from a previous instruction are preserved.
Table A4-2 summarizes the main data-processing instructions in the Thumb and ARM instruction sets.
Generally, each of these instructions is described in three sections in Chapter A8 Instruction Details, one
section for each of the following:
INSTRUCTION
(immediate) where the second operand is a modified immediate constant.
INSTRUCTION
(register) where the second operand is a register, or a register shifted by a constant.
INSTRUCTION
(register-shifted register) where the second operand is a register shifted by a value
obtained from another register. These are only available in the ARM instruction set.
Table A4-2 Standard data-processing instructions
Instruction Mnemonic Notes
Add with Carry
ADC
-
Add
ADD
Thumb instruction set permits use of a modified immediate
constant or a zero-extended 12-bit immediate constant.
Form PC-relative Address
ADR
First operand is the PC. Second operand is an immediate constant.
Thumb instruction set uses a zero-extended 12-bit immediate
constant. Operation is an addition or a subtraction.
Bitwise AND
AND
-
Bitwise Bit Clear
BIC
-
Compare Negative
CMN
Sets flags. Like
ADD
but with no destination register.
Compare
CMP
Sets flags. Like
SUB
but with no destination register.
Bitwise Exclusive OR
EOR
-
Copy operand to destination
MOV
Has only one operand, with the same options as the second
operand in most of these instructions. If the operand is a shifted
register, the instruction is an
LSL
,
LSR
,
ASR
, or
ROR
instruction
instead. For details see Shift instructions on page A4-10.
The ARM and Thumb instruction sets permit use of a modified
immediate constant or a zero-extended 16-bit immediate constant.
Bitwise NOT
MVN
Has only one operand, with the same options as the second
operand in most of these instructions.
Bitwise OR NOT
ORN
Not available in the ARM instruction set.
Bitwise OR
ORR
-
The Instruction Sets
A4-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.4.2 Shift instructions
Table A4-3 lists the shift instructions in the ARM and Thumb instruction sets.
In the ARM instruction set only, the destination register of these instructions can be the PC, causing the
result to be treated as an address to branch to.
Reverse Subtract
RSB
Subtracts first operand from second operand. This permits
subtraction from constants and shifted registers.
Reverse Subtract with Carry
RSC
Not available in the Thumb instruction set.
Subtract with Carry
SBC
-
Subtract
SUB
Thumb instruction set permits use of a modified immediate
constant or a zero-extended 12-bit immediate constant.
Test Equivalence
TEQ
Sets flags. Like
EOR
but with no destination register.
Test
TST
Sets flags. Like
AND
but with no destination register.
Table A4-2 Standard data-processing instructions (continued)
Instruction Mnemonic Notes
Table A4-3 Shift instructions
Instruction See
Arithmetic Shift Right ASR (immediate) on page A8-40
Arithmetic Shift Right ASR (register) on page A8-42
Logical Shift Left LSL (immediate) on page A8-178
Logical Shift Left LSL (register) on page A8-180
Logical Shift Right LSR (immediate) on page A8-182
Logical Shift Right LSR (register) on page A8-184
Rotate Right ROR (immediate) on page A8-278
Rotate Right ROR (register) on page A8-280
Rotate Right with Extend RRX on page A8-282
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-11
A4.4.3 Multiply instructions
These instructions can operate on signed or unsigned quantities. In some types of operation, the results are
same whether the operands are signed or unsigned.
Table A4-4 summarizes the multiply instructions where there is no distinction between signed and
unsigned quantities.
The least significant 32 bits of the result are used. More significant bits are discarded.
Table A4-5 summarizes the signed multiply instructions.
Table A4-6 on page A4-12 summarizes the unsigned multiply instructions.
Table A4-4 General multiply instructions
Instruction See Operation (number of bits)
Multiply Accumulate MLA on page A8-190 32 = 32 + 32 x 32
Multiply and Subtract MLS on page A8-192 32 = 32 – 32 x 32
Multiply MUL on page A8-212 32 = 32 x 32
Table A4-5 Signed multiply instructions
Instruction See Operation (number of bits)
Signed Multiply Accumulate (halfwords) SMLABB, SMLABT,
SMLATB, SMLATT on
page A8-330
32 = 32 + 16 x 16
Signed Multiply Accumulate Dual SMLAD on page A8-332 32 = 32 + 16 x 16 + 16 x 16
Signed Multiply Accumulate Long SMLAL on page A8-334 64 = 64 + 32 x 32
Signed Multiply Accumulate Long (halfwords) SMLALBB, SMLALBT,
SMLALTB, SMLALTT on
page A8-336
64 = 64 + 16 x 16
Signed Multiply Accumulate Long Dual SMLALD on page A8-338 64 = 64 + 16 x 16 + 16 x 16
Signed Multiply Accumulate (word by
halfword)
SMLAWB, SMLAWT on
page A8-340
32 = 32 + 32 x 16 a
Signed Multiply Subtract Dual SMLSD on page A8-342 32 = 32 + 16 x 16 – 16 x 16
Signed Multiply Subtract Long Dual SMLSLD on page A8-344 64 = 64 + 16 x 16 – 16 x 16
Signed Most Significant Word Multiply
Accumulate
SMMLA on page A8-346 32 = 32 + 32 x 32 b
The Instruction Sets
A4-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Signed Most Significant Word Multiply
Subtract
SMMLS on page A8-348 32 = 32 – 32 x 32 b
Signed Most Significant Word Multiply SMMUL on page A8-350 32 = 32 x 32 b
Signed Dual Multiply Add SMUAD on page A8-352 32 = 16 x 16 + 16 x 16
Signed Multiply (halfwords) SMULBB, SMULBT,
SMULTB, SMULTT on
page A8-354
32 = 16 x 16
Signed Multiply Long SMULL on page A8-356 64 = 32 x 32
Signed Multiply (word by halfword) SMULWB, SMULWT on
page A8-358
32 = 32 x 16 a
Signed Dual Multiply Subtract SMUSD on page A8-360 32 = 16 x 16 – 16 x 16
a. The most significant 32 bits of the 48-bit product are used. Less significant bits are discarded.
b. The most significant 32 bits of the 64-bit product are used. Less significant bits are discarded.
Table A4-6 Unsigned multiply instructions
Instruction See Operation (number of bits)
Unsigned Multiply Accumulate Accumulate Long UMAAL on page A8-482 64 = 32 + 32 + 32 x 32
Unsigned Multiply Accumulate Long UMLAL on page A8-484 64 = 64 + 32 x 32
Unsigned Multiply Long UMULL on page A8-486 64 = 32 x 32
Table A4-5 Signed multiply instructions (continued)
Instruction See Operation (number of bits)
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-13
A4.4.4 Saturating instructions
Table A4-7 lists the saturating instructions in the ARM and Thumb instruction sets. For more information,
see Pseudocode details of saturation on page A2-9.
Table A4-7 Saturating instructions
Instruction See Operation
Signed Saturate SSAT on page A8-362 Saturates optionally shifted 32-bit value to selected range
Signed Saturate 16 SSAT16 on page A8-364 Saturates two 16-bit values to selected range
Unsigned Saturate USAT on page A8-504 Saturates optionally shifted 32-bit value to selected range
Unsigned Saturate 16 USAT16 on page A8-506 Saturates two 16-bit values to selected range
The Instruction Sets
A4-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.4.5 Packing and unpacking instructions
Table A4-8 lists the packing and unpacking instructions in the ARM and Thumb instruction sets. These are
all available from ARMv6T2 in the Thumb instruction set, and from ARMv6 onwards in the ARM
instruction set.
Table A4-8 Packing and unpacking instructions
Instruction See Operation
Pack Halfword PKH on page A8-234 Combine halfwords
Signed Extend and Add Byte SXTAB on page A8-434 Extend 8 bits to 32 and add
Signed Extend and Add Byte 16 SXTAB16 on page A8-436 Dual extend 8 bits to 16 and add
Signed Extend and Add Halfword SXTAH on page A8-438 Extend 16 bits to 32 and add
Signed Extend Byte SXTB on page A8-440 Extend 8 bits to 32
Signed Extend Byte 16 SXTB16 on page A8-442 Dual extend 8 bits to 16
Signed Extend Halfword SXTH on page A8-444 Extend 16 bits to 32
Unsigned Extend and Add Byte UXTAB on page A8-514 Extend 8 bits to 32 and add
Unsigned Extend and Add Byte 16 UXTAB16 on page A8-516 Dual extend 8 bits to 16 and add
Unsigned Extend and Add Halfword UXTAH on page A8-518 Extend 16 bits to 32 and add
Unsigned Extend Byte UXTB on page A8-520 Extend 8 bits to 32
Unsigned Extend Byte 16 UXTB16 on page A8-522 Dual extend 8 bits to 16
Unsigned Extend Halfword UXTH on page A8-524 Extend 16 bits to 32
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-15
A4.4.6 Miscellaneous data-processing instructions
Table A4-9 lists the miscellaneous data-processing instructions in the ARM and Thumb instruction sets.
Immediate values in these instructions are simple binary numbers.
Table A4-9 Miscellaneous data-processing instructions
Instruction See Notes
Bit Field Clear BFC on page A8-46 -
Bit Field Insert BFI on page A8-48 -
Count Leading Zeros CLZ on page A8-72 -
Move Top MOVT on page A8-200 Moves 16-bit immediate value to top
halfword. Bottom halfword unchanged.
Reverse Bits RBIT on page A8-270 -
Byte-Reverse Word REV on page A8-272 -
Byte-Reverse Packed Halfword REV16 on page A8-274 -
Byte-Reverse Signed Halfword REVSH on page A8-276 -
Signed Bit Field Extract SBFX on page A8-308 -
Select Bytes using GE flags SEL on page A8-312 -
Unsigned Bit Field Extract UBFX on page A8-466 -
Unsigned Sum of Absolute Differences USAD8 on page A8-500 -
Unsigned Sum of Absolute Differences
and Accumulate
USADA8 on page A8-502 -
The Instruction Sets
A4-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.4.7 Parallel addition and subtraction instructions
These instructions perform additions and subtractions on the values of two registers and write the result to
a destination register, treating the register values as sets of two halfwords or four bytes. They are available
in ARMv6 and above.
These instructions consist of a prefix followed by a main instruction mnemonic. The prefixes are as follows:
S
Signed arithmetic modulo 28 or 216.
Q
Signed saturating arithmetic.
SH
Signed arithmetic, halving the results.
U
Unsigned arithmetic modulo 28 or 216.
UQ
Unsigned saturating arithmetic.
UH
Unsigned arithmetic, halving the results.
The main instruction mnemonics are as follows:
ADD16
Adds the top halfwords of two operands to form the top halfword of the result, and the
bottom halfwords of the same two operands to form the bottom halfword of the result.
ASX
Exchanges halfwords of the second operand, and then adds top halfwords and subtracts
bottom halfwords.
SAX
Exchanges halfwords of the second operand, and then subtracts top halfwords and adds
bottom halfwords.
SUB16
Subtracts each halfword of the second operand from the corresponding halfword of the first
operand to form the corresponding halfword of the result.
ADD8
Adds each byte of the second operand to the corresponding byte of the first operand to form
the corresponding byte of the result.
SUB8
Subtracts each byte of the second operand from the corresponding byte of the first operand
to form the corresponding byte of the result.
The instruction set permits all 36 combinations of prefix and main instruction operand.
See also Advanced SIMD parallel addition and subtraction on page A4-31.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-17
A4.4.8 Divide instructions
In the ARMv7-R profile, the Thumb instruction set includes signed and unsigned integer divide instructions
that are implemented in hardware. For details of the instructions see:
SDIV on page A8-310
UDIV on page A8-468.
Note
SDIV
and
UDIV
are UNDEFINED in the ARMv7-A profile.
The ARMv7-M profile also includes the
SDIV
and
UDIV
instructions.
In the ARMv7-R profile, the SCTLR.DZ bit enables divide by zero fault detection, see c1, System Control
Register (SCTLR) on page B4-45:
DZ == 0 Divide-by-zero returns a zero result.
DZ == 1
SDIV
and
UDIV
generate an Undefined Instruction exception on a divide-by-zero.
The SCTLR.DZ bit is cleared to zero on reset.
The Instruction Sets
A4-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.5 Status register access instructions
The
MRS
and
MSR
instructions move the contents of the Application Program Status Register (APSR) to or
from a general-purpose register.
The APSR is described in The Application Program Status Register (APSR) on page A2-14.
The condition flags in the APSR are normally set by executing data-processing instructions, and are
normally used to control the execution of conditional instructions. However, you can set the flags explicitly
using the
MSR
instruction, and you can read the current state of the flags explicitly using the
MRS
instruction.
For details of the system level use of status register access instructions
CPS
,
MRS
, and
MSR
, see Chapter B6
System Instructions.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-19
A4.6 Load/store instructions
Table A4-10 summarizes the general-purpose register load/store instructions in the ARM and Thumb
instruction sets. See also:
Load/store multiple instructions on page A4-22
Advanced SIMD and VFP load/store instructions on page A4-26.
Load/store instructions have several options for addressing memory. For more information, see Addressing
modes on page A4-20.
A4.6.1 Loads to the PC
The
LDR
instruction can be used to load a value into the PC. The value loaded is treated as an interworking
address, as described by the
LoadWritePC()
pseudocode function in Pseudocode details of operations on
ARM core registers on page A2-12.
A4.6.2 Halfword and byte loads and stores
Halfword and byte stores store the least significant halfword or byte from the register, to 16 or 8 bits of
memory respectively. There is no distinction between signed and unsigned stores.
Halfword and byte loads load 16 or 8 bits from memory into the least significant halfword or byte of a
register. Unsigned loads zero-extend the loaded value to 32 bits, and signed loads sign-extend the value to
32 bits.
Table A4-10 Load/store instructions
Data type Load Store Load
unprivileged
Store
unprivileged
Load-
Exclusive
Store-
Exclusive
32-bit word
LDR STR LDRT STRT LDREX STREX
16-bit halfword -
STRH
-
STRHT
-
STREXH
16-bit unsigned halfword
LDRH
-
LDRHT - LDREXH
-
16-bit signed halfword
LDRSH
-
LDRSHT
---
8-bit byte -
STRB
-
STRBT
-
STREXB
8-bit unsigned byte
LDRB - LDRBT
-
LDREXB
-
8-bit signed byte
LDRSB
-
LDRSBT
---
Two 32-bit words
LDRD STRD
-- --
64-bit doubleword - - - -
LDREXD STREXD
The Instruction Sets
A4-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.6.3 Unprivileged loads and stores
In an unprivileged mode, unprivileged loads and stores operate in exactly the same way as the corresponding
ordinary operations. In a privileged mode, unprivileged loads and stores are treated as though they were
executed in an unprivileged mode. For more information, see Privilege level access controls for data
accesses on page A3-38.
A4.6.4 Exclusive loads and stores
Exclusive loads and stores provide for shared memory synchronization. For more information, see
Synchronization and semaphores on page A3-12.
A4.6.5 Addressing modes
The address for a load or store is formed from two parts: a value from a base register, and an offset.
The base register can be any one of the general-purpose registers.
For loads, the base register can be the PC. This permits PC-relative addressing for position-independent
code. Instructions marked (literal) in their title in Chapter A8 Instruction Details are PC-relative loads.
The offset takes one of three formats:
Immediate The offset is an unsigned number that can be added to or subtracted from the base
register value. Immediate offset addressing is useful for accessing data elements that
are a fixed distance from the start of the data object, such as structure fields, stack
offsets and input/output registers.
Register The offset is a value from a general-purpose register. This register cannot be the PC.
The value can be added to, or subtracted from, the base register value. Register
offsets are useful for accessing arrays or blocks of data.
Scaled register The offset is a general-purpose register, other than the PC, shifted by an immediate
value, then added to or subtracted from the base register. This means an array index
can be scaled by the size of each array element.
The offset and base register can be used in three different ways to form the memory address. The addressing
modes are described as follows:
Offset The offset is added to or subtracted from the base register to form the memory
address.
Pre-indexed The offset is added to or subtracted from the base register to form the memory
address. The base register is then updated with this new address, to permit automatic
indexing through an array or memory block.
Post-indexed The value of the base register alone is used as the memory address. The offset is then
added to or subtracted from the base register. The result is stored back in the base
register, to permit automatic indexing through an array or memory block.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-21
Note
Not every variant is available for every instruction, and the range of permitted immediate values and the
options for scaled registers vary from instruction to instruction. See Chapter A8 Instruction Details for full
details for each instruction.
The Instruction Sets
A4-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.7 Load/store multiple instructions
Load Multiple instructions load a subset, or possibly all, of the general-purpose registers from memory.
Store Multiple instructions store a subset, or possibly all, of the general-purpose registers to memory.
The memory locations are consecutive word-aligned words. The addresses used are obtained from a base
register, and can be either above or below the value in the base register. The base register can optionally be
updated by the total size of the data transferred.
Table A4-11 summarizes the load/store multiple instructions in the ARM and Thumb instruction sets.
System level variants of the
LDM
and
STM
instructions load and store User mode registers from a privileged
mode. Another system level variant of the
LDM
instruction performs an exception return. For details, see
Chapter B6 System Instructions.
A4.7.1 Loads to the PC
The
LDM
,
LDMDA
,
LDMDB
,
LDMIB
, and
POP
instructions can be used to load a value into the PC. The value loaded
is treated as an interworking address, as described by the
LoadWritePC()
pseudocode function in Pseudocode
details of operations on ARM core registers on page A2-12.
Table A4-11 Load/store multiple instructions
Instruction See
Load Multiple, Increment After or Full Descending LDM / LDMIA / LDMFD on page A8-110
Load Multiple, Decrement After or Full Ascending a
a. Not available in the Thumb instruction set.
LDMDA / LDMFA on page A8-112
Load Multiple, Decrement Before or Empty Ascending LDMDB / LDMEA on page A8-114
Load Multiple, Increment Before or Empty Descending aLDMIB / LDMED on page A8-116
Pop multiple registers off the stack b
b. This instruction is equivalent to an
LDM
instruction with the SP as base register, and base register updating.
POP on page A8-246
Push multiple registers onto the stack c
c. This instruction is equivalent to an
STMDB
instruction with the SP as base register, and base register
updating.
PUSH on page A8-248
Store Multiple, Increment After or Empty Ascending STM / STMIA / STMEA on page A8-374
Store Multiple, Decrement After or Empty Descending aSTMDA / STMED on page A8-376
Store Multiple, Decrement Before or Full Descending STMDB / STMFD on page A8-378
Store Multiple, Increment Before or Full Ascending aSTMIB / STMFA on page A8-380
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-23
A4.8 Miscellaneous instructions
Table A4-12 summarizes the miscellaneous instructions in the ARM and Thumb instruction sets.
Table A4-12 Miscellaneous instructions
Instruction See
Clear-Exclusive CLREX on page A8-70
Debug hint DBG on page A8-88
Data Memory Barrier DMB on page A8-90
Data Synchronization Barrier DSB on page A8-92
Instruction Synchronization Barrier ISB on page A8-102
If Then (makes following instructions conditional) IT on page A8-104
No Operation NOP on page A8-222
Preload Data PLD, PLDW (immediate) on page A8-236
PLD (literal) on page A8-238
PLD, PLDW (register) on page A8-240
Preload Instruction PLI (immediate, literal) on page A8-242
PLI (register) on page A8-244
Set Endianness SETEND on page A8-314
Send Event SEV on page A8-316
Supervisor Call SVC (previously SWI) on page A8-430
Swap, Swap Byte. Use deprecated. a
a. Use Load/Store-Exclusive instructions instead, see Load/store instructions on page A4-19.
SWP, SWPB on page A8-432
Wait For Event WFE on page A8-808
Wait For Interrupt WFI on page A8-810
Yield YIELD on page A8-812
The Instruction Sets
A4-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.9 Exception-generating and exception-handling instructions
The following instructions are intended specifically to cause a processor exception to occur:
The Supervisor Call (
SVC
, previously
SWI
) instruction is used to cause an SVC exception to occur. This
is the main mechanism for User mode code to make calls to privileged operating system code. For
more information, see Supervisor Call (SVC) exception on page B1-52.
The Breakpoint instruction
BKPT
provides for software breakpoints. For more information, see About
debug events on page C3-2.
In privileged system level code, the Secure Monitor Call (
SMC
, previously
SMI
) instruction. For more
information, see Secure Monitor Call (SMC) exception on page B1-53.
System level variants of the
SUBS
and
LDM
instructions can be used to return from exceptions. From ARMv6,
the
SRS
instruction can be used near the start of an exception handler to store return information, and the
RFE
instruction can be used to return from an exception using the stored return information. For details of these
instructions, see Chapter B6 System Instructions.
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-25
A4.10 Coprocessor instructions
There are three types of instruction for communicating with coprocessors. These permit the processor to:
Initiate a coprocessor data-processing operation. For details see CDP, CDP2 on page A8-68.
Transfer general-purpose registers to and from coprocessor registers. For details, see:
MCR, MCR2 on page A8-186
MCRR, MCRR2 on page A8-188
MRC, MRC2 on page A8-202
MRRC, MRRC2 on page A8-204.
Load or store the values of coprocessor registers. For details, see:
LDC, LDC2 (immediate) on page A8-106
LDC, LDC2 (literal) on page A8-108
STC, STC2 on page A8-372.
The instruction set distinguishes up to 16 coprocessors with a 4-bit field in each coprocessor instruction, so
each coprocessor is assigned a particular number.
Note
One coprocessor can use more than one of the 16 numbers if a large coprocessor instruction set is required.
Coprocessors 10 and 11 are used, together, for VFP and some Advanced SIMD functionality. There are
different instructions for accessing these coprocessors, of similar types to the instructions for the other
coprocessors, that is, to:
Initiate a coprocessor data-processing operation. For details see VFP data-processing instructions on
page A4-38.
Transfer general-purpose registers to and from coprocessor registers. For details, see Advanced SIMD
and VFP register transfer instructions on page A4-29.
Load or store the values of coprocessor registers. For details, see Advanced SIMD and VFP load/store
instructions on page A4-26.
Coprocessors execute the same instruction stream as the processor, ignoring non-coprocessor instructions
and coprocessor instructions for other coprocessors. Coprocessor instructions that cannot be executed by
any coprocessor hardware cause an Undefined Instruction exception.
For more information about specific coprocessors see Coprocessor support on page A2-68.
The Instruction Sets
A4-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.11 Advanced SIMD and VFP load/store instructions
Table A4-13 summarizes the extension register load/store instructions in the Advanced SIMD and VFP
instruction sets.
Advanced SIMD also provides instructions for loading and storing multiple elements, or structures of
elements, see Element and structure load/store instructions on page A4-27.
Table A4-13 Extension register load/store instructions
Instruction See Operation
Vector Load Multiple VLDM on page A8-626 Load 1-16 consecutive 64-bit registers (Adv. SIMD and VFP)
Load 1-16 consecutive 32-bit registers (VFP only)
Vector Load Register VLDR on page A8-628 Load one 64-bit register (Adv. SIMD and VFP)
Load one 32-bit register (VFP only)
Vector Store Multiple VSTM on page A8-784 Store 1-16 consecutive 64-bit registers (Adv. SIMD and VFP)
Store 1-16 consecutive 32-bit registers (VFP only)
Vector Store Register VSTR on page A8-786 Store one 64-bit register (Adv. SIMD and VFP)
Store one 32-bit register (VFP only)
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-27
A4.11.1 Element and structure load/store instructions
Table A4-14 shows the element and structure load/store instructions available in the Advanced SIMD
instruction set. Loading and storing structures of more than one element automatically de-interleaves or
interleaves the elements, see Figure A4-1 on page A4-28 for an example of de-interleaving. Interleaving is
the inverse process.
Table A4-14 Element and structure load/store instructions
Instruction See
Load single element
Multiple elements VLD1 (multiple single elements) on page A8-602
To one lane VLD1 (single element to one lane) on page A8-604
To all lanes VLD1 (single element to all lanes) on page A8-606
Load 2-element structure
Multiple structures VLD2 (multiple 2-element structures) on page A8-608
To one lane VLD2 (single 2-element structure to one lane) on page A8-610
To all lanes VLD2 (single 2-element structure to all lanes) on page A8-612
Load 3-element structure
Multiple structures VLD3 (multiple 3-element structures) on page A8-614
To one lane VLD3 (single 3-element structure to one lane) on page A8-616
To all lanes VLD3 (single 3-element structure to all lanes) on page A8-618
Load 4-element structure
Multiple structures VLD4 (multiple 4-element structures) on page A8-620
To one lane VLD4 (single 4-element structure to one lane) on page A8-622
To all lanes VLD4 (single 4-element structure to all lanes) on page A8-624
Store single element
Multiple elements VST1 (multiple single elements) on page A8-768
From one lane VST1 (single element from one lane) on page A8-770
The Instruction Sets
A4-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Figure A4-1 De-interleaving an array of 3-element structures
Store 2-element structure
Multiple structures VST2 (multiple 2-element structures) on page A8-772
From one lane VST2 (single 2-element structure from one lane) on page A8-774
Store 3-element structure
Multiple structures VST3 (multiple 3-element structures) on page A8-776
From one lane VST3 (single 3-element structure from one lane) on page A8-778
Store 4-element structure
Multiple structures VST4 (multiple 4-element structures) on page A8-780
From one lane VST4 (single 4-element structure from one lane) on page A8-782
Table A4-14 Element and structure load/store instructions (continued)
Instruction See
Z
3
D2
A[3].x
A[3].y
A[3].z
Z
2
Z
1
Z
0
A[2].x
A[2].y
A[2].z
A[1].x
A[1].y
A[1].z
A[0].x
A[0].y
A[0].z
Y
3
D1Y
2
Y
1
Y
0
X
3
D0X
2
X
1
X
0
Memory
Registers
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-29
A4.12 Advanced SIMD and VFP register transfer instructions
Table A4-15 summarizes the extension register transfer instructions in the Advanced SIMD and VFP
instruction sets. These instructions transfer data from ARM core registers to extension registers, or from
extension registers to ARM core registers.
Advanced SIMD vectors, and single-precision and double-precision VFP registers, are all views of the same
extension register set. For details see Advanced SIMD and VFP extension registers on page A2-21.
Table A4-15 Extension register transfer instructions
Instruction See
Copy element from ARM core register to every element of
Advanced SIMD vector
VDUP (ARM core register) on page A8-594
Copy byte, halfword, or word from ARM core register to
extension register
VMOV (ARM core register to scalar) on
page A8-644
Copy byte, halfword, or word from extension register to ARM
core register
VMOV (scalar to ARM core register) on
page A8-646
Copy from single-precision VFP register to ARM core register,
or from ARM core register to single-precision VFP register
VMOV (between ARM core register and
single-precision register) on page A8-648
Copy two words from ARM core registers to consecutive
single-precision VFP registers, or from consecutive
single-precision VFP registers to ARM core registers
VMOV (between two ARM core registers and
two single-precision registers) on page A8-650
Copy two words from ARM core registers to doubleword
extension register, or from doubleword extension register to
ARM core registers
VMOV (between two ARM core registers and a
doubleword extension register) on page A8-652
Copy from Advanced SIMD and VFP extension System Register
to ARM core register
VMRS on page A8-658
VMRS on page B6-27 (system level view)
Copy from ARM core register to Advanced SIMD and VFP
extension System Register
VMSR on page A8-660
VMSR on page B6-29 (system level view)
The Instruction Sets
A4-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.13 Advanced SIMD data-processing operations
Advanced SIMD data-processing operations process registers containing vectors of elements of the same
type packed together, enabling the same operation to be performed on multiple items in parallel.
Instructions operate on vectors held in 64-bit or 128-bit registers. Figure A4-2 shows an operation on two
64-bit operand vectors, generating a 64-bit vector result.
Note
Figure A4-2 and other similar figures show 64-bit vectors that consist of four 16-bit elements, and 128-bit
vectors that consist of four 32-bit elements. Other element sizes produce similar figures, but with one, two,
eight, or sixteen operations performed in parallel instead of four.
Figure A4-2 Advanced SIMD instruction operating on 64-bit registers
Many Advanced SIMD instructions have variants that produce vectors of elements double the size of the
inputs. In this case, the number of elements in the result vector is the same as the number of elements in the
operand vectors, but each element, and the whole vector, is double the size.
Figure A4-3 shows an example of an Advanced SIMD instruction operating on 64-bit registers, and
generating a 128-bit result.
Figure A4-3 Advanced SIMD instruction producing wider result
There are also Advanced SIMD instructions that have variants that produce vectors containing elements half
the size of the inputs. Figure A4-4 on page A4-31 shows an example of an Advanced SIMD instruction
operating on one 128-bit register, and generating a 64-bit result.
Op
Dn
Dm
Dd
OpOpOp
Op
Dn
Dm
Qd
OpOpOp
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-31
Figure A4-4 Advanced SIMD instruction producing narrower result
Some Advanced SIMD instructions do not conform to these standard patterns. Their operation patterns are
described in the individual instruction descriptions.
Advanced SIMD instructions that perform floating-point arithmetic use the ARM standard floating-point
arithmetic defined in Floating-point data types and arithmetic on page A2-32.
A4.13.1 Advanced SIMD parallel addition and subtraction
Table A4-16 shows the Advanced SIMD parallel add and subtract instructions.
Op
Qn
Dd
OpOpOp
Table A4-16 Advanced SIMD parallel add and subtract instructions
Instruction See
Vector Add VADD (integer) on page A8-536
VADD (floating-point) on page A8-538
Vector Add and Narrow, returning High Half VADDHN on page A8-540
Vector Add Long, Vector Add Wide VADDL, VADDW on page A8-542
Vector Halving Add, Vector Halving Subtract VHADD, VHSUB on page A8-600
Vector Pairwise Add and Accumulate Long VPADAL on page A8-682
Vector Pairwise Add VPADD (integer) on page A8-684
VPADD (floating-point) on page A8-686
Vector Pairwise Add Long VPADDL on page A8-688
Vector Rounding Add and Narrow, returning High Half VRADDHN on page A8-726
Vector Rounding Halving Add VRHADD on page A8-734
Vector Rounding Subtract and Narrow, returning High Half VRSUBHN on page A8-748
Vector Saturating Add VQADD on page A8-700
Vector Saturating Subtract VQSUB on page A8-724
The Instruction Sets
A4-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.13.2 Bitwise Advanced SIMD data-processing instructions
Table A4-17 shows bitwise Advanced SIMD data-processing instructions. These operate on the doubleword
(64-bit) or quadword (128-bit) extension registers, and there is no division into vector elements.
Vector Subtract VSUB (integer) on page A8-788
VSUB (floating-point) on page A8-790
Vector Subtract and Narrow, returning High Half VSUBHN on page A8-792
Vector Subtract Long, Vector Subtract Wide VSUBL, VSUBW on page A8-794
Table A4-16 Advanced SIMD parallel add and subtract instructions (continued)
Instruction See
Table A4-17 Bitwise Advanced SIMD data-processing instructions
Instruction See
Vector Bitwise AND VAND (register) on page A8-544
Vector Bitwise Bit Clear (AND complement) VBIC (immediate) on page A8-546
VBIC (register) on page A8-548
Vector Bitwise Exclusive OR VEOR on page A8-596
Vector Bitwise Insert if False
VBIF, VBIT, VBSL on page A8-550
Vector Bitwise Insert if True
Vector Bitwise Move VMOV (immediate) on page A8-640
VMOV (register) on page A8-642
Vector Bitwise NOT VMVN (immediate) on page A8-668
VMVN (register) on page A8-670
Vector Bitwise OR VORR (immediate) on page A8-678
VORR (register) on page A8-680
Vector Bitwise OR NOT VORN (register) on page A8-676
Vector Bitwise Select VBIF, VBIT, VBSL on page A8-550
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-33
A4.13.3 Advanced SIMD comparison instructions
Table A4-18 shows Advanced SIMD comparison instructions.
Table A4-18 Advanced SIMD comparison instructions
Instruction See
Vector Absolute Compare VACGE, VACGT, VACLE,VACLT on page A8-534
Vector Compare Equal VCEQ (register) on page A8-552
Vector Compare Equal to Zero VCEQ (immediate #0) on page A8-554
Vector Compare Greater Than or Equal VCGE (register) on page A8-556
Vector Compare Greater Than or Equal to Zero VCGE (immediate #0) on page A8-558
Vector Compare Greater Than VCGT (register) on page A8-560
Vector Compare Greater Than Zero VCGT (immediate #0) on page A8-562
Vector Compare Less Than or Equal to Zero VCLE (immediate #0) on page A8-564
Vector Compare Less Than Zero VCLT (immediate #0) on page A8-568
Vec t or Te st B it s VTST on page A8-802
The Instruction Sets
A4-34 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.13.4 Advanced SIMD shift instructions
Table A4-19 lists the shift instructions in the Advanced SIMD instruction set.
Table A4-19 Advanced SIMD shift instructions
Instruction See
Vector Saturating Rounding Shift Left VQRSHL on page A8-714
Vector Saturating Rounding Shift Right and Narrow VQRSHRN, VQRSHRUN on page A8-716
Vector Saturating Shift Left VQSHL (register) on page A8-718
VQSHL, VQSHLU (immediate) on page A8-720
Vector Saturating Shift Right and Narrow VQSHRN, VQSHRUN on page A8-722
Vector Rounding Shift Left VRSHL on page A8-736
Vector Rounding Shift Right VRSHR on page A8-738
Vector Rounding Shift Right and Accumulate VRSRA on page A8-746
Vector Rounding Shift Right and Narrow VRSHRN on page A8-740
Vector Shift Left VSHL (immediate) on page A8-750
VSHL (register) on page A8-752
Vector Shift Left Long VSHLL on page A8-754
Vector Shift Right VSHR on page A8-756
Vector Shift Right and Narrow VSHRN on page A8-758
Vector Shift Left and Insert VSLI on page A8-760
Vector Shift Right and Accumulate VSRA on page A8-764
Vector Shift Right and Insert VSRI on page A8-766
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-35
A4.13.5 Advanced SIMD multiply instructions
Table A4-20 summarizes the Advanced SIMD multiply instructions.
Advanced SIMD multiply instructions can operate on vectors of:
8-bit, 16-bit, or 32-bit unsigned integers
8-bit, 16-bit, or 32-bit signed integers
8-bit or 16-bit polynomials over {0,1} (
VMUL
and
VMULL
only)
single-precision (32-bit) floating-point numbers.
They can also act on one vector and one scalar.
Long instructions have doubleword (64-bit) operands, and produce quadword (128-bit) results. Other
Advanced SIMD multiply instructions can have either doubleword or quadword operands, and produce
results of the same size.
VFP multiply instructions can operate on:
single-precision (32-bit) floating-point numbers
double-precision (64-bit) floating-point numbers.
Some VFP implementations do not support double-precision numbers.
Table A4-20 Advanced SIMD multiply instructions
Instruction See
Vector Multiply Accumulate VMLA, VMLAL, VMLS, VMLSL (integer) on
page A8-634
VMLA, VMLS (floating-point) on page A8-636
VMLA, VMLAL, VMLS, VMLSL (by scalar) on
page A8-638
Vector Multiply Accumulate Long
Vector Multiply Subtract
Vector Multiply Subtract Long
Vector Multiply VMUL, VMULL (integer and polynomial) on
page A8-662
VMUL (floating-point) on page A8-664
VMUL, VMULL (by scalar) on page A8-666
Vector Multiply Long
Vector Saturating Doubling Multiply Accumulate Long
VQDMLAL, VQDMLSL on page A8-702
Vector Saturating Doubling Multiply Subtract Long
Vector Saturating Doubling Multiply Returning High Half VQDMULH on page A8-704
Vector Saturating Rounding Doubling Multiply Returning
High Half
VQRDMULH on page A8-712
Vector Saturating Doubling Multiply Long VQDMULL on page A8-706
The Instruction Sets
A4-36 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.13.6 Miscellaneous Advanced SIMD data-processing instructions
Table A4-21 shows miscellaneous Advanced SIMD data-processing instructions.
Table A4-21 Miscellaneous Advanced SIMD data-processing instructions
Instruction See
Vector Absolute Difference and Accumulate VABA, VABAL on page A8-526
Vector Absolute Difference VABD, VABDL (integer) on page A8-528
VABD (floating-point) on page A8-530
Vec t or A bs ol u te VABS on page A8-532
Vector Convert between floating-point and
fixed point
VCVT (between floating-point and fixed-point, Advanced SIMD) on
page A8-580
Vector Convert between floating-point and
integer
VCVT (between floating-point and integer, Advanced SIMD) on
page A8-576
Vector Convert between half-precision and
single-precision
VCVT (between half-precision and single-precision, Advanced
SIMD) on page A8-586
Vector Count Leading Sign Bits VCLS on page A8-566
Vector Count Leading Zeros VCLZ on page A8-570
Vector Count Set Bits VCNT on page A8-574
Vector Duplicate scalar VDUP (scalar) on page A8-592
Vector Extract VEXT on page A8-598
Vector Move and Narrow VMOVN on page A8-656
Vector Move Long VMOVL on page A8-654
Vector Maximum, Minimum VMAX, VMIN (integer) on page A8-630
VMAX, VMIN (floating-point) on page A8-632
Vec t or N ega t e VNEG on page A8-672
Vector Pairwise Maximum, Minimum VPMAX, VPMIN (integer) on page A8-690
VPMAX, VPMIN (floating-point) on page A8-692
Vector Reciprocal Estimate VRECPE on page A8-728
Vector Reciprocal Step VRECPS on page A8-730
Vector Reciprocal Square Root Estimate VRSQRTE on page A8-742
The Instruction Sets
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A4-37
Vector Reciprocal Square Root Step VRSQRTS on page A8-744
Ve ct o r R ev er s e VREV16, VREV32, VREV64 on page A8-732
Vector Saturating Absolute VQABS on page A8-698
Vector Saturating Move and Narrow VQMOVN, VQMOVUN on page A8-708
Vector Saturating Negate VQNEG on page A8-710
Ve ct o r S wa p VSWP on page A8-796
Vector Table Lookup VTBL, VTBX on page A8-798
Vector Transpose VTRN on page A8-800
Vector Unzip VUZP on page A8-804
Vector Zip VZIP on page A8-806
Table A4-21 Miscellaneous Advanced SIMD data-processing instructions (continued)
Instruction See
The Instruction Sets
A4-38 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A4.14 VFP data-processing instructions
Table A4-22 summarizes the data-processing instructions in the VFP instruction set.
For details of the floating-point arithmetic used by VFP instructions, see Floating-point data types and
arithmetic on page A2-32.
Table A4-22 VFP data-processing instructions
Instruction See
Absolute value VABS on page A8-532
Add VADD (floating-point) on page A8-538
Compare (optionally with exceptions enabled) VCMP, VCMPE on page A8-572
Convert between floating-point and integer VCVT, VCVTR (between floating-point and integer, VFP) on
page A8-578
Convert between floating-point and fixed-point VCVT (between floating-point and fixed-point, VFP) on
page A8-582
Convert between double-precision and
single-precision
VCVT (between double-precision and single-precision) on
page A8-584
Convert between half-precision and single-precision VCVTB, VCVTT (between half-precision and
single-precision, VFP) on page A8-588
Divide VDIV on page A8-590
Multiply Accumulate, Multiply Subtract VMLA, VMLS (floating-point) on page A8-636
Move immediate value to extension register VMOV (immediate) on page A8-640
Copy from one extension register to another VMOV (register) on page A8-642
Multiply VMUL (floating-point) on page A8-664
Negate (invert the sign bit) VNEG on page A8-672
Multiply Accumulate and Negate, Multiply Subtract
and Negate, Multiply and Negate
VNMLA, VNMLS, VNMUL on page A8-674
Square Root VSQRT on page A8-762
Subtract VSUB (floating-point) on page A8-790
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-1
Chapter A5
ARM Instruction Set Encoding
This chapter describes the encoding of the ARM instruction set. It contains the following sections:
ARM instruction set encoding on page A5-2
Data-processing and miscellaneous instructions on page A5-4
Load/store word and unsigned byte on page A5-19
Media instructions on page A5-21
Branch, branch with link, and block data transfer on page A5-27
Supervisor Call, and coprocessor instructions on page A5-28
Unconditional instructions on page A5-30.
Note
Architecture variant information in this chapter describes the architecture variant or extension in
which the instruction encoding was introduced into the ARM instruction set. All means that the
instruction encoding was introduced in ARMv4 or earlier, and so is in all variants of the ARM
instruction set covered by this manual.
In the decode tables in this chapter, an entry of - for a field value means the value of the field does
not affect the decoding.
ARM Instruction Set Encoding
A5-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.1 ARM instruction set encoding
The ARM instruction stream is a sequence of word-aligned words. Each ARM instruction is a single 32-bit
word in that stream.
Table A5-1 shows the major subdivisions of the ARM instruction set, determined by bits [31:25,4].
Most ARM instructions can be conditional, with a condition determined by bits [31:28] of the instruction,
the cond field. For details see The condition field. This applies to all instructions except those with the cond
field equal to 0b1111.
A5.1.1 The condition field
Every conditional instruction contains a 4-bit condition code field in bits 31 to 28:
This field contains one of the values 0b0000-0b1110 described in Table A8-1 on page A8-8. Most
instruction mnemonics can be extended with the letters defined in the mnemonic extension field.
If the always (
AL
) condition is specified, the instruction is executed irrespective of the value of the condition
code flags. The absence of a condition code on an instruction mnemonic implies the
AL
condition code.
313029282726252423222120191817161514131211109876543210
cond op1 op
Table A5-1 ARM instruction encoding
cond op1 op Instruction classes
not 1111 00x - Data-processing and miscellaneous instructions on page A5-4.
010 - Load/store word and unsigned byte on page A5-19.
011 0 Load/store word and unsigned byte on page A5-19.
1Media instructions on page A5-21.
10x - Branch, branch with link, and block data transfer on page A5-27.
11x - Supervisor Call, and coprocessor instructions on page A5-28.
Includes VFP instructions and Advanced SIMD data transfers, see Chapter A7 Advanced
SIMD and VFP Instruction Encoding.
1111 - - If the cond field is 0b1111, the instruction can only be executed unconditionally, see
Unconditional instructions on page A5-30.
Includes Advanced SIMD instructions, see Chapter A7 Advanced SIMD and VFP
Instruction Encoding.
313029282726252423222120191817161514131211109876543210
cond
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-3
A5.1.2 UNDEFINED and UNPREDICTABLE instruction set space
An attempt to execute an unallocated instruction results in either:
Unpredictable behavior. The instruction is described as UNPREDICTABLE.
An Undefined Instruction exception. The instruction is described as UNDEFINED.
An instruction is UNDEFINED if it is declared as UNDEFINED in an instruction description, or in this chapter.
An instruction is UNPREDICTABLE if:
it is declared as UNPREDICTABLE in an instruction description or in this chapter
the pseudocode for that encoding does not indicate that a different special case applies, and a bit
marked (0) or (1) in the encoding diagram of an instruction is not 0 or 1 respectively.
Unless otherwise specified:
ARM instructions introduced in an architecture variant are UNDEFINED in earlier architecture variants.
ARM instructions introduced in one or more architecture extensions are UNDEFINED if none of those
extensions are implemented.
A5.1.3 The PC and the use of 0b1111 as a register specifier
In ARM instructions, the use of 0b1111 as a register specifier specifies the PC.
Many instructions are UNPREDICTABLE if they use 0b1111 as a register specifier. This is specified by
pseudocode in the instruction description.
Note
Use of the PC as the base register in any store instruction is deprecated in ARMv7.
A5.1.4 The SP and the use of 0b1101 as a register specifier
In ARM instructions, the use of 0b1101 as a register specifier specifies the SP.
ARM deprecates:
using SP for any purpose other than as a stack pointer
using the SP in ARM instructions in ways other that those listed in 32-bit Thumb instruction support
for R13 on page A6-4, except that ARM does not deprecate the use of instructions of the following
form that write a word-aligned address to SP:
SUB SP, <Rd>, #<const>
ARM Instruction Set Encoding
A5-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.2 Data-processing and miscellaneous instructions
Table A5-2 shows the allocation of encodings in this space.
313029282726252423222120191817161514131211109876543210
cond 0 0 op op1 op2
Table A5-2 Data-processing and miscellaneous instructions
op op1 op2 Instruction or instruction class Variant
0 not 10xx0 xxx0 Data-processing (register) on page A5-5 -
0xx1 Data-processing (register-shifted register) on page A5-7 -
10xx0 0xxx Miscellaneous instructions on page A5-18 -
1xx0 Halfword multiply and multiply-accumulate on page A5-13 -
0xxxx 1001 Multiply and multiply-accumulate on page A5-12 -
1xxxx 1001 Synchronization primitives on page A5-16 -
not 0xx1x 1011 Extra load/store instructions on page A5-14 -
11x1 Extra load/store instructions on page A5-14 -
0xx1x 1011 Extra load/store instructions (unprivileged) on page A5-15 -
11x1 Extra load/store instructions (unprivileged) on page A5-15 -
1 not 10xx0 - Data-processing (immediate) on page A5-8 -
10000 - 16-bit immediate load (MOV (immediate) on page A8-194) v6T2
10100 - High halfword 16-bit immediate load (MOVT on page A8-200) v6T2
10x10 - MSR (immediate), and hints on page A5-17 -
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-5
A5.2.1 Data-processing (register)
If op1 == 0b10xx0, see Data-processing and miscellaneous instructions on page A5-4.
Table A5-3 shows the allocation of encodings in this space. These encodings are in all architecture variants.
313029282726252423222120191817161514131211109876543210
cond 0 0 0 op1 op2 op3 0
Table A5-3 Data-processing (register) instructions
op1 op2 op3 Instruction See
0000x - - Bitwise AND AND (register) on page A8-36
0001x - - Bitwise Exclusive OR EOR (register) on page A8-96
0010x - - Subtract SUB (register) on page A8-422
0011x - - Reverse Subtract RSB (register) on page A8-286
0100x - - Add ADD (register) on page A8-24
0101x - - Add with Carry ADC (register) on page A8-16
0110x - - Subtract with Carry SBC (register) on page A8-304
0111x - - Reverse Subtract with Carry RSC (register) on page A8-292
10001 - - Test TST (register) on page A8-456
10011 - - Test Equivalence TEQ (register) on page A8-450
10101 - - Compare CMP (register) on page A8-82
10111 - - Compare Negative CMN (register) on page A8-76
1100x - - Bitwise OR ORR (register) on page A8-230
1101x 00000 00 Move MOV (register) on page A8-196
not 00000 00 Logical Shift Left LSL (immediate) on page A8-178
- 01 Logical Shift Right LSR (immediate) on page A8-182
- 10 Arithmetic Shift Right ASR (immediate) on page A8-40
00000 11 Rotate Right with Extend RRX on page A8-282
not 00000 11 Rotate Right ROR (immediate) on page A8-278
ARM Instruction Set Encoding
A5-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
1110x - - Bitwise Bit Clear BIC (register) on page A8-52
1111x - - Bitwise NOT MVN (register) on page A8-216
Table A5-3 Data-processing (register) instructions (continued)
op1 op2 op3 Instruction See
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-7
A5.2.2 Data-processing (register-shifted register)
If op1 == 0b10xx0, see Data-processing and miscellaneous instructions on page A5-4.
Table A5-4 shows the allocation of encodings in this space. These encodings are in all architecture variants.
313029282726252423222120191817161514131211109876543210
cond 0 0 0 op1 0 op2 1
Table A5-4 Data-processing (register-shifted register) instructions
op1 op2 Instruction See
0000x - Bitwise AND AND (register-shifted register) on page A8-38
0001x - Bitwise Exclusive OR EOR (register-shifted register) on page A8-98
0010x - Subtract SUB (register-shifted register) on page A8-424
0011x - Reverse Subtract RSB (register-shifted register) on page A8-288
0100x - Add ADD (register-shifted register) on page A8-26
0101x - Add with Carry ADC (register-shifted register) on page A8-18
0110x - Subtract with Carry SBC (register-shifted register) on page A8-306
0111x - Reverse Subtract with Carry RSC (register-shifted register) on page A8-294
10001 - Test TST (register-shifted register) on page A8-458
10011 - Test Equivalence TEQ (register-shifted register) on page A8-452
10101 - Compare CMP (register-shifted register) on page A8-84
10111 - Compare Negative CMN (register-shifted register) on page A8-78
1100x - Bitwise OR ORR (register-shifted register) on page A8-232
1101x 00 Logical Shift Left LSL (register) on page A8-180
01 Logical Shift Right LSR (register) on page A8-184
10 Arithmetic Shift Right ASR (register) on page A8-42
11 Rotate Right ROR (register) on page A8-280
1110x - Bitwise Bit Clear BIC (register-shifted register) on page A8-54
1111x - Bitwise NOT MVN (register-shifted register) on page A8-218
ARM Instruction Set Encoding
A5-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.2.3 Data-processing (immediate)
If op == 0b10xx0, see Data-processing and miscellaneous instructions on page A5-4.
Table A5-5 shows the allocation of encodings in this space. These encodings are in all architecture variants.
These instructions all have modified immediate constants, rather than a simple 12-bit binary number. This
provides a more useful range of values. For details see Modified immediate constants in ARM instructions
on page A5-9.
313029282726252423222120191817161514131211109876543210
cond 0 0 1 op Rn
Table A5-5 Data-processing (immediate) instructions
op Rn Instruction See
0000x - Bitwise AND AND (immediate) on page A8-34
0001x - Bitwise Exclusive OR EOR (immediate) on page A8-94
0010x not 1111 Subtract SUB (immediate, ARM) on page A8-420
1111 Form PC-relative address ADR on page A8-32
0011x - Reverse Subtract RSB (immediate) on page A8-284
0100x not 1111 Add ADD (immediate, ARM) on page A8-22
1111 Form PC-relative address ADR on page A8-32
0101x - Add with Carry ADC (immediate) on page A8-14
0110x - Subtract with Carry SBC (immediate) on page A8-302
0111x - Reverse Subtract with Carry RSC (immediate) on page A8-290
10001 - Test TST (immediate) on page A8-454
10011 - Test Equivalence TEQ (immediate) on page A8-448
10101 - Compare CMP (immediate) on page A8-80
10111 - Compare Negative CMN (immediate) on page A8-74
1100x - Bitwise OR ORR (immediate) on page A8-228
1101x - Move MOV (immediate) on page A8-194
1110x - Bitwise Bit Clear BIC (immediate) on page A8-50
1111x - Bitwise NOT MVN (immediate) on page A8-214
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-9
A5.2.4 Modified immediate constants in ARM instructions
Table A5-6 shows the range of modified immediate constants available in ARM data-processing
instructions, and how they are encoded in the a, b, c, d, e, f, g, h, and rotation fields in the instruction.
Note
The range of values available in ARM modified immediate constants is slightly different from the range of
values available in 32-bit Thumb instructions. See Modified immediate constants in Thumb instructions on
page A6-17.
15141312111098765432101514131211109876543210
rotation a b c d e f g h
Table A5-6 Encoding of modified immediates in ARM processing instructions
rotation <const> a
a. In this table, the immediate constant value is shown in binary form, to relate
abcdefgh
to the encoding diagram. In assembly syntax, the immediate value
is specified in the usual way (a decimal number by default).
0000
00000000 00000000 00000000 abcdefgh
0001
gh000000 00000000 00000000 00abcdef
0010
efgh0000 00000000 00000000 0000abcd
0011
cdefgh00 00000000 00000000 000000ab
0100
abcdefgh 00000000 00000000 00000000
..
8-bit values shifted to other even-numbered positions..
..
1001
00000000 00abcdef gh000000 00000000
..
8-bit values shifted to other even-numbered positions..
..
1110
00000000 00000000 0000abcd efgh0000
1111
00000000 00000000 000000ab cdefgh00
ARM Instruction Set Encoding
A5-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Carry out
A logical instruction with rotation == 0b0000 does not affect APSR.C. Otherwise, a logical instruction that
sets the flags sets APSR.C to the value of bit [31] of the modified immediate constant.
Constants with multiple encodings
Some constant values have multiple possible encodings. In this case, a UAL assembler must select the
encoding with the lowest unsigned value of the rotation field. This is the encoding that appears first in
Table A5-6 on page A5-9. For example, the constant
#3
must be encoded with (rotation, abcdefgh) ==
(
0b0000
,
0b00000011
), not (
0b0001
,
0b00001100
), (
0b0010
,
0b00110000
), or (
0b0011
,
0b11000000
).
In particular, this means that all constants in the range 0-255 are encoded with rotation ==
0b0000
, and
permitted constants outside that range are encoded with rotation !=
0b0000
. A flag-setting logical instruction
with a modified immediate constant therefore leaves APSR.C unchanged if the constant is in the range 0-255
and sets it to the most significant bit of the constant otherwise. This matches the behavior of Thumb
modified immediate constants for all constants that are permitted in both the ARM and Thumb instruction
sets.
An alternative syntax is available for a modified immediate constant that permits the programmer to specify
the encoding directly. In this syntax,
#<const>
is instead written as
#<byte>,#<rot>
, where:
<byte>
is the numeric value of abcdefgh, in the range 0-255
<rot>
is twice the numeric value of rotation, an even number in the range 0-30.
This syntax permits all ARM data-processing instructions with modified immediate constants to be
disassembled to assembler syntax that will assemble to the original instruction.
This syntax also makes it possible to write variants of some flag-setting logical instructions that have
different effects on APSR.C to those obtained with the normal
#<const>
syntax. For example,
ANDS R1,R2,#12,#2
has the same behavior as
ANDS R1,R2,#3
except that it sets APSR.C to 0 instead of leaving
it unchanged. Such variants of flag-setting logical instructions do not have equivalents in the Thumb
instruction set, and their use is deprecated.
Operation
// ARMExpandImm()
// ==============
bits(32) ARMExpandImm(bits(12) imm12)
// APSR.C argument to following function call does not affect the imm32 result.
(imm32, -) = ARMExpandImm_C(imm12, APSR.C);
return imm32;
// ARMExpandImm_C()
// ================
(bits(32), bit) ARMExpandImm_C(bits(12) imm12, bit carry_in)
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-11
unrotated_value = ZeroExtend(imm12<7:0>, 32);
(imm32, carry_out) = Shift_C(unrotated_value, SRType_ROR, 2*UInt(imm12<11:8>), carry_in);
return (imm32, carry_out);
ARM Instruction Set Encoding
A5-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.2.5 Multiply and multiply-accumulate
Table A5-7 shows the allocation of encodings in this space.
313029282726252423222120191817161514131211109876543210
cond 0000 op 1001
Table A5-7 Multiply and multiply-accumulate instructions
op Instruction See Variant
000x Multiply MUL on page A8-212 All
001x Multiply Accumulate MLA on page A8-190 All
0100 Unsigned Multiply Accumulate Accumulate Long UMAAL on page A8-482 v6
0101 UNDEFINED --
0110 Multiply and Subtract MLS on page A8-192 v6T2
0111 UNDEFINED --
100x Unsigned Multiply Long UMULL on page A8-486 All
101x Unsigned Multiply Accumulate Long UMLAL on page A8-484 All
110x Signed Multiply Long SMULL on page A8-356 All
111x Signed Multiply Accumulate Long SMLAL on page A8-334 All
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-13
A5.2.6 Saturating addition and subtraction
Table A5-8 shows the allocation of encodings in this space. These encodings are all available in ARMv5TE
and above, and are UNDEFINED in earlier variants of the architecture.
A5.2.7 Halfword multiply and multiply-accumulate
Table A5-9 shows the allocation of encodings in this space.
These encodings are signed multiply (
SMUL
) and signed multiply-accumulate (
SMLA
) instructions, operating
on 16-bit values, or mixed 16-bit and 32-bit values. The results and accumulators are 32-bit or 64-bit.
These encodings are all available in ARMv5TE and above, and are UNDEFINED in earlier variants of the
architecture.
313029282726252423222120191817161514131211109876543210
cond 00010 op 0 0101
Table A5-8 Saturating addition and subtraction instructions
op Instruction See
00 Saturating Add QADD on page A8-250
01 Saturating Subtract QSUB on page A8-264
10 Saturating Double and Add QDADD on page A8-258
11 Saturating Double and Subtract QDSUB on page A8-260
313029282726252423222120191817161514131211109876543210
cond 00010 op1 0 1 op0
Table A5-9 Halfword multiply and multiply-accumulate instructions
op1 op Instruction See
00 - Signed 16-bit multiply, 32-bit accumulate SMLABB, SMLABT, SMLATB, SMLATT on
page A8-330
01 0 Signed 16-bit x 32-bit multiply, 32-bit accumulate SMLAWB, SMLAWT on page A8-340
01 1 Signed 16-bit x 32-bit multiply, 32-bit result SMULWB, SMULWT on page A8-358
10 - Signed 16-bit multiply, 64-bit accumulate SMLALBB, SMLALBT, SMLALTB, SMLALTT
on page A8-336
11 - Signed 16-bit multiply, 32-bit result SMULBB, SMULBT, SMULTB, SMULTT on
page A8-354
ARM Instruction Set Encoding
A5-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.2.8 Extra load/store instructions
If op1 == 0b0xx1x or op2 == 0b00, see Data-processing and miscellaneous instructions on page A5-4.
Table A5-10 shows the allocation of encodings in this space.
313029282726252423222120191817161514131211109876543210
cond 0 0 0 op1 Rn 1 op2 1
Table A5-10 Extra load/store instructions
op2 op1 Rn Instruction See Variant
01 xx0x0 - Store Halfword STRH (register) on page A8-412 All
xx0x1 - Load Halfword LDRH (register) on page A8-156 All
xx1x0 - Store Halfword STRH (immediate, ARM) on page A8-410 All
xx1x1 not 1111 Load Halfword LDRH (immediate, ARM) on page A8-152 All
1111 Load Halfword LDRH (literal) on page A8-154 All
10 xx0x0 - Load Dual LDRD (register) on page A8-140 v5TE
xx0x1 - Load Signed Byte LDRSB (register) on page A8-164 All
xx1x0 not 1111 Load Dual LDRD (immediate) on page A8-136 v5TE
1111 Load Dual LDRD (literal) on page A8-138 v5TE
xx1x1 not 1111 Load Signed Byte LDRSB (immediate) on page A8-160 All
1111 Load Signed Byte LDRSB (literal) on page A8-162 All
11 xx0x0 - Store Dual STRD (register) on page A8-398 All
xx0x1 - Load Signed Halfword LDRSH (register) on page A8-172 All
xx1x0 - Store Dual STRD (immediate) on page A8-396 All
xx1x1 not 1111 Load Signed Halfword LDRSH (immediate) on page A8-168 All
1111 Load Signed Halfword LDRSH (literal) on page A8-170 All
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-15
A5.2.9 Extra load/store instructions (unprivileged)
If op2 == 0b00, see Data-processing and miscellaneous instructions on page A5-4.
Table A5-11 shows the allocation of encodings in this space. The instruction encodings are all available in
ARMv6T2 and above, and are UNDEFINED in earlier variants of the architecture.
313029282726252423222120191817161514131211109876543210
cond 0 0 0 0 1 op Rt 1 op2 1
Table A5-11 Extra load/store instructions (unprivileged)
op2 op Rt Instruction See
01 0 - Store Halfword Unprivileged STRHT on page A8-414
1 - Load Halfword Unprivileged LDRHT on page A8-158
1x 0 xxx0 UNPREDICTABLE -
xxx1 UNDEFINED -
10 1 - Load Signed Byte Unprivileged LDRSBT on page A8-166
11 1 - Load Signed Halfword Unprivileged LDRSHT on page A8-174
ARM Instruction Set Encoding
A5-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.2.10 Synchronization primitives
Table A5-12 shows the allocation of encodings in this space.
Other encodings in this space are UNDEFINED.
313029282726252423222120191817161514131211109876543210
cond 0001 op 1001
Table A5-12 Synchronization primitives
op Instruction See Variant
0x00 Swap Word, Swap Byte SWP, SWPB on page A8-432 a
a. Use of these instructions is deprecated.
All
1000 Store Register Exclusive STREX on page A8-400 v6
1001 Load Register Exclusive LDREX on page A8-142 v6
1010 Store Register Exclusive Doubleword STREXD on page A8-404 v6K
1011 Load Register Exclusive Doubleword LDREXD on page A8-146 v6K
1100 Store Register Exclusive Byte STREXB on page A8-402 v6K
1101 Load Register Exclusive Byte LDREXB on page A8-144 v6K
1110 Store Register Exclusive Halfword STREXH on page A8-406 v6K
1111 Load Register Exclusive Halfword LDREXH on page A8-148 v6K
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-17
A5.2.11 MSR (immediate), and hints
Table A5-13 shows the allocation of encodings in this space.
Other encodings in this space are unallocated hints. They execute as NOPs, but software must not use them.
313029282726252423222120191817161514131211109876543210
cond 0 0 1 1 0 op 1 0 op1 op2
Table A5-13 MSR (immediate), and hints
op op1 op2 Instruction See Variant
0 0000 00000000 No Operation hint NOP on page A8-222 v6K, v6T2
00000001 Yield hint YIELD on page A8-812 v6K
00000010 Wait For Event hint WFE on page A8-808 v6K
00000011 Wait For Interrupt hint WFI on page A8-810 v6K
00000100 Send Event hint SEV on page A8-316 v6K
1111xxxx Debug hint DBG on page A8-88 v7
0100 - Move to Special Register,
application level
MSR (immediate) on page A8-208 All
1x00 -
xx01 - Move to Special Register, system
level
MSR (immediate) on page B6-12 All
xx1x -
1 - - Move to Special Register, system
level
MSR (immediate) on page B6-12 All
ARM Instruction Set Encoding
A5-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.2.12 Miscellaneous instructions
Table A5-14 shows the allocation of encodings in this space.
Other encodings in this space are UNDEFINED.
313029282726252423222120191817161514131211109876543210
cond 00010 op 0 op1 0 op2
Table A5-14 Miscellaneous instructions
op2 op op1 Instruction or instruction class See Variant
000 x0 xxxx Move Special Register to Register MRS on page A8-206
MRS on page B6-10
All
01 xx00 Move to Special Register, application level MSR (register) on page A8-210 All
xx01
xx1x
Move to Special Register, system level MSR (register) on page B6-14 All
11 - Move to Special Register, system level MSR (register) on page B6-14 All
001 01 - Branch and Exchange BX on page A8-62 v4T
11 - Count Leading Zeros CLZ on page A8-72 v6
010 01 - Branch and Exchange Jazelle BXJ on page A8-64 v5TEJ
011 01 - Branch with Link and Exchange BLX (register) on page A8-60 v5T
101 - - Saturating addition and subtraction Saturating addition and
subtraction on page A5-13
-
111 01 - Breakpoint BKPT on page A8-56 v5T
11 - Secure Monitor Call SMC (previously SMI) on
page B6-18
Security
Extensions
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-19
A5.3 Load/store word and unsigned byte
These instructions have either A == 0 or B == 0. For instructions with A == 1 and B == 1, see Media
instructions on page A5-21.
Table A5-15 shows the allocation of encodings in this space. These encodings are in all architecture
variants.
313029282726252423222120191817161514131211109876543210
cond 0 1 A op1 Rn B
Table A5-15 Single data transfer instructions
A op1 B Rn Instruction See
0 xx0x0 not 0x010 - - Store Register STR (immediate, ARM) on
page A8-384
1 xx0x0 not 0x010 0 - Store Register STR (register) on page A8-386
0 0x010 - - Store Register Unprivileged STRT on page A8-416
1 0x010 0 -
0 xx0x1 not 0x011 - not 1111 Load Register (immediate) LDR (immediate, ARM) on
page A8-120
xx0x1 not 0x011 - 1111 Load Register (literal) LDR (literal) on page A8-122
1 xx0x1 not 0x011 0 - Load Register LDR (register) on page A8-124
0 0x011 - - Load Register Unprivileged LDRT on page A8-176
1 0x011 0 -
0 xx1x0 not 0x110 - - Store Register Byte (immediate) STRB (immediate, ARM) on
page A8-390
1 xx1x0 not 0x110 0 - Store Register Byte (register) STRB (register) on page A8-392
0 0x110 - - Store Register Byte Unprivileged STRBT on page A8-394
1 0x110 0 -
0 xx1x1 not 0x111 - not 1111 Load Register Byte (immediate) LDRB (immediate, ARM) on
page A8-128
xx1x1 not 0x111 - 1111 Load Register Byte (literal) LDRB (literal) on page A8-130
ARM Instruction Set Encoding
A5-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
1 xx1x1 not 0x111 0 - Load Register Byte (register) LDRB (register) on page A8-132
0 0x111 - - Load Register Byte Unprivileged LDRBT on page A8-134
1 0x111 0 -
Table A5-15 Single data transfer instructions (continued)
A op1 B Rn Instruction See
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-21
A5.4 Media instructions
Table A5-16 shows the allocation of encodings in this space.
Other encodings in this space are UNDEFINED.
313029282726252423222120191817161514131211109876543210
cond 0 1 1 op1 Rd op2 1 Rn
Table A5-16 Media instructions
op1 op2 Rd Rn Instructions See Variant
000xx - - - - Parallel addition and
subtraction, signed on
page A5-22
-
001xx - - - - Parallel addition and
subtraction, unsigned on
page A5-23
-
01xxx - - - - Packing, unpacking,
saturation, and reversal on
page A5-24
-
10xxx - - - - Signed multiplies on
page A5-26
-
11000 000 1111 - Unsigned Sum of Absolute
Differences
USAD8 on page A8-500 v6
000 not 1111 - Unsigned Sum of Absolute
Differences and Accumulate
USADA8 on page A8-502 v6
1101x x10 - - Signed Bit Field Extract SBFX on page A8-308 v6T2
1110x x00 - 1111 Bit Field Clear BFC on page A8-46 v6T2
- not 1111 Bit Field Insert BFI on page A8-48 v6T2
1111x x10 - - Unsigned Bit Field Extract UBFX on page A8-466 v6T2
11111 111 - - Permanently UNDEFINED. This space will not be allocated in future.
ARM Instruction Set Encoding
A5-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.4.1 Parallel addition and subtraction, signed
Table A5-17 shows the allocation of encodings in this space. These encodings are all available in ARMv6
and above, and are UNDEFINED in earlier variants of the architecture.
Other encodings in this space are UNDEFINED.
313029282726252423222120191817161514131211109876543210
cond 011000 op1 op2 1
Table A5-17 Signed parallel addition and subtraction instructions
op1 op2 Instruction See
01 000 Add 16-bit SADD16 on page A8-296
01 001 Add and Subtract with Exchange SASX on page A8-300
01 010 Subtract and Add with Exchange SSAX on page A8-366
01 011 Subtract 16-bit SSUB16 on page A8-368
01 100 Add 8-bit SADD8 on page A8-298
01 111 Subtract 8-bit SSUB8 on page A8-370
Saturating instructions
10 000 Saturating Add 16-bit QADD16 on page A8-252
10 001 Saturating Add and Subtract with Exchange QASX on page A8-256
10 010 Saturating Subtract and Add with Exchange QSAX on page A8-262
10 011 Saturating Subtract 16-bit QSUB16 on page A8-266
10 100 Saturating Add 8-bit QADD8 on page A8-254
10 111 Saturating Subtract 8-bit QSUB8 on page A8-268
Halving instructions
11 000 Halving Add 16-bit SHADD16 on page A8-318
11 001 Halving Add and Subtract with Exchange SHASX on page A8-322
11 010 Halving Subtract and Add with Exchange SHSAX on page A8-324
11 011 Halving Subtract 16-bit SHSUB16 on page A8-326
11 100 Halving Add 8-bit SHADD8 on page A8-320
11 111 Halving Subtract 8-bit SHSUB8 on page A8-328
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-23
A5.4.2 Parallel addition and subtraction, unsigned
Table A5-18 shows the allocation of encodings in this space. These encodings are all available in ARMv6
and above, and are UNDEFINED in earlier variants of the architecture.
Other encodings in this space are UNDEFINED.
313029282726252423222120191817161514131211109876543210
cond 0 1 1 0 0 1 op1 op2 1
Table A5-18 Unsigned parallel addition and subtractions instructions
op1 op2 Instruction See
01 000 Add 16-bit UADD16 on page A8-460
01 001 Add and Subtract with Exchange UASX on page A8-464
01 010 Subtract and Add with Exchange USAX on page A8-508
01 011 Subtract 16-bit USUB16 on page A8-510
01 100 Add 8-bit UADD8 on page A8-462
01 111 Subtract 8-bit USUB8 on page A8-512
Saturating instructions
10 000 Saturating Add 16-bit UQADD16 on page A8-488
10 001 Saturating Add and Subtract with Exchange UQASX on page A8-492
10 010 Saturating Subtract and Add with Exchange UQSAX on page A8-494
10 011 Saturating Subtract 16-bit UQSUB16 on page A8-496
10 100 Saturating Add 8-bit UQADD8 on page A8-490
10 111 Saturating Subtract 8-bit UQSUB8 on page A8-498
Halving instructions
11 000 Halving Add 16-bit UHADD16 on page A8-470
11 001 Halving Add and Subtract with Exchange UHASX on page A8-474
11 010 Halving Subtract and Add with Exchange UHSAX on page A8-476
11 011 Halving Subtract 16-bit UHSUB16 on page A8-478
11 100 Halving Add 8-bit UHADD8 on page A8-472
11 111 Halving Subtract 8-bit UHSUB8 on page A8-480
ARM Instruction Set Encoding
A5-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.4.3 Packing, unpacking, saturation, and reversal
Table A5-19 shows the allocation of encodings in this space.
Other encodings in this space are UNDEFINED.
313029282726252423222120191817161514131211109876543210
cond 01101 op1 A op2 1
Table A5-19 Packing, unpacking, saturation, and reversal instructions
op1 op2 A Instructions See Variant
000 xx0 - Pack Halfword PKH on page A8-234 v6
01x xx0 - Signed Saturate SSAT on page A8-362 v6
11x xx0 - Unsigned Saturate USAT on page A8-504 v6
000 011 not 1111 Signed Extend and Add Byte 16 SXTAB16 on page A8-436 v6
1111 Signed Extend Byte 16 SXTB16 on page A8-442 v6
101 - Select Bytes SEL on page A8-312 v6
010 001 - Signed Saturate 16 SSAT16 on page A8-364 v6
011 not 1111 Signed Extend and Add Byte SXTAB on page A8-434 v6
1111 Signed Extend Byte SXTB on page A8-440 v6
011 001 - Byte-Reverse Word REV on page A8-272 v6
011 not 1111 Signed Extend and Add Halfword SXTAH on page A8-438 v6
1111 Signed Extend Halfword SXTH on page A8-444 v6
011 101 - Byte-Reverse Packed Halfword REV16 on page A8-274 v6
100 011 not 1111 Unsigned Extend and Add Byte 16 UXTAB16 on page A8-516 v6
1111 Unsigned Extend Byte 16 UXTB16 on page A8-522 v6
110 001 - Unsigned Saturate 16 USAT16 on page A8-506 v6
011 not 1111 Unsigned Extend and Add Byte UXTAB on page A8-514 v6
1111 Unsigned Extend Byte UXTB on page A8-520 v6
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-25
111 001 - Reverse Bits RBIT on page A8-270 v6T2
011 not 1111 Unsigned Extend and Add Halfword UXTAH on page A8-518 v6
1111 Unsigned Extend Halfword UXTH on page A8-524 v6
101 - Byte-Reverse Signed Halfword REVSH on page A8-276 v6
Table A5-19 Packing, unpacking, saturation, and reversal instructions (continued)
op1 op2 A Instructions See Variant
ARM Instruction Set Encoding
A5-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.4.4 Signed multiplies
Table A5-20 shows the allocation of encodings in this space. These encodings are all available in ARMv6T2
and above, and are UNDEFINED in earlier variants of the architecture.
Other encodings in this space are UNDEFINED.
313029282726252423222120191817161514131211109876543210
cond 01110 op1 A op2 1
Table A5-20 Signed multiply instructions
op1 op2 A Instruction See
000 00x not 1111 Signed Multiply Accumulate Dual SMLAD on page A8-332
1111 Signed Dual Multiply Add SMUAD on page A8-352
01x not 1111 Signed Multiply Subtract Dual SMLSD on page A8-342
1111 Signed Dual Multiply Subtract SMUSD on page A8-360
100 00x - Signed Multiply Accumulate Long Dual SMLALD on page A8-338
01x - Signed Multiply Subtract Long Dual SMLSLD on page A8-344
101 00x not 1111 Signed Most Significant Word Multiply Accumulate SMMLA on page A8-346
1111 Signed Most Significant Word Multiply SMMUL on page A8-350
11x - Signed Most Significant Word Multiply Subtract SMMLS on page A8-348
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-27
A5.5 Branch, branch with link, and block data transfer
Table A5-21 shows the allocation of encodings in this space. These encodings are in all architecture
variants.
313029282726252423222120191817161514131211109876543210
cond 1 0 op R
Table A5-21 Branch, branch with link, and block data transfer instructions
op R Instructions See
0000x0 - Store Multiple Decrement After STMDA / STMED on page A8-376
0000x1 - Load Multiple Decrement After LDMDA / LDMFA on page A8-112
0010x0 - Store Multiple (Increment After) STM / STMIA / STMEA on page A8-374
0010x1 - Load Multiple (Increment After) LDM / LDMIA / LDMFD on page A8-110
0100x0 - Store Multiple Decrement Before STMDB / STMFD on page A8-378
0100x1 - Load Multiple Decrement Before LDMDB / LDMEA on page A8-114
0110x0 - Store Multiple Increment Before STMIB / STMFA on page A8-380
0110x1 - Load Multiple Increment Before LDMIB / LDMED on page A8-116
0xx1x0 - Store Multiple (user registers) STM (user registers) on page B6-22
0xx1x1 0 Load Multiple (user registers) LDM (user registers) on page B6-7
1 Load Multiple (exception return) LDM (exception return) on page B6-5
10xxxx - Branch B on page A8-44
11xxxx - Branch with Link BL, BLX (immediate) on page A8-58
ARM Instruction Set Encoding
A5-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.6 Supervisor Call, and coprocessor instructions
Table A5-22 shows the allocation of encodings in this space.
313029282726252423222120191817161514131211109876543210
cond 1 1 op1 Rn coproc op
Table A5-22 Supervisor Call, and coprocessor instructions
op1 op coproc Rn Instructions See Variant
0xxxxxa- 101x - Advanced SIMD, VFP Extension register load/store
instructions on page A7-26
0xxxx0a- not 101x - Store Coprocessor STC, STC2 on page A8-372 All
0xxxx1a- not 101x not 1111 Load Coprocessor LDC, LDC2 (immediate) on
page A8-106
All
1111 Load Coprocessor LDC, LDC2 (literal) on
page A8-108
All
00000x - - - UNDEFINED --
00010x - 101x - Advanced SIMD, VFP 64-bit transfers between ARM core and
extension registers on page A7-32
000100 - not 101x - Move to Coprocessor from
two ARM core registers
MCRR, MCRR2 on
page A8-188
v5TE
000101 - not 101x - Move to two ARM core
registers from Coprocessor
MRRC, MRRC2 on
page A8-204
v5TE
10xxxx 0 101x - - VFP data-processing instructions on
page A7-24
not 101x - Coprocessor data operations CDP, CDP2 on page A8-68 All
1 101x - Advanced SIMD, VFP 8, 16, and 32-bit transfer between ARM
core and extension registers on
page A7-31
10xxx0 1 not 101x - Move to Coprocessor from
ARM core register
MCR, MCR2 on
page A8-186
All
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-29
For more information about specific coprocessors see Coprocessor support on page A2-68.
10xxx1 1 not 101x - Move to ARM core register
from Coprocessor
MRC, MRC2 on
page A8-202
All
11xxxx - - - Supervisor Call SVC (previously SWI) on
page A8-430
All
a. But not 000x0x
Table A5-22 Supervisor Call, and coprocessor instructions (continued)
op1 op coproc Rn Instructions See Variant
ARM Instruction Set Encoding
A5-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A5.7 Unconditional instructions
Table A5-23 shows the allocation of encodings in this space.
Other encodings in this space are UNDEFINED in ARMv5 and above.
All encodings in this space are UNPREDICTABLE in ARMv4 and ARMv4T.
313029282726252423222120191817161514131211109876543210
1111 op1 Rn op
Table A5-23 Unconditional instructions
op1 op Rn Instruction See Variant
0xxxxxxx - - - Miscellaneous instructions, memory hints, and
Advanced SIMD instructions on page A5-31
100xx1x0 - - Store Return State SRS on page B6-20 v6
100xx0x1 - - Return From Exception RFE on page B6-16 v6
101xxxxx - - Branch with Link and Exchange BL, BLX (immediate) on
page A8-58
v5
11000x11 - not 1111 Load Coprocessor (immediate) LDC, LDC2 (immediate) on
page A8-106
v5
11001xx1 - 1111 Load Coprocessor (literal) LDC, LDC2 (literal) on
page A8-108
v5
1101xxx1 - 1111
11000x10
11001xx0
1101xxx0
- - Store Coprocessor STC, STC2 on page A8-372 v5
11000100 - - Move to Coprocessor from two
ARM core registers
MCRR, MCRR2 on page A8-188 v6
11000101 - - Move to two ARM core registers
from Coprocessor
MRRC, MRRC2 on page A8-204 v6
1110xxxx 0 - Coprocessor data operations CDP, CDP2 on page A8-68 v5
1110xxx0 1 - Move to Coprocessor from
ARM core register
MCR, MCR2 on page A8-186 v5
1110xxx1 1 - Move to ARM core register from
Coprocessor
MRC, MRC2 on page A8-202 v5
ARM Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A5-31
A5.7.1 Miscellaneous instructions, memory hints, and Advanced SIMD instructions
Table A5-24 shows the allocation of encodings in this space.
Other encodings in this space are UNDEFINED in ARMv5 and above. All these encodings are
UNPREDICTABLE in ARMv4 and ARMv4T.
313029282726252423222120191817161514131211109876543210
11110 op1 Rn op2
Table A5-24 Hints, and Advanced SIMD instructions
op1 op2 Rn Instruction See Variant
0010000 xx0x xxx0 Change Processor State CPS on page B6-3 v6
0010000 0000 xxx1 Set Endianness SETEND on page A8-314 v6
01xxxxx - - See Advanced SIMD data-processing instructions on page A7-10 v7
100xxx0 - - See Advanced SIMD element or structure load/store instructions on
page A7-27
v7
100x001 - - Unallocated memory hint (treat as NOP) MPa
Extensions
100x101 - - Preload Instruction PLI (immediate, literal) on
page A8-242
v7
101x001 - not 1111 Preload Data with intent to
Write
PLD, PLDW (immediate) on
page A8-236
MPa
Extensions
1111 UNPREDICTABLE --
101x101 - not 1111 Preload Data PLD, PLDW (immediate) on
page A8-236
v5TE
1111 Preload Data PLD (literal) on page A8-238 v5TE
1010111 0001 - Clear-Exclusive CLREX on page A8-70 v6K
0100 - Data Synchronization Barrier DSB on page A8-92 v6T2
0101 - Data Memory Barrier DMB on page A8-90 v7
0110 - Instruction Synchronization
Barrier
ISB on page A8-102 v6T2
10xxx11 - - UNPREDICTABLE except as shown above -
ARM Instruction Set Encoding
A5-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
110x001 xxx0 - Unallocated memory hint (treat as NOP) MPa
Extensions
110x101 xxx0 - Preload Instruction PLI (register) on page A8-244 v7
111x001 xxx0 - Preload Data with intent to
Write
PLD, PLDW (register) on
page A8-240
MPa
Extensions
111x101 xxx0 - Preload Data PLD, PLDW (register) on
page A8-240
v5TE
11xxx11 xxx0 - UNPREDICTABLE --
a. Multiprocessing Extensions.
Table A5-24 Hints, and Advanced SIMD instructions (continued)
op1 op2 Rn Instruction See Variant
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-1
Chapter A6
Thumb Instruction Set Encoding
This chapter introduces the Thumb instruction set and describes how it uses the ARM programmers’ model.
It contains the following sections:
Thumb instruction set encoding on page A6-2
16-bit Thumb instruction encoding on page A6-6
32-bit Thumb instruction encoding on page A6-14.
For details of the differences between the Thumb and ThumbEE instruction sets see Chapter A9 ThumbEE.
Note
Architecture variant information in this chapter describes the architecture variant or extension in
which the instruction encoding was introduced into the Thumb instruction set.
In the decode tables in this chapter, an entry of - for a field value means the value of the field does
not affect the decoding.
Thumb Instruction Set Encoding
A6-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.1 Thumb instruction set encoding
The Thumb instruction stream is a sequence of halfword-aligned halfwords. Each Thumb instruction is
either a single 16-bit halfword in that stream, or a 32-bit instruction consisting of two consecutive halfwords
in that stream.
If bits [15:11] of the halfword being decoded take any of the following values, the halfword is the first
halfword of a 32-bit instruction:
• 0b11101
• 0b11110
• 0b11111.
Otherwise, the halfword is a 16-bit instruction.
For details of the encoding of 16-bit Thumb instructions see 16-bit Thumb instruction encoding on
page A6-6.
For details of the encoding of 32-bit Thumb instructions see 32-bit Thumb instruction encoding on
page A6-14.
A6.1.1 UNDEFINED and UNPREDICTABLE instruction set space
An attempt to execute an unallocated instruction results in either:
Unpredictable behavior. The instruction is described as UNPREDICTABLE.
An Undefined Instruction exception. The instruction is described as UNDEFINED.
An instruction is UNDEFINED if it is declared as UNDEFINED in an instruction description, or in this chapter.
An instruction is UNPREDICTABLE if:
a bit marked (0) or (1) in the encoding diagram of an instruction is not 0 or 1 respectively, and the
pseudocode for that encoding does not indicate that a different special case applies
it is declared as UNPREDICTABLE in an instruction description or in this chapter.
Unless otherwise specified:
Thumb instructions introduced in an architecture variant are either UNPREDICTABLE or UNDEFINED in
earlier architecture variants.
A Thumb instruction that is provided by one or more of the architecture extensions is either
UNPREDICTABLE or UNDEFINED in an implementation that does not include any of those extensions.
In both cases, the instruction is UNPREDICTABLE if it is a 32-bit instruction in an architecture variant before
ARMv6T2, and UNDEFINED otherwise.
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-3
A6.1.2 Use of 0b1111 as a register specifier
The use of 0b1111 as a register specifier is not normally permitted in Thumb instructions. When a value of
0b1111 is permitted, a variety of meanings is possible. For register reads, these meanings are:
Read the PC value, that is, the address of the current instruction + 4. The base register of the table
branch instructions
TBB
and
TBH
can be the PC. This enables branch tables to be placed in memory
immediately after the instruction.
Note
Use of the PC as the base register in the
STC
instruction is deprecated in ARMv7.
Read the word-aligned PC value, that is, the address of the current instruction + 4, with bits [1:0]
forced to zero. The base register of
LDC
,
LDR
,
LDRB
,
LDRD
(pre-indexed, no writeback),
LDRH
,
LDRSB
, and
LDRSH
instructions can be the word-aligned PC. This enables PC-relative data addressing. In addition,
some encodings of the
ADD
and
SUB
instructions permit their source registers to be 0b1111 for the same
purpose.
Read zero. This is done in some cases when one instruction is a special case of another, more general
instruction, but with one operand zero. In these cases, the instructions are listed on separate pages,
with a special case in the pseudocode for the more general instruction cross-referencing the other
page.
For register writes, these meanings are:
The PC can be specified as the destination register of an
LDR
instruction. This is done by encoding Rt
as 0b1111. The loaded value is treated as an address, and the effect of execution is a branch to that
address. bit [0] of the loaded value selects whether to execute ARM or Thumb instructions after the
branch.
Some other instructions write the PC in similar ways, either implicitly (for example branch
instructions) or by using a register mask rather than a register specifier (
LDM
). The address to branch
to can be:
a loaded value, for example,
RFE
a register value, for example,
BX
the result of a calculation, for example,
TBB
or
TBH
.
The method of choosing the instruction set used after the branch can be:
similar to the
LDR
case, for
LDM
or
BX
a fixed instruction set other than the one currently being used, for example, the immediate form
of
BLX
unchanged, for example branch instructions
set from the (J,T) bits of the SPSR, for
RFE
and
SUBS PC,LR,#imm8
.
Discard the result of a calculation. This is done in some cases when one instruction is a special case
of another, more general instruction, but with the result discarded. In these cases, the instructions are
listed on separate pages, with a special case in the pseudocode for the more general instruction
cross-referencing the other page.
Thumb Instruction Set Encoding
A6-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
If the destination register specifier of an
LDRB
,
LDRH
,
LDRSB
, or
LDRSH
instruction is 0b1111, the
instruction is a memory hint instead of a load operation.
If the destination register specifier of an
MRC
instruction is 0b1111, bits [31:28] of the value
transferred from the coprocessor are written to the N, Z, C, and V flags in the APSR, and bits [27:0]
are discarded.
A6.1.3 Use of 0b1101 as a register specifier
R13 is defined in the Thumb instruction set so that its use is primarily as a stack pointer, and R13 is normally
identified as SP in Thumb instructions. In 32-bit Thumb instructions, if you use R13 as a general-purpose
register beyond the architecturally defined constraints described in this section, the results are
UNPREDICTABLE.
The restrictions applicable to R13 are described in:
R13[1:0] definition
32-bit Thumb instruction support for R13.
See also 16-bit Thumb instruction support for R13 on page A6-5.
R13[1:0] definition
Bits [1:0] of R13 are SBZP. Writing a nonzero value to bits [1:0] causes UNPREDICTABLE behavior.
32-bit Thumb instruction support for R13
R13 instruction support is restricted to the following:
R13 as the source or destination register of a
MOV
instruction. Only register to register transfers without
shifts are supported, with no flag setting:
MOV SP,<Rm>
MOV <Rn>,SP
Using the following instructions to adjust R13 up or down by a multiple of 4:
ADD{W} SP,SP,#<imm>
SUB{W} SP,SP,#<imm>
ADD SP,SP,<Rm>
ADD SP,SP,<Rm>,LSL #<n> ; For <n> = 1,2,3
SUB SP,SP,<Rm>
SUB SP,SP,<Rm>,LSL #<n> ; For <n> = 1,2,3
R13 as a base register
<Rn>
of any load/store instruction. This supports SP-based addressing for load,
store, or memory hint instructions, with positive or negative offsets, with and without writeback.
R13 as the first operand
<Rn>
in any
ADD{S}
,
CMN
,
CMP
, or
SUB{S}
instruction. The add and subtract
instructions support SP-based address generation, with the address going into a general-purpose
register.
CMN
and
CMP
are useful for stack checking in some circumstances.
R13 as the transferred register
<Rt>
in any
LDR
or
STR
instruction.
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-5
16-bit Thumb instruction support for R13
For 16-bit data-processing instructions that affect high registers, R13 can only be used as described in 32-bit
Thumb instruction support for R13 on page A6-4. Any other use is deprecated. This affects the high register
forms of
CMP
and
ADD
, where the use of R13 as
<Rm>
is deprecated.
Thumb Instruction Set Encoding
A6-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.2 16-bit Thumb instruction encoding
Table A6-1 shows the allocation of 16-bit instruction encodings.
1514131211109876543210
Opcode
Table A6-1 16-bit Thumb instruction encoding
Opcode Instruction or instruction class Variant
00xxxx Shift (immediate), add, subtract, move, and compare on page A6-7 -
010000 Data-processing on page A6-8 -
010001 Special data instructions and branch and exchange on page A6-9 -
01001x Load from Literal Pool, see LDR (literal) on page A8-122 v4T
0101xx Load/store single data item on page A6-10 -
011xxx
100xxx
10100x Generate PC-relative address, see ADR on page A8-32 v4T
10101x Generate SP-relative address, see ADD (SP plus immediate) on page A8-28 v4T
1011xx Miscellaneous 16-bit instructions on page A6-11 -
11000x Store multiple registers, see STM / STMIA / STMEA on page A8-374 a
a. In ThumbEE, 16-bit load/store multiple instructions are not available. This encoding is used for special
ThumbEE instructions. For details see Chapter A9 ThumbEE.
v4T
11001x Load multiple registers, see LDM / LDMIA / LDMFD on page A8-110 av4T
1101xx Conditional branch, and Supervisor Call on page A6-13 -
11100x Unconditional Branch, see B on page A8-44 v4T
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-7
A6.2.1 Shift (immediate), add, subtract, move, and compare
Table A6-2 shows the allocation of encodings in this space.
All these instructions are available since the Thumb instruction set was introduced in ARMv4T.
1514131211109876543210
0 0 Opcode
Table A6-2 16-bit Thumb shift (immediate), add, subtract, move, and compare instructions
Opcode Instruction See
000xx Logical Shift Left LSL (immediate) on page A8-178
001xx Logical Shift Right LSR (immediate) on page A8-182
010xx Arithmetic Shift Right ASR (immediate) on page A8-40
01100 Add register ADD (register) on page A8-24
01101 Subtract register SUB (register) on page A8-422
01110 Add 3-bit immediate ADD (immediate, Thumb) on page A8-20
01111 Subtract 3-bit immediate SUB (immediate, Thumb) on page A8-418
100xx Move MOV (immediate) on page A8-194
101xx Compare CMP (immediate) on page A8-80
110xx Add 8-bit immediate ADD (immediate, Thumb) on page A8-20
111xx Subtract 8-bit immediate SUB (immediate, Thumb) on page A8-418
Thumb Instruction Set Encoding
A6-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.2.2 Data-processing
Table A6-3 shows the allocation of encodings in this space.
All these instructions are available since the Thumb instruction set was introduced in ARMv4T.
1514131211109876543210
010000 Opcode
Table A6-3 16-bit Thumb data-processing instructions
Opcode Instruction See
0000 Bitwise AND AND (register) on page A8-36
0001 Bitwise Exclusive OR EOR (register) on page A8-96
0010 Logical Shift Left LSL (register) on page A8-180
0011 Logical Shift Right LSR (register) on page A8-184
0100 Arithmetic Shift Right ASR (register) on page A8-42
0101 Add with Carry ADC (register) on page A8-16
0110 Subtract with Carry SBC (register) on page A8-304
0111 Rotate Right ROR (register) on page A8-280
1000 Test TST (register) on page A8-456
1001 Reverse Subtract from 0 RSB (immediate) on page A8-284
1010 Compare High Registers CMP (register) on page A8-82
1011 Compare Negative CMN (register) on page A8-76
1100 Bitwise OR ORR (register) on page A8-230
1101 Multiply Two Registers MUL on page A8-212
1110 Bitwise Bit Clear BIC (register) on page A8-52
1111 Bitwise NOT MVN (register) on page A8-216
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-9
A6.2.3 Special data instructions and branch and exchange
Table A6-4 shows the allocation of encodings in this space.
1514131211109876543210
010001 Opcode
Table A6-4 16-bit Thumb special data instructions and branch and exchange
Opcode Instruction See Variant
0000 Add Low Registers ADD (register) on page A8-24 v6T2 a
a. UNPREDICTABLE in earlier variants.
0001
001x
Add High Registers ADD (register) on page A8-24 v4T
0100 UNPREDICTABLE --
0101
011x
Compare High Registers CMP (register) on page A8-82 v4T
1000 Move Low Registers MOV (register) on page A8-196 v6 a
1001
101x
Move High Registers MOV (register) on page A8-196 v4T
110x Branch and Exchange BX on page A8-62 v4T
111x Branch with Link and Exchange BLX (register) on page A8-60 v5T a
Thumb Instruction Set Encoding
A6-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.2.4 Load/store single data item
These instructions have one of the following values in opA:
• 0b0101
• 0b011x
• 0b100x.
Table A6-5 shows the allocation of encodings in this space.
All these instructions are available since the Thumb instruction set was introduced in ARMv4T.
1514131211109876543210
opA opB
Table A6-5 16-bit Thumb Load/store instructions
opA opB Instruction See
0101 000 Store Register STR (register) on page A8-386
001 Store Register Halfword STRH (register) on page A8-412
010 Store Register Byte STRB (register) on page A8-392
011 Load Register Signed Byte LDRSB (register) on page A8-164
100 Load Register LDR (register) on page A8-124
101 Load Register Halfword LDRH (register) on page A8-156
110 Load Register Byte LDRB (register) on page A8-132
111 Load Register Signed Halfword LDRSH (register) on page A8-172
0110 0xx Store Register STR (immediate, Thumb) on page A8-382
1xx Load Register LDR (immediate, Thumb) on page A8-118
0111 0xx Store Register Byte STRB (immediate, Thumb) on page A8-388
1xx Load Register Byte LDRB (immediate, Thumb) on page A8-126
1000 0xx Store Register Halfword STRH (immediate, Thumb) on page A8-408
1xx Load Register Halfword LDRH (immediate, Thumb) on page A8-150
1001 0xx Store Register SP relative STR (immediate, Thumb) on page A8-382
1xx Load Register SP relative LDR (immediate, Thumb) on page A8-118
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-11
A6.2.5 Miscellaneous 16-bit instructions
Table A6-6 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
1514131211109876543210
1 0 1 1 Opcode
Table A6-6 Miscellaneous 16-bit instructions
Opcode Instruction See Variant
0110010 Set Endianness SETEND on page A8-314 v6
0110011 Change Processor State CPS on page B6-3 v6
00000xx Add Immediate to SP ADD (SP plus immediate) on page A8-28 v4T
00001xx Subtract Immediate from SP SUB (SP minus immediate) on page A8-426 v4T
0001xxx Compare and Branch on Zero CBNZ, CBZ on page A8-66 v6T2
001000x Signed Extend Halfword SXTH on page A8-444 v6
001001x Signed Extend Byte SXTB on page A8-440 v6
001010x Unsigned Extend Halfword UXTH on page A8-524 v6
001011x Unsigned Extend Byte UXTB on page A8-520 v6
0011xxx Compare and Branch on Zero CBNZ, CBZ on page A8-66 v6T2
010xxxx Push Multiple Registers PUSH on page A8-248 v4T
1001xxx Compare and Branch on Nonzero CBNZ, CBZ on page A8-66 v6T2
101000x Byte-Reverse Word REV on page A8-272 v6
101001x Byte-Reverse Packed Halfword REV16 on page A8-274 v6
101011x Byte-Reverse Signed Halfword REVSH on page A8-276 v6
1011xxx Compare and Branch on Nonzero CBNZ, CBZ on page A8-66 v6T2
110xxxx Pop Multiple Registers POP on page A8-246 v4T
1110xxx Breakpoint BKPT on page A8-56 v5
1111xxx If-Then, and hints If-Then, and hints on page A6-12 -
Thumb Instruction Set Encoding
A6-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
If-Then, and hints
Table A6-7 shows the allocation of encodings in this space.
Other encodings in this space are unallocated hints. They execute as NOPs, but software must not use them.
1514131211109876543210
10111111 opA opB
Table A6-7 Miscellaneous 16-bit instructions
opA opB Instruction See Variant
- not 0000 If-Then IT on page A8-104 v6T2
0000 0000 No Operation hint NOP on page A8-222 v6T2
0001 0000 Yield hint YIELD on page A8-812 v7
0010 0000 Wait For Event hint WFE on page A8-808 v7
0011 0000 Wait For Interrupt hint WFI on page A8-810 v7
0100 0000 Send Event hint SEV on page A8-316 v7
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-13
A6.2.6 Conditional branch, and Supervisor Call
Table A6-8 shows the allocation of encodings in this space.
All these instructions are available since the Thumb instruction set was introduced in ARMv4T.
1514131211109876543210
1 1 0 1 Opcode
Table A6-8 Conditional branch and Supervisor Call instructions
Opcode Instruction See
not 111x Conditional branch B on page A8-44
1110 Permanently UNDEFINED. This space will not be allocated in future.
1111 Supervisor Call SVC (previously SWI) on page A8-430
Thumb Instruction Set Encoding
A6-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3 32-bit Thumb instruction encoding
If op1 == 0b00, a 16-bit instruction is encoded, see 16-bit Thumb instruction encoding on page A6-6.
Table A6-9 shows the allocation of encodings in this space.
1514131211109876543210 1514131211109876543210
1 1 1 op1 op2 op
Table A6-9 32-bit Thumb instruction encoding
op1 op2 op Instruction class, see
01 00xx0xx - Load/store multiple on page A6-23
00xx1xx - Load/store dual, load/store exclusive, table branch on page A6-24
01xxxxx - Data-processing (shifted register) on page A6-31
1xxxxxx - Coprocessor instructions on page A6-40
10 x0xxxxx 0 Data-processing (modified immediate) on page A6-15
x1xxxxx 0 Data-processing (plain binary immediate) on page A6-19
-1Branches and miscellaneous control on page A6-20
11 000xxx0 - Store single data item on page A6-30
001xxx0 - Advanced SIMD element or structure load/store instructions on page A7-27
00xx001 - Load byte, memory hints on page A6-28
00xx011 - Load halfword, memory hints on page A6-26
00xx101 - Load word on page A6-25
00xx111 - UNDEFINED
010xxxx - Data-processing (register) on page A6-33
0110xxx - Multiply, multiply accumulate, and absolute difference on page A6-38
0111xxx - Long multiply, long multiply accumulate, and divide on page A6-39
1xxxxxx - Coprocessor instructions on page A6-40
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-15
A6.3.1 Data-processing (modified immediate)
Table A6-10 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
In the Rn, Rd and S columns, - indicates that the field value of the field does affect the decoding.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
11110 0 op S Rn 0 Rd
Table A6-10 32-bit modified immediate data-processing instructions
op Rn Rd S Instruction See
0000 - not 1111 x Bitwise AND AND (immediate) on page A8-34
- 1111 0 UNPREDICTABLE -
- 1111 1 Test TST (immediate) on page A8-454
0001 - - - Bitwise Bit Clear BIC (immediate) on page A8-50
0010 not 1111 - - Bitwise OR ORR (immediate) on page A8-228
1111 - - Move MOV (immediate) on page A8-194
0011 not 1111 - - Bitwise OR NOT ORN (immediate) on page A8-224
1111 - - Bitwise NOT MVN (immediate) on page A8-214
0100 - not 1111 x Bitwise Exclusive OR EOR (immediate) on page A8-94
1111 0 UNPREDICTABLE -
1 Test Equivalence TEQ (immediate) on page A8-448
1000 - not 1111 - Add ADD (immediate, Thumb) on page A8-20
1111 0 UNPREDICTABLE -
1 Compare Negative CMN (immediate) on page A8-74
1010 - - - Add with Carry ADC (immediate) on page A8-14
1011 - - - Subtract with Carry SBC (immediate) on page A8-302
Thumb Instruction Set Encoding
A6-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
These instructions all have modified immediate constants, rather than a simple 12-bit binary number. This
provides a more useful range of values. For details see Modified immediate constants in Thumb instructions
on page A6-17.
1101 - not 1111 - Subtract SUB (immediate, Thumb) on page A8-418
1111 0 UNPREDICTABLE -
1 Compare CMP (immediate) on page A8-80
1110 - - - Reverse Subtract RSB (immediate) on page A8-284
Table A6-10 32-bit modified immediate data-processing instructions (continued)
op Rn Rd S Instruction See
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-17
A6.3.2 Modified immediate constants in Thumb instructions
Table A6-11 shows the range of modified immediate constants available in Thumb data-processing
instructions, and how they are encoded in the a, b, c, d, e, f, g, h, i, and imm3 fields in the instruction.
Note
The range of values available in Thumb modified immediate constants is slightly different from the range
of values available in ARM instructions. See Modified immediate constants in ARM instructions on
page A5-9 for the ARM values.
15141312111098765432101514131211109876543210
iimm3abcdefgh
Table A6-11 Encoding of modified immediates in Thumb data-processing instructions
i:imm3:a <const> a
a. In this table, the immediate constant value is shown in binary form, to relate
abcdefgh
to the encoding diagram. In assembly syntax, the immediate value is
specified in the usual way (a decimal number by default).
0000x
00000000 00000000 00000000 abcdefgh
0001x
00000000 abcdefgh 00000000 abcdefgh
b
b. Not available in ARM instructions. UNPREDICTABLE if abcdefgh == 00000000.
0010x
abcdefgh 00000000 abcdefgh 00000000
b
0011x
abcdefgh abcdefgh abcdefgh abcdefgh
b
01000
1bcdefgh 00000000 00000000 00000000
01001
01bcdefg h0000000 00000000 00000000
c
01010
001bcdef gh000000 00000000 00000000
01011
0001bcde fgh00000 00000000 00000000
c
.
.
.
.
.
.
8-bit values shifted to other positions
11101
00000000 00000000 000001bc defgh000
c
11110
00000000 00000000 0000001b cdefgh00
11111
00000000 00000000 00000001 bcdefgh0
c
c. Not available in ARM instructions if
h
== 1.
Thumb Instruction Set Encoding
A6-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Carry out
A logical instruction with i:imm3:a == ’00xxx’ does not affect the carry flag. Otherwise, a logical
instruction that sets the flags sets the Carry flag to the value of bit [31] of the modified immediate constant.
Operation
// ThumbExpandImm()
// ================
bits(32) ThumbExpandImm(bits(12) imm12)
// APSR.C argument to following function call does not affect the imm32 result.
(imm32, -) = ThumbExpandImm_C(imm12, APSR.C);
return imm32;
// ThumbExpandImm_C()
// ==================
(bits(32), bit) ThumbExpandImm_C(bits(12) imm12, bit carry_in)
if imm12<11:10> == ‘00’ then
case imm12<9:8> of
when ‘00’
imm32 = ZeroExtend(imm12<7:0>, 32);
when ‘01’
if imm12<7:0> == ‘00000000’ then UNPREDICTABLE;
imm32 = ‘00000000’ : imm12<7:0> : ‘00000000’ : imm12<7:0>;
when ‘10’
if imm12<7:0> == ‘00000000’ then UNPREDICTABLE;
imm32 = imm12<7:0> : ‘00000000’ : imm12<7:0> : ‘00000000’;
when ‘11’
if imm12<7:0> == ‘00000000’ then UNPREDICTABLE;
imm32 = imm12<7:0> : imm12<7:0> : imm12<7:0> : imm12<7:0>;
carry_out = carry_in;
else
unrotated_value = ZeroExtend(‘1’:imm12<6:0>, 32);
(imm32, carry_out) = ROR_C(unrotated_value, UInt(imm12<11:7>));
return (imm32, carry_out);
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-19
A6.3.3 Data-processing (plain binary immediate)
Table A6-12 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
11110 1 op Rn 0
Table A6-12 32-bit unmodified immediate data-processing instructions
op Rn Instruction See
00000 not 1111 Add Wide (12-bit) ADD (immediate, Thumb) on page A8-20
1111 Form PC-relative Address ADR on page A8-32
00100 - Move Wide (16-bit) MOV (immediate) on page A8-194
01010 not 1111 Subtract Wide (12-bit) SUB (immediate, Thumb) on page A8-418
1111 Form PC-relative Address ADR on page A8-32
01100 - Move Top (16-bit) MOVT on page A8-200
100x0 a
a. In the second halfword of the instruction, bits [14:12.7:6] != 0b00000.
- Signed Saturate SSAT on page A8-362
10010 b
b. In the second halfword of the instruction, bits [14:12.7:6] == 0b00000.
- Signed Saturate (two 16-bit) SSAT16 on page A8-364
10100 - Signed Bit Field Extract SBFX on page A8-308
10110 not 1111 Bit Field Insert BFI on page A8-48
1111 Bit Field Clear BFC on page A8-46
110x0 a- Unsigned Saturate USAT on page A8-504
11010 b- Unsigned Saturate 16 USAT16 on page A8-506
11100 - Unsigned Bit Field Extract UBFX on page A8-466
Thumb Instruction Set Encoding
A6-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.4 Branches and miscellaneous control
Table A6-13 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
15141312111098765432101514131211109876543210
11110 op 1 op1 op2
Table A6-13 Branches and miscellaneous control instructions
op1 op op2 Instruction See Variant
0x0 not x111xxx - Conditional branch B on page A8-44 v6T2
0111000 xx00 Move to Special Register,
application level
MSR (register) on page A8-210 All
xx01 Move to Special Register,
system level
MSR (register) on page B6-14 All
xx1x
0111001 -
0111010 - - Change Processor State, and hints
on page A6-21
-
0111011 - - Miscellaneous control instructions
on page A6-21
-
0111100 - Branch and Exchange Jazelle BXJ on page A8-64 v6T2
0111101 - Exception Return SUBS PC, LR and related
instructions on page B6-25
v6T2
011111x - Move from Special Register MRS on page A8-206 v6T2
000 1111111 - Secure Monitor Call SMC (previously SMI) on
page B6-18
Security
Extensions
010 1111111 - Permanently UNDEFINED. This space will not be allocated in future.
0x1 - - Branch B on page A8-44 v6T2
1x0 - - Branch with Link and
Exchange BL, BLX (immediate) on
page A8-58
v5T a
1x1 - - Branch with Link v4T
a. UNDEFINED in ARMv4T.
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-21
Change Processor State, and hints
Table A6-14 shows the allocation of encodings in this space. Other encodings in this space are unallocated
hints that execute as NOPs. These unallocated hint encodings are reserved and software must not use them.
Miscellaneous control instructions
Table A6-15 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED
in ARMv7. They are UNPREDICTABLE in ARMv6.
15141312111098765432101514131211109876543210
111100111010 10 0 op1 op2
Table A6-14 Change Processor State, and hint instructions
op1 op2 Instruction See Variant
not 000 - Change Processor State CPS on page B6-3 v6T2
000 00000000 No Operation hint NOP on page A8-222 v6T2
00000001 Yield hint YIELD on page A8-812 v7
00000010 Wait For Event hint WFE on page A8-808 v7
00000011 Wait For Interrupt hint WFI on page A8-810 v7
00000100 Send Event hint SEV on page A8-316 v7
1111xxxx Debug hint DBG on page A8-88 v7
15141312111098765432101514131211109876543210
111100111011 10 0 op
Table A6-15 Miscellaneous control instructions
op Instruction See Variant
0000 Leave ThumbEE state aENTERX, LEAVEX on page A9-7 ThumbEE
0001 Enter ThumbEE state ENTERX, LEAVEX on page A9-7 ThumbEE
0010 Clear-Exclusive CLREX on page A8-70 v7
Thumb Instruction Set Encoding
A6-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
0100 Data Synchronization Barrier DSB on page A8-92 v7
0101 Data Memory Barrier DMB on page A8-90 v7
0110 Instruction Synchronization Barrier ISB on page A8-102 v7
a. This instruction is a NOP in Thumb state.
Table A6-15 Miscellaneous control instructions (continued)
op Instruction See Variant
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-23
A6.3.5 Load/store multiple
Table A6-16 shows the allocation of encodings in this space.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
1110100 op 0 L Rn
Table A6-16 Load/store multiple instructions
op L Rn Instruction See
00 0 - Store Return State SRS on page B6-20
1 - Return From Exception RFE on page B6-16
01 0 - Store Multiple (Increment After, Empty Ascending) STM / STMIA / STMEA on
page A8-374
1 not 1101 Load Multiple (Increment After, Full Descending) LDM / LDMIA / LDMFD on
page A8-110
1101 Pop Multiple Registers from the stack POP on page A8-246
10 0 not 1101 Store Multiple (Decrement Before, Full Descending) STMDB / STMFD on
page A8-378
1101 Push Multiple Registers to the stack. PUSH on page A8-248
1 - Load Multiple (Decrement Before, Empty Ascending) LDMDB / LDMEA on
page A8-114
11 0 - Store Return State SRS on page B6-20
1 - Return From Exception RFE on page B6-16
Thumb Instruction Set Encoding
A6-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.6 Load/store dual, load/store exclusive, table branch
Table A6-17 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
15141312111098765432101514131211109876543210
1110100 op1 1 op2 Rn op3
Table A6-17 Load/store double or exclusive, table branch
op1 op2 op3 Rn Instruction See Variant
00 00 - - Store Register Exclusive STREX on page A8-400 v6T2
01 - - Load Register Exclusive LDREX on page A8-142 v6T2
0x
1x
10
x0
- - Store Register Dual STRD (immediate) on
page A8-396
v6T2
0x 11 - not 1111 Load Register Dual (immediate) LDRD (immediate) on
page A8-136
v6T2
1x x1 - not 1111
0x 11 - 1111 Load Register Dual (literal) LDRD (literal) on
page A8-138
v6T2
1x x1 - 1111
01 00 0100 - Store Register Exclusive Byte STREXB on page A8-402 v7
0101 - Store Register Exclusive Halfword STREXH on page A8-406 v7
0111 - Store Register Exclusive
Doubleword
STREXD on page A8-404 v7
01 0000 - Table Branch Byte TBB, TBH on page A8-446 v6T2
0001 - Table Branch Halfword TBB, TBH on page A8-446 v6T2
0100 - Load Register Exclusive Byte LDREXB on page A8-144 v7
0101 - Load Register Exclusive Halfword LDREXH on page A8-148 v7
0111 - Load Register Exclusive
Doubleword
LDREXD on page A8-146 v7
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-25
A6.3.7 Load word
Table A6-18 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
1111100 op1 101 Rn op2
Table A6-18 Load word
op1 op2 Rn Instruction See
01 - not 1111 Load Register LDR (immediate, Thumb) on page A8-118
00 1xx1xx not 1111
1100xx not 1111
1110xx not 1111 Load Register Unprivileged LDRT on page A8-176
000000 not 1111 Load Register LDR (register) on page A8-124
0x - 1111 Load Register LDR (literal) on page A8-122
Thumb Instruction Set Encoding
A6-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.8 Load halfword, memory hints
Table A6-19 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
Except where otherwise noted, these encodings are available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
1111100 op1 011 Rn Rt op2
Table A6-19 Load halfword, preload
op1 op2 Rn Rt Instruction See
0x - 1111 not 1111 Load Register Halfword LDRH (literal) on page A8-154
01 - not 1111 not 1111 Load Register Halfword LDRH (immediate, Thumb) on
page A8-150
00 1xx1xx not 1111 not 1111
1100xx not 1111 not 1111
1110xx not 1111 not 1111 Load Register Halfword
Unprivileged
LDRHT on page A8-158
000000 not 1111 not 1111 Load Register Halfword LDRH (register) on page A8-156
1x - 1111 not 1111 Load Register Signed
Halfword
LDRSH (literal) on page A8-170
11 - not 1111 not 1111 Load Register Signed
Halfword
LDRSH (immediate) on page A8-168
10 1xx1xx not 1111 not 1111
1100xx not 1111 not 1111
1110xx not 1111 not 1111 Load Register Signed
Halfword Unprivileged
LDRSHT on page A8-174
000000 not 1111 not 1111 Load Register Signed
Halfword
LDRSH (register) on page A8-172
0x - 1111 1111 UNPREDICTABLE -
01 - not 1111 1111 Preload Data with intent to
Writea
PLD, PLDW (immediate) on
page A8-236
00 1100xx not 1111 1111 Preload Data with intent to
Writea
PLD, PLDW (immediate) on
page A8-236
000000 not 1111 1111 Preload Data with intent to
Writea
PLD, PLDW (register) on
page A8-240
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-27
00 1xx1xx not 1111 1111 UNPREDICTABLE -
1110xx not 1111 1111
1x - 1111 1111 Unallocated memory hint (treat as NOP)
10 1100xx not 1111 1111
000000 not 1111 1111
10 1xx1xx not 1111 1111 UNPREDICTABLE -
1110xx not 1111 1111
11 - not 1111 1111 Unallocated memory hint (treat as NOP)
a. Available in ARMv7 with the Multiprocessing Extensions. In the ARMv7 base architecture and in ARMv6T2 these are
unallocated memory hints (treat as NOP).
Table A6-19 Load halfword, preload (continued)
op1 op2 Rn Rt Instruction See
Thumb Instruction Set Encoding
A6-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.9 Load byte, memory hints
Table A6-20 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
1111100 op1 001 Rn Rt op2
Table A6-20 Load byte, preload
op1 op2 Rn Rt Instruction See
0x - 1111 not 1111 Load Register Byte LDRB (literal) on page A8-130
01 - not 1111 not 1111 Load Register Byte LDRB (immediate, Thumb) on
page A8-126
00 1xx1xx not 1111 not 1111
1100xx not 1111 not 1111
1110xx not 1111 not 1111 Load Register Byte
Unprivileged
LDRBT on page A8-134
000000 not 1111 not 1111 Load Register Byte LDRB (register) on page A8-132
1x - 1111 not 1111 Load Register Signed Byte LDRSB (literal) on page A8-162
11 - not 1111 not 1111 Load Register Signed Byte LDRSB (immediate) on page A8-160
10 1xx1xx not 1111 not 1111
1100xx not 1111 not 1111
1110xx not 1111 not 1111 Load Register Signed Byte
Unprivileged
LDRSBT on page A8-166
000000 not 1111 not 1111 Load Register Signed Byte LDRSB (register) on page A8-164
0x - 1111 1111 Preload Data PLD (literal) on page A8-238
01 - not 1111 1111 Preload Data PLD, PLDW (immediate) on
page A8-236
00 1100xx not 1111 1111 Preload Data PLD, PLDW (immediate) on
page A8-236
000000 not 1111 1111 Preload Data PLD, PLDW (register) on page A8-240
1xx1xx not 1111 1111 UNPREDICTABLE -
1110xx not 1111 1111
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-29
1x - 1111 1111 Preload Instruction PLI (immediate, literal) on page A8-242
11 - not 1111 1111
10 1100xx not 1111 1111
000000 not 1111 1111 Preload Instruction PLI (register) on page A8-244
1xx1xx not 1111 1111 UNPREDICTABLE -
1110xx not 1111 1111
Table A6-20 Load byte, preload (continued)
op1 op2 Rn Rt Instruction See
Thumb Instruction Set Encoding
A6-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.10 Store single data item
Table A6-21 show the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
11111000 op1 0 op2
Table A6-21 Store single data item
op1 op2 Instruction See
100 - Store Register Byte STRB (immediate, Thumb) on page A8-388
000 1xx1xx
1100xx
1110xx Store Register Byte Unprivileged STRBT on page A8-394
0xxxxx Store Register Byte STRB (register) on page A8-392
101 - Store Register Halfword STRH (immediate, Thumb) on page A8-408
001 1xx1xx
1100xx
1110xx Store Register Halfword Unprivileged STRHT on page A8-414
001 0xxxxx Store Register Halfword STRH (register) on page A8-412
110 - Store Register (immediate) STR (immediate, Thumb) on page A8-382
010 1xx1xx
1100xx
1110xx Store Register Unprivileged STRT on page A8-416
0xxxxx Store Register (register) STR (register) on page A8-386
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-31
A6.3.11 Data-processing (shifted register)
Table A6-22 shows the allocation of encodings in this space.
Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
1110101 op S Rn Rd
Table A6-22 Data-processing (shifted register)
op Rn Rd S Instruction See
0000 - not 1111 x Bitwise AND AND (register) on page A8-36
1111 0 UNPREDICTABLE -
1Test TST (register) on page A8-456
0001 - - - Bitwise Bit Clear BIC (register) on page A8-52
0010 not 1111 - - Bitwise OR ORR (register) on page A8-230
1111 - - Move MOV (register) on page A8-196
0011 not 1111 - - Bitwise OR NOT ORN (register) on page A8-226
1111 - - Bitwise NOT MVN (register) on page A8-216
0100 - not 1111 - Bitwise Exclusive OR EOR (register) on page A8-96
1111 0 UNPREDICTABLE -
1 Test Equivalence TEQ (register) on page A8-450
0110 - - - Pack Halfword PKH on page A8-234
1000 - not 1111 - Add ADD (register) on page A8-24
1111 0 UNPREDICTABLE -
1 Compare Negative CMN (register) on page A8-76
1010 - - - Add with Carry ADC (register) on page A8-16
1011 - - - Subtract with Carry SBC (register) on page A8-304
Thumb Instruction Set Encoding
A6-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
1101 - not 1111 - Subtract SUB (register) on page A8-422
1111 0 UNPREDICTABLE -
1 Compare CMP (register) on page A8-82
1110 - - - Reverse Subtract RSB (register) on page A8-286
Table A6-22 Data-processing (shifted register) (continued)
op Rn Rd S Instruction See
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-33
A6.3.12 Data-processing (register)
If, in the second halfword of the instruction, bits [15:12] != 0b1111, the instruction is UNDEFINED.
Table A6-23 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
11111010 op1 Rn 1111 op2
Table A6-23 Data-processing (register)
op1 op2 Rn Instruction See
000x 0000 - Logical Shift Left LSL (register) on page A8-180
001x 0000 - Logical Shift Right LSR (register) on page A8-184
010x 0000 - Arithmetic Shift Right ASR (register) on page A8-42
011x 0000 - Rotate Right ROR (register) on page A8-280
0000 1xxx not 1111 Signed Extend and Add Halfword SXTAH on page A8-438
1111 Signed Extend Halfword SXTH on page A8-444
0001 1xxx not 1111 Unsigned Extend and Add Halfword UXTAH on page A8-518
1111 Unsigned Extend Halfword UXTH on page A8-524
0010 1xxx not 1111 Signed Extend and Add Byte 16 SXTAB16 on page A8-436
1111 Signed Extend Byte 16 SXTB16 on page A8-442
0011 1xxx not 1111 Unsigned Extend and Add Byte 16 UXTAB16 on page A8-516
1111 Unsigned Extend Byte 16 UXTB16 on page A8-522
0100 1xxx not 1111 Signed Extend and Add Byte SXTAB on page A8-434
1111 Signed Extend Byte SXTB on page A8-440
0101 1xxx not 1111 Unsigned Extend and Add Byte UXTAB on page A8-514
1111 Unsigned Extend Byte UXTB on page A8-520
Thumb Instruction Set Encoding
A6-34 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
1xxx 00xx - - Parallel addition and subtraction, signed on
page A6-35
01xx - - Parallel addition and subtraction, unsigned on
page A6-36
10xx 10xx - - Miscellaneous operations on page A6-37
Table A6-23 Data-processing (register) (continued)
op1 op2 Rn Instruction See
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-35
A6.3.13 Parallel addition and subtraction, signed
If, in the second halfword of the instruction, bits [15:12] != 0b1111, the instruction is UNDEFINED.
Table A6-24 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
111110101 op1 1111 00 op2
Table A6-24 Signed parallel addition and subtraction instructions
op1 op2 Instruction See
001 00 Add 16-bit SADD16 on page A8-296
010 00 Add, Subtract SASX on page A8-300
110 00 Subtract, Add SSAX on page A8-366
101 00 Subtract 16-bit SSUB16 on page A8-368
000 00 Add 8-bit SADD8 on page A8-298
100 00 Subtract 8-bit SSUB8 on page A8-370
Saturating instructions
001 01 Saturating Add 16-bit QADD16 on page A8-252
010 01 Saturating Add, Subtract QASX on page A8-256
110 01 Saturating Subtract, Add QSAX on page A8-262
101 01 Saturating Subtract 16-bit QSUB16 on page A8-266
000 01 Saturating Add 8-bit QADD8 on page A8-254
100 01 Saturating Subtract 8-bit QSUB8 on page A8-268
Halving instructions
001 10 Halving Add 16-bit SHADD16 on page A8-318
010 10 Halving Add, Subtract SHASX on page A8-322
110 10 Halving Subtract, Add SHSAX on page A8-324
101 10 Halving Subtract 16-bit SHSUB16 on page A8-326
000 10 Halving Add 8-bit SHADD8 on page A8-320
100 10 Halving Subtract 8-bit SHSUB8 on page A8-328
Thumb Instruction Set Encoding
A6-36 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.14 Parallel addition and subtraction, unsigned
If, in the second halfword of the instruction, bits [15:12] != 0b1111, the instruction is UNDEFINED.
Table A6-25 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
111110101 op1 1111 01 op2
Table A6-25 Unsigned parallel addition and subtraction instructions
op1 op2 Instruction See
001 00 Add 16-bit UADD16 on page A8-460
010 00 Add, Subtract UASX on page A8-464
110 00 Subtract, Add USAX on page A8-508
101 00 Subtract 16-bit USUB16 on page A8-510
000 00 Add 8-bit UADD8 on page A8-462
100 00 Subtract 8-bit USUB8 on page A8-512
Saturating instructions
001 01 Saturating Add 16-bit UQADD16 on page A8-488
010 01 Saturating Add, Subtract UQASX on page A8-492
110 01 Saturating Subtract, Add UQSAX on page A8-494
101 01 Saturating Subtract 16-bit UQSUB16 on page A8-496
000 01 Saturating Add 8-bit UQADD8 on page A8-490
100 01 Saturating Subtract 8-bit UQSUB8 on page A8-498
Halving instructions
001 10 Halving Add 16-bit UHADD16 on page A8-470
010 10 Halving Add, Subtract UHASX on page A8-474
110 10 Halving Subtract, Add UHSAX on page A8-476
101 10 Halving Subtract 16-bit UHSUB16 on page A8-478
000 10 Halving Add 8-bit UHADD8 on page A8-472
100 10 Halving Subtract 8-bit UHSUB8 on page A8-480
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-37
A6.3.15 Miscellaneous operations
If, in the second halfword of the instruction, bits [15:12] != 0b1111, the instruction is UNDEFINED.
Table A6-26 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
1111101010 op1 1111 10 op2
Table A6-26 Miscellaneous operations
op1 op2 Instruction See
00 00 Saturating Add QADD on page A8-250
01 Saturating Double and Add QDADD on page A8-258
10 Saturating Subtract QSUB on page A8-264
11 Saturating Double and Subtract QDSUB on page A8-260
01 00 Byte-Reverse Word REV on page A8-272
01 Byte-Reverse Packed Halfword REV16 on page A8-274
10 Reverse Bits RBIT on page A8-270
11 Byte-Reverse Signed Halfword REVSH on page A8-276
10 00 Select Bytes SEL on page A8-312
11 00 Count Leading Zeros CLZ on page A8-72
Thumb Instruction Set Encoding
A6-38 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.16 Multiply, multiply accumulate, and absolute difference
If, in the second halfword of the instruction, bits [7:6] != 0b00, the instruction is UNDEFINED.
Table A6-27 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
These encodings are all available in ARMv6T2 and above.
15141312111098765432101514131211109876543210
111110110 op1 Ra 00 op2
Table A6-27 Multiply, multiply accumulate, and absolute difference operations
op1 op2 Ra Instruction See
000 00 not 1111 Multiply Accumulate MLA on page A8-190
1111 Multiply MUL on page A8-212
01 - Multiply and Subtract MLS on page A8-192
001 - not 1111 Signed Multiply Accumulate (Halfwords) SMLABB, SMLABT, SMLATB,
SMLATT on page A8-330
1111 Signed Multiply (Halfwords) SMULBB, SMULBT, SMULTB,
SMULTT on page A8-354
010 0x not 1111 Signed Multiply Accumulate Dual SMLAD on page A8-332
1111 Signed Dual Multiply Add SMUAD on page A8-352
011 0x not 1111 Signed Multiply Accumulate (Word by halfword) SMLAWB, SMLAWT on
page A8-340
1111 Signed Multiply (Word by halfword) SMULWB, SMULWT on
page A8-358
100 0x not 1111 Signed Multiply Subtract Dual SMLSD on page A8-342
1111 Signed Dual Multiply Subtract SMUSD on page A8-360
101 0x not 1111 Signed Most Significant Word Multiply Accumulate SMMLA on page A8-346
1111 Signed Most Significant Word Multiply SMMUL on page A8-350
110 0x - Signed Most Significant Word Multiply Subtract SMMLS on page A8-348
111 00 not 1111 Unsigned Sum of Absolute Differences USAD8 on page A8-500
1111 Unsigned Sum of Absolute Differences, Accumulate USADA8 on page A8-502
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-39
A6.3.17 Long multiply, long multiply accumulate, and divide
Table A6-28 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
15141312111098765432101514131211109876543210
111110111 op1 op2
Table A6-28 Multiply, multiply accumulate, and absolute difference operations
op1 op2 Instruction See Variant
000 0000 Signed Multiply Long SMULL on page A8-356 v6T2
001 1111 Signed Divide SDIV on page A8-310 v7-R a
010 0000 Unsigned Multiply Long UMULL on page A8-486 v6T2
011 1111 Unsigned Divide UDIV on page A8-468 v7-R a
100 0000 Signed Multiply Accumulate Long SMLAL on page A8-334 v6T2
10xx Signed Multiply Accumulate Long
(Halfwords)
SMLALBB, SMLALBT, SMLALTB,
SMLALTT on page A8-336
v6T2
110x Signed Multiply Accumulate Long Dual SMLALD on page A8-338 v6T2
101 110x Signed Multiply Subtract Long Dual SMLSLD on page A8-344 v6T2
110 0000 Unsigned Multiply Accumulate Long UMLAL on page A8-484 v6T2
0110 Unsigned Multiply Accumulate Accumulate
Long
UMAAL on page A8-482 v6T2
a. UNDEFINED in ARMv7-A.
Thumb Instruction Set Encoding
A6-40 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A6.3.18 Coprocessor instructions
Table A6-29 shows the allocation of encodings in this space. These encodings are all available in ARMv6T2
and above.
15141312111098765432101514131211109876543210
1 1 1 1 1 op1 Rn coproc op
Table A6-29 Coprocessor instructions
op1 op coproc Rn Instructions See
000x1x
001xxx
01xxxx
- 101x - Advanced SIMD, VFP Extension register load/store
instructions on page A7-26
000x10
001xx0
01xxx0
- not 101x - Store Coprocessor STC, STC2 on page A8-372
000x11
001xx1
01xxx1
- not 101x not 1111 Load Coprocessor (immediate) LDC, LDC2 (immediate) on
page A8-106
000x11
001xx1
01xxx1
- not 101x 1111 Load Coprocessor (literal) LDC, LDC2 (literal) on page A8-108
00000x - - - UNDEFINED -
00010x - 101x - Advanced SIMD, VFP 64-bit transfers between ARM core
and extension registers on page A7-32
000100 - not 101x - Move to Coprocessor from two
ARM core registers
MCRR, MCRR2 on page A8-188
000101 - not 101x - Move to two ARM core
registers from Coprocessor
MRRC, MRRC2 on page A8-204
10xxxx 0 101x - VFP VFP data-processing instructions on
page A7-24
not 101x - Coprocessor data operations CDP, CDP2 on page A8-68
Thumb Instruction Set Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A6-41
For more information about specific coprocessors see Coprocessor support on page A2-68.
10xxxx 1 101x - Advanced SIMD, VFP 8, 16, and 32-bit transfer between
ARM core and extension registers on
page A7-31
10xxx0 1 not 101x - Move to Coprocessor from
ARM core register
MCR, MCR2 on page A8-186
10xxx1 1 not 101x - Move to ARM core register
from Coprocessor
MRC, MRC2 on page A8-202
11xxxx - - - Advanced SIMD Advanced SIMD data-processing
instructions on page A7-10
Table A6-29 Coprocessor instructions (continued)
op1 op coproc Rn Instructions See
Thumb Instruction Set Encoding
A6-42 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-1
Chapter A7
Advanced SIMD and VFP
Instruction Encoding
This chapter gives an overview of the Advanced SIMD and VFP instruction sets. It contains the following
sections:
Overview on page A7-2
Advanced SIMD and VFP instruction syntax on page A7-3
Register encoding on page A7-8
Advanced SIMD data-processing instructions on page A7-10
VFP data-processing instructions on page A7-24
Extension register load/store instructions on page A7-26
Advanced SIMD element or structure load/store instructions on page A7-27
8, 16, and 32-bit transfer between ARM core and extension registers on page A7-31
64-bit transfers between ARM core and extension registers on page A7-32.
Note
The Advanced SIMD architecture extension, its associated implementations, and supporting
software, are commonly referred to as NEON technology.
In the decode tables in this chapter, an entry of - for a field value means the value of the field does
not affect the decoding.
Advanced SIMD and VFP Instruction Encoding
A7-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.1 Overview
All Advanced SIMD and VFP instructions are available in both ARM state and Thumb state.
A7.1.1 Advanced SIMD
The following sections describe the classes of instruction in the Advanced SIMD extension:
Advanced SIMD data-processing instructions on page A7-10
Advanced SIMD element or structure load/store instructions on page A7-27
Extension register load/store instructions on page A7-26
8, 16, and 32-bit transfer between ARM core and extension registers on page A7-31
64-bit transfers between ARM core and extension registers on page A7-32.
A7.1.2 VFP
The following sections describe the classes of instruction in the VFP extension:
Extension register load/store instructions on page A7-26
8, 16, and 32-bit transfer between ARM core and extension registers on page A7-31
64-bit transfers between ARM core and extension registers on page A7-32
VFP data-processing instructions on page A7-24.
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-3
A7.2 Advanced SIMD and VFP instruction syntax
Advanced SIMD and VFP instructions use the general conventions of the ARM instruction set.
Advanced SIMD and VFP data-processing instructions use the following general format:
V{<modifier>}<operation>{<shape>}<c><q>{.<dt>} {<dest>,} <src1>, <src2>
All Advanced SIMD and VFP instructions begin with a
V
. This distinguishes Advanced SIMD vector and
VFP instructions from ARM scalar instructions.
The main operation is specified in the
<operation>
field. It is usually a three letter mnemonic the same as or
similar to the corresponding scalar integer instruction.
The
<c>
and
<q>
fields are standard assembler syntax fields. For details see Standard assembler syntax fields
on page A8-7.
A7.2.1 Advanced SIMD Instruction modifiers
The
<modifier>
field provides additional variants of some instructions. Table A7-1 provides definitions of
the modifiers. Modifiers are not available for every instruction.
Table A7-1 Advanced SIMD instruction modifiers
<modifier> Meaning
Q The operation uses saturating arithmetic.
R The operation performs rounding.
D The operation doubles the result (before accumulation, if any).
H The operation halves the result.
Advanced SIMD and VFP Instruction Encoding
A7-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.2.2 Advanced SIMD Operand shapes
The
<shape>
field provides additional variants of some instructions. Table A7-2 provides definitions of the
shapes. Operand shapes are not available for every instruction.
A7.2.3 Data type specifiers
The
<dt>
field normally contains one data type specifier. This indicates the data type contained in
the second operand, if any
the operand, if there is no second operand
the result, if there are no operand registers.
The data types of the other operand and result are implied by the
<dt>
field combined with the instruction
shape. For information about data type formats see Data types supported by the Advanced SIMD extension
on page A2-25.
In the instruction syntax descriptions in Chapter A8 Instruction Details, the
<dt>
field is usually specified
as a single field. However, where more convenient, it is sometimes specified as a concatenation of two fields,
<type><size>
.
Table A7-2 Advanced SIMD operand shapes
<shape> Meaning Typical register shape
(none) The operands and result are all the same width. Dd, Dn, Dm Qd, Qn, Qm
L Long operation - result is twice the width of both operands Qd, Dn, Dm
N Narrow operation - result is half the width of both operands Dd, Qn, Qm
W Wide operation - result and first operand are twice the width of the
second operand
Qd, Qn, Dm
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-5
Syntax flexibility
There is some flexibility in the data type specifier syntax:
You can specify three data types, specifying the result and both operand data types. For example:
VSUBW.I16.I16.S8 Q3,Q5,D0
instead of:
VSUBW.S8 Q3,Q5,D0
You can specify two data types, specifying the data types of the two operands. The data type of the
result is implied by the instruction shape.
You can specify two data types, specifying the data types of the single operand and the result.
Where an instruction requires a less specific data type, you can instead specify a more specific type,
as shown in Table A7-3.
Where an instruction does not require a data type, you can provide one.
The
F32
data type can be abbreviated to
F
.
The
F64
data type can be abbreviated to
D
.
In all cases, if you provide additional information, the additional information must match the instruction
shape. Disassembly does not regenerate this additional information.
Table A7-3 Data type specification flexibility
Specified data type Permitted more specific data types
None Any
.I<size>
-
.S<size> .U<size>
--
.8 .I8 .S8 .U8 .P8
-
.16 .I16 .S16 .U16 .P16 .F16
.32 .I32 .S32 .U32
-
.F32
or
.F
.64 .I64 .S64 .U64
-
.F64
or
.D
Advanced SIMD and VFP Instruction Encoding
A7-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.2.4 Register specifiers
The
<dest>
,
<src1>
, and
<src2>
fields contain register specifiers, or in some cases scalar specifiers or register
lists. Table A7-4 shows the register and scalar specifier formats that appear in the instruction descriptions.
If
<dest>
is omitted, it is the same as
<src1>
.
Table A7-4 Advanced SIMD and VFP register specifier formats
<specifier> Usual meaning a
<Qd>
A quadword destination register for the result vector (Advanced SIMD only).
<Qn>
A quadword source register for the first operand vector (Advanced SIMD only).
<Qm>
A quadword source register for the second operand vector (Advanced SIMD only).
<Dd>
A doubleword destination register for the result vector.
<Dn>
A doubleword source register for the first operand vector.
<Dm>
A doubleword source register for the second operand vector.
<Sd>
A singleword destination register for the result vector (VFP only).
<Sn>
A singleword source register for the first operand vector (VFP only).
<Sm>
A singleword source register for the second operand vector (VFP only).
<Dd[x]>
A destination scalar for the result. Element x of vector
<Dd>
. (Advanced SIMD only).
<Dn[x]>
A source scalar for the first operand. Element x of vector
<Dn>
. (Advanced SIMD only).
<Dm[x]>
A source scalar for the second operand. Element x of vector
<Dm>
. (Advanced SIMD only).
<Rd>
An ARM core register. Can be source or destination.
<Rm>
An ARM core register. Can be source or destination.
a. In some instructions the roles of registers are different.
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-7
A7.2.5 Register lists
A register list is a list of register specifiers separated by commas and enclosed in brackets { and }. There are
restrictions on what registers can appear in a register list. These restrictions are described in the individual
instruction descriptions. Table A7-5 shows some register list formats, with examples of actual register lists
corresponding to those formats.
Note
Register lists must not wrap around the end of the register bank.
Syntax flexibility
There is some flexibility in the register list syntax:
Where a register list contains consecutive registers, they can be specified as a range, instead of listing
every register, for example
{D0-D3}
instead of
{D0,D1,D2,D3}
.
Where a register list contains an even number of consecutive doubleword registers starting with an
even numbered register, it can be written as a list of quadword registers instead, for example
{Q1,Q2}
instead of
{D2-D5}
.
Where a register list contains only one register, the enclosing braces can be omitted, for example
VLD1.8 D0,[R0]
instead of
VLD1.8 {D0},[R0]
.
Table A7-5 Example register lists
Format Example Alternative
{<Dd>} {D3} D3
{<Dd>,<Dd+1>,<Dd+2>} {D3,D4,D5} {D3-D5}
{<Dd[x]>,<Dd+2[x]} {D0[3],D2[3]}
-
{<Dd[]>} {D7[]} D7[]
Advanced SIMD and VFP Instruction Encoding
A7-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.3 Register encoding
Advanced SIMD registers are either quadword (128 bits wide) or doubleword (64 bits wide). Some
instructions have options for either doubleword or quadword registers. This is normally encoded in Q
(bit [6]) as Q = 0 for doubleword operations, Q = 1 for quadword operations.
VFP registers are either double-precision (64 bits wide) or single-precision (32 bits wide). This is encoded
in the sz field (bit [8]) as sz = 1 for double-precision operations, or sz = 0 for single-precision operations.
Some instructions use only one or two registers, and use the unused register fields as additional opcode bits.
Table A7-6 shows the encodings for the registers.
Thumb encoding
15141312111098765432101514131211109876543210
DVnVd szNQMVm
ARM encoding
313029282726252423222120191817161514131211109876543210
DVnVd szNQMVm
Table A7-6 Encoding of register numbers
Register
mnemonic Usual usage Register number
encoded in Notes aUsed in
<Qd>
Destination (quadword) D, Vd (bits [22,15:13]) bit [12] == 0 Adv. SIMD
<Qn>
First operand (quadword) N, Vn (bits [7,19:17]) bit [16] == 0 Adv. SIMD
<Qm>
Second operand (quadword) M, Vm (bits [5,3:1]) bit [0] == 0 Adv. SIMD
<Dd>
Destination (doubleword) D, Vd (bits [22,15:12]) - Both
<Dn>
First operand (doubleword) N, Vn (bits [7,19:16]) - Both
<Dm>
Second operand (doubleword) M, Vm (bits [5,3:0]) - Both
<Sd>
Destination (single-precision) Vd, D (bits [15:12,22]) - VFP
<Sn>
First operand (single-precision) Vn, N (bits [19:16,7]) - VFP
<Sm>
Second operand (single-precision) Vm, M (bits [3:0,5]) - VFP
a. If one of these bits is 1, the instruction is UNDEFINED.
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-9
A7.3.1 Advanced SIMD scalars
Advanced SIMD scalars can be 8-bit, 16-bit, 32-bit, or 64-bit. Instructions other than multiply instructions
can access any element in the register set. The instruction syntax refers to the scalars using an index into a
doubleword vector. The descriptions of the individual instructions contain details of the encodings.
Table A7-7 shows the form of encoding for scalars used in multiply instructions. These instructions cannot
access scalars in some registers. The descriptions of the individual instructions contain cross references to
this section where appropriate.
32-bit Advanced SIMD scalars, when used as single-precision floating-point numbers, are equivalent to
VFP single-precision registers. That is,
Dm[x]
in a 32-bit context (0 <=
m
<= 15, 0 <=
x
<=1) is equivalent to
S[2m + x]
.
Table A7-7 Encoding of scalars in multiply instructions
Scalar
mnemonic Usual usage Scalar
size
Register
specifier
Index
specifier
Accessible
registers
<Dm[x]>
Second operand 16-bit Vm[2:0] M, Vm[3] D0-D7
32-bit Vm[3:0] M D0-D15
Advanced SIMD and VFP Instruction Encoding
A7-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.4 Advanced SIMD data-processing instructions
Table A7-8 shows the encoding for Advanced SIMD data-processing instructions. Other encodings in this
space are UNDEFINED.
In these instructions, the U bit is in a different location in ARM and Thumb instructions. This is bit [12] of
the first halfword in the Thumb encoding, and bit [24] in the ARM encoding. Other variable bits are in
identical locations in the two encodings, after adjusting for the fact that the ARM encoding is held in
memory as a single word and the Thumb encoding is held as two consecutive halfwords.
The ARM instructions can only be executed unconditionally. The Thumb instructions can be executed
conditionally by using the
IT
instruction. For details see IT on page A8-104.
Thumb encoding
15141312111098765432101514131211109876543210
111U1111 A B C
ARM encoding
313029282726252423222120191817161514131211109876543210
1111001U A B C
Table A7-8 Data-processing instructions
UA BCSee
- 0xxxx - - Three registers of the same length on page A7-12
1x000 - 0xx1 One register and a modified immediate value on page A7-21
1x001 - 0xx1 Two registers and a shift amount on page A7-17
1x01x - 0xx1
1x1xx - 0xx1
1xxxx - 1xx1
1x0xx - x0x0 Three registers of different lengths on page A7-15
1x10x - x0x0
1x0xx - x1x0 Two registers and a scalar on page A7-16
1x10x - x1x0
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-11
0 1x11x - xxx0 Vector Extract, VEXT on page A8-598
1 1x11x 0xxx xxx0 Two registers, miscellaneous on page A7-19
10xx xxx0 Vector Table Lookup, VTBL, VTBX on page A8-798
1100 0xx0 Vector Duplicate, VDUP (scalar) on page A8-592
Table A7-8 Data-processing instructions (continued)
UA BCSee
Advanced SIMD and VFP Instruction Encoding
A7-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.4.1 Three registers of the same length
Table A7-9 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
Thumb encoding
15141312111098765432101514131211109876543210
111U11110 C A B
ARM encoding
313029282726252423222120191817161514131211109876543210
1111001U0 C A B
Table A7-9 Three registers of the same length
A B U C Instruction See
0000 0 - - Vector Halving Add VHADD, VHSUB on page A8-600
1 - - Vector Saturating Add VQADD on page A8-700
0001 0 - - Vector Rounding Halving Add VRHADD on page A8-734
1 0 00 Vector Bitwise AND VAND (register) on page A8-544
01 Vector Bitwise Bit Clear (AND complement) VBIC (register) on page A8-548
10 Vector Bitwise OR (if source registers differ) VORR (register) on page A8-680
Vector Move (if source registers identical) VMOV (register) on page A8-642
11 Vector Bitwise OR NOT VORN (register) on page A8-676
1 00 Vector Bitwise Exclusive OR VEOR on page A8-596
01 Vector Bitwise Select VBIF, VBIT, VBSL on page A8-550
10 Vector Bitwise Insert if True VBIF, VBIT, VBSL on page A8-550
11 Vector Bitwise Insert if False VBIF, VBIT, VBSL on page A8-550
0010 0 - - Vector Halving Subtract VHADD, VHSUB on page A8-600
1 - - Vector Saturating Subtract VQSUB on page A8-724
0011 0 - - Vector Compare Greater Than VCGT (register) on page A8-560
1 - - Vector Compare Greater Than or Equal VCGE (register) on page A8-556
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-13
0100 0 - - Vector Shift Left VSHL (register) on page A8-752
1 - - Vector Saturating Shift Left VQSHL (register) on page A8-718
0101 0 - - Vector Rounding Shift Left VRSHL on page A8-736
1 - - Vector Saturating Rounding Shift Left VQRSHL on page A8-714
0110 - - - Vector Maximum or Minimum VMAX, VMIN (integer) on page A8-630
0111 0 - - Vector Absolute Difference VABD, VABDL (integer) on page A8-528
1 - - Vector Absolute Difference and Accumulate VABA, VABAL on page A8-526
1000 0 0 - Vector Add VADD (integer) on page A8-536
1 - Vector Subtract VSUB (integer) on page A8-788
10 - Vector Test Bits VTST on page A8-802
1 - Vector Compare Equal VCEQ (register) on page A8-552
1001 0 - - Vector Multiply Accumulate or Subtract VMLA, VMLAL, VMLS, VMLSL (integer)
on page A8-634
1 - - Vector Multiply VMUL, VMULL (integer and polynomial)
on page A8-662
1010 - - - Vector Pairwise Maximum or Minimum VPMAX, VPMIN (integer) on
page A8-690
1011 0 0 - Vector Saturating Doubling Multiply
Returning High Half
VQDMULH on page A8-704
1 - Vector Saturating Rounding Doubling
Multiply Returning High Half
VQRDMULH on page A8-712
1 0 - Vector Pairwise Add VPADD (integer) on page A8-684
Table A7-9 Three registers of the same length (continued)
A B U C Instruction See
Advanced SIMD and VFP Instruction Encoding
A7-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
1101 0 0 0x Vector Add VADD (floating-point) on page A8-538
1x Vector Subtract VSUB (floating-point) on page A8-790
1 0x Vector Pairwise Add VPADD (floating-point) on page A8-686
1x Vector Absolute Difference VABD (floating-point) on page A8-530
1 0 - Vector Multiply Accumulate or Subtract VMLA, VMLS (floating-point) on
page A8-636
1 0x Vector Multiply VMUL (floating-point) on page A8-664
1110 0 0 0x Vector Compare Equal VCEQ (register) on page A8-552
1 0x Vector Compare Greater Than or Equal VCGE (register) on page A8-556
1x Vector Compare Greater Than VCGT (register) on page A8-560
1 1 - Vector Absolute Compare Greater or Less
Than (or Equal)
VACGE, VACGT, VACLE,VACLT on
page A8-534
1111 0 0 - Vector Maximum or Minimum VMAX, VMIN (floating-point) on
page A8-632
1 - Vector Pairwise Maximum or Minimum VPMAX, VPMIN (floating-point) on
page A8-692
1 0 0x Vector Reciprocal Step VRECPS on page A8-730
0 1x Vector Reciprocal Square Root Step VRSQRTS on page A8-744
Table A7-9 Three registers of the same length (continued)
A B U C Instruction See
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-15
A7.4.2 Three registers of different lengths
If B == 0b11, see Advanced SIMD data-processing instructions on page A7-10.
Table A7-10 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
Thumb encoding
15141312111098765432101514131211109876543210
111U11111 B A 0 0
ARM encoding
313029282726252423222120191817161514131211109876543210
1111001U1 B A 0 0
Table A7-10 Data-processing instructions with three registers of different lengths
A U Instruction See
000x - Vector Add Long or Wide VADDL, VADDW on page A8-542
001x - Vector Subtract Long or Wide VSUBL, VSUBW on page A8-794
0100 0 Vector Add and Narrow, returning High Half VADDHN on page A8-540
1 Vector Rounding Add and Narrow, returning High Half VRADDHN on page A8-726
0101 - Vector Absolute Difference and Accumulate VABA, VABAL on page A8-526
0110 0 Vector Subtract and Narrow, returning High Half VSUBHN on page A8-792
1 Vector Rounding Subtract and Narrow, returning High Half VRSUBHN on page A8-748
0111 - Vector Absolute Difference VABD, VABDL (integer) on
page A8-528
10x0 - Vector Multiply Accumulate or Subtract VMLA, VMLAL, VMLS, VMLSL
(integer) on page A8-634
10x1 0 Vector Saturating Doubling Multiply Accumulate or
Subtract Long
VQDMLAL, VQDMLSL on
page A8-702
1100 - Vector Multiply
(integer) VMUL, VMULL (integer and
polynomial) on page A8-662
1101 0 Vector Saturating Doubling Multiply Long VQDMULL on page A8-706
1110 - Vector Multiply (polynomial) VMUL, VMULL (integer and
polynomial) on page A8-662
Advanced SIMD and VFP Instruction Encoding
A7-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.4.3 Two registers and a scalar
If B == 0b11, see Advanced SIMD data-processing instructions on page A7-10.
Table A7-11 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
Thumb encoding
15141312111098765432101514131211109876543210
111U11111 B A 1 0
ARM encoding
313029282726252423222120191817161514131211109876543210
1111001U1 B A 1 0
Table A7-11 Data-processing instructions with two registers and a scalar
A U Instruction See
0x0x - Vector Multiply Accumulate or Subtract VMLA, VMLAL, VMLS, VMLSL (by scalar) on
page A8-638
0x10 - Vector Multiply Accumulate or Subtract Long VMLA, VMLAL, VMLS, VMLSL (by scalar) on
page A8-638
0x11 0 Vector Saturating Doubling Multiply
Accumulate or Subtract Long
VQDMLAL, VQDMLSL on page A8-702
100x - Vector Multiply VMUL, VMULL (by scalar) on page A8-666
1010 - Vector Multiply Long VMUL, VMULL (by scalar) on page A8-666
1011 0 Vector Saturating Doubling Multiply Long VQDMULL on page A8-706
1100 - Vector Saturating Doubling Multiply returning
High Half
VQDMULH on page A8-704
1101 - Vector Saturating Rounding Doubling
Multiply returning High Half
VQRDMULH on page A8-712
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-17
A7.4.4 Two registers and a shift amount
If [L, imm3] == 0b0000, see One register and a modified immediate value on page A7-21.
Table A7-12 shows the allocation of encodings in this space. Other encodings in this space are UNDEFINED.
Thumb encoding
15141312111098765432101514131211109876543210
111U11111 imm3 A LB 1
ARM encoding
313029282726252423222120191817161514131211109876543210
1111001U1 imm3 A LB 1
Table A7-12 Data-processing instructions with two registers and a shift amount
A U B L Instruction See
0000 - - - Vector Shift Right VSHR on page A8-756
0001 - - - Vector Shift Right and Accumulate VSRA on page A8-764
0010 - - - Vector Rounding Shift Right VRSHR on page A8-738
0011 - - - Vector Rounding Shift Right and Accumulate VRSRA on page A8-746
0100 1 - - Vector Shift Right and Insert VSRI on page A8-766
0101 0 - - Vector Shift Left VSHL (immediate) on page A8-750
0101 1 - - Vector Shift Left and Insert VSLI on page A8-760
011x - - - Vector Saturating Shift Left VQSHL, VQSHLU (immediate) on
page A8-720
1000 0 0 0 Vector Shift Right Narrow VSHRN on page A8-758
1 - Vector Rounding Shift Right Narrow VRSHRN on page A8-740
1 0 - Vector Saturating Shift Right, Unsigned Narrow VQSHRN, VQSHRUN on page A8-722
1 - Vector Saturating Shift Right, Rounded
Unsigned Narrow
VQRSHRN, VQRSHRUN on
page A8-716
1001 - 0 - Vector Saturating Shift Right, Narrow VQSHRN, VQSHRUN on page A8-722
1 - Vector Saturating Shift Right, Rounded Narrow VQRSHRN, VQRSHRUN on
page A8-716
Advanced SIMD and VFP Instruction Encoding
A7-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
1010 - 0 - Vector Shift Left Long VSHLL on page A8-754
Vector Move Long VMOVL on page A8-654
111x - - - Vector Convert VCVT (between floating-point and
fixed-point, Advanced SIMD) on
page A8-580
Table A7-12 Data-processing instructions with two registers and a shift amount (continued)
A U B L Instruction See
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-19
A7.4.5 Two registers, miscellaneous
The allocation of encodings in this space is shown in Table A7-13. Other encodings in this space are
UNDEFINED.
Thumb encoding
15141312111098765432101514131211109876543210
111111111 11 A 0 B 0
ARM encoding
313029282726252423222120191817161514131211109876543210
111100111 11 A 0 B 0
Table A7-13 Instructions with two registers, miscellaneous
A B Instruction See
00 0000x Vector Reverse in doublewords VREV16, VREV32, VREV64 on page A8-732
0001x Vector Reverse in words VREV16, VREV32, VREV64 on page A8-732
0010x Vector Reverse in halfwords VREV16, VREV32, VREV64 on page A8-732
010xx Vector Pairwise Add Long VPADDL on page A8-688
1000x Vector Count Leading Sign Bits VCLS on page A8-566
1001x Vector Count Leading Zeros VCLZ on page A8-570
1010x Vector Count VCNT on page A8-574
1011x Vector Bitwise NOT VMVN (register) on page A8-670
110xx Vector Pairwise Add and Accumulate Long VPADAL on page A8-682
1110x Vector Saturating Absolute VQABS on page A8-698
1111x Vector Saturating Negate VQNEG on page A8-710
Advanced SIMD and VFP Instruction Encoding
A7-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
01 x000x Vector Compare Greater Than Zero VCGT (immediate #0) on page A8-562
x001x Vector Compare Greater Than or Equal to Zero VCGE (immediate #0) on page A8-558
x010x Vector Compare Equal to zero VCEQ (immediate #0) on page A8-554
x011x Vector Compare Less Than or Equal to Zero VCLE (immediate #0) on page A8-564
x100x Vector Compare Less Than Zero VCLT (immediate #0) on page A8-568
x110x Vector Absolute VABS on page A8-532
x111x Vector Negate VNEG on page A8-672
0000x Vector Swap VSWP on page A8-796
0001x Vector Transpose VTRN on page A8-800
0010x Vector Unzip VUZP on page A8-804
0011x Vector Zip VZIP on page A8-806
10 01000 Vector Move and Narrow VMOVN on page A8-656
01001 Vector Saturating Move and Unsigned Narrow VQMOVN, VQMOVUN on page A8-708
0101x Vector Saturating Move and Narrow VQMOVN, VQMOVUN on page A8-708
01100 Vector Shift Left Long (maximum shift) VSHLL on page A8-754
11x00 Vector Convert VCVT (between half-precision and
single-precision, Advanced SIMD) on
page A8-586
11 10x0x Vector Reciprocal Estimate VRECPE on page A8-728
10x1x Vector Reciprocal Square Root Estimate VRSQRTE on page A8-742
11xxx Vector Convert VCVT (between floating-point and integer,
Advanced SIMD) on page A8-576
Table A7-13 Instructions with two registers, miscellaneous (continued)
A B Instruction See
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-21
A7.4.6 One register and a modified immediate value
Table A7-14 shows the allocation of encodings in this space.
Table A7-15 on page A7-22 shows the modified immediate constants available with these instructions, and
how they are encoded.
Thumb encoding
15141312111098765432101514131211109876543210
111a11111 000bcd cmode 0 op1e fgh
ARM encoding
313029282726252423222120191817161514131211109876543210
1111001a1 000bcd cmode 0 op1e fgh
Table A7-14 Data-processing instructions with one register and
a modified immediate value
op cmode Instruction See
0 0xx0 Vector Move VMOV (immediate) on page A8-640
0xx1 Vector Bitwise OR VORR (immediate) on page A8-678
10x0 Vector Move VMOV (immediate) on page A8-640
10x1 Vector Bitwise OR VORR (immediate) on page A8-678
11xx Vector Move VMOV (immediate) on page A8-640
1 0xx0 Vector Bitwise NOT VMVN (immediate) on page A8-668
0xx1 Vector Bit Clear VBIC (immediate) on page A8-546
10x0 Vector Bitwise NOT VMVN (immediate) on page A8-668
10x1 Vector Bit Clear VBIC (immediate) on page A8-546
110x Vector Bitwise NOT VMVN (immediate) on page A8-668
1110 Vector Move VMOV (immediate) on page A8-640
1111 UNDEFINED -
Advanced SIMD and VFP Instruction Encoding
A7-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Table A7-15 Modified immediate values for Advanced SIMD instructions
op cmode Constant a<dt>bNotes
- 000x
00000000 00000000 00000000 abcdefgh 00000000 00000000 00000000 abcdefgh I32
c
001x
00000000 00000000 abcdefgh 00000000 00000000 00000000 abcdefgh 00000000 I32
c, d
010x
00000000 abcdefgh 00000000 00000000 00000000 abcdefgh 00000000 00000000 I32
c, d
011x
abcdefgh 00000000 00000000 00000000 abcdefgh 00000000 00000000 00000000 I32
c, d
100x
00000000 abcdefgh 00000000 abcdefgh 00000000 abcdefgh 00000000 abcdefgh I16
c
101x
abcdefgh 00000000 abcdefgh 00000000 abcdefgh 00000000 abcdefgh 00000000 I16
c, d
1100
00000000 00000000 abcdefgh 11111111 00000000 00000000 abcdefgh 11111111 I32
d, e
1101
00000000 abcdefgh 11111111 11111111 00000000 abcdefgh 11111111 11111111 I32
d, e
0 1110
abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh abcdefgh I8
f
1 1110
aaaaaaaa bbbbbbbb cccccccc dddddddd eeeeeeee ffffffff gggggggg hhhhhhhh I64
f
0 1111
aBbbbbbc defgh000 00000000 00000000 aBbbbbbc defgh000 00000000 00000000 F32
f, g
1 1111 UNDEFINED --
a. In this table, the immediate value is shown in binary form, to relate abcdefgh to the encoding diagram. In assembler
syntax, the constant is specified by a data type and a value of that type. That value is specified in the normal way (a
decimal number by default) and is replicated enough times to fill the 64-bit immediate. For example, a data type of
I32
and a value of 10 specify the 64-bit constant
0x0000000A0000000A
.
b. This specifies the data type used when the instruction is disassembled. On assembly, the data type must be matched in
the table if possible. Other data types are permitted as pseudo-instructions when code is assembled, provided the 64-bit
constant specified by the data type and value is available for the instruction (if it is available in more than one way, the
first entry in this table that can produce it is used). For example,
VMOV.I64 D0,#0x8000000080000000
does not specify a
64-bit constant that is available from the I64 line of the table, but does specify one that is available from the fourth I32
line or the F32 line. It is assembled to the former, and therefore is disassembled as
VMOV.I32 D0,#0x80000000
.
c. This constant is available for the
VBIC
,
VMOV
,
VMVN
, and
VORR
instructions.
d. UNPREDICTABLE if
abcdefgh
== 00000000.
e. This constant is available for the
VMOV
and
VMVN
instructions only.
f. This constant is available for the
VMOV
instruction only.
g. In this entry,
B
=NOT(
b
). The bit pattern represents the floating-point number (–1)S * 2exp * mantissa, where
S=
UInt(a)
, exp =
UInt(NOT(b):c:d)-3
and mantissa =
(16+UInt(e:f:g:h))/16
.
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-23
Operation
// AdvSIMDExpandImm()
// ==================
bits(64) AdvSIMDExpandImm(bit op, bits(4) cmode, bits(8) imm8)
case cmode<3:1> of
when ‘000’
testimm8 = FALSE; imm64 = Replicate(Zeros(24):imm8, 2);
when ‘001’
testimm8 = TRUE; imm64 = Replicate(Zeros(16):imm8:Zeros(8), 2);
when ‘010’
testimm8 = TRUE; imm64 = Replicate(Zeros(8):imm8:Zeros(16), 2);
when ‘011’
testimm8 = TRUE; imm64 = Replicate(imm8:Zeros(24), 2);
when ‘100’
testimm8 = FALSE; imm64 = Replicate(Zeros(8):imm8, 4);
when ‘101’
testimm8 = TRUE; imm64 = Replicate(imm8:Zeros(8), 4);
when ‘110’
testimm8 = TRUE;
if cmode<0> == ‘0’ then
imm64 = Replicate(Zeros(16):imm8:Ones(8), 2);
else
imm64 = Replicate(Zeros(8):imm8:Ones(16), 2);
when ‘111’
testimm8 = FALSE;
if cmode<0> == ‘0’ && op == ‘0’ then
imm64 = Replicate(imm8, 8);
if cmode<0> == ‘0’ && op == ‘1’ then
imm8a = Replicate(imm8<7>, 8); imm8b = Replicate(imm8<6>, 8);
imm8c = Replicate(imm8<5>, 8); imm8d = Replicate(imm8<4>, 8);
imm8e = Replicate(imm8<3>, 8); imm8f = Replicate(imm8<2>, 8);
imm8g = Replicate(imm8<1>, 8); imm8h = Replicate(imm8<0>, 8);
imm64 = imm8a:imm8b:imm8c:imm8d:imm8e:imm8f:imm8g:imm8h;
if cmode<0> == ‘1’ && op == ‘0’ then
imm32 = imm8<7>:NOT(imm8<6>):Replicate(imm8<6>,5):imm8<5:0>:Zeros(19);
imm64 = Replicate(imm32, 2);
if cmode<0> == ‘1’ && op == ‘1’ then
UNDEFINED;
if testimm8 && imm8 == ‘00000000’ then
UNPREDICTABLE;
return imm64;
Advanced SIMD and VFP Instruction Encoding
A7-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.5 VFP data-processing instructions
If
T
== 1 in the Thumb encoding or
cond
== 0b1111 in the ARM encoding, the instruction is UNDEFINED.
Otherwise:
Table A7-16 shows the encodings for three-register VFP data-processing instructions. Other
encodings in this space are UNDEFINED.
Table A7-17 on page A7-25 applies only if Table A7-16 indicates that it does. It shows the encodings
for VFP data-processing instructions with two registers or a register and an immediate. Other
encodings in this space are UNDEFINED.
Table A7-18 on page A7-25 shows the immediate constants available in the
VMOV
(immediate)
instruction.
These instructions are
CDP
instructions for coprocessors 10 and 11.
Thumb encoding
15141312111098765432101514131211109876543210
111T1110 opc1 opc2 101 opc3 0 opc4
ARM encoding
313029282726252423222120191817161514131211109876543210
cond 1110 opc1 opc2 101 opc3 0 opc4
Table A7-16 Three-register VFP data-processing instructions
opc1 opc3 Instruction See
0x00 - Vector Multiply Accumulate or Subtract VMLA, VMLS (floating-point) on
page A8-636
0x01 - Vector Negate Multiply Accumulate or Subtract VNMLA, VNMLS, VNMUL on page A8-674
0x10 x1
x0 Vector Multiply VMUL (floating-point) on page A8-664
0x11 x0 Vector Add VADD (integer) on page A8-536
x1 Vector Subtract VSUB (integer) on page A8-788
1x00 x0 Vector Divide VDIV on page A8-590
1x11 - Other VFP data-processing instructions Table A7-17 on page A7-25
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-25
A7.5.1 Operation
// VFPExpandImm()
// ==============
bits(N) VFPExpandImm(bits(8) imm8, integer N)
assert N == 32 || N == 64;
if N == 32 then
return imm8<7>:NOT(imm8<6>):Replicate(imm8<6>,5):imm8<5:0>:Zeros(19);
else
return imm8<7>:NOT(imm8<6>):Replicate(imm8<6>,8):imm8<5:0>:Zeros(48);
Table A7-17 Other VFP data-processing instructions
opc2 opc3 Instruction See
- x0 Vector Move VMOV (immediate) on page A8-640
0000 01 Vector Move VMOV (register) on page A8-642
11 Vector Absolute VABS on page A8-532
0001 01 Vector Negate VNEG on page A8-672
11 Vector Square Root VSQRT on page A8-762
001x x1 Vector Convert VCVTB, VCVTT (between half-precision and single-precision, VFP) on
page A8-588
010x x1 Vector Compare VCMP, VCMPE on page A8-572
0111 11 Vector Convert VCVT (between double-precision and single-precision) on page A8-584
1000 x1 Vector Convert VCVT, VCVTR (between floating-point and integer, VFP) on page A8-578
101x x1 Vector Convert VCVT (between floating-point and fixed-point, VFP) on page A8-582
110x x1 Vector Convert VCVT, VCVTR (between floating-point and integer, VFP) on page A8-578
111x x1 Vector Convert VCVT (between floating-point and fixed-point, VFP) on page A8-582
Table A7-18 VFP modified immediate constants
Data type opc2 opc4 Constant a
F32
abcd efgh
aBbbbbbc defgh000 00000000 00000000
F64
abcd efgh
aBbbbbbb bbcdefgh 00000000 00000000 00000000 00000000 00000000 00000000
a. In this column,
B
=NOT(
b
). The bit pattern represents the floating-point number (–1)S * 2exp * mantissa, where
S=
UInt(a)
, exp =
UInt(NOT(b):c:d)-3
and mantissa =
(16+UInt(e:f:g:h))/16
.
Advanced SIMD and VFP Instruction Encoding
A7-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.6 Extension register load/store instructions
If
T
== 1 in the Thumb encoding or
cond
== 0b1111 in the ARM encoding, the instruction is UNDEFINED.
Otherwise, the allocation of encodings in this space is shown in Table A7-19. Other encodings in this space
are UNDEFINED.
These instructions are
LDC
and
STC
instructions for coprocessors 10 and 11.
Thumb encoding
15141312111098765432101514131211109876543210
111T110 Opcode Rn 101
ARM encoding
313029282726252423222120191817161514131211109876543210
cond 110 Opcode Rn 101
Table A7-19 Extension register load/store instructions
Opcode Rn Instruction See
0010x - - 64-bit transfers between ARM
core and extension registers on
page A7-32
01x00 - Vector Store Multiple (Increment After, no writeback) VSTM on page A8-784
01x10 - Vector Store Multiple (Increment After, writeback) VSTM on page A8-784
1xx00 - Vector Store Register VSTR on page A8-786
10x10 not 1101 Vector Store Multiple (Decrement Before, writeback) VSTM on page A8-784
1101 Vector Push Registers VPUSH on page A8-696
01x01 - Vector Load Multiple (Increment After, no writeback) VLDM on page A8-626
01x11 not 1101 Vector Load Multiple (Increment After, writeback) VLDM on page A8-626
1101 Vector Pop Registers VPOP on page A8-694
1xx01 - Vector Load Register VLDR on page A8-628
10x11 - Vector Load Multiple (Decrement Before, writeback) VLDM on page A8-626
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-27
A7.7 Advanced SIMD element or structure load/store instructions
The allocation of encodings in this space is shown in:
Table A7-20 if L == 0, store instructions
Table A7-21 on page A7-28 if L == 1, load instructions.
Other encodings in this space are UNDEFINED.
The variable bits are in identical locations in the two encodings, after adjusting for the fact that the ARM
encoding is held in memory as a single word and the Thumb encoding is held as two consecutive halfwords.
The ARM instructions can only executed unconditionally. The Thumb instructions can be executed
conditionally by using the
IT
instruction. For details see IT on page A8-104.
Thumb encoding
15141312111098765432101514131211109876543210
11111001A L0 B
ARM encoding
313029282726252423222120191817161514131211109876543210
11110100A L0 B
Table A7-20 Element and structure store instructions (L == 0)
A B Instruction See
0 0010
011x
1010
Vector Store VST1 (multiple single elements) on page A8-768
0011
100x
Vector Store VST2 (multiple 2-element structures) on page A8-772
010x Vector Store VST3 (multiple 3-element structures) on page A8-776
000x Vector Store VST4 (multiple 4-element structures) on page A8-780
Advanced SIMD and VFP Instruction Encoding
A7-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
1 0x00
1000
Vector Store VST1 (single element from one lane) on page A8-770
0x01
1001
Vector Store VST2 (single 2-element structure from one lane) on page A8-774
0x10
1010
Vector Store VST3 (single 3-element structure from one lane) on page A8-778
0x11
1011
Vector Store VST4 (single 4-element structure from one lane) on page A8-782
Table A7-21 Element and structure load instructions (L == 1)
A B Instruction See
0 0010
011x
1010
Ve ct o r L oa d VLD1 (multiple single elements) on page A8-602
0011
100x
Ve ct o r L oa d VLD2 (multiple 2-element structures) on page A8-608
010x Vector Load VLD3 (multiple 3-element structures) on page A8-614
000x Vector Load VLD4 (multiple 4-element structures) on page A8-620
Table A7-20 Element and structure store instructions (L == 0) (continued)
A B Instruction See
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-29
1 0x00
1000
Vector Load VLD1 (single element to one lane) on page A8-604
1100 Vector Load VLD1 (single element to all lanes) on page A8-606
0x01
1001
Vector Load VLD2 (single 2-element structure to one lane) on page A8-610
1101 Vector Load VLD2 (single 2-element structure to all lanes) on page A8-612
0x10
1010
Vector Load VLD3 (single 3-element structure to one lane) on page A8-616
1110 Vector Load VLD3 (single 3-element structure to all lanes) on page A8-618
0x11
1011
Vector Load VLD4 (single 4-element structure to one lane) on page A8-622
1111 Vector Load VLD4 (single 4-element structure to all lanes) on page A8-624
Table A7-21 Element and structure load instructions (L == 1) (continued)
A B Instruction See
Advanced SIMD and VFP Instruction Encoding
A7-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.7.1 Advanced SIMD addressing mode
All the element and structure load/store instructions use this addressing mode. There is a choice of three
formats:
[<Rn>{@<align>}]
The address is contained in ARM core register Rn.
Rn is not updated by this instruction.
Encoded as Rm = 0b1111.
If Rn is encoded as 0b1111, the instruction is UNPREDICTABLE.
[<Rn>{@<align>}]!
The address is contained in ARM core register Rn.
Rn is updated by this instruction:
Rn = Rn + transfer_size
Encoded as Rm = 0b1101.
transfer_size
is the number of bytes transferred by the instruction. This means that,
after the instruction is executed, Rn points to the address in memory immediately
following the last address loaded from or stored to.
If Rn is encoded as 0b1111, the instruction is UNPREDICTABLE.
This addressing mode can also be written as:
[<Rn>{@align}], #<transfer_size>
However, disassembly produces the
[<Rn>{@align}]!
form.
[<Rn>{@<align>}], <Rm>
The address is contained in ARM core register
<Rn>
.
Rn is updated by this instruction:
Rn = Rn + Rm
Encoded as Rm = Rm. Rm must not be encoded as 0b1111 or 0b1101 (the PC or
the SP).
If Rn is encoded as 0b1111, the instruction is UNPREDICTABLE.
In all cases,
<align>
specifies an optional alignment. Details are given in the individual instruction
descriptions.
Advanced SIMD and VFP Instruction Encoding
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A7-31
A7.8 8, 16, and 32-bit transfer between ARM core and extension registers
If
T
== 1 in the Thumb encoding or
cond
== 0b1111 in the ARM encoding, the instruction is UNDEFINED.
Otherwise, the allocation of encodings in this space is shown in Table A7-22. Other encodings in this space
are UNDEFINED.
These instructions are
MRC
and
MCR
instructions for coprocessors 10 and 11.
Thumb encoding
15141312111098765432101514131211109876543210
111T1110 A L 101C B 1
ARM encoding
313029282726252423222120191817161514131211109876543210
cond 1110 A L 101C B 1
Table A7-22 8-bit, 16-bit and 32-bit data transfer instructions
L C A B Instruction See
0 0 000 - Vector Move VMOV (between ARM core register and
single-precision register) on page A8-648
111 - Move to VFP Special Register from
ARM core register
VMSR on page A8-660
VMSR on page B6-29 (System level view)
0 1 0xx - Vector Move VMOV (ARM core register to scalar) on
page A8-644
1xx 0x Vector Duplicate VDUP (ARM core register) on page A8-594
1 0 000 - Vector Move VMOV (between ARM core register and
single-precision register) on page A8-648
111 - Move to ARM core register from VFP
Special Register
VMRS on page A8-658
VMRS on page B6-27 (System level view)
1 xxx - Vector Move VMOV (scalar to ARM core register) on
page A8-646
Advanced SIMD and VFP Instruction Encoding
A7-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A7.9 64-bit transfers between ARM core and extension registers
If
T
== 1 in the Thumb encoding or
cond
== 0b1111 in the ARM encoding, the instruction is UNDEFINED.
Otherwise, the allocation of encodings in this space is shown in Table A7-23. Other encodings in this space
are UNDEFINED.
These instructions are
MRRC
and
MCRR
instructions for coprocessors 10 and 11.
Thumb encoding
15141312111098765432101514131211109876543210
111T1100010 101C op
ARM encoding
313029282726252423222120191817161514131211109876543210
cond 1100010 101C op
Table A7-23 8-bit, 16-bit and 32-bit data transfer instructions
C op Instruction
0 00x1 VMOV (between two ARM core registers and two single-precision registers) on page A8-650
1 00x1 VMOV (between two ARM core registers and a doubleword extension register) on page A8-652
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-1
Chapter A8
Instruction Details
This chapter describes each instruction. It contains the following sections:
Format of instruction descriptions on page A8-2
Standard assembler syntax fields on page A8-7
Conditional execution on page A8-8
Shifts applied to a register on page A8-10
Memory accesses on page A8-13
Alphabetical list of instructions on page A8-14.
Instruction Details
A8-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.1 Format of instruction descriptions
The instruction descriptions in Alphabetical list of instructions on page A8-14 normally use the following
format:
instruction section title
introduction to the instruction
instruction encoding(s) with architecture information
assembler syntax
pseudocode describing how the instruction operates
exception information
notes (where applicable).
Each of these items is described in more detail in the following subsections.
A few instruction descriptions describe alternative mnemonics for other instructions and use an abbreviated
and modified version of this format.
A8.1.1 Instruction section title
The instruction section title gives the base mnemonic for the instructions described in the section. When one
mnemonic has multiple forms described in separate instruction sections, this is followed by a short
description of the form in parentheses. The most common use of this is to distinguish between forms of an
instruction in which one of the operands is an immediate value and forms in which it is a register.
Parenthesized text is also used to document the former mnemonic in some cases where a mnemonic has been
replaced entirely by another mnemonic in the new assembler syntax.
A8.1.2 Introduction to the instruction
The instruction section title is followed by text that briefly describes the main features of the instruction.
This description is not necessarily complete and is not definitive. If there is any conflict between it and the
more detailed information that follows, the latter takes priority.
A8.1.3 Instruction encodings
This is a list of one or more instruction encodings. Each instruction encoding is labelled as:
T1, T2, T3 … for the first, second, third and any additional Thumb encodings
A1, A2, A3 … for the first, second, third and any additional ARM encodings
E1, E2, E3 … for the first, second, third and any additional ThumbEE encodings that are not also
Thumb encodings.
Where Thumb and ARM encodings are very closely related, the two encodings are described together, for
example as encoding T1 / A1.
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-3
Each instruction encoding description consists of:
Information about which architecture variants include the particular encoding of the instruction. This
is presented in one of two ways:
For instruction encodings that are in the main instruction set architecture, as a list of the
architecture variants that include the encoding. See Architecture versions, profiles, and
variants on page A1-4 for a summary of these variants.
For instruction encodings that are in the architecture extensions, as a list of the architecture
extensions that include the encoding. See Architecture extensions on page A1-6 for a summary
of the architecture extensions and the architecture variants that they can extend.
In architecture variant lists:
ARMv7 means ARMv7-A and ARMv7-R profiles. The architecture variant information in this
manual does not cover the ARMv7-M profile.
* is used as a wildcard. For example, ARMv5T* means ARMv5T, ARMv5TE, and
ARMv5TEJ.
An assembly syntax that ensures that the assembler selects the encoding in preference to any other
encoding. In some cases, multiple syntaxes are given. The correct one to use is sometimes indicated
by annotations to the syntax, such as Inside IT block and Outside IT block. In other cases, the correct
one to use can be determined by looking at the assembler syntax description and using it to determine
which syntax corresponds to the instruction being disassembled.
There is usually more than one syntax that ensures re-assembly to any particular encoding, and the
exact set of syntaxes that do so usually depends on the register numbers, immediate constants and
other operands to the instruction. For example, when assembling to the Thumb instruction set, the
syntax
AND R0,R0,R8
ensures selection of a 32-bit encoding but
AND R0,R0,R1
selects a 16-bit encoding.
The assembly syntax documented for the encoding is chosen to be the simplest one that ensures
selection of that encoding for all operand combinations supported by that encoding. This often means
that it includes elements that are only necessary for a small subset of operand combinations. For
example, the assembler syntax documented for the 32-bit Thumb
AND
(register) encoding includes
the
.W
qualifier to ensure that the 32-bit encoding is selected even for the small proportion of operand
combinations for which the 16-bit encoding is also available.
The assembly syntax given for an encoding is therefore a suitable one for a disassembler to
disassemble that encoding to. However, disassemblers might wish to use simpler syntaxes when they
are suitable for the operand combination, in order to produce more readable disassembled code.
An encoding diagram, or a Thumb encoding diagram followed by an ARM encoding diagram when
they are being described together. This is half-width for 16-bit Thumb encodings and full-width for
32-bit Thumb and ARM encodings. The 32-bit Thumb encodings use a double vertical line between
the two halfwords of the instruction to distinguish them from ARM encodings and to act as a
reminder that 32-bit Thumb instructions consist of two consecutive halfwords rather than a word.
In particular, if instructions are stored using the standard little-endian instruction endianness, the
encoding diagram for an ARM instruction at address A shows the bytes at addressees A+3, A+2,
A+1, A from left to right, but the encoding diagram for a 32-bit Thumb instruction shows them in the
order A+1, A for the first halfword, followed by A+3, A+2 for the second halfword.
Instruction Details
A8-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Encoding-specific pseudocode. This is pseudocode that translates the encoding-specific instruction
fields into inputs to the encoding-independent pseudocode in the later Operation subsection, and that
picks out any special cases in the encoding. For a detailed description of the pseudocode used and of
the relationship between the encoding diagram, the encoding-specific pseudocode and the
encoding-independent pseudocode, see Appendix I Pseudocode Definition.
A8.1.4 Assembler syntax
The Assembly syntax subsection describes the standard UAL syntax for the instruction.
Each syntax description consists of the following elements:
One or more syntax prototype lines written in a
typewriter
font, using the conventions described in
Assembler syntax prototype line conventions on page A8-5. Each prototype line documents the
mnemonic and (where appropriate) operand parts of a full line of assembler code. When there is more
than one such line, each prototype line is annotated to indicate required results of the
encoding-specific pseudocode. For each instruction encoding, this information can be used to
determine whether any instructions matching that encoding are available when assembling that
syntax, and if so, which ones.
The line where: followed by descriptions of all of the variable or optional fields of the prototype
syntax line.
Some syntax fields are standardized across all or most instructions. Standard assembler syntax fields
on page A8-7 describes these fields.
By default, syntax fields that specify registers, such as
<Rd>
,
<Rn>
, or
<Rt>
, can be any of R0-R12 or
LR in Thumb instructions, and any of R0-R12, SP or LR in ARM instructions. These require that the
encoding-specific pseudocode set the corresponding integer variable (such as
d
,
n
, or
t
) to the
corresponding register number (0-12 for R0-R12, 13 for SP, 14 for LR). This can normally be done
by setting the corresponding bitfield in the instruction (named Rd, Rn, Rt…) to the binary encoding
of that number. In the case of 16-bit Thumb encodings, this bitfield is normally of length 3 and so the
encoding is only available when one of R0-R7 is specified in the assembler syntax. It is also common
for such encodings to use a bitfield name such as Rdn. This indicates that the encoding is only
available if
<Rd>
and
<Rn>
specify the same register, and that the register number of that register is
encoded in the bitfield if they do.
The description of a syntax field that specifies a register sometimes extends or restricts the permitted
range of registers or documents other differences from the default rules for such fields. Typical
extensions are to permit the use of the SP in Thumb instructions and to permit the use of the PC (using
register number 15).
Where appropriate, text that briefly describes changes from the pre-UAL ARM assembler syntax.
Where present, this usually consists of an alternative pre-UAL form of the assembler mnemonic. The
pre-UAL ARM assembler syntax does not conflict with UAL, and support for it is a recommended
optional extension to UAL, to enable the assembly of pre-UAL ARM assembler source files.
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-5
Note
The pre-UAL Thumb assembler syntax is incompatible with UAL and is not documented in the instruction
sections. For details see Appendix C Legacy Instruction Mnemonics.
Assembler syntax prototype line conventions
The following conventions are used in assembler syntax prototype lines and their subfields:
< >
Any item bracketed by
<
and
>
is a short description of a type of value to be supplied by the
user in that position. A longer description of the item is normally supplied by subsequent
text. Such items often correspond to a similarly named field in an encoding diagram for an
instruction. When the correspondence simply requires the binary encoding of an integer
value or register number to be substituted into the instruction encoding, it is not described
explicitly. For example, if the assembler syntax for an ARM instruction contains an item
<Rn>
and the instruction encoding diagram contains a 4-bit field named Rn, the number of
the register specified in the assembler syntax is encoded in binary in the instruction field.
If the correspondence between the assembler syntax item and the instruction encoding is
more complex than simple binary encoding of an integer or register number, the item
description indicates how it is encoded. This is often done by specifying a required output
from the encoding-specific pseudocode, such as
add = TRUE
. The assembler must only use
encodings that produce that output.
{}
Any item bracketed by
{
and
}
is optional. A description of the item and of how its presence
or absence is encoded in the instruction is normally supplied by subsequent text.
Many instructions have an optional destination register. Unless otherwise stated, if such a
destination register is omitted, it is the same as the immediately following source register in
the instruction syntax.
spaces Single spaces are used for clarity, to separate items. When a space is obligatory in the
assembler syntax, two or more consecutive spaces are used.
+/-
This indicates an optional
+
or
-
sign. If neither is coded,
+
is assumed.
All other characters must be encoded precisely as they appear in the assembler syntax. Apart from
{
and
}
,
the special characters described above do not appear in the basic forms of assembler instructions
documented in this manual. The
{
and
}
characters need to be encoded in a few places as part of a variable
item. When this happens, the long description of the variable item indicates how they must be used.
A8.1.5 Pseudocode describing how the instruction operates
The Operation subsection contains encoding-independent pseudocode that describes the main operation of
the instruction. For a detailed description of the pseudocode used and of the relationship between the
encoding diagram, the encoding-specific pseudocode and the encoding-independent pseudocode, see
Appendix I Pseudocode Definition.
Instruction Details
A8-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.1.6 Exception information
The Exceptions subsection contains a list of the exceptional conditions that can be caused by execution of
the instruction.
Processor exceptions are listed as follows:
Resets and interrupts (both IRQs and FIQs) are not listed. They can occur before or after the
execution of any instruction, and in some cases during the execution of an instruction, but they are
not in general caused by the instruction concerned.
Prefetch Abort exceptions are normally caused by a memory abort when an instruction is fetched,
followed by an attempt to execute that instruction. This can happen for any instruction, but is caused
by the aborted attempt to fetch the instruction rather than by the instruction itself, and so is not listed.
A special case is the
BKPT
instruction, that is defined as causing a Prefetch Abort exception in some
circumstances.
Data Abort exceptions are listed for all instructions that perform data memory accesses.
Undefined Instruction exceptions are listed when they are part of the effects of a defined instruction.
For example, all coprocessor instructions are defined to produce the Undefined Instruction exception
if not accepted by their coprocessor. Undefined Instruction exceptions caused by the execution of an
UNDEFINED instruction are not listed, even when the UNDEFINED instruction is a special case of one
or more of the encodings of the instruction. Such special cases are instead indicated in the
encoding-specific pseudocode for the encoding.
Supervisor Call and Secure Monitor Call exceptions are listed for the
SVC
and
SMC
instructions
respectively. Supervisor Call exceptions and the
SVC
instruction were previously called Software
Interrupt exceptions and the
SWI
instruction. Secure Monitor Call exceptions and the
SMC
instruction
were previously called Secure Monitor interrupts and the
SMI
instruction.
Floating-point exceptions are listed for instructions that can produce them. Floating-point exceptions on
page A2-42 describes these exceptions. They do not normally result in processor exceptions.
A8.1.7 Notes
Where appropriate, other notes about the instruction appear under additional subheadings.
Note
Information that was documented in notes in previous versions of the ARM Architecture Reference Manual
and its supplements has often been moved elsewhere. For example, operand restrictions on the values of
bitfields in an instruction encoding are now normally documented in the encoding-specific pseudocode for
that encoding.
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-7
A8.2 Standard assembler syntax fields
The following assembler syntax fields are standard across all or most instructions:
<c>
Is an optional field. It specifies the condition under which the instruction is executed. See
Conditional execution on page A8-8 for the range of available conditions and their
encoding. If
<c>
is omitted, it defaults to always (
AL
).
<q>
Specifies optional assembler qualifiers on the instruction. The following qualifiers are
defined:
.N
Meaning narrow, specifies that the assembler must select a 16-bit encoding for
the instruction. If this is not possible, an assembler error is produced.
.W
Meaning wide, specifies that the assembler must select a 32-bit encoding for the
instruction. If this is not possible, an assembler error is produced.
If neither
.W
nor
.N
is specified, the assembler can select either 16-bit or 32-bit encodings.
If both are available, it must select a 16-bit encoding. In a few cases, more than one encoding
of the same length can be available for an instruction. The rules for selecting between such
encodings are instruction-specific and are part of the instruction description.
Note
When assembling to the ARM instruction set, the
.N
qualifier produces an assembler error
and the
.W
qualifier has no effect.
Although the instruction descriptions throughout this manual show the
<c>
and
<q>
fields without { } around
them, these fields are optional as described in this section.
Instruction Details
A8-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.3 Conditional execution
Most ARM instructions, and most Thumb instructions from ARMv6T2 onwards, can be executed
conditionally, based on the values of the APSR condition flags. Before ARMv6T2, the only conditional
Thumb instruction was the 16-bit conditional branch instruction. Table A8-1 lists the available conditions.
In Thumb instructions, the condition (if it is not
AL
) is normally encoded in a preceding
IT
instruction. For
details see Conditional instructions on page A4-4 and IT on page A8-104. Some conditional branch
instructions do not require a preceding
IT
instruction, and include a condition code in their encoding.
In ARM instructions, bits [31:28] of the instruction contain the condition, or contain 1111 for some ARM
instructions that can only be executed unconditionally.
Table A8-1 Condition codes
cond Mnemonic
extension Meaning (integer) Meaning (floating-point) aCondition flags
0000
EQ
Equal Equal Z == 1
0001
NE
Not equal Not equal, or unordered Z == 0
0010
CS
bCarry set Greater than, equal, or unordered C == 1
0011
CC
cCarry clear Less than C == 0
0100
MI
Minus, negative Less than N == 1
0101
PL
Plus, positive or zero Greater than, equal, or unordered N == 0
0110
VS
Overflow Unordered V == 1
0111
VC
No overflow Not unordered V == 0
1000
HI
Unsigned higher Greater than, or unordered C == 1 and Z == 0
1001
LS
Unsigned lower or same Less than or equal C == 0 or Z == 1
1010
GE
Signed greater than or equal Greater than or equal N == V
1011
LT
Signed less than Less than, or unordered N != V
1100
GT
Signed greater than Greater than Z == 0 and N == V
1101
LE
Signed less than or equal Less than, equal, or unordered Z == 1 or N != V
1110 None (
AL
) dAlways (unconditional) Always (unconditional) Any
a. Unordered means at least one NaN operand.
b.
HS
(unsigned higher or same) is a synonym for
CS
.
c.
LO
(unsigned lower) is a synonym for
CC
.
d.
AL
is an optional mnemonic extension for always, except in
IT
instructions. For details see IT on page A8-104.
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-9
A8.3.1 Pseudocode details of conditional execution
The
CurrentCond()
pseudocode function has prototype:
bits(4) CurrentCond()
and returns a 4-bit condition specifier as follows:
For ARM instructions, it returns bits[31:28] of the instruction.
For the T1 and T3 encodings of the Branch instruction (see B on page A8-44), it returns the 4-bit
'cond' field of the encoding.
For all other Thumb and ThumbEE instructions, it returns ITSTATE.IT<7:4>. See ITSTATE on
page A2-17.
The
ConditionPassed()
function uses this condition specifier and the APSR condition flags to determine
whether the instruction must be executed:
// ConditionPassed()
// =================
boolean ConditionPassed()
cond = CurrentCond();
// Evaluate base condition.
case cond<3:1> of
when ‘000’ result = (APSR.Z == ‘1’); // EQ or NE
when ‘001’ result = (APSR.C == ‘1’); // CS or CC
when ‘010’ result = (APSR.N == ‘1’); // MI or PL
when ‘011’ result = (APSR.V == ‘1’); // VS or VC
when ‘100’ result = (APSR.C == ‘1’) && (APSR.Z == ‘0’); // HI or LS
when ‘101’ result = (APSR.N == APSR.V); // GE or LT
when ‘110’ result = (APSR.N == APSR.V) && (APSR.Z == ‘0’); // GT or LE
when ‘111’ result = TRUE; // AL
// Condition bits ‘111x’ indicate the instruction is always executed. Otherwise,
// invert condition if necessary.
if cond<0> == ‘1’ && cond != ‘1111’ then
result = !result;
return result;
Instruction Details
A8-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.4 Shifts applied to a register
ARM register offset load/store word and unsigned byte instructions can apply a wide range of different
constant shifts to the offset register. Both Thumb and ARM data-processing instructions can apply the same
range of different constant shifts to the second operand register. For details see Constant shifts.
ARM data-processing instructions can apply a register-controlled shift to the second operand register.
A8.4.1 Constant shifts
These are the same in Thumb and ARM instructions, except that the input bits come from different
positions.
<shift>
is an optional shift to be applied to
<Rm>
. It can be any one of:
(omitted) No shift.
LSL #<n>
Logical shift left
<n>
bits. 1 <=
<n>
<= 31.
LSR #<n>
Logical shift right
<n>
bits. 1 <=
<n>
<= 32.
ASR #<n>
Arithmetic shift right
<n>
bits. 1 <=
<n>
<= 32.
ROR #<n>
Rotate right
<n>
bits. 1 <=
<n>
<= 31.
RRX
Rotate right one bit, with extend. Bit [0] is written to
shifter_carry_out
, bits [31:1] are
shifted right one bit, and the Carry Flag is shifted into bit [31].
Note
Assemblers can permit the use of some or all of
ASR #0
,
LSL #0
,
LSR #0
, and
ROR #0
to specify that no shift is
to be performed. This is not standard UAL, and the encoding selected for Thumb instructions might vary
between UAL assemblers if it is used. To ensure disassembled code assembles to the original instructions,
disassemblers must omit the shift specifier when the instruction specifies no shift.
Similarly, assemblers can permit the use of
#0
in the immediate forms of
ASR
,
LSL
,
LSR
, and
ROR
instructions
to specify that no shift is to be performed, that is, that a
MOV
(register) instruction is wanted. Again, this is
not standard UAL, and the encoding selected for Thumb instructions might vary between UAL assemblers
if it is used. To ensure disassembled code assembles to the original instructions, disassemblers must use the
MOV
(register) syntax when the instruction specifies no shift.
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-11
Encoding
The assembler encodes
<shift>
into two type bits and five immediate bits, as follows:
(omitted) type = 0b00, immediate = 0.
LSL #<n>
type = 0b00, immediate =
<n>
.
LSR #<n>
type = 0b01.
If
<n>
< 32, immediate =
<n>
.
If
<n>
== 32, immediate = 0.
ASR #<n>
type = 0b10.
If
<n>
< 32, immediate =
<n>
.
If
<n>
== 32, immediate = 0.
ROR #<n>
type = 0b11, immediate =
<n>
.
RRX
type = 0b11, immediate = 0.
A8.4.2 Register controlled shifts
These are only available in ARM instructions.
<type>
is the type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
The bottom byte of
<Rs>
contains the shift amount.
A8.4.3 Pseudocode details of instruction-specified shifts and rotates
enumeration SRType (SRType_LSL, SRType_LSR, SRType_ASR, SRType_ROR, SRType_RRX);
// DecodeImmShift()
// ================
(SRType, integer) DecodeImmShift(bits(2) type, bits(5) imm5)
case type of
when ‘00’
shift_t = SRType_LSL; shift_n = UInt(imm5);
when ‘01’
shift_t = SRType_LSR; shift_n = if imm5 == ‘00000’ then 32 else UInt(imm5);
when ‘10’
shift_t = SRType_ASR; shift_n = if imm5 == ‘00000’ then 32 else UInt(imm5);
when ‘11’
Instruction Details
A8-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
if imm5 == ‘00000’ then
shift_t = SRType_RRX; shift_n = 1;
else
shift_t = SRType_ROR; shift_n = UInt(imm5);
return (shift_t, shift_n);
// DecodeRegShift()
// ================
SRType DecodeRegShift(bits(2) type)
case type of
when ‘00’ shift_t = SRType_LSL;
when ‘01’ shift_t = SRType_LSR;
when ‘10’ shift_t = SRType_ASR;
when ‘11’ shift_t = SRType_ROR;
return shift_t;
// Shift()
// =======
bits(N) Shift(bits(N) value, SRType type, integer amount, bit carry_in)
(result, -) = Shift_C(value, type, amount, carry_in);
return result;
// Shift_C()
// =========
(bits(N), bit) Shift_C(bits(N) value, SRType type, integer amount, bit carry_in)
assert !(type == SRType_RRX && amount != 1);
if amount == 0 then
(result, carry_out) = (value, carry_in);
else
case type of
when SRType_LSL
(result, carry_out) = LSL_C(value, amount);
when SRType_LSR
(result, carry_out) = LSR_C(value, amount);
when SRType_ASR
(result, carry_out) = ASR_C(value, amount);
when SRType_ROR
(result, carry_out) = ROR_C(value, amount);
when SRType_RRX
(result, carry_out) = RRX_C(value, carry_in);
return (result, carry_out);
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-13
A8.5 Memory accesses
Commonly, the following addressing modes are permitted for memory access instructions:
Offset addressing
The offset value is applied to an address obtained from the base register. The result is used
as the address for the memory access. The value of the base register is unchanged.
The assembly language syntax for this mode is:
[<Rn>,<offset>]
Pre-indexed addressing
The offset value is applied to an address obtained from the base register. The result is used
as the address for the memory access, and written back into the base register.
The assembly language syntax for this mode is:
[<Rn>,<offset>]!
Post-indexed addressing
The address obtained from the base register is used, unchanged, as the address for the
memory access. The offset value is applied to the address, and written back into the base
register
The assembly language syntax for this mode is:
[<Rn>],<offset>
In each case,
<Rn>
is the base register.
<offset>
can be:
an immediate constant, such as
<imm8>
or
<imm12>
an index register,
<Rm>
a shifted index register, such as
<Rm>, LSL #<shift>
.
For information about unaligned access, endianness, and exclusive access, see:
Alignment support on page A3-4
Endian support on page A3-7
Synchronization and semaphores on page A3-12.
Instruction Details
A8-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6 Alphabetical list of instructions
Every instruction is listed in this section. For details of the format used see Format of instruction
descriptions on page A8-2.
A8.6.1 ADC (immediate)
Add with Carry (immediate) adds an immediate value and the carry flag value to a register value, and writes
the result to the destination register. It can optionally update the condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’); imm32 = ThumbExpandImm(i:imm3:imm8);
if BadReg(d) || BadReg(n) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’); imm32 = ARMExpandImm(imm12);
Encoding T1 ARMv6T2, ARMv7
ADC{S}<c> <Rd>,<Rn>,#<const>
15141312111098765432101514131211109876543210
11110 i 01010S Rn 0 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADC{S}<c> <Rd>,<Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0010101S Rn Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-15
Assembler syntax
ADC{S}<c><q> {<Rd>,} <Rn>, #<const>
where:
S
If S is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<const>
The immediate value to be added to the value obtained from
<Rn>
. See Modified immediate
constants in Thumb instructions on page A6-17 or Modified immediate constants in ARM
instructions on page A5-9 for the range of values.
The pre-UAL syntax
ADC<c>S
is equivalent to
ADCS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry, overflow) = AddWithCarry(R[n], imm32, APSR.C);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.2 ADC (register)
Add with Carry (register) adds a register value, the carry flag value, and an optionally-shifted register value,
and writes the result to the destination register. It can optionally update the condition flags based on the
result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
(shift_t, shift_n) = (SRType_LSL, 0);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADCS <Rdn>,<Rm>
Outside IT block.
ADC<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100000101 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
ADC{S}<c>.W <Rd>,<Rn>,<Rm>{,<shift>}
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
11101011010S Rn (0) imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADC{S}<c> <Rd>,<Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0000101S Rn Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-17
Assembler syntax
ADC{S}<c><q> {<Rd>,} <Rn>, <Rm> {,<shift>}
where:
S
If S is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The optionally shifted second operand register.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and any encoding is permitted. Shifts applied to a register on
page A8-10 describes the shifts and how they are encoded.
In Thumb assembly:
outside an IT block, if
ADCS <Rd>,<Rn>,<Rd>
has
<Rd>
and
<Rn>
both in the range R0-R7, it is assembled
using encoding T1 as though
ADCS <Rd>,<Rn>
had been written.
inside an IT block, if
ADC<c> <Rd>,<Rn>,<Rd>
has
<Rd>
and
<Rn>
both in the range R0-R7, it is
assembled using encoding T1 as though
ADC<c> <Rd>,<Rn>
had been written.
To prevent either of these happening, use the .W qualifier.
The pre-UAL syntax
ADC<c>S
is equivalent to
ADCS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], shifted, APSR.C);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.3 ADC (register-shifted register)
Add with Carry (register-shifted register) adds a register value, the carry flag value, and a register-shifted
register value. It writes the result to the destination register, and can optionally update the condition flags
based on the result.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
setflags = (S == ‘1’); shift_t = DecodeRegShift(type);
if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADC{S}<c> <Rd>,<Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0000101S Rn Rd Rs 0type1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-19
Assembler syntax
ADC{S}<c><q> {<Rd>,} <Rn>, <Rm>, <type> <Rs>
where:
S
If S is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
The pre-UAL syntax
ADC<c>S
is equivalent to
ADCS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], shifted, APSR.C);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.4 ADD (immediate, Thumb)
This instruction adds an immediate value to a register value, and writes the result to the destination register.
It can optionally update the condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); setflags = !InITBlock(); imm32 = ZeroExtend(imm3, 32);
d = UInt(Rdn); n = UInt(Rdn); setflags = !InITBlock(); imm32 = ZeroExtend(imm8, 32);
if Rd == ‘1111’ && S == ‘1’ then SEE CMN (immediate);
if Rn == ‘1101’ then SEE ADD (SP plus immediate);
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’); imm32 = ThumbExpandImm(i:imm3:imm8);
if BadReg(d) || n == 15 then UNPREDICTABLE;
if Rn == ‘1111’ then SEE ADR;
if Rn == ‘1101’ then SEE ADD (SP plus immediate);
d = UInt(Rd); n = UInt(Rn); setflags = FALSE; imm32 = ZeroExtend(i:imm3:imm8, 32);
if BadReg(d) then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADDS <Rd>,<Rn>,#<imm3>
Outside IT block.
ADD<c> <Rd>,<Rn>,#<imm3>
Inside IT block.
1514131211109876543210
0001110 imm3 Rn Rd
Encoding T2 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADDS <Rdn>,#<imm8>
Outside IT block.
ADD<c> <Rdn>,#<imm8>
Inside IT block.
1514131211109876543210
00110 Rdn imm8
Encoding T3 ARMv6T2, ARMv7
ADD{S}<c>.W <Rd>,<Rn>,#<const>
15141312111098765432101514131211109876543210
11110 i 01000S Rn 0 imm3 Rd imm8
Encoding T4 ARMv6T2, ARMv7
ADDW<c> <Rd>,<Rn>,#<imm12>
15141312111098765432101514131211109876543210
11110i100000 Rn 0 imm3 Rd imm8
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-21
Assembler syntax
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register. If
<Rn>
is SP, see ADD (SP plus immediate) on page A8-28. If
<Rn>
is PC, see ADR on page A8-32.
<const>
The immediate value to be added to the value obtained from
<Rn>
. The range of values is 0-7
for encoding T1, 0-255 for encoding T2 and 0-4095 for encoding T4. See Modified
immediate constants in Thumb instructions on page A6-17 for the range of values for
encoding T3.
When multiple encodings of the same length are available for an instruction, encoding T3 is preferred to
encoding T4 (if encoding T4 is required, use the
ADDW
syntax). Encoding T1 is preferred to encoding T2 if
<Rd>
is specified and encoding T2 is preferred to encoding T1 if
<Rd>
is omitted.
The pre-UAL syntax
ADD<c>S
is equivalent to
ADDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry, overflow) = AddWithCarry(R[n], imm32, ‘0’);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
ADD{S}<c><q> {<Rd>,} <Rn>, #<const>
All encodings permitted
ADDW<c><q> {<Rd>,} <Rn>, #<const>
Only encoding T4 permitted
Instruction Details
A8-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.5 ADD (immediate, ARM)
This instruction adds an immediate value to a register value, and writes the result to the destination register.
It can optionally update the condition flags based on the result.
if Rn == ‘1111’ && S == ‘0’ then SEE ADR;
if Rn == ‘1101’ then SEE ADD (SP plus immediate);
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’); imm32 = ARMExpandImm(imm12);
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADD{S}<c> <Rd>,<Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0010100S Rn Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-23
Assembler syntax
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register. If the SP is specified for
<Rn>
, see ADD (SP plus immediate) on
page A8-28. If the PC is specified for
<Rn>
, see ADR on page A8-32.
<const>
The immediate value to be added to the value obtained from
<Rn>
. See Modified immediate
constants in ARM instructions on page A5-9 for the range of values.
The pre-UAL syntax
ADD<c>S
is equivalent to
ADDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry, overflow) = AddWithCarry(R[n], imm32, ‘0’);
if d == 15 then
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
ADD{S}<c><q> {<Rd>,} <Rn>, #<const>
Instruction Details
A8-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.6 ADD (register)
This instruction adds a register value and an optionally-shifted register value, and writes the result to the
destination register. It can optionally update the condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = !InITBlock();
(shift_t, shift_n) = (SRType_LSL, 0);
if (DN:Rdn) == ‘1101’ || Rm == ‘1101’ then SEE ADD (SP plus register);
d = UInt(DN:Rdn); n = d; m = UInt(Rm); setflags = FALSE; (shift_t, shift_n) = (SRType_LSL, 0);
if n == 15 && m == 15 then UNPREDICTABLE;
if d == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE CMN (register);
if Rn == ‘1101’ then SEE ADD (SP plus register);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || n == 15 || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
if Rn == ‘1101’ then SEE ADD (SP plus register);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADDS <Rd>,<Rn>,<Rm>
Outside IT block.
ADD<c> <Rd>,<Rn>,<Rm>
Inside IT block.
1514131211109876543210
0001100 Rm Rn Rd
Encoding T2 ARMv6T2, ARMv7 if
<Rdn>
and
<Rm>
are both from R0-R7
ARMv4T, ARMv5T*, ARMv6*, ARMv7 otherwise
ADD<c> <Rdn>,<Rm>
If
<Rdn>
is the PC, must be outside or last in IT block.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
01000100DN Rm Rdn
Encoding T3 ARMv6T2, ARMv7
ADD{S}<c>.W <Rd>,<Rn>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
11101011000S Rn (0) imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADD{S}<c> <Rd>,<Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0000100S Rn Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-25
Assembler syntax
ADD{S}<c><q> {<Rd>,} <Rn>, <Rm> {,<shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register. If omitted,
<Rd>
is the same as
<Rn>
and encoding T2 is preferred to
encoding T1 inside an IT block. If
<Rd>
is present, encoding T1 is preferred to encoding T2.
<Rn>
The first operand register. If
<Rn>
is SP, see ADD (SP plus register) on page A8-30.
<Rm>
The register that is optionally shifted and used as the second operand.
<shift>
The shift to apply to the value read from
<Rm>
. If present, only encoding T3 or A1 is
permitted. If omitted, no shift is applied and any encoding is permitted. Shifts applied to a
register on page A8-10 describes the shifts and how they are encoded.
In Thumb assembly, inside an IT block, if
ADD<c> <Rd>,<Rn>,<Rd>
cannot be assembled using encoding T1,
it is assembled using encoding T2 as though
ADD<c> <Rd>,<Rn>
had been written.
To prevent this happening, use the
.W
qualifier.
The pre-UAL syntax
ADD<c>S
is equivalent to
ADDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], shifted, ‘0’);
if d == 15 then
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.7 ADD (register-shifted register)
Add (register-shifted register) adds a register value and a register-shifted register value. It writes the result
to the destination register, and can optionally update the condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
setflags = (S == ‘1’); shift_t = DecodeRegShift(type);
if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADD{S}<c> <Rd>,<Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0000100S Rn Rd Rs 0type1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-27
Assembler syntax
ADD{S}<c><q> {<Rd>,} <Rn>, <Rm>, <type> <Rs>
where:
S
If S is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
The pre-UAL syntax
ADD<c>S
is equivalent to
ADDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], shifted, ‘0’);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.8 ADD (SP plus immediate)
This instruction adds an immediate value to the SP value, and writes the result to the destination register.
d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(imm8:’00’, 32);
d = 13; setflags = FALSE; imm32 = ZeroExtend(imm7:’00’, 32);
if Rd == ‘1111’ && S == ‘1’ then SEE CMN (immediate);
d = UInt(Rd); setflags = (S == ‘1’); imm32 = ThumbExpandImm(i:imm3:imm8);
if d == 15 then UNPREDICTABLE;
d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(i:imm3:imm8, 32);
if d == 15 then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); setflags = (S == ‘1’); imm32 = ARMExpandImm(imm12);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADD<c> <Rd>,SP,#<imm>
1514131211109876543210
10101 Rd imm8
Encoding T2 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADD<c> SP,SP,#<imm>
1514131211109876543210
101100000 imm7
Encoding T3 ARMv6T2, ARMv7
ADD{S}<c>.W <Rd>,SP,#<const>
15141312111098765432101514131211109876543210
11110 i 01000S11010 imm3 Rd imm8
Encoding T4 ARMv6T2, ARMv7
ADDW<c> <Rd>,SP,#<imm12>
15141312111098765432101514131211109876543210
11110 i 10000011010 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADD{S}<c> <Rd>,SP,#<const>
313029282726252423222120191817161514131211109876543210
cond 0010100S1101 Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-29
Assembler syntax
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register. If omitted,
<Rd>
is SP.
<const>
The immediate value to be added to the value obtained from SP. Values are multiples of 4 in
the range 0-1020 for encoding T1, multiples of 4 in the range 0-508 for encoding T2 and
any value in the range 0-4095 for encoding T4. See Modified immediate constants in Thumb
instructions on page A6-17 or Modified immediate constants in ARM instructions on
page A5-9 for the range of values for encodings T3 and A1.
When both 32-bit encodings are available for an instruction, encoding T3 is preferred to
encoding T4 (if encoding T4 is required, use the
ADDW
syntax).
The pre-UAL syntax
ADD<c>S
is equivalent to
ADDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry, overflow) = AddWithCarry(SP, imm32, ‘0’);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
ADD{S}<c><q> {<Rd>,} SP, #<const>
All encodings permitted
ADDW<c><q> {<Rd>,} SP, #<const>
Only encoding T4 is permitted
Instruction Details
A8-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.9 ADD (SP plus register)
This instruction adds an optionally-shifted register value to the SP value, and writes the result to the
destination register.
d = UInt(DM:Rdm); m = UInt(DM:Rdm); setflags = FALSE;
(shift_t, shift_n) = (SRType_LSL, 0);
if Rm == ‘1101’ then SEE encoding T1;
d = 13; m = UInt(Rm); setflags = FALSE;
(shift_t, shift_n) = (SRType_LSL, 0);
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if d == 13 && (shift_t != SRType_LSL || shift_n > 3) then UNPREDICTABLE;
if d == 15 || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADD<c> <Rdm>, SP, <Rdm>
15141312111098 7 6543210
01000100DM1101 Rdm
Encoding T2 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADD<c> SP,<Rm>
1514131211109876543210
010001001 Rm 101
Encoding T3 ARMv6T2, ARMv7
ADD{S}<c>.W <Rd>,SP,<Rm>{,<shift>}
15141312111098765432101514131211109876543210
11101011000S11010 imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADD{S}<c> <Rd>,SP,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0000100S1101 Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-31
Assembler syntax
ADD{S}<c><q> {<Rd>,} SP, <Rm>{, <shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register. This register can be SP. If omitted,
<Rd>
is SP. This register can be
the PC, but if it is, encoding T3 is not permitted. Using the PC is deprecated.
<Rm>
The register that is optionally shifted and used as the second operand. This register can be
the PC, but if it is, encoding T3 is not permitted. Using the PC is deprecated. This register
can be SP in both ARM and Thumb instructions, but:
the use of SP is deprecated
when assembling for the Thumb instruction set, only encoding T1 is available and so
the instruction can only be
ADD SP,SP,SP
.
<shift>
The shift to apply to the value read from
<Rm>
. If omitted, no shift is applied and any
encoding is permitted. If present, only encoding T3 or A1 is permitted. Shifts applied to a
register on page A8-10 describes the shifts and how they are encoded.
In the Thumb instruction set, if
<Rd>
is SP or omitted,
<shift>
is only permitted to be
omitted,
LSL #1
,
LSL #2
, or
LSL #3
.
The pre-UAL syntax
ADD<c>S
is equivalent to
ADDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(SP, shifted, ‘0’);
if d == 15 then
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.10 ADR
This instruction adds an immediate value to the PC value to form a PC-relative address, and writes the result
to the destination register.
d = UInt(Rd); imm32 = ZeroExtend(imm8:’00’, 32); add = TRUE;
d = UInt(Rd); imm32 = ZeroExtend(i:imm3:imm8, 32); add = FALSE;
if BadReg(d) then UNPREDICTABLE;
d = UInt(Rd); imm32 = ZeroExtend(i:imm3:imm8, 32); add = TRUE;
if BadReg(d) then UNPREDICTABLE;
d = UInt(Rd); imm32 = ARMExpandImm(imm12); add = TRUE;
d = UInt(Rd); imm32 = ARMExpandImm(imm12); add = FALSE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ADR<c> <Rd>,<label>
1514131211109876543210
10100 Rd imm8
Encoding T2 ARMv6T2, ARMv7
ADR<c>.W <Rd>,<label> <label>
before current instruction
SUB <Rd>,PC,#0
Special case for subtraction of zero
15141312111098765432101514131211109876543210
11110 i 10101011110 imm3 Rd imm8
Encoding T3 ARMv6T2, ARMv7
ADR<c>.W <Rd>,<label> <label>
after current instruction
15141312111098765432101514131211109876543210
11110 i 10000011110 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADR<c> <Rd>,<label> <label>
after current instruction
313029282726252423222120191817161514131211109876543210
cond 001010001111 Rd imm12
Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ADR<c> <Rd>,<label> <label>
before current instruction
SUB <Rd>,PC,#0
Special case for subtraction of zero
313029282726252423222120191817161514131211109876543210
cond 001001001111 Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-33
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<label>
The label of an instruction or literal data item whose address is to be loaded into
<Rd>
. The
assembler calculates the required value of the offset from the
Align(PC,4)
value of the
ADR
instruction to this label. Permitted values of the offset are:
Encoding T1
multiples of 4 in the range -1020 to 1020
Encodings T2 and T3
any value in the range -4095 to 4095
Encodings A1 and A2
plus or minus any of the constants described in Modified immediate constants
in ARM instructions on page A5-9.
If the offset is zero or positive, encodings T1, T3, and A1 are permitted with
imm32
equal to
the offset.
If the offset is negative, encodings T2 and A2 are permitted with
imm32
equal to minus the
offset.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32);
if d == 15 then // Can only occur for ARM encodings
ALUWritePC(result);
else
R[d] = result;
Exceptions
None.
ADR<c><q> <Rd>, <label>
Normal syntax
ADD<c><q> <Rd>, PC, #<const>
Alternative for encodings T1, T3, A1
SUB<c><q> <Rd>, PC, #<const>
Alternative for encoding T2, A2
Instruction Details
A8-34 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.11 AND (immediate)
This instruction performs a bitwise AND of a register value and an immediate value, and writes the result
to the destination register.
if Rd == ‘1111’ && S == ‘1’ then SEE TST (immediate);
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ThumbExpandImm_C(i:imm3:imm8, APSR.C);
if BadReg(d) || BadReg(n) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ARMExpandImm_C(imm12, APSR.C);
Encoding T1 ARMv6T2, ARMv7
AND{S}<c> <Rd>,<Rn>,#<const>
15141312111098765432101514131211109876543210
11110 i 00000S Rn 0 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
AND{S}<c> <Rd>,<Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0010000S Rn Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-35
Assembler syntax
AND{S}<c><q> {<Rd>,} <Rn>, #<const>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<const>
The immediate value to be ANDed with the value obtained from
<Rn>
. See Modified
immediate constants in Thumb instructions on page A6-17 or Modified immediate constants
in ARM instructions on page A5-9 for the range of values.
The pre-UAL syntax
AND<c>S
is equivalent to
ANDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = R[n] AND imm32;
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-36 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.12 AND (register)
This instruction performs a bitwise AND of a register value and an optionally-shifted register value, and
writes the result to the destination register. It can optionally update the condition flags based on the result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
(shift_t, shift_n) = (SRType_LSL, 0);
if Rd == ‘1111’ && S == ‘1’ then SEE TST (register);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ANDS <Rdn>,<Rm>
Outside IT block.
AND<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100000000 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
AND{S}<c>.W <Rd>,<Rn>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
11101010000S Rn (0) imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
AND{S}<c> <Rd>,<Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0000000S Rn Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-37
Assembler syntax
AND{S}<c><q> {<Rd>,} <Rn>, <Rm> {,<shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is optionally shifted and used as the second operand.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted. Shifts applied to a register on
page A8-10 describes the shifts and how they are encoded.
In Thumb assembly:
outside an IT block, if
ANDS <Rd>,<Rn>,<Rd>
has
<Rd>
and
<Rn>
both in the range R0-R7, it is assembled
using encoding T1 as though
ANDS <Rd>,<Rn>
had been written
inside an IT block, if
AND<c> <Rd>,<Rn>,<Rd>
has
<Rd>
and
<Rn>
both in the range R0-R7, it is
assembled using encoding T1 as though
AND<c> <Rd>,<Rn>
had been written.
To prevent either of these happening, use the .W qualifier.
The pre-UAL syntax
AND<c>S
is equivalent to
ANDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] AND shifted;
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-38 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.13 AND (register-shifted register)
This instruction performs a bitwise AND of a register value and a register-shifted register value. It writes
the result to the destination register, and can optionally update the condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
setflags = (S == ‘1’); shift_t = DecodeRegShift(type);
if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
AND{S}<c> <Rd>,<Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0000000S Rn Rd Rs 0type1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-39
Assembler syntax
AND{S}<c><q> {<Rd>,} <Rn>, <Rm>, <type> <Rs>
where:
S
If S is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
The pre-UAL syntax
AND<c>S
is equivalent to
ANDS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] AND shifted;
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-40 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.14 ASR (immediate)
Arithmetic Shift Right (immediate) shifts a register value right by an immediate number of bits, shifting in
copies of its sign bit, and writes the result to the destination register. It can optionally update the condition
flags based on the result.
d = UInt(Rd); m = UInt(Rm); setflags = !InITBlock();
(-, shift_n) = DecodeImmShift(‘10’, imm5);
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(-, shift_n) = DecodeImmShift(‘10’, imm3:imm2);
if BadReg(d) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(-, shift_n) = DecodeImmShift(‘10’, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ASRS <Rd>,<Rm>,#<imm>
Outside IT block.
ASR<c> <Rd>,<Rm>,#<imm>
Inside IT block.
1514131211109876543210
00010 imm5 Rm Rd
Encoding T2 ARMv6T2, ARMv7
ASR{S}<c>.W <Rd>,<Rm>,#<imm>
151413121110987654321015141312111098 7 6 543210
11101010010S1111(0) imm3 Rd imm210 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ASR{S}<c> <Rd>,<Rm>,#<imm>
313029282726252423222120191817161514131211109876543210
cond 0001101S(0)(0)(0)(0) Rd imm5 100 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-41
Assembler syntax
ASR{S}<c><q> {<Rd>,} <Rm>, #<imm>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rm>
The first operand register.
<imm>
The shift amount, in the range 1 to 32. See Shifts applied to a register on page A8-10.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry) = Shift_C(R[m], SRType_ASR, shift_n, APSR.C);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-42 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.15 ASR (register)
Arithmetic Shift Right (register) shifts a register value right by a variable number of bits, shifting in copies
of its sign bit, and writes the result to the destination register. The variable number of bits is read from the
bottom byte of a register. It can optionally update the condition flags based on the result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ASRS <Rdn>,<Rm>
Outside IT block.
ASR<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100000100 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
ASR{S}<c>.W <Rd>,<Rn>,<Rm>
15141312111098765432101514131211109876543210
11111010010S Rn 1111 Rd 0000 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ASR{S}<c> <Rd>,<Rn>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 0001101S(0)(0)(0)(0) Rd Rm 0101 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-43
Assembler syntax
ASR{S}<c><q> {<Rd>,} <Rn>, <Rm>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register whose bottom byte contains the amount to shift by.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[m]<7:0>);
(result, carry) = Shift_C(R[n], SRType_ASR, shift_n, APSR.C);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-44 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.16 B
Branch causes a branch to a target address.
if cond == ‘1110’ then UNDEFINED;
if cond == ‘1111’ then SEE SVC;
imm32 = SignExtend(imm8:’0’, 32);
if InITBlock() then UNPREDICTABLE;
imm32 = SignExtend(imm11:’0’, 32);
if InITBlock() && !LastInITBlock() then UNPREDICTABLE;
if cond<3:1> == ‘111’ then SEE “Related encodings”;
imm32 = SignExtend(S:J2:J1:imm6:imm11:’0’, 32);
if InITBlock() then UNPREDICTABLE;
I1 = NOT(J1 EOR S); I2 = NOT(J2 EOR S); imm32 = SignExtend(S:I1:I2:imm10:imm11:’0’, 32);
if InITBlock() && !LastInITBlock() then UNPREDICTABLE;
imm32 = SignExtend(imm24:’00’, 32);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
B<c> <label>
Not permitted in IT block.
1514131211109876543210
1101 cond imm8
Encoding T2 ARMv4T, ARMv5T*, ARMv6*, ARMv7
B<c> <label>
Outside or last in IT block
1514131211109876543210
11100 imm11
Encoding T3 ARMv6T2, ARMv7
B<c>.W <label>
Not permitted in IT block.
15141312111098765432101514131211109876543210
11110S cond imm6 10J10J2 imm11
Encoding T4 ARMv6T2, ARMv7
B<c>.W <label>
Outside or last in IT block
15141312111098765432101514131211109876543210
11110S imm10 10J11J2 imm11
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
B<c> <label>
313029282726252423222120191817161514131211109876543210
cond 1010 imm24
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-45
Assembler syntax
B<c><q> <label>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
Note
Encodings T1 and T3 are conditional in their own right, and do not require an
IT
instruction
to make them conditional.
For encodings T1 and T3,
<c>
must not be
AL
or omitted. The 4-bit encoding of the condition
is placed in the instruction and not in a preceding
IT
instruction, and the instruction must not
be in an IT block. As a result, encodings T1 and T2 are never both available to the assembler,
nor are encodings T3 and T4.
<label>
The label of the instruction that is to be branched to. The assembler calculates the required
value of the offset from the PC value of the
B
instruction to this label, then selects an
encoding that sets
imm32
to that offset.
Permitted offsets are:
Encoding T1 Even numbers in the range –256 to 254
Encoding T2 Even numbers in the range –2048 to 2046
Encoding T3 Even numbers in the range –1048576 to 1048574
Encoding T4 Even numbers in the range –16777216 to 16777214
Encoding A1 Multiples of 4 in the range –33554432 to 33554428.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
BranchWritePC(PC + imm32);
Exceptions
None.
Related encodings See Branches and miscellaneous control on page A6-20
Instruction Details
A8-46 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.17 BFC
Bit Field Clear clears any number of adjacent bits at any position in a register, without affecting the other
bits in the register.
d = UInt(Rd); msbit = UInt(msb); lsbit = UInt(imm3:imm2);
if BadReg(d) then UNPREDICTABLE;
d = UInt(Rd); msbit = UInt(msb); lsbit = UInt(lsb);
if d == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
BFC<c> <Rd>,#<lsb>,#<width>
151413121110987654321015141312111098 7 6 543210
11110(0)11011011110 imm3 Rd imm2(0) msb
Encoding A1 ARMv6T2, ARMv7
BFC<c> <Rd>,#<lsb>,#<width>
313029282726252423222120191817161514131211109876543210
cond 0111110 msb Rd lsb 0011111
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-47
Assembler syntax
BFC<c><q> <Rd>, #<lsb>, #<width>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<lsb>
The least significant bit that is to be cleared, in the range 0 to 31. This determines the
required value of
lsbit
.
<width>
The number of bits to be cleared, in the range 1 to 32-
<lsb>
. The required value of
msbit
is
<lsb>+<width>-1
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if msbit >= lsbit then
R[d]<msbit:lsbit> = Replicate(‘0’, msbit-lsbit+1);
// Other bits of R[d] are unchanged
else
UNPREDICTABLE;
Exceptions
None.
Instruction Details
A8-48 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.18 BFI
Bit Field Insert copies any number of low order bits from a register into the same number of adjacent bits at
any position in the destination register.
if Rn == ‘1111’ then SEE BFC;
d = UInt(Rd); n = UInt(Rn); msbit = UInt(msb); lsbit = UInt(imm3:imm2);
if BadReg(d) || n == 13 then UNPREDICTABLE;
if Rn == ‘1111’ then SEE BFC;
d = UInt(Rd); n = UInt(Rn); msbit = UInt(msb); lsbit = UInt(lsb);
if d == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
BFI<c> <Rd>,<Rn>,#<lsb>,#<width>
151413121110987654321015141312111098 7 6 543210
11110(0)110110 Rn 0 imm3 Rd imm2(0) msb
Encoding A1 ARMv6T2, ARMv7
BFI<c> <Rd>,<Rn>,#<lsb>,#<width>
313029282726252423222120191817161514131211109876543210
cond 0111110 msb Rd 1sb 001 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-49
Assembler syntax
BFI<c><q> <Rd>, <Rn>, #<lsb>, #<width>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The source register.
<lsb>
The least significant destination bit, in the range 0 to 31. This determines the required value
of
lsbit
.
<width>
The number of bits to be copied, in the range 1 to 32-
<lsb>
. The required value of
msbit
is
<lsb>+<width>-1
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if msbit >= lsbit then
R[d]<msbit:lsbit> = R[n]<(msbit-lsbit):0>;
// Other bits of R[d] are unchanged
else
UNPREDICTABLE;
Exceptions
None.
Instruction Details
A8-50 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.19 BIC (immediate)
Bitwise Bit Clear (immediate) performs a bitwise AND of a register value and the complement of an
immediate value, and writes the result to the destination register. It can optionally update the condition flags
based on the result.
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ThumbExpandImm_C(i:imm3:imm8, APSR.C);
if BadReg(d) || BadReg(n) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ARMExpandImm_C(imm12, APSR.C);
Encoding T1 ARMv6T2, ARMv7
BIC{S}<c> <Rd>,<Rn>,#<const>
15141312111098765432101514131211109876543210
11110 i 00001S Rn 0 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
BIC{S}<c> <Rd>,<Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0011110S Rn Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-51
Assembler syntax
BIC{S}<c><q> {<Rd>,} <Rn>, #<const>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The register that contains the operand.
<const>
The immediate value to be bitwise inverted and ANDed with the value obtained from
<Rn>
.
See Modified immediate constants in Thumb instructions on page A6-17 or Modified
immediate constants in ARM instructions on page A5-9 for the range of values.
The pre-UAL syntax
BIC<c>S
is equivalent to
BICS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = R[n] AND NOT(imm32);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-52 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.20 BIC (register)
Bitwise Bit Clear (register) performs a bitwise AND of a register value and the complement of an
optionally-shifted register value, and writes the result to the destination register. It can optionally update the
condition flags based on the result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
(shift_t, shift_n) = (SRType_LSL, 0);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
BICS <Rdn>,<Rm>
Outside IT block.
BIC<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100001110 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
BIC{S}<c>.W <Rd>,<Rn>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
11101010001S Rn (0) imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
BIC{S}<c> <Rd>,<Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0001110S Rn Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-53
Assembler syntax
BIC{S}<c><q> {<Rd>,} <Rn>, <Rm> {,<shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is optionally shifted and used as the second operand.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted. Shifts applied to a register on
page A8-10 describes the shifts and how they are encoded.
The pre-UAL syntax
BIC<c>S
is equivalent to
BICS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] AND NOT(shifted);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-54 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.21 BIC (register-shifted register)
Bitwise Bit Clear (register-shifted register) performs a bitwise AND of a register value and the complement
of a register-shifted register value. It writes the result to the destination register, and can optionally update
the condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
setflags = (S == ‘1’); shift_t = DecodeRegShift(type);
if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
BIC{S}<c> <Rd>,<Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0001110S Rn Rd Rs 0type1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-55
Assembler syntax
BIC{S}<c><q> {<Rd>,} <Rn>, <Rm>, <type> <Rs>
where:
S
If S is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
The pre-UAL syntax
BIC<c>S
is equivalent to
BICS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] AND NOT(shifted);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-56 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.22 BKPT
Breakpoint causes a software breakpoint to occur.
Breakpoint is always unconditional, even when inside an IT block.
imm32 = ZeroExtend(imm8, 32);
// imm32 is for assembly/disassembly only and is ignored by hardware.
imm32 = ZeroExtend(imm12:imm4, 32);
// imm32 is for assembly/disassembly only and is ignored by hardware.
if cond != ‘1110’ then UNPREDICTABLE; // BKPT must be encoded with AL condition
Encoding T1 ARMv5T*, ARMv6*, ARMv7
BKPT #<imm8>
1514131211109876543210
10111110 imm8
Encoding A1 ARMv5T*, ARMv6*, ARMv7
BKPT #<imm16>
313029282726252423222120191817161514131211109876543210
cond 00010010 imm12 0111 imm4
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-57
Assembler syntax
BKPT<q> #<imm>
where:
<q>
See Standard assembler syntax fields on page A8-7. A
BKPT
instruction must be
unconditional.
<imm>
Specifies a value that is stored in the instruction, in the range 0-255 for a Thumb instruction
or 0-65535 for an ARM instruction. This value is ignored by the processor, but can be used
by a debugger to store more information about the breakpoint.
Operation
EncodingSpecificOperations();
BKPTInstrDebugEvent();
Exceptions
Prefetch Abort.
Instruction Details
A8-58 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.23 BL, BLX (immediate)
Branch with Link calls a subroutine at a PC-relative address.
Branch with Link and Exchange Instruction Sets (immediate) calls a subroutine at a PC-relative address,
and changes instruction set from ARM to Thumb, or from Thumb to ARM.
I1 = NOT(J1 EOR S); I2 = NOT(J2 EOR S); imm32 = SignExtend(S:I1:I2:imm10:imm11:’0’, 32);
toARM = FALSE;
if InITBlock() && !LastInITBlock() then UNPREDICTABLE;
if CurrentInstrSet() == InstrSet_ThumbEE then UNDEFINED;
I1 = NOT(J1 EOR S); I2 = NOT(J2 EOR S); imm32 = SignExtend(S:I1:I2:imm10H:imm10L:’00’, 32);
toARM = TRUE;
if InITBlock() && !LastInITBlock() then UNPREDICTABLE;
imm32 = SignExtend(imm24:’00’, 32); toARM = TRUE;
imm32 = SignExtend(imm24:H:’0’, 32); toARM = FALSE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7 if J1 == J2 == 1
ARMv6T2, ARMv7 otherwise
BL<c> <label>
Outside or last in IT block
15141312111098765432101514131211109876543210
11110S imm10 11J11J2 imm11
Encoding T2 ARMv5T*, ARMv6*, ARMv7 if J1 == J2 == 1
ARMv6T2, ARMv7 otherwise
BLX<c> <label>
Outside or last in IT block
15141312111098765432101514131211109876543210
11110S imm10H 11J10J2 imm10L 0
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
BL<c> <label>
313029282726252423222120191817161514131211109876543210
cond 1011 imm24
Encoding A2 ARMv5T*, ARMv6*, ARMv7
BLX <label>
313029282726252423222120191817161514131211109876543210
1111101H imm24
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-59
Assembler syntax
BL{X}<c><q> <label>
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
BLX
(immediate) instruction
must be unconditional.
X
If present, specifies a change of instruction set (from ARM to Thumb or from Thumb to
ARM). If X is omitted, the processor remains in the same state. For ThumbEE code,
specifying X is not permitted.
<label>
The label of the instruction that is to be branched to.
For
BL
(encodings T1, A1), the assembler calculates the required value of the offset from the
PC value of the
BL
instruction to this label, then selects an encoding that sets
imm32
to that
offset. Permitted offsets are even numbers in the range –16777216 to 16777214 (Thumb) or
multiples of 4 in the range 33554432 to 33554428 (ARM).
For
BLX
(encodings T2, A2), the assembler calculates the required value of the offset from
the
Align(PC,4)
value of the
BLX
instruction to this label, then selects an encoding that sets
imm32
to that offset. Permitted offsets are multiples of 4 in the range –16777216 to 16777212
(Thumb) or even numbers in the range 33554432 to 33554430 (ARM).
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if CurrentInstrSet == InstrSet_ARM then
next_instr_addr = PC - 4;
LR = next_instr_addr;
else
next_instr_addr = PC;
LR = next_instr_addr<31:1> : ‘1’;
if toARM then
SelectInstrSet(InstrSet_ARM);
BranchWritePC(Align(PC,4) + imm32);
else
SelectInstrSet(InstrSet_Thumb);
BranchWritePC(PC + imm32);
Exceptions
None.
Branch range before ARMv6T2
Before ARMv6T2, J1 and J2 in encodings T1 and T2 were both 1, resulting in a smaller branch range. The
instructions could be executed as two separate 16-bit instructions, as described in BL and BLX (immediate)
instructions, before ARMv6T2 on page AppxG-4.
Instruction Details
A8-60 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.24 BLX (register)
Branch with Link and Exchange (register) calls a subroutine at an address and instruction set specified by a
register.
m = UInt(Rm);
if m == 15 then UNPREDICTABLE;
if InITBlock() && !LastInITBlock() then UNPREDICTABLE;
m = UInt(Rm);
if m == 15 then UNPREDICTABLE;
Encoding T1 ARMv5T*, ARMv6*, ARMv7
BLX<c> <Rm>
Outside or last in IT block
1514131211109876543210
010001111 Rm (0)(0)(0)
Encoding A1 ARMv5T*, ARMv6*, ARMv7
BLX<c> <Rm>
313029282726252423222120191817161514131211109876543210
cond 00010010(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)0011 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-61
Assembler syntax
BLX<c><q> <Rm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rm>
The register that contains the branch target address and instruction set selection bit.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if CurrentInstrSet() == InstrSet_ARM then
next_instr_addr = PC - 4;
LR = next_instr_addr;
else
next_instr_addr = PC - 2;
LR = next_instr_addr<31:1> : ‘1’;
BXWritePC(R[m]);
Exceptions
None.
Instruction Details
A8-62 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.25 BX
Branch and Exchange causes a branch to an address and instruction set specified by a register.
m = UInt(Rm);
if InITBlock() && !LastInITBlock() then UNPREDICTABLE;
m = UInt(Rm);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
BX<c> <Rm>
Outside or last in IT block
1514131211109876543210
010001110 Rm (0)(0)(0)
Encoding A1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
BX<c> Rm
313029282726252423222120191817161514131211109876543210
cond 00010010(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)0001 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-63
Assembler syntax
BX<c><q> <Rm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rm>
The register that contains the branch target address and instruction set selection bit. The PC
can be used.
Note
If
<Rm>
is the PC in a Thumb instruction at a non word-aligned address, it results in
UNPREDICTABLE behavior because the address passed to the
BXWritePC()
pseudocode
function has bits<1:0> = '10'.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
BXWritePC(R[m]);
Exceptions
None.
Instruction Details
A8-64 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.26 BXJ
Branch and Exchange Jazelle attempts to change to Jazelle state. If the attempt fails, it branches to an
address and instruction set specified by a register as though it were a
BX
instruction.
m = UInt(Rm);
if BadReg(m) then UNPREDICTABLE;
if InITBlock() && !LastInITBlock() then UNPREDICTABLE;
m = UInt(Rm);
if m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
BXJ<c> <Rm>
Outside or last in IT block
15141312111098765432101514131211109876543210
111100111100 Rm 10(0)0(1)(1)(1)(1)(0)(0)(0)(0)(0)(0)(0)(0)
Encoding A1 ARMv5TEJ, ARMv6*, ARMv7
BXJ<c> <Rm>
313029282726252423222120191817161514131211109876543210
cond 00010010(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)(1)0010 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-65
Assembler syntax
BXJ<c><q> <Rm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rm>
The register that specifies the branch target address and instruction set selection bit to be
used if the attempt to switch to Jazelle state fails.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if JMCR.JE == ‘0’ || CurrentInstrSet() == InstrSet_ThumbEE then
BXWritePC(R[m]);
else
if JazelleAcceptsExecution() then
SwitchToJazelleExecution();
else
SUBARCHITECTURE_DEFINED handler call;
Exceptions
None.
Instruction Details
A8-66 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.27 CBNZ, CBZ
Compare and Branch on Nonzero and Compare and Branch on Zero compare the value in a register with
zero, and conditionally branch forward a constant value. They do not affect the condition flags.
n = UInt(Rn); imm32 = ZeroExtend(i:imm5:’0’, 32); nonzero = (op == ‘1’);
if InITBlock() then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
CB{N}Z <Rn>,<label>
Not permitted in IT block.
1514131211109876543210
1011op0 i 1 imm5 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-67
Assembler syntax
CB{N}Z<q> <Rn>, <label>
where:
N
If specified, causes the branch to occur when the contents of
<Rn>
are nonzero (encoded as
op = 1). If omitted, causes the branch to occur when the contents of
<Rn>
are zero (encoded
as op = 0).
<q>
See Standard assembler syntax fields on page A8-7. A
CBZ
or
CBNZ
instruction must be
unconditional.
<Rn>
The operand register.
<label>
The label of the instruction that is to be branched to. The assembler calculates the required
value of the offset from the PC value of the
CB{N}Z
instruction to this label, then selects an
encoding that sets
imm32
to that offset. Permitted offsets are even numbers in the range 0 to
126.
Operation
EncodingSpecificOperations();
if nonzero ^ IsZero(R[n]) then
BranchWritePC(PC + imm32);
Exceptions
None.
Instruction Details
A8-68 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.28 CDP, CDP2
Coprocessor Data Processing tells a coprocessor to perform an operation that is independent of ARM core
registers and memory. If no coprocessor can execute the instruction, an Undefined Instruction exception is
generated.
This is a generic coprocessor instruction. Some of the fields have no functionality defined by the architecture
and are free for use by the coprocessor instruction set designer. These fields are the opc1, opc2, CRd, CRn,
and CRm fields.
For more information about the coprocessors see Coprocessor support on page A2-68.
if coproc == ‘101x’ then SEE “VFP instructions”;
cp = UInt(coproc);
cp = UInt(coproc);
Encoding T1 / A1 ARMv6T2, ARMv7 for encoding T1
ARMv4*, ARMv5T*, ARMv6*, ARMv7 for encoding A1
CDP<c> <coproc>,<opc1>,<CRd>,<CRn>,<CRm>,<opc2>
15141312111098765432101514131211109876543210
11101110 opc1 CRn CRd coproc opc2 0 CRm
313029282726252423222120191817161514131211109876543210
cond 1110 opc1 CRn CRd coproc opc2 0 CRm
Encoding T2 / A2 ARMv6T2, ARMv7 for encoding T2
ARMv5T*, ARMv6*, ARMv7 for encodingA2
CDP2<c> <coproc>,<opc1>,<CRd>,<CRn>,<CRm>,<opc2>
15141312111098765432101514131211109876543210
11111110 opc1 CRn CRd coproc opc2 0 CRm
313029282726252423222120191817161514131211109876543210
11111110 opc1 CRn CRd coproc opc2 0 CRm
VFP instructions See VFP data-processing instructions on page A7-24
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-69
Assembler syntax
CDP{2}<c><q> <coproc>, #<opc1>, <CRd>, <CRn>, <CRm> {,#<opc2>}
where:
2
If specified, selects encoding T2 / A2. If omitted, selects encoding T1 / A1.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
CDP2
instruction must be
unconditional.
<coproc>
The name of the coprocessor, and causes the corresponding coprocessor number to be
placed in the cp_num field of the instruction. The standard generic coprocessor names are
p0, p1, …, p15.
<opc1>
Is a coprocessor-specific opcode, in the range 0 to 15.
<CRd>
The destination coprocessor register for the instruction.
<CRn>
The coprocessor register that contains the first operand.
<CRm>
The coprocessor register that contains the second operand.
<opc2>
Is a coprocessor-specific opcode in the range 0 to 7. If it is omitted,
<opc2>
is 0.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if !Coproc_Accepted(cp, ThisInstr()) then
GenerateCoprocessorException();
else
Coproc_InternalOperation(cp, ThisInstr());
Exceptions
Undefined Instruction.
Instruction Details
A8-70 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.29 CHKA
CHKA
is a ThumbEE instruction. For details see CHKA on page A9-15.
A8.6.30 CLREX
Clear-Exclusive clears the local record of the executing processor that an address has had a request for an
exclusive access.
// No additional decoding required
// No additional decoding required
Encoding T1 ARMv7
CLREX<c>
151413121110987654321 01514131211109876543210
111100111011(1)(1)(1)(1)10(0)0(1)(1)(1)(1)0010(1)(1)(1)(1)
Encoding A1 ARMv6K, ARMv7
CLREX
313029282726252423222120191817161514131211109876543210
111101010111(1)(1)(1)(1)(1)(1)(1)(1)(0)(0)(0)(0)0001(1)(1)(1)(1)
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-71
Assembler syntax
CLREX<c><q>
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
CLREX
instruction must be
unconditional.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
ClearExclusiveLocal(ProcessorID());
Exceptions
None.
Instruction Details
A8-72 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.31 CLZ
Count Leading Zeros returns the number of binary zero bits before the first binary one bit in a value.
if !Consistent(Rm) then UNPREDICTABLE;
d = UInt(Rd); m = UInt(Rm);
if BadReg(d) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); m = UInt(Rm);
if d == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
CLZ<c> <Rd>,<Rm>
15141312111098765432101514131211109876543210
111110101011 Rm 1111 Rd 1000 Rm
Encoding A1 ARMv5T*, ARMv6*, ARMv7
CLZ<c> <Rd>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 00010110(1)(1)(1)(1) Rd (1)(1)(1)(1)0001 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-73
Assembler syntax
CLZ<c><q> <Rd>, <Rm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rm>
The register that contains the operand. Its number must be encoded twice in encoding T1.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = CountLeadingZeroBits(R[m]);
R[d] = result<31:0>;
Exceptions
None.
Instruction Details
A8-74 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.32 CMN (immediate)
Compare Negative (immediate) adds a register value and an immediate value. It updates the condition flags
based on the result, and discards the result.
n = UInt(Rn); imm32 = ThumbExpandImm(i:imm3:imm8);
if n == 15 then UNPREDICTABLE;
n = UInt(Rn); imm32 = ARMExpandImm(imm12);
Encoding T1 ARMv6T2, ARMv7
CMN<c> <Rn>,#<const>
15141312111098765432101514131211109876543210
11110i010001 Rn 0 imm3 1111 imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
CMN<c> <Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0 0 1 1 0 1 1 1 Rn (0) (0) (0) (0) imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-75
Assembler syntax
CMN<c><q> <Rn>, #<const>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The register that contains the operand. SP can be used in Thumb as well as in ARM.
<const>
The immediate value to be added to the value obtained from
<Rn>
. See Modified immediate
constants in Thumb instructions on page A6-17 or Modified immediate constants in ARM
instructions on page A5-9 for the range of values.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry, overflow) = AddWithCarry(R[n], imm32, ‘0’);
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-76 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.33 CMN (register)
Compare Negative (register) adds a register value and an optionally-shifted register value. It updates the
condition flags based on the result, and discards the result.
n = UInt(Rn); m = UInt(Rm);
(shift_t, shift_n) = (SRType_LSL, 0);
n = UInt(Rn); m = UInt(Rm);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if n == 15 || BadReg(m) then UNPREDICTABLE;
n = UInt(Rn); m = UInt(Rm);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
CMN<c> <Rn>,<Rm>
1514131211109876543210
0100001011 Rm Rn
Encoding T2 ARMv6T2, ARMv7
CMN<c>.W <Rn>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
111010110001 Rn (0) imm3 1111imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
CMN<c> <Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0 0 0 1 0 1 1 1 Rn (0) (0) (0) (0) imm5 type 0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-77
Assembler syntax
CMN<c><q> <Rn>, <Rm> {,<shift>}
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The first operand register. SP can be used in Thumb (encoding T2) as well as in ARM.
<Rm>
The register that is optionally shifted and used as the second operand.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted. Shifts applied to a register on
page A8-10 describes the shifts and how they are encoded.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], shifted, ‘0’);
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-78 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.34 CMN (register-shifted register)
Compare Negative (register-shifted register) adds a register value and a register-shifted register value. It
updates the condition flags based on the result, and discards the result.
n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
shift_t = DecodeRegShift(type);
if n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
CMN<c> <Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0 0 0 1 0 1 1 1 Rn (0) (0) (0) (0) Rs 0 type 1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-79
Assembler syntax
CMN<c><q> <Rn>, <Rm>, <type> <Rs>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], shifted, ‘0’);
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-80 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.35 CMP (immediate)
Compare (immediate) subtracts an immediate value from a register value. It updates the condition flags
based on the result, and discards the result.
n = UInt(Rdn); imm32 = ZeroExtend(imm8, 32);
n = UInt(Rn); imm32 = ThumbExpandImm(i:imm3:imm8);
if n == 15 then UNPREDICTABLE;
n = UInt(Rn); imm32 = ARMExpandImm(imm12);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
CMP<c> <Rn>,#<imm8>
1514131211109876543210
00101 Rn imm8
Encoding T2 ARMv6T2, ARMv7
CMP<c>.W <Rn>,#<const>
15141312111098765432101514131211109876543210
11110i011011 Rn 0 imm3 1111 imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
CMP<c> <Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 00110101 Rn (0)(0)(0)(0) imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-81
Assembler syntax
CMP<c><q> <Rn>, #<const>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The first operand register. SP can be used in Thumb (encoding T2) as well as in ARM.
<const>
The immediate value to be compared with the value obtained from
<Rn>
. The range of values
is 0-255 for encoding T1. See Modified immediate constants in Thumb instructions on
page A6-17 or Modified immediate constants in ARM instructions on page A5-9 for the
range of values for encoding T2 and A1.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’);
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-82 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.36 CMP (register)
Compare (register) subtracts an optionally-shifted register value from a register value. It updates the
condition flags based on the result, and discards the result.
n = UInt(Rn); m = UInt(Rm);
(shift_t, shift_n) = (SRType_LSL, 0);
n = UInt(N:Rn); m = UInt(Rm);
(shift_t, shift_n) = (SRType_LSL, 0);
if n < 8 && m < 8 then UNPREDICTABLE;
if n == 15 || m == 15 then UNPREDICTABLE;
n = UInt(Rn); m = UInt(Rm);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if n == 15 || BadReg(m) then UNPREDICTABLE;
n = UInt(Rn); m = UInt(Rm);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
CMP<c> <Rn>,<Rm> <Rn>
and
<Rm>
both from R0-R7
1514131211109876543210
0100001010 Rm Rn
Encoding T2 ARMv4T, ARMv5T*, ARMv6*, ARMv7
CMP<c> <Rn>,<Rm> <Rn>
and
<Rm>
not both from R0-R7
1514131211109876543210
01000101N Rm Rn
Encoding T3 ARMv6T2, ARMv7
CMP<c>.W <Rn>, <Rm> {,<shift>}
151413121110987654321015141312111098 7 6 543210
111010111011 Rn (0) imm3 1111imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
CMP<c> <Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0 0 0 1 0 1 0 1 Rn (0) (0) (0) (0) imm5 type 0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-83
Assembler syntax
CMP<c><q> <Rn>, <Rm> {,<shift>}
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The first operand register. The SP can be used.
<Rm>
The register that is optionally shifted and used as the second operand. This register can be
SP in both ARM and Thumb instructions, but:
the use of SP is deprecated
when assembling for the Thumb instruction set, only encoding T2 is available.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encodings T1 and T2 are not
permitted. If absent, no shift is applied and all encodings are permitted. Shifts applied to a
register on page A8-10 describes the shifts and how they are encoded.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], NOT(shifted), ‘1’);
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-84 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.37 CMP (register-shifted register)
Compare (register-shifted register) subtracts a register-shifted register value from a register value. It updates
the condition flags based on the result, and discards the result.
n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
shift_t = DecodeRegShift(type);
if n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
CMP<c> <Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0 0 0 1 0 1 0 1 Rn (0) (0) (0) (0) Rs 0 type 1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-85
Assembler syntax
CMP<c><q> <Rn>, <Rm>, <type> <Rs>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
shifted = Shift(R[m], shift_t, shift_n, APSR.C);
(result, carry, overflow) = AddWithCarry(R[n], NOT(shifted), ‘1’);
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
APSR.V = overflow;
Exceptions
None.
Instruction Details
A8-86 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.38 CPS
Change Processor State is a system instruction. For details see CPS on page B6-3.
A8.6.39 CPY
Copy is a pre-UAL synonym for
MOV
(register).
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-87
Assembler syntax
CPY <Rd>, <Rn>
This is equivalent to:
MOV <Rd>, <Rn>
Exceptions
None.
Instruction Details
A8-88 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.40 DBG
Debug Hint provides a hint to debug and related systems. See their documentation for what use (if any) they
make of this instruction.
// Any decoding of ‘option’ is specified by the debug system
// Any decoding of ‘option’ is specified by the debug system
Encoding T1 ARMv7 (executes as NOP in ARMv6T2)
DBG<c> #<option>
151413121110987654321 01514131211109876543210
111100111010(1)(1)(1)(1)10(0)0(0)0001111 option
Encoding A1 ARMv7 (executes as NOP in ARMv6Kand ARMv6T2)
DBG<c> #<option>
313029282726252423222120191817161514131211109876543210
cond 001100100000(1)(1)(1)(1)(0)(0)(0)(0)1111 option
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-89
Assembler syntax
DBG<c><q> #<option>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<option>
Provides extra information about the hint, and is in the range 0 to 15.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
Hint_Debug(option);
Exceptions
None.
Instruction Details
A8-90 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.41 DMB
Data Memory Barrier is a memory barrier that ensures the ordering of observations of memory accesses, see
Data Memory Barrier (DMB) on page A3-48.
// No additional decoding required
// No additional decoding required
Assembler syntax
DMB<c><q> {<opt>}
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
DMB
instruction must be
unconditional.
<opt>
Specifies an optional limitation on the DMB operation. Values are:
SY
Full system is the required shareability domain, reads and writes are the
required access types. Can be omitted.
This option is referred to as the full system DMB. Encoded as option == '1111'.
ST
Full system is the required shareability domain, writes are the required access
type.
SYST
is a synonym for
ST
. Encoded as option == '1110'.
ISH
Inner Shareable is the required shareability domain, reads and writes are the
required access types. Encoded as option == '1011'.
ISHST
Inner Shareable is the required shareability domain, writes are the required
access type. Encoded as option == '1010'.
NSH
Non-shareable is the required shareability domain, reads and writes are the
required access types. Encoded as option == '0111'.
NSHST
Non-shareable is the required shareability domain, writes are the required
access type. Encoded as option == '0110'.
Encoding T1 ARMv7
DMB<c> #<option>
151413121110987654321 01514131211109876543210
111100111011(1)(1)(1)(1)10(0)0(1)(1)(1)(1)0101 option
Encoding A1 ARMv7
DMB #<option>
313029282726252423222120191817161514131211109876543210
111101010111(1)(1)(1)(1)(1)(1)(1)(1)(0)(0)(0)(0)0101 option
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-91
OSH
Outer Shareable is the required shareability domain, reads and writes are the
required access types. Encoded as option == '0011'.
OSHST
Outer Shareable is the required shareability domain, writes are the required
access type. Encoded as option == '0010'.
All other encodings of option are reserved. It is IMPLEMENTATION DEFINED whether options
other than
SY
are implemented. All unsupported and reserved options must execute as a full
system DMB operation, but software must not must rely on this operation.
Note
The following alternative
<opt>
values are supported, but ARM recommends that you do not
use these alternative values:
SH
as an alias for
ISH
SHST
as an alias for
ISHST
UN
as an alias for
NSH
UNST
is an alias for
NSHST
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
case option of
when ‘0010’ domain = MBReqDomain_OuterShareable; types = MBReqTypes_Writes;
when ‘0010’ domain = MBReqDomain_OuterShareable; types = MBReqTypes_All;
when ‘0110’ domain = MBReqDomain_Nonshareable; types = MBReqTypes_Writes;
when ‘0111’ domain = MBReqDomain_Nonshareable; types = MBReqTypes_All;
when ‘1010’ domain = MBReqDomain_InnerShareable; types = MBReqTypes_Writes;
when ‘1011’ domain = MBReqDomain_InnerShareable; types = MBReqTypes_All;
when ‘1110’ domain = MBReqDomain_FullSystem; types = MBReqTypes_Writes;
otherwise domain = MBReqDomain_FullSystem; types = MBReqTypes_All;
DataMemoryBarrier(domain, types);
Exceptions
None.
Instruction Details
A8-92 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.42 DSB
Data Synchronization Barrier is a memory barrier that ensures the completion of memory accesses, see Data
Synchronization Barrier (DSB) on page A3-49.
// No additional decoding required
// No additional decoding required
Assembler syntax
DSB<c><q> {<opt>}
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
DSB
instruction must be
unconditional.
<opt>
Specifies an optional limitation on the DSB operation. Values are:
SY
Full system is the required shareability domain, reads and writes are the
required access types. Can be omitted.
This option is referred to as the full system DMB. Encoded as option == '1111'.
ST
Full system is the required shareability domain, writes are the required access
type.
SYST
is a synonym for
ST
. Encoded as option == '1110'.
ISH
Inner Shareable is the required shareability domain, reads and writes are the
required access types. Encoded as option == '1011'.
ISHST
Inner Shareable is the required shareability domain, writes are the required
access type. Encoded as option == '1010'.
NSH
Non-shareable is the required shareability domain, reads and writes are the
required access types. Encoded as option == '0111'.
NSHST
Non-shareable is the required shareability domain, writes are the required
access type. Encoded as option == '0110'.
Encoding T1 ARMv7
DSB<c> #<option>
151413121110987654321 01514131211109876543210
111100111011(1)(1)(1)(1)10(0)0(1)(1)(1)(1)0100 option
Encoding A1 ARMv7
DSB #<option>
313029282726252423222120191817161514131211109876543210
111101010111(1)(1)(1)(1)(1)(1)(1)(1)(0)(0)(0)(0)0100 option
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-93
OSH
Outer Shareable is the required shareability domain, reads and writes are the
required access types. Encoded as option == '0011'.
OSHST
Outer Shareable is the required shareability domain, writes are the required
access type. Encoded as option == '0010'.
All other encodings of option are reserved. It is IMPLEMENTATION DEFINED whether options
other than
SY
are implemented. All unsupported and reserved options must execute as a full
system DSB operation, but software must not must rely on this operation.
Note
The following alternative
<opt>
values are supported, but ARM recommends that you do not
use these alternative values:
SH
as an alias for
ISH
SHST
as an alias for
ISHST
UN
as an alias for
NSH
UNST
is an alias for
NSHST
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
case option of
when ‘0010’ domain = MBReqDomain_OuterShareable; types = MBReqTypes_Writes;
when ‘0010’ domain = MBReqDomain_OuterShareable; types = MBReqTypes_All;
when ‘0110’ domain = MBReqDomain_Nonshareable; types = MBReqTypes_Writes;
when ‘0111’ domain = MBReqDomain_Nonshareable; types = MBReqTypes_All;
when ‘1010’ domain = MBReqDomain_InnerShareable; types = MBReqTypes_Writes;
when ‘1011’ domain = MBReqDomain_InnerShareable; types = MBReqTypes_All;
when ‘1110’ domain = MBReqDomain_FullSystem; types = MBReqTypes_Writes;
otherwise domain = MBReqDomain_FullSystem; types = MBReqTypes_All;
DataSynchronizationBarrier(domain, types);
Exceptions
None.
Instruction Details
A8-94 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.43 ENTERX
ENTERX
causes a change from Thumb state to ThumbEE state, or has no effect in ThumbEE state. For details
see ENTERX, LEAVEX on page A9-7.
A8.6.44 EOR (immediate)
Bitwise Exclusive OR (immediate) performs a bitwise Exclusive OR of a register value and an immediate
value, and writes the result to the destination register. It can optionally update the condition flags based on
the result.
if Rd == ‘1111’ && S == ‘1’ then SEE TEQ (immediate);
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ThumbExpandImm_C(i:imm3:imm8, APSR.C);
if BadReg(d) || BadReg(n) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ARMExpandImm_C(imm12, APSR.C);
Encoding T1 ARMv6T2, ARMv7
EOR{S}<c> <Rd>,<Rn>,#<const>
15141312111098765432101514131211109876543210
11110 i 00100S Rn 0 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
EOR{S}<c> <Rd>,<Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0010001S Rn Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-95
Assembler syntax
EOR{S}<c><q> {<Rd>,} <Rn>, #<const>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The register that contains the operand.
<const>
The immediate value to be exclusive ORed with the value obtained from
<Rn>
. See Modified
immediate constants in Thumb instructions on page A6-17 or Modified immediate constants
in ARM instructions on page A5-9 for the range of values.
The pre-UAL syntax
EOR<c>S
is equivalent to
EORS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = R[n] EOR imm32;
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-96 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.45 EOR (register)
Bitwise Exclusive OR (register) performs a bitwise Exclusive OR of a register value and an
optionally-shifted register value, and writes the result to the destination register. It can optionally update the
condition flags based on the result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
(shift_t, shift_n) = (SRType_LSL, 0);
if Rd == ‘1111’ && S == ‘1’ then SEE TEQ (register);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
EORS <Rdn>,<Rm>
Outside IT block.
EOR<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100000001 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
EOR{S}<c>.W <Rd>,<Rn>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
11101010100S Rn (0) imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
EOR{S}<c> <Rd>,<Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0000001S Rn Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-97
Assembler syntax
EOR{S}<c><q> {<Rd>,} <Rn>, <Rm> {,<shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is optionally shifted and used as the second operand.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted.Shifts applied to a register on
page A8-10 describes the shifts and how they are encoded.
In Thumb assembly:
outside an IT block, if
EORS <Rd>,<Rn>,<Rd>
has
<Rd>
and
<Rn>
both in the range R0-R7, it is assembled
using encoding T1 as though
EORS <Rd>,<Rn>
had been written
inside an IT block, if
EOR<c> <Rd>,<Rn>,<Rd>
has
<Rd>
and
<Rn>
both in the range R0-R7, it is
assembled using encoding T1 as though
EOR<c> <Rd>,<Rn>
had been written.
To prevent either of these happening, use the .W qualifier.
The pre-UAL syntax
EOR<c>S
is equivalent to
EORS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] EOR shifted;
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-98 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.46 EOR (register-shifted register)
Bitwise Exclusive OR (register-shifted register) performs a bitwise Exclusive OR of a register value and a
register-shifted register value. It writes the result to the destination register, and can optionally update the
condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
setflags = (S == ‘1’); shift_t = DecodeRegShift(type);
if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
EOR{S}<c> <Rd>,<Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0000001S Rn Rd Rs 0type1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-99
Assembler syntax
EOR{S}<c><q> {<Rd>,} <Rn>, <Rm>, <type> <Rs>
where:
S
If S is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
The pre-UAL syntax
EOR<c>S
is equivalent to
EORS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] EOR shifted;
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-100 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.47 F* (former VFP instruction mnemonics)
Table A8-2 lists the UAL equivalents of pre-UAL VFP instruction mnemonics.
Table A8-2 VFP instruction mnemonics
Former ARM assembler
mnemonic
UAL
equivalent See
FABSD
,
FABSS VABS
VABS on page A8-532
FADDD
,
FADDS VADD
VADD (floating-point) on page A8-538
FCMP
,
FCMPE
,
FCMPEZ
,
FCMPZ VCMP{E}
VCMP, VCMPE on page A8-572
FCONSTD
,
FCONSTS VMOV
VMOV (immediate) on page A8-640
FCPYD
,
FCPYS VMOV
VMOV (register) on page A8-642
FCVTDS
,
FCVTSD VCVT
VCVT (between double-precision and single-precision) on page A8-584
FDIVD
,
FDIVS VDIV
VDIV on page A8-590
FLDD VLDR
VLDR on page A8-628
FLDMD
,
FLDMS VLDM
,
VPOP
VLDM on page A8-626. VPOP on page A8-694
FLDMX FLDMX
FLDMX, FSTMX on page A8-101
FLDS VLDR
VLDR on page A8-628
FMACD
,
FMACS VMLA
VMLA, VMLS (floating-point) on page A8-636
FMDHR
,
FMDLR VMOV
VMOV (ARM core register to scalar) on page A8-644
FMDRR VMOV
VMOV (between two ARM core registers and a doubleword extension
register) on page A8-652
FMRDH
,
FMRDL VMOV
VMOV (scalar to ARM core register) on page A8-646
FMRRD VMOV
VMOV (between two ARM core registers and a doubleword extension
register) on page A8-652
FMRRS VMOV
VMOV (between two ARM core registers and two single-precision
registers) on page A8-650
FMRS VMOV
VMOV (between ARM core register and single-precision register) on
page A8-648
FMRX VMRS
VMRS on page A8-658
FMSCD
,
FMSCS VNMLS
VNMLA, VNMLS, VNMUL on page A8-674
FMSR VMOV
VMOV (between ARM core register and single-precision register) on
page A8-648
FMSRR VMOV
VMOV (between two ARM core registers and two single-precision
registers) on page A8-650
FMSTAT VMRS
VMRS on page A8-658
FMULD
,
FMULS VMUL
VMUL (floating-point) on page A8-664
FMXR VMSR
VMSR on page A8-660
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-101
FLDMX, FSTMX
Encodings T1/A1 of the
VLDM
,
VPOP
,
VPUSH
, and
VSTM
instructions contain an imm8 field that is set to twice
the number of doubleword registers to be transferred. Use of these encodings with an odd value in imm8 is
deprecated, and there is no UAL syntax for them.
The pre-UAL mnemonics
FLDMX
and
FSTMX
result in the same instructions as
FLDMD
(
VLDM.64
or
VPOP.64
) and
FSTMD
(
VSTM.64
or
VPUSH.64
) respectively, except that imm8 is equal to twice the number of doubleword
registers plus one. Use of
FLDMX
and
FSTMX
is deprecated from ARMv6, except for disassembly purposes, and
reassembly of disassembled code.
FNEGD
,
FNEGS VNEG
VNEG on page A8-672
FNMACD
,
FNMACS VMLS
VMLA, VMLS (floating-point) on page A8-636
FNMSCD
,
FNMSCS VNMLA
VNMLA, VNMLS, VNMUL on page A8-674
FNMULD
,
FNMULS VNMUL
VNMLA, VNMLS, VNMUL on page A8-674
FSHTOD
,
FSHTOS VCVT
VCVT (between floating-point and fixed-point, VFP) on page A8-582
FSITOD
,
FSITOS VCVT
VCVT, VCVTR (between floating-point and integer, VFP) on
page A8-578
FSLTOD
,
FSLTOS VCVT
VCVT (between floating-point and fixed-point, VFP) on page A8-582
FSQRTD
,
FSQRTS VSQRT
VSQRT on page A8-762
FSTD VSTR
VSTR on page A8-786
FSTMD
,
FSTMS VSTM
,
VPUSH
VSTM on page A8-784, VPUSH on page A8-696
FSTMX FSTMX
FLDMX, FSTMX
FSTS VSTR
VSTR on page A8-786
FSUBD
,
FSUBS VSUB
VSUB (floating-point) on page A8-790
FTOSHD
,
FTOSHS VCVT
VCVT (between floating-point and fixed-point, VFP) on page A8-582
FTOSI{Z}D
,
FTOSI{Z}S VCVT{R}
VCVT, VCVTR (between floating-point and integer, VFP) on
page A8-578
FTOSL
,
FTOUH VCVT
VCVT (between floating-point and fixed-point, VFP) on page A8-582
FTOUI{Z}D
,
FTOUI{Z}S VCVT{R}
VCVT, VCVTR (between floating-point and integer, VFP) on
page A8-578
FTOULD
,
FTOULS
,
FUHTOD
,
FUHTOS VCVT
VCVT (between floating-point and fixed-point, VFP) on page A8-582
FUITOD
,
FUITOS VCVT
VCVT, VCVTR (between floating-point and integer, VFP) on
page A8-578
FULTOD
,
FULTOS VCVT
VCVT (between floating-point and fixed-point, VFP) on page A8-582
Table A8-2 VFP instruction mnemonics (continued)
Former ARM assembler
mnemonic
UAL
equivalent See
Instruction Details
A8-102 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.48 HB, HBL, HBLP, HBP
These are ThumbEE instructions. For details see HB, HBL on page A9-16, HBLP on page A9-17, and HBP
on page A9-18.
A8.6.49 ISB
Instruction Synchronization Barrier flushes the pipeline in the processor, so that all instructions following
the
ISB
are fetched from cache or memory, after the instruction has been completed. It ensures that the effects
of context altering operations, such as changing the ASID, or completed TLB maintenance operations, or
branch predictor maintenance operations, as well as all changes to the CP15 registers, executed before the
ISB
instruction are visible to the instructions fetched after the
ISB
.
In addition, any branches that appear in program order after the ISB instruction are written into the branch
prediction logic with the context that is visible after the
ISB
instruction. This is needed to ensure correct
execution of the instruction stream.
// No additional decoding required
// No additional decoding required
Encoding T1 ARMv7
ISB<c> #<option>
151413121110987654321 01514131211109876543210
111100111011(1)(1)(1)(1)10(0)0(1)(1)(1)(1)0110 option
Encoding A1 ARMv7
ISB #<option>
313029282726252423222120191817161514131211109876543210
111101010111(1)(1)(1)(1)(1)(1)(1)(1)(0)(0)(0)(0)0110 option
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-103
Assembler syntax
ISB<c><q> {<opt>}
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
ISB
instruction must be
unconditional.
<opt>
Specifies an optional limitation on the ISB operation. Values are:
SY
Full system ISB operation, encoded as option == '1111'. Can be omitted.
All other encodings of option are reserved. The corresponding instructions execute as full
system ISB operations, but must not be relied upon by software.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
InstructionSynchronizationBarrier();
Exceptions
None.
Instruction Details
A8-104 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.50 IT
If Then makes up to four following instructions (the IT block) conditional. The conditions for the
instructions in the IT block can be the same, or some of them can be the inverse of others.
IT
does not affect the condition code flags. Branches to any instruction in the IT block are not permitted,
apart from those performed by exception returns.
16-bit instructions in the IT block, other than
CMP
,
CMN
and
TST
, do not set the condition code flags. The
AL
condition can be specified to get this changed behavior without conditional execution.
See also ITSTATE on page A2-17, Conditional instructions on page A4-4, and Conditional execution on
page A8-8.
if mask == ‘0000’ then SEE “Related encodings”;
if firstcond == ‘1111’ then UNPREDICTABLE;
if firstcond == ‘1110’ && BitCount(mask) != 1 then UNPREDICTABLE;
if InITBlock() then UNPREDICTABLE;
Assembler syntax
IT{x{y{z}}}<q> <firstcond>
where:
<x>
The condition for the second instruction in the IT block.
<y>
The condition for the third instruction in the IT block.
<z>
The condition for the fourth instruction in the IT block.
<q>
See Standard assembler syntax fields on page A8-7. An
IT
instruction must be
unconditional.
<firstcond>
The condition for the first instruction in the IT block. See Table A8-1 on page A8-8 for the
range of conditions available, and the encodings.
Each of
<x>
,
<y>
, and
<z>
can be either:
T
Then. The condition attached to the instruction is
<firstcond>
.
E
Else. The condition attached to the instruction is the inverse of
<firstcond>
. The condition
code is the same as
<firstcond>
, except that the least significant bit is inverted.
E
must not
be specified if
<firstcond>
is
AL
.
Encoding T1 ARMv6T2, ARMv7
IT{x{y{z}}} <firstcond>
Not permitted in IT block
1514131211109876543210
10111111 firstcond mask
Related encodings See If-Then, and hints on page A6-12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-105
Table A8-3 shows how the values of
<x>
,
<y>
, and
<z>
determine the value of the
mask
field.
The conditions specified in an
IT
instruction must match those specified in the syntax of the instructions in
its IT block. When assembling to ARM code, assemblers check
IT
instruction syntax for validity but do not
generate assembled instructions for them. See Conditional instructions on page A4-4.
Operation
EncodingSpecificOperations();
ITSTATE.IT<7:0> = firstcond:mask;
Exceptions
None.
Table A8-3 Determination of maska field
<x> <y> <z> mask[3] mask[2] mask[1] mask[0]
OmittedOmittedOmitted1000
T
Omitted Omitted firstcond[0] 1 0 0
E
Omitted Omitted NOT firstcond[0] 1 0 0
TT
Omitted firstcond[0] firstcond[0] 1 0
ET
Omitted NOT firstcond[0] firstcond[0] 1 0
TE
Omitted firstcond[0] NOT firstcond[0] 1 0
EE
Omitted NOT firstcond[0] NOT firstcond[0] 1 0
TTT
firstcond[0] firstcond[0] firstcond[0] 1
ETT
NOT firstcond[0] firstcond[0] firstcond[0] 1
TET
firstcond[0] NOT firstcond[0] firstcond[0] 1
EET
NOT firstcond[0] NOT firstcond[0] firstcond[0] 1
TTE
firstcond[0] firstcond[0] NOT firstcond[0] 1
ETE
NOT firstcond[0] firstcond[0] NOT firstcond[0] 1
TEE
firstcond[0] NOT firstcond[0] NOT firstcond[0] 1
EEE
NOT firstcond[0] NOT firstcond[0] NOT firstcond[0] 1
a. Note that at least one bit is always 1 in mask.
Instruction Details
A8-106 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.51 LDC, LDC2 (immediate)
Load Coprocessor loads memory data from a sequence of consecutive memory addresses to a coprocessor.
If no coprocessor can execute the instruction, an Undefined Instruction exception is generated.
This is a generic coprocessor instruction. Some of the fields have no functionality defined by the architecture
and are free for use by the coprocessor instruction set designer. These fields are the D bit, the CRd field, and
in the Unindexed addressing mode only, the imm8 field.
For more information about the coprocessors see Coprocessor support on page A2-68.
if Rn == ‘1111’ then SEE LDC (literal);
if P == ‘0’ && U == ‘0’ && D == ‘0’ && W == ‘0’ then UNDEFINED;
if P == ‘0’ && U == ‘0’ && D == ‘1’ && W == ‘0’ then SEE MRRC, MRRC2;
if coproc == ‘101x’ then SEE “Advanced SIMD and VFP”;
n = UInt(Rn); cp = UInt(coproc); imm32 = ZeroExtend(imm8:’00’, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
if Rn == ‘1111’ then SEE LDC (literal);
if P == ‘0’ && U == ‘0’ && D == ‘0’ && W == ‘0’ then UNDEFINED;
if P == ‘0’ && U == ‘0’ && D == ‘1’ && W == ‘0’ then SEE MRRC, MRRC2;
n = UInt(Rn); cp = UInt(coproc); imm32 = ZeroExtend(imm8:’00’, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
Encoding T1 / A1 ARMv6T2, ARMv7 for encoding T1
ARMv4*, ARMv5T*, ARMv6*, ARMv7 for encoding A1
LDC{L}<c> <coproc>,<CRd>,[<Rn>,#+/-<imm>]{!}
LDC{L}<c> <coproc>,<CRd>,[<Rn>],#+/-<imm>
LDC{L}<c> <coproc>,<CRd>,[<Rn>],<option>
15141312111098765432101514131211109876543210
1110110PUDW1 Rn CRd coproc imm8
313029282726252423222120191817161514131211109876543210
cond 1 1 0 P U D W 1 Rn CRd coproc imm8
Encoding T2 / A2 ARMv6T2, ARMv7 for encoding T2
ARMv5T*, ARMv6*, ARMv7 for encodingA2
LDC2{L}<c> <coproc>,<CRd>,[<Rn>,#+/-<imm>]{!}
LDC2{L}<c> <coproc>,<CRd>,[<Rn>],#+/-<imm>
LDC2{L}<c> <coproc>,<CRd>,[<Rn>],<option>
15141312111098765432101514131211109876543210
1111110PUDW1 Rn CRd coproc imm8
313029282726252423222120191817161514131211109876543210
1111110PUDW1 Rn CRd coproc imm8
Advanced SIMD and VFP See Extension register load/store instructions on page A7-26
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-107
Assembler syntax
where:
2
If specified, selects encoding T2 / A2. If omitted, selects encoding T1 / A1.
L
If specified, selects the D == 1 form of the encoding. If omitted, selects the D == 0 form.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
LDC2
instruction must be
unconditional.
<coproc>
The name of the coprocessor. The standard generic coprocessor names are p0, p1, …, p15.
<CRd>
The coprocessor destination register.
<Rn>
The base register. The SP can be used. For PC use see LDC, LDC2 (literal) on page A8-108.
+/-
Is + or omitted if the immediate offset is to be added to the base register value (
add == TRUE
),
or – if it is to be subtracted (
add == FALSE
).
#0
and
#-0
generate different instructions.
<imm>
The immediate offset used to form the address. Values are multiples of 4 in the range
0-1020. For the offset addressing syntax,
<imm>
can be omitted, meaning an offset of +0.
<option>
A coprocessor option. An integer in the range 0-255 enclosed in { }. Encoded in imm8.
The pre-UAL syntax
LDC<c>L
is equivalent to
LDCL<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if !Coproc_Accepted(cp, ThisInstr()) then
GenerateCoprocessorException();
else
NullCheckIfThumbEE(n);
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
repeat
Coproc_SendLoadedWord(MemA[address,4], cp, ThisInstr()); address = address + 4;
until Coproc_DoneLoading(cp, ThisInstr());
if wback then R[n] = offset_addr;
Exceptions
Undefined Instruction, Data Abort.
LDC{2}{L}<c><q> <coproc>,<CRd>,[<Rn>{,#+/-<imm>}]
Offset. P = 1, W = 0.
LDC{2}{L}<c><q> <coproc>,<CRd>,[<Rn>,#+/-<imm>]!
Pre-indexed. P = 1, W = 1.
LDC{2}{L}<c><q> <coproc>,<CRd>,[<Rn>],#+/-<imm>
Post-indexed. P = 0, W = 1.
LDC{2}{L}<c><q> <coproc>,<CRd>,[<Rn>],<option>
Unindexed. P =0, W =0, U =1.
Instruction Details
A8-108 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.52 LDC, LDC2 (literal)
Load Coprocessor loads memory data from a sequence of consecutive memory addresses to a coprocessor.
If no coprocessor can execute the instruction, an Undefined Instruction exception is generated.
This is a generic coprocessor instruction. The D bit and the CRd field have no functionality defined by the
architecture and are free for use by the coprocessor instruction set designer.
For more information about the coprocessors see Coprocessor support on page A2-68.
if P == ‘0’ && U == ‘0’ && D == ‘0’ && W == ‘0’ then UNDEFINED;
if P == ‘0’ && U == ‘0’ && D == ‘1’ && W == ‘0’ then SEE MRRC, MRRC2;
if coproc == ‘101x’ then SEE “Advanced SIMD and VFP”;
index = (P == ‘1’); add = (U == ‘1’); cp = UInt(coproc); imm32 = ZeroExtend(imm8:’00’, 32);
if W == ‘1’ || (P == ‘0’ && CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
if P == ‘0’ && U == ‘0’ && D == ‘0’ && W == ‘0’ then UNDEFINED;
if P == ‘0’ && U == ‘0’ && D == ‘1’ && W == ‘0’ then SEE MRRC, MRRC2;
index = (P == ‘1’); add = (U == ‘1’); cp = UInt(coproc); imm32 = ZeroExtend(imm8:’00’, 32);
if W == ‘1’ || (P == ‘0’ && CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
Encoding T1 / A1 ARMv6T2, ARMv7 for encoding T1
ARMv4*, ARMv5T*, ARMv6*, ARMv7 for encoding A1
LDC{L}<c> <coproc>,<CRd>,<label>
LDC{L}<c> <coproc>,<CRd>,[PC,#-0]
Special case
LDC{L}<c> <coproc>,<CRd>,[PC],<option>
15141312111098765432101514131211109876543210
1110110PUDW11111 CRd coproc imm8
313029282726252423222120191817161514131211109876543210
cond 110PUDW11111 CRd coproc imm8
Encoding T2 / A2 ARMv6T2, ARMv7 for encoding T2
ARMv5T*, ARMv6*, ARMv7 for encodingA2
LDC2{L}<c> <coproc>,<CRd>,<label>
LDC2{L}<c> <coproc>,<CRd>,[PC,#-0]
Special case
LDC2{L}<c> <coproc>,<CRd>,[PC],<option>
15141312111098765432101514131211109876543210
1111110PUDW11111 CRd coproc imm8
313029282726252423222120191817161514131211109876543210
1111110PUDW11111 CRd coproc imm8
Advanced SIMD and VFP See Extension register load/store instructions on page A7-26
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-109
Assembler syntax
where:
2
If specified, selects encoding T2 / A2. If omitted, selects encoding T1 / A1.
L
If specified, selects the D == 1 form of the encoding. If omitted, selects the D == 0 form.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
LDC2
instruction must be
unconditional.
<coproc>
The name of the coprocessor. The standard generic coprocessor names are p0, p1, …, p15.
<CRd>
The coprocessor destination register.
<label>
The label of the literal data item that is to be loaded into
<Rt>
. The assembler calculates the
required value of the offset from the
Align(PC,4)
value of this instruction to the label.
Permitted values of the offset are multiples of 4 in the range -1020 to 1020.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
The unindexed form is permitted for the ARM instruction set only. In it, <option> is a coprocessor option,
written as an integer 0-255 enclosed in { } and encoded in
imm8
.
The pre-UAL syntax
LDC<c>L
is equivalent to
LDCL<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if !Coproc_Accepted(cp, ThisInstr()) then
GenerateCoprocessorException();
else
NullCheckIfThumbEE(15);
offset_addr = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32);
address = if index then offset_addr else Align(PC,4);
repeat
Coproc_SendLoadedWord(MemA[address,4], cp, ThisInstr()); address = address + 4;
until Coproc_DoneLoading(cp, ThisInstr());
Exceptions
Undefined Instruction, Data Abort.
LDC{2}{L}<c><q> <coproc>, <CRd>, <label>
Normal form with P = 1, W = 0
LDC{2}{L}<c><q> <coproc>, <CRd>, [PC,#+/-<imm>]
Alternative form with P = 1, W = 0
LDC{2}{L}<c><q> <coproc>, <CRd>, [PC], <option>
Unindexed form with P = 0, U = 1, W = 0
Instruction Details
A8-110 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.53 LDM / LDMIA / LDMFD
Load Multiple (Increment After) loads multiple registers from consecutive memory locations using an
address from a base register. The consecutive memory locations start at this address, and the address just
above the highest of those locations can optionally be written back to the base register. The registers loaded
can include the PC, causing a branch to a loaded address. Related system instructions are LDM (user
registers) on page B6-7 and LDM (exception return) on page B6-5.
n = UInt(Rn); registers = ‘00000000’:register_list; wback = (registers<n> == ‘0’);
if BitCount(registers) < 1 then UNPREDICTABLE;
if W == ‘1’ && Rn == ‘1101’ then SEE POP;
n = UInt(Rn); registers = P:M:’0’:register_list; wback = (W == ‘1’);
if n == 15 || BitCount(registers) < 2 || (P == ‘1’ && M == ‘1’) then UNPREDICTABLE;
if registers<15> == ‘1’ && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
if wback && registers<n> == ‘1’ then UNPREDICTABLE;
if W == ‘1’ && Rn == ‘1101’ && BitCount(register_list) >= 2 then SEE POP;
n = UInt(Rn); registers = register_list; wback = (W == ‘1’);
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE;
if wback && registers<n> == ‘1’ && ArchVersion() >= 7 then UNPREDICTABLE;
Assembler syntax
LDM<c><q> <Rn>{!}, <registers>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7 (not in ThumbEE)
LDM<c> <Rn>!,<registers> <Rn>
not included in
<registers>
LDM<c> <Rn>,<registers> <Rn>
included in
<registers>
1514131211109876543210
11001 Rn register_list
Encoding T2 ARMv6T2, ARMv7
LDM<c>.W <Rn>{!},<registers>
15141312111098765432101514131211109876543210
1110100010W1 Rn PM(0) register_list
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDM<c> <Rn>{!},<registers>
313029282726252423222120191817161514131211109876543210
cond 100010W1 Rn register_list
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-111
<Rn>
The base register. SP can be used. If it is the SP and ! is specified, the instruction is treated
as described in POP on page A8-246.
!
Causes the instruction to write a modified value back to
<Rn>
. Encoded as W = 1. If
!
is
omitted, the instruction does not change
<Rn>
in this way. Encoded as W = 0.
<registers>
Is a list of one or more registers to be loaded, separated by commas and surrounded by
{
and
}
. The lowest-numbered register is loaded from the lowest memory address, through
to the highest-numbered register from the highest memory address.
Encoding T2 does not support a list containing only one register. If an
LDMIA
instruction with
just one register
<Rt>
in the list is assembled to Thumb and encoding T1 is not available, it
is assembled to the equivalent
LDR<c><q> <Rt>,[<Rn>]{,#4}
instruction.
The SP can be in the list in ARM code, but not in Thumb code. However, ARM instructions
that include the SP in the list are deprecated.
The PC can be in the list. If it is, the instruction branches to the address loaded to the PC. In
ARMv5T and above, this is an interworking branch, see Pseudocode details of operations
on ARM core registers on page A2-12. In Thumb code, if the PC is in the list:
the LR must not be in the list
the instruction must be either outside any IT block, or the last instruction in an IT
block.
ARM instructions that include both the LR and the PC in the list are deprecated.
Instructions with the base register in the list and ! specified are only available in the ARM
instruction set before ARMv7, and the use of such instructions is deprecated. The value of
the base register after such an instruction is UNKNOWN.
LDMIA
and
LDMFD
are pseudo-instructions for
LDM
.
LDMFD
refers to its use for popping data from Full
Descending stacks.
The pre-UAL syntaxes
LDM<c>IA
and
LDM<c>FD
are equivalent to
LDM<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
address = R[n];
for i = 0 to 14
if registers<i> == ‘1’ then
R[i] = MemA[address,4]; address = address + 4;
if registers<15> == ‘1’ then
LoadWritePC(MemA[address,4]);
if wback && registers<n> == ‘0’ then R[n] = R[n] + 4*BitCount(registers);
if wback && registers<n> == ‘1’ then R[n] = bits(32) UNKNOWN;
Exceptions
Data Abort.
Instruction Details
A8-112 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.54 LDMDA / LDMFA
Load Multiple Decrement After (Load Multiple Full Ascending) loads multiple registers from consecutive
memory locations using an address from a base register. The consecutive memory locations end at this
address, and the address just below the lowest of those locations can optionally be written back to the base
register. The registers loaded can include the PC, causing a branch to a loaded address.
Related system instructions are LDM (user registers) on page B6-7 and LDM (exception return) on
page B6-5.
n = UInt(Rn); registers = register_list; wback = (W == ‘1’);
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE;
if wback && registers<n> == ‘1’ && ArchVersion() >= 7 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDMDA<c> <Rn>{!},<registers>
313029282726252423222120191817161514131211109876543210
cond 100000W1 Rn register_list
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-113
Assembler syntax
LDMDA<c><q> <Rn>{!}, <registers>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The base register. SP can be used.
!
Causes the instruction to write a modified value back to
<Rn>
. Encoded as W = 1.
If
!
is omitted, the instruction does not change
<Rn>
in this way. Encoded as W = 0.
<registers>
Is a list of one or more registers to be loaded, separated by commas and surrounded by
{
and
}
. The lowest-numbered register is loaded from the lowest memory address, through
to the highest-numbered register from the highest memory address.
The SP can be in the list. However, instructions that include the SP in the list are deprecated.
The PC can be in the list. If it is, the instruction branches to the address (data) loaded to the
PC. In ARMv5T and above, this branch is an interworking branch, see Pseudocode details
of operations on ARM core registers on page A2-12.
Instructions that include both the LR and the PC in the list are deprecated.
Instructions with the base register in the list and ! specified are only available before
ARMv7, and the use of such instructions is deprecated. The value of the base register after
such an instruction is UNKNOWN.
LDMFA
is a pseudo-instruction for
LDMDA
, referring to its use for popping data from Full Ascending stacks.
The pre-UAL syntaxes
LDM<c>DA
and
LDM<c>FA
are equivalent to
LDMDA<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
address = R[n] - 4*BitCount(registers) + 4;
for i = 0 to 14
if registers<i> == ‘1’ then
R[i] = MemA[address,4]; address = address + 4;
if registers<15> == ‘1’ then
LoadWritePC(MemA[address,4]);
if wback && registers<n> == ‘0’ then R[n] = R[n] - 4*BitCount(registers);
if wback && registers<n> == ‘1’ then R[n] = bits(32) UNKNOWN;
Exceptions
Data Abort.
Instruction Details
A8-114 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.55 LDMDB / LDMEA
Load Multiple Decrement Before (Load Multiple Empty Ascending) loads multiple registers from
consecutive memory locations using an address from a base register. The consecutive memory locations end
just below this address, and the address of the lowest of those locations can optionally be written back to the
base register. The registers loaded can include the PC, causing a branch to a loaded address.
Related system instructions are LDM (user registers) on page B6-7 and LDM (exception return) on
page B6-5.
n = UInt(Rn); registers = P:M:’0’:register_list; wback = (W == ‘1’);
if n == 15 || BitCount(registers) < 2 || (P == ‘1’ && M == ‘1’) then UNPREDICTABLE;
if registers<15> == ‘1’ && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
if wback && registers<n> == ‘1’ then UNPREDICTABLE;
n = UInt(Rn); registers = register_list; wback = (W == ‘1’);
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE;
if wback && registers<n> == ‘1’ && ArchVersion() >= 7 then UNPREDICTABLE;
Assembler syntax
LDMDB<c><q> <Rn>{!}, <registers>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The base register. The SP can be used.
!
Causes the instruction to write a modified value back to
<Rn>
. Encoded as W = 1.
If
!
is omitted, the instruction does not change
<Rn>
in this way. Encoded as W = 0.
<registers>
Is a list of one or more registers to be loaded, separated by commas and surrounded by
{
and
}
. The lowest-numbered register is loaded from the lowest memory address, through
to the highest-numbered register from the highest memory address.
Encoding T1 does not support a list containing only one register. If an
LDMDB
instruction with
just one register
<Rt>
in the list is assembled to Thumb, it is assembled to the equivalent
LDR<c><q> <Rt>,[<Rn>,#-4]{!}
instruction.
Encoding T1 ARMv6T2, ARMv7
LDMDB<c> <Rn>{!},<registers>
15141312111098765432101514131211109876543210
1110100100W1 Rn PM(0) register_list
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDMDB<c> <Rn>{!},<registers>
313029282726252423222120191817161514131211109876543210
cond 100100W1 Rn register_list
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-115
The SP can be in the list in ARM code, but not in Thumb code. However, ARM instructions
that include the SP in the list are deprecated.
The PC can be in the list. If it is, the instruction branches to the address loaded to the PC. In
ARMv5T and above, this is an interworking branch, see Pseudocode details of operations
on ARM core registers on page A2-12. In Thumb code, if the PC is in the list:
the LR must not be in the list
the instruction must be either outside any IT block, or the last instruction in an IT
block.
ARM instructions that include both the LR and the PC in the list are deprecated.
Instructions with the base register in the list and ! specified are only available in the ARM
instruction set before ARMv7, and the use of such instructions is deprecated. The value of
the base register after such an instruction is UNKNOWN.
LDMEA
is a pseudo-instruction for
LDMDB
, referring to its use for popping data from Empty Ascending stacks.
The pre-UAL syntaxes
LDM<c>DB
and
LDM<c>EA
are equivalent to
LDMDB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
address = R[n] - 4*BitCount(registers);
for i = 0 to 14
if registers<i> == ‘1’ then
R[i] = MemA[address,4]; address = address + 4;
if registers<15> == ‘1’ then
LoadWritePC(MemA[address,4]);
if wback && registers<n> == ‘0’ then R[n] = R[n] - 4*BitCount(registers);
if wback && registers<n> == ‘1’ then R[n] = bits(32) UNKNOWN;
Exceptions
Data Abort.
Instruction Details
A8-116 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.56 LDMIB / LDMED
Load Multiple Increment Before loads multiple registers from consecutive memory locations using an
address from a base register. The consecutive memory locations start just above this address, and the address
of the last of those locations can optionally be written back to the base register. The registers loaded can
include the PC, causing a branch to a loaded address.
Related system instructions are LDM (user registers) on page B6-7 and LDM (exception return) on
page B6-5.
n = UInt(Rn); registers = register_list; wback = (W == ‘1’);
if n == 15 || BitCount(registers) < 1 then UNPREDICTABLE;
if wback && registers<n> == ‘1’ && ArchVersion() >= 7 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDMIB<c> <Rn>{!},<registers>
313029282726252423222120191817161514131211109876543210
cond 100110W1 Rn register_list
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-117
Assembler syntax
LDMIB<c><q> <Rn>{!}, <registers>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rn>
The base register. The SP can be used.
!
Causes the instruction to write a modified value back to
<Rn>
. Encoded as W = 1.
If
!
is omitted, the instruction does not change
<Rn>
in this way. Encoded as W = 0.
<registers>
Is a list of one or more registers to be loaded, separated by commas and surrounded by
{
and
}
. The lowest-numbered register is loaded from the lowest memory address, through
to the highest-numbered register from the highest memory address.
The SP can be in the list. However, instructions that include the SP in the list are deprecated.
The PC can be in the list. If it is, the instruction branches to the address (data) loaded to the
PC. In ARMv5T and above, this branch is an interworking branch, see Pseudocode details
of operations on ARM core registers on page A2-12.
Instructions that include both the LR and the PC in the list are deprecated.
Instructions with the base register in the list and ! specified are only available before
ARMv7, and the use of such instructions is deprecated. The value of the base register after
such an instruction is UNKNOWN.
LDMED
is a pseudo-instruction for
LDMIB
, referring to its use for popping data from Empty Descending stacks.
The pre-UAL syntaxes
LDM<c>IB
and
LDM<c>ED
are equivalent to
LDMIB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
address = R[n] + 4;
for i = 0 to 14
if registers<i> == ‘1’ then
R[i] = MemA[address,4]; address = address + 4;
if registers<15> == ‘1’ then
LoadWritePC(MemA[address,4]);
if wback && registers<n> == ‘0’ then R[n] = R[n] + 4*BitCount(registers);
if wback && registers<n> == ‘1’ then R[n] = bits(32) UNKNOWN;
Exceptions
Data Abort.
Instruction Details
A8-118 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.57 LDR (immediate, Thumb)
Load Register (immediate) calculates an address from a base register value and an immediate offset, loads
a word from memory, and writes it to a register. It can use offset, post-indexed, or pre-indexed addressing.
For information about memory accesses see Memory accesses on page A8-13.
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm5:’00’, 32);
index = TRUE; add = TRUE; wback = FALSE;
t = UInt(Rt); n = 13; imm32 = ZeroExtend(imm8:’00’, 32);
index = TRUE; add = TRUE; wback = FALSE;
if Rn == ‘1111’ then SEE LDR (literal);
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32);
index = TRUE; add = TRUE; wback = FALSE;
if t == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
if Rn == ‘1111’ then SEE LDR (literal);
if P == ‘1’ && U == ‘1’ && W == ‘0’ then SEE LDRT;
if Rn == ‘1101’ && P == ‘0’ && U == ‘1’ && W == ‘1’ && imm8 == ‘00000100’ then SEE POP;
if P == ‘0’ && W == ‘0’ then UNDEFINED;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
if (wback && n == t) || (t == 15 && InITBlock() && !LastInITBlock()) then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDR<c> <Rt>, [<Rn>{,#<imm>}]
1514131211109876543210
01101 imm5 Rn Rt
Encoding T2 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDR<c> <Rt>,[SP{,#<imm>}]
1514131211109876543210
10011 Rt imm8
Encoding T3 ARMv6T2, ARMv7
LDR<c>.W <Rt>,[<Rn>{,#<imm12>}]
15141312111098765432101514131211109876543210
111110001101 Rn Rt imm12
Encoding T4 ARMv6T2, ARMv7
LDR<c> <Rt>,[<Rn>,#-<imm8>]
LDR<c> <Rt>,[<Rn>],#+/-<imm8>
LDR<c> <Rt>,[<Rn>,#+/-<imm8>]!
15141312111098765432101514131211109876543210
111110000101 Rn Rt 1PUW imm8
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-119
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register. The SP can be used. The PC can be used, provided the instruction
is either outside an IT block or the last instruction of an IT block. If the PC is used, the
instruction branches to the address (data) loaded to the PC. In ARMv5T and above, this
branch is an interworking branch, see Pseudocode details of operations on ARM core
registers on page A2-12.
<Rn>
The base register. The SP can be used. For PC use see LDR (literal) on page A8-122.
+/-
Is + or omitted if the immediate offset is to be added to the base register value (
add == TRUE
),
or – if it is to be subtracted (
add == FALSE
).
#0
and
#-0
generate different instructions.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Values are:
Encoding T1 multiples of 4 in the range 0-124
Encoding T2 multiples of 4 in the range 0-1020
Encoding T3 any value in the range 0-4095
Encoding T4 any value in the range 0-255.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
data = MemU[address,4];
if wback then R[n] = offset_addr;
if t == 15 then
if address<1:0> == ‘00’ then LoadWritePC(data); else UNPREDICTABLE;
elsif UnalignedSupport() || address<1:0> = ‘00’ then
R[t] = data;
else R[t] = bits(32) UNKNOWN; // Can only apply before ARMv7
Exceptions
Data Abort.
ThumbEE instruction
ThumbEE has additional
LDR
(immediate) encodings. For details see LDR (immediate) on page A9-19.
LDR<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDR<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDR<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-120 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.58 LDR (immediate, ARM)
Load Register (immediate) calculates an address from a base register value and an immediate offset, loads
a word from memory, and writes it to a register. It can use offset, post-indexed, or pre-indexed addressing.
For information about memory accesses see Memory accesses on page A8-13.
if Rn == ‘1111’ then SEE LDR (literal);
if P == ‘0’ && W == ‘1’ then SEE LDRT;
if Rn == ‘1101’ && P == ‘0’ && U == ‘1’ && W == ‘0’ && imm12 == ‘000000000100’ then SEE POP;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
if wback && n == t then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDR<c> <Rt>,[<Rn>{,#+/-<imm12>}]
LDR<c> <Rt>,[<Rn>],#+/-<imm12>
LDR<c> <Rt>,[<Rn>,#+/-<imm12>]!
313029282726252423222120191817161514131211109876543210
cond 0 1 0 P U 0 W 1 Rn Rt imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-121
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register. The SP or the PC can be used. If the PC is used, the instruction
branches to the address (data) loaded to the PC. In ARMv5T and above, this branch is an
interworking branch, see Pseudocode details of operations on ARM core registers on
page A2-12.
<Rn>
The base register. The SP can be used. For PC use see LDR (literal) on page A8-122.
+/-
Is + or omitted if the immediate offset is to be added to the base register value (
add == TRUE
),
or – if it is to be subtracted (
add == FALSE
).
#0
and
#-0
generate different instructions.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Any value in the range 0-4095 is permitted.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
data = MemU[address,4];
if wback then R[n] = offset_addr;
if t == 15 then
if address<1:0> == ‘00’ then LoadWritePC(data); else UNPREDICTABLE;
elsif UnalignedSupport() || address<1:0> = ‘00’ then
R[t] = data;
else // Can only apply before ARMv7
R[t] = ROR(data, 8*UInt(address<1:0>));
Exceptions
Data Abort.
LDR<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDR<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDR<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-122 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.59 LDR (literal)
Load Register (literal) calculates an address from the PC value and an immediate offset, loads a word from
memory, and writes it to a register. For information about memory accesses see Memory accesses on
page A8-13.
t = UInt(Rt); imm32 = ZeroExtend(imm8:’00’, 32); add = TRUE;
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
if t == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register. The SP can be used. The PC can be used, provided the instruction
is either outside an IT block or the last instruction of an IT block. If the PC is used, the
instruction branches to the address (data) loaded to the PC. In ARMv5T and above, this
branch is an interworking branch, see Pseudocode details of operations on ARM core
registers on page A2-12.
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDR<c> <Rt>,<label>
1514131211109876543210
01001 Rt imm8
Encoding T2 ARMv6T2, ARMv7
LDR<c>.W <Rt>,<label>
LDR<c>.W <Rt>,[PC,#-0]
Special case
15141312111098765432101514131211109876543210
11111000U1011111 Rt imm12
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDR<c> <Rt>,<label>
LDR<c> <Rt>,[PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
cond 010(1)U0(0)11111 Rt imm12
LDR<c><q> <Rt>, <label>
Normal form
LDR<c><q> <Rt>, [PC, #+/-<imm>]
Alternative form
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-123
<label>
The label of the literal data item that is to be loaded into
<Rt>
. The assembler calculates the
required value of the offset from the
Align(PC,4)
value of this instruction to the label.
Permitted values of the offset are:
Encoding T1 multiples of four in the range -1020 to 1020
Encoding T2 or A1 any value in the range -4095 to 4095.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
. Negative offset
is not available in encoding T1.
Note
In code examples in this manual, the syntax
=<value>
is used for the label of a memory word
whose contents are constant and equal to
<value>
. The actual syntax for such a label is
assembler-dependent.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(15);
base = Align(PC,4);
address = if add then (base + imm32) else (base - imm32);
data = MemU[address,4];
if t == 15 then
if address<1:0> == ‘00’ then LoadWritePC(data); else UNPREDICTABLE;
elsif UnalignedSupport() || address<1:0> = ‘00’ then
R[t] = data;
else // Can only apply before ARMv7
if CurrentInstrSet() == InstrSet_ARM then
R[t] = ROR(data, 8*UInt(address<1:0>));
else
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
Instruction Details
A8-124 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.60 LDR (register)
Load Register (register) calculates an address from a base register value and an offset register value, loads
a word from memory, and writes it to a register. The offset register value can optionally be shifted. For
information about memory accesses, see Memory accesses on page A8-13.
if CurrentInstrSet() == InstrSet_ThumbEE then SEE “Modified operation in ThumbEE”;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, 0);
if Rn == ‘1111’ then SEE LDR (literal);
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, UInt(imm2));
if BadReg(m) then UNPREDICTABLE;
if t == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
if P == ‘0’ && W == ‘1’ then SEE LDRT;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
if m == 15 then UNPREDICTABLE;
if wback && (n == 15 || n == t) then UNPREDICTABLE;
if ArchVersion() < 6 && wback && m == n then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDR<c> <Rt>,[<Rn>,<Rm>]
1514131211109876543210
0101100 Rm Rn Rt
Encoding T2 ARMv6T2, ARMv7
LDR<c>.W <Rt>,[<Rn>,<Rm>{,LSL #<imm2>}]
15141312111098765432101514131211109876543210
111110000101 Rn Rt 000000imm2 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDR<c> <Rt>,[<Rn>,+/-<Rm>{, <shift>}]{!}
LDR<c> <Rt>,[<Rn>],+/-<Rm>{, <shift>}
313029282726252423222120191817161514131211109876543210
cond 0 1 1 P U 0 W 1 Rn Rt imm5 type 0 Rm
Modified operation in ThumbEE See LDR (register) on page A9-9
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-125
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register. The SP can be used. The PC can be used, provided the instruction
is either outside an IT block or the last instruction of an IT block. If the PC is used, the
instruction branches to the address (data) loaded to the PC. In ARMv5T and above, this
branch is an interworking branch, see Pseudocode details of operations on ARM core
registers on page A2-12.
<Rn>
The base register. The SP can be used. The PC can be used only in the ARM instruction set.
+/-
Is + or omitted if the optionally shifted value of
<Rm>
is to be added to the base register value
(
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only, add == FALSE).
<Rm>
The offset that is optionally shifted and applied to the value of
<Rn>
to form the address.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted. For encoding T2,
<shift>
can
only be omitted, encoded as imm2 =
0b00
, or
LSL #<imm>
with
<imm>
= 1, 2, or 3, and
<imm>
encoded in imm2. For encoding A1, see Shifts applied to a register on page A8-10.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = Shift(R[m], shift_t, shift_n, APSR.C);
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if index then offset_addr else R[n];
data = MemU[address,4];
if wback then R[n] = offset_addr;
if t == 15 then
if address<1:0> == ‘00’ then LoadWritePC(data); else UNPREDICTABLE;
elsif UnalignedSupport() || address<1:0> = ‘00’ then
R[t] = data;
else // Can only apply before ARMv7
if CurrentInstrSet() == InstrSet_ARM then
R[t] = ROR(data, 8*UInt(address<1:0>));
else
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDR<c><q> <Rt>, [<Rn>, +/-<Rm>{, <shift>}]
Offset:
index==TRUE
,
wback==FALSE
LDR<c><q> <Rt>, [<Rn>, +/-<Rm>{, <shift>}]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDR<c><q> <Rt>, [<Rn>], +/-<Rm>{, <shift>}
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-126 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.61 LDRB (immediate, Thumb)
Load Register Byte (immediate) calculates an address from a base register value and an immediate offset,
loads a byte from memory, zero-extends it to form a 32-bit word, and writes it to a register. It can use offset,
post-indexed, or pre-indexed addressing. For information about memory accesses see Memory accesses on
page A8-13.
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm5, 32);
index = TRUE; add = TRUE; wback = FALSE;
if Rt == ‘1111’ then SEE PLD;
if Rn == ‘1111’ then SEE LDRB (literal);
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32);
index = TRUE; add = TRUE; wback = FALSE;
if t == 13 then UNPREDICTABLE;
if Rt == ‘1111’ && P == ‘1’ && U == ‘0’ && W == ‘0’ then SEE PLD;
if Rn == ‘1111’ then SEE LDRB (literal);
if P == ‘1’ && U == ‘1’ && W == ‘0’ then SEE LDRBT;
if P == ‘0’ && W == ‘0’ then UNDEFINED;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
if BadReg(t) || (wback && n == t) then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDRB<c> <Rt>,[<Rn>{,#<imm5>}]
1514131211109876543210
01111 imm5 Rn Rt
Encoding T2 ARMv6T2, ARMv7
LDRB<c>.W <Rt>,[<Rn>{,#<imm12>}]
15141312111098765432101514131211109876543210
111110001001 Rn Rt imm12
Encoding T3 ARMv6T2, ARMv7
LDRB<c> <Rt>,[<Rn>,#-<imm8>]
LDRB<c> <Rt>,[<Rn>],#+/-<imm8>
LDRB<c> <Rt>,[<Rn>,#+/-<imm8>]!
15141312111098765432101514131211109876543210
111110000001 Rn Rt 1PUW imm8
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-127
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. For PC use see LDRB (literal) on page A8-130.
+/-
Is + or omitted if the immediate offset is to be added to the base register value (
add == TRUE
),
or – if it is to be subtracted (
add == FALSE
).
#0
and
#-0
generate different instructions.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Values are:
Encoding T1 any value in the range 0-31
Encoding T2 any value in the range 0-4095
Encoding T3 any value in the range 0-255.
The pre-UAL syntax
LDR<c>B
is equivalent to
LDRB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
R[t] = ZeroExtend(MemU[address,1], 32);
if wback then R[n] = offset_addr;
Exceptions
Data Abort.
LDRB<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRB<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRB<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-128 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.62 LDRB (immediate, ARM)
Load Register Byte (immediate) calculates an address from a base register value and an immediate offset,
loads a byte from memory, zero-extends it to form a 32-bit word, and writes it to a register. It can use offset,
post-indexed, or pre-indexed addressing. For information about memory accesses see Memory accesses on
page A8-13.
if Rn == ‘1111’ then SEE LDRB (literal);
if P == ‘0’ && W == ‘1’ then SEE LDRBT;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
if t == 15 || (wback && n == t) then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRB<c> <Rt>,[<Rn>{,#+/-<imm12>}]
LDRB<c> <Rt>,[<Rn>],#+/-<imm12>
LDRB<c> <Rt>,[<Rn>,#+/-<imm12>]!
313029282726252423222120191817161514131211109876543210
cond 0 1 0 P U 1 W 1 Rn Rt imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-129
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. For PC use see LDRB (literal) on page A8-130.
+/-
Is + or omitted if the immediate offset is to be added to the base register value (
add == TRUE
),
or – if it is to be subtracted (
add == FALSE
).
#0
and
#-0
generate different instructions.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Any value in the range 0-4095 is permitted.
The pre-UAL syntax
LDR<c>B
is equivalent to
LDRB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
R[t] = ZeroExtend(MemU[address,1], 32);
if wback then R[n] = offset_addr;
Exceptions
Data Abort.
LDRB<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRB<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRB<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-130 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.63 LDRB (literal)
Load Register Byte (literal) calculates an address from the PC value and an immediate offset, loads a byte
from memory, zero-extends it to form a 32-bit word, and writes it to a register. For information about
memory accesses see Memory accesses on page A8-13.
if Rt == ‘1111’ then SEE PLD;
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
if t == 13 then UNPREDICTABLE;
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
if t == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRB<c> <Rt>,<label>
LDRB<c> <Rt>,[PC,#-0]
Special case
15141312111098765432101514131211109876543210
11111000U0011111 Rt imm12
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRB<c> <Rt>,<label>
LDRB<c> <Rt>,[PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
cond 010(1)U1(0)11111 Rt imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-131
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<label>
The label of the literal data item that is to be loaded into
<Rt>
. The assembler calculates the
required value of the offset from the
Align(PC,4)
value of this instruction to the label.
Permitted values of the offset are -4095 to 4095.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
The pre-UAL syntax
LDR<c>B
is equivalent to
LDRB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(15);
base = Align(PC,4);
address = if add then (base + imm32) else (base - imm32);
R[t] = ZeroExtend(MemU[address,1], 32);
Exceptions
Data Abort.
LDRB<c><q> <Rt>, <label>
Normal form
LDRB<c><q> <Rt>, [PC, #+/-<imm>]
Alternative form
Instruction Details
A8-132 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.64 LDRB (register)
Load Register Byte (register) calculates an address from abase register value and an offset register value,
loads a byte from memory, zero-extends it to form a 32-bit word, and writes it to a register. The offset
register value can optionally be shifted. For information about memory accesses see Memory accesses on
page A8-13.
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, 0);
if Rt == ‘1111’ then SEE PLD;
if Rn == ‘1111’ then SEE LDRB (literal);
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, UInt(imm2));
if t == 13 || BadReg(m) then UNPREDICTABLE;
if P == ‘0’ && W == ‘1’ then SEE LDRBT;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
if t == 15 || m == 15 then UNPREDICTABLE;
if wback && (n == 15 || n == t) then UNPREDICTABLE;
if ArchVersion() < 6 && wback && m == n then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDRB<c> <Rt>,[<Rn>,<Rm>]
1514131211109876543210
0101110 Rm Rn Rt
Encoding T2 ARMv6T2, ARMv7
LDRB<c>.W <Rt>,[<Rn>,<Rm>{,LSL #<imm2>}]
15141312111098765432101514131211109876543210
111110000001 Rn Rt 000000imm2 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRB<c> <Rt>,[<Rn>,+/-<Rm>{, <shift>}]{!}
LDRB<c> <Rt>,[<Rn>],+/-<Rm>{, <shift>}
313029282726252423222120191817161514131211109876543210
cond 0 1 1 P U 1 W 1 Rn Rt imm5 type 0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-133
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. The PC can be used only in the ARM instruction set.
+/-
Is + or omitted if the optionally shifted value of
<Rm>
is to be added to the base register value
(
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only, add == FALSE).
<Rm>
Contains the offset that is optionally shifted and applied to the value of
<Rn>
to form the
address.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted. For encoding T2,
<shift>
can
only be omitted, encoded as imm2 =
0b00
, or
LSL #<imm>
with
<imm>
= 1, 2, or 3, and
<imm>
encoded in imm2. For encoding A1, see Shifts applied to a register on page A8-10.
The pre-UAL syntax
LDR<c>B
is equivalent to
LDRB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = Shift(R[m], shift_t, shift_n, APSR.C);
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if index then offset_addr else R[n];
R[t] = ZeroExtend(MemU[address,1],32);
if wback then R[n] = offset_addr;
Exceptions
Data Abort.
LDRB<c><q> <Rt>, [<Rn>, +/-<Rm>{, <shift>}]
Offset:
index==TRUE
,
wback==FALSE
LDRB<c><q> <Rt>, [<Rn>, +/-<Rm>{, <shift>}]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRB<c><q> <Rt>, [<Rn>], +/-<Rm>{, <shift>}
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-134 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.65 LDRBT
Load Register Byte Unprivileged loads a byte from memory, zero-extends it to form a 32-bit word, and
writes it to a register. For information about memory accesses see Memory accesses on page A8-13.
The memory access is restricted as if the processor were running in User mode. (This makes no difference
if the processor is actually running in User mode.)
The Thumb instruction uses an offset addressing mode, that calculates the address used for the memory
access from a base register value and an immediate offset, and leaves the base register unchanged.
The ARM instruction uses a post-indexed addressing mode, that uses a base register value as the address for
the memory access, and calculates a new address from a base register value and an offset and writes it back
to the base register. The offset can be an immediate value or an optionally-shifted register value.
if Rn == ‘1111’ then SEE LDRB (literal);
t = UInt(Rt); n = UInt(Rn); postindex = FALSE; add = TRUE;
register_form = FALSE; imm32 = ZeroExtend(imm8, 32);
if BadReg(t) then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); postindex = TRUE; add = (U == ‘1’);
register_form = FALSE; imm32 = ZeroExtend(imm12, 32);
if t == 15 || n == 15 || n == t then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm); postindex = TRUE; add = (U == ‘1’);
register_form = TRUE; (shift_t, shift_n) = DecodeImmShift(type, imm5);
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE;
if ArchVersion() < 6 && m == n then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRBT<c> <Rt>,[<Rn>,#<imm8>]
15141312111098765432101514131211109876543210
111110000001 Rn Rt 1110 imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRBT<c> <Rt>,[<Rn>],#+/-<imm12>
313029282726252423222120191817161514131211109876543210
cond 0100U111 Rn Rt imm12
Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRBT<c> <Rt>,[<Rn>],+/-<Rm>{, <shift>}
313029282726252423222120191817161514131211109876543210
cond 0110U111 Rn Rt imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-135
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
+/-
Is + or omitted if
<imm>
or the optionally shifted value of
<Rm>
is to be added to the base
register value (
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only,
add == FALSE
).
<imm>
The immediate offset applied to the value of
<Rn>
. Values are 0-255 for encoding T1, and
0-4095 for encoding A1.
<imm>
can be omitted, meaning an offset of 0.
<Rm>
Contains the offset that is optionally shifted and applied to the value of
<Rn>
to form the
address.
<shift>
The shift to apply to the value read from
<Rm>
. If omitted, no shift is applied. Shifts applied
to a register on page A8-10 describes the shifts and how they are encoded.
The pre-UAL syntax
LDR<c>BT
is equivalent to
LDRBT<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = if register_form then Shift(R[m], shift_t, shift_n, APSR.C) else imm32;
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if postindex then R[n] else offset_addr;
R[t] = ZeroExtend(MemU_unpriv[address,1],32);
if postindex then R[n] = offset_addr;
Exceptions
Data Abort.
LDRBT<c><q> <Rt>, [<Rn> {, #<imm>}]
Offset: Thumb only
LDRBT<c><q> <Rt>, [<Rn>] {, #+/-<imm>}
Post-indexed: ARM only
LDRBT<c><q> <Rt>, [<Rn>], +/-<Rm> {, <shift>}
Post-indexed: ARM only
Instruction Details
A8-136 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.66 LDRD (immediate)
Load Register Dual (immediate) calculates an address from a base register value and an immediate offset,
loads two words from memory, and writes them to two registers. It can use offset, post-indexed, or
pre-indexed addressing. For information about memory accesses see Memory accesses on page A8-13.
if P == ‘0’ && W == ‘0’ then SEE “Related encodings”;
if Rn == ‘1111’ then SEE LDRD (literal);
t = UInt(Rt); t2 = UInt(Rt2); n = UInt(Rn); imm32 = ZeroExtend(imm8:’00’, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
if wback && (n == t || n == t2) then UNPREDICTABLE;
if BadReg(t) || BadReg(t2) || t == t2 then UNPREDICTABLE;
if Rn == ‘1111’ then SEE LDRD (literal);
if Rt<0> == ‘1’ then UNDEFINED;
t = UInt(Rt); t2 = t+1; n = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
if P == ‘0’ && W == ‘1’ then UNPREDICTABLE;
if wback && (n == t || n == t2) then UNPREDICTABLE;
if t2 == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRD<c> <Rt>,<Rt2>,[<Rn>{,#+/-<imm>}]
LDRD<c> <Rt>,<Rt2>,[<Rn>],#+/-<imm>
LDRD<c> <Rt>,<Rt2>,[<Rn>,#+/-<imm>]!
15141312111098765432101514131211109876543210
1110100PU1W1 Rn Rt Rt2 imm8
Encoding A1 ARMv5TE*, ARMv6*, ARMv7
LDRD<c> <Rt>,<Rt2>,[<Rn>{,#+/-<imm8>}]
LDRD<c> <Rt>,<Rt2>,[<Rn>],#+/-<imm8>
LDRD<c> <Rt>,<Rt2>,[<Rn>,#+/-<imm8>]!
313029282726252423222120191817161514131211109876543210
cond 000PU1W0 Rn Rt imm4H 1101 imm4L
Related encodings See Load/store dual, load/store exclusive, table branch on page A6-24
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-137
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The first destination register. For an ARM instruction,
<Rt>
must be even-numbered and not
R14.
<Rt2>
The second destination register. For an ARM instruction,
<Rt2>
must be
<R(t+1)>
.
<Rn>
The base register. The SP can be used. For PC use see LDRD (literal) on page A8-138.
+/-
Is + or omitted if the immediate offset is to be added to the base register value (
add == TRUE
),
or – if it is to be subtracted (
add == FALSE
).
#0
and
#-0
generate different instructions.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Values are:
Encoding T1 multiples of 4 in the range 0-1020
Encoding A1 any value in the range 0-255.
The pre-UAL syntax
LDR<c>D
is equivalent to
LDRD<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
R[t] = MemA[address,4];
R[t2] = MemA[address+4,4];
if wback then R[n] = offset_addr;
Exceptions
Data Abort.
LDRD<c><q> <Rt>, <Rt2>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRD<c><q> <Rt>, <Rt2>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRD<c><q> <Rt>, <Rt2>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-138 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.67 LDRD (literal)
Load Register Dual (literal) calculates an address from the PC value and an immediate offset, loads two
words from memory, and writes them to two registers. For information about memory accesses see Memory
accesses on page A8-13.
if P == ‘0’ then SEE “Related encodings”;
t = UInt(Rt); t2 = UInt(Rt2);
imm32 = ZeroExtend(imm8:’00’, 32); add = (U == ‘1’);
if BadReg(t) || BadReg(t2) || t == t2 then UNPREDICTABLE;
if Rt<0> == ‘1’ then UNDEFINED;
t = UInt(Rt); t2 = t+1; imm32 = ZeroExtend(imm4H:imm4L, 32); add = (U == ‘1’);
if t2 == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRD<c> <Rt>,<Rt2>,<label>
LDRD<c> <Rt>,<Rt2>,[PC,#-0]
Special case
15141312111098765432101514131211109876543210
1110100PU1(0)11111 Rt Rt2 imm8
Encoding A1 ARMv5TE*, ARMv6*, ARMv7
LDRD<c> <Rt>,<Rt2>,<label>
LDRD<c> <Rt>,<Rt2>,[PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
cond 000(1)U1(0)01111 Rt imm4H 1101 imm4L
Related encodings See Load/store dual, load/store exclusive, table branch on page A6-24
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-139
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The first destination register. For an ARM instruction,
<Rt>
must be even-numbered and not
R14.
<Rt2>
The second destination register. For an ARM instruction,
<Rt2>
must be
<R(t+1)>
.
<label>
The label of the literal data item that is to be loaded into
<Rt>
. The assembler calculates the
required value of the offset from the
Align(PC,4)
value of this instruction to the label.
Permitted values of the offset are:
Encoding T1 multiples of 4 in the range -1020 to 1020
Encoding A1 any value in the range -255 to 255.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
The pre-UAL syntax
LDR<c>D
is equivalent to
LDRD<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(15);
address = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32);
R[t] = MemA[address,4];
R[t2] = MemA[address+4,4];
Exceptions
Data Abort.
LDRD<c><q> <Rt>, <Rt2>, <label>
Normal form
LDRD<c><q> <Rt>, <Rt2>, [PC, #+/-<imm>]
Alternative form
Instruction Details
A8-140 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.68 LDRD (register)
Load Register Dual (register) calculates an address from abase register value and a register offset, loads two
words from memory, and writes them to two registers. It can use offset, post-indexed, or pre-indexed
addressing. For information about memory accesses see Memory accesses on page A8-13.
if Rt<0> == ‘1’ then UNDEFINED;
t = UInt(Rt); t2 = t+1; n = UInt(Rn); m = UInt(Rm);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
if P == ‘0’ && W == ‘1’ then UNPREDICTABLE;
if t2 == 15 || m == 15 || m == t || m == t2 then UNPREDICTABLE;
if wback && (n == 15 || n == t || n == t2) then UNPREDICTABLE;
if ArchVersion() < 6 && wback && m == n then UNPREDICTABLE;
Encoding A1 ARMv5TE*, ARMv6*, ARMv7
LDRD<c> <Rt>,<Rt2>,[<Rn>,+/-<Rm>]{!}
LDRD<c> <Rt>,<Rt2>,[<Rn>],+/-<Rm>
313029282726252423222120191817161514131211109876543210
cond 000PU0W0 Rn Rt (0)(0)(0)(0)1101 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-141
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The first destination register. This register must be even-numbered and not R14.
<Rt2>
The second destination register. This register must be
<R(t+1)>
.
<Rn>
The base register. The SP or the PC can be used.
+/-
Is + or omitted if the value of
<Rm>
is to be added to the base register value (
add == TRUE
), or
– if it is to be subtracted (
add == FALSE
).
<Rm>
Contains the offset that is applied to the value of
<Rn>
to form the address.
The pre-UAL syntax
LDR<c>D
is equivalent to
LDRD<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
offset_addr = if add then (R[n] + R[m]) else (R[n] - R[m]);
address = if index then offset_addr else R[n];
R[t] = MemA[address,4];
R[t2] = MemA[address+4,4];
if wback then R[n] = offset_addr;
Exceptions
Data Abort.
LDRD<c><q> <Rt>, <Rt2>, [<Rn>, +/-<Rm>]
Offset:
index==TRUE
,
wback==FALSE
LDRD<c><q> <Rt>, <Rt2>, [<Rn>, +/-<Rm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRD<c><q> <Rt>, <Rt2>, [<Rn>], +/-<Rm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-142 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.69 LDREX
Load Register Exclusive calculates an address from a base register value and an immediate offset, loads a
word from memory, writes it to a register and:
if the address has the Shared Memory attribute, marks the physical address as exclusive access for
the executing processor in a shared monitor
causes the executing processor to indicate an active exclusive access in the local monitor.
For more information about support for shared memory see Synchronization and semaphores on
page A3-12. For information about memory accesses see Memory accesses on page A8-13.
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8:’00’, 32);
if BadReg(t) || n == 15 then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); imm32 = Zeros(32); // Zero offset
if t == 15 || n == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDREX<c> <Rt>,[<Rn>{,#<imm>}]
15141312111098765432101514131211109876543210
111010000101 Rn Rt (1)(1)(1)(1) imm8
Encoding A1 ARMv6*, ARMv7
LDREX<c> <Rt>,[<Rn>]
313029282726252423222120191817161514131211109876543210
cond 00011001 Rn Rt (1)(1)(1)(1)1001(1)(1)(1)(1)
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-143
Assembler syntax
LDREX<c><q> <Rt>, [<Rn> {,#<imm>}]
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
<imm>
The immediate offset added to the value of
<Rn>
to form the address.
<imm>
can be omitted,
meaning an offset of 0. Values are:
Encoding T1 multiples of 4 in the range 0-1020
Encoding A1 omitted or 0.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
address = R[n] + imm32;
SetExclusiveMonitors(address,4);
R[t] = MemA[address,4];
Exceptions
Data Abort.
Instruction Details
A8-144 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.70 LDREXB
Load Register Exclusive Byte derives an address from a base register value, loads a byte from memory,
zero-extends it to form a 32-bit word, writes it to a register and:
if the address has the Shared Memory attribute, marks the physical address as exclusive access for
the executing processor in a shared monitor
causes the executing processor to indicate an active exclusive access in the local monitor.
For more information about support for shared memory see Synchronization and semaphores on
page A3-12. For information about memory accesses see Memory accesses on page A8-13.
t = UInt(Rt); n = UInt(Rn);
if BadReg(t) || n == 15 then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn);
if t == 15 || n == 15 then UNPREDICTABLE;
Encoding T1 ARMv7
LDREXB<c> <Rt>, [<Rn>]
15141312111098765432101514131211109876543210
111010001101 Rn Rt (1)(1)(1)(1)0100(1)(1)(1)(1)
Encoding A1 ARMv6K, ARMv7
LDREXB<c> <Rt>, [<Rn>]
313029282726252423222120191817161514131211109876543210
cond 00011101 Rn Rt (1)(1)(1)(1)1001(1)(1)(1)(1)
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-145
Assembler syntax
LDREXB<c><q> <Rt>, [<Rn>]
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
address = R[n];
SetExclusiveMonitors(address,1);
R[t] = ZeroExtend(MemA[address,1], 32);
Exceptions
Data Abort.
Instruction Details
A8-146 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.71 LDREXD
Load Register Exclusive Doubleword derives an address from a base register value, loads a 64-bit
doubleword from memory, writes it to two registers and:
if the address has the Shared Memory attribute, marks the physical address as exclusive access for
the executing processor in a shared monitor
causes the executing processor to indicate an active exclusive access in the local monitor.
For more information about support for shared memory see Synchronization and semaphores on
page A3-12. For information about memory accesses see Memory accesses on page A8-13.
t = UInt(Rt); t2 = UInt(Rt2); n = UInt(Rn);
if BadReg(t) || BadReg(t2) || t == t2 || n == 15 then UNPREDICTABLE;
t = UInt(Rt); t2 = t+1; n = UInt(Rn);
if Rt<0> = ‘1’ || Rt == ‘1110’ || n == 15 then UNPREDICTABLE;
Encoding T1 ARMv7
LDREXD<c> <Rt>,<Rt2>,[<Rn>]
15141312111098765432101514131211109876543210
111010001101 Rn Rt Rt2 0111(1)(1)(1)(1)
Encoding A1 ARMv6K, ARMv7
LDREXD<c> <Rt>,<Rt2>,[<Rn>]
313029282726252423222120191817161514131211109876543210
cond 00011011 Rn Rt (1)(1)(1)(1)1001(1)(1)(1)(1)
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-147
Assembler syntax
LDREXD<c><q> <Rt>, <Rt2>, [<Rn>]
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The first destination register. For an ARM instruction,
<Rt>
must be even-numbered and not
R14.
<Rt2>
The second destination register. For an ARM instruction,
<Rt2>
must be
<R(t+1)>
.
<Rn>
The base register. The SP can be used.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
address = R[n];
SetExclusiveMonitors(address,8);
value = MemA[address,8];
// Extract words from 64-bit loaded value such that R[t] is
// loaded from address and R[t2] from address+4.
R[t] = if BigEndian() then value<63:32> else value<31:0>;
R[t2] = if BigEndian() then value<31:0> else value<63:32>;
Exceptions
Data Abort.
Instruction Details
A8-148 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.72 LDREXH
Load Register Exclusive Halfword derives an address from abase register value, loads a halfword from
memory, zero-extends it to form a 32-bit word, writes it to a register and:
if the address has the Shared Memory attribute, marks the physical address as exclusive access for
the executing processor in a shared monitor
causes the executing processor to indicate an active exclusive access in the local monitor.
For more information about support for shared memory see Synchronization and semaphores on
page A3-12. For information about memory accesses see Memory accesses on page A8-13.
t = UInt(Rt); n = UInt(Rn);
if BadReg(t) || n == 15 then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn);
if t == 15 || n == 15 then UNPREDICTABLE;
Encoding T1 ARMv7
LDREXH<c> <Rt>, [<Rn>]
15141312111098765432101514131211109876543210
111010001101 Rn Rt (1)(1)(1)(1)0101(1)(1)(1)(1)
Encoding A1 ARMv6K, ARMv7
LDREXH<c> <Rt>, [<Rn>]
313029282726252423222120191817161514131211109876543210
cond 00011111 Rn Rt (1)(1)(1)(1)1001(1)(1)(1)(1)
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-149
Assembler syntax
LDREXH<c><q> <Rt>, [<Rn>]
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
address = R[n];
SetExclusiveMonitors(address,2);
R[t] = ZeroExtend(MemA[address,2], 32);
Exceptions
Data Abort.
Instruction Details
A8-150 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.73 LDRH (immediate, Thumb)
Load Register Halfword (immediate) calculates an address from a base register value and an immediate
offset, loads a halfword from memory, zero-extends it to form a 32-bit word, and writes it to a register. It
can use offset, post-indexed, or pre-indexed addressing. For information about memory accesses see
Memory accesses on page A8-13.
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm5:’0’, 32);
index = TRUE; add = TRUE; wback = FALSE;
if Rt == ‘1111’ then SEE “Unallocated memory hints”;
if Rn == ‘1111’ then SEE LDRH (literal);
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32);
index = TRUE; add = TRUE; wback = FALSE;
if t == 13 then UNPREDICTABLE;
if Rn == ‘1111’ then SEE LDRH (literal);
if Rt == ‘1111’ && P == ‘1’ && U == ‘0’ && W == ‘0’ then SEE “Unallocated memory hints”;
if P == ‘1’ && U == ‘1’ && W == ‘0’ then SEE LDRHT;
if P == ‘0’ && W == ‘0’ then UNDEFINED;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
if BadReg(t) || (wback && n == t) then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDRH<c> <Rt>,[<Rn>{,#<imm>}]
1514131211109876543210
10001 imm5 Rn Rt
Encoding T2 ARMv6T2, ARMv7
LDRH<c>.W <Rt>,[<Rn>{,#<imm12>}]
15141312111098765432101514131211109876543210
111110001011 Rn Rt imm12
Encoding T3 ARMv6T2, ARMv7
LDRH<c> <Rt>,[<Rn>,#-<imm8>]
LDRH<c> <Rt>,[<Rn>],#+/-<imm8>
LDRH<c> <Rt>,[<Rn>,#+/-<imm8>]!
15141312111098765432101514131211109876543210
111110000011 Rn Rt 1PUW imm8
Unallocated memory hints See Load halfword, memory hints on page A6-26
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-151
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. For PC use see LDRH (literal) on page A8-154.
+/-
Is + or omitted to indicate that the immediate offset is added to the base register value
(
add == TRUE
), or – to indicate that the offset is to be subtracted (
add == FALSE
). Different
instructions are generated for
#0
and
#-0
.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Values are:
Encoding T1 multiples of 2 in the range 0-62
Encoding T2 any value in the range 0-4095
Encoding T3 any value in the range 0-255.
The pre-UAL syntax
LDR<c>H
is equivalent to
LDRH<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
data = MemU[address,2];
if wback then R[n] = offset_addr;
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = ZeroExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRH<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRH<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRH<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-152 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.74 LDRH (immediate, ARM)
Load Register Halfword (immediate) calculates an address from a base register value and an immediate
offset, loads a halfword from memory, zero-extends it to form a 32-bit word, and writes it to a register. It
can use offset, post-indexed, or pre-indexed addressing. For information about memory accesses see
Memory accesses on page A8-13.
if Rn == ‘1111’ then SEE LDRH (literal);
if P == ‘0’ && W == ‘1’ then SEE LDRHT;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
if t == 15 || (wback && n == t) then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRH<c> <Rt>,[<Rn>{,#+/-<imm8>}]
LDRH<c> <Rt>,[<Rn>],#+/-<imm8>
LDRH<c> <Rt>,[<Rn>,#+/-<imm8>]!
313029282726252423222120191817161514131211109876543210
cond 000PU1W1 Rn Rt imm4H 1011 imm4L
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-153
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. For PC use see LDRH (literal) on page A8-154.
+/-
Is + or omitted to indicate that the immediate offset is added to the base register value
(
add == TRUE
), or – to indicate that the offset is to be subtracted (
add == FALSE
). Different
instructions are generated for
#0
and
#-0
.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Any value in the range 0-255 is permitted.
The pre-UAL syntax
LDR<c>H
is equivalent to
LDRH<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
data = MemU[address,2];
if wback then R[n] = offset_addr;
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = ZeroExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRH<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRH<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRH<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-154 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.75 LDRH (literal)
Load Register Halfword (literal) calculates an address from the PC value and an immediate offset, loads a
halfword from memory, zero-extends it to form a 32-bit word, and writes it to a register. For information
about memory accesses see Memory accesses on page A8-13.
if Rt == ‘1111’ then SEE “Unallocated memory hints”;
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
if t == 13 then UNPREDICTABLE;
t = UInt(Rt); imm32 = ZeroExtend(imm4H:imm4L, 32); add = (U == ‘1’);
if t == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRH<c> <Rt>,<label>
LDRH<c> <Rt>,[PC,#-0]
Special case
15141312111098765432101514131211109876543210
11111000U0111111 Rt imm12
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRH<c> <Rt>,<label>
LDRH<c> <Rt>,[PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
cond 000(1)U1(0)11111 Rt imm4H 1011 imm4L
Unallocated memory hints See Load halfword, memory hints on page A6-26
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-155
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<label>
The label of the literal data item that is to be loaded into
<Rt>
. The assembler calculates the
required value of the offset from the
Align(PC,4)
value of the
ADR
instruction to this label.
Permitted values of the offset are:
Encoding T1 any value in the range -4095 to 4095
Encoding A1 any value in the range -255 to 255.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
The pre-UAL syntax
LDR<c>H
is equivalent to
LDRH<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(15);
base = Align(PC,4);
address = if add then (base + imm32) else (base - imm32);
data = MemU[address,2];
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = ZeroExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRH<c><q> <Rt>, <label>
Normal form
LDRH<c><q> <Rt>, [PC, #+/-<imm>]
Alternative form
Instruction Details
A8-156 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.76 LDRH (register)
Load Register Halfword (register) calculates an address from a base register value and an offset register
value, loads a halfword from memory, zero-extends it to form a 32-bit word, and writes it to a register. The
offset register value can be shifted left by 0, 1, 2, or 3 bits. For information about memory accesses see
Memory accesses on page A8-13.
if CurrentInstrSet() == InstrSet_ThumbEE then SEE “Modified operation in ThumbEE”;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, 0);
if Rn == ‘1111’ then SEE LDRH (literal);
if Rt == ‘1111’ then SEE “Unallocated memory hints”;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, UInt(imm2));
if t == 13 || BadReg(m) then UNPREDICTABLE;
if P == ‘0’ && W == ‘1’ then SEE LDRHT;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
(shift_t, shift_n) = (SRType_LSL, 0);
if t == 15 || m == 15 then UNPREDICTABLE;
if wback && (n == 15 || n == t) then UNPREDICTABLE;
if ArchVersion() < 6 && wback && m == n then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDRH<c> <Rt>,[<Rn>,<Rm>]
1514131211109876543210
0101101 Rm Rn Rt
Encoding T2 ARMv6T2, ARMv7
LDRH<c>.W <Rt>,[<Rn>,<Rm>{,LSL #<imm2>}]
15141312111098765432101514131211109876543210
111110000011 Rn Rt 000000imm2 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRH<c> <Rt>,[<Rn>,+/-<Rm>]{!}
LDRH<c> <Rt>,[<Rn>],+/-<Rm>
313029282726252423222120191817161514131211109876543210
cond 000PU0W1 Rn Rt (0)(0)(0)(0)1011 Rm
Unallocated memory hints See Load halfword, memory hints on page A6-26
Modified operation in ThumbEE See LDRH (register) on page A9-10
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-157
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. The PC can be used only in the ARM instruction set.
+/-
Is + or omitted if the optionally shifted value of
<Rm>
is to be added to the base register value
(
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only, add == FALSE).
<Rm>
Contains the offset that is optionally left shifted and added to the value of
<Rn>
to form the
address.
<imm>
If present, the size of the left shift to apply to the value from
<Rm>
, in the range 1-3. Only
encoding T2 is permitted, and
<imm>
is encoded in imm2.
If absent, no shift is specified and all encodings are permitted. In encoding T2, imm2 is
encoded as
0b00
.
The pre-UAL syntax
LDR<c>H
is equivalent to
LDRH<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = Shift(R[m], shift_t, shift_n, APSR.C);
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if index then offset_addr else R[n];
data = MemU[address,2];
if wback then R[n] = offset_addr;
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = ZeroExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRH<c><q> <Rt>, [<Rn>, <Rm>{, LSL #<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRH<c><q> <Rt>, [<Rn>, +/-<Rm>]
Offset:
index==TRUE
,
wback==FALSE
LDRH<c><q> <Rt>, [<Rn>, +/-<Rm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRH<c><q> <Rt>, [<Rn>], +/-<Rm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-158 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.77 LDRHT
Load Register Halfword Unprivileged loads a halfword from memory, zero-extends it to form a 32-bit word,
and writes it to a register. For information about memory accesses see Memory accesses on page A8-13.
The memory access is restricted as if the processor were running in User mode. (This makes no difference
if the processor is actually running in User mode.)
The Thumb instruction uses an offset addressing mode, that calculates the address used for the memory
access from a base register value and an immediate offset, and leaves the base register unchanged.
The ARM instruction uses a post-indexed addressing mode, that uses a base register value as the address for
the memory access, and calculates a new address from a base register value and an offset and writes it back
to the base register. The offset can be an immediate value or a register value.
if Rn == ‘1111’ then SEE LDRH (literal);
t = UInt(Rt); n = UInt(Rn); postindex = FALSE; add = TRUE;
register_form = FALSE; imm32 = ZeroExtend(imm8, 32);
if BadReg(t) then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); postindex = TRUE; add = (U == ‘1’);
register_form = FALSE; imm32 = ZeroExtend(imm4H:imm4L, 32);
if t == 15 || n == 15 || n == t then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm); postindex = TRUE; add = (U == ‘1’);
register_form = FALSE;
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRHT<c> <Rt>,[<Rn>,#<imm8>]
15141312111098765432101514131211109876543210
111110000011 Rn Rt 1110 imm8
Encoding A1 ARMv6T2, ARMv7
LDRHT<c> <Rt>, [<Rn>] {, #+/-<imm8>}
313029282726252423222120191817161514131211109876543210
cond 0000U111 Rn Rt imm4H 1011 imm4L
Encoding A2 ARMv6T2, ARMv7
LDRHT<c> <Rt>, [<Rn>], +/-<Rm>
313029282726252423222120191817161514131211109876543210
cond 0000U011 Rn Rt (0)(0)(0)(0)1011 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-159
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
+/-
Is + or omitted if
<imm>
or the optionally shifted value of
<Rm>
is to be added to the base
register value (
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only,
add == FALSE
).
<imm>
The immediate offset applied to the value of
<Rn>
. Any value in the range 0-255 is permitted.
<imm>
can be omitted, meaning an offset of 0.
<Rm>
Contains the offset that is applied to the value of
<Rn>
to form the address.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = if register_form then R[m] else imm32;
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if postindex then R[n] else offset_addr;
data = MemU_unpriv[address,2];
if postindex then R[n] = offset_addr;
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = ZeroExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRHT<c><q> <Rt>, [<Rn> {, #<imm>}]
Offset: Thumb only
LDRHT<c><q> <Rt>, [<Rn>] {, #+/-<imm>}
Post-indexed: ARM only
LDRHT<c><q> <Rt>, [<Rn>], +/-<Rm>
Post-indexed: ARM only
Instruction Details
A8-160 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.78 LDRSB (immediate)
Load Register Signed Byte (immediate) calculates an address from a base register value and an immediate
offset, loads a byte from memory, sign-extends it to form a 32-bit word, and writes it to a register. It can use
offset, post-indexed, or pre-indexed addressing. For information about memory accesses see Memory
accesses on page A8-13.
if Rt == ‘1111’ then SEE PLI;
if Rn == ‘1111’ then SEE LDRSB (literal);
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32);
index = TRUE; add = TRUE; wback = FALSE;
if t == 13 then UNPREDICTABLE;
if Rt == ‘1111’ && P == ‘1’ && U == ‘0’ && W == ‘0’ then SEE PLI;
if Rn == ‘1111’ then SEE LDRSB (literal);
if P == ‘1’ && U == ‘1’ && W == ‘0’ then SEE LDRSBT;
if P == ‘0’ && W == ‘0’ then UNDEFINED;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
if BadReg(t) || (wback && n == t) then UNPREDICTABLE;
if Rn == ‘1111’ then SEE LDRSB (literal);
if P == ‘0’ && W == ‘1’ then SEE LDRSBT;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
if t == 15 || (wback && n == t) then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRSB<c> <Rt>,[<Rn>,#<imm12>]
15141312111098765432101514131211109876543210
111110011001 Rn Rt imm12
Encoding T2 ARMv6T2, ARMv7
LDRSB<c> <Rt>,[<Rn>,#-<imm8>]
LDRSB<c> <Rt>,[<Rn>],#+/-<imm8>
LDRSB<c> <Rt>,[<Rn>,#+/-<imm8>]!
15141312111098765432101514131211109876543210
111110010001 Rn Rt 1PUW imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRSB<c> <Rt>,[<Rn>{,#+/-<imm8>}]
LDRSB<c> <Rt>,[<Rn>],#+/-<imm8>
LDRSB<c> <Rt>,[<Rn>,#+/-<imm8>]!
313029282726252423222120191817161514131211109876543210
cond 000PU1W1 Rn Rt imm4H 1101 imm4L
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-161
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. For PC use see LDRSB (literal) on page A8-162.
+/-
Is + or omitted to indicate that the immediate offset is added to the base register value
(
add == TRUE
), or – to indicate that the offset is to be subtracted (
add == FALSE
). Different
instructions are generated for
#0
and
#-0
.
<imm>
The immediate offset used to form the address. For the offset addressing syntax,
<imm>
can
be omitted, meaning an offset of 0. Values are:
Encoding T1 any value in the range 0-4095
Encoding T2 or A1 any value in the range0-255.
The pre-UAL syntax
LDR<c>SB
is equivalent to
LDRSB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
R[t] = SignExtend(MemU[address,1], 32);
if wback then R[n] = offset_addr;
Exceptions
Data Abort.
LDRSB<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRSB<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRSB<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-162 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.79 LDRSB (literal)
Load Register Signed Byte (literal) calculates an address from the PC value and an immediate offset, loads
a byte from memory, sign-extends it to form a 32-bit word, and writes it to a register. For information about
memory accesses see Memory accesses on page A8-13.
if Rt == ‘1111’ then SEE PLI;
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
if t == 13 then UNPREDICTABLE;
t = UInt(Rt); imm32 = ZeroExtend(imm4H:imm4L, 32); add = (U == ‘1’);
if t == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRSB<c> <Rt>,<label>
LDRSB<c> <Rt>,[PC,#-0]
Special case
15141312111098765432101514131211109876543210
11111001U0011111 Rt imm12
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRSB<c> <Rt>,<label>
LDRSB<c> <Rt>,[PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
cond 000(1)U1(0)11111 Rt imm4H 1101 imm4L
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-163
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<label>
The label of the literal data item that is to be loaded into
<Rt>
. The assembler calculates the
required value of the offset from the
Align(PC,4)
value of the
ADR
instruction to this label.
Permitted values of the offset are:
Encoding T1 any value in the range -4095 to 4095
Encoding A1 any value in the range -255 to 255.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
The pre-UAL syntax
LDR<c>SB
is equivalent to
LDRSB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(15);
base = Align(PC,4);
address = if add then (base + imm32) else (base - imm32);
R[t] = SignExtend(MemU[address,1], 32);
Exceptions
Data Abort.
LDRSB<c><q> <Rt>, <label>
Normal form
LDRSB<c><q> <Rt>, [PC, #+/-<imm>]
Alternative form
Instruction Details
A8-164 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.80 LDRSB (register)
Load Register Signed Byte (register) calculates an address from a base register value and an offset register
value, loads a byte from memory, sign-extends it to form a 32-bit word, and writes it to a register. The offset
register value can be shifted left by 0, 1, 2, or 3 bits. For information about memory accesses see Memory
accesses on page A8-13.
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, 0);
if Rt == ‘1111’ then SEE PLI;
if Rn == ‘1111’ then SEE LDRSB (literal);
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, UInt(imm2));
if t == 13 || BadReg(m) then UNPREDICTABLE;
if P == ‘0’ && W == ‘1’ then SEE LDRSBT;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
(shift_t, shift_n) = (SRType_LSL, 0);
if t == 15 || m == 15 then UNPREDICTABLE;
if wback && (n == 15 || n == t) then UNPREDICTABLE;
if ArchVersion() < 6 && wback && m == n then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDRSB<c> <Rt>,[<Rn>,<Rm>]
1514131211109876543210
0101011 Rm Rn Rt
Encoding T2 ARMv6T2, ARMv7
LDRSB<c>.W <Rt>,[<Rn>,<Rm>{,LSL #<imm2>}]
15141312111098765432101514131211109876543210
111110010001 Rn Rt 000000imm2 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRSB<c> <Rt>,[<Rn>,+/-<Rm>]{!}
LDRSB<c> <Rt>,[<Rn>],+/-<Rm>
313029282726252423222120191817161514131211109876543210
cond 000PU0W1 Rn Rt (0)(0)(0)(0)1101 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-165
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. The PC can be used only in the ARM instruction set.
+/-
Is + or omitted if the optionally shifted value of
<Rm>
is to be added to the base register value
(
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only, add == FALSE).
<Rm>
Contains the offset that is optionally left shifted and added to the value of
<Rn>
to form the
address.
<imm>
If present, the size of the left shift to apply to the value from
<Rm>
, in the range 1-3. Only
encoding T2 is permitted, and
<imm>
is encoded in imm2.
If absent, no shift is specified and all encodings are permitted. In encoding T2, imm2 is
encoded as
0b00
.
The pre-UAL syntax
LDR<c>SB
is equivalent to
LDRSB<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = Shift(R[m], shift_t, shift_n, APSR.C);
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if index then offset_addr else R[n];
R[t] = SignExtend(MemU[address,1], 32);
if wback then R[n] = offset_addr;
Exceptions
Data Abort.
LDRSB<c><q> <Rt>, [<Rn>, <Rm>{, LSL #<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRSB<c><q> <Rt>, [<Rn>, +/-<Rm>]
Offset:
index==TRUE
,
wback==FALSE
LDRSB<c><q> <Rt>, [<Rn>, +/-<Rm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRSB<c><q> <Rt>, [<Rn>], +/-<Rm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-166 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.81 LDRSBT
Load Register Signed Byte Unprivileged loads a byte from memory, sign-extends it to form a 32-bit word,
and writes it to a register. For information about memory accesses see Memory accesses on page A8-13.
The memory access is restricted as if the processor were running in User mode. (This makes no difference
if the processor is actually running in User mode.)
The Thumb instruction uses an offset addressing mode, that calculates the address used for the memory
access from a base register value and an immediate offset, and leaves the base register unchanged.
The ARM instruction uses a post-indexed addressing mode, that uses a base register value as the address for
the memory access, and calculates a new address from a base register value and an offset and writes it back
to the base register. The offset can be an immediate value or a register value.
if Rn == ‘1111’ then SEE LDRSB (literal);
t = UInt(Rt); n = UInt(Rn); postindex = FALSE; add = TRUE;
register_form = FALSE; imm32 = ZeroExtend(imm8, 32);
if BadReg(t) then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); postindex = TRUE; add = (U == ‘1’);
register_form = FALSE; imm32 = ZeroExtend(imm4H:imm4L, 32);
if t == 15 || n == 15 || n == t then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm); postindex = TRUE; add = (U == ‘1’);
register_form = TRUE;
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRSBT<c> <Rt>,[<Rn>,#<imm8>]
15141312111098765432101514131211109876543210
111110010001 Rn Rt 1110 imm8
Encoding A1 ARMv6T2, ARMv7
LDRSBT<c> <Rt>, [<Rn>] {, #+/-<imm8>}
313029282726252423222120191817161514131211109876543210
cond 0000U111 Rn Rt imm4H 1101 imm4L
Encoding A2 ARMv6T2, ARMv7
LDRSBT<c> <Rt>, [<Rn>], +/-<Rm>
313029282726252423222120191817161514131211109876543210
cond 0000U011 Rn Rt (0)(0)(0)(0)1101 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-167
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
+/-
Is + or omitted if
<imm>
or the optionally shifted value of
<Rm>
is to be added to the base
register value (
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only,
add == FALSE
).
<imm>
The immediate offset applied to the value of
<Rn>
. Any value in the range 0-255 is permitted.
<imm>
can be omitted, meaning an offset of 0.
<Rm>
Contains the offset that is applied to the value of
<Rn>
to form the address.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = if register_form then R[m] else imm32;
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if postindex then R[n] else offset_addr;
R[t] = SignExtend(MemU_unpriv[address,1], 32);
if postindex then R[n] = offset_addr;
Exceptions
Data Abort.
LDRSBT<c><q> <Rt>, [<Rn> {, #<imm>}]
Offset: Thumb only
LDRSBT<c><q> <Rt>, [<Rn>] {, #+/-<imm>}
Post-indexed: ARM only
LDRSBT<c><q> <Rt>, [<Rn>], +/-<Rm>
Post-indexed: ARM only
Instruction Details
A8-168 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.82 LDRSH (immediate)
Load Register Signed Halfword (immediate) calculates an address from a base register value and an
immediate offset, loads a halfword from memory, sign-extends it to form a 32-bit word, and writes it to a
register. It can use offset, post-indexed, or pre-indexed addressing. For information about memory accesses
see Memory accesses on page A8-13.
if Rn == ‘1111’ then SEE LDRSH (literal);
if Rt == ‘1111’ then SEE “Unallocated memory hints”;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm12, 32);
index = TRUE; add = TRUE; wback = FALSE;
if t == 13 then UNPREDICTABLE;
if Rn == ‘1111’ then SEE LDRSH (literal);
if Rt == ‘1111’ && P == ‘1’ && U == ‘0’ && W == ‘0’ then SEE “Unallocated memory hints”;
if P == ‘1’ && U == ‘1’ && W == ‘0’ then SEE LDRSHT;
if P == ‘0’ && W == ‘0’ then UNDEFINED;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm8, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (W == ‘1’);
if BadReg(t) || (wback && n == t) then UNPREDICTABLE;
if Rn == ‘1111’ then SEE LDRSH (literal);
if P == ‘0’ && W == ‘1’ then SEE LDRSHT;
t = UInt(Rt); n = UInt(Rn); imm32 = ZeroExtend(imm4H:imm4L, 32);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
if t == 15 || (wback && n == t) then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRSH<c> <Rt>,[<Rn>,#<imm12>]
15141312111098765432101514131211109876543210
111110011011 Rn Rt imm12
Encoding T2 ARMv6T2, ARMv7
LDRSH<c> <Rt>,[<Rn>,#-<imm8>]
LDRSH<c> <Rt>,[<Rn>],#+/-<imm8>
LDRSH<c> <Rt>,[<Rn>,#+/-<imm8>]!
15141312111098765432101514131211109876543210
111110010011 Rn Rt 1PUW imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRSH<c> <Rt>,[<Rn>{,#+/-<imm8>}]
LDRSH<c> <Rt>,[<Rn>],#+/-<imm8>
LDRSH<c> <Rt>,[<Rn>,#+/-<imm8>]!
313029282726252423222120191817161514131211109876543210
cond 000PU1W1 Rn Rt imm4H 1111 imm4L
Unallocated memory hints See Load halfword, memory hints on page A6-26
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-169
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. For PC use see LDRSH (literal) on page A8-170.
+/-
Is + or omitted to indicate that the immediate offset is added to the base register value
(
add == TRUE
), or – to indicate that the offset is to be subtracted (
add == FALSE
). Different
instructions are generated for
#0
and
#-0
.
<imm>
The immediate offset used to form the address, Values are 0-4095 for encoding T1, and
0-255 for encoding T2 or A1. For the offset syntax,
<imm>
can be omitted, meaning an offset
of 0.
The pre-UAL syntax
LDR<c>SH
is equivalent to
LDRSH<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset_addr = if add then (R[n] + imm32) else (R[n] - imm32);
address = if index then offset_addr else R[n];
data = MemU[address,2];
if wback then R[n] = offset_addr;
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = SignExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRSH<c><q> <Rt>, [<Rn> {, #+/-<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRSH<c><q> <Rt>, [<Rn>, #+/-<imm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRSH<c><q> <Rt>, [<Rn>], #+/-<imm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-170 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.83 LDRSH (literal)
Load Register Signed Halfword (literal) calculates an address from the PC value and an immediate offset,
loads a halfword from memory, sign-extends it to form a 32-bit word, and writes it to a register. For
information about memory accesses see Memory accesses on page A8-13.
if Rt == ‘1111’ then SEE “Unallocated memory hints”;
t = UInt(Rt); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
if t == 13 then UNPREDICTABLE;
t = UInt(Rt); imm32 = ZeroExtend(imm4H:imm4L, 32); add = (U == ‘1’);
if t == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRSH<c> <Rt>,<label>
LDRSH<c> <Rt>,[PC,#-0]
Special case
15141312111098765432101514131211109876543210
11111001U0111111 Rt imm12
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRSH<c> <Rt>,<label>
LDRSH<c> <Rt>,[PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
cond 000(1)U1(0)11111 Rt imm4H 1111 imm4L
Unallocated memory hints See Load halfword, memory hints on page A6-26
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-171
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<label>
The label of the literal data item that is to be loaded into
<Rt>
. The assembler calculates the
required value of the offset from the
Align(PC,4)
value of the
ADR
instruction to this label.
Permitted values of the offset are:
Encoding T1 any value in the range -4095 to 4095
Encoding A1 any value in the range -255 to 255.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
The pre-UAL syntax
LDR<c>SH
is equivalent to
LDRSH<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(15);
base = Align(PC,4);
address = if add then (base + imm32) else (base - imm32);
data = MemU[address,2];
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = SignExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRSH<c><q> <Rt>, <label>
Normal form
LDRSH<c><q> <Rt>, [PC, #+/-<imm>]
Alternative form
Instruction Details
A8-172 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.84 LDRSH (register)
Load Register Signed Halfword (register) calculates an address from a base register value and an offset
register value, loads a halfword from memory, sign-extends it to form a 32-bit word, and writes it to a
register. The offset register value can be shifted left by 0, 1, 2, or 3 bits. For information about memory
accesses see Memory accesses on page A8-13.
if CurrentInstrSet() == InstrSet_ThumbEE then SEE “Modified operation in ThumbEE”;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, 0);
if Rn == ‘1111’ then SEE LDRSH (literal);
if Rt == ‘1111’ then SEE “Unallocated memory hints”;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = TRUE; add = TRUE; wback = FALSE;
(shift_t, shift_n) = (SRType_LSL, UInt(imm2));
if t == 13 || BadReg(m) then UNPREDICTABLE;
if P == ‘0’ && W == ‘1’ then SEE LDRSHT;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm);
index = (P == ‘1’); add = (U == ‘1’); wback = (P == ‘0’) || (W == ‘1’);
(shift_t, shift_n) = (SRType_LSL, 0);
if t == 15 || m == 15 then UNPREDICTABLE;
if wback && (n == 15 || n == t) then UNPREDICTABLE;
if ArchVersion() < 6 && wback && m == n then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LDRSH<c> <Rt>,[<Rn>,<Rm>]
1514131211109876543210
0101111 Rm Rn Rt
Encoding T2 ARMv6T2, ARMv7
LDRSH<c>.W <Rt>,[<Rn>,<Rm>{,LSL #<imm2>}]
15141312111098765432101514131211109876543210
111110010011 Rn Rt 000000imm2 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRSH<c> <Rt>,[<Rn>,+/-<Rm>]{!}
LDRSH<c> <Rt>,[<Rn>],+/-<Rm>
313029282726252423222120191817161514131211109876543210
cond 000PU0W1 Rn Rt (0)(0)(0)(0)1111 Rm
Unallocated memory hints See Load halfword, memory hints on page A6-26
Modified operation in ThumbEE See LDRSH (register) on page A9-11
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-173
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used. The PC can be used only in the ARM instruction set.
+/-
Is + or omitted if the optionally shifted value of
<Rm>
is to be added to the base register value
(
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only, add == FALSE).
<Rm>
Contains the offset that is optionally left shifted and added to the value of
<Rn>
to form the
address.
<imm>
If present, the size of the left shift to apply to the value from
<Rm>
, in the range 1-3. Only
encoding T2 is permitted, and
<imm>
is encoded in imm2.
If absent, no shift is specified and all encodings are permitted. In encoding T2, imm2 is
encoded as
0b00
.
The pre-UAL syntax
LDR<c>SH
is equivalent to
LDRSH<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = Shift(R[m], shift_t, shift_n, APSR.C);
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if index then offset_addr else R[n];
data = MemU[address,2];
if wback then R[n] = offset_addr;
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = SignExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRSH<c><q> <Rt>, [<Rn>, <Rm>{, LSL #<imm>}]
Offset:
index==TRUE
,
wback==FALSE
LDRSH<c><q> <Rt>, [<Rn>, +/-<Rm>]
Offset:
index==TRUE
,
wback==FALSE
LDRSH<c><q> <Rt>, [<Rn>, +/-<Rm>]!
Pre-indexed:
index==TRUE
,
wback==TRUE
LDRSH<c><q> <Rt>, [<Rn>], +/-<Rm>
Post-indexed:
index==FALSE
,
wback==TRUE
Instruction Details
A8-174 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.85 LDRSHT
Load Register Signed Halfword Unprivileged loads a halfword from memory, sign-extends it to form a
32-bit word, and writes it to a register. For information about memory accesses see Memory accesses on
page A8-13.
The memory access is restricted as if the processor were running in User mode. (This makes no difference
if the processor is actually running in User mode.)
The Thumb instruction uses an offset addressing mode, that calculates the address used for the memory
access from a base register value and an immediate offset, and leaves the base register unchanged.
The ARM instruction uses a post-indexed addressing mode, that uses a base register value as the address for
the memory access, and calculates a new address from a base register value and an offset and writes it back
to the base register. The offset can be an immediate value or a register value.
if Rn == ‘1111’ then SEE LDRSH (literal);
t = UInt(Rt); n = UInt(Rn); postindex = FALSE; add = TRUE;
register_form = FALSE; imm32 = ZeroExtend(imm8, 32);
if BadReg(t) then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); postindex = TRUE; add = (U == ‘1’);
register_form = FALSE; imm32 = ZeroExtend(imm4H:imm4L, 32);
if t == 15 || n == 15 || n == t then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm); postindex = TRUE; add = (U == ‘1’);
register_form = TRUE;
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRSHT<c> <Rt>,[<Rn>,#<imm8>]
15141312111098765432101514131211109876543210
111110010011 Rn Rt 1110 imm8
Encoding A1 ARMv6T2, ARMv7
LDRSHT<c> <Rt>, [<Rn>] {, #+/-<imm8>}
313029282726252423222120191817161514131211109876543210
cond 0000U111 Rn Rt imm4H 1111 imm4L
Encoding A2 ARMv6T2, ARMv7
LDRSHT<c> <Rt>, [<Rn>], +/-<Rm>
313029282726252423222120191817161514131211109876543210
cond 0000U011 Rn Rt (0)(0)(0)(0)1111 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-175
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
+/-
Is + or omitted if
<imm>
or the optionally shifted value of
<Rm>
is to be added to the base
register value (
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only,
add == FALSE
).
<imm>
The immediate offset applied to the value of
<Rn>
. Any value in the range 0-255 is permitted.
<imm>
can be omitted, meaning an offset of 0.
<Rm>
Contains the offset that is applied to the value of
<Rn>
to form the address.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = if register_form then R[m] else imm32;
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if postindex then R[n] else offset_addr;
data = MemU_unpriv[address,2];
if postindex then R[n] = offset_addr;
if UnalignedSupport() || address<0> = ‘0’ then
R[t] = SignExtend(data, 32);
else // Can only apply before ARMv7
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRSHT<c><q> <Rt>, [<Rn> {, #<imm>}]
Offset: Thumb only
LDRSHT<c><q> <Rt>, [<Rn>] {, #+/-<imm>}
Post-indexed: ARM only
LDRSHT<c><q> <Rt>, [<Rn>], +/-<Rm>
Post-indexed: ARM only
Instruction Details
A8-176 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.86 LDRT
Load Register Unprivileged loads a word from memory, and writes it to a register. For information about
memory accesses see Memory accesses on page A8-13.
The memory access is restricted as if the processor were running in User mode. (This makes no difference
if the processor is actually running in User mode.)
The Thumb instruction uses an offset addressing mode, that calculates the address used for the memory
access from a base register value and an immediate offset, and leaves the base register unchanged.
The ARM instruction uses a post-indexed addressing mode, that uses a base register value as the address for
the memory access, and calculates a new address from a base register value and an offset and writes it back
to the base register. The offset can be an immediate value or an optionally-shifted register value.
if Rn == ‘1111’ then SEE LDR (literal);
t = UInt(Rt); n = UInt(Rn); postindex = FALSE; add = TRUE;
register_form = FALSE; imm32 = ZeroExtend(imm8, 32);
if BadReg(t) then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); postindex = TRUE; add = (U == ‘1’);
register_form = FALSE; imm32 = ZeroExtend(imm12, 32);
if t == 15 || n == 15 || n == t then UNPREDICTABLE;
t = UInt(Rt); n = UInt(Rn); m = UInt(Rm); postindex = TRUE; add = (U == ‘1’);
register_form = TRUE; (shift_t, shift_n) = DecodeImmShift(type, imm5);
if t == 15 || n == 15 || n == t || m == 15 then UNPREDICTABLE;
if ArchVersion() < 6 && m == n then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDRT<c> <Rt>,[<Rn>,#<imm8>]
15141312111098765432101514131211109876543210
111110000101 Rn Rt 1110 imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRT<c> <Rt>, [<Rn>] {, #+/-<imm12>}
313029282726252423222120191817161514131211109876543210
cond 0100U011 Rn Rt imm12
Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LDRT<c> <Rt>,[<Rn>],+/-<Rm>{, <shift>}
313029282726252423222120191817161514131211109876543210
cond 0110U011 Rn Rt imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-177
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rt>
The destination register.
<Rn>
The base register. The SP can be used.
+/-
Is + or omitted if
<imm>
or the optionally shifted value of
<Rm>
is to be added to the base
register value (
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only,
add == FALSE
).
<imm>
The immediate offset applied to the value of
<Rn>
. Values are 0-255 for encoding T1, and
0-4095 for encoding A1.
<imm>
can be omitted, meaning an offset of 0.
<Rm>
Contains the offset that is optionally shifted and applied to the value of
<Rn>
to form the
address.
<shift>
The shift to apply to the value read from
<Rm>
. If omitted, no shift is applied. Shifts applied
to a register on page A8-10 describes the shifts and how they are encoded.
The pre-UAL syntax
LDR<c>T
is equivalent to
LDRT<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(n);
offset = if register_form then Shift(R[m], shift_t, shift_n, APSR.C) else imm32;
offset_addr = if add then (R[n] + offset) else (R[n] - offset);
address = if postindex then R[n] else offset_addr;
data = MemU_unpriv[address,4];
if postindex then R[n] = offset_addr;
if t == 15 then // Only possible for encodings A1 and A2
if address<1:0> == ‘00’ then LoadWritePC(data); else UNPREDICTABLE;
elsif UnalignedSupport() || address<1:0> = ‘00’ then
R[t] = data;
else // Can only apply before ARMv7
if CurrentInstrSet() == InstrSet_ARM then
R[t] = ROR(data, 8*UInt(address<1:0>));
else
R[t] = bits(32) UNKNOWN;
Exceptions
Data Abort.
LDRT<c><q> <Rt>, [<Rn> {, #<imm>}]
Offset: Thumb only
LDRT<c><q> <Rt>, [<Rn>] {, #+/-<imm>}
Post-indexed: ARM only
LDRT<c><q> <Rt>, [<Rn>], +/-<Rm> {, <shift>}
Post-indexed: ARM only
Instruction Details
A8-178 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.87 LEAVEX
LEAVEX
causes a change from ThumbEE to Thumb state, or has no effect in Thumb state. For details see
ENTERX, LEAVEX on page A9-7.
A8.6.88 LSL (immediate)
Logical Shift Left (immediate) shifts a register value left by an immediate number of bits, shifting in zeros,
and writes the result to the destination register. It can optionally update the condition flags based on the
result.
if imm5 == ‘00000’ then SEE MOV (register);
d = UInt(Rd); m = UInt(Rm); setflags = !InITBlock();
(-, shift_n) = DecodeImmShift(‘00’, imm5);
if (imm3:imm2) == ‘00000’ then SEE MOV (register);
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(-, shift_n) = DecodeImmShift(‘00’, imm3:imm2);
if BadReg(d) || BadReg(m) then UNPREDICTABLE;
if imm5 == ‘00000’ then SEE MOV (register);
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(-, shift_n) = DecodeImmShift(‘00’, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LSLS <Rd>,<Rm>,#<imm5>
Outside IT block.
LSL<c> <Rd>,<Rm>,#<imm5>
Inside IT block.
1514131211109876543210
00000 imm5 Rm Rd
Encoding T2 ARMv6T2, ARMv7
LSL{S}<c>.W <Rd>,<Rm>,#<imm5>
151413121110987654321015141312111098 7 6 543210
11101010010S1111(0) imm3 Rd imm200 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LSL{S}<c> <Rd>,<Rm>,#<imm5>
313029282726252423222120191817161514131211109876543210
cond 0001101S(0)(0)(0)(0) Rd imm5 000 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-179
Assembler syntax
LSL{S}<c><q> {<Rd>,} <Rm>, #<imm5>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rm>
The first operand register.
<imm5>
The shift amount, in the range 1 to 31. See Shifts applied to a register on page A8-10.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry) = Shift_C(R[m], SRType_LSL, shift_n, APSR.C);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-180 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.89 LSL (register)
Logical Shift Left (register) shifts a register value left by a variable number of bits, shifting in zeros, and
writes the result to the destination register. The variable number of bits is read from the bottom byte of a
register. It can optionally update the condition flags based on the result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LSLS <Rdn>,<Rm>
Outside IT block.
LSL<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100000010 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
LSL{S}<c>.W <Rd>,<Rn>,<Rm>
15141312111098765432101514131211109876543210
11111010000S Rn 1111 Rd 0000 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LSL{S}<c> <Rd>,<Rn>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 0001101S(0)(0)(0)(0) Rd Rm 0001 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-181
Assembler syntax
LSL{S}<c><q> {<Rd>,} <Rn>, <Rm>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register whose bottom byte contains the amount to shift by.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[m]<7:0>);
(result, carry) = Shift_C(R[n], SRType_LSL, shift_n, APSR.C);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-182 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.90 LSR (immediate)
Logical Shift Right (immediate) shifts a register value right by an immediate number of bits, shifting in
zeros, and writes the result to the destination register. It can optionally update the condition flags based on
the result.
d = UInt(Rd); m = UInt(Rm); setflags = !InITBlock();
(-, shift_n) = DecodeImmShift(‘01’, imm5);
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(-, shift_n) = DecodeImmShift(‘01’, imm3:imm2);
if BadReg(d) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(-, shift_n) = DecodeImmShift(‘01’, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LSRS <Rd>,<Rm>,#<imm>
Outside IT block.
LSR<c> <Rd>,<Rm>,#<imm>
Inside IT block.
1514131211109876543210
00001 imm5 Rm Rd
Encoding T2 ARMv6T2, ARMv7
LSR{S}<c>.W <Rd>,<Rm>,#<imm>
151413121110987654321015141312111098 7 6 543210
11101010010S1111(0) imm3 Rd imm201 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LSR{S}<c> <Rd>,<Rm>,#<imm>
313029282726252423222120191817161514131211109876543210
cond 0001101S(0)(0)(0)(0) Rd imm5 010 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-183
Assembler syntax
LSR{S}<c><q> {<Rd>,} <Rm>, #<imm>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rm>
The first operand register.
<imm>
The shift amount, in the range 1 to 32. See Shifts applied to a register on page A8-10.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(result, carry) = Shift_C(R[m], SRType_LSR, shift_n, APSR.C);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-184 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.91 LSR (register)
Logical Shift Right (register) shifts a register value right by a variable number of bits, shifting in zeros, and
writes the result to the destination register. The variable number of bits is read from the bottom byte of a
register. It can optionally update the condition flags based on the result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
LSRS <Rdn>,<Rm>
Outside IT block.
LSR<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100000011 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
LSR{S}<c>.W <Rd>,<Rn>,<Rm>
15141312111098765432101514131211109876543210
11111010001S Rn 1111 Rd 0000 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
LSR{S}<c> <Rd>,<Rn>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 0001101S(0)(0)(0)(0) Rd Rm 0011 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-185
Assembler syntax
LSR{S}<c><q> {<Rd>,} <Rn>, <Rm>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register whose bottom byte contains the amount to shift by.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[m]<7:0>);
(result, carry) = Shift_C(R[n], SRType_LSR, shift_n, APSR.C);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-186 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.92 MCR, MCR2
Move to Coprocessor from ARM core register passes the value of an ARM core register to a coprocessor.
If no coprocessor can execute the instruction, an Undefined Instruction exception is generated.
This is a generic coprocessor instruction. Some of the fields have no functionality defined by the architecture
and are free for use by the coprocessor instruction set designer. These fields are the opc1, opc2, CRn, and
CRm fields.
For more information about the coprocessors see Coprocessor support on page A2-68.
if coproc == ‘101x’ then SEE “Advanced SIMD and VFP”;
t = UInt(Rt); cp = UInt(coproc);
if t == 15 || (t == 13 && (CurrentInstrSet() != InstrSet_ARM)) then UNPREDICTABLE;
t = UInt(Rt); cp = UInt(coproc);
if t == 15 || (t == 13 && (CurrentInstrSet() != InstrSet_ARM)) then UNPREDICTABLE;
Encoding T1 / A1 ARMv6T2, ARMv7 for encoding T1
ARMv4*, ARMv5T*, ARMv6*, ARMv7 for encoding A1
MCR<c> <coproc>,<opc1>,<Rt>,<CRn>,<CRm>{,<opc2>}
15141312111098765432101514131211109876543210
11101110 opc1 0 CRn Rt coproc opc2 1 CRm
313029282726252423222120191817161514131211109876543210
cond 1110 opc1 0 CRn Rt coproc opc2 1 CRm
Encoding T2 / A2 ARMv6T2, ARMv7 for encoding T2
ARMv5T*, ARMv6*, ARMv7 for encodingA2
MCR2<c> <coproc>,<opc1>,<Rt>,<CRn>,<CRm>{,<opc2>}
15141312111098765432101514131211109876543210
11111110 opc1 0 CRn Rt coproc opc2 1 CRm
313029282726252423222120191817161514131211109876543210
11111110 opc1 0 CRn Rt coproc opc2 1 CRm
Advanced SIMD and VFP See 8, 16, and 32-bit transfer between ARM core and extension registers
on page A7-31
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-187
Assembler syntax
MCR{2}<c><q> <coproc>, #<opc1>, <Rt>, <CRn>, <CRm>{, #<opc2>}
where:
2
If specified, selects encoding T2 / A2. If omitted, selects encoding T1 / A1.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
MCR2
instruction must be
unconditional.
<coproc>
The name of the coprocessor. The standard generic coprocessor names are p0, p1, …, p15.
<opc1>
Is a coprocessor-specific opcode in the range 0 to 7.
<Rt>
Is the ARM core register whose value is transferred to the coprocessor.
<CRn>
Is the destination coprocessor register.
<CRm>
Is an additional destination coprocessor register.
<opc2>
Is a coprocessor-specific opcode in the range 0-7. If omitted,
<opc2>
is assumed to be 0.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if !Coproc_Accepted(cp, ThisInstr()) then
GenerateCoprocessorException();
else
Coproc_SendOneWord(R[t], cp, ThisInstr());
Exceptions
Undefined Instruction.
Instruction Details
A8-188 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.93 MCRR, MCRR2
Move to Coprocessor from two ARM core registers passes the values of two ARM core registers to a
coprocessor. If no coprocessor can execute the instruction, an Undefined Instruction exception is generated.
This is a generic coprocessor instruction. The opc1 and CRm fields have no functionality defined by the
architecture and are free for use by the coprocessor instruction set designer.
For more information about the coprocessors see Coprocessor support on page A2-68.
if coproc == ‘101x’ then SEE “Advanced SIMD and VFP”;
t = UInt(Rt); t2 = UInt(Rt2); cp = UInt(coproc);
if t == 15 || t2 == 15 then UNPREDICTABLE;
if (t == 13 || t2 == 13) && (CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
t = UInt(Rt); t2 = UInt(Rt2); cp = UInt(coproc);
if t == 15 || t2 == 15 then UNPREDICTABLE;
if (t == 13 || t2 == 13) && (CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
Encoding T1 / A1 ARMv6T2, ARMv7 for encoding T1
ARMv5TE*, ARMv6*, ARMv7 for encoding A1
MCRR<c> <coproc>,<opc1>,<Rt>,<Rt2>,<CRm>
15141312111098765432101514131211109876543210
111011000100 Rt2 Rt coproc opc1 CRm
313029282726252423222120191817161514131211109876543210
cond 11000100 Rt2 Rt coproc opc1 CRm
Encoding T2 / A2 ARMv6T2, ARMv7 for encoding T2
ARMv6*, ARMv7 for encoding A2
MCRR2<c> <coproc>,<opc1>,<Rt>,<Rt2>,<CRm>
15141312111098765432101514131211109876543210
111111000100 Rt2 Rt coproc opc1 CRm
313029282726252423222120191817161514131211109876543210
111111000100 Rt2 Rt coproc opc1 CRm
Advanced SIMD and VFP See 64-bit transfers between ARM core and extension registers on
page A7-32
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-189
Assembler syntax
MCRR{2}<c><q> <coproc>, #<opc1>, <Rt>, <Rt2>, <CRm>
where:
2
If specified, selects encoding T2 / A2. If omitted, selects encoding T1 / A1.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
MCRR2
instruction must be
unconditional.
<coproc>
The name of the coprocessor.
The standard generic coprocessor names are p0, p1, …, p15.
<opc1>
Is a coprocessor-specific opcode in the range 0 to 15.
<Rt>
Is the first ARM core register whose value is transferred to the coprocessor.
<Rt2>
Is the second ARM core register whose value is transferred to the coprocessor.
<CRm>
Is the destination coprocessor register.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if !Coproc_Accepted(cp, ThisInstr()) then
GenerateCoprocessorException();
else
Coproc_SendTwoWords(R[t], R[t2], cp, ThisInstr());
Exceptions
Undefined Instruction.
Instruction Details
A8-190 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.94 MLA
Multiply Accumulate multiplies two register values, and adds a third register value. The least significant 32
bits of the result are written to the destination register. These 32 bits do not depend on whether the source
register values are considered to be signed values or unsigned values.
In ARM code, the condition flags can optionally be updated based on the result. Use of this option adversely
affects performance on many processor implementations.
if Ra == ‘1111’ then SEE MUL;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); a = UInt(Ra); setflags = FALSE;
if BadReg(d) || BadReg(n) || BadReg(m) || a == 13 then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); a = UInt(Ra); setflags = (S == ‘1’);
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE;
if ArchVersion() < 6 && d == n then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
MLA<c> <Rd>,<Rn>,<Rm>,<Ra>
151413121110987654321 01514131211109876543210
111110110000 Rn Ra Rd 0000 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MLA{S}<c> <Rd>,<Rn>,<Rm>,<Ra>
313029282726252423222120191817161514131211109876543210
cond 0000001S Rd Ra Rm 1001 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-191
Assembler syntax
MLA{S}<c><q> <Rd>, <Rn>, <Rm>, <Ra>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
S
can be specified only for the ARM instruction set.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The second operand register.
<Ra>
The register containing the accumulate value.
The pre-UAL syntax
MLA<c>S
is equivalent to
MLAS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
operand1 = SInt(R[n]); // operand1 = UInt(R[n]) produces the same final results
operand2 = SInt(R[m]); // operand2 = UInt(R[m]) produces the same final results
addend = SInt(R[a]); // addend = UInt(R[a]) produces the same final results
result = operand1 * operand2 + addend;
R[d] = result<31:0>;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
if ArchVersion() == 4 then
APSR.C = bit UNKNOWN;
// else APSR.C unchanged
// APSR.V always unchanged
Exceptions
None.
Instruction Details
A8-192 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.95 MLS
Multiply and Subtract multiplies two register values, and subtracts the product from a third register value.
The least significant 32 bits of the result are written to the destination register. These 32 bits do not depend
on whether the source register values are considered to be signed values or unsigned values.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); a = UInt(Ra);
if BadReg(d) || BadReg(n) || BadReg(m) || BadReg(a) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); a = UInt(Ra);
if d == 15 || n == 15 || m == 15 || a == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
MLS<c> <Rd>,<Rn>,<Rm>,<Ra>
151413121110987654321 01514131211109876543210
111110110000 Rn Ra Rd 0001 Rm
Encoding A1 ARMv6T2, ARMv7
MLS<c> <Rd>,<Rn>,<Rm>,<Ra>
313029282726252423222120191817161514131211109876543210
cond 00000110 Rd Ra Rm 1001 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-193
Assembler syntax
MLS<c><q> <Rd>, <Rn>, <Rm>, <Ra>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The second operand register.
<Ra>
The register containing the accumulate value.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
operand1 = SInt(R[n]); // operand1 = UInt(R[n]) produces the same final results
operand2 = SInt(R[m]); // operand2 = UInt(R[m]) produces the same final results
addend = SInt(R[a]); // addend = UInt(R[a]) produces the same final results
result = addend - operand1 * operand2;
R[d] = result<31:0>;
Exceptions
None.
Instruction Details
A8-194 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.96 MOV (immediate)
Move (immediate) writes an immediate value to the destination register. It can optionally update the
condition flags based on the value.
d = UInt(Rd); setflags = !InITBlock(); imm32 = ZeroExtend(imm8, 32); carry = APSR.C;
d = UInt(Rd); setflags = (S == ‘1’); (imm32, carry) = ThumbExpandImm_C(i:imm3:imm8, APSR.C);
if BadReg(d) then UNPREDICTABLE;
d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(imm4:i:imm3:imm8, 32);
if BadReg(d) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); setflags = (S == ‘1’); (imm32, carry) = ARMExpandImm_C(imm12, APSR.C);
d = UInt(Rd); setflags = FALSE; imm32 = ZeroExtend(imm4:imm12, 32);
if d == 15 then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
MOVS <Rd>,#<imm8>
Outside IT block.
MOV<c> <Rd>,#<imm8>
Inside IT block.
1514131211109876543210
00100 Rd imm8
Encoding T2 ARMv6T2, ARMv7
MOV{S}<c>.W <Rd>,#<const>
15141312111098765432101514131211109876543210
11110 i 00010S11110 imm3 Rd imm8
Encoding T3 ARMv6T2, ARMv7
MOVW<c> <Rd>,#<imm16>
15141312111098765432101514131211109876543210
11110 i 100100 imm4 0 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MOV{S}<c> <Rd>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0011101S(0)(0)(0)(0) Rd imm12
Encoding A2 ARMv6T2, ARMv7
MOVW<c> <Rd>,#<imm16>
313029282726252423222120191817161514131211109876543210
cond 00110000 imm4 Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-195
Assembler syntax
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<const>
The immediate value to be placed in
<Rd>
. The range of values is 0-255 for encoding T1 and
0-65535 for encoding T3 or A2. See Modified immediate constants in Thumb instructions
on page A6-17 or Modified immediate constants in ARM instructions on page A5-9 for the
range of values for encoding T2 or A1.
When both 32-bit encodings are available for an instruction, encoding T2 or A1 is preferred
to encoding T3 or A2 (if encoding T3 or A2 is required, use the
MOVW
syntax).
The pre-UAL syntax
MOV<c>S
is equivalent to
MOVS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = imm32;
if d == 15 then // Can only occur for encoding A1
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
MOV{S}<c><q> <Rd>, #<const>
All encodings permitted
MOVW<c><q> <Rd>, #<const>
Only encoding T3 or A2 permitted
Instruction Details
A8-196 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.97 MOV (register)
Move (register) copies a value from a register to the destination register. It can optionally update the
condition flags based on the value.
d = UInt(D:Rd); m = UInt(Rm); setflags = FALSE;
if d == 15 && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
d = UInt(Rd); m = UInt(Rm); setflags = TRUE;
if InITBlock() then UNPREDICTABLE;
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
if (d == 13 || BadReg(m)) && setflags then UNPREDICTABLE;
if (d == 13 && BadReg(m)) || d == 15 then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
Assembler syntax
MOV{S}<c><q> <Rd>, <Rm>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
Encoding T1 ARMv6*, ARMv7 if
<Rd>
and
<Rm>
both from R0-R7
ARMv4T, ARMv5T*, ARMv6*, ARMv7 otherwise
MOV<c> <Rd>,<Rm>
If
<Rd>
is the PC, must be outside or last in IT block.
1514131211109876543210
01000110D Rm Rd
Encoding T2 ARMv4T, ARMv5T*, ARMv6*, ARMv7
MOVS <Rd>,<Rm>
Not permitted in IT block
1514131211109876543210
0000000000 Rm Rd
Encoding T3 ARMv6T2, ARMv7
MOV{S}<c>.W <Rd>,<Rm>
151413121110987654321015141312111098 7 6 543210
11101010010S1111(0)000 Rd 0 0 00 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MOV{S}<c> <Rd>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 0001101S(0)(0)(0)(0) Rd 00000000 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-197
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register. This register can be the SP or PC. If this register is the PC and
S
is
specified, see SUBS PC, LR and related instructions on page B6-25.
If
<Rd>
is the PC:
the instruction causes a branch to the address moved to the PC
in the Thumb and ThumbEE instruction sets:
the instruction must either be outside an IT block or the last instruction of an
IT block
encoding T3 is not permitted.
In the Thumb and ThumbEE instruction sets,
S
must not be specified if
<Rd>
is the SP. If
<Rd>
is the SP and
<Rm>
is the SP or PC, encoding T3 is not permitted.
<Rm>
The source register. This register can be the SP or PC. In the Thumb and ThumbEE
instruction sets,
S
must not be specified if
<Rm>
is the SP or PC.
Note
The use of the following
MOV
(register) instructions is deprecated:
ones in which
<Rd>
is the SP or PC and
<Rm>
is also the SP or PC
ones in which
S
is specified and
<Rd>
is the SP,
<Rm>
is the SP, or
<Rm>
is the PC.
See also Changing between Thumb state and ARM state on page A4-2 about the use of the
MOV PC,LR
instruction.
The pre-UAL syntax
MOV<c>S
is equivalent to
MOVS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = R[m];
if d == 15 then
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-198 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.98 MOV (shifted register)
Move (shifted register) is a pseudo-instruction for
ASR
,
LSL
,
LSR
,
ROR
, and
RRX
.
For details see the following sections:
ASR (immediate) on page A8-40
ASR (register) on page A8-42
LSL (immediate) on page A8-178
LSL (register) on page A8-180
LSR (immediate) on page A8-182
LSR (register) on page A8-184
ROR (immediate) on page A8-278
ROR (register) on page A8-280
RRX on page A8-282.
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-199
Assembler syntax
Table A8-4 shows the equivalences between
MOV
(shifted register) and other instructions.
Disassembly produces the canonical form of the instruction.
Exceptions
None.
Table A8-4 MOV (shifted register) equivalences
MOV instruction Canonical form
MOV{S} <Rd>,<Rm>,ASR #<n> ASR{S} <Rd>,<Rm>,#<n>
MOV{S} <Rd>,<Rm>,LSL #<n> LSL{S} <Rd>,<Rm>,#<n>
MOV{S} <Rd>,<Rm>,LSR #<n> LSR{S} <Rd>,<Rm>,#<n>
MOV{S} <Rd>,<Rm>,ROR #<n> ROR{S} <Rd>,<Rm>,#<n>
MOV{S} <Rd>,<Rm>,ASR <Rs> ASR{S} <Rd>,<Rm>,<Rs>
MOV{S} <Rd>,<Rm>,LSL <Rs> LSL{S} <Rd>,<Rm>,<Rs>
MOV{S} <Rd>,<Rm>,LSR <Rs> LSR{S} <Rd>,<Rm>,<Rs>
MOV{S} <Rd>,<Rm>,ROR <Rs> ROR{S} <Rd>,<Rm>,<Rs>
MOV{S} <Rd>,<Rm>,RRX RRX{S} <Rd>,<Rm>
Instruction Details
A8-200 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.99 MOVT
Move Top writes an immediate value to the top halfword of the destination register. It does not affect the
contents of the bottom halfword.
d = UInt(Rd); imm16 = imm4:i:imm3:imm8;
if BadReg(d) then UNPREDICTABLE;
d = UInt(Rd); imm16 = imm4:imm12;
if d == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
MOVT<c> <Rd>,#<imm16>
15141312111098765432101514131211109876543210
11110 i 101100 imm4 0 imm3 Rd imm8
Encoding A1 ARMv6T2, ARMv7
MOVT<c> <Rd>,#<imm16>
313029282726252423222120191817161514131211109876543210
cond 00110100 imm4 Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-201
Assembler syntax
MOVT<c><q> <Rd>, #<imm16>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<imm16>
The immediate value to be written to
<Rd>
. It must be in the range 0-65535.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
R[d]<31:16> = imm16;
// R[d]<15:0> unchanged
Exceptions
None.
Instruction Details
A8-202 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.100 MRC, MRC2
Move to ARM core register from Coprocessor causes a coprocessor to transfer a value to an ARM core
register or to the condition flags. If no coprocessor can execute the instruction, an Undefined Instruction
exception is generated.
This is a generic coprocessor instruction. Some of the fields have no functionality defined by the architecture
and are free for use by the coprocessor instruction set designer. These fields are the opc1, opc2, CRn, and
CRm fields.
For more information about the coprocessors see Coprocessor support on page A2-68.
if coproc == ‘101x’ then SEE “Advanced SIMD and VFP”;
t = UInt(Rt); cp = UInt(coproc);
if t == 13 && (CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
t = UInt(Rt); cp = UInt(coproc);
if t == 13 && (CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
Encoding T1 / A1 ARMv6T2, ARMv7 for encoding T1
ARMv4*, ARMv5T*, ARMv6*, ARMv7 for encoding A1
MRC<c> <coproc>,<opc1>,<Rt>,<CRn>,<CRm>{,<opc2>}
15141312111098765432101514131211109876543210
11101110 opc1 1 CRn Rt coproc opc2 1 CRm
313029282726252423222120191817161514131211109876543210
cond 1110 opc1 1 CRn Rt coproc opc2 1 CRm
Encoding T2 / A2 ARMv6T2, ARMv7 for encoding T2
ARMv5T*, ARMv6*, ARMv7 for encodingA2
MRC2<c> <coproc>,<opc1>,<Rt>,<CRn>,<CRm>{,<opc2>}
15141312111098765432101514131211109876543210
11111110 opc1 1 CRn Rt coproc opc2 1 CRm
313029282726252423222120191817161514131211109876543210
11111110 opc1 1 CRn Rt coproc opc2 1 CRm
Advanced SIMD and VFP See 8, 16, and 32-bit transfer between ARM core and extension registers
on page A7-31
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-203
Assembler syntax
MRC{2}<c><q> <coproc>, #<opc1>, <Rt>, <CRn>, <CRm>{, #<opc2>}
where:
2
If specified, selects encoding T2 / A2. If omitted, selects encoding T1 / A1.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
MRC2
instruction must be
unconditional.
<coproc>
The name of the coprocessor. The standard generic coprocessor names are p0, p1, …, p15.
<opc1>
Is a coprocessor-specific opcode in the range 0 to 7.
<Rt>
Is the destination ARM core register. This register can be R0-R14 or APSR_nzcv. The last
form writes bits [31:28] of the transferred value to the N, Z, C and V condition flags and is
specified by setting the Rt field of the encoding to 0b1111. In pre-UAL assembler syntax,
PC was written instead of APSR_nzcv to select this form.
<CRn>
Is the coprocessor register that contains the first operand.
<CRm>
Is an additional source or destination coprocessor register.
<opc2>
Is a coprocessor-specific opcode in the range 0 to 7. If omitted,
<opc2>
is assumed to be 0.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if !Coproc_Accepted(cp, ThisInstr()) then
GenerateCoprocessorException();
else
value = Coproc_GetOneWord(cp, ThisInstr());
if t != 15 then
R[t] = value;
else
APSR.N = value<31>;
APSR.Z = value<30>;
APSR.C = value<29>;
APSR.V = value<28>;
// value<27:0> are not used.
Exceptions
Undefined Instruction.
Instruction Details
A8-204 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.101 MRRC, MRRC2
Move to two ARM core registers from Coprocessor causes a coprocessor to transfer values to two ARM
core registers. If no coprocessor can execute the instruction, an Undefined Instruction exception is
generated.
This is a generic coprocessor instruction. The opc1 and CRm fields have no functionality defined by the
architecture and are free for use by the coprocessor instruction set designer.
For more information about the coprocessors see Coprocessor support on page A2-68.
if coproc == ‘101x’ then SEE “Advanced SIMD and VFP”;
t = UInt(Rt); t2 = UInt(Rt2); cp = UInt(coproc);
if t == 15 || t2 == 15 || t == t2 then UNPREDICTABLE;
if (t == 13 || t2 == 13) && (CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
t = UInt(Rt); t2 = UInt(Rt2); cp = UInt(coproc);
if t == 15 || t2 == 15 || t == t2 then UNPREDICTABLE;
if (t == 13 || t2 == 13) && (CurrentInstrSet() != InstrSet_ARM) then UNPREDICTABLE;
Encoding T1 / A1 ARMv6T2, ARMv7 for encoding T1
ARMv5TE*, ARMv6*, ARMv7 for encoding A1
MRRC<c> <coproc>,<opc>,<Rt>,<Rt2>,<CRm>
15141312111098765432101514131211109876543210
111011000101 Rt2 Rt coproc opc1 CRm
313029282726252423222120191817161514131211109876543210
cond 11000101 Rt2 Rt coproc opc1 CRm
Encoding T2 / A2 ARMv6T2, ARMv7 for encoding T2
ARMv6*, ARMv7 for encoding A2
MRRC2<c> <coproc>,<opc>,<Rt>,<Rt2>,<CRm>
15141312111098765432101514131211109876543210
111111000101 Rt2 Rt coproc opc1 CRm
313029282726252423222120191817161514131211109876543210
111111000101 Rt2 Rt coproc opc1 CRm
Advanced SIMD and VFP See 64-bit transfers between ARM core and extension registers on
page A7-32
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-205
Assembler syntax
MRRC{2}<c><q> <coproc>, #<opc1>, <Rt>, <Rt2>, <CRm>
where:
2
If specified, selects encoding T2 / A2. If omitted, selects encoding T1 / A1.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
MRRC2
instruction must be
unconditional.
<coproc>
The name of the coprocessor. The standard generic coprocessor names are p0, p1, …, p15.
<opc1>
Is a coprocessor-specific opcode in the range 0 to 15.
<Rt>
Is the first destination ARM core register.
<Rt2>
Is the second destination ARM core register.
<CRm>
Is the coprocessor register that supplies the data to be transferred.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if !Coproc_Accepted(cp, ThisInstr()) then
GenerateCoprocessorException();
else
(R[t], R[t2]) = Coproc_GetTwoWords(cp, ThisInstr());
Exceptions
Undefined Instruction.
Instruction Details
A8-206 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.102 MRS
Move to Register from Special Register moves the value from the APSR into a general-purpose register.
For details of system level use of this instruction, see MRS on page B6-10.
d = UInt(Rd);
if BadReg(d) then UNPREDICTABLE;
d = UInt(Rd);
if d == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
MRS<c> <Rd>,<spec_reg>
151413121110987654321 01514131211109876543210
111100111110(1)(1)(1)(1)10(0)0 Rd (0)(0)(0)(0)(0)(0)(0)(0)
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MRS<c> <Rd>,<spec_reg>
313029282726252423222120191817161514131211109876543210
cond 00010000(1)(1)(1)(1) Rd (0)(0)(0)(0)0000(0)(0)(0)(0)
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-207
Assembler syntax
MRS<c><q> <Rd>, <spec_reg>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<spec_reg>
Is one of:
•APSR
• CPSR.
ARM recommends the
APSR
form in application level code. For more information, see The
Application Program Status Register (APSR) on page A2-14.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
R[d] = APSR;
Exceptions
None.
Instruction Details
A8-208 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.103 MSR (immediate)
Move immediate value to Special Register moves selected bits of an immediate value to the corresponding
bits in the APSR.
For details of system level use of this instruction, see MSR (immediate) on page B6-12.
if mask == ‘00’ then SEE “Related encodings”;
imm32 = ARMExpandImm(imm12); write_nzcvq = (mask<1> == ‘1’); write_g = (mask<0> == ‘1’);
if n == 15 then UNPREDICTABLE;
Assembler syntax
MSR<c><q> <spec_reg>, #<imm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<spec_reg>
Is one of:
• APSR_<bits>
• CPSR_<fields>.
ARM recommends the
APSR
forms in application level code. For more information,
see The Application Program Status Register (APSR) on page A2-14.
<imm>
Is the immediate value to be transferred to
<spec_reg>
. See Modified immediate
constants in ARM instructions on page A5-9 for the range of values.
<bits>
Is one of
nzcvq
,
g
, or
nzcvqg
.
In the A and R profiles:
APSR_nzcvq
is the same as
CPSR_f
APSR_g
is the same as
CPSR_s
APSR_nzcvqg
is the same as
CPSR_fs
.
<fields>
Is a sequence of one or more of the following:
s
,
f
.
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MSR<c> <spec_reg>,#<const>
313029282726252423222120191817161514131211109876543210
cond 00110010mask00(1)(1)(1)(1) imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-209
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if write_nzcvq then
APSR.N = imm32<31>;
APSR.Z = imm32<30>;
APSR.C = imm32<29>;
APSR.V = imm32<28>;
APSR.Q = imm32<27>;
if write_g then
APSR.GE = imm32<19:16>;
Exceptions
None.
Usage
For details of the APSR see The Application Program Status Register (APSR) on page A2-14. Because of
the Do-Not-Modify nature of its reserved bits, the immediate form of
MSR
is normally only useful at the
Application level for writing to
APSR_nzcvq
(
CPSR_f
).
For the A and R profiles, MSR (immediate) on page B6-12 describes additional functionality that is available
using the reserved bits. This includes some deprecated functionality that is available in unprivileged and
privileged modes and therefore can be used at the Application level.
Instruction Details
A8-210 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.104 MSR (register)
Move to Special Register from ARM core register moves selected bits of a general-purpose register to the
APSR.
For details of system level use of this instruction, see MSR (register) on page B6-14.
n = UInt(Rn); write_nzcvq = (mask<1> == ‘1’); write_g = (mask<0> == ‘1’);
if mask == ‘00’ then UNPREDICTABLE;
if n == 15 then UNPREDICTABLE;
n = UInt(Rn); write_nzcvq = (mask<1> == ‘1’); write_g = (mask<0> == ‘1’);
if mask == ‘00’ then UNPREDICTABLE;
if n == 15 then UNPREDICTABLE;
Assembler syntax
MSR<c><q> <spec_reg>, <Rn>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<spec_reg>
Is one of:
• APSR_<bits>
• CPSR_<fields>.
ARM recommends the
APSR
forms in application level code. For more information, see The
Application Program Status Register (APSR) on page A2-14.
<Rn>
Is the general-purpose register to be transferred to
<spec_reg>
.
<bits>
Is one of
nzcvq
,
g
, or
nzcvqg
.
In the A and R profiles:
APSR_nzcvq
is the same as
CPSR_f
APSR_g
is the same as
CPSR_s
APSR_nzcvqg
is the same as
CPSR_fs
.
<fields>
Is a sequence of one or more of the following:
s
,
f
.
Encoding T1 ARMv6T2, ARMv7
MSR<c> <spec_reg>,<Rn>
151413121110987654321 01514131211109876543210
111100111000 Rn 10(0)0mask00(0)(0)(0)(0)(0)(0)(0)(0)
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MSR<c> <spec_reg>,<Rn>
313029282726252423222120191817161514131211109876543210
cond 00010010mask00(1)(1)(1)(1)(0)(0)(0)(0)0000 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-211
Operation
if ConditionPassed() then
EncodingSpecificOperations();
if write_nzcvq then
APSR.N = R[n]<31>;
APSR.Z = R[n]<30>;
APSR.C = R[n]<29>;
APSR.V = R[n]<28>;
APSR.Q = R[n]<27>;
if write_g then
APSR.GE = R[n]<19:16>;
Exceptions
None.
Usage
For details of the APSR see The Application Program Status Register (APSR) on page A2-14. Because of
the Do-Not-Modify nature of its reserved bits, a read / modify / write sequence is normally needed when the
MSR
instruction is being used at Application level and its destination is not
APSR_nzcvq
(
CPSR_f
).
For the A and R profiles, MSR (register) on page B6-14 describes additional functionality that is available
using the reserved bits. This includes some deprecated functionality that is available in unprivileged and
privileged modes and therefore can be used at the Application level.
Instruction Details
A8-212 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.105 MUL
Multiply multiplies two register values. The least significant 32 bits of the result are written to the
destination register. These 32 bits do not depend on whether the source register values are considered to be
signed values or unsigned values.
Optionally, it can update the condition flags based on the result. In the Thumb instruction set, this option is
limited to only a few forms of the instruction. Use of this option adversely affects performance on many
processor implementations.
d = UInt(Rdm); n = UInt(Rn); m = UInt(Rdm); setflags = !InITBlock();
if ArchVersion() < 6 && d == n then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = FALSE;
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
if ArchVersion() < 6 && d == n then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
MULS <Rdm>,<Rn>,<Rdm>
Outside IT block.
MUL<c> <Rdm>,<Rn>,<Rdm>
Inside IT block.
151413121110987654321 0
0100001101 Rn Rdm
Encoding T2 ARMv6T2, ARMv7
MUL<c> <Rd>,<Rn>,<Rm>
151413121110987654321 01514131211109876543210
111110110000 Rn 1111 Rd 0000 Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MUL{S}<c> <Rd>,<Rn>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 0000000S Rd (0)(0)(0)(0) Rm 1001 Rn
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-213
Assembler syntax
MUL{S}<c><q> {<Rd>,} <Rn>, <Rm>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
In the Thumb instruction set,
S
can be specified only if both
<Rn>
and
<Rm>
are R0-R7 and
the instruction is outside an IT block.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The second operand register.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
operand1 = SInt(R[n]); // operand1 = UInt(R[n]) produces the same final results
operand2 = SInt(R[m]); // operand2 = UInt(R[m]) produces the same final results
result = operand1 * operand2;
R[d] = result<31:0>;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
if ArchVersion() == 4 then
APSR.C = bit UNKNOWN;
// else APSR.C unchanged
// APSR.V always unchanged
Exceptions
None.
Instruction Details
A8-214 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.106 MVN (immediate)
Bitwise NOT (immediate) writes the bitwise inverse of an immediate value to the destination register. It can
optionally update the condition flags based on the value.
d = UInt(Rd); setflags = (S == ‘1’);
(imm32, carry) = ThumbExpandImm_C(i:imm3:imm8, APSR.C);
if BadReg(d) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); setflags = (S == ‘1’);
(imm32, carry) = ARMExpandImm_C(imm12, APSR.C);
Encoding T1 ARMv6T2, ARMv7
MVN{S}<c> <Rd>,#<const>
15141312111098765432101514131211109876543210
11110 i 00011S11110 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MVN{S}<c> <Rd>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0011111S(0)(0)(0)(0) Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-215
Assembler syntax
MVN{S}<c><q> <Rd>, #<const>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<const>
The immediate value to be bitwise inverted. See Modified immediate constants in Thumb
instructions on page A6-17 or Modified immediate constants in ARM instructions on
page A5-9 for the range of values.
The pre-UAL syntax
MVN<c>S
is equivalent to
MVNS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = NOT(imm32);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-216 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.107 MVN (register)
Bitwise NOT (register) writes the bitwise inverse of a register value to the destination register. It can
optionally update the condition flags based on the result.
d = UInt(Rd); m = UInt(Rm); setflags = !InITBlock();
(shift_t, shift_n) = (SRType_LSL, 0);
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
MVNS <Rd>,<Rm>
Outside IT block.
MVN<c> <Rd>,<Rm>
Inside IT block.
1514131211109876543210
0100001111 Rm Rd
Encoding T2 ARMv6T2, ARMv7
MVN{S}<c>.W <Rd>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
11101010011S1111(0) imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MVN{S}<c> <Rd>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0001111S(0)(0)(0)(0) Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-217
Assembler syntax
MVN{S}<c><q> <Rd>, <Rm> {, <shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rm>
The register that is optionally shifted and used as the source register.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted. Shifts applied to a register on
page A8-10 describes the shifts and how they are encoded.
The pre-UAL syntax
MVN<c>S
is equivalent to
MVNS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = NOT(shifted);
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-218 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.108 MVN (register-shifted register)
Bitwise NOT (register-shifted register) writes the bitwise inverse of a register-shifted register value to the
destination register. It can optionally update the condition flags based on the result.
d = UInt(Rd); m = UInt(Rm); s = UInt(Rs);
setflags = (S == ‘1’); shift_t = DecodeRegShift(type);
if d == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
MVN{S}<c> <Rd>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0001111S(0)(0)(0)(0) Rd Rs 0type1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-219
Assembler syntax
MVN{S}<c><q> <Rd>, <Rm>, <type> <Rs>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rm>
The register that is shifted and used as the operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
The pre-UAL syntax
MVN<c>S
is equivalent to
MVNS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = NOT(shifted);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-220 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.109 NEG
Negate is a pre-UAL synonym for
RSB
(immediate) with an immediate value of 0. For details see RSB
(immediate) on page A8-284.
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-221
Assembler syntax
NEG<c><q> <Rd>, <Rm>
This is equivalent to:
RSBS<c><q> <Rd>, <Rm>, #0
Exceptions
None.
Instruction Details
A8-222 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.110 NOP
No Operation does nothing. This instruction can be used for code alignment purposes.
See Pre-UAL pseudo-instruction NOP on page AppxC-3 for details of
NOP
before the introduction of UAL
and the ARMv6K and ARMv6T2 architecture variants.
Note
The timing effects of including a
NOP
instruction in code are not guaranteed. It can increase execution time,
leave it unchanged, or even reduce it.
NOP
instructions are therefore not suitable for timing loops.
// No additional decoding required
// No additional decoding required
// No additional decoding required
Encoding T1 ARMv6T2, ARMv7
NOP<c>
1514131211109876543210
1011111100000000
Encoding T2 ARMv6T2, ARMv7
NOP<c>.W
151413121110987654321 01514131211109876543210
111100111010(1)(1)(1)(1)10(0)0(0)00000000000
Encoding A1 ARMv6K, ARMv6T2, ARMv7
NOP<c>
313029282726252423222120191817161514131211109876543210
cond 001100100000(1)(1)(1)(1)(0)(0)(0)(0)00000000
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-223
Assembler syntax
NOP<c><q>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
// Do nothing
Exceptions
None.
Instruction Details
A8-224 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.111 ORN (immediate)
Bitwise OR NOT (immediate) performs a bitwise (inclusive) OR of a register value and the complement of
an immediate value, and writes the result to the destination register. It can optionally update the condition
flags based on the result.
if Rn == ‘1111’ then SEE MVN (immediate);
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ThumbExpandImm_C(i:imm3:imm8, APSR.C);
if BadReg(d) || n == 13 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
ORN{S}<c> <Rd>,<Rn>,#<const>
15141312111098765432101514131211109876543210
11110 i 00011S Rn 0 imm3 Rd imm8
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-225
Assembler syntax
ORN{S}<c><q> {<Rd>,} <Rn>, #<const>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The register that contains the operand.
<const>
The immediate value to be bitwise inverted and ORed with the value obtained from
<Rn>
.
See Modified immediate constants in Thumb instructions on page A6-17 for the range of
values.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = R[n] OR NOT(imm32);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-226 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.112 ORN (register)
Bitwise OR NOT (register) performs a bitwise (inclusive) OR of a register value and the complement of an
optionally-shifted register value, and writes the result to the destination register. It can optionally update the
condition flags based on the result.
if Rn == ‘1111’ then SEE MVN (register);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || n == 13 || BadReg(m) then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
ORN{S}<c> <Rd>,<Rn>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
11101010011S Rn (0) imm3 Rd imm2type Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-227
Assembler syntax
ORN{S}<c><q> {<Rd>,} <Rn>, <Rm> {,<shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is optionally shifted and used as the second operand.
<shift>
The shift to apply to the value read from
<Rm>
. If omitted, no shift is applied. Shifts applied
to a register on page A8-10 describes the shifts and how they are encoded.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] OR NOT(shifted);
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-228 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.113 ORR (immediate)
Bitwise OR (immediate) performs a bitwise (inclusive) OR of a register value and an immediate value, and
writes the result to the destination register. It can optionally update the condition flags based on the result.
if Rn == ‘1111’ then SEE MOV (immediate);
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ThumbExpandImm_C(i:imm3:imm8, APSR.C);
if BadReg(d) || n == 13 then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); setflags = (S == ‘1’);
(imm32, carry) = ARMExpandImm_C(imm12, APSR.C);
Encoding T1 ARMv6T2, ARMv7
ORR{S}<c> <Rd>,<Rn>,#<const>
15141312111098765432101514131211109876543210
11110 i 00010S Rn 0 imm3 Rd imm8
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ORR{S}<c> <Rd>,<Rn>,#<const>
313029282726252423222120191817161514131211109876543210
cond 0011100S Rn Rd imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-229
Assembler syntax
ORR{S}<c><q> {<Rd>,} <Rn>, #<const>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The register that contains the operand.
<const>
The immediate value to be bitwise ORed with the value obtained from
<Rn>
. See Modified
immediate constants in Thumb instructions on page A6-17 or Modified immediate constants
in ARM instructions on page A5-9 for the range of values.
The pre-UAL syntax
ORR<c>S
is equivalent to
ORRS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
result = R[n] OR imm32;
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-230 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.114 ORR (register)
Bitwise OR (register) performs a bitwise (inclusive) OR of a register value and an optionally-shifted register
value, and writes the result to the destination register. It can optionally update the condition flags based on
the result.
d = UInt(Rdn); n = UInt(Rdn); m = UInt(Rm); setflags = !InITBlock();
(shift_t, shift_n) = (SRType_LSL, 0);
if Rn == ‘1111’ then SEE MOV (register);
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm3:imm2);
if BadReg(d) || n == 13 || BadReg(m) then UNPREDICTABLE;
if Rd == ‘1111’ && S == ‘1’ then SEE SUBS PC, LR and related instructions;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); setflags = (S == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
ORRS <Rdn>,<Rm>
Outside IT block.
ORR<c> <Rdn>,<Rm>
Inside IT block.
1514131211109876543210
0100001100 Rm Rdn
Encoding T2 ARMv6T2, ARMv7
ORR{S}<c>.W <Rd>,<Rn>,<Rm>{,<shift>}
151413121110987654321015141312111098 7 6 543210
11101010010S Rn (0) imm3 Rd imm2type Rm
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ORR{S}<c> <Rd>,<Rn>,<Rm>{,<shift>}
313029282726252423222120191817161514131211109876543210
cond 0001100S Rn Rd imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-231
Assembler syntax
ORR{S}<c><q> {<Rd>,} <Rn>, <Rm> {,<shift>}
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is optionally shifted and used as the second operand.
<shift>
The shift to apply to the value read from
<Rm>
. If present, encoding T1 is not permitted. If
absent, no shift is applied and all encodings are permitted. Shifts applied to a register on
page A8-10 describes the shifts and how they are encoded.
In Thumb assembly:
outside an IT block, if
ORRS <Rd>,<Rn>,<Rd>
is written with
<Rd>
and
<Rn>
both in the range R0-R7, it
is assembled using encoding T1 as though
ORRS <Rd>,<Rn>
had been written
inside an IT block, if
ORR<c> <Rd>,<Rn>,<Rd>
is written with
<Rd>
and
<Rn>
both in the range R0-R7,
it is assembled using encoding T1 as though
ORR<c> <Rd>,<Rn>
had been written.
To prevent either of these happening, use the .W qualifier.
The pre-UAL syntax
ORR<c>S
is equivalent to
ORRS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] OR shifted;
if d == 15 then // Can only occur for ARM encoding
ALUWritePC(result); // setflags is always FALSE here
else
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-232 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.115 ORR (register-shifted register)
Bitwise OR (register-shifted register) performs a bitwise (inclusive) OR of a register value and a
register-shifted register value, and writes the result to the destination register. It can optionally update the
condition flags based on the result.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); s = UInt(Rs);
setflags = (S == ‘1’); shift_t = DecodeRegShift(type);
if d == 15 || n == 15 || m == 15 || s == 15 then UNPREDICTABLE;
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
ORR{S}<c> <Rd>,<Rn>,<Rm>,<type> <Rs>
313029282726252423222120191817161514131211109876543210
cond 0001100S Rn Rd Rs 0type1 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-233
Assembler syntax
ORR{S}<c><q> {<Rd>,} <Rn>, <Rm>, <type> <Rs>
where:
S
If
S
is present, the instruction updates the flags. Otherwise, the flags are not updated.
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is shifted and used as the second operand.
<type>
The type of shift to apply to the value read from
<Rm>
. It must be one of:
ASR
Arithmetic shift right, encoded as type = 0b10
LSL
Logical shift left, encoded as type = 0b00
LSR
Logical shift right, encoded as type = 0b01
ROR
Rotate right, encoded as type = 0b11.
<Rs>
The register whose bottom byte contains the amount to shift by.
The pre-UAL syntax
ORR<c>S
is equivalent to
ORRS<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
shift_n = UInt(R[s]<7:0>);
(shifted, carry) = Shift_C(R[m], shift_t, shift_n, APSR.C);
result = R[n] OR shifted;
R[d] = result;
if setflags then
APSR.N = result<31>;
APSR.Z = IsZeroBit(result);
APSR.C = carry;
// APSR.V unchanged
Exceptions
None.
Instruction Details
A8-234 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.116 PKH
Pack Halfword combines one halfword of its first operand with the other halfword of its shifted second
operand.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); tbform = (tb == ‘1’);
(shift_t, shift_n) = DecodeImmShift(tb:’0’, imm3:imm2);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm); tbform = (tb == ‘1’);
(shift_t, shift_n) = DecodeImmShift(tb:’0’, imm5);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
PKHBT<c> <Rd>,<Rn>,<Rm>{,LSL #<imm>}
PKHTB<c> <Rd>,<Rn>,<Rm>{,ASR #<imm>}
151413121110987654321015141312111098 7 6 543210
111010101100 Rn (0) imm3 Rd imm2tb0 Rm
Encoding A1 ARMv6*, ARMv7
PKHBT<c> <Rd>,<Rn>,<Rm>{,LSL #<imm>}
PKHTB<c> <Rd>,<Rn>,<Rm>{,ASR #<imm>}
313029282726252423222120191817161514131211109876543210
cond 01101000 Rn Rd imm5 tb01 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-235
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The register that is optionally shifted and used as the second operand.
<imm>
The shift to apply to the value read from
<Rm>
, encoded in imm3:imm2 for encoding T1 and
imm5 for encoding A1.
For
PKHBT
, it is one of:
omitted No shift, encoded as
0b00000
1-31 Left shift by specified number of bits, encoded as a binary number.
For
PKHTB
, it is one of:
omitted Instruction is a pseudo-instruction and is assembled as though
PKHBT<c><q> <Rd>,<Rm>,<Rn>
had been written
1-32 Arithmetic right shift by specified number of bits. A shift by 32 bits is encoded
as
0b00000
. Other shift amounts are encoded as binary numbers.
Note
An assembler can permit
<imm>
= 0 to mean the same thing as omitting the shift, but this is
not standard UAL and must not be used for disassembly.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
operand2 = Shift(R[m], shift_t, shift_n, APSR.C); // APSR.C ignored
R[d]<15:0> = if tbform then operand2<15:0> else R[n]<15:0>;
R[d]<31:16> = if tbform then R[n]<31:16> else operand2<31:16>;
Exceptions
None.
PKHBT<c><q> {<Rd>,} <Rn>, <Rm> {, LSL #<imm>}
tbform == FALSE
PKHTB<c><q> {<Rd>,} <Rn>, <Rm> {, ASR #<imm>}
tbform == TRUE
Instruction Details
A8-236 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.117 PLD, PLDW (immediate)
Preload Data signals the memory system that data memory accesses from a specified address are likely in
the near future. The memory system can respond by taking actions that are expected to speed up the memory
accesses when they do occur, such as pre-loading the cache line containing the specified address into the
data cache. For more information, see Behavior of Preload Data (PLD, PLDW) and Preload Instruction
(PLI) with caches on page B2-7.
On an architecture variant that includes both the
PLD
and
PLDW
instructions, the
PLD
instruction signals that
the likely memory access is a read, and the
PLDW
instruction signals that it is a write.
if Rn == ‘1111’ then SEE PLD (literal);
n = UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = TRUE; is_pldw = (W == ‘1’);
if Rn == ‘1111’ then SEE PLD (literal);
n = UInt(Rn); imm32 = ZeroExtend(imm8, 32); add = FALSE; is_pldw = (W == ‘1’);
if Rn == ‘1111’ then SEE PLD (literal);
n = UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’); is_pldw = (R == ‘0’);
Encoding T1 ARMv6T2, ARMv7 for PLD
ARMv7 with MP Extensions for PLDW
PLD{W}<c> [<Rn>,#<imm12>]
15141312111098765432101514131211109876543210
1111100010W1 Rn 1111 imm12
Encoding T2 ARMv6T2, ARMv7 for PLD
ARMv7 with MP Extensions for PLDW
PLD{W}<c> [<Rn>,#-<imm8>]
15141312111098765432101514131211109876543210
1111100000W1 Rn 11111100 imm8
Encoding A1 ARMv5TE*, ARMv6*, ARMv7 for PLD
ARMv7 with MP Extensions for PLDW
PLD{W} [<Rn>,#+/-<imm12>]
313029282726252423222120191817161514131211109876543210
11110101UR01 Rn (1)(1)(1)(1) imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-237
Assembler syntax
where:
W
If specified, selects PLDW, encoded as W = 1 in Thumb encodings and R = 0 in ARM
encodings. If omitted, selects PLD, encoded as W = 0 in Thumb encodings and R = 1 in
ARM encodings.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
PLD
or
PLDW
instruction must
be unconditional.
<Rn>
The base register. The SP can be used. For PC use in the
PLD
instruction, see PLD (literal)
on page A8-238.
+/-
Is + or omitted to indicate that the immediate offset is added to the base register value
(
add == TRUE
), or – to indicate that the offset is to be subtracted (
add == FALSE
). Different
instructions are generated for
#0
and
#-0
.
<imm>
The immediate offset used to form the address. This offset can be omitted, meaning an offset
of 0. Values are:
Encoding T1, A1 any value in the range 0-4095
Encoding T2 any value in the range 0-255.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
address = if add then (R[n] + imm32) else (R[n] - imm32);
if is_pldw then
Hint_PreloadDataForWrite(address);
else
Hint_PreloadData(address);
Exceptions
None.
PLD{W}<c><q> [<Rn> {, #+/-<imm>}]
Instruction Details
A8-238 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.118 PLD (literal)
Preload Data signals the memory system that data memory accesses from a specified address are likely in
the near future. The memory system can respond by taking actions that are expected to speed up the memory
accesses when they do occur, such as pre-loading the cache line containing the specified address into the
data cache. For more information, see Behavior of Preload Data (PLD, PLDW) and Preload Instruction
(PLI) with caches on page B2-7.
imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
Encoding T1 ARMv6T2, ARMv7
PLD<c> <label>
PLD<c> [PC,#-0]
Special case
15141312111098765432101514131211109876543210
11111000U0(0)111111111 imm12
Encoding A1 ARMv5TE*, ARMv6*, ARMv7
PLD <label>
PLD [PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
11110101U(1)011111(1)(1)(1)(1) imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-239
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
PLD
instruction must be
unconditional.
<label>
The label of the literal data item that is likely to be accessed in the near future. The
assembler calculates the required value of the offset from the
Align(PC,4)
value of this
instruction to the label. The offset must be in the range –4095 to 4095.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
+/-
Is + or omitted to indicate that the immediate offset is added to the
Align(PC,4)
value
(
add == TRUE
), or – to indicate that the offset is to be subtracted (
add == FALSE
). Different
instructions are generated for
#0
and
#-0
.
<imm>
The immediate offset used to form the address. Values are in the range 0-4095.
The alternative syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
address = if add then (Align(PC,4) + imm32) else (Align(PC,4) - imm32);
Hint_PreloadData(address);
Exceptions
None.
PLD<c><q> <label>
Normal form
PLD<c><q> [PC, #+/-<imm>]
Alternative form
Instruction Details
A8-240 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.119 PLD, PLDW (register)
Preload Data signals the memory system that data memory accesses from a specified address are likely in
the near future. The memory system can respond by taking actions that are expected to speed up the memory
accesses when they do occur, such as pre-loading the cache line containing the specified address into the
data cache. For more information, see Behavior of Preload Data (PLD, PLDW) and Preload Instruction
(PLI) with caches on page B2-7.
On an architecture variant that includes both the
PLD
and
PLDW
instructions, the
PLD
instruction signals that
the likely memory access is a read, and the
PLDW
instruction signals that it is a write.
if Rn == ‘1111’ then SEE PLD (literal);
n = UInt(Rn); m = UInt(Rm); add = TRUE; is_pldw = (W == ‘1’);
(shift_t, shift_n) = (SRType_LSL, UInt(imm2));
if BadReg(m) then UNPREDICTABLE;
n = UInt(Rn); m = UInt(Rm); add = (U == ‘1’); is_pldw = (R == ‘0’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
if m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7 for PLD
ARMv7 with MP Extensions for PLDW
PLD{W}<c> [<Rn>,<Rm>{,LSL #<imm2>}]
15141312111098765432101514131211109876543210
1111100000W1 Rn 1111000000imm2 Rm
Encoding A1 ARMv5TE*, ARMv6*, ARMv7 for PLD
ARMv7 with MP Extensions for PLDW
PLD{W}<c> [<Rn>,+/-<Rm>{, <shift>}]
313029282726252423222120191817161514131211109876543210
11110111UR01 Rn (1)(1)(1)(1) imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-241
Assembler syntax
PLD[W]<c><q> [<Rn>, +/-<Rm> {, <shift>}]
where:
W
If specified, selects PLDW, encoded as W = 1 in Thumb encodings and R = 0 in ARM
encodings. If omitted, selects PLD, encoded as W = 0 in Thumb encodings and R = 1 in
ARM encodings.
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
PLD
or
PLDW
instruction must
be unconditional.
<Rn>
Is the base register. The SP can be used.
+/-
Is + or omitted if the optionally shifted value of
<Rm>
is to be added to the base register value
(
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only,
add == FALSE
).
<Rm>
Contains the offset that is optionally shifted and applied to the value of
<Rn>
to form the
address.
<shift>
The shift to apply to the value read from
<Rm>
. If absent, no shift is applied. For encoding
T1,
<shift>
can only be omitted, encoded as imm2 =
0b00
, or
LSL #<imm>
with
<imm>
= 1, 2,
or 3, with
<imm>
encoded in imm2. For encoding A1, see Shifts applied to a register on
page A8-10.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
offset = Shift(R[m], shift_t, shift_n, APSR.C);
address = if add then (R[n] + offset) else (R[n] - offset);
if is_pldw then
Hint_PreloadDataForWrite(address);
else
Hint_PreloadData(address);
Exceptions
None.
Instruction Details
A8-242 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.120 PLI (immediate, literal)
Preload Instruction signals the memory system that instruction memory accesses from a specified address
are likely in the near future. The memory system can respond by taking actions that are expected to speed
up the memory accesses when they do occur, such as pre-loading the cache line containing the specified
address into the instruction cache. For more information, see Behavior of Preload Data (PLD, PLDW) and
Preload Instruction (PLI) with caches on page B2-7.
if Rn == ‘1111’ then SEE encoding T3;
n = UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = TRUE;
if Rn == ‘1111’ then SEE encoding T3;
n = UInt(Rn); imm32 = ZeroExtend(imm8, 32); add = FALSE;
n = 15; imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
n = UInt(Rn); imm32 = ZeroExtend(imm12, 32); add = (U == ‘1’);
Encoding T1 ARMv7
PLI<c> [<Rn>,#<imm12>]
15141312111098765432101514131211109876543210
111110011001 Rn 1111 imm12
Encoding T2 ARMv7
PLI<c> [<Rn>,#-<imm8>]
15141312111098765432101514131211109876543210
111110010001 Rn 11111100 imm8
Encoding T3 ARMv7
PLI<c> <label>
PLI<c> [PC,#-0]
Special case
15141312111098765432101514131211109876543210
11111001U00111111111 imm12
Encoding A1 ARMv7
PLI [<Rn>,#+/-<imm12>]
PLI <label>
PLI [PC,#-0]
Special case
313029282726252423222120191817161514131211109876543210
11110100U101 Rn (1)(1)(1)(1) imm12
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-243
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
PLI
instruction must be
unconditional.
<Rn>
Is the base register. The SP can be used.
+/-
Is + or omitted to indicate that the immediate offset is added to the base register value
(
add == TRUE
), or – to indicate that the offset is to be subtracted (
add == FALSE
). Different
instructions are generated for
#0
and
#-0
.
<imm>
The immediate offset used to form the address. For the immediate form of the syntax,
<imm>
can be omitted, in which case the
#0
form of the instruction is assembled. Values are:
Encoding T1, T3, A1 any value in the range 0 to 4095
Encoding T2 any value in the range 0 to 255.
<label>
The label of the instruction that is likely to be accessed in the near future. The assembler
calculates the required value of the offset from the
Align(PC,4)
value of this instruction to
the label. The offset must be in the range –4095 to 4095.
If the offset is zero or positive,
imm32
is equal to the offset and
add == TRUE
.
If the offset is negative,
imm32
is equal to minus the offset and
add == FALSE
.
For the literal forms of the instruction, encoding T3 is used, or Rn is encoded as '1111' in encoding A1, to
indicate that the PC is the base register.
The alternative literal syntax permits the addition or subtraction of the offset and the immediate offset to be
specified separately, including permitting a subtraction of 0 that cannot be specified using the normal
syntax. For more information, see Use of labels in UAL instruction syntax on page A4-5.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
base = if n == 15 then Align(PC,4) else R[n];
address = if add then (base + imm32) else (base - imm32);
Hint_PreloadInstr(address);
Exceptions
None.
PLI<c><q> [<Rn> {, #+/-<imm>}]
Immediate form
PLI<c><q> <label>
Normal literal form
PLI<c><q> [PC, #+/-<imm>]
Alternative literal form
Instruction Details
A8-244 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.121 PLI (register)
Preload Instruction signals the memory system that instruction memory accesses from a specified address
are likely in the near future. The memory system can respond by taking actions that are expected to speed
up the memory accesses when they do occur, such as pre-loading the cache line containing the specified
address into the instruction cache. For more information, see Behavior of Preload Data (PLD, PLDW) and
Preload Instruction (PLI) with caches on page B2-7.
if Rn == ‘1111’ then SEE PLI (immediate, literal);
n = UInt(Rn); m = UInt(Rm); add = TRUE;
(shift_t, shift_n) = (SRType_LSL, UInt(imm2));
if BadReg(m) then UNPREDICTABLE;
n = UInt(Rn); m = UInt(Rm); add = (U == ‘1’);
(shift_t, shift_n) = DecodeImmShift(type, imm5);
if m == 15 then UNPREDICTABLE;
Encoding T1 ARMv7
PLI<c> [<Rn>,<Rm>{,LSL #<imm2>}]
15141312111098765432101514131211109876543210
111110010001 Rn 1111000000imm2 Rm
Encoding A1 ARMv7
PLI [<Rn>,+/-<Rm>{, <shift>}]
313029282726252423222120191817161514131211109876543210
11110110U101 Rn (1)(1)(1)(1) imm5 type0 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-245
Assembler syntax
PLI<c><q> [<Rn>, +/-<Rm> {, <shift>}]
where:
<c><q>
See Standard assembler syntax fields on page A8-7. An ARM
PLI
instruction must be
unconditional.
<Rn>
Is the base register. The SP can be used.
+/-
Is + or omitted if the optionally shifted value of
<Rm>
is to be added to the base register value
(
add == TRUE
), or – if it is to be subtracted (permitted in ARM code only, add == FALSE).
<Rm>
Contains the offset that is optionally shifted and applied to the value of
<Rn>
to form the
address.
<shift>
The shift to apply to the value read from
<Rm>
. If absent, no shift is applied. For encoding
T1,
<shift>
can only be omitted, encoded as imm2 =
0b00
, or
LSL #<imm>
with
<imm>
= 1, 2,
or 3, with
<imm>
encoded in imm2. For encoding A1, see Shifts applied to a register on
page A8-10.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
offset = Shift(R[m], shift_t, shift_n, APSR.C);
address = if add then (R[n] + offset) else (R[n] - offset);
Hint_PreloadInstr(address);
Exceptions
None.
Instruction Details
A8-246 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.122 POP
Pop Multiple Registers loads multiple registers from the stack, loading from consecutive memory locations
starting at the address in SP, and updates SP to point just above the loaded data.
registers = P:’0000000’:register_list; if BitCount(registers) < 1 then UNPREDICTABLE;
registers = P:M:’0’:register_list;
if BitCount(registers) < 2 || (P == ‘1’ && M == ‘1’) then UNPREDICTABLE;
if registers<15> == ‘1’ && InITBlock() && !LastInITBlock() then UNPREDICTABLE;
t = UInt(Rt); registers = Zeros(16); registers<t> = ‘1’;
if t == 13 || (t == 15 && InITBlock() && !LastInITBlock()) then UNPREDICTABLE;
if BitCount(register_list) < 2 then SEE LDM / LDMIA / LDMFD;
registers = register_list;
if registers<13> == ‘1’ && ArchVersion() >= 7 then UNPREDICTABLE;
t = UInt(Rt); registers = Zeros(16); registers<t> = ‘1’;
if t == 13 then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
POP<c> <registers>
1514131211109876543210
1011110P register_list
Encoding T2 ARMv6T2, ARMv7
POP<c>.W <registers> <registers>
contains more than one register
15141312111098765432101514131211109876543210
1110100010111101PM(0) register_list
Encoding T3 ARMv6T2, ARMv7
POP<c>.W <registers> <registers>
contains one register,
<Rt>
15141312111098765432101514131211109876543210
1111100001011101 Rt 101100000100
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
POP<c> <registers> <registers>
contains more than one register
313029282726252423222120191817161514131211109876543210
cond 100010111101 register_list
Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7
POP<c> <registers> <registers>
contains one register,
<Rt>
313029282726252423222120191817161514131211109876543210
cond 010010011101 Rt 000000000100
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-247
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<registers>
Is a list of one or more registers to be loaded, separated by commas and surrounded by
{
and
}
. The lowest-numbered register is loaded from the lowest memory address, through
to the highest-numbered register from the highest memory address.
If the list contains more than one register, the instruction is assembled to encoding T1, T2,
or A1. If the list contains exactly one register, the instruction is assembled to encoding T1,
T3, or A2.
The SP can only be in the list in ARM code before ARMv7. ARM instructions that include
the SP in the list are deprecated, and the value of the SP after such an instruction is
UNKNOWN.
The PC can be in the list. If it is, the instruction branches to the address loaded to the PC. In
ARMv5T and above, this is an interworking branch, see Pseudocode details of operations
on ARM core registers on page A2-12. In Thumb code, if the PC is in the list:
the LR must not be in the list
the instruction must be either outside any IT block, or the last instruction in an IT
block.
ARM instructions that include both the LR and the PC in the list are deprecated.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(13);
address = SP;
for i = 0 to 14
if registers<i> == ‘1’ then
R[i} = MemA[address,4]; address = address + 4;
if registers<15> == ‘1’ then
LoadWritePC(MemA[address,4]);
if registers<13> == ‘0’ then SP = SP + 4*BitCount(registers);
if registers<13> == ‘1’ then SP = bits(32) UNKNOWN;
Exceptions
Data Abort.
POP<c><q> <registers>
Standard syntax
LDM<c><q> SP!, <registers>
Equivalent
LDM
syntax
Instruction Details
A8-248 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.123 PUSH
Push Multiple Registers stores multiple registers to the stack, storing to consecutive memory locations
ending just below the address in SP, and updates SP to point to the start of the stored data.
registers = ‘0’:M:’000000’:register_list;
if BitCount(registers) < 1 then UNPREDICTABLE;
registers = ‘0’:M:’0’:register_list;
if BitCount(registers) < 2 then UNPREDICTABLE;
t = UInt(Rt); registers = Zeros(16); registers<t> = ‘1’;
if BadReg(t) then UNPREDICTABLE;
if BitCount(register_list) < 2 then SEE STMDB / STMFD;
registers = register_list;
t = UInt(Rt); registers = Zeros(16); registers<t> = ‘1’;
if t == 13 then UNPREDICTABLE;
Encoding T1 ARMv4T, ARMv5T*, ARMv6*, ARMv7
PUSH<c> <registers>
1514131211109876543210
1011010M register_list
Encoding T2 ARMv6T2, ARMv7
PUSH<c>.W <registers> <registers>
contains more than one register
15141312111098765432101514131211109876543210
1110100010101101(0)M(0) register_list
Encoding T3 ARMv6T2, ARMv7
PUSH<c>.W <registers> <registers>
contains one register,
<Rt>
15141312111098765432101514131211109876543210
1111100001001101 Rt 110100000100
Encoding A1 ARMv4*, ARMv5T*, ARMv6*, ARMv7
PUSH<c> <registers> <registers>
contains more than one register
313029282726252423222120191817161514131211109876543210
cond 100100101101 register_list
Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7
PUSH<c> <registers> <registers>
contains one register,
<Rt>
313029282726252423222120191817161514131211109876543210
cond 010100101101 Rt 000000000100
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-249
Assembler syntax
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<registers>
Is a list of one or more registers to be stored, separated by commas and surrounded by
{
and
}
. The lowest-numbered register is stored to the lowest memory address, through to
the highest-numbered register to the highest memory address.
If the list contains more than one register, the instruction is assembled to encoding T1, T2,
or A1. If the list contains exactly one register, the instruction is assembled to encoding T1,
T3, or A2.
The SP and PC can be in the list in ARM code, but not in Thumb code. However, ARM
instructions that include the SP or the PC in the list are deprecated, and if the SP is in the
list, the value the instruction stores for the SP is UNKNOWN.
Operation
if ConditionPassed() then
EncodingSpecificOperations(); NullCheckIfThumbEE(13);
address = SP - 4*BitCount(registers);
for i = 0 to 14
if registers<i> == ‘1’ then
if i == 13 && i != LowestSetBit(registers) then // Only possible for encoding A1
MemA[address,4] = bits(32) UNKNOWN;
else
MemA[address,4] = R[i];
address = address + 4;
if registers<15> == ‘1’ then // Only possible for encoding A1 or A2
MemA[address,4] = PCStoreValue();
SP = SP - 4*BitCount(registers);
Exceptions
Data Abort.
PUSH<c><q> <registers>
Standard syntax
STMDB<c><q> SP!, <registers>
Equivalent
STM
syntax
Instruction Details
A8-250 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.124 QADD
Saturating Add adds two register values, saturates the result to the 32-bit signed integer range
–231 x231 – 1, and writes the result to the destination register. If saturation occurs, it sets the Q flag in
the APSR.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
QADD<c> <Rd>,<Rm>,<Rn>
15141312111098765432101514131211109876543210
111110101000 Rn 1111 Rd 1000 Rm
Encoding A1 ARMv5TE*, ARMv6*, ARMv7
QADD<c> <Rd>,<Rm>,<Rn>
313029282726252423222120191817161514131211109876543210
cond 00010000 Rn Rd (0)(0)(0)(0)0101 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-251
Assembler syntax
QADD<c><q> {<Rd>,} <Rm>, <Rn>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rm>
The first operand register.
<Rn>
The second operand register.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
(R[d], sat) = SignedSatQ(SInt(R[m]) + SInt(R[n]), 32);
if sat then
APSR.Q = ‘1’;
Exceptions
None.
Instruction Details
A8-252 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.125 QADD16
Saturating Add 16 performs two 16-bit integer additions, saturates the results to the 16-bit signed integer
range –215 x215 – 1, and writes the results to the destination register.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
QADD16<c> <Rd>,<Rn>,<Rm>
15141312111098765432101514131211109876543210
111110101001 Rn 1111 Rd 0001 Rm
Encoding A1 ARMv6*, ARMv7
QADD16<c> <Rd>,<Rn>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 01100010 Rn Rd (1)(1)(1)(1)0001 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-253
Assembler syntax
QADD16<c><q> {<Rd>,} <Rn>, <Rm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The second operand register.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
sum1 = SInt(R[n]<15:0>) + SInt(R[m]<15:0>);
sum2 = SInt(R[n]<31:16>) + SInt(R[m]<31:16>);
R[d]<15:0> = SignedSat(sum1, 16);
R[d]<31:16> = SignedSat(sum2, 16);
Exceptions
None.
Instruction Details
A8-254 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.126 QADD8
Saturating Add 8 performs four 8-bit integer additions, saturates the results to the 8-bit signed integer range
–27x27– 1, and writes the results to the destination register.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
QADD8<c> <Rd>,<Rn>,<Rm>
15141312111098765432101514131211109876543210
111110101000 Rn 1111 Rd 0001 Rm
Encoding A1 ARMv6*, ARMv7
QADD8<c> <Rd>,<Rn>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 01100010 Rn Rd (1)(1)(1)(1)1001 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-255
Assembler syntax
QADD8<c><q> {<Rd>,} <Rn>, <Rm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The second operand register.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
sum1 = SInt(R[n]<7:0>) + SInt(R[m]<7:0>);
sum2 = SInt(R[n]<15:8>) + SInt(R[m]<15:8>);
sum3 = SInt(R[n]<23:16>) + SInt(R[m]<23:16>);
sum4 = SInt(R[n]<31:24>) + SInt(R[m]<31:24>);
R[d]<7:0> = SignedSat(sum1, 8);
R[d]<15:8> = SignedSat(sum2, 8);
R[d]<23:16> = SignedSat(sum3, 8);
R[d]<31:24> = SignedSat(sum4, 8);
Exceptions
None.
Instruction Details
A8-256 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A8.6.127 QASX
Saturating Add and Subtract with Exchange exchanges the two halfwords of the second operand, performs
one 16-bit integer addition and one 16-bit subtraction, saturates the results to the 16-bit signed integer range
–215 x215 – 1, and writes the results to the destination register.
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if BadReg(d) || BadReg(n) || BadReg(m) then UNPREDICTABLE;
d = UInt(Rd); n = UInt(Rn); m = UInt(Rm);
if d == 15 || n == 15 || m == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
QASX<c> <Rd>,<Rn>,<Rm>
15141312111098765432101514131211109876543210
111110101010 Rn 1111 Rd 0001 Rm
Encoding A1 ARMv6*, ARMv7
QASX<c> <Rd>,<Rn>,<Rm>
313029282726252423222120191817161514131211109876543210
cond 01100010 Rn Rd (1)(1)(1)(1)0011 Rm
Instruction Details
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A8-257
Assembler syntax
QASX<c><q> {<Rd>,} <Rn>, <Rm>
where:
<c><q>
See Standard assembler syntax fields on page A8-7.
<Rd>
The destination register.
<Rn>
The first operand register.
<Rm>
The second operand register.
The pre-UAL syntax
QADDSUBX<c>
is equivalent to
QASX<c>
.
Operation
if ConditionPassed() then
EncodingSpecificOperations();
diff = SInt(R[n]<15:0>) - SInt(R[m]<31:16>);
sum = SInt(R[n]<31:16>) + SInt(R[m]<15:0>);