ARM Architecture Reference Manual ARMv7 A And R Edition V7
User Manual: Pdf
Open the PDF directly: View PDF .
Page Count: 2158 [warning: Documents this large are best viewed by clicking the View PDF Link!]
- ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition
- Contents
- Preface
- Application Level Architecture
- Introduction to the ARM Architecture
- Application Level Programmers’ Model
- A2.1 About the Application level programmers’ model
- A2.2 ARM core data types and arithmetic
- A2.3 ARM core registers
- A2.4 The Application Program Status Register (APSR)
- A2.5 Execution state registers
- A2.6 Advanced SIMD and VFP extensions
- A2.7 Floating-point data types and arithmetic
- A2.7.1 ARM standard floating-point input and output values
- A2.7.2 Advanced SIMD and VFP single-precision format
- A2.7.3 VFP double-precision format
- A2.7.4 Advanced SIMD and VFP half-precision formats
- A2.7.5 Flush-to-zero
- A2.7.6 NaN handling and the Default NaN
- A2.7.7 Floating-point exceptions
- A2.7.8 Pseudocode details of floating-point operations
- Generation of specific floating-point values
- Negation and absolute value
- Floating-point value unpacking
- Floating-point exception and NaN handling
- Floating-point rounding
- Selection of ARM standard floating-point arithmetic
- Comparisons
- Maximum and minimum
- Addition and subtraction
- Multiplication and division
- Reciprocal estimate and step
- Square root
- Reciprocal square root
- Conversions
- A2.8 Polynomial arithmetic over {0,1}
- A2.9 Coprocessor support
- A2.10 Execution environment support
- A2.11 Exceptions, debug events and checks
- Application Level Memory Model
- A3.1 Address space
- A3.2 Alignment support
- A3.3 Endian support
- A3.4 Synchronization and semaphores
- A3.4.1 Exclusive access instructions and Non-shareable memory regions
- A3.4.2 Exclusive access instructions and Shareable memory regions
- A3.4.3 Tagging and the size of the tagged memory block
- A3.4.4 Context switch support
- A3.4.5 Load-Exclusive and Store-Exclusive usage restrictions
- A3.4.6 Semaphores
- A3.4.7 Synchronization primitives and the memory order model
- A3.4.8 Use of WFE and SEV instructions by spin-locks
- A3.5 Memory types and attributes and the memory order model
- A3.6 Access rights
- A3.7 Virtual and physical addressing
- A3.8 Memory access order
- A3.9 Caches and memory hierarchy
- The Instruction Sets
- A4.1 About the instruction sets
- A4.2 Unified Assembler Language
- A4.3 Branch instructions
- A4.4 Data-processing instructions
- A4.5 Status register access instructions
- A4.6 Load/store instructions
- A4.7 Load/store multiple instructions
- A4.8 Miscellaneous instructions
- A4.9 Exception-generating and exception-handling instructions
- A4.10 Coprocessor instructions
- A4.11 Advanced SIMD and VFP load/store instructions
- A4.12 Advanced SIMD and VFP register transfer instructions
- A4.13 Advanced SIMD data-processing operations
- A4.13.1 Advanced SIMD parallel addition and subtraction
- A4.13.2 Bitwise Advanced SIMD data-processing instructions
- A4.13.3 Advanced SIMD comparison instructions
- A4.13.4 Advanced SIMD shift instructions
- A4.13.5 Advanced SIMD multiply instructions
- A4.13.6 Miscellaneous Advanced SIMD data-processing instructions
- A4.14 VFP data-processing instructions
- ARM Instruction Set Encoding
- A5.1 ARM instruction set encoding
- A5.2 Data-processing and miscellaneous instructions
- A5.2.1 Data-processing (register)
- A5.2.2 Data-processing (register-shifted register)
- A5.2.3 Data-processing (immediate)
- A5.2.4 Modified immediate constants in ARM instructions
- A5.2.5 Multiply and multiply-accumulate
- A5.2.6 Saturating addition and subtraction
- A5.2.7 Halfword multiply and multiply-accumulate
- A5.2.8 Extra load/store instructions
- A5.2.9 Extra load/store instructions (unprivileged)
- A5.2.10 Synchronization primitives
- A5.2.11 MSR (immediate), and hints
- A5.2.12 Miscellaneous instructions
- A5.3 Load/store word and unsigned byte
- A5.4 Media instructions
- A5.5 Branch, branch with link, and block data transfer
- A5.6 Supervisor Call, and coprocessor instructions
- A5.7 Unconditional instructions
- Thumb Instruction Set Encoding
- A6.1 Thumb instruction set encoding
- A6.2 16-bit Thumb instruction encoding
- A6.3 32-bit Thumb instruction encoding
- A6.3.1 Data-processing (modified immediate)
- A6.3.2 Modified immediate constants in Thumb instructions
- A6.3.3 Data-processing (plain binary immediate)
- A6.3.4 Branches and miscellaneous control
- A6.3.5 Load/store multiple
- A6.3.6 Load/store dual, load/store exclusive, table branch
- A6.3.7 Load word
- A6.3.8 Load halfword, memory hints
- A6.3.9 Load byte, memory hints
- A6.3.10 Store single data item
- A6.3.11 Data-processing (shifted register)
- A6.3.12 Data-processing (register)
- A6.3.13 Parallel addition and subtraction, signed
- A6.3.14 Parallel addition and subtraction, unsigned
- A6.3.15 Miscellaneous operations
- A6.3.16 Multiply, multiply accumulate, and absolute difference
- A6.3.17 Long multiply, long multiply accumulate, and divide
- A6.3.18 Coprocessor instructions
- Advanced SIMD and VFP Instruction Encoding
- A7.1 Overview
- A7.2 Advanced SIMD and VFP instruction syntax
- A7.3 Register encoding
- A7.4 Advanced SIMD data-processing instructions
- A7.5 VFP data-processing instructions
- A7.6 Extension register load/store instructions
- A7.7 Advanced SIMD element or structure load/store instructions
- A7.8 8, 16, and 32-bit transfer between ARM core and extension registers
- A7.9 64-bit transfers between ARM core and extension registers
- Instruction Details
- A8.1 Format of instruction descriptions
- A8.2 Standard assembler syntax fields
- A8.3 Conditional execution
- A8.4 Shifts applied to a register
- A8.5 Memory accesses
- A8.6 Alphabetical list of instructions
- A8.6.1 ADC (immediate)
- A8.6.2 ADC (register)
- A8.6.3 ADC (register-shifted register)
- A8.6.4 ADD (immediate, Thumb)
- A8.6.5 ADD (immediate, ARM)
- A8.6.6 ADD (register)
- A8.6.7 ADD (register-shifted register)
- A8.6.8 ADD (SP plus immediate)
- A8.6.9 ADD (SP plus register)
- A8.6.10 ADR
- A8.6.11 AND (immediate)
- A8.6.12 AND (register)
- A8.6.13 AND (register-shifted register)
- A8.6.14 ASR (immediate)
- A8.6.15 ASR (register)
- A8.6.16 B
- A8.6.17 BFC
- A8.6.18 BFI
- A8.6.19 BIC (immediate)
- A8.6.20 BIC (register)
- A8.6.21 BIC (register-shifted register)
- A8.6.22 BKPT
- A8.6.23 BL, BLX (immediate)
- A8.6.24 BLX (register)
- A8.6.25 BX
- A8.6.26 BXJ
- A8.6.27 CBNZ, CBZ
- A8.6.28 CDP, CDP2
- A8.6.29 CHKA
- A8.6.30 CLREX
- A8.6.31 CLZ
- A8.6.32 CMN (immediate)
- A8.6.33 CMN (register)
- A8.6.34 CMN (register-shifted register)
- A8.6.35 CMP (immediate)
- A8.6.36 CMP (register)
- A8.6.37 CMP (register-shifted register)
- A8.6.38 CPS
- A8.6.39 CPY
- A8.6.40 DBG
- A8.6.41 DMB
- A8.6.42 DSB
- A8.6.43 ENTERX
- A8.6.44 EOR (immediate)
- A8.6.45 EOR (register)
- A8.6.46 EOR (register-shifted register)
- A8.6.47 F* (former VFP instruction mnemonics)
- A8.6.48 HB, HBL, HBLP, HBP
- A8.6.49 ISB
- A8.6.50 IT
- A8.6.51 LDC, LDC2 (immediate)
- A8.6.52 LDC, LDC2 (literal)
- A8.6.53 LDM / LDMIA / LDMFD
- A8.6.54 LDMDA / LDMFA
- A8.6.55 LDMDB / LDMEA
- A8.6.56 LDMIB / LDMED
- A8.6.57 LDR (immediate, Thumb)
- A8.6.58 LDR (immediate, ARM)
- A8.6.59 LDR (literal)
- A8.6.60 LDR (register)
- A8.6.61 LDRB (immediate, Thumb)
- A8.6.62 LDRB (immediate, ARM)
- A8.6.63 LDRB (literal)
- A8.6.64 LDRB (register)
- A8.6.65 LDRBT
- A8.6.66 LDRD (immediate)
- A8.6.67 LDRD (literal)
- A8.6.68 LDRD (register)
- A8.6.69 LDREX
- A8.6.70 LDREXB
- A8.6.71 LDREXD
- A8.6.72 LDREXH
- A8.6.73 LDRH (immediate, Thumb)
- A8.6.74 LDRH (immediate, ARM)
- A8.6.75 LDRH (literal)
- A8.6.76 LDRH (register)
- A8.6.77 LDRHT
- A8.6.78 LDRSB (immediate)
- A8.6.79 LDRSB (literal)
- A8.6.80 LDRSB (register)
- A8.6.81 LDRSBT
- A8.6.82 LDRSH (immediate)
- A8.6.83 LDRSH (literal)
- A8.6.84 LDRSH (register)
- A8.6.85 LDRSHT
- A8.6.86 LDRT
- A8.6.87 LEAVEX
- A8.6.88 LSL (immediate)
- A8.6.89 LSL (register)
- A8.6.90 LSR (immediate)
- A8.6.91 LSR (register)
- A8.6.92 MCR, MCR2
- A8.6.93 MCRR, MCRR2
- A8.6.94 MLA
- A8.6.95 MLS
- A8.6.96 MOV (immediate)
- A8.6.97 MOV (register)
- A8.6.98 MOV (shifted register)
- A8.6.99 MOVT
- A8.6.100 MRC, MRC2
- A8.6.101 MRRC, MRRC2
- A8.6.102 MRS
- A8.6.103 MSR (immediate)
- A8.6.104 MSR (register)
- A8.6.105 MUL
- A8.6.106 MVN (immediate)
- A8.6.107 MVN (register)
- A8.6.108 MVN (register-shifted register)
- A8.6.109 NEG
- A8.6.110 NOP
- A8.6.111 ORN (immediate)
- A8.6.112 ORN (register)
- A8.6.113 ORR (immediate)
- A8.6.114 ORR (register)
- A8.6.115 ORR (register-shifted register)
- A8.6.116 PKH
- A8.6.117 PLD, PLDW (immediate)
- A8.6.118 PLD (literal)
- A8.6.119 PLD, PLDW (register)
- A8.6.120 PLI (immediate, literal)
- A8.6.121 PLI (register)
- A8.6.122 POP
- A8.6.123 PUSH
- A8.6.124 QADD
- A8.6.125 QADD16
- A8.6.126 QADD8
- A8.6.127 QASX
- A8.6.128 QDADD
- A8.6.129 QDSUB
- A8.6.130 QSAX
- A8.6.131 QSUB
- A8.6.132 QSUB16
- A8.6.133 QSUB8
- A8.6.134 RBIT
- A8.6.135 REV
- A8.6.136 REV16
- A8.6.137 REVSH
- A8.6.138 RFE
- A8.6.139 ROR (immediate)
- A8.6.140 ROR (register)
- A8.6.141 RRX
- A8.6.142 RSB (immediate)
- A8.6.143 RSB (register)
- A8.6.144 RSB (register-shifted register)
- A8.6.145 RSC (immediate)
- A8.6.146 RSC (register)
- A8.6.147 RSC (register-shifted register)
- A8.6.148 SADD16
- A8.6.149 SADD8
- A8.6.150 SASX
- A8.6.151 SBC (immediate)
- A8.6.152 SBC (register)
- A8.6.153 SBC (register-shifted register)
- A8.6.154 SBFX
- A8.6.155 SDIV
- A8.6.156 SEL
- A8.6.157 SETEND
- A8.6.158 SEV
- A8.6.159 SHADD16
- A8.6.160 SHADD8
- A8.6.161 SHASX
- A8.6.162 SHSAX
- A8.6.163 SHSUB16
- A8.6.164 SHSUB8
- A8.6.165 SMC (previously SMI)
- A8.6.166 SMLABB, SMLABT, SMLATB, SMLATT
- A8.6.167 SMLAD
- A8.6.168 SMLAL
- A8.6.169 SMLALBB, SMLALBT, SMLALTB, SMLALTT
- A8.6.170 SMLALD
- A8.6.171 SMLAWB, SMLAWT
- A8.6.172 SMLSD
- A8.6.173 SMLSLD
- A8.6.174 SMMLA
- A8.6.175 SMMLS
- A8.6.176 SMMUL
- A8.6.177 SMUAD
- A8.6.178 SMULBB, SMULBT, SMULTB, SMULTT
- A8.6.179 SMULL
- A8.6.180 SMULWB, SMULWT
- A8.6.181 SMUSD
- A8.6.182 SRS
- A8.6.183 SSAT
- A8.6.184 SSAT16
- A8.6.185 SSAX
- A8.6.186 SSUB16
- A8.6.187 SSUB8
- A8.6.188 STC, STC2
- A8.6.189 STM / STMIA / STMEA
- A8.6.190 STMDA / STMED
- A8.6.191 STMDB / STMFD
- A8.6.192 STMIB / STMFA
- A8.6.193 STR (immediate, Thumb)
- A8.6.194 STR (immediate, ARM)
- A8.6.195 STR (register)
- A8.6.196 STRB (immediate, Thumb)
- A8.6.197 STRB (immediate, ARM)
- A8.6.198 STRB (register)
- A8.6.199 STRBT
- A8.6.200 STRD (immediate)
- A8.6.201 STRD (register)
- A8.6.202 STREX
- A8.6.203 STREXB
- A8.6.204 STREXD
- A8.6.205 STREXH
- A8.6.206 STRH (immediate, Thumb)
- A8.6.207 STRH (immediate, ARM)
- A8.6.208 STRH (register)
- A8.6.209 STRHT
- A8.6.210 STRT
- A8.6.211 SUB (immediate, Thumb)
- A8.6.212 SUB (immediate, ARM)
- A8.6.213 SUB (register)
- A8.6.214 SUB (register-shifted register)
- A8.6.215 SUB (SP minus immediate)
- A8.6.216 SUB (SP minus register)
- A8.6.217 SUBS PC, LR and related instructions
- A8.6.218 SVC (previously SWI)
- A8.6.219 SWP, SWPB
- A8.6.220 SXTAB
- A8.6.221 SXTAB16
- A8.6.222 SXTAH
- A8.6.223 SXTB
- A8.6.224 SXTB16
- A8.6.225 SXTH
- A8.6.226 TBB, TBH
- A8.6.227 TEQ (immediate)
- A8.6.228 TEQ (register)
- A8.6.229 TEQ (register-shifted register)
- A8.6.230 TST (immediate)
- A8.6.231 TST (register)
- A8.6.232 TST (register-shifted register)
- A8.6.233 UADD16
- A8.6.234 UADD8
- A8.6.235 UASX
- A8.6.236 UBFX
- A8.6.237 UDIV
- A8.6.238 UHADD16
- A8.6.239 UHADD8
- A8.6.240 UHASX
- A8.6.241 UHSAX
- A8.6.242 UHSUB16
- A8.6.243 UHSUB8
- A8.6.244 UMAAL
- A8.6.245 UMLAL
- A8.6.246 UMULL
- A8.6.247 UQADD16
- A8.6.248 UQADD8
- A8.6.249 UQASX
- A8.6.250 UQSAX
- A8.6.251 UQSUB16
- A8.6.252 UQSUB8
- A8.6.253 USAD8
- A8.6.254 USADA8
- A8.6.255 USAT
- A8.6.256 USAT16
- A8.6.257 USAX
- A8.6.258 USUB16
- A8.6.259 USUB8
- A8.6.260 UXTAB
- A8.6.261 UXTAB16
- A8.6.262 UXTAH
- A8.6.263 UXTB
- A8.6.264 UXTB16
- A8.6.265 UXTH
- A8.6.266 VABA, VABAL
- A8.6.267 VABD, VABDL (integer)
- A8.6.268 VABD (floating-point)
- A8.6.269 VABS
- A8.6.270 VACGE, VACGT, VACLE,VACLT
- A8.6.271 VADD (integer)
- A8.6.272 VADD (floating-point)
- A8.6.273 VADDHN
- A8.6.274 VADDL, VADDW
- A8.6.275 VAND (immediate)
- A8.6.276 VAND (register)
- A8.6.277 VBIC (immediate)
- A8.6.278 VBIC (register)
- A8.6.279 VBIF, VBIT, VBSL
- A8.6.280 VCEQ (register)
- A8.6.281 VCEQ (immediate #0)
- A8.6.282 VCGE (register)
- A8.6.283 VCGE (immediate #0)
- A8.6.284 VCGT (register)
- A8.6.285 VCGT (immediate #0)
- A8.6.286 VCLE (register)
- A8.6.287 VCLE (immediate #0)
- A8.6.288 VCLS
- A8.6.289 VCLT (register)
- A8.6.290 VCLT (immediate #0)
- A8.6.291 VCLZ
- A8.6.292 VCMP, VCMPE
- A8.6.293 VCNT
- A8.6.294 VCVT (between floating-point and integer, Advanced SIMD)
- A8.6.295 VCVT, VCVTR (between floating-point and integer, VFP)
- A8.6.296 VCVT (between floating-point and fixed-point, Advanced SIMD)
- A8.6.297 VCVT (between floating-point and fixed-point, VFP)
- A8.6.298 VCVT (between double-precision and single-precision)
- A8.6.299 VCVT (between half-precision and single-precision, Advanced SIMD)
- A8.6.300 VCVTB, VCVTT (between half-precision and single-precision, VFP)
- A8.6.301 VDIV
- A8.6.302 VDUP (scalar)
- A8.6.303 VDUP (ARM core register)
- A8.6.304 VEOR
- A8.6.305 VEXT
- A8.6.306 VHADD, VHSUB
- A8.6.307 VLD1 (multiple single elements)
- A8.6.308 VLD1 (single element to one lane)
- A8.6.309 VLD1 (single element to all lanes)
- A8.6.310 VLD2 (multiple 2-element structures)
- A8.6.311 VLD2 (single 2-element structure to one lane)
- A8.6.312 VLD2 (single 2-element structure to all lanes)
- A8.6.313 VLD3 (multiple 3-element structures)
- A8.6.314 VLD3 (single 3-element structure to one lane)
- A8.6.315 VLD3 (single 3-element structure to all lanes)
- A8.6.316 VLD4 (multiple 4-element structures)
- A8.6.317 VLD4 (single 4-element structure to one lane)
- A8.6.318 VLD4 (single 4-element structure to all lanes)
- A8.6.319 VLDM
- A8.6.320 VLDR
- A8.6.321 VMAX, VMIN (integer)
- A8.6.322 VMAX, VMIN (floating-point)
- A8.6.323 VMLA, VMLAL, VMLS, VMLSL (integer)
- A8.6.324 VMLA, VMLS (floating-point)
- A8.6.325 VMLA, VMLAL, VMLS, VMLSL (by scalar)
- A8.6.326 VMOV (immediate)
- A8.6.327 VMOV (register)
- A8.6.328 VMOV (ARM core register to scalar)
- A8.6.329 VMOV (scalar to ARM core register)
- A8.6.330 VMOV (between ARM core register and single-precision register)
- A8.6.331 VMOV (between two ARM core registers and two single-precision registers)
- A8.6.332 VMOV (between two ARM core registers and a doubleword extension register)
- A8.6.333 VMOVL
- A8.6.334 VMOVN
- A8.6.335 VMRS
- A8.6.336 VMSR
- A8.6.337 VMUL, VMULL (integer and polynomial)
- A8.6.338 VMUL (floating-point)
- A8.6.339 VMUL, VMULL (by scalar)
- A8.6.340 VMVN (immediate)
- A8.6.341 VMVN (register)
- A8.6.342 VNEG
- A8.6.343 VNMLA, VNMLS, VNMUL
- A8.6.344 VORN (immediate)
- A8.6.345 VORN (register)
- A8.6.346 VORR (immediate)
- A8.6.347 VORR (register)
- A8.6.348 VPADAL
- A8.6.349 VPADD (integer)
- A8.6.350 VPADD (floating-point)
- A8.6.351 VPADDL
- A8.6.352 VPMAX, VPMIN (integer)
- A8.6.353 VPMAX, VPMIN (floating-point)
- A8.6.354 VPOP
- A8.6.355 VPUSH
- A8.6.356 VQABS
- A8.6.357 VQADD
- A8.6.358 VQDMLAL, VQDMLSL
- A8.6.359 VQDMULH
- A8.6.360 VQDMULL
- A8.6.361 VQMOVN, VQMOVUN
- A8.6.362 VQNEG
- A8.6.363 VQRDMULH
- A8.6.364 VQRSHL
- A8.6.365 VQRSHRN, VQRSHRUN
- A8.6.366 VQSHL (register)
- A8.6.367 VQSHL, VQSHLU (immediate)
- A8.6.368 VQSHRN, VQSHRUN
- A8.6.369 VQSUB
- A8.6.370 VRADDHN
- A8.6.371 VRECPE
- A8.6.372 VRECPS
- A8.6.373 VREV16, VREV32, VREV64
- A8.6.374 VRHADD
- A8.6.375 VRSHL
- A8.6.376 VRSHR
- A8.6.377 VRSHRN
- A8.6.378 VRSQRTE
- A8.6.379 VRSQRTS
- A8.6.380 VRSRA
- A8.6.381 VRSUBHN
- A8.6.382 VSHL (immediate)
- A8.6.383 VSHL (register)
- A8.6.384 VSHLL
- A8.6.385 VSHR
- A8.6.386 VSHRN
- A8.6.387 VSLI
- A8.6.388 VSQRT
- A8.6.389 VSRA
- A8.6.390 VSRI
- A8.6.391 VST1 (multiple single elements)
- A8.6.392 VST1 (single element from one lane)
- A8.6.393 VST2 (multiple 2-element structures)
- A8.6.394 VST2 (single 2-element structure from one lane)
- A8.6.395 VST3 (multiple 3-element structures)
- A8.6.396 VST3 (single 3-element structure from one lane)
- A8.6.397 VST4 (multiple 4-element structures)
- A8.6.398 VST4 (single 4-element structure from one lane)
- A8.6.399 VSTM
- A8.6.400 VSTR
- A8.6.401 VSUB (integer)
- A8.6.402 VSUB (floating-point)
- A8.6.403 VSUBHN
- A8.6.404 VSUBL, VSUBW
- A8.6.405 VSWP
- A8.6.406 VTBL, VTBX
- A8.6.407 VTRN
- A8.6.408 VTST
- A8.6.409 VUZP
- A8.6.410 VZIP
- A8.6.411 WFE
- A8.6.412 WFI
- A8.6.413 YIELD
- ThumbEE
- A9.1 The ThumbEE instruction set
- A9.2 ThumbEE instruction set encoding
- A9.3 Additional instructions in Thumb and ThumbEE instruction sets
- A9.4 ThumbEE instructions with modified behavior
- A9.5 Additional ThumbEE instructions
- System Level Architecture
- The System Level Programmers’ Model
- B1.1 About the system level programmers’ model
- B1.2 System level concepts and terminology
- B1.3 ARM processor modes and core registers
- B1.4 Instruction set states
- B1.5 The Security Extensions
- B1.6 Exceptions
- B1.6.1 Exception vectors and the exception base address
- B1.6.2 Exception priority order
- B1.6.3 Exception entry
- B1.6.4 Exception return
- B1.6.5 Exception-handling instructions
- B1.6.6 Control of exception handling by the Security Extensions
- B1.6.7 Low interrupt latency configuration
- B1.6.8 Wait For Event and Send Event
- B1.6.9 Wait For Interrupt
- B1.6.10 Reset
- B1.6.11 Undefined Instruction exception
- B1.6.12 Supervisor Call (SVC) exception
- B1.6.13 Secure Monitor Call (SMC) exception
- B1.6.14 Prefetch Abort exception
- B1.6.15 Data Abort exception
- B1.6.16 IRQ exception
- B1.6.17 FIQ exception
- B1.7 Coprocessors and system control
- B1.8 Advanced SIMD and floating-point support
- B1.9 Execution environment support
- Common Memory System Architecture Features
- B2.1 About the memory system architecture
- B2.2 Caches
- B2.2.1 Cache identification
- B2.2.2 Cache behavior
- B2.2.3 Cache enabling and disabling
- B2.2.4 Cache maintenance functionality
- B2.2.5 The interaction of cache lockdown with cache maintenance
- B2.2.6 Branch predictors
- B2.2.7 Ordering of cache and branch predictor maintenance operations
- B2.2.8 Multiprocessor effects on cache maintenance operations
- B2.2.9 System-level caches
- B2.3 Implementation defined memory system features
- B2.4 Pseudocode details of general memory system operations
- B2.4.1 Memory data type definitions
- B2.4.2 Basic memory accesses
- B2.4.3 Interfaces to memory system specific pseudocode
- B2.4.4 Aligned memory accesses
- B2.4.5 Unaligned memory accesses
- B2.4.6 Reverse endianness
- B2.4.7 Exclusive monitors operations
- B2.4.8 Access permission checking
- B2.4.9 Default memory access decode
- B2.4.10 Data Abort exception
- Virtual Memory System Architecture (VMSA)
- B3.1 About the VMSA
- B3.2 Memory access sequence
- B3.3 Translation tables
- B3.4 Address mapping restrictions
- B3.5 Secure and Non-secure address spaces
- B3.6 Memory access control
- B3.7 Memory region attributes
- B3.8 VMSA memory aborts
- B3.9 Fault Status and Fault Address registers in a VMSA implementation
- B3.9.1 About the Fault Status and Fault Address registers
- B3.9.2 Data Abort exceptions
- B3.9.3 Prefetch Abort exceptions
- B3.9.4 Fault Status Register encodings for the VMSA
- B3.9.5 Distinguishing read and write accesses on Data Abort exceptions
- B3.9.6 Provision for classification of external aborts
- B3.9.7 The Domain field in the DFSR
- B3.9.8 Auxiliary Fault Status Registers
- B3.10 Translation Lookaside Buffers (TLBs)
- B3.11 Virtual Address to Physical Address translation operations
- B3.12 CP15 registers for a VMSA implementation
- B3.12.1 Organization of the CP15 registers in a VMSA implementation
- B3.12.2 General behavior of CP15 registers
- B3.12.3 Effect of the Security Extensions on the CP15 registers
- B3.12.4 Changes to CP15 registers and the memory order model
- B3.12.5 Meaning of fixed bit values in register diagrams
- B3.12.6 CP15 c0, ID codes registers
- B3.12.7 c0, Main ID Register (MIDR)
- B3.12.8 c0, Cache Type Register (CTR)
- B3.12.9 c0, TCM Type Register (TCMTR)
- B3.12.10 c0, TLB Type Register (TLBTR)
- B3.12.11 c0, Multiprocessor Affinity Register (MPIDR)
- B3.12.12 c0, Cache Size ID Registers (CCSIDR)
- B3.12.13 c0, Cache Level ID Register (CLIDR)
- B3.12.14 c0, Implementation defined Auxiliary ID Register (AIDR)
- B3.12.15 c0, Cache Size Selection Register (CSSELR)
- B3.12.16 CP15 c1, System control registers
- B3.12.17 c1, System Control Register (SCTLR)
- B3.12.18 c1, Implementation defined Auxiliary Control Register (ACTLR)
- B3.12.19 c1, Coprocessor Access Control Register (CPACR)
- B3.12.20 c1, Secure Configuration Register (SCR)
- B3.12.21 c1, Secure Debug Enable Register (SDER)
- B3.12.22 c1, Non-Secure Access Control Register (NSACR)
- B3.12.23 CP15 c2 and c3, Memory protection and control registers
- B3.12.24 CP15 c2, Translation table support registers
- B3.12.25 c3, Domain Access Control Register (DACR)
- B3.12.26 CP15 c4, Not used
- B3.12.27 CP15 c5 and c6, Memory system fault registers
- B3.12.28 CP15 c5, Fault status registers
- B3.12.29 CP15 c6, Fault Address registers
- B3.12.30 CP15 c7, Cache maintenance and other functions
- B3.12.31 CP15 c7, Cache and branch predictor maintenance functions
- B3.12.32 CP15 c7, Virtual Address to Physical Address translation operations
- B3.12.33 CP15 c7, Miscellaneous functions
- B3.12.34 CP15 c8, TLB maintenance operations
- B3.12.35 CP15 c9, Cache and TCM lockdown registers and performance monitors
- B3.12.36 CP15 c10, Memory remapping and TLB control registers
- B3.12.37 CP15 c10, Memory Remap Registers
- B3.12.38 CP15 c11, Reserved for TCM DMA registers
- B3.12.39 CP15 c12, Security Extensions registers
- B3.12.40 c12, Vector Base Address Register (VBAR)
- B3.12.41 c12, Monitor Vector Base Address Register (MVBAR)
- B3.12.42 c12, Interrupt Status Register (ISR)
- B3.12.43 CP15 c13, Process, context and thread ID registers
- B3.12.44 c13, FCSE Process ID Register (FCSEIDR)
- B3.12.45 c13, Context ID Register (CONTEXTIDR)
- B3.12.46 CP15 c13 Software Thread ID registers
- B3.12.47 CP15 c14, Not used
- B3.12.48 CP15 c15, Implementation defined registers
- B3.13 Pseudocode details of VMSA memory system operations
- Protected Memory System Architecture (PMSA)
- B4.1 About the PMSA
- B4.2 Memory access control
- B4.3 Memory region attributes
- B4.4 PMSA memory aborts
- B4.5 Fault Status and Fault Address registers in a PMSA implementation
- B4.5.1 About the Fault Status and Fault Address registers
- B4.5.2 Data Abort exceptions
- B4.5.3 Prefetch Abort exceptions
- B4.5.4 Fault Status Register encodings for the PMSA
- B4.5.5 Distinguishing read and write accesses on Data Abort exceptions
- B4.5.6 Provision for classification of external aborts
- B4.5.7 Auxiliary Fault Status Registers
- B4.6 CP15 registers for a PMSA implementation
- B4.6.1 Organization of the CP15 registers in a PMSA implementation
- B4.6.2 General behavior of CP15 registers
- B4.6.3 Changes to CP15 registers and the memory order model
- B4.6.4 Meaning of fixed bit values in register diagrams
- B4.6.5 CP15 c0, ID codes registers
- B4.6.6 c0, Main ID Register (MIDR)
- B4.6.7 c0, Cache Type Register (CTR)
- B4.6.8 c0, TCM Type Register (TCMTR)
- B4.6.9 c0, MPU Type Register (MPUIR)
- B4.6.10 c0, Multiprocessor Affinity Register (MPIDR)
- B4.6.11 c0, Cache Size ID Registers (CCSIDR)
- B4.6.12 c0, Cache Level ID Register (CLIDR)
- B4.6.13 c0, Implementation defined Auxiliary ID Register (AIDR)
- B4.6.14 c0, Cache Size Selection Register (CSSELR)
- B4.6.15 CP15 c1, System control registers
- B4.6.16 c1, System Control Register (SCTLR)
- B4.6.17 c1, Implementation defined Auxiliary Control Register (ACTLR)
- B4.6.18 c1, Coprocessor Access Control Register (CPACR)
- B4.6.19 CP15 c2 and c3, Not used on a PMSA implementation
- B4.6.20 CP15 c4, Not used
- B4.6.21 CP15 c5 and c6, Memory system fault registers
- B4.6.22 CP15 c5, Fault status registers
- B4.6.23 CP15 c6, Fault Address registers
- B4.6.24 CP15 c6, Memory region programming registers
- c6, Data Region Base Address Register (DRBAR)
- c6, Instruction Region Base Address Register (IRBAR)
- c6, Data Region Size and Enable Register (DRSR)
- c6, Instruction Region Size and Enable Register (IRSR)
- c6, Data Region Access Control Register (DRACR)
- c6, Instruction Region Access Control Register (IRACR)
- c6, MPU Region Number Register (RGNR)
- B4.6.25 CP15 c7, Cache maintenance and other functions
- B4.6.26 CP15 c7, Cache and branch predictor maintenance functions
- B4.6.27 CP15 c7, Miscellaneous functions
- B4.6.28 CP15 c8, Not used on a PMSA implementation
- B4.6.29 CP15 c9, Cache and TCM lockdown registers and performance monitors
- B4.6.30 CP15 c10, Not used on a PMSA implementation
- B4.6.31 CP15 c11, Reserved for TCM DMA registers
- B4.6.32 CP15 c12, Not used on a PMSA implementation
- B4.6.33 CP15 c13, Context and Thread ID registers
- B4.6.34 c13, Context ID Register (CONTEXTIDR)
- B4.6.35 CP15 c13 Software Thread ID registers
- B4.6.36 CP15 c14, Not used
- B4.6.37 CP15 c15, Implementation defined registers
- B4.7 Pseudocode details of PMSA memory system operations
- The CPUID Identification Scheme
- B5.1 Introduction to the CPUID scheme
- B5.2 The CPUID registers
- B5.2.1 CP15 c0, Processor Feature registers
- B5.2.2 c0, Debug Feature Register 0 (ID_DFR0)
- B5.2.3 c0, Auxiliary Feature Register 0 (ID_AFR0)
- B5.2.4 CP15 c0, Memory Model Feature registers
- B5.2.5 CP15 c0, Instruction Set Attribute registers
- Instruction set descriptions in the CPUID scheme
- Summary of Instruction Set Attribute register attributes
- c0, Instruction Set Attribute Register 0 (ID_ISAR0)
- c0, Instruction Set Attribute Register 1 (ID_ISAR1)
- c0, Instruction Set Attribute Register 2 (ID_ISAR2)
- c0, Instruction Set Attribute Register 3 (ID_ISAR3)
- c0, Instruction Set Attribute Register 4 (ID_ISAR4)
- c0, Instruction Set Attribute Register 5 (ID_ISAR5)
- Accessing the Instruction Set Attribute registers
- B5.3 Advanced SIMD and VFP feature identification registers
- System Instructions
- B6.1 Alphabetical list of instructions
- B6.1.1 CPS
- B6.1.2 LDM (exception return)
- B6.1.3 LDM (user registers)
- B6.1.4 LDRBT, LDRHT, LDRSBT, LDRSHT, and LDRT
- B6.1.5 MRS
- B6.1.6 MSR (immediate)
- B6.1.7 MSR (register)
- B6.1.8 RFE
- B6.1.9 SMC (previously SMI)
- B6.1.10 SRS
- B6.1.11 STM (user registers)
- B6.1.12 STRBT, STRHT, and STRT
- B6.1.13 SUBS PC, LR and related instructions
- B6.1.14 VMRS
- B6.1.15 VMSR
- B6.1 Alphabetical list of instructions
- Debug Architecture
- Introduction to the ARM Debug Architecture
- Invasive Debug Authentication
- Debug Events
- C3.1 About debug events
- C3.2 Software debug events
- C3.2.1 Breakpoint debug events
- Generation of Breakpoint debug events
- Debug event generation conditions defined by the DBGBCR
- IVA comparisons for Debug event generation
- IVA comparisons and instruction length
- Context ID comparisons for Debug event generation
- Additional considerations for IVA mismatch breakpoints
- Additional conditions for linked BRPs
- C3.2.2 Watchpoint debug events
- C3.2.3 BKPT Instruction debug events
- C3.2.4 Vector Catch debug events
- C3.2.5 Memory addresses
- C3.2.6 Unpredictable behavior on Software debug events
- C3.2.7 Pseudocode details of Software debug events
- C3.2.1 Breakpoint debug events
- C3.3 Halting debug events
- C3.4 Generation of debug events
- C3.5 Debug event prioritization
- Debug Exceptions
- Debug State
- C5.1 About Debug state
- C5.2 Entering Debug state
- C5.3 Behavior of the PC and CPSR in Debug state
- C5.4 Executing instructions in Debug state
- C5.5 Privilege in Debug state
- C5.6 Behavior of non-invasive debug in Debug state
- C5.7 Exceptions in Debug state
- C5.8 Memory system behavior in Debug state
- C5.9 Leaving Debug state
- Debug Register Interfaces
- C6.1 About the debug register interfaces
- C6.2 Reset and power-down support
- C6.3 Debug register map
- C6.4 Synchronization of debug register updates
- C6.5 Access permissions
- C6.6 The CP14 debug register interfaces
- C6.7 The memory-mapped and recommended external debug interfaces
- Non-invasive Debug Authentication
- Sample-based Profiling
- Performance Monitors
- C9.1 About the performance monitors
- C9.2 Status in the ARM architecture
- C9.3 Accuracy of the performance monitors
- C9.4 Behavior on overflow
- C9.5 Interaction with Security Extensions
- C9.6 Interaction with trace
- C9.7 Interaction with power saving operations
- C9.8 CP15 c9 register map
- C9.9 Access permissions
- C9.10 Event numbers
- Debug Registers Reference
- C10.1 Accessing the debug registers
- C10.2 Debug identification registers
- C10.3 Control and status registers
- C10.3.1 Debug Status and Control Register (DBGDSCR)
- C10.3.2 Watchpoint Fault Address Register (DBGWFAR)
- C10.3.3 Debug Run Control Register (DBGDRCR), v7 Debug only
- C10.3.4 Device Power-down and Reset Control Register (DBGPRCR), v7 Debug only
- C10.3.5 Device Power-down and Reset Status Register (DBGPRSR), v7 Debug only
- C10.3.6 Program Counter Sampling Register (DBGPCSR)
- C10.3.7 Context ID Sampling Register (DBGCIDSR)
- C10.4 Instruction and data transfer registers
- C10.5 Software debug event registers
- C10.6 OS Save and Restore registers, v7 Debug only
- C10.7 Memory system control registers
- C10.8 Management registers, ARMv7 only
- C10.8.1 Processor identification registers
- C10.8.2 Integration Mode Control Register (DBGITCTRL)
- C10.8.3 Claim Tag Set Register (DBGCLAIMSET)
- C10.8.4 Claim Tag Clear Register (DBGCLAIMCLR)
- C10.8.5 Lock Access Register (DBGLAR)
- C10.8.6 Lock Status Register (DBGLSR)
- C10.8.7 Authentication Status Register (DBGAUTHSTATUS)
- C10.8.8 Device Type Register (DBGDEVTYPE)
- C10.8.9 Debug Peripheral Identification Registers (DBGPID0 to DBGPID4)
- C10.8.10 Debug Component Identification Registers (DBGCID0 to DBGCID3)
- C10.9 Performance monitor registers
- C10.9.1 c9, Performance Monitor Control Register (PMCR)
- C10.9.2 c9, Count Enable Set Register (PMCNTENSET)
- C10.9.3 c9, Count Enable Clear Register (PMCNTENCLR)
- C10.9.4 c9, Overflow Flag Status Register (PMOVSR)
- C10.9.5 c9, Software Increment Register (PMSWINC)
- C10.9.6 c9, Event Counter Selection Register (PMSELR)
- C10.9.7 c9, Cycle Count Register (PMCCNTR)
- C10.9.8 c9, Event Type Select Register (PMXEVTYPER)
- C10.9.9 c9, Event Count Register (PMXEVCNTR)
- C10.9.10 c9, User Enable Register (PMUSERENR)
- C10.9.11 c9, Interrupt Enable Set Register (PMINTENSET)
- C10.9.12 c9, Interrupt Enable Clear Register (PMINTENCLR)
- Appendices
- Recommended External Debug Interface
- A.1 System integration signals
- A.2 Recommended debug slave port
- Common VFP Subarchitecture Specification
- B.1 Scope of this appendix
- B.2 Introduction to the Common VFP subarchitecture
- B.3 Exception processing
- B.4 Support code requirements
- B.5 Context switching
- B.6 Subarchitecture additions to the VFP system registers
- B.7 Version 1 of the Common VFP subarchitecture
- B.8 Version 2 of the Common VFP subarchitecture
- Legacy Instruction Mnemonics
- Deprecated and Obsolete Features
- D.1 Deprecated features
- D.1.1 VFP vector mode
- D.1.2 VFP FLDMX and FSTMX instructions
- D.1.3 Fast context switch extension
- D.1.4 Direct manipulation of the Endianness bit
- D.1.5 Strongly-ordered memory accesses and interrupt masks
- D.1.6 Unaligned exception returns
- D.1.7 Use of AP[2] = 1, AP[1:0] = 0b10 in MMU access permissions
- D.1.8 The Domain field in the DFSR
- D.1.9 Watchpoint Fault Address Register in CP15
- D.1.10 CP15 memory barrier operations
- D.1.11 Use of Hivecs exception base address in PMSA implementations
- D.1.12 Use of Secure User halting debug
- D.1.13 Escalation of privilege on CP14 and CP15 accesses in Debug state
- D.1.14 Interrupts or asynchronous aborts in a sequence of memory transactions
- D.1.15 Reading the Debug Program Counter Sampling Registers as register 33
- D.1.16 Old mnemonics for CP15 c8 operations to invalidate entries in a unified TLB
- D.2 Deprecated terminology
- D.3 Obsolete features
- D.4 Semaphore instructions
- D.5 Use of the SP as a general-purpose register
- D.6 Explicit use of the PC in ARM instructions
- D.7 Deprecated Thumb instructions
- D.1 Deprecated features
- Fast Context Switch Extension (FCSE)
- VFP Vector Operation Support
- ARMv6 Differences
- G.1 Introduction to ARMv6
- G.2 Application level register support
- G.3 Application level memory support
- G.4 Instruction set support
- G.5 System level register support
- G.6 System level memory model
- G.7 System Control coprocessor (CP15) support
- G.7.1 Organization of CP15 registers for an ARMv6 VMSA implementation
- G.7.2 Organization of CP15 registers for an ARMv6 PMSA implementation
- G.7.3 c0, ID support
- G.7.4 c1, System control support
- G.7.5 c1, VMSA Security Extensions support
- G.7.6 c2 and c3, VMSA memory protection and control registers
- G.7.7 c5 and c6, VMSA memory system support
- G.7.8 c5 and c6, PMSA memory system support
- G.7.9 c6, Watchpoint Fault Address Register (DBGWFAR)
- G.7.10 c7, Cache operations
- G.7.11 c7, Miscellaneous functions
- G.7.12 c7, VMSA virtual to physical address translation support
- G.7.13 c8, VMSA TLB support
- G.7.14 c9, Cache lockdown support
- G.7.15 c9, TCM support
- G.7.16 c9, VMSA support for the Security Extensions
- G.7.17 c10, VMSA memory remapping support
- G.7.18 c10, VMSA TLB lockdown support
- G.7.19 c11, DMA support
- G.7.20 c12, VMSA support for the Security Extensions
- G.7.21 c13, Context ID support
- G.7.22 c15, implementation defined
- ARMv4 and ARMv5 Differences
- H.1 Introduction to ARMv4 and ARMv5
- H.2 Application level register support
- H.3 Application level memory support
- H.4 Instruction set support
- H.5 System level register support
- H.6 System level memory model
- H.7 System Control coprocessor (CP15) support
- H.7.1 Organization of CP15 registers in an ARMv4 or ARMv5 VMSA implementation
- H.7.2 Organization of CP15 registers in an ARMv4 or ARMv5 PMSA implementation
- H.7.3 c0, ID support
- H.7.4 c1, System control register support
- H.7.5 c2 and c3, VMSA memory protection and control registers
- H.7.6 c5 and c6, VMSA memory system support
- H.7.7 c2, c3, c5, and c6, PMSA support
- H.7.8 c7, Cache operations
- H.7.9 c7, Miscellaneous functions
- H.7.10 c8, VMSA TLB support
- H.7.11 c9, cache lockdown support
- H.7.12 c9, TCM support
- H.7.13 c10, VMSA TLB lockdown support
- H.7.14 c13, VMSA FCSE support
- H.7.15 c15, implementation defined
- Pseudocode Definition
- I.1 Instruction encoding diagrams and pseudocode
- I.2 Limitations of pseudocode
- I.3 Data types
- I.4 Expressions
- I.5 Operators and built-in functions
- I.5.1 Operations on generic types
- I.5.2 Operations on booleans
- I.5.3 Bitstring manipulation
- Bitstring length and most significant bit
- Bitstring concatenation and replication
- Bitstring extraction
- Logical operations on bitstrings
- Bitstring count
- Testing a bitstring for being all zero or all ones
- Lowest and highest set bits of a bitstring
- Zero-extension and sign-extension of bitstrings
- Converting bitstrings to integers
- I.5.4 Arithmetic
- I.6 Statements and program structure
- I.7 Miscellaneous helper procedures and functions
- I.7.1 ArchVersion()
- I.7.2 BadReg()
- I.7.3 Breakpoint()
- I.7.4 CallSupervisor()
- I.7.5 Coproc_Accepted()
- I.7.6 Coproc_DoneLoading()
- I.7.7 Coproc_DoneStoring()
- I.7.8 Coproc_GetOneWord()
- I.7.9 Coproc_GetTwoWords()
- I.7.10 Coproc_GetWordToStore()
- I.7.11 Coproc_InternalOperation()
- I.7.12 Coproc_SendLoadedWord()
- I.7.13 Coproc_SendOneWord()
- I.7.14 Coproc_SendTwoWords()
- I.7.15 EndOfInstruction()
- I.7.16 GenerateAlignmentException()
- I.7.17 GenerateCoprocessorException()
- I.7.18 GenerateIntegerZeroDivide()
- I.7.19 HaveMPExt()
- I.7.20 Hint_Debug()
- I.7.21 Hint_PreloadData()
- I.7.22 Hint_PreloadDataForWrite()
- I.7.23 Hint_PreloadInstr()
- I.7.24 Hint_Yield()
- I.7.25 IntegerZeroDivideTrappingEnabled()
- I.7.26 IsExternalAbort()
- I.7.27 JazelleAcceptsExecution()
- I.7.28 MemorySystemArchitecture()
- I.7.29 ProcessorID()
- I.7.30 RemapRegsHaveResetValues()
- I.7.31 SwitchToJazelleExecution()
- I.7.32 ThisInstr()
- I.7.33 UnalignedSupport()
- Pseudocode Index
- Register Index
- Glossary

Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
ARM DDI 0406B
ARM® Architecture
Reference Manual
ARM®v7-A and ARM®v7-R edition

ii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
ARM Architecture Reference Manual
ARMv7-A and ARMv7-R edition
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved.
Release Information
The following changes have been made to this document.
From ARMv7, the ARM® architecture defines different architectural profiles and this edition of this manual describes
only the A and R profiles. For details of the documentation of the ARMv7-M profile see Further reading on page xx.
Before ARMv7 there was only a single ARM Architecture Reference Manual, with document number DDI 0100. The first
issue of this was in February 1996, and the final issue, Issue I, was in July 2005. For more information see Further reading
on page xx.
Proprietary Notice
Words and logos marked with ® or ™ are registered trademarks or trademarks of ARM Limited in the EU and other
countries, except as otherwise stated below in this proprietary notice. Other brands and names mentioned herein may be
the trademarks of their respective owners.
Neither the whole nor any part of the information contained in, or the product described in, this document may be adapted
or reproduced in any material form except with the prior written permission of the copyright holder.
The product described in this document is subject to continuous developments and improvements. All particulars of the
product and its use contained in this document are given by ARM in good faith. However, all warranties implied or
expressed, including but not limited to implied warranties of merchantability, or fitness for purpose, are excluded.
1. Subject to the provisions set out below, ARM hereby grants to you a perpetual, non-exclusive, nontransferable, royalty
free, worldwide licence to use this ARM Architecture Reference Manual for the purposes of developing; (i) software
applications or operating systems which are targeted to run on microprocessor cores distributed under licence from ARM;
(ii) tools which are designed to develop software programs which are targeted to run on microprocessor cores distributed
under licence from ARM; (iii) or having developed integrated circuits which incorporate a microprocessor core
manufactured under licence from ARM.
2. Except as expressly licensed in Clause 1 you acquire no right, title or interest in the ARM Architecture Reference
Manual, or any Intellectual Property therein. In no event shall the licences granted in Clause 1, be construed as granting
you expressly or by implication, estoppel or otherwise, licences to any ARM technology other than the ARM Architecture
Reference Manual. The licence grant in Clause 1 expressly excludes any rights for you to use or take into use any ARM
patents. No right is granted to you under the provisions of Clause 1 to; (i) use the ARM Architecture Reference Manual
for the purposes of developing or having developed microprocessor cores or models thereof which are compatible in
whole or part with either or both the instructions or programmers’ models described in this ARM Architecture Reference
Manual; or (ii) develop or have developed models of any microprocessor cores designed by or for ARM; or (iii) distribute
Change History
Date Issue Confidentiality Change
05 April 2007 A Non-Confidential New edition for ARMv7-A and ARMv7-R architecture profiles.
Document number changed from ARM DDI 0100 to ARM DDI 0406 and contents
restructured.
29 April 2008 B Non-Confidential Addition of the VFP Half-precision and Multiprocessing Extensions, and many clarifications
and enhancements.

ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. iii
in whole or in part this ARM Architecture Reference Manual to third parties, other than to your subcontractors for the
purposes of having developed products in accordance with the licence grant in Clause 1 without the express written
permission of ARM; or (iv) translate or have translated this ARM Architecture Reference Manual into any other
languages.
3. THE ARM ARCHITECTURE REFERENCE MANUAL IS PROVIDED "AS IS" WITH NO WARRANTIES
EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF
SATISFACTORY QUALITY, NONINFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE.
4. No licence, express, implied or otherwise, is granted to LICENSEE, under the provisions of Clause 1, to use the ARM
tradename, in connection with the use of the ARM Architecture Reference Manual or any products based thereon.
Nothing in Clause 1 shall be construed as authority for you to make any representations on behalf of ARM in respect of
the ARM Architecture Reference Manual or any products based thereon.
Where the term ARM is used to refer to the company it means “ARM or any of its subsidiaries as appropriate”.
Note
The term ARM is also used to refer to versions of the ARM architecture, for example ARMv6 refers to version 6 of the
ARM architecture. The context makes it clear when the term is used in this way.
Copyright © 1996-1998, 2000, 2004-2008 ARM Limited
110 Fulbourn Road Cambridge, England CB1 9NJ
Restricted Rights Legend: Use, duplication or disclosure by the United States Government is subject to the restrictions
set forth in DFARS 252.227-7013 (c)(1)(ii) and FAR 52.227-19.
This document is Non-Confidential. The right to use, copy and disclose this document is subject to the licence set out
above.

iv Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B

ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. v
Contents
ARM Architecture Reference Manual
ARMv7-A and ARMv7-R edition
Preface
About this manual ............................................................................... xiv
Using this manual ................................................................................ xv
Conventions ....................................................................................... xviii
Further reading .................................................................................... xx
Feedback ............................................................................................ xxi
Part A Application Level Architecture
Chapter A1 Introduction to the ARM Architecture
A1.1 About the ARM architecture ............................................................. A1-2
A1.2 The ARM and Thumb instruction sets .............................................. A1-3
A1.3 Architecture versions, profiles, and variants .................................... A1-4
A1.4 Architecture extensions .................................................................... A1-6
A1.5 The ARM memory model ................................................................. A1-7
A1.6 Debug .............................................................................................. A1-8
Chapter A2 Application Level Programmers’ Model
A2.1 About the Application level programmers’ model ............................. A2-2

Contents
vi Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.2 ARM core data types and arithmetic ................................................ A2-3
A2.3 ARM core registers ........................................................................ A2-11
A2.4 The Application Program Status Register (APSR) ......................... A2-14
A2.5 Execution state registers ................................................................ A2-15
A2.6 Advanced SIMD and VFP extensions ............................................ A2-20
A2.7 Floating-point data types and arithmetic ........................................ A2-32
A2.8 Polynomial arithmetic over {0,1} .................................................... A2-67
A2.9 Coprocessor support ...................................................................... A2-68
A2.10 Execution environment support ..................................................... A2-69
A2.11 Exceptions, debug events and checks ........................................... A2-81
Chapter A3 Application Level Memory Model
A3.1 Address space ................................................................................. A3-2
A3.2 Alignment support ............................................................................ A3-4
A3.3 Endian support ................................................................................. A3-7
A3.4 Synchronization and semaphores .................................................. A3-12
A3.5 Memory types and attributes and the memory order model .......... A3-24
A3.6 Access rights .................................................................................. A3-38
A3.7 Virtual and physical addressing ..................................................... A3-40
A3.8 Memory access order .................................................................... A3-41
A3.9 Caches and memory hierarchy ...................................................... A3-51
Chapter A4 The Instruction Sets
A4.1 About the instruction sets ................................................................. A4-2
A4.2 Unified Assembler Language ........................................................... A4-4
A4.3 Branch instructions .......................................................................... A4-7
A4.4 Data-processing instructions ............................................................ A4-8
A4.5 Status register access instructions ................................................ A4-18
A4.6 Load/store instructions ................................................................... A4-19
A4.7 Load/store multiple instructions ..................................................... A4-22
A4.8 Miscellaneous instructions ............................................................. A4-23
A4.9 Exception-generating and exception-handling instructions ............ A4-24
A4.10 Coprocessor instructions ............................................................... A4-25
A4.11 Advanced SIMD and VFP load/store instructions .......................... A4-26
A4.12 Advanced SIMD and VFP register transfer instructions ................. A4-29
A4.13 Advanced SIMD data-processing operations ................................. A4-30
A4.14 VFP data-processing instructions .................................................. A4-38
Chapter A5 ARM Instruction Set Encoding
A5.1 ARM instruction set encoding .......................................................... A5-2
A5.2 Data-processing and miscellaneous instructions ............................. A5-4
A5.3 Load/store word and unsigned byte ............................................... A5-19
A5.4 Media instructions .......................................................................... A5-21
A5.5 Branch, branch with link, and block data transfer .......................... A5-27
A5.6 Supervisor Call, and coprocessor instructions ............................... A5-28
A5.7 Unconditional instructions .............................................................. A5-30

Contents
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. vii
Chapter A6 Thumb Instruction Set Encoding
A6.1 Thumb instruction set encoding ....................................................... A6-2
A6.2 16-bit Thumb instruction encoding ................................................... A6-6
A6.3 32-bit Thumb instruction encoding ................................................. A6-14
Chapter A7 Advanced SIMD and VFP Instruction Encoding
A7.1 Overview .......................................................................................... A7-2
A7.2 Advanced SIMD and VFP instruction syntax ................................... A7-3
A7.3 Register encoding ............................................................................ A7-8
A7.4 Advanced SIMD data-processing instructions ............................... A7-10
A7.5 VFP data-processing instructions .................................................. A7-24
A7.6 Extension register load/store instructions ...................................... A7-26
A7.7 Advanced SIMD element or structure load/store instructions ........ A7-27
A7.8 8, 16, and 32-bit transfer between ARM core and extension registers .....
A7-31
A7.9 64-bit transfers between ARM core and extension registers ......... A7-32
Chapter A8 Instruction Details
A8.1 Format of instruction descriptions .................................................... A8-2
A8.2 Standard assembler syntax fields .................................................... A8-7
A8.3 Conditional execution ....................................................................... A8-8
A8.4 Shifts applied to a register ............................................................. A8-10
A8.5 Memory accesses .......................................................................... A8-13
A8.6 Alphabetical list of instructions ....................................................... A8-14
Chapter A9 ThumbEE
A9.1 The ThumbEE instruction set ........................................................... A9-2
A9.2 ThumbEE instruction set encoding .................................................. A9-6
A9.3 Additional instructions in Thumb and ThumbEE instruction sets ..... A9-7
A9.4 ThumbEE instructions with modified behavior ................................. A9-8
A9.5 Additional ThumbEE instructions ................................................... A9-14
Part B System Level Architecture
Chapter B1 The System Level Programmers’ Model
B1.1 About the system level programmers’ model ................................... B1-2
B1.2 System level concepts and terminology ........................................... B1-3
B1.3 ARM processor modes and core registers ....................................... B1-6
B1.4 Instruction set states ...................................................................... B1-23
B1.5 The Security Extensions ................................................................ B1-25
B1.6 Exceptions ..................................................................................... B1-30
B1.7 Coprocessors and system control .................................................. B1-62
B1.8 Advanced SIMD and floating-point support .................................... B1-64
B1.9 Execution environment support ..................................................... B1-73

Contents
viii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Chapter B2 Common Memory System Architecture Features
B2.1 About the memory system architecture ........................................... B2-2
B2.2 Caches ............................................................................................. B2-3
B2.3 Implementation defined memory system features ......................... B2-27
B2.4 Pseudocode details of general memory system operations .......... B2-29
Chapter B3 Virtual Memory System Architecture (VMSA)
B3.1 About the VMSA .............................................................................. B3-2
B3.2 Memory access sequence ............................................................... B3-4
B3.3 Translation tables ............................................................................. B3-7
B3.4 Address mapping restrictions ......................................................... B3-23
B3.5 Secure and Non-secure address spaces ....................................... B3-26
B3.6 Memory access control .................................................................. B3-28
B3.7 Memory region attributes ............................................................... B3-32
B3.8 VMSA memory aborts .................................................................... B3-40
B3.9 Fault Status and Fault Address registers in a VMSA implementation ......
B3-48
B3.10 Translation Lookaside Buffers (TLBs) ............................................ B3-54
B3.11 Virtual Address to Physical Address translation operations ........... B3-63
B3.12 CP15 registers for a VMSA implementation .................................. B3-64
B3.13 Pseudocode details of VMSA memory system operations .......... B3-156
Chapter B4 Protected Memory System Architecture (PMSA)
B4.1 About the PMSA .............................................................................. B4-2
B4.2 Memory access control .................................................................... B4-9
B4.3 Memory region attributes ............................................................... B4-11
B4.4 PMSA memory aborts .................................................................... B4-13
B4.5 Fault Status and Fault Address registers in a PMSA implementation ......
B4-18
B4.6 CP15 registers for a PMSA implementation .................................. B4-22
B4.7 Pseudocode details of PMSA memory system operations ............ B4-79
Chapter B5 The CPUID Identification Scheme
B5.1 Introduction to the CPUID scheme .................................................. B5-2
B5.2 The CPUID registers ........................................................................ B5-4
B5.3 Advanced SIMD and VFP feature identification registers .............. B5-34
Chapter B6 System Instructions
B6.1 Alphabetical list of instructions ......................................................... B6-2
Part C Debug Architecture
Chapter C1 Introduction to the ARM Debug Architecture
C1.1 Scope of part C of this manual ......................................................... C1-2
C1.2 About the ARM Debug architecture ................................................. C1-3

Contents
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ix
C1.3 Security Extensions and debug ....................................................... C1-8
C1.4 Register interfaces ........................................................................... C1-9
Chapter C2 Invasive Debug Authentication
C2.1 About invasive debug authentication ............................................... C2-2
Chapter C3 Debug Events
C3.1 About debug events ......................................................................... C3-2
C3.2 Software debug events .................................................................... C3-5
C3.3 Halting debug events ..................................................................... C3-38
C3.4 Generation of debug events ........................................................... C3-40
C3.5 Debug event prioritization .............................................................. C3-43
Chapter C4 Debug Exceptions
C4.1 About debug exceptions .................................................................. C4-2
C4.2 Effects of debug exceptions on CP15 registers and the DBGWFAR ........
C4-4
Chapter C5 Debug State
C5.1 About Debug state ........................................................................... C5-2
C5.2 Entering Debug state ....................................................................... C5-3
C5.3 Behavior of the PC and CPSR in Debug state ................................. C5-7
C5.4 Executing instructions in Debug state .............................................. C5-9
C5.5 Privilege in Debug state ................................................................. C5-13
C5.6 Behavior of non-invasive debug in Debug state ............................. C5-19
C5.7 Exceptions in Debug state ............................................................. C5-20
C5.8 Memory system behavior in Debug state ....................................... C5-24
C5.9 Leaving Debug state ...................................................................... C5-28
Chapter C6 Debug Register Interfaces
C6.1 About the debug register interfaces ................................................. C6-2
C6.2 Reset and power-down support ....................................................... C6-4
C6.3 Debug register map ....................................................................... C6-18
C6.4 Synchronization of debug register updates .................................... C6-24
C6.5 Access permissions ....................................................................... C6-26
C6.6 The CP14 debug register interfaces .............................................. C6-32
C6.7 The memory-mapped and recommended external debug interfaces .......
C6-43
Chapter C7 Non-invasive Debug Authentication
C7.1 About non-invasive debug authentication ........................................ C7-2
C7.2 v7 Debug non-invasive debug authentication .................................. C7-4
C7.3 Effects of non-invasive debug authentication .................................. C7-6
C7.4 ARMv6 non-invasive debug authentication ...................................... C7-8

Contents
xCopyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Chapter C8 Sample-based Profiling
C8.1 Program Counter sampling .............................................................. C8-2
Chapter C9 Performance Monitors
C9.1 About the performance monitors ...................................................... C9-2
C9.2 Status in the ARM architecture ........................................................ C9-4
C9.3 Accuracy of the performance monitors ............................................ C9-5
C9.4 Behavior on overflow ....................................................................... C9-6
C9.5 Interaction with Security Extensions ................................................ C9-7
C9.6 Interaction with trace ........................................................................ C9-8
C9.7 Interaction with power saving operations ......................................... C9-9
C9.8 CP15 c9 register map .................................................................... C9-10
C9.9 Access permissions ....................................................................... C9-12
C9.10 Event numbers ............................................................................... C9-13
Chapter C10 Debug Registers Reference
C10.1 Accessing the debug registers ....................................................... C10-2
C10.2 Debug identification registers ......................................................... C10-3
C10.3 Control and status registers ......................................................... C10-10
C10.4 Instruction and data transfer registers ......................................... C10-40
C10.5 Software debug event registers ................................................... C10-48
C10.6 OS Save and Restore registers, v7 Debug only .......................... C10-75
C10.7 Memory system control registers ................................................. C10-80
C10.8 Management registers, ARMv7 only ............................................ C10-88
C10.9 Performance monitor registers ................................................... C10-105
Appendix A Recommended External Debug Interface
A.1 System integration signals ......................................................... AppxA-2
A.2 Recommended debug slave port ............................................. AppxA-13
Appendix B Common VFP Subarchitecture Specification
B.1 Scope of this appendix ............................................................... AppxB-2
B.2 Introduction to the Common VFP subarchitecture ..................... AppxB-3
B.3 Exception processing ................................................................. AppxB-6
B.4 Support code requirements ...................................................... AppxB-11
B.5 Context switching ..................................................................... AppxB-14
B.6 Subarchitecture additions to the VFP system registers ........... AppxB-15
B.7 Version 1 of the Common VFP subarchitecture ....................... AppxB-23
B.8 Version 2 of the Common VFP subarchitecture ....................... AppxB-24
Appendix C Legacy Instruction Mnemonics
C.1 Thumb instruction mnemonics ................................................... AppxC-2
C.2 Pre-UAL pseudo-instruction NOP .............................................. AppxC-3

Contents
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xi
Appendix D Deprecated and Obsolete Features
D.1 Deprecated features .................................................................. AppxD-2
D.2 Deprecated terminology ............................................................. AppxD-5
D.3 Obsolete features ....................................................................... AppxD-6
D.4 Semaphore instructions ............................................................. AppxD-7
D.5 Use of the SP as a general-purpose register ............................. AppxD-8
D.6 Explicit use of the PC in ARM instructions ................................. AppxD-9
D.7 Deprecated Thumb instructions ............................................... AppxD-10
Appendix E Fast Context Switch Extension (FCSE)
E.1 About the FCSE ......................................................................... AppxE-2
E.2 Modified virtual addresses ......................................................... AppxE-3
E.3 Debug and trace ........................................................................ AppxE-5
Appendix F VFP Vector Operation Support
F.1 About VFP vector mode ............................................................. AppxF-2
F.2 Vector length and stride control ................................................. AppxF-3
F.3 VFP register banks .................................................................... AppxF-5
F.4 VFP instruction type selection .................................................... AppxF-7
Appendix G ARMv6 Differences
G.1 Introduction to ARMv6 .............................................................. AppxG-2
G.2 Application level register support .............................................. AppxG-3
G.3 Application level memory support ............................................. AppxG-6
G.4 Instruction set support ............................................................. AppxG-10
G.5 System level register support .................................................. AppxG-16
G.6 System level memory model ................................................... AppxG-20
G.7 System Control coprocessor (CP15) support .......................... AppxG-29
Appendix H ARMv4 and ARMv5 Differences
H.1 Introduction to ARMv4 and ARMv5 ............................................ AppxH-2
H.2 Application level register support ............................................... AppxH-4
H.3 Application level memory support .............................................. AppxH-6
H.4 Instruction set support .............................................................. AppxH-11
H.5 System level register support ................................................... AppxH-18
H.6 System level memory model .................................................... AppxH-21
H.7 System Control coprocessor (CP15) support ........................... AppxH-31
Appendix I Pseudocode Definition
I.1 Instruction encoding diagrams and pseudocode ......................... AppxI-2
I.2 Limitations of pseudocode .......................................................... AppxI-4
I.3 Data types ................................................................................... AppxI-5
I.4 Expressions ................................................................................ AppxI-9
I.5 Operators and built-in functions ................................................ AppxI-11
I.6 Statements and program structure ............................................ AppxI-17
I.7 Miscellaneous helper procedures and functions ....................... AppxI-22

Contents
xii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Appendix J Pseudocode Index
J.1 Pseudocode operators and keywords ........................................ AppxJ-2
J.2 Pseudocode functions and procedures ...................................... AppxJ-6
Appendix K Register Index
K.1 Register index ............................................................................ AppxK-2
Glossary

ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xiii
Preface
This preface summarizes the contents of this manual and lists the conventions it uses. It contains the
following sections:
•About this manual on page xiv
•Using this manual on page xv
•Conventions on page xviii
•Further reading on page xx
•Feedback on page xxi.

Preface
xiv Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
About this manual
This manual describes the ARM®v7 instruction set architecture, including its high code density Thumb®
instruction encoding and the following extensions to it:
• The System Control coprocessor, coprocessor 15 (CP15), used to control memory system
components such as caches, write buffers, Memory Management Units, and Protection Units.
• The optional Advanced SIMD extension, that provides high-performance integer and
single-precision floating-point vector operations.
• The optional VFP extension, that provides high-performance floating-point operations. It can
optionally support double-precision operations.
• The Debug architecture, that provides software access to debug features in ARM processors.
Part A describes the application level view of the architecture. It describes the application level view of the
programmers’ model and the memory model. It also describes the precise effects of each instruction in User
mode (the normal operating mode), including any restrictions on its use. This information is of primary
importance to authors and users of compilers, assemblers, and other programs that generate ARM machine
code.
Part B describes the system level view of the architecture. It gives details of system registers that are not
accessible from User mode, and the system level view of the memory model. It also gives full details of the
effects of instructions in privileged modes (any mode other than User mode), where these are different from
their effects in User mode.
Part C describes the Debug architecture. This is an extension to the ARM architecture that provides
configuration, breakpoint and watchpoint support, and a Debug Communications Channel (DCC) to a debug
host.
Assembler syntax is given for the instructions described in this manual, permitting instructions to be
specified in textual form. However, this manual is not intended as tutorial material for ARM assembler
language, nor does it describe ARM assembler language at anything other than a very basic level. To make
effective use of ARM assembler language, consult the documentation supplied with the assembler being
used.

Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xv
Using this manual
The information in this manual is organized into four parts, as described below.
Part A, Application Level Architecture
Part A describes the application level view of the architecture. It contains the following chapters:
Chapter A1 Gives a brief overview of the ARM architecture, and the ARM and Thumb instruction sets.
Chapter A2 Describes the application level view of the ARM programmers’ model, including the
application level view of the Advanced SIMD and VFP extensions. It describes the types of
value that ARM instructions operate on, the general-purpose registers that contain those
values, and the Application Program Status Register.
Chapter A3 Describes the application level view of the memory model, including the ARM memory
types and attributes, and memory access control.
Chapter A4 Describes the range of instructions available in the ARM, Thumb, Advanced SIMD, and
VFP instruction sets. It also contains some details of instruction operation, where these are
common to several instructions.
Chapter A5 Gives details of the encoding of the ARM instruction set.
Chapter A6 Gives details of the encoding of the Thumb instruction set.
Chapter A7 Gives details of the encoding of the Advanced SIMD and VFP instruction sets.
Chapter A8 Provides detailed reference information about every instruction available in the Thumb,
ARM, Advanced SIMD, and VFP instruction sets, with the exception of information only
relevant in privileged modes.
Chapter A9 Provides detailed reference information about the ThumbEE (Execution Environment)
variant of the Thumb instruction set.

Preface
xvi Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Part B, System Level Architecture
Part B describes the system level view of the architecture. It contains the following chapters:
Chapter B1 Describes the system level view of the programmers’ model.
Chapter B2 Describes the system level view of the memory model features that are common to all
memory systems.
Chapter B3 Describes the system level view of the Virtual Memory System Architecture (VMSA) that
is part of all ARMv7-A implementations. This chapter includes descriptions of all of the
CP15 System Control Coprocessor registers in a VMSA implementation.
Chapter B4 Describes the system level view of the Protected Memory System Architecture (PMSA) that
is part of all ARMv7-R implementations. This chapter includes descriptions of all of the
CP15 System Control Coprocessor registers in a PMSA implementation.
Chapter B5 Describes the CPUID scheme.
Chapter B6 Provides detailed reference information about system instructions, and more information
about instructions where they behave differently in privileged modes.
Part C, Debug Architecture
Part C describes the Debug architecture. It contains the following chapters:
Chapter C1 Gives a brief introduction to the Debug architecture.
Chapter C2 Describes the authentication of invasive debug.
Chapter C3 Describes the debug events.
Chapter C4 Describes the debug exceptions.
Chapter C5 Describes Debug state.
Chapter C6 Describes the permitted debug register interfaces.
Chapter C7 Describes the authentication of non-invasive debug.
Chapter C8 Describes sample-based profiling.
Chapter C9 Describes the ARM performance monitors.
Chapter C10 Describes the debug registers.

Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xvii
Part D, Appendices
This manual contains the following appendices:
Appendix A Describes the recommended external Debug interfaces.
Note
This description is not part of the ARM architecture specification. It is included here only
as supplementary information, for the convenience of developers and users who might
require this information.
Appendix B The Common VFP subarchitecture specification.
Note
This specification is not part of the ARM architecture specification. This sub-architectural
information is included here only as supplementary information, for the convenience of
developers and users who might require this information.
Appendix C Describes the legacy mnemonics.
Appendix D Identifies the deprecated architectural features.
Appendix E Describes the Fast Context Switch Extension (FCSE). From ARMv6, the use of this feature
is deprecated, and in ARMv7 the FCSE is optional.
Appendix F Describes the VFP vector operations. Use of these operations is deprecated in ARMv7.
Appendix G Describes the differences in the ARMv6 architecture.
Appendix H Describes the differences in the ARMv4 and ARMv5 architectures.
Appendix I The formal definition of the pseudocode.
Appendix J Index to definitions of pseudocode operators, keywords, functions, and procedures.
Appendix K Index to register descriptions in the manual.

Preface
xviii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Conventions
This manual employs typographic and other conventions intended to improve its ease of use.
General typographic conventions
typewriter
Is used for assembler syntax descriptions, pseudocode descriptions of instructions,
and source code examples. In the cases of assembler syntax descriptions and
pseudocode descriptions, see the additional conventions below.
The
typewriter
style is also used in the main text for instruction mnemonics and for
references to other items appearing in assembler syntax descriptions, pseudocode
descriptions of instructions and source code examples.
italic Highlights important notes, introduces special terminology, and denotes internal
cross-references and citations.
bold Is used for emphasis in descriptive lists and elsewhere, where appropriate.
SMALL CAPITALS Are used for a few terms that have specific technical meanings. Their meanings can
be found in the Glossary.
Signals
In general this specification does not define processor signals, but it does include some signal examples and
recommendations. It uses the following signal conventions:
Signal level The level of an asserted signal depends on whether the signal is active-HIGH or
active-LOW. Asserted means:
• HIGH for active-HIGH signals
• LOW for active-LOW signals.
Lower-case n At the start or end of a signal name denotes an active-LOW signal.
Numbers
Numbers are normally written in decimal. Binary numbers are preceded by 0b, and hexadecimal numbers
by
0x
and written in a
typewriter
font.
Bit values
Values of bits and bitfields are normally given in binary, in single quotes. The quotes are normally omitted
in encoding diagrams and tables.
Pseudocode descriptions
This manual uses a form of pseudocode to provide precise descriptions of the specified functionality. This
pseudocode is written in a
typewriter
font, and is described in Appendix I Pseudocode Definition.

Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xix
Assembler syntax descriptions
This manual contains numerous syntax descriptions for assembler instructions and for components of
assembler instructions. These are shown in a
typewriter
font, and use the conventions described in
Assembler syntax on page A8-4.

Preface
xx Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Further reading
This section lists publications from both ARM and third parties that provide more information on the ARM
family of processors.
ARM periodically provides updates and corrections to its documentation. See
http://www.arm.com
for
current errata sheets and addenda, and the ARM Frequently Asked Questions.
ARM publications
•ARM Debug Interface v5 Architecture Specification (ARM IHI 0031)
•ARMv7-M Architecture Reference Manual (ARM DDI 0403)
•CoreSight Architecture Specification (ARM IHI 0029)
•ARM Architecture Reference Manual (ARM DDI 0100I)
Note
— Issue I of the ARM Architecture Reference Manual (DDI 0100I) was issued in July 2005 and
describes the first version of the ARMv6 architecture, and all previous architecture versions.
— Addison-Wesley Professional publish ARM Architecture Reference Manual, Second Edition
(December 27, 2000). The contents of this are identical to Issue E of the ARM Architecture
Reference Manual (DDI 0100E). It describes ARMv5TE and earlier versions of the ARM
architecture, and is superseded by DDI 0100I.
•Embedded Trace Macrocell Architecture Specification (ARM IHI 0014)
•CoreSight Program Flow Trace Architecture Specification (ARM IHI 0035).
External publications
The following books are referred to in this manual, or provide more information:
• IEEE Std 1596.5-1993, IEEE Standard for Shared-Data Formats Optimized for Scalable Coherent
Interface (SCI) Processors, ISBN 1-55937-354-7
• IEEE Std 1149.1-2001, IEEE Standard Test Access Port and Boundary Scan Architecture (JTAG)
• ANSI/IEEE Std 754-1985, IEEE Standard for Binary Floating-Point Arithmetic
• JEP106, Standard Manufacturers Identification Code, JEDEC Solid State Technology Association
•The Java Virtual Machine Specification Second Edition, Tim Lindholm and Frank Yellin, published
by Addison Wesley (ISBN: 0-201-43294-3)
•Memory Consistency Models for Shared Memory-Multiprocessors, Kourosh Gharachorloo, Stanford
University Technical Report CSL-TR-95-685

Preface
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. xxi
Feedback
ARM welcomes feedback on its documentation.
Feedback on this manual
If you notice any errors or omissions in this manual, send e-mail to
errata@arm.com
giving:
• the document title
• the document number
• the page number(s) to which your comments apply
• a concise explanation of the problem.
General suggestions for additions and improvements are also welcome.

Preface
xxii Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Part A
Application Level Architecture

ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-1
Chapter A1
Introduction to the ARM Architecture
This chapter introduces the ARM architecture and contains the following sections:
•About the ARM architecture on page A1-2
•The ARM and Thumb instruction sets on page A1-3
•Architecture versions, profiles, and variants on page A1-4
•Architecture extensions on page A1-6
•The ARM memory model on page A1-7
•Debug on page A1-8.

Introduction to the ARM Architecture
A1-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.1 About the ARM architecture
The ARM architecture supports implementations across a wide range of performance points. It is
established as the dominant architecture in many market segments. The architectural simplicity of ARM
processors leads to very small implementations, and small implementations mean devices can have very low
power consumption. Implementation size, performance, and very low power consumption are key attributes
of the ARM architecture.
The ARM architecture is a Reduced Instruction Set Computer (RISC) architecture, as it incorporates these
typical RISC architecture features:
• a large uniform register file
•a load/store architecture, where data-processing operations only operate on register contents, not
directly on memory contents
• simple addressing modes, with all load/store addresses being determined from register contents and
instruction fields only.
In addition, the ARM architecture provides:
• instructions that combine a shift with an arithmetic or logical operation
• auto-increment and auto-decrement addressing modes to optimize program loops
• Load and Store Multiple instructions to maximize data throughput
• conditional execution of almost all instructions to maximize execution throughput.
These enhancements to a basic RISC architecture enable ARM processors to achieve a good balance of high
performance, small code size, low power consumption, and small silicon area.
Except where the architecture specifies differently, the programmer-visible behavior of an implementation
must be the same as a simple sequential execution of the program. This programmer-visible behavior does
not include the execution time of the program.

Introduction to the ARM Architecture
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-3
A1.2 The ARM and Thumb instruction sets
The ARM instruction set is a set of 32-bit instructions providing comprehensive data-processing and control
functions.
The Thumb instruction set was developed as a 16-bit instruction set with a subset of the functionality of the
ARM instruction set. It provides significantly improved code density, at a cost of some reduction in
performance. A processor executing Thumb instructions can change to executing ARM instructions for
performance critical segments, in particular for handling interrupts.
In ARMv6T2, Thumb-2 technology is introduced. This technology makes it possible to extend the original
Thumb instruction set with many 32-bit instructions. The range of 32-bit Thumb instructions included in
ARMv6T2 permits Thumb code to achieve performance similar to ARM code, with code density better than
that of earlier Thumb code.
From ARMv6T2, the ARM and Thumb instruction sets provide almost identical functionality. For more
information, see Chapter A4 The Instruction Sets.

Introduction to the ARM Architecture
A1-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.3 Architecture versions, profiles, and variants
The ARM and Thumb instruction set architectures have evolved significantly since they were first
developed. They will continue to be developed in the future. Seven major versions of the instruction set have
been defined to date, denoted by the version numbers 1 to 7. Of these, the first three versions are now
obsolete.
ARMv7 provides three profiles:
ARMv7-A Application profile, described in this manual. Implements a traditional ARM architecture
with multiple modes and supporting a Virtual Memory System Architecture (VMSA) based
on an MMU. Supports the ARM and Thumb instruction sets.
ARMv7-R Real-time profile, described in this manual. Implements a traditional ARM architecture with
multiple modes and supporting a Protected Memory System Architecture (PMSA) based on
an MPU. Supports the ARM and Thumb instruction sets.
ARMv7-M Microcontroller profile, described in the ARMv7-M Architecture Reference Manual.
Implements a programmers' model designed for fast interrupt processing, with hardware
stacking of registers and support for writing interrupt handlers in high-level languages.
Implements a variant of the ARMv7 PMSA and supports a variant of the Thumb instruction
set.
Versions can be qualified with variant letters to specify additional instructions and other functionality that
are included as an architecture extension. Extensions are typically included in the base architecture of the
next version number. Provision is also made to exclude variants by prefixing the variant letter with
x
.
Some extensions are described separately instead of using a variant letter. For details of these extensions see
Architecture extensions on page A1-6.
The valid variants of ARMv4, ARMv5, and ARMv6 are as follows:
ARMv4 The earliest architecture variant covered by this manual. It includes only the ARM
instruction set.
ARMv4T Adds the Thumb instruction set.
ARMv5T Improves interworking of ARM and Thumb instructions. Adds count leading zeros (
CLZ
)
and software breakpoint (
BKPT
) instructions.
ARMv5TE Enhances arithmetic support for digital signal processing (DSP) algorithms. Adds preload
data (
PLD
), dual word load (
LDRD
), store (
STRD
), and 64-bit coprocessor register transfers
(
MCRR
,
MRRC
).
ARMv5TEJ Adds the
BXJ
instruction and other support for the Jazelle® architecture extension.
ARMv6 Adds many new instructions to the ARM instruction set. Formalizes and revises the memory
model and the Debug architecture.
ARMv6K Adds instructions to support multi-processing to the ARM instruction set, and some extra
memory model features.

Introduction to the ARM Architecture
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-5
ARMv6T2 Introduces Thumb-2 technology, giving a major development of the Thumb instruction set
to provide a similar level of functionality to the ARM instruction set.
Note
ARMv6KZ or ARMv6Z are sometimes used to describe the ARMv6K architecture with the optional
Security Extensions.
For detailed information about versions of the ARM architecture, see Appendix G ARMv6 Differences and
Appendix H ARMv4 and ARMv5 Differences.
The following architecture variants are now obsolete:
ARMv1, ARMv2, ARMv2a, ARMv3, ARMv3G, ARMv3M, ARMv4xM, ARMv4TxM, ARMv5,
ARMv5xM, ARMv5TxM, and ARMv5TExP.
Contact ARM if you require details of obsolete variants.
Instruction descriptions in this manual specify the architecture versions that support them.

Introduction to the ARM Architecture
A1-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.4 Architecture extensions
This manual describes the following extensions to the ARM and Thumb instruction set architectures:
ThumbEE Is a variant of the Thumb instruction set that is designed as a target for dynamically
generated code. It is:
• a required extension to the ARMv7-A profile
• an optional extension to the ARMv7-R profile.
VFP Is a floating-point coprocessor extension to the instruction set architectures. There
have been three main versions of VFP to date:
• VFPv1 is obsolete. Details are available on request from ARM.
• VFPv2 is an optional extension to:
— the ARM instruction set in the ARMv5TE, ARMv5TEJ, ARMv6, and
ARMv6K architectures
— the ARM and Thumb instruction sets in the ARMv6T2 architecture.
• VFPv3 is an optional extension to the ARM, Thumb and ThumbEE
instruction sets in the ARMv7-A and ARMv7-R profiles.
VFPv3 can be implemented with either thirty-two or sixteen doubleword
registers, as described in Advanced SIMD and VFP extension registers on
page A2-21. Where necessary, the terms VFPv3-D32 and VFPv3-D16 are
used to distinguish between these two implementation options. Where the
term VFPv3 is used it covers both options.
VFPv3 can be extended by the half-precision extensions that provide
conversion functions in both directions between half-precision floating-point
and single-precision floating-point.
Advanced SIMD Is an instruction set extension that provides Single Instruction Multiple Data
(SIMD) functionality. It is an optional extension to the ARMv7-A and ARMv7-R
profiles. When VFPv3 and Advanced SIMD are both implemented, they use a
shared register bank and have some shared instructions.
Advanced SIMD can be extended by the half-precision extensions that provide
conversion functions in both directions between half-precision floating-point and
single-precision floating-point.
Security Extensions Are a set of security features that facilitate the development of secure applications.
They are an optional extension to the ARMv6K architecture and the ARMv7-A
profile.
Jazelle Is the Java bytecode execution extension that extended ARMv5TE to ARMv5TEJ.
From ARMv6 Jazelle is a required part of the architecture, but is still often
described as the Jazelle extension.
Multiprocessing Extensions
Are a set of features that enhance multiprocessing functionality. They are an
optional extension to the ARMv7-A and ARMv7-R profiles.

Introduction to the ARM Architecture
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A1-7
A1.5 The ARM memory model
The ARM architecture uses a single, flat address space of 232 8-bit bytes. The address space is also regarded
as 230 32-bit words or 231 16-bit halfwords.
The architecture provides facilities for:
• faulting unaligned memory accesses
• restricting access by applications to specified areas of memory
• translating virtual addresses provided by executing instructions into physical addresses
• altering the interpretation of word and halfword data between big-endian and little-endian
• optionally preventing out-of-order access to memory
• controlling caches
• synchronizing access to shared memory by multiple processors.
For more information, see:
• Chapter A3 Application Level Memory Model
• Chapter B2 Common Memory System Architecture Features
• Chapter B3 Virtual Memory System Architecture (VMSA)
• Chapter B4 Protected Memory System Architecture (PMSA).

Introduction to the ARM Architecture
A1-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A1.6 Debug
ARMv7 processors implement two types of debug support:
Invasive debug Debug permitting modification of the state of the processor. This is intended
primarily for run-control debugging.
Non-invasive debug Debug permitting data and program flow observation, without modifying the state
of the processor or interrupting the flow of execution.
This provides for:
• instruction and data tracing
• program counter sampling
• performance monitors.
For more information, see Chapter C1 Introduction to the ARM Debug Architecture.

ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-1
Chapter A2
Application Level Programmers’ Model
This chapter gives an application level view of the ARM programmers’ model. It contains the following
sections:
•About the Application level programmers’ model on page A2-2
•ARM core data types and arithmetic on page A2-3
•ARM core registers on page A2-11
•The Application Program Status Register (APSR) on page A2-14
•Execution state registers on page A2-15
•Advanced SIMD and VFP extensions on page A2-20
•Floating-point data types and arithmetic on page A2-32
•Polynomial arithmetic over {0,1} on page A2-67
•Coprocessor support on page A2-68
•Execution environment support on page A2-69
•Exceptions, debug events and checks on page A2-81.

Application Level Programmers’ Model
A2-2 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.1 About the Application level programmers’ model
This chapter contains the programmers’ model information required for application development.
The information in this chapter is distinct from the system information required to service and support
application execution under an operating system. However, some knowledge of that system information is
needed to put the Application level programmers' model into context.
System level support requires access to all features and facilities of the architecture, a mode of operation
referred to as privileged operation. System code determines whether an application runs in a privileged or
unprivileged manner. When an operating system supports both privileged and unprivileged operation, an
application usually runs unprivileged. This:
• permits the operating system to allocate system resources to it in a unique or shared manner
• provides a degree of protection from other processes and tasks, and so helps protect the operating
system from malfunctioning applications.
This chapter indicates where some system level understanding is helpful, and where appropriate it:
• gives an overview of the system level information
• gives references to the system level descriptions in Chapter B1 The System Level Programmers’
Model and elsewhere.
The Security Extensions extend the architecture to provide hardware security features that support the
development of secure applications. For more information, see The Security Extensions on page B1-25.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-3
A2.2 ARM core data types and arithmetic
All ARMv7-A and ARMv7-R processors support the following data types in memory:
Byte 8 bits
Halfword 16 bits
Word 32 bits
Doubleword 64 bits.
Processor registers are 32 bits in size. The instruction set contains instructions supporting the following data
types held in registers:
• 32-bit pointers
• unsigned or signed 32-bit integers
• unsigned 16-bit or 8-bit integers, held in zero-extended form
• signed 16-bit or 8-bit integers, held in sign-extended form
• two 16-bit integers packed into a register
• four 8-bit integers packed into a register
• unsigned or signed 64-bit integers held in two registers.
Load and store operations can transfer bytes, halfwords, or words to and from memory. Loads of bytes or
halfwords zero-extend or sign-extend the data as it is loaded, as specified in the appropriate load instruction.
The instruction sets include load and store operations that transfer two or more words to and from memory.
You can load and store doublewords using these instructions. The exclusive doubleword load/store
instructions
LDREXD
and
STREXD
specify single-copy atomic doubleword accesses to memory.
When any of the data types is described as unsigned, the N-bit data value represents a non-negative integer
in the range 0 to 2N-1, using normal binary format.
When any of these types is described as signed, the N-bit data value represents an integer in the range -2N-1
to +2N-1-1, using two's complement format.
The instructions that operate on packed halfwords or bytes include some multiply instructions that use just
one of two halfwords, and Single Instruction Multiple Data (SIMD) instructions that operate on all of the
halfwords or bytes in parallel.
Direct instruction support for 64-bit integers is limited, and most 64-bit operations require sequences of two
or more instructions to synthesize them.

Application Level Programmers’ Model
A2-4 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.2.1 Integer arithmetic
The instruction set provides a wide variety of operations on the values in registers, including bitwise logical
operations, shifts, additions, subtractions, multiplications, and many others. These operations are defined
using the pseudocode described in Appendix I Pseudocode Definition, usually in one of three ways:
• By direct use of the pseudocode operators and built-in functions defined in Operators and built-in
functions on page AppxI-11.
• By use of pseudocode helper functions defined in the main text. These can be located using the table
in Appendix J Pseudocode Index.
• By a sequence of the form:
1. Use of the
SInt()
,
UInt()
, and
Int()
built-in functions defined in Converting bitstrings to
integers on page AppxI-14 to convert the bitstring contents of the instruction operands to the
unbounded integers that they represent as two's complement or unsigned integers.
2. Use of mathematical operators, built-in functions and helper functions on those unbounded
integers to calculate other such integers.
3. Use of either the bitstring extraction operator defined in Bitstring extraction on page AppxI-12
or of the saturation helper functions described in Pseudocode details of saturation on
page A2-9 to convert an unbounded integer result into a bitstring result that can be written to
a register.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-5
Shift and rotate operations
The following types of shift and rotate operations are used in instructions:
Logical Shift Left
(
LSL
) moves each bit of a bitstring left by a specified number of bits. Zeros are shifted in at
the right end of the bitstring. Bits that are shifted off the left end of the bitstring are
discarded, except that the last such bit can be produced as a carry output.
Logical Shift Right
(
LSR
) moves each bit of a bitstring right by a specified number of bits. Zeros are shifted in
at the left end of the bitstring. Bits that are shifted off the right end of the bitstring are
discarded, except that the last such bit can be produced as a carry output.
Arithmetic Shift Right
(
ASR
) moves each bit of a bitstring right by a specified number of bits. Copies of the leftmost
bit are shifted in at the left end of the bitstring. Bits that are shifted off the right end of the
bitstring are discarded, except that the last such bit can be produced as a carry output.
Rotate Right (
ROR
) moves each bit of a bitstring right by a specified number of bits. Each bit that is shifted
off the right end of the bitstring is re-introduced at the left end. The last bit shifted off the
right end of the bitstring can be produced as a carry output.
Rotate Right with Extend
(
RRX
) moves each bit of a bitstring right by one bit. The carry input is shifted in at the left
end of the bitstring. The bit shifted off the right end of the bitstring can be produced as a
carry output.
Pseudocode details of shift and rotate operations
These shift and rotate operations are supported in pseudocode by the following functions:
// LSL_C()
// =======
(bits(N), bit) LSL_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = x : Zeros(shift);
result = extended_x<N-1:0>;
carry_out = extended_x<N>;
return (result, carry_out);
// LSL()
// =====
bits(N) LSL(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else

Application Level Programmers’ Model
A2-6 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
(result, -) = LSL_C(x, shift);
return result;
// LSR_C()
// =======
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
// LSR()
// =====
bits(N) LSR(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
(result, -) = LSR_C(x, shift);
return result;
// ASR_C()
// =======
(bits(N), bit) ASR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = SignExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
// ASR()
// =====
bits(N) ASR(bits(N) x, integer shift)
assert shift >= 0;
if shift == 0 then
result = x;
else
(result, -) = ASR_C(x, shift);
return result;
// ROR_C()
// =======
(bits(N), bit) ROR_C(bits(N) x, integer shift)
assert shift != 0;
m = shift MOD N;
result = LSR(x,m) OR LSL(x,N-m);
carry_out = result<N-1>;
return (result, carry_out);

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-7
// ROR()
// =====
bits(N) ROR(bits(N) x, integer shift)
if n == 0 then
result = x;
else
(result, -) = ROR_C(x, shift);
return result;
// RRX_C()
// =======
(bits(N), bit) RRX_C(bits(N) x, bit carry_in)
result = carry_in : x<N-1:1>;
carry_out = x<0>;
return (result, carry_out);
// RRX()
// =====
bits(N) RRX(bits(N) x, bit carry_in)
(result, -) = RRX_C(x, shift);
return result;

Application Level Programmers’ Model
A2-8 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Pseudocode details of addition and subtraction
In pseudocode, addition and subtraction can be performed on any combination of unbounded integers and
bitstrings, provided that if they are performed on two bitstrings, the bitstrings must be identical in length.
The result is another unbounded integer if both operands are unbounded integers, and a bitstring of the same
length as the bitstring operand(s) otherwise. For the precise definition of these operations, see Addition and
subtraction on page AppxI-15.
The main addition and subtraction instructions can produce status information about both unsigned carry
and signed overflow conditions. This status information can be used to synthesize multi-word additions and
subtractions. In pseudocode the
AddWithCarry()
function provides an addition with a carry input and carry
and overflow outputs:
// AddWithCarry()
// ==============
(bits(N), bit, bit) AddWithCarry(bits(N) x, bits(N) y, bit carry_in)
unsigned_sum = UInt(x) + UInt(y) + UInt(carry_in);
signed_sum = SInt(x) + SInt(y) + UInt(carry_in);
result = unsigned_sum<N-1:0>; // == signed_sum<N-1:0>
carry_out = if UInt(result) == unsigned_sum then ‘0’ else ‘1’;
overflow = if SInt(result) == signed_sum then ‘0’ else ‘1’;
return (result, carry_out, overflow);
An important property of the
AddWithCarry()
function is that if:
(result, carry_out, overflow) = AddWithCarry(x, NOT(y), carry_in)
then:
•if
carry_in == '1'
, then
result == x-y
with:
—
overflow == '1'
if signed overflow occurred during the subtraction
—
carry_out == '1'
if unsigned borrow did not occur during the subtraction, that is, if
x >= y
•if
carry_in == '0'
, then
result == x-y-1
with:
—
overflow == '1'
if signed overflow occurred during the subtraction
—
carry_out == '1'
if unsigned borrow did not occur during the subtraction, that is, if
x > y
.
Together, these mean that the
carry_in
and
carry_out
bits in
AddWithCarry()
calls can act as NOT borrow
flags for subtractions as well as carry flags for additions.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-9
Pseudocode details of saturation
Some instructions perform saturating arithmetic, that is, if the result of the arithmetic overflows the
destination signed or unsigned N-bit integer range, the result produced is the largest or smallest value in that
range, rather than wrapping around modulo 2N. This is supported in pseudocode by the
SignedSatQ()
and
UnsignedSatQ()
functions when a boolean result is wanted saying whether saturation occurred, and by the
SignedSat()
and
UnsignedSat()
functions when only the saturated result is wanted:
// SignedSatQ()
// ============
(bits(N), boolean) SignedSatQ(integer i, integer N)
if i > 2^(N-1) - 1 then
result = 2^(N-1) - 1; saturated = TRUE;
elsif i < -(2^(N-1)) then
result = -(2^(N-1)); saturated = TRUE;
else
result = i; saturated = FALSE;
return (result<N-1:0>, saturated);
// UnsignedSatQ()
// ==============
(bits(N), boolean) UnsignedSatQ(integer i, integer N)
if i > 2^N - 1 then
result = 2^N - 1; saturated = TRUE;
elsif i < 0 then
result = 0; saturated = TRUE;
else
result = i; saturated = FALSE;
return (result<N-1:0>, saturated);
// SignedSat()
// ===========
bits(N) SignedSat(integer i, integer N)
(result, -) = SignedSatQ(i, N);
return result;
// UnsignedSat()
// =============
bits(N) UnsignedSat(integer i, integer N)
(result, -) = UnsignedSatQ(i, N);
return result;
SatQ(i, N, unsigned)
returns either
UnsignedSatQ(i,N)
or
SignedSatQ(i, N)
depending on the value of its
third argument, and
Sat(i, N, unsigned)
returns either
UnsignedSat(i, N)
or
SignedSat(i, N)
depending on
the value of its third argument:

Application Level Programmers’ Model
A2-10 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
// SatQ()
// ======
(bits(N), boolean) SatQ(integer i, integer N, boolean unsigned)
(result, sat) = if unsigned then UnsignedSatQ(i, N) else SignedSatQ(i, N);
return (result, sat);
// Sat()
// =====
bits(N) Sat(integer i, integer N, boolean unsigned)
result = if unsigned then UnsignedSat(i, N) else SignedSat(i, N);
return result;

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-11
A2.3 ARM core registers
In the application level view, an ARM processor has:
• thirteen general-purpose32-bit registers, R0 to R12
• three 32-bit registers, R13 to R15, that sometimes or always have a special use.
Registers R13 to R15 are usually referred to by names that indicate their special uses:
SP, the Stack Pointer
Register R13 is used as a pointer to the active stack.
In Thumb code, most instructions cannot access SP. The only instructions that can access
SP are those designed to use SP as a stack pointer.
The use of SP for any purpose other than as a stack pointer is deprecated.
Note
Using SP for any purpose other than as a stack pointer is likely to break the requirements of
operating systems, debuggers, and other software systems, causing them to malfunction.
LR, the Link Register
Register R14 is used to store the return address from a subroutine. At other times, LR can
be used for other purposes.
When a
BL
or
BLX
instruction performs a subroutine call, LR is set to the subroutine return
address. To perform a subroutine return, copy LR back to the program counter. This is
typically done in one of two ways, after entering the subroutine with a
BL
or
BLX
instruction:
• Return with a
BX LR
instruction.
• On subroutine entry, store LR to the stack with an instruction of the form:
PUSH {<registers>,LR}
and use a matching instruction to return:
POP {<registers>,PC}
ThumbEE checks and handler calls use LR in a similar way. For details see Chapter A9
ThumbEE.
PC, the Program Counter
Register R15 is the program counter:
• When executing an ARM instruction, PC reads as the address of the current
instruction plus 8.
• When executing a Thumb instruction, PC reads as the address of the current
instruction plus 4.
• Writing an address to PC causes a branch to that address.
In Thumb code, most instructions cannot access PC.
See ARM core registers on page B1-9 for the system level view of SP, LR, and PC.

Application Level Programmers’ Model
A2-12 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Note
The names SP, LR and PC are preferred to R13, R14 and R15. However, sometimes it is simpler to use the
R13-R15 names when referring to a group of registers. For example, it is simpler to refer to Registers R8 to
R15, rather than to Registers R8 to R12, the SP, LR and PC. However these two descriptions of the group of
registers have exactly the same meaning.
A2.3.1 Pseudocode details of operations on ARM core registers
In pseudocode, the
R[]
function is used to:
• Read or write R0-R12, SP, and LR, using n == 0-12, 13, and 14 respectively.
• Read the PC, using n == 15.
This function has prototypes:
bits(32) R[integer n]
assert n >= 0 && n <= 15;
R[integer n] = bits(32) value
assert n >= 0 && n <= 14;
The full operation of this function is explained in Pseudocode details of ARM core register operations on
page B1-12.
Descriptions of ARM store instructions that store the PC value use the
PCStoreValue()
pseudocode function
to specify the PC value stored by the instruction:
// PCStoreValue()
// ==============
bits(32) PCStoreValue()
// This function returns the PC value. On architecture versions before ARMv7, it
// is permitted to instead return PC+4, provided it does so consistently. It is
// used only to describe ARM instructions, so it returns the address of the current
// instruction plus 8 (normally) or 12 (when the alternative is permitted).
return PC;
Writing an address to the PC causes either a simple branch to that address or an interworking branch that
also selects the instruction set to execute after the branch. A simple branch is performed by the
BranchWritePC()
function:
// BranchWritePC()
// ===============
BranchWritePC(bits(32) address)
if CurrentInstrSet() == InstrSet_ARM then
if ArchVersion() < 6 && address<1:0> != ‘00’ then UNPREDICTABLE;
BranchTo(address<31:2>:’00’);
else
BranchTo(address<31:1>:’0’);
An interworking branch is performed by the
BXWritePC()
function:

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-13
// BXWritePC()
// ===========
BXWritePC(bits(32) address)
if CurrentInstrSet() == InstrSet_ThumbEE then
if address<0> == ‘1’ then
BranchTo(address<31:1>:’0’); // Remaining in ThumbEE state
else
UNPREDICTABLE;
else
if address<0> == ‘1’ then
SelectInstrSet(InstrSet_Thumb);
BranchTo(address<31:1>:’0’);
elsif address<1> == ‘0’ then
SelectInstrSet(InstrSet_ARM);
BranchTo(address);
else // address<1:0> == ‘10’
UNPREDICTABLE;
The
LoadWritePC()
and
ALUWritePC()
functions are used for two cases where the behavior was systematically
modified between architecture versions:
// LoadWritePC()
// =============
LoadWritePC(bits(32) address)
if ArchVersion() >= 5 then
BXWritePC(address);
else
BranchWritePC(address);
// ALUWritePC()
// ============
ALUWritePC(bits(32) address)
if ArchVersion() >= 7 && CurrentInstrSet() == InstrSet_ARM then
BXWritePC(address);
else
BranchWritePC(address);
Note
The behavior of the PC writes performed by the
ALUWritePC()
function is different in Debug state, where
there are more UNPREDICTABLE cases. The pseudocode in this section only handles the non-debug cases. For
more information, see Data-processing instructions with the PC as the target in Debug state on page C5-12.

Application Level Programmers’ Model
A2-14 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.4 The Application Program Status Register (APSR)
Program status is reported in the 32-bit Application Program Status Register (APSR). The format of the
APSR is:
In the APSR, the bits are in the following categories:
• Reserved bits are allocated to system features, or are available for future expansion. Unprivileged
execution ignores writes to privileged fields. However, application level software that writes to the
APSR must treat reserved bits as Do-Not-Modify (DNM) bits. For more information about the
reserved bits, see Format of the CPSR and SPSRs on page B1-16.
• Flags that can be set by many instructions:
N, bit [31] Negative condition code flag. Set to bit [31] of the result of the instruction. If the result
is regarded as a two's complement signed integer, then N == 1 if the result is negative and
N == 0 if it is positive or zero.
Z, bit [30] Zero condition code flag. Set to 1 if the result of the instruction is zero, and to 0 otherwise.
A result of zero often indicates an equal result from a comparison.
C, bit [29] Carry condition code flag. Set to 1 if the instruction results in a carry condition, for
example an unsigned overflow on an addition.
V, bit [28] Overflow condition code flag. Set to 1 if the instruction results in an overflow condition,
for example a signed overflow on an addition.
Q, bit [27] Set to 1 to indicate overflow or saturation occurred in some instructions, normally related
to Digital Signal Processing (DSP). For more information, see Pseudocode details of
saturation on page A2-9.
GE[3:0], bits [19:16]
Greater than or Equal flags. SIMD instructions update these flags to indicate the results
from individual bytes or halfwords of the operation. These flags can control a later
SEL
instruction. For more information, see SEL on page A8-312.
• Bits [26:24] are RAZ/SBZP. Therefore, software can use
MSR
instructions that write the top byte of
the APSR without using a read, modify, write sequence. If it does this, it must write zeros to
bits [26:24].
Instructions can test the N, Z, C, and V condition code flags to determine whether the instruction is to be
executed. In this way, execution of the instruction can be made conditional on the result of a previous
operation. For more information about conditional execution see Conditional execution on page A4-3 and
Conditional execution on page A8-8.
In ARMv7-A and ARMv7-R, the APSR is the same register as the CPSR, but the APSR must be used only
to access the N, Z, C, V, Q, and GE[3:0] bits. For more information, see Program Status Registers (PSRs)
on page B1-14.
31 30 29 28 27 26 24 23 20 19 16 15 0
NZCVQ RAZ/
SBZP Reserved GE[3:0] Reserved

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-15
A2.5 Execution state registers
The execution state registers modify the execution of instructions. They control:
• Whether instructions are interpreted as Thumb instructions, ARM instructions, ThumbEE
instructions, or Java bytecodes. For more information, see ISETSTATE.
• In Thumb state and ThumbEE state only, what conditions apply to the next four instructions. For
more information, see ITSTATE on page A2-17.
• Whether data is interpreted as big-endian or little-endian. For more information, see ENDIANSTATE
on page A2-19.
In ARMv7-A and ARMv7-R, the execution state registers are part of the Current Program Status Register.
For more information, see Program Status Registers (PSRs) on page B1-14.
There is no direct access to the execution state registers from application level instructions, but they can be
changed by side effects of application level instructions.
A2.5.1 ISETSTATE
The J bit and the T bit determine the instruction set used by the processor. Table A2-1 shows the encoding
of these bits.
ARM state The processor executes the ARM instruction set described in Chapter A5 ARM
Instruction Set Encoding.
Thumb state The processor executes the Thumb instruction set as described in Chapter A6
Thumb Instruction Set Encoding.
Jazelle state The processor executes Java bytecodes as part of a Java Virtual Machine (JVM). For
more information, see Jazelle direct bytecode execution support on page A2-73.
10
JT
Table A2-1 J and T bit encoding in ISETSTATE
J T Instruction set state
00 ARM
0 1 Thumb
10 Jazelle
1 1 ThumbEE

Application Level Programmers’ Model
A2-16 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
ThumbEE state The processor executes a variation of the Thumb instruction set specifically targeted
for use with dynamic compilation techniques associated with an execution
environment. This can be Java or other execution environments. This feature is
required in ARMv7-A, and optional in ARMv7-R. For more information, see
Thumb Execution Environment on page A2-69.
Pseudocode details of ISETSTATE operations
The following pseudocode functions return the current instruction set and select a new instruction set:
enumeration InstrSet {InstrSet_ARM, InstrSet_Thumb, InstrSet_Jazelle, InstrSet_ThumbEE};
// CurrentInstrSet()
// =================
InstrSet CurrentInstrSet()
case ISETSTATE of
when ‘00’ result = InstrSet_ARM;
when ‘01’ result = InstrSet_Thumb;
when ‘10’ result = InstrSet_Jazelle;
when ‘11’ result = InstrSet_ThumbEE;
return result;
// SelectInstrSet()
// ================
SelectInstrSet(InstrSet iset)
case iset of
when InstrSet_ARM
if CurrentInstrSet() == InstrSet_ThumbEE then
UNPREDICTABLE;
else
ISETSTATE = ‘00’;
when InstrSet_Thumb
ISETSTATE = ‘01’;
when InstrSet_Jazelle
ISETSTATE = ‘10’;
when InstrSet_ThumbEE
ISETSTATE = ‘11’;
return;

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-17
A2.5.2 ITSTATE
This field holds the If-Then execution state bits for the Thumb
IT
instruction. See IT on page A8-104 for a
description of the
IT
instruction and the associated IT block.
ITSTATE divides into two subfields:
IT[7:5] Holds the base condition for the current IT block. The base condition is the top 3 bits of the
condition specified by the IT instruction.
This subfield is 0b000 when no IT block is active.
IT[4:0] Encodes:
• The size of the IT block. This is the number of instructions that are to be conditionally
executed. The size of the block is implied by the position of the least significant 1 in
this field, as shown in Table A2-2 on page A2-18.
• The value of the least significant bit of the condition code for each instruction in the
block.
Note
Changing the value of the least significant bit of a condition code from 0 to 1 has the
effect of inverting the condition code.
This subfield is 0b00000 when no IT block is active.
When an IT instruction is executed, these bits are set according to the condition in the instruction, and the
Then and Else (T and E) parameters in the instruction. For more information, see IT on page A8-104.
An instruction in an IT block is conditional, see Conditional instructions on page A4-4 and Conditional
execution on page A8-8. The condition used is the current value of IT[7:4]. When an instruction in an IT
block completes its execution normally,
ITSTATE
is advanced to the next line of Table A2-2 on page A2-18.
For details of what happens if such an instruction takes an exception see Exception entry on page B1-34.
Note
Instructions that can complete their normal execution by branching are only permitted in an IT block as its
last instruction, and so always result in
ITSTATE
advancing to normal execution.
Note
ITSTATE
affects instruction execution only in Thumb and ThumbEE states. In ARM and Jazelle states,
ITSTATE
must be '00000000', otherwise behavior is UNPREDICTABLE.
76543210
IT[7:0]

Application Level Programmers’ Model
A2-18 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Pseudocode details of ITSTATE operations
ITSTATE
advances after normal execution of an IT block instruction. This is described by the
ITAdvance()
pseudocode function:
// ITAdvance()
// ===========
ITAdvance()
if ITSTATE<2:0> == ‘000’ then
ITSTATE.IT = ‘00000000’;
else
ITSTATE.IT<4:0> = LSL(ITSTATE.IT<4:0>, 1);
The following functions test whether the current instruction is in an IT block, and whether it is the last
instruction of an IT block:
// InITBlock()
// ===========
boolean InITBlock()
return (ITSTATE.IT<3:0> != ‘0000’);
// LastInITBlock()
// ===============
boolean LastInITBlock()
return (ITSTATE.IT<3:0> == ‘1000’);
Table A2-2 Effect of IT execution state bits
IT bits a
a. Combinations of the IT bits not shown in this table are reserved.
Note
[7:5] [4] [3] [2] [1] [0]
cond_base P1 P2 P3 P4 1 Entry point for 4-instruction IT block
cond_base P1 P2 P3 1 0 Entry point for 3-instruction IT block
cond_base P1 P2 1 0 0 Entry point for 2-instruction IT block
cond_base P1 1 0 0 0 Entry point for 1-instruction IT block
000 0 0 0 0 0 Normal execution, not in an IT block

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-19
A2.5.3 ENDIANSTATE
ARMv7-A and ARMv7-R support configuration between little-endian and big-endian interpretations of
data memory, as shown in Table A2-3. The endianness is controlled by ENDIANSTATE.
The ARM and Thumb instruction sets both include an instruction to manipulate ENDIANSTATE:
SETEND BE
Sets ENDIANSTATE to 1, for big-endian operation
SETEND LE
Sets ENDIANSTATE to 0, for little-endian operation.
The
SETEND
instruction is unconditional. For more information, see SETEND on page A8-314.
Pseudocode details of ENDIANSTATE operations
The
BigEndian()
pseudocode function tests whether big-endian memory accesses are currently selected.
// BigEndian()
// ===========
boolean BigEndian()
return (ENDIANSTATE == ‘1’);
Table A2-3 APSR configuration of endianness
ENDIANSTATE Endian mapping
0 Little-endian
1Big-endian

Application Level Programmers’ Model
A2-20 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.6 Advanced SIMD and VFP extensions
Advanced SIMD and VFP are two optional extensions to ARMv7.
Advanced SIMD performs packed Single Instruction Multiple Data (SIMD) operations, either integer or
single-precision floating-point. VFP performs single-precision or double-precision floating-point
operations.
Both extensions permit floating-point exceptions, such as overflow or division by zero, to be handled in an
untrapped fashion. When handled in this way, a floating-point exception causes a cumulative status register
bit to be set to 1 and a default result to be produced by the operation.
The ARMv7 VFP implementation is VFPv3. ARMv7 also permits a variant of VFPv3, VFPv3U, that
supports the trapping of floating-point exceptions, see VFPv3U on page A2-31. VFPv2 also supports the
trapping of floating-point exceptions.
For more information about floating-point exceptions see Floating-point exceptions on page A2-42.
Each extension can be implemented at a number of levels. Table A2-4 shows the permitted combinations of
implementations of the two extensions.
The optional half-precision extensions provide conversion functions in both directions between
half-precision floating-point and single-precision floating-point. These extensions can be implemented with
any Advanced SIMD and VFP implementation that supports single-precision floating-point. The
half-precision extensions apply to both VFP and Advanced SIMD if they are both implemented.
For system-level information about the Advanced SIMD and VFP extensions see:
•Advanced SIMD and VFP extension system registers on page B1-66
•Advanced SIMD and floating-point support on page B1-64.
Table A2-4 Permitted combinations of Advanced SIMD and VFP extensions
Advanced SIMD VFP
Not implemented Not implemented
Integer only Not implemented
Integer and single-precision floating-point Single-precision floating-point onlya
a. Must be able to load and store double-precision data.
Integer and single-precision floating-point Single-precision and double-precision floating-point
Not implemented Single-precision floating-point onlya
Not implemented Single-precision and double-precision floating-point

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-21
Note
Before ARMv7, the VFP extension was called the Vector Floating-point Architecture, and was used for
vector operations. For details of these deprecated operations see Appendix F VFP Vector Operation
Support. From ARMv7:
• ARM recommends that the Advanced SIMD extension is used for single-precision vector
floating-point operations
• an implementation that requires support for vector operations must implement the Advanced SIMD
extension.
A2.6.1 Advanced SIMD and VFP extension registers
Advanced SIMD and VFPv3 use the same register set. This is distinct from the ARM core register set. These
registers are generally referred to as the extension registers.
The extension register set consists of either thirty-two or sixteen doubleword registers, as follows:
• If VFPv2 is implemented, it consists of sixteen doubleword registers.
• If VFPv3 is implemented, it consists of either thirty-two or sixteen doubleword registers. Where
necessary the terms VFPv3-D32 and VFPv3-D16 are used to distinguish between these two
implementation options.
• If Advanced SIMD is implemented, it consists of thirty-two doubleword registers. If both Advanced
SIMD and VFPv3 are implemented, VFPv3 must be implemented in its VFPv3-D32 form.
The Advanced SIMD and VFP views of the extension register set are not identical. They are described in
the following sections.
Figure A2-1 on page A2-22 shows the views of the extension register set, and the way the word,
doubleword, and quadword registers overlap.
Advanced SIMD views of the extension register set
Advanced SIMD can view this register set as:
• Sixteen 128-bit quadword registers,
Q0-Q15
.
• Thirty-two 64-bit doubleword registers,
D0-D31
. This view is also available in VFPv3.
These views can be used simultaneously. For example, a program might hold 64-bit vectors in D0 and D1
and a 128-bit vector in Q1.

Application Level Programmers’ Model
A2-22 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
VFP views of the extension register set
In VFPv3-D32, the extension register set consists of thirty-two doubleword registers, that VFP can view as:
• Thirty-two 64-bit doubleword registers,
D0-D31
. This view is also available in Advanced SIMD.
• Thirty-two 32-bit single word registers,
S0-S31
. Only half of the set is accessible in this view.
In VFPv3-D16 and VFPv2, the extension register set consists of sixteen doubleword registers, that VFP can
view as:
• Sixteen 64-bit doubleword registers,
D0-D15
.
• Thirty-two 32-bit single word registers,
S0-S31
.
In each case, the two views can be used simultaneously.
Advanced SIMD and VFP register mapping
Figure A2-1 Advanced SIMD and VFP register set
D0
D3
D31
D30
S0
S1
S2
S3
S4
S5
S28
S29
S6
S7
S30
S31
...
D1
D2
D14
D15
D16
D17
...
Q0
Q1
Q7
Q8
Q15
......
...
D0
D3
D1
D2
D14
D15
...
S0-S31
VFP only
D0-D15
VFPv2 or
VFPv3-D16
D0-D31
VFPv3-D32 or
Advanced SIMD
Q0-Q15
Advanced SIMD only

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-23
The mapping between the registers is as follows:
•
S<2n>
maps to the least significant half of
D<n>
•
S<2n+1>
maps to the most significant half of
D<n>
•
D<2n>
maps to the least significant half of
Q<n>
•
D<2n+1>
maps to the most significant half of
Q<n>
.
For example, you can access the least significant half of the elements of a vector in
Q6
by referring to
D12
,
and the most significant half of the elements by referring to
D13
.
Pseudocode details of Advanced SIMD and VFP extension registers
The pseudocode function
VFPSmallRegisterBank()
returns FALSE if all of the 32 registers D0-D31 can be
accessed, and TRUE if only the 16 registers D0-D15 can be accessed:
boolean VFPSmallRegisterBank()
In more detail,
VFPSmallRegisterBank()
:
• returns TRUE for a VFPv2 or VFPv3-D16 implementation
• for a VFPv3-D32 implementation:
— returns FALSE if CPACR.D32DIS == 0
— returns TRUE if CPACR.D32DIS == 1 and CPACR.ASEDIS == 1
— results in UNPREDICTABLE behavior if CPACR.D32DIS == 1 and CPACR.ASEDIS == 0.
For details of the CPACR register, see:
•c1, Coprocessor Access Control Register (CPACR) on page B3-104 for a VMSA implementation
•c1, Coprocessor Access Control Register (CPACR) on page B4-51 for a PMSA implementation.
The S0-S31, D0-D31, and Q0-Q15 views of the registers are provided by the following functions:
// The 64-bit extension register bank for Advanced SIMD and VFP.
array bits(64) _D[0..31];
// S[] - non-assignment form
// =========================
bits(32) S[integer n]
assert n >= 0 && n <= 31;
if (n MOD 2) == 0 then
result = D[n DIV 2]<31:0>;
else
result = D[n DIV 2]<63:32>;
return result;
// S[] - assignment form
// =====================
S[integer n] = bits(32) value
assert n >= 0 && n <= 31;
if (n MOD 2) == 0 then

Application Level Programmers’ Model
A2-24 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
D[n DIV 2]<31:0> = value;
else
D[n DIV 2]<63:32> = value;
return;
// D[] - non-assignment form
// =========================
bits(64) D[integer n]
assert n >= 0 && n <= 31;
if n >= 16 && VFPSmallRegisterBank() then UNDEFINED;
return _D[n];
// D[] - assignment form
// =====================
D[integer n] = bits(64) value
assert n >= 0 && n <= 31;
if n >= 16 && VFPSmallRegisterBank() then UNDEFINED;
_D[n] = value;
return;
// Q[] - non-assignment form
// =========================
bits(128) Q[integer n]
assert n >= 0 && n <= 15;
return D[2*n+1]:D[2*n];
// Q[] - assignment form
// =====================
Q[integer n] = bits(128) value
assert n >= 0 && n <= 15;
D[2*n] = value<63:0>;
D[2*n+1] = value<127:64>;
return;

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-25
A2.6.2 Data types supported by the Advanced SIMD extension
When the Advanced SIMD extension is implemented, it can operate on integer and floating-point data. It
defines a set of data types to represent the different data formats. Table A2-5 shows the available formats.
Each instruction description specifies the data types that the instruction supports.
The polynomial data type is described in Polynomial arithmetic over {0,1} on page A2-67.
The
.F16
data type is the half-precision data type currently selected by the FPSCR.AHP bit, see Advanced
SIMD and VFP system registers on page A2-28. It is supported only when the half-precision extensions are
implemented.
The
.F32
data type is the ARM standard single-precision floating-point data type, see Advanced SIMD and
VFP single-precision format on page A2-34.
The instruction definitions use a data type specifier to define the data types appropriate to the operation.
Figure A2-2 on page A2-26 shows the hierarchy of Advanced SIMD data types.
Table A2-5 Advanced SIMD data types
Data type specifier Meaning
.<size>
Any element of
<size>
bits
.F<size>
Floating-point number of
<size>
bits
.I<size>
Signed or unsigned integer of
<size>
bits
.P<size>
Polynomial over {0,1} of degree less than
<size>
.S<size>
Signed integer of
<size>
bits
.U<size>
Unsigned integer of
<size>
bits

Application Level Programmers’ Model
A2-26 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Figure A2-2 Advanced SIMD data type hierarchy
For example, a multiply instruction must distinguish between integer and floating-point data types.
However, some multiply instructions use modulo arithmetic for integer instructions and therefore do not
need to distinguish between signed and unsigned inputs.
A multiply instruction that generates a double-width (long) result must specify the input data types as signed
or unsigned, because for this operation it does make a difference.
A2.6.3 Advanced SIMD vectors
When the Advanced SIMD extension is implemented, a register can hold one or more packed elements, all
of the same size and type. The combination of a register and a data type describes a vector of elements. The
vector is considered to be an array of elements of the data type specified in the instruction. The number of
elements in the vector is implied by the size of the data elements and the size of the register.
Vector indices are in the range 0 to (number of elements – 1). An index of 0 refers to the least significant
end of the vector. Figure A2-3 on page A2-27 shows examples of Advanced SIMD vectors:
.U8
.S16
.S8
.I64
.I32 .U32
.16
.32
.U16
.S32
.8
.I8
.64 .U64
.P16
.P8
.F32
.S64
-
-
-
-
.F16
.I16
Supported only if the half-precision extensions are implemented

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-27
Figure A2-3 Examples of Advanced SIMD vectors
Pseudocode details of Advanced SIMD vectors
The pseudocode function
Elem[]
is used to access the element of a specified index and size in a vector:
// Elem[] - non-assignment form
// ============================
bits(size) Elem[bits(N) vector, integer e, integer size]
assert e >= 0 && (e+1)*size <= N;
return vector<(e+1)*size-1:e*size>;
// Elem[] - assignment form
// ========================
Elem[bits(N) vector, integer e, integer size] = bits(size) value
assert e >= 0 && (e+1)*size <= N;
vector<(e+1)*size-1:e*size> = value;
return;
Qn
64-bit vector of 32-bit signed integers
[1]
[2]
[3]
[3][7] [6] [5]
64-bit vector of 16-bit unsigned integers
128-bit vector of single-precision
(32-bit) floating-point numbers
128-bit vector of 16-bit signed integers
[2] [0]
[4] [1] [0]
127 0
63 0
.F32 .F32 .F32 .F32
.S16 .S16 .S16 .S16 .S16 .S16 .S16 .S16
Dn
.S32 .S32
[1] [0]
.U16 .U16 .U16 .U16
[2][3] [1] [0]

Application Level Programmers’ Model
A2-28 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.6.4 Advanced SIMD and VFP system registers
The Advanced SIMD and VFP extensions have a shared register space for system registers. Only one
register in this space is accessible at the application level, see Floating-point Status and Control Register
(FPSCR).
See Advanced SIMD and VFP extension system registers on page B1-66 for the system level description of
the registers.
Floating-point Status and Control Register (FPSCR)
The Floating-point Status and Control Register (FPSCR) is implemented in any system that implements one
or both of:
• the VFP extension
• the Advanced SIMD extension.
The FPSCR provides all necessary User level control of the floating-point system
The FPSCR is a 32-bit read/write system register, accessible in unprivileged and privileged modes.
The format of the FPSCR is:
Bits [31:28] Condition code bits. These are updated on floating-point comparison operations. They are
not updated on SIMD operations, and do not affect SIMD instructions.
N, bit [31] Negative condition code flag.
Z, bit [30] Zero condition code flag.
C, bit [29] Carry condition code flag.
V, bit [28] Overflow condition code flag.
QC, bit [27] Cumulative saturation flag, Advanced SIMD only. This bit is set to 1 to indicate that an
Advanced SIMD integer operation has saturated since 0 was last written to this bit. For
details of saturation, see Pseudocode details of saturation on page A2-9.
The value of this bit is ignored by the VFP extension. If Advanced SIMD is not implemented
this bit is UNK/SBZP.
01231 30292827262524232221201918 161514131211109876543
N Z C V Stride Len UNK/
SBZP
IXC
AHP
QC
DN
FZ
RMode
UNK/SBZP
UFC
OFC
DZC
IOC
IDE
UNK/
SBZP
UFE
OFE
DZE
IOE
IXE
IDC

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-29
AHP, bit[26] Alternative half-precision control bit:
0 IEEE half-precision format selected.
1 Alternative half-precision format selected.
For more information see Advanced SIMD and VFP half-precision formats on page A2-38.
If the half-precision extensions are not implemented this bit is UNK/SBZP.
Bits [19,14:13,6:5]
Reserved. UNK/SBZP.
DN, bit [25] Default NaN mode control bit:
0 NaN operands propagate through to the output of a floating-point operation.
1 Any operation involving one or more NaNs returns the Default NaN.
For more information, see NaN handling and the Default NaN on page A2-41.
The value of this bit only controls VFP arithmetic. Advanced SIMD arithmetic always uses
the Default NaN setting, regardless of the value of the DN bit.
FZ, bit [24] Flush-to-zero mode control bit:
0 Flush-to-zero mode disabled. Behavior of the floating-point system is fully
compliant with the IEEE 754 standard.
1 Flush-to-zero mode enabled.
For more information, see Flush-to-zero on page A2-39.
The value of this bit only controls VFP arithmetic. Advanced SIMD arithmetic always uses
the Flush-to-zero setting, regardless of the value of the FZ bit.
RMode, bits [23:22]
Rounding Mode control field. The encoding of this field is:
0b00 Round to Nearest (RN) mode
0b01 Round towards Plus Infinity (RP) mode
0b10 Round towards Minus Infinity (RM) mode
0b11 Round towards Zero (RZ) mode.
The specified rounding mode is used by almost all VFP floating-point instructions.
Advanced SIMD arithmetic always uses the Round to Nearest setting, regardless of the
value of the RMode bits.
Stride, bits [21:20] and Len, bits [18:16]
Use of nonzero values of these fields is deprecated in ARMv7. For details of their use in
previous versions of the ARM architecture see Appendix F VFP Vector Operation Support.
The values of these fields are ignored by the Advanced SIMD extension.
Bits [15,12:8] Floating-point exception trap enable bits. These bits are supported only in VFPv2 and
VFPv3U. They are reserved, RAZ/SBZP, on a system that implements VFPv3.

Application Level Programmers’ Model
A2-30 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
The possible values of each bit are:
0 Untrapped exception handling selected
1 Trapped exception handling selected.
The values of these bits control only VFP arithmetic. Advanced SIMD arithmetic always
uses untrapped exception handling, regardless of the values of these bits.
For more information, see Floating-point exceptions on page A2-42.
IDE, bit [15] Input Denormal exception trap enable.
IXE, bit [12] Inexact exception trap enable.
UFE, bit [11] Underflow exception trap enable.
OFE, bit [10] Overflow exception trap enable.
DZE, bit [9] Division by Zero exception trap enable.
IOE, bit [8] Invalid Operation exception trap enable.
Bits [7,4:0] Cumulative exception flags for floating-point exceptions. Each of these bits is set to 1 to
indicate that the corresponding exception has occurred since 0 was last written to it. How
VFP instructions update these bits depends on the value of the corresponding exception trap
enable bits:
Trap enable bit = 0
If the floating-point exception occurs then the cumulative exception flag is set
to 1.
Trap enable bit = 1
If the floating-point exception occurs the trap handling software can decide
whether to set the cumulative exception flag to 1.
Advanced SIMD instructions set each cumulative exception flag if the corresponding
exception occurs in one or more of the floating-point calculations performed by the
instruction, regardless of the setting of the trap enable bits.
For more information, see Floating-point exceptions on page A2-42.
IDC, bit [7] Input Denormal cumulative exception flag.
IXC, bit [4] Inexact cumulative exception flag.
UFC, bit [3] Underflow cumulative exception flag.
OFC, bit [2] Overflow cumulative exception flag.
DZC, bit [1] Division by Zero cumulative exception flag.
IOC, bit [0] Invalid Operation cumulative exception flag.
If the processor implements the integer-only Advanced SIMD extension and does not implement the VFP
extension, all of these bits except QC are UNK/SBZP.
Writes to the FPSCR can have side-effects on various aspects of processor operation. All of these
side-effects are synchronous to the FPSCR write. This means they are guaranteed not to be visible to earlier
instructions in the execution stream, and they are guaranteed to be visible to later instructions in the
execution stream.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-31
Accessing the FPSCR
You read or write the FPSCR using the
VMRS
and
VMSR
instructions. For more information, see VMRS on
page A8-658 and VMSR on page A8-660. For example:
VMRS <Rt>, FPSCR ; Read Floating-point System Control Register
VMSR FPSCR, <Rt> ; Write Floating-point System Control Register
A2.6.5 VFPv3U
VFPv3 does not support the exception trap enable bits in the FPSCR, see Floating-point Status and Control
Register (FPSCR) on page A2-28. All floating-point exceptions are untrapped.
The VFPv3U variant of the VFPv3 architecture implements the exception trap enable bits in the FPSCR,
and provides exception handling as described in VFP support code on page B1-70. There is a separate trap
enable bit for each of the six floating-point exceptions described in Floating-point exceptions on
page A2-42. The VFPv3U architecture is otherwise identical to VFPv3.
Trapped exception handling never causes the corresponding cumulative exception bit of the FPSCR to be
set to 1. If this behavior is desired, the trap handler routine must use a read, modify, write sequence on the
FPSCR to set the cumulative exception bit.
VFPv3U is backwards compatible with VFPv2.

Application Level Programmers’ Model
A2-32 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7 Floating-point data types and arithmetic
The VFP extension supports single-precision (32-bit) and double-precision (64-bit) floating-point data
types and arithmetic as defined by the IEEE 754 floating-point standard. It also supports the ARM Standard
modifications to that arithmetic described in Flush-to-zero on page A2-39 and NaN handling and the
Default NaN on page A2-41.
Trapped floating-point exception handling is supported in the VFPv3U variant only (see VFPv3U on
page A2-31).
ARM standard floating-point arithmetic means IEEE 754 floating-point arithmetic with the ARM standard
modifications and:
• the Round to Nearest rounding mode selected
• untrapped exception handling selected for all floating-point exceptions.
The Advanced SIMD extension only supports single-precision ARM standard floating-point arithmetic.
Note
Implementations of the VFP extension require support code to be installed in the system if trapped
floating-point exception handling is required. See VFP support code on page B1-70.
They might also require support code to be installed in the system to support other aspects of their
floating-point arithmetic. It is IMPLEMENTATION DEFINED which aspects of VFP floating-point arithmetic
are supported in a system without support code installed.
Aspects of floating-point arithmetic that are implemented in support code are likely to run much more
slowly than those that are executed in hardware.
ARM recommends that:
• To maximize the chance of getting high floating-point performance, software developers use ARM
standard floating-point arithmetic.
• Software developers check whether their systems have support code installed, and if not, observe the
IMPLEMENTATION DEFINED restrictions on what operations their VFP implementation can handle
without support code.
• VFP implementation developers implement at least ARM standard floating-point arithmetic in
hardware, so that it can be executed without any need for support code.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-33
A2.7.1 ARM standard floating-point input and output values
ARM standard floating-point arithmetic supports the following input formats defined by the IEEE 754
floating-point standard:
• Zeros.
• Normalized numbers.
• Denormalized numbers are flushed to 0 before floating-point operations. For details, see
Flush-to-zero on page A2-39.
•NaNs.
• Infinities.
ARM standard floating-point arithmetic supports the Round to Nearest rounding mode defined by the IEEE
754 standard.
ARM standard floating-point arithmetic supports the following output result formats defined by the IEEE
754 standard:
• Zeros.
• Normalized numbers.
• Results that are less than the minimum normalized number are flushed to zero, see Flush-to-zero on
page A2-39.
• NaNs produced in floating-point operations are always the default NaN, see NaN handling and the
Default NaN on page A2-41.
• Infinities.

Application Level Programmers’ Model
A2-34 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7.2 Advanced SIMD and VFP single-precision format
The single-precision floating-point format used by the Advanced SIMD and VFP extensions is as defined
by the IEEE 754 standard.
This description includes ARM-specific details that are left open by the standard. It is only intended as an
introduction to the formats and to the values they can contain. For full details, especially of the handling of
infinities, NaNs and signed zeros, see the IEEE 754 standard.
A single-precision value is a 32-bit word, and must be word-aligned when held in memory. It has the format:
The interpretation of the format depends on the value of the exponent field, bits [30:23]:
0 < exponent <
0xFF
The value is a normalized number and is equal to:
–1S × 2(exponent – 127) × (1.fraction)
The minimum positive normalized number is 2–126, or approximately 1.175 ×10–38.
The maximum positive normalized number is (2 – 2–23) × 2127, or approximately
3.403 ×1038.
exponent == 0
The value is either a zero or a denormalized number, depending on the fraction bits:
fraction == 0
The value is a zero. There are two distinct zeros:
+0 when S==0
–0 when S==1.
These usually behave identically. In particular, the result is equal if +0 and –0
are compared as floating-point numbers. However, they yield different results in
some circumstances. For example, the sign of the infinity produced as the result
of dividing by zero depends on the sign of the zero. The two zeros can be
distinguished from each other by performing an integer comparison of the two
words.
fraction != 0
The value is a denormalized number and is equal to:
–1S × 2–126 × (0.fraction)
The minimum positive denormalized number is 2–149, or approximately 1.401 × 10–45.
Denormalized numbers are flushed to zero in the Advanced SIMD extension. They are
optionally flushed to zero in the VFP extension. For details see Flush-to-zero on
page A2-39.
31 30 23 22 0
S exponent fraction

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-35
exponent ==
0xFF
The value is either an infinity or a Not a Number (NaN), depending on the fraction bits:
fraction == 0
The value is an infinity. There are two distinct infinities:
+∞ When S==0. This represents all positive numbers that are too big to
be represented accurately as a normalized number.
-∞ When S==1. This represents all negative numbers with an absolute
value that is too big to be represented accurately as a normalized
number.
fraction != 0
The value is a NaN, and is either a quiet NaN or a signaling NaN.
In the VFP architecture, the two types of NaN are distinguished on the basis of
their most significant fraction bit, bit [22]:
bit [22] == 0
The NaN is a signaling NaN. The sign bit can take any value, and
the remaining fraction bits can take any value except all zeros.
bit [22] == 1
The NaN is a quiet NaN. The sign bit and remaining fraction bits
can take any value.
For details of the default NaN see NaN handling and the Default NaN on page A2-41.
Note
NaNs with different sign or fraction bits are distinct NaNs, but this does not mean you can use floating-point
comparison instructions to distinguish them. This is because the IEEE 754 standard specifies that a NaN
compares as unordered with everything, including itself. However, you can use integer comparisons to
distinguish different NaNs.

Application Level Programmers’ Model
A2-36 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7.3 VFP double-precision format
The double-precision floating-point format used by the VFP extension is as defined by the IEEE 754
standard.
This description includes VFP-specific details that are left open by the standard. It is only intended as an
introduction to the formats and to the values they can contain. For full details, especially of the handling of
infinities, NaNs and signed zeros, see the IEEE 754 standard.
A double-precision value consists of two 32-bit words, with the formats:
Most significant word:
Least significant word:
When held in memory, the two words must appear consecutively and must both be word-aligned. The order
of the two words depends on the endianness of the memory system:
• In a little-endian memory system, the least significant word appears at the lower memory address and
the most significant word at the higher memory address.
• In a big-endian memory system, the most significant word appears at the lower memory address and
the least significant word at the higher memory address.
Double-precision values represent numbers, infinities and NaNs in a similar way to single-precision values,
with the interpretation of the format depending on the value of the exponent:
0 < exponent <
0x7FF
The value is a normalized number and is equal to:
–1S × 2exponent–1023 × (1.fraction)
The minimum positive normalized number is 2–1022, or approximately 2.225 × 10–308.
The maximum positive normalized number is (2 – 2–52) × 21023, or approximately
1.798 × 10308.
exponent == 0
The value is either a zero or a denormalized number, depending on the fraction bits:
fraction == 0
The value is a zero. There are two distinct zeros that behave analogously to the
two single-precision zeros:
+0 when S==0
–0 when S==1.
31 30 20 19 0
S exponent fraction[51:32]
31 0
fraction[31:0]

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-37
fraction != 0
The value is a denormalized number and is equal to:
1–S × 2–1022 × (0.fraction)
The minimum positive denormalized number is 2–1074, or approximately 4.941 × 10–324.
Optionally, denormalized numbers are flushed to zero in the VFP extension. For details see
Flush-to-zero on page A2-39.
exponent ==
0x7FF
The value is either an infinity or a NaN, depending on the fraction bits:
fraction == 0
the value is an infinity. As for single-precision, there are two infinities:
+∞ Plus infinity, when S==0
-∞ Minus infinity, when S==1.
fraction != 0
The value is a NaN, and is either a quiet NaN or a signaling NaN.
In the VFP architecture, the two types of NaN are distinguished on the basis of
their most significant fraction bit, bit [19] of the most significant word:
bit [19] == 0
The NaN is a signaling NaN. The sign bit can take any value, and
the remaining fraction bits can take any value except all zeros.
bit [19] == 1
The NaN is a quiet NaN. The sign bit and the remaining fraction bits
can take any value.
For details of the default NaN see NaN handling and the Default NaN on page A2-41.

Application Level Programmers’ Model
A2-38 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
A2.7.4 Advanced SIMD and VFP half-precision formats
Two half-precision floating-point formats are used by the half-precision extensions to Advanced SIMD and
VFP:
• IEEE half-precision, as described in the revised IEEE 754 standard
• Alternative half-precision.
The description of IEEE half-precision includes ARM-specific details that are left open by the standard, and
is only an introduction to the formats and to the values they can contain. For more information, especially
on the handling of infinities, NaNs and signed zeros, see the IEEE 754 standard.
For both half-precision floating-point formats, the layout of the 16-bit number is the same. The format is:
The interpretation of the format depends on the value of the exponent field, bits[14:10] and on which
half-precision format is being used.
0 < exponent <
0x1F
The value is a normalized number and is equal to:
–1S × 2((exponent-15) × (1.fraction)
The minimum positive normalized number is 2–14, or approximately 6.104 ×10–5.
The maximum positive normalized number is (2 – 2–10) × 215, or 65504.
Larger normalized numbers can be expressed using the alternative format when the
exponent ==
0x1F
.
exponent == 0
The value is either a zero or a denormalized number, depending on the fraction bits:
fraction == 0
The value is a zero. There are two distinct zeros:
+0 when S==0
–0 when S==1.
fraction != 0
The value is a denormalized number and is equal to:
–1S × 2–14 × (0.fraction)
The minimum positive denormalized number is 2–25, or approximately 2.980 × 10–8.
15 14 10 9 0
SExponent Fraction

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-39
exponent ==
0x1F
The value depends on which half-precision format is being used:
IEEE Half-precision
The value is either an infinity or a Not a Number (NaN), depending on the
fraction bits:
fraction == 0
The value is an infinity. There are two distinct infinities:
+∞ When S==0. This represents all positive
numbers that are too big to be represented
accurately as a normalized number.
-∞ When S==1. This represents all negative
numbers with an absolute value that is too
big to be represented accurately as a
normalized number.
fraction != 0
The value is a NaN, and is either a quiet NaN or a signaling NaN.
The two types of NaN are distinguished by their most significant
fraction bit, bit [9]:
bit [9] == 0 The NaN is a signaling NaN. The sign bit
can take any value, and the remaining
fraction bits can take any value except all
zeros.
bit [9] == 1 The NaN is a quiet NaN. The sign bit and
remaining fraction bits can take any value.
Alternative Half-precision
The value is a normalized number and is equal to:
-1S x 216 x (1.fraction)
The maximum positive normalized number is (2-2-10) x 216 or 131008.
A2.7.5 Flush-to-zero
The performance of floating-point implementations can be significantly reduced when performing
calculations involving denormalized numbers and Underflow exceptions. In particular this occurs for
implementations that only handle normalized numbers and zeros in hardware, and invoke support code to
handle any other types of value. For an algorithm where a significant number of the operands and
intermediate results are denormalized numbers, this can result in a considerable loss of performance.
In many of these algorithms, this performance can be recovered, without significantly affecting the accuracy
of the final result, by replacing the denormalized operands and intermediate results with zeros. To permit
this optimization, VFP implementations have a special processing mode called Flush-to-zero mode.
Advanced SIMD implementations always use Flush-to-zero mode.

Application Level Programmers’ Model
A2-40 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Behavior in Flush-to-zero mode differs from normal IEEE 754 arithmetic in the following ways:
• All inputs to floating-point operations that are double-precision de-normalized numbers or
single-precision de-normalized numbers are treated as though they were zero. This causes an Input
Denormal exception, but does not cause an Inexact exception. The Input Denormal exception occurs
only in Flush-to-zero mode.
The FPSCR contains a cumulative exception bit FPSCR.IDC and trap enable bit FPSCR.IDE
corresponding to the Input Denormal exception. For details of how these are used when processing
the exception see Advanced SIMD and VFP system registers on page A2-28.
The occurrence of all exceptions except Input Denormal is determined using the input values after
flush-to-zero processing has occurred.
• The result of a floating-point operation is flushed to zero if the result of the operation before rounding
satisfies the condition:
0 <
Abs(result)
< MinNorm, where:
—MinNorm ==2
-126 for single-precision
—MinNorm ==2
-1022 for double-precision.
This causes the FPSCR.UFC bit to be set to 1, and prevents any Inexact exception from occurring for
the operation.
Underflow exceptions occur only when a result is flushed to zero.
In a VFPv2 or VFPv3U implementation Underflow exceptions that occur in Flush-to-zero mode are
always treated as untrapped, even when the Underflow trap enable bit, FPSCR.UFE, is set to 1.
• An Inexact exception does not occur if the result is flushed to zero, even though the final result of
zero is not equivalent to the value that would be produced if the operation were performed with
unbounded precision and exponent range.
For information on the FPSCR bits see Floating-point Status and Control Register (FPSCR) on page A2-28.
When an input or a result is flushed to zero the value of the sign bit of the zero is determined as follows:
• In VFPv3 or VFPv3U, it is preserved. That is, the sign bit of the zero matches the sign bit of the input
or result that is being flushed to zero.
• In VFPv2, it is IMPLEMENTATION DEFINED whether it is preserved or always positive. The same
choice must be made for all cases of flushing an input or result to zero.
Flush-to-zero mode has no effect on half-precision numbers that are inputs to floating-point operations, or
results from floating-point operations.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-41
Note
Flush-to-zero mode is incompatible with the IEEE 754 standard, and must not be used when IEEE 754
compatibility is a requirement. Flush-to-zero mode must be treated with care. Although it can lead to a major
performance increase on many algorithms, there are significant limitations on its use. These are application
dependent:
• On many algorithms, it has no noticeable effect, because the algorithm does not normally use
denormalized numbers.
• On other algorithms, it can cause exceptions to occur or seriously reduce the accuracy of the results
of the algorithm.
A2.7.6 NaN handling and the Default NaN
The IEEE 754 standard specifies that:
• an operation that produces an Invalid Operation floating-point exception generates a quiet NaN as its
result if that exception is untrapped
• an operation involving a quiet NaN operand, but not a signaling NaN operand, returns an input NaN
as its result.
The VFP behavior when Default NaN mode is disabled adheres to this with the following extra details,
where the first operand means the first argument to the pseudocode function call that describes the
operation:
• If an untrapped Invalid Operation floating-point exception is produced because one of the operands
is a signaling NaN, the quiet NaN result is equal to the signaling NaN with its most significant
fraction bit changed to 1. If both operands are signaling NaNs, the result is produced in this way from
the first operand.
• If an untrapped Invalid Operation floating-point exception is produced for other reasons, the quiet
NaN result is the Default NaN.
• If both operands are quiet NaNs, the result is the first operand.
The VFP behavior when Default NaN mode is enabled, and the Advanced SIMD behavior in all
circumstances, is that the Default NaN is the result of all floating-point operations that:
• generate untrapped Invalid Operation floating-point exceptions
• have one or more quiet NaN inputs.
Table A2-6 on page A2-42 shows the format of the default NaN for ARM floating-point processors.
Default NaN mode is selected for VFP by setting the FPSCR.DN bit to 1, see Floating-point Status and
Control Register (FPSCR) on page A2-28.

Application Level Programmers’ Model
A2-42 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
Other aspects of the functionality of the Invalid Operation exception are not affected by Default NaN mode.
These are that:
• If untrapped, it causes the FPSCR.IOC bit be set to 1.
• If trapped, it causes a user trap handler to be invoked. This is only possible in VFPv2 and VFPv3U.
A2.7.7 Floating-point exceptions
The Advanced SIMD and VFP extensions record the following floating-point exceptions in the FPSCR
cumulative flags, see Floating-point Status and Control Register (FPSCR) on page A2-28:
IOC Invalid Operation. The flag is set to 1 if the result of an operation has no mathematical value
or cannot be represented. Cases include infinity * 0, +infinity + (–infinity), for example.
These tests are made after flush-to-zero processing. For example, if flush-to-zero mode is
selected, multiplying a denormalized number and an infinity is treated as 0 * infinity and
causes an Invalid Operation floating-point exception.
IOC is also set on any floating-point operation with one or more signaling NaNs as
operands, except for negation and absolute value, as described in Negation and absolute
value on page A2-47.
DZC Division by Zero. The flag is set to 1 if a divide operation has a zero divisor and a dividend
that is not zero, an infinity or a NaN. These tests are made after flush-to-zero processing, so
if flush-to-zero processing is selected, a denormalized dividend is treated as zero and
prevents Division by Zero from occurring, and a denormalized divisor is treated as zero and
causes Division by Zero to occur if the dividend is a normalized number.
For the reciprocal and reciprocal square root estimate functions the dividend is assumed to
be +1.0. This means that a zero or denormalized operand to these functions sets the DZC
flag.
OFC Overflow. The flag is set to 1 if the absolute value of the result of an operation, produced
after rounding, is greater than the maximum positive normalized number for the destination
precision.
UFC Underflow. The flag is set to 1 if the absolute value of the result of an operation, produced
before rounding, is less than the minimum positive normalized number for the destination
precision, and the rounded result is inexact.
Table A2-6 Default NaN encoding
Half-precision, IEEE Format Single-precision Double-precision
Sign bit 0 0a0a
Exponent
0x1F 0xFF 0x7FF
Fraction Bit[9] == 1, bits[8:0] == 0 bit [22] == 1, bits [21:0] == 0 bit [51] == 1, bits [50:0] == 0
a. In VFPv2, the sign bit of the Default NaN is UNKNOWN.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-43
The criteria for the Underflow exception to occur are different in Flush-to-zero mode. For
details, see Flush-to-zero on page A2-39.
IXC Inexact. The flag is set to 1 if the result of an operation is not equivalent to the value that
would be produced if the operation were performed with unbounded precision and exponent
range.
The criteria for the Inexact exception to occur are different in Flush-to-zero mode. For
details, see Flush-to-zero on page A2-39.
IDC Input Denormal. The flag is set to 1 if a denormalized input operand is replaced in the
computation by a zero, as described in Flush-to-zero on page A2-39.
With the Advanced SIMD extension and the VFPv3 extension these are non-trapping exceptions and the
data-processing instructions do not generate any trapped exceptions.
With the VFPv2 and VFPv3U extensions:
• These exceptions can be trapped, by setting trap enable flags in the FPSCR, see VFPv3U on
page A2-31. Trapped floating-point exceptions are delivered to user code in an IMPLEMENTATION
DEFINED fashion.
• The definitions of the floating-point exceptions change as follows:
— if the Underflow exception is trapped, it occurs if the absolute value of the result of an
operation, produced before rounding, is less than the minimum positive normalized number
for the destination precision, regardless of whether the rounded result is inexact
— higher priority trapped exceptions can prevent lower priority exceptions from occurring, as
described in Combinations of exceptions on page A2-44.
Table A2-7 shows the default results of the floating-point exceptions:
Table A2-7 Floating-point exception default results
Exception type Default result for positive sign Default result for negative sign
IOC, Invalid Operation Quiet NaN Quiet NaN
DZC, Division by Zero
+
∞
(plus infinity)
–
∞
(minus infinity)
OFC, Overflow RN, RP:
RM, RZ:
+
∞
(plus infinity)
+MaxNorm
RN, RM:
RP, RZ:
–
∞
(minus infinity)
–MaxNorm
UFC, Underflow Normal rounded result Normal rounded result
IXC, Inexact Normal rounded result Normal rounded result
IDC, Input Denormal Normal rounded result Normal rounded result

Application Level Programmers’ Model
A2-44 Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. ARM DDI 0406B
In Table A2-7 on page A2-43:
MaxNorm The maximum normalized number of the destination precision
RM Round towards Minus Infinity mode, as defined in the IEEE 754 standard
RN Round to Nearest mode, as defined in the IEEE 754 standard
RP Round towards Plus Infinity mode, as defined in the IEEE 754 standard
RZ Round towards Zero mode, as defined in the IEEE 754 standard
• For Invalid Operation exceptions, for details of which quiet NaN is produced as the default result see
NaN handling and the Default NaN on page A2-41.
• For Division by Zero exceptions, the sign bit of the default result is determined normally for a
division. This means it is the exclusive OR of the sign bits of the two operands.
• For Overflow exceptions, the sign bit of the default result is determined normally for the overflowing
operation.
Combinations of exceptions
The following pseudocode functions perform floating-point operations:
FixedToFP()
FPAbs()
FPAdd()
FPCompare()
FPCompareGE()
FPCompareGT()
FPDiv()
FPDoubleToSingle()
FPMax()
FPMin()
FPMul()
FPNeg()
FPRecipEstimate()
FPRecipStep()
FPRSqrtEstimate()
FPRSqrtStep()
FPSingleToDouble()
FPSqrt()
FPSub()
FPToFixed()
All of these operations except
FPAbs()
and
FPNeg()
can generate floating-point exceptions.
More than one exception can occur on the same operation. The only combinations of exceptions that can
occur are:
• Overflow with Inexact
• Underflow with Inexact
• Input Denormal with other exceptions.

Application Level Programmers’ Model
ARM DDI 0406B Copyright © 1996-1998, 2000, 2004-2008 ARM Limited. All rights reserved. A2-45
When none of the exceptions caused by an operation are trapped, any exception that occurs causes the
associated cumulative flag in the FPSCR to be set.
When one or more exceptions caused by an operation are trapped, the behavior of the instruction depends
on the priority of the exceptions. The Inexact exception is treated as lowest priority, and Input Denormal as
highest priority:
• If the higher priority exception is trapped, its trap handler is called. It is IMPLEMENTATION DEFINED
whether the parameters to the trap handler include information about the lower priority exception.
Apart from this, the lower priority exception is ignored in this case.
• If the higher priority exception is untrapped, its cumulative bit is set to 1 and its default result is
evaluated. Then the lower priority exception is handled normally, using this default result.
Some floating-point instructions specify more than one floating-point operation, as indicated by the
pseudocode descriptions of the instruction. In such cases, an exception on one operation is treated as higher
priority than an exception on another operation if the occurrence of the second exception depends on the
result of the first operation. Otherwise, it is UNPREDICTABLE which exception is treated as higher priority.
For example, a
VMLA.F32
instruction specifies a floating-point multiplication followed by a floating-point
addition. The addition can generate Overflow, Underflow and Inexact exceptions, all of which depend on
both operands to the addition and so are treated as lower priority than any exception on the multiplication.
The same applies to Invalid Operation exceptions on the addition caused by adding opposite-signed
infinities.
The addition can also generate an Input Denormal exception, caused by the addend being a denormalized
number while in Flush-to-zero mode. It is UNPREDICTABLE which of an Input Denormal exception on the
addition and an exception on the multiplication is treated as higher priority, because the occurrence of the
Input Denormal exception does not depend on the result of the multiplication. The same applies to an Invalid
Operation exception on the addition caused by the addend being a signaling NaN.
Note
Like other details of VFP instruction execution, these rules about exception handling apply to the overall
results produced by an instruction when the system uses a combination of hardware and support code to
implement it. See VFP support code on page B1-70 for more information.
These principles also apply to the multiple floating-point operations generated by VFP instructions in the
deprecated VFP vector mode of operation. For details of this mode of operation see Appendix F VFP Vector
Operation Support.