Xtensa Instruction Set Architecture (ISA) Reference Manual ASSEMBLER GUIDE

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 662

DownloadXtensa Instruction Set Architecture (ISA) Reference Manual - ASSEMBLER GUIDE
Open PDF In BrowserView PDF
Xtensa®
Instruction Set Architecture (ISA)
Reference Manual

For All Xtensa Processor Cores

Tensilica, Inc.
3255-6 Scott Blvd.
Santa Clara, CA 95054
(408) 986-8000
fax (408) 986-8919
www.tensilica.com

© 2010 Tensilica, Inc.
Printed in the United States of America
All Rights Reserved
This publication is provided “AS IS.” Tensilica, Inc. (hereafter “Tensilica”) does not make any warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose.
Information in this document is provided solely to enable system and software developers to use Tensilica processors. Unless
specifically set forth herein, there are no express or implied patent, copyright or any other intellectual property rights or licenses granted hereunder to design or fabricate Tensilica integrated circuits or integrated circuits based on the information in this
document. Tensilica does not warrant that the contents of this publication, whether individually or as one or more groups,
meets your requirements or that the publication is error-free. This publication could include technical inaccuracies or typographical errors. Changes may be made to the information herein, and these changes may be incorporated in new editions of
this publication.
Tensilica and Xtensa are registered trademarks of Tensilica, Inc. The following terms are trademarks of Tensilica, Inc.: FLIX,
OSKit, Sea of Processors, TurboXim, Vectra, Xenergy, Xplorer, and XPRES. All other trademarks and registered trademarks
are the property of their respective companies.

Issue Date: 4/2010
RC-2010.1 Release
PD-09-0801-10-01
Tensilica, Inc.
3255-6 Scott Blvd.
Santa Clara, CA 95054
(408) 986-8000
fax (408) 986-8919
www.tensilica.com

Digitally signed by Tensilica
Technical Publications
Reason: Certified original
Tensilica document 04/2010

Contents

Contents

1.

2.

3.

Introduction ................................................................................................................... 1
1.1 What Problem is Tensilica Solving? ............................................................................. 1
1.1.1 Adding Architectural Enhancements .................................................................. 1
1.1.2 Creating Custom Processor Configurations ........................................................ 4
1.1.3 Mapping the Architecture into Hardware ............................................................. 4
1.1.4 Development and Verification Tools ................................................................... 5
1.2 The Xtensa Instruction Set Architecture ....................................................................... 5
1.2.1 Configurability ................................................................................................. 7
1.2.2 Extensibility ..................................................................................................... 8
1.2.2.1 State Extensions ..................................................................................... 9
1.2.2.2 Register File Extensions .......................................................................... 9
1.2.2.3 Instruction Extensions .............................................................................. 9
1.2.2.4 Coprocessor Extensions .......................................................................... 9
1.2.3 Time-to-Market ................................................................................................ 9
1.2.4 Code Density ................................................................................................ 10
1.2.5 Low Implementation Cost ............................................................................... 10
1.2.6 Low-Power .................................................................................................... 11
1.2.7 Performance ................................................................................................. 11
1.2.8 Pipelines ....................................................................................................... 12
1.3 The Xtensa Processor Generator.............................................................................. 13
1.3.1 Processor Configuration ................................................................................. 13
1.3.2 System-Specific Instructions—The TIE Language ............................................. 13
Notation ....................................................................................................................... 17
2.1 Bit and Byte Order .................................................................................................. 17
2.2 Expressions ........................................................................................................... 19
2.3 Unsigned Semantics ............................................................................................... 20
2.4 Case ..................................................................................................................... 20
2.5 Statements ............................................................................................................. 21
2.6 Instruction Fields..................................................................................................... 21
Core Architecture ......................................................................................................... 23
3.1 Overview of the Core Architecture ............................................................................ 23
3.2 Processor-Configuration Parameters......................................................................... 23
3.3 Registers ............................................................................................................... 24
3.3.1 General (AR) Registers ................................................................................... 24
3.3.2 Shifts and the Shift Amount Register (SAR) ....................................................... 25
3.3.3 Reading and Writing the Special Registers ....................................................... 26
3.4 Data Formats and Alignment .................................................................................... 26
3.5 Memory ................................................................................................................. 27
3.5.1 Memory Addressing ....................................................................................... 27
3.5.2 Addressing Modes ......................................................................................... 28
3.5.3 Program Counter ........................................................................................... 29
3.5.4 Instruction Fetch ............................................................................................ 29
3.5.4.1 Little-Endian Fetch Semantics ................................................................ 29

Xtensa Instruction Set Architecture (ISA) Reference Manual

iii

Contents

4.

iv

3.5.4.2 Big-Endian Fetch Semantics................................................................... 31
3.6 Reset..................................................................................................................... 32
3.7 Exceptions and Interrupts ........................................................................................ 32
3.8 Instruction Summary ............................................................................................... 33
3.8.1 Load Instructions ........................................................................................... 33
3.8.2 Store Instructions ........................................................................................... 36
3.8.3 Memory Access Ordering ............................................................................... 39
3.8.4 Jump and Call Instructions .............................................................................. 40
3.8.5 Conditional Branch Instructions ....................................................................... 40
3.8.6 Move Instructions .......................................................................................... 42
3.8.7 Arithmetic Instructions .................................................................................... 43
3.8.8 Bitwise Logical Instructions ............................................................................. 44
3.8.9 Shift Instructions ............................................................................................ 44
3.8.10 Processor Control Instructions ....................................................................... 45
Architectural Options .................................................................................................... 47
4.1 Overview of Options ................................................................................................ 47
4.2 Core Architecture .................................................................................................... 50
4.3 Options for Additional Instructions ............................................................................. 53
4.3.1 Code Density Option ...................................................................................... 53
4.3.1.1 Code Density Option Architectural Additions ............................................ 53
4.3.1.2 Branches .............................................................................................. 54
4.3.2 Loop Option .................................................................................................. 54
4.3.2.1 Loop Option Architectural Additions ......................................................... 55
4.3.2.2 Restrictions on Loops ............................................................................ 55
4.3.2.3 Loops Disabled During Exceptions .......................................................... 56
4.3.2.4 Loopback Semantics ............................................................................. 56
4.3.3 Extended L32R Option ................................................................................... 56
4.3.3.1 Extended L32R Option Architectural Additions.......................................... 56
4.3.3.2 The Literal Base Register ....................................................................... 57
4.3.4 16-bit Integer Multiply Option .......................................................................... 57
4.3.4.1 16-bit Integer Multiply Option Architectural Additions ................................. 58
4.3.5 32-bit Integer Multiply Option .......................................................................... 58
4.3.5.1 32-bit Integer Multiply Option Architectural Additions ................................. 58
4.3.6 32-bit Integer Divide Option ............................................................................ 59
4.3.6.1 32-bit Integer Divide Option Architectural Additions ................................... 59
4.3.7 MAC16 Option............................................................................................... 60
4.3.7.1 MAC16 Option Architectural Additions ..................................................... 60
4.3.7.2 Use With CLAMPS Instruction ................................................................ 62
4.3.8 Miscellaneous Operations Option .................................................................... 62
4.3.8.1 Miscellaneous Operations Option Architectural Additions ........................... 62
4.3.9 Coprocessor Option ....................................................................................... 63
4.3.9.1 Coprocessor Option Architectural Additions ............................................. 64
4.3.9.2 Coprocessor Context Switch................................................................... 64
4.3.10 Boolean Option ............................................................................................ 65
4.3.10.1 Boolean Option Architectural Additions .................................................. 65
4.3.10.2 Booleans ............................................................................................ 66
4.3.11 Floating-Point Coprocessor Option................................................................. 67
4.3.11.1 Floating-Point Coprocessor Option Architectural Additions ....................... 67

Xtensa Instruction Set Architecture (ISA) Reference Manual

Contents

4.3.11.2 Floating-Point Representation ............................................................... 69
4.3.11.3 Floating-Point State.............................................................................. 69
4.3.11.4 Floating-Point Exceptions ..................................................................... 71
4.3.11.5 Floating-Point Instructions .................................................................... 71
4.3.12 Multiprocessor Synchronization Option ........................................................... 74
4.3.12.1 Memory Access Ordering ..................................................................... 74
4.3.12.2 Multiprocessor Synchronization Option Architectural Additions ................. 75
4.3.12.3 Inter-Processor Communication with the L32AI and S32RI Instructions ... 76
4.3.13 Conditional Store Option ............................................................................... 77
4.3.13.1 Conditional Store Option Architectural Additions ..................................... 77
4.3.13.2 Exclusive Access with the S32C1I Instruction ........................................ 78
4.3.13.3 Use Models for the S32C1I Instruction .................................................. 79
4.3.13.4 The Atomic Operation Control Register (ATOMCTL) under the Conditional
Store Option .................................................................................................... 80
4.3.13.5 Memory Ordering and the S32C1I Instruction ........................................ 81
4.4 Options for Interrupts and Exceptions ........................................................................ 82
4.4.1 Exception Option ........................................................................................... 82
4.4.1.1 Exception Option Architectural Additions .................................................. 83
4.4.1.2 Exception Causes under the Exception Option ......................................... 85
4.4.1.3 The Miscellaneous Program State Register (PS) under the Exception Option ...

87

4.4.1.4 Value of Variables under the Exception Option ......................................... 88
4.4.1.5 The Exception Cause Register (EXCCAUSE) under the Exception Option ..... 89
4.4.1.6 The Exception Virtual Address Register (EXCVADDR) under the Exception
Option ............................................................................................................. 91
4.4.1.7 The Exception Program Counter (EPC) under the Exception Option ............ 91
4.4.1.8 The Double Exception Program Counter (DEPC) under the Exception Option ...

92

4.4.1.9 The Exception Save Register (EXCSAVE) under the Exception Option ......... 92
4.4.1.10 Handling of Exceptional Conditions under the Exception Option ............... 93
4.4.1.11 Exception Priority under the Exception Option......................................... 96
4.4.2 Relocatable Vector Option .............................................................................. 98
4.4.2.1 Relocatable Vector Option Architectural Additions ..................................... 99
4.4.3 Unaligned Exception Option ............................................................................ 99
4.4.3.1 Unaligned Exception Option Architectural Additions ................................ 100
4.4.4 Interrupt Option ........................................................................................... 100
4.4.4.1 Interrupt Option Architectural Additions .................................................. 101
4.4.4.2 Specifying Interrupts ............................................................................ 102
4.4.4.3 The Level-1 Interrupt Process ............................................................... 105
4.4.4.4 Use of Interrupt Instructions .................................................................. 106
4.4.5 High-Priority Interrupt Option ......................................................................... 106
4.4.5.1 High-Priority Interrupt Option Architectural Additions ............................... 106
4.4.5.2 Specifying High-Priority Interrupts ......................................................... 108
4.4.5.3 The High-Priority Interrupt Process ........................................................ 108
4.4.5.4 Checking for Interrupts ......................................................................... 109
4.4.6 Timer Interrupt Option .................................................................................. 110
4.4.6.1 Timer Interrupt Option Architectural Additions ......................................... 110
4.4.6.2 Clock Counting and Comparison ........................................................... 111
4.5 Options for Local Memory ...................................................................................... 111

Xtensa Instruction Set Architecture (ISA) Reference Manual

v

Contents

4.5.1 General Cache Option Features .....................................................................111
4.5.1.1 Cache Terminology .............................................................................. 112
4.5.1.2 Cache Tag Format ............................................................................... 112
4.5.1.3 Cache Prefetch ................................................................................... 113
4.5.2 Instruction Cache Option .............................................................................. 115
4.5.2.1 Instruction Cache Option Architectural Additions ..................................... 115
4.5.3 Instruction Cache Test Option ....................................................................... 116
4.5.3.1 Instruction Cache Test Option Architectural Additions .............................. 116
4.5.4 Instruction Cache Index Lock Option .............................................................. 117
4.5.4.1 Instruction Cache Index Lock Option Architectural Additions .................... 117
4.5.5 Data Cache Option ...................................................................................... 118
4.5.5.1 Data Cache Option Architectural Additions ............................................. 119
4.5.6 Data Cache Test Option ............................................................................... 121
4.5.6.1 Data Cache Test Option Architectural Additions ...................................... 121
4.5.7 Data Cache Index Lock Option ...................................................................... 122
4.5.7.1 Data Cache Index Lock Option Architectural Additions ............................ 122
4.5.8 General RAM/ROM Option Features .............................................................. 123
4.5.9 Instruction RAM Option ................................................................................ 124
4.5.9.1 Instruction RAM Option Architectural Additions ....................................... 124
4.5.10 Instruction ROM Option .............................................................................. 125
4.5.10.1 Instruction ROM Option Architectural Additions ..................................... 125
4.5.11 Data RAM Option ....................................................................................... 126
4.5.11.1 Data RAM Option Architectural Additions ............................................. 126
4.5.12 Data ROM Option ...................................................................................... 126
4.5.12.1 Data ROM Option Architectural Additions ............................................. 127
4.5.13 XLMI Option .............................................................................................. 127
4.5.13.1 XLMI Option Architectural Additions .................................................... 127
4.5.14 Hardware Alignment Option ........................................................................ 128
4.5.15 Memory ECC/Parity Option ......................................................................... 128
4.5.15.1 Memory ECC/Parity Option Architectural Additions ............................... 129
4.5.15.2 Memory Error Information Registers .................................................... 130
4.5.15.3 The Exception Registers .................................................................... 137
4.5.15.4 Memory Error Semantics .................................................................... 137
4.6 Options for Memory Protection and Translation ........................................................ 138
4.6.1 Overview of Memory Management Concepts .................................................. 139
4.6.1.1 Overview of Memory Translation ........................................................... 139
4.6.1.2 Overview of Memory Protection ............................................................ 142
4.6.1.3 Overview of Attributes .......................................................................... 144
4.6.2 The Memory Access Process ........................................................................ 145
4.6.2.1 Choose the TLB .................................................................................. 146
4.6.2.2 Lookup in the TLB ............................................................................... 147
4.6.2.3 Check the Access Rights ..................................................................... 148
4.6.2.4 Direct the Access to Local Memory ....................................................... 148
4.6.2.5 Direct the Access to PIF ....................................................................... 150
4.6.2.6 Direct the Access to Cache .................................................................. 150
4.6.3 Region Protection Option .............................................................................. 150
4.6.3.1 Region Protection Option Architectural Additions .................................... 151
4.6.3.2 Formats for Accessing Region Protection Option TLB Entries .................. 152
4.6.3.3 Region Protection Option Memory Attributes .......................................... 154

vi

Xtensa Instruction Set Architecture (ISA) Reference Manual

Contents

5.

4.6.4 Region Translation Option ............................................................................ 156
4.6.4.1 Region Translation Option Architectural Additions ................................... 156
4.6.4.2 Region Translation Option Formats for Accessing TLB Entries ................. 156
4.6.4.3 Region Translation Option Memory Attributes ......................................... 158
4.6.5 MMU Option ................................................................................................ 158
4.6.5.1 MMU Option Architectural Additions ...................................................... 159
4.6.5.2 MMU Option Register Formats.............................................................. 161
PTEVADDR ............................................................................................161
RASID .................................................................................................... 161
ITLBCFG ................................................................................................ 162
DTLBCFG............................................................................................... 162
4.6.5.3 The Structure of the MMU Option TLBs ................................................. 163
4.6.5.4 The MMU Option Memory Map ............................................................. 164
4.6.5.5 Formats for Writing MMU Option TLB Entries ......................................... 165
4.6.5.6 Formats for Reading MMU Option TLB Entries ....................................... 168
4.6.5.7 Formats for Probing MMU Option TLB Entries ........................................ 171
4.6.5.8 Format for Invalidating MMU Option TLB Entries .................................... 172
4.6.5.9 MMU Option Auto-Refill TLB Ways and PTE Format ............................... 174
4.6.5.10 MMU Option Memory Attributes .......................................................... 175
4.6.5.11 MMU Option Operation Semantics....................................................... 178
4.7 Options for Other Purposes .................................................................................... 179
4.7.1 Windowed Register Option ........................................................................... 180
4.7.1.1 Windowed Register Option Architectural Additions .................................. 181
4.7.1.2 Managing Physical Registers ................................................................ 183
4.7.1.3 Window Overflow Check ...................................................................... 184
4.7.1.4 Call, Entry, and Return Mechanism ....................................................... 186
4.7.1.5 Windowed Procedure-Call Protocol ....................................................... 187
4.7.1.6 Window Overflow and Underflow to and from the Program Stack .............. 192
4.7.2 Processor Interface Option ........................................................................... 194
4.7.2.1 Processor Interface Option Architectural Additions .................................. 195
4.7.3 Miscellaneous Special Registers Option ......................................................... 195
4.7.3.1 Miscellaneous Special Registers Option Architectural Additions ............... 195
4.7.4 Thread Pointer Option .................................................................................. 196
4.7.4.1 Thread Pointer Option Architectural Additions ........................................ 196
4.7.5 Processor ID Option ..................................................................................... 196
4.7.5.1 Processor ID Option Architectural Additions ........................................... 196
4.7.6 Debug Option .............................................................................................. 197
4.7.6.1 Debug Option Architectural Additions .................................................... 197
4.7.6.2 Debug Cause Register ......................................................................... 198
4.7.6.3 Using Breakpoints ...............................................................................199
4.7.6.4 Debug Exceptions ...............................................................................201
4.7.6.5 Instruction Counting ............................................................................. 201
4.7.6.6 Debug Registers ................................................................................. 202
4.7.6.7 Debug Interrupts ................................................................................. 203
4.7.6.8 The checkIcount Procedure .............................................................. 203
4.7.7 Trace Port Option ........................................................................................ 203
4.7.7.1 Trace Port Option Architectural Additions ............................................... 204
Processor State .......................................................................................................... 205

Xtensa Instruction Set Architecture (ISA) Reference Manual

vii

Contents

6.
7.

8.

viii

5.1 General Registers ................................................................................................. 208
5.2 Program Counter .................................................................................................. 208
5.3 Special Registers .................................................................................................. 208
5.3.1 Reading and Writing Special Registers ........................................................... 211
5.3.2 LOOP Special Registers ............................................................................... 212
5.3.3 MAC16 Special Registers ............................................................................. 213
5.3.4 Other Unprivileged Special Registers ............................................................. 215
5.3.5 Processor Status Special Register ................................................................. 216
5.3.6 Windowed Register Option Special Registers ................................................. 221
5.3.7 Memory Management Special Registers ........................................................ 221
5.3.8 Exception Support Special Registers ............................................................. 223
5.3.9 Exception State Special Registers ................................................................. 226
5.3.10 Interrupt Special Registers .......................................................................... 229
5.3.11 Timing Special Registers ............................................................................. 231
5.3.12 Breakpoint Special Registers ....................................................................... 233
5.3.13 Other Privileged Special Registers ............................................................... 235
5.4 User Registers...................................................................................................... 237
5.4.1 Reading and Writing User Registers .............................................................. 237
5.4.2 The List of User Registers ............................................................................ 238
5.5 TLB Entries .......................................................................................................... 239
5.6 Additional Register Files ........................................................................................ 240
5.7 Caches and Local Memories .................................................................................. 240
Instruction Descriptions .............................................................................................. 243
Instruction Formats and Opcodes ................................................................................ 569
7.1 Formats ............................................................................................................... 569
7.1.1 RRR ........................................................................................................... 569
7.1.2 RRI4 .......................................................................................................... 569
7.1.3 RRI8 .......................................................................................................... 570
7.1.4 RI16 ........................................................................................................... 570
7.1.5 RSR ........................................................................................................... 570
7.1.6 CALL .......................................................................................................... 571
7.1.7 CALLX........................................................................................................ 571
7.1.8 BRI8........................................................................................................... 571
7.1.9 BRI12 ......................................................................................................... 572
7.1.10 RRRN ....................................................................................................... 572
7.1.11 RI7 ........................................................................................................... 572
7.1.12 RI6 ........................................................................................................... 573
7.2 Instruction Fields................................................................................................... 573
7.3 Opcode Encodings ................................................................................................ 574
7.3.1 Opcode Maps .............................................................................................. 575
7.3.2 CUST0 and CUST1 Opcode Encodings ......................................................... 586
7.3.3 Cache-Option Opcode Encodings (Implementation-Specific) ............................ 586
Using the Xtensa Architecture ..................................................................................... 587
8.1 The Windowed Register and CALL0 ABIs ................................................................ 587
8.1.1 Windowed Register Usage and Stack Layout .................................................. 587
8.1.2 CALL0 Register Usage and Stack Layout ....................................................... 589
8.1.3 Data Types and Alignment ............................................................................ 589
8.1.4 Argument Passing ....................................................................................... 590

Xtensa Instruction Set Architecture (ISA) Reference Manual

Contents

8.1.5 Return Values.............................................................................................. 591
8.1.6 Variable Arguments ...................................................................................... 592
8.1.7 Other Register Conventions .......................................................................... 592
8.1.8 Nested Functions ......................................................................................... 593
8.1.9 Stack Initialization ........................................................................................ 594
8.2 Other Conventions ................................................................................................ 595
8.2.1 Break Instruction Operands .......................................................................... 595
8.2.2 System Calls ............................................................................................... 597
8.3 Assembly Code .................................................................................................... 598
8.3.1 Assembler Replacements and the Underscore Form ....................................... 598
8.3.2 Instruction Idioms ........................................................................................598
8.3.3 Example: A FIR Filter with MAC16 Option ...................................................... 600
8.4 Performance ........................................................................................................ 605
8.4.1 Processor Performance Terminology and Modeling ......................................... 605
8.4.2 Xtensa Processor Family .............................................................................. 608
A. Differences Between Old and Current Hardware ........................................................ 611
A.1 Added Instructions ................................................................................................ 611
A.2 Xtensa Exception Architecture 1............................................................................. 611
A.2.1 Differences in the PS Register ...................................................................... 612
A.2.2 Exception Semantics ................................................................................... 612
A.2.3 Checking ICOUNT ....................................................................................... 614
A.2.4 The BREAK and BREAK.N Instructions ........................................................... 614
A.2.5 The RETW and RETW.N Instructions ............................................................... 614
A.2.6 The RFDE Instruction ................................................................................... 614
A.2.7 The RFE Instruction ..................................................................................... 614
A.2.8 The RFUE Instruction ................................................................................... 615
A.2.9 The RFWO and RFWU Instructions ................................................................... 616
A.2.10 Exception Virtual Address Register .............................................................. 616
A.2.11 Double Exceptions ..................................................................................... 616
A.2.12 Use of the RSIL Instruction ......................................................................... 616
A.2.13 Writeback Cache ....................................................................................... 616
A.2.14 The Cache Attribute Register ...................................................................... 617
A.3 New Exception Cause Values ................................................................................ 619
A.4 ICOUNTLEVEL .................................................................................................... 620
A.5 MMU Option Memory Attributes ............................................................................. 620
A.6 Special Register Read and Write ............................................................................ 620
A.7 MMU Modification ................................................................................................. 621
A.8 Reduction of SYNC Instruction Requirements .......................................................... 621

Xtensa Instruction Set Architecture (ISA) Reference Manual

ix

Contents

x

Xtensa Instruction Set Architecture (ISA) Reference Manual

List of Figures

List of Figures

Figure 1–1.
Figure 1–2.
Figure 1–3.
Figure 2–4.
Figure 2–5.
Figure 3–6.
Figure 4–7.
Figure 4–8.
Figure 4–9.
Figure 4–10.
Figure 4–11.
Figure 4–12.
Figure 4–13.
Figure 4–14.
Figure 4–15.
Figure 4–16.
Figure 4–17.
Figure 4–18.
Figure 4–19.
Figure 4–20.
Figure 4–21.

Xtensa LX Hardware Architecture Block Diagram ............................................... 6
Example Implementation Pipeline ................................................................... 12
The Xtensa Design Flow ................................................................................ 15
Big and Little Bit Numbering for BBC/BBS Instructions ...................................... 17
Big and Little Endian Byte Ordering ................................................................ 18
Virtual Address Fields.................................................................................... 27
LITBASE Register Format.............................................................................. 57
PS Register Format....................................................................................... 87
EXCCAUSE Register .................................................................................... 89
EXCVADDR Register Format ......................................................................... 91
EPC Register Format for Exception Option ...................................................... 92
DEPC Register Format .................................................................................. 92
EXCSAVE Register Format ............................................................................ 93
Instruction and Data Cache Tag Format for Xtensa ......................................... 113
MESR Register Format ...............................................................................130
MECR Register Format ...............................................................................135
MEVADDR Register Format ......................................................................... 136
Virtual-to-Physical Address Translation ......................................................... 140
A Single Process’ Rings .............................................................................. 143
Nested Rings of Multiple Processes with Some Sharing .................................. 143
Region Protection Option Addressing (as) Format for WxTLB, RxTLB1, & PxTLB ....

152

Figure 4–22.
Figure 4–23.
Figure 4–24.
Figure 4–25.

Region Protection Option Data (at) Format for WxTLB .................................... 153
Region Protection Option Data (at) Format for RxTLB1 .................................. 153
Region Protection Option Data (at) Format for PxTLB ................................... 153
Region Translation Option Addressing (as) Format for WxTLB, RxTLB1, & PxTLB ...

157
Figure 4–26.
Figure 4–27.
Figure 4–28.
Figure 4–29.
Figure 4–30.
Figure 4–31.
Figure 4–32.
Figure 4–33.
Figure 4–34.
Figure 4–35.
Figure 4–36.
Figure 4–37.
Figure 4–38.
Figure 4–39.
Figure 4–40.
Figure 4–41.
Figure 4–42.

Region Translation Option Data (at) Format for WxTLB .................................. 157
Region Translation Option Data (at) Format for RxTLB1 ................................ 158
Region Translation Option Data (at) Format for PxTLB .................................. 158
MMU Option PTEVADDR Register Format .................................................... 161
MMU Option RASID Register Format ............................................................ 162
MMU Option DTLBCFG Register Format ....................................................... 163
MMU Option Address Map with IVARWAY56 and DVARWAY56 Fixed ................ 165
MMU Option Addressing (as) Format for WxTLB ............................................ 167
MMU Option Data (at) Format for WxTLB ...................................................... 168
MMU Option Addressing (as) Format for RxTLB0 and RxTLB1....................... 169
MMU Option Data (at) Format for RxTLB0 .................................................... 170
MMU Option Data (at) Format for RxTLB1 .................................................... 171
MMU Option Addressing (as) Format for PxTLB ............................................ 172
MMU Option Data (at) Format for PITLB ...................................................... 172
MMU Option Data (at) Format for PDTLB ...................................................... 172
MMU Option Addressing (as) Format for IxTLB ............................................ 173
MMU Option Page Table Entry (PTE) Format ................................................. 174

Xtensa Instruction Set Architecture (ISA) Reference Manual

xi

List of Figures

Figure 4–43.
Figure 4–44.
Figure 4–45.
Figure 4–46.
Figure 4–47.
Figure 4–48.
Figure 4–49.
Figure 4–50.
Figure 4–51.
Figure 4–52.
Figure 8–53.
Figure 8–54.
Figure 8–55.
Figure 8–56.
Figure 9-57.

xii

Conceptual Register Window Read............................................................... 183
Faster Register Window Read ...................................................................... 184
Fastest Register Window Read .................................................................... 184
Register Window Near Overflow ................................................................... 185
Register Window Just Before Underflow ........................................................ 187
Stack Frame Before alloca() .................................................................... 189
Stack Frame After First alloca() ............................................................... 190
Stack Frame Layout .................................................................................... 191
DEBUGCAUSE Register ............................................................................. 199
DBREAKC[i] Format .................................................................................. 202
Stack Frame for the Windowed Register ABI .................................................. 588
Instruction Operand Dependency Interlock .................................................... 607
Functional Unit Interlock .............................................................................. 607
Xtensa Pipeline Effects ................................................................................ 609
CACHEATTR Register ................................................................................ 618

Xtensa Instruction Set Architecture (ISA) Reference Manual

List of Tables

List of Tables

Table 1–1.
Table 1–2.
Table 1–3.
Table 2–4.
Table 2–5.
Table 2–6.
Table 3–7.
Table 3–8.
Table 3–9.
Table 3–10.
Table 3–11.
Table 3–12.
Table 3–13.
Table 3–14.
Table 3–15.
Table 3–16.
Table 3–17.
Table 3–18.
Table 3–19.
Table 3–20.
Table 3–21.
Table 3–22.
Table 3–23.
Table 4–24.
Table 4–25.
Table 4–26.
Table 4–27.
Table 4–28.
Table 4–29.
Table 4–30.
Table 4–31.
Table 4–32.
Table 4–33.
Table 4–34.
Table 4–35.
Table 4–36.
Table 4–37.
Table 4–38.
Table 4–39.
Table 4–40.
Table 4–41.
Table 4–42.
Table 4–43.
Table 4–44.

Huffman Decode Example ............................................................................... 2
Comparison of Typical RISC and Xtensa ISA Features ....................................... 6
Modular Components ...................................................................................... 7
Instruction-Description Expressions ................................................................ 19
Instruction-Description Statements .................................................................. 21
Uses Of Instruction Fields .............................................................................. 21
Core Processor-Configuration Parameters....................................................... 24
Core-Architecture Set .................................................................................... 24
Reading and Writing Special Registers ............................................................ 26
Operand Formats and Alignment .................................................................... 27
Core Instruction Summary ............................................................................. 33
Load Instructions .......................................................................................... 34
Store Instructions .......................................................................................... 36
Memory Order Instructions ............................................................................. 39
Jump and Call Instructions ............................................................................. 40
Conditional Branch Instructions ...................................................................... 40
Branch Immediate (b4const) Encodings ........................................................ 41
Branch Unsigned Immediate (b4constu) Encodings........................................ 42
Move Instructions.......................................................................................... 43
Arithmetic Instructions ................................................................................... 43
Bitwise Logical Instructions ............................................................................ 44
Shift Instructions ........................................................................................... 44
Processor Control Instructions ........................................................................ 46
Core Architecture Processor-Configurations .................................................... 50
Core Architecture Processor-State .................................................................. 51
Core Architecture Instructions ........................................................................ 51
Code Density Option Instruction Additions ....................................................... 54
Loop Option Processor-State Additions ........................................................... 55
Loop Option Instruction Additions ................................................................... 55
Extended L32R Option Processor-State Additions ............................................ 57
16-bit Integer Multiply Option Instruction Additions............................................ 58
32-bit Integer Multiply Option Processor-Configuration Additions ........................ 59
32-Bit Integer Multiply Instruction Additions ...................................................... 59
32-bit Integer Divide Option Processor-Configuration Additions.......................... 59
32-bit Integer Divide Option Exception Additions .............................................. 60
32-bit Integer Divide Option Instruction Additions.............................................. 60
MAC16 Option Processor-State Additions ........................................................ 61
MAC16 Option Instruction Additions ................................................................ 61
Miscellaneous Operations Option Processor-Configuration Additions ................. 62
Miscellaneous Operations Instruction Additions ................................................ 63
Coprocessor Option Exception Additions ......................................................... 64
Coprocessor Option Processor-State Additions ................................................ 64
Boolean Option Processor-State Additions....................................................... 65
Boolean Option Instruction Additions ............................................................... 66

Xtensa Instruction Set Architecture (ISA) Reference Manual

xiii

List of Tables

Table 4–45.
Table 4–46.
Table 4–47.
Table 4–48.
Table 4–49.
Table 4–50.
Table 4–51.
Table 4–52.
Table 4–53.
Table 4–54.
Table 4–55.
Table 4–56.
Table 4–57.
Table 4–58.
Table 4–59.
Table 4–60.
Table 4–61.
Table 4–63.
Table 4–62.
Table 4–64.
Table 4–65.
Table 4–66.
Table 4–67.
Table 4–68.
Table 4–69.
Table 4–70.
Table 4–71.
Table 4–72.
Table 4–73.
Table 4–74.
Table 4–75.
Table 4–76.
Table 4–77.
Table 4–78.
Table 4–79.
Table 4–80.
Table 4–81.
Table 4–82.
Table 4–83.
Table 4–84.
Table 4–85.
Table 4–86.
Table 4–87.
Table 4–88.
Table 4–89.
Table 4–90.
Table 4–91.
Table 4–92.
Table 4–93.

xiv

Floating-Point Coprocessor Option Processor-State Additions ........................... 67
Floating-Point Coprocessor Option Instruction Additions ................................... 67
FCR fields .................................................................................................... 70
FSR fields .................................................................................................... 70
Floating-Point Coprocessor Option Load/Store Instructions ............................... 72
Floating-Point Coprocessor Option Operation Instructions ................................. 72
Multiprocessor Synchronization Option Instruction Additions ............................. 76
Conditional Store Option Processor-State Additions .......................................... 78
Conditional Store Option Instruction Additions .................................................. 78
ATOMCTL Register Fields ............................................................................. 81
Exception Option Constant Additions (Exception Causes) ................................. 83
Exception Option Processor-Configuration Additions ........................................ 83
Exception Option Processor-State Additions .................................................... 84
Exception Option Instruction Additions ............................................................ 84
Instruction Exceptions under the Exception Option ........................................... 85
Interrupts under the Exception Option ............................................................. 86
Machine Checks under the Exception Option ................................................... 86
PS Register Fields ........................................................................................ 87
Debug Conditions under the Exception Option ................................................. 87
Exception Causes ........................................................................................ 89
Exception and Interrupt Information Registers by Vector ................................... 94
Exception and Interrupt Exception Registers by Vector ..................................... 95
Relocatable Vector Option Processor-State Additions ....................................... 99
Unaligned Exception Option Constant Additions (Exception Causes) ................ 100
Interrupt Option Constant Additions (Exception Causes) ................................. 101
Interrupt Option Processor-Configuration Additions ........................................ 101
Interrupt Option Processor-State Additions .................................................... 101
Interrupt Option Instruction Additions ............................................................ 102
Interrupt Types ........................................................................................... 103
High-Priority Interrupt Option Processor-Configuration Additions ...................... 107
High-Priority Interrupt Option Processor-State Additions .................................. 107
High-Priority Interrupt Option Instruction Additions .......................................... 107
Timer Interrupt Option Processor-Configuration Additions ............................... 110
Timer Interrupt Option Processor-State Additions ............................................111
Instruction Cache Option Processor-Configuration Additions ........................... 115
Instruction Cache Option Instruction Additions ............................................... 116
Instruction Cache Test Option Instruction Additions ........................................ 117
Instruction Cache Index Lock Option Instruction Additions ............................... 118
Data Cache Option Processor-Configuration Additions ................................... 119
Data Cache Option Instruction Additions ....................................................... 119
Data Cache Test Option Instruction Additions................................................. 121
Data Cache Index Lock Option Instruction Additions ....................................... 122
RAM/ROM Access Restrictions .................................................................... 124
Instruction RAM Option Processor-Configuration Additions ............................. 124
Instruction ROM Option Processor-Configuration Additions ............................. 125
Data RAM Option Processor-Configuration Additions ...................................... 126
Data ROM Option Processor-Configuration Additions ..................................... 127
XLMI Option Processor-Configuration Additions ............................................. 127
Memory ECC/Parity Option Processor-Configuration Additions ........................ 129

Xtensa Instruction Set Architecture (ISA) Reference Manual

List of Tables

Table 4–94.
Table 4–95.
Table 4–96.
Table 4–97.
Table 4–98.
Table 4–99.
Table 4–100.
Table 4–101.
Table 4–102.
Table 4–103.
Table 4–104.
Table 4–105.
Table 4–106.
Table 4–107.
Table 4–108.
Table 4–109.
Table 4–110.
Table 4–111.
Table 4–112.
Table 4–113.
Table 4–114.
Table 4–115.
Table 4–116.
Table 4–117.
Table 4–118.
Table 4–119.
Table 4–120.
Table 4–121.
Table 4–122.
Table 4–123.
Table 4–124.
Table 4–125.
Table 4–126.
Table 5–127.
Table 5–128.
Table 5–129.
Table 5–130.
Table 5–131.
Table 5–132.
Table 5–133.
Table 5–134.
Table 5–135.
Table 5–136.
Table 5–137.
Table 5–138.
Table 5–139.
Table 5–140.
Table 5–141.
Table 5–142.

Memory ECC/Parity Option Processor-State Additions .................................... 129
Memory ECC/Parity Option Instruction Additions ............................................ 130
MESR Register Fields ................................................................................. 131
MECR Register Fields ................................................................................. 135
MEVADDR Contents ................................................................................... 136
Access Characteristics Encoded in the Attributes ........................................... 144
Local Memory Accesses .............................................................................. 149
Region Protection Option Exception Additions ............................................... 151
Region Protection Option Processor-State Additions ....................................... 151
Region Protection Option Instruction Additions ............................................... 151
Region Protection Option Attribute Field Values ............................................. 155
MMU Option Processor-Configuration Additions ............................................. 159
MMU Option Exception Additions ................................................................. 159
MMU Option Processor-State Additions ......................................................... 160
MMU Option Instruction Additions ................................................................. 160
MMU Option Attribute Field Values ............................................................... 178
Windowed Register Option Constant Additions (Exception Causes) ................. 181
Windowed Register Option Processor-Configuration Additions ........................ 181
Windowed Register Option Processor-State Additions and Changes ................ 181
Windowed Register Option Instruction Additions ............................................ 182
Windowed Register Usage ........................................................................... 188
Processor Interface Option Constant Additions (Exception Causes) ................. 195
Miscellaneous Special Registers Option Processor-Configuration Additions ...... 195
Miscellaneous Special Registers Option Processor-State Additions .................. 196
Thread Pointer Option Processor-State Additions ........................................... 196
Processor ID Option Special Register Additions ............................................. 197
Debug Option Processor-Configuration Additions ........................................... 197
Debug Option Processor-State Additions ....................................................... 198
Debug Option Instruction Additions ............................................................... 198
DEBUGCAUSE Fields ................................................................................. 199
DBREAK Fields .......................................................................................... 200
DBREAKC[i] Register Fields ....................................................................... 202
Trace Port Option Special Register Additions ................................................. 204
Alphabetical List of Processor State ............................................................. 205
Numerical List of Special Registers ............................................................... 209
LBEG - Special Register #0 ....................................................................... 212
LEND - Special Register #1 ....................................................................... 213
LCOUNT - Special Register #2....................................................................... 213
ACCLO - Special Register #16 .................................................................... 214
ACCHI - Special Register #17 .................................................................... 214
M0..3 - Special Register #32-35 ............................................................... 214
SAR - Special Register #3 ......................................................................... 215
BR - Special Register #4 ........................................................................... 215
LITBASE - Special Register #5 .................................................................. 216
SCOMPARE1 - Special Register #12 ............................................................ 216
PS - Special Register #230 ........................................................................ 217
PS.INTLEVEL - Special Register #230 (part) .............................................. 217
PS.EXCM - Special Register #230 (part) ..................................................... 218
PS.UM - Special Register #230 (part) ......................................................... 219

Xtensa Instruction Set Architecture (ISA) Reference Manual

xv

List of Tables

Table 5–143.
Table 5–144.
Table 5–145.
Table 5–146.
Table 5–147.
Table 5–148.
Table 5–149.
Table 5–150.
Table 5–151.
Table 5–152.
Table 5–153.
Table 5–154.
Table 5–155.
Table 5–156.
Table 5–157.
Table 5–158.
Table 5–159.
Table 5–160.
Table 5–161.
Table 5–162.
Table 5–163.
Table 5–164.
Table 5–165.
Table 5–166.
Table 5–167.
Table 5–168.
Table 5–169.
Table 5–170.
Table 5–171.
Table 5–172.
Table 5–173.
Table 5–174.
Table 5–175.
Table 5–176.
Table 5–177.
Table 5–178.
Table 5–179.
Table 5–180.
Table 5–181.
Table 5–182.
Table 5–183.
Table 5–184.
Table 5–185.
Table 5–186.
Table 5–187.
Table 5–188.
Table 5–189.
Table 5–190.
Table 7–191.

xvi

PS.RING - Special Register #230 (part) ..................................................... 219
PS.OWB - Special Register #230 (part) ....................................................... 220
PS.CALLINC - Special Register #230 (part)................................................ 220
PS.WOE - Special Register #230 (part) ....................................................... 220
WindowBase - Special Register #72 .......................................................... 221
WindowStart - Special Register #73 ........................................................ 221
PTEVADDR - Special Register #83 .............................................................. 222
RASID - Special Register #90 .................................................................... 222
ITLBCFG - Special Register #91 ................................................................ 223
DTLBCFG - Special Register #92 ................................................................ 223
EXCCAUSE - Special Register #232 ............................................................ 224
EXCVADDR - Special Register #238 ............................................................ 224
VECBASE - Special Register #231 .............................................................. 224
MESR - Special Register #109 .................................................................... 225
MECR - Special Register #110 .................................................................... 225
MEVADDR - Special Register #111 .............................................................. 225
DEBUGCAUSE - Special Register #233 ........................................................ 226
EPC1 - Special Register #177 .................................................................... 226
EPC2..7 - Special Register #178-183........................................................ 226
DEPC - Special Register #192 .................................................................... 227
MEPC - Special Register #106 .................................................................... 227
EPS2..7 - Special Register #194-199........................................................ 227
MEPS - Special Register #107 .................................................................... 228
EXCSAVE1 - Special Register #192 ............................................................ 228
EXCSAVE2..7- Special Register #210-215 .................................................. 228
MESAVE- Special Register #108 .................................................................. 229
INTERRUPT - Special Register #226 (read)................................................. 229
INTSET - Special Register #226 (write) ...................................................... 230
INTCLEAR - Special Register #227 ............................................................ 230
INTENABLE - Special Register #228 .......................................................... 231
ICOUNT - Special Register #236 ................................................................ 231
ICOUNTLEVEL - Special Register #237 ...................................................... 232
CCOUNT - Special Register #234 ................................................................ 232
CCOMPARE0..2 - Special Register #240-242 .............................................. 233
IBREAKENABLE - Special Register #96 ...................................................... 233
IBREAKA0..1 - Special Register #128-129 ................................................ 234
DBREAKC0..1 - Special Register #160-161 ................................................ 234
DBREAKA0..1 - Special Register #144-145 ................................................ 235
PRID - Special Register #235 .................................................................... 235
MMID - Special Register #89...................................................................... 235
DDR - Special Register #104 ...................................................................... 236
CPENABLE - Special Register #224 ............................................................ 236
MISC0..3 - Special Register #244-247...................................................... 236
ATOMCTL - Special Register #99 ................................................................ 237
Numerical List of User Registers .................................................................. 237
THREADPTR - User Register #231 .............................................................. 238
FCR - User Register #232 ......................................................................... 239
FSR - User Register #233 ......................................................................... 239
Uses Of Instruction Fields ............................................................................ 573

Xtensa Instruction Set Architecture (ISA) Reference Manual

List of Tables

Table 7–192.
Table 7–193.
Table 7–194.
Table 7–195.
Table 7–196.
Table 7–197.
Table 7–198.
Table 7–199.
Table 7–200.
Table 7–201.
Table 7–202.
Table 7–203.
Table 7–204.
Table 7–205.
Table 7–206.
Table 7–207.
Table 7–208.
Table 7–209.
Table 7–210.
Table 7–211.
Table 7–212.
Table 7–213.
Table 7–214.
Table 7–215.
Table 7–216.
Table 7–217.
Table 7–218.
Table 7–219.
Table 7–220.
Table 7–221.
Table 7–222.
Table 7–223.
Table 7–224.
Table 7–225.
Table 7–226.
Table 7–227.
Table 7–228.
Table 7–229.
Table 7–230.
Table 7–231.
Table 7–232.
Table 7–233.
Table 7–234.
Table 7–235.
Table 7–236.
Table 7–237.
Table 7–238.
Table 7–239.
Table 7–240.

Whole Opcode Space .................................................................................. 575
QRST (from Table 7–192) Formats RRR, CALLX, and RSR (t, s, r, op2 vary) .... 575
RST0 (from Table 7–193) Formats RRR and CALLX (t, s, r vary) ..................... 576
ST0 (from Table 7–194 Formats RRR and CALLX (t, s vary)............................ 576
SNM0 (from Table 7–195) Format CALLX (n, s vary) ...................................... 576
JR (from Table 7–196) Format CALLX (s varies) ............................................ 576
CALLX (from Table 7–196) Format CALLX (s varies) ...................................... 576
SYNC (from Table 7–195) Format RRR (s varies) ........................................... 576
RFEI (from Table 7–195) Format RRR (s varies) ............................................ 577
RFET (from Table 7–200) Format RRR (no bits vary) ...................................... 577
ST1 (from Table 7–194) Format RRR (t, s vary).............................................. 577
TLB (from Table 7–194) Format RRR (t, s vary).............................................. 577
RT0 (from Table 7–194) Format RRR (t, r vary) .............................................. 578
RST1 (from Table 7–193) Format RRR (t, s, r vary) ........................................ 578
ACCER (from Table 7–205) Format RRR (t, s vary) ........................................ 578
IMP (from Table 7–205) Format RRR (t, s vary) (Section 7.3.3) ........................ 578
RFDX (from Table 7–207) Format RRR (s varies) ........................................... 579
RST2 (from Table 7–193) Format RRR (t, s, r vary) ........................................ 579
RST3 (from Table 7–193) Formats RRR and RSR (t, s, r vary) ......................... 579
LSCX (from Table 7–193) Format RRR (t, s, r vary) ........................................ 579
LSC4 (from Table 7–193) Format RRI4 (t, s, r vary) ........................................ 580
FP0 (from Table 7–193) Format RRR (t, s, r vary) ........................................... 580
FP1OP (from Table 7–213) Format RRR (s, r vary) ......................................... 580
FP1 (from Table 7–193) Format RRR (t, s, r vary) ........................................... 580
LSAI (from Table 7–192) Formats RRI8 and RRI4 (t, s, imm8 vary) .................. 581
CACHE (from Table 7–216) Formats RRI8 and RRI4 (s, imm8 vary) ................. 581
DCE (from Table 7–217) Format RRI4 (s, imm4 vary) ..................................... 581
ICE (from Table 7–217) Format RRI4 (s, imm4 vary) ....................................... 581
LSCI (from Table 7–192) Format RRI8 (t, s, imm8 vary) .................................. 582
MAC16 (from Table 7–192) Format RRR (t, s, r, op1 vary) ............................... 582
MACID (from Table 7–221) Format RRR (t, s, r vary) ...................................... 582
MACIA (from Table 7–221) Format RRR (t, s, r vary) ...................................... 582
MACDD (from Table 7–221) Format RRR (t, s, r vary) ..................................... 583
MACAD (from Table 7–221) Format RRR (t, s, r vary) ..................................... 583
MACCD (from Table 7–221) Format RRR (t, s, r vary) ..................................... 583
MACCA (from Table 7–221) Format RRR (t, s, r vary) ..................................... 583
MACDA (from Table 7–221) Format RRR (t, s, r vary) ..................................... 584
MACAA (from Table 7–221) Format RRR (t, s, r vary) ..................................... 584
MACI (from Table 7–221) Format RRR (t, s, r vary) ........................................ 584
MACC (from Table 7–221) Format RRR (t, s, r vary) ....................................... 584
CALLN (from Table 7–192) Format CALL (offset varies) .................................. 584
SI (from Table 7–192) Formats CALL, BRI8 and BRI12(offset varies) ............... 585
BZ (from Table 7–233) Format BRI12 (s, imm12 vary)..................................... 585
BI0 (from Table 7–233) Format BRI8 (s, r, imm8 vary) ..................................... 585
BI1 (from Table 7–233) Formats BRI8 and BRI12 (s, r, imm8 vary) ................... 585
B1 (from Table 7–236) Format BRI8 (s, imm8 vary) ........................................ 585
B (from Table 7–192) Format RRI8 (t, s, imm8 vary) ....................................... 585
ST2 (from Table 7–192) Formats RI7 and RI6 (s, r vary).................................. 586
ST3 (from Table 7–192) Format RRRN (t, s vary) ........................................... 586

Xtensa Instruction Set Architecture (ISA) Reference Manual

xvii

List of Tables

Table 7–241.
Table 8–242.
Table 8–243.
Table 8–244.
Table 8–245.
Table 8–246.
Table 8–247.
Table 9-248.
Table 9-249.
Table 9-250.
Table 9–251.

xviii

S3 (from Table 7–240) Format RRRN (no fields vary) ..................................... 586
Windowed Register Usage........................................................................... 587
CALL0 Register Usage ................................................................................ 589
Data Types and Alignment ........................................................................... 589
Breakpoint Instruction Operand Conventions ................................................. 596
Instruction Idioms........................................................................................ 599
Xtensa Pipeline .......................................................................................... 608
Instructions Added ...................................................................................... 611
Cache Attribute Register .............................................................................. 617
Cache Attribute Special Register .................................................................. 617
T1050 Additional SYNC Requirements .......................................................... 621

Xtensa Instruction Set Architecture (ISA) Reference Manual

Preface

Preface

This manual is written for Tensilica customers who are experienced in working with microprocessors or in writing assembly code or compilers. It is NOT a specification for one
particular implementation of the Architecture, but rather a reference for the ongoing
Instruction Set Architecture. For a detailed specification for specific products, refer to a
specific Tensilica processor data book.

Notation
„

italic_name indicates a program or file name, document title, or term being defined.

„

$ represents your shell prompt, in user-session examples.

„

literal_input indicates literal command-line input.

„

variable indicates a user parameter.

„

literal_keyword (in text paragraphs) indicates a literal command keyword.

„

literal_output indicates literal program output.

„

... output ... indicates unspecified program output.

„

[optional-variable] indicates an optional parameter.

„

[variable] indicates a parameter within literal square-braces.

„

{variable} indicates a parameter within literal curly-braces.

„

(variable) indicates a parameter within literal parentheses.

„

| means OR.

„

(var1 | var2) indicates a required choice between one of multiple parameters.

„

[var1 | var2] indicates an optional choice between one of multiple parameters.

„

var1 [, varn]* indicates a list of 1 or more parameters (0 or more repetitions).

„

4'b0010 is a 4-bit value specified in binary.

„

12'o7016 is a 12-bit value specified in octal.

„

10'd4839 is a 10-bit value specified in decimal.

„

32'hff2a or 32'HFF2A is a 32-bit value specified in hexadecimal.

Terms
„

0x at the beginning of a value indicates a hexadecimal value.

„

b means bit.

„

B means byte.

Xtensa Instruction Set Architecture (ISA) Reference Manual

xix

Preface

„

flush is deprecated due to potential ambiguity (it may mean write-back or discard).

„

Mb means megabit.

„

MB means megabyte.

„

PC means program counter.

„

word means 4 bytes.

xx

Xtensa Instruction Set Architecture (ISA) Reference Manual

Preface

Related Tensilica Documents
„

330HiFi Standard DSP Processor Data Book

„

388VDO Hardware User’s Guide

„

388VDO Software Guide

„

545CK Standard DSP Processor Data Book

„

ConnX D2 DSP Engine User’s Guide

„

ConnX Vectra™ LX DSP Engine Guide

„

Diamond Series Hardware User’s Guide

„

Diamond Series Upgrade Guide

„

Diamond Standard Controllers Data Book

„

GNU Assembler User’s Guide

„

GNU Binary Utilities User’s Guide

„

GNU Debugger User’s Guide

„

GNU Linker User’s Guide

„

GNU Profiler User’s Guide

„

HiFi 2 Audio Engine Codecs Programmer’s Guides

„

HiFi 2 Audio Engine Instruction Set Architecture Reference Manual

„

Red Hat newlib C Library Reference Manual

„

Red Hat newlib C Math Library Reference Manual

„

Tensilica Avnet LX200 (XT-AV200) Board User’s Guide

„

Tensilica Avnet LX60 (XT-AV60) Board User’s Guide

„

Tensilica Bus Designer’s Toolkit Guide

„

Tensilica C Application Programmer’s Guide

„

Tensilica Instruction Extension (TIE) Language Reference Manual

„

Tensilica Instruction Extension (TIE) Language User’s Guide

„

Tensilica On-Chip Debugging Guide

„

Tensilica Processors Bus Bridges Guide

„

Tensilica Trace Solutions User’s Guide

„

Xtensa® C and C++ Compiler User’s Guide

„

Xtensa® Development Tools Installation Guide

„

Xtensa® Energy Estimator (Xenergy) User’s Guide

„

Xtensa® Hardware User’s Guide

„

Xtensa® Instruction Set Architecture (ISA) Reference Manual

„

Xtensa® Instruction Set Simulator (ISS) User’s Guide

„

Xtensa® Linker Support Packages (LSPs) Reference Manual

Xtensa Instruction Set Architecture (ISA) Reference Manual

xxi

Preface

„

Xtensa® LX3 Microprocessor Data Book

„

Xtensa® 8 Microprocessor Data Book

„

Xtensa® Microprocessor Programmer’s Guide

„

Xtensa® Modeling Protocol (XTMP) User’s Guide

„

Xtensa® OSKit™ Guide

„

Xtensa® Processor Extensions Synthesis (XPRES™) Compiler User’s Guide

„

Xtensa® Processor Interface Protocol Reference Manual

„

Xtensa® Software Development Toolkit User's Guide

„

Xtensa® SystemC® (XTSC) Reference Manual

„

Xtensa® SystemC® (XTSC) User’s Guide

„

Xtensa® System Designer’s Guide

„

Xtensa® System Software Reference Manual

„

Xtensa® Upgrade Guide

xxii

Xtensa Instruction Set Architecture (ISA) Reference Manual

Changes from the Previous Version

Changes from the Previous Version

The following changes have been made to this document for the Tensilica RC-2010.1
release:
„

„

„

Deleted several extraneous blank pages in between each chapter in previous release.
Corrected erroneous cross-references to Table 4–55 through Table 4–58 in
Section 4.4.1.1 on page 83
Clarified information about lookup rings in Section 4.6.2.2 and Section 4.6.2.3.

The following changes have been made to this document for the Tensilica RC-2009.0
release:
„

„

„

„

„

A new register, ATOMCL, has been added to Section 4.3.13 “Conditional Store Option” on page 91. The ATOMCTL register controls the interaction of the S32C1I instruction with the memory system.
The description of attributes for the Section 4.6.3 “Region Protection Option” on
page 187 and the Section 4.6.5.10 “MMU Option Memory Attributes” on page 213
have been improved. There are no actual changes to the attributes.
The Section 4.6.5 “MMU Option” on page 196 has gained a new option. Way5 and
Way6 can now be either variable or fixed. The variable version provides more flexibility in the address map and has a setting where the MMU puts out a physical address equal to the virtual address and is, in that sense, turned off.
Many of the SYNC instruction requirements listed in Section 5.3 “Special Registers”
on page 259 have not actually been needed after T1050. Those requirements have
now been removed from Section 5.3 but retained in Appendix A.
The RER and WER instructions have been added to Chapter 6.

Xtensa Instruction Set Architecture (ISA) Reference Manual

xxiii

Changes from the Previous Version

xxiv

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

1.

Introduction

This chapter provides an overview of Tensilica, the Xtensa Instruction Set Architecture
(ISA), and the Xtensa Processor Generator.

1.1

What Problem is Tensilica Solving?

Processors have traditionally been extremely difficult to design and modify. Therefore,
most systems contain rigid processors that were designed and verified once for generalpurpose use and then embedded into multiple applications over time. Because these
processors are general-purpose designs, their suitability to any particular application is
less than ideal. Although it would be preferable to have a processor specifically designed to execute a particular application’s code better (for example, to run faster, or
consume less power, or cost less), this is rarely possible because of the difficulty; the
time, cost, and risk of modifying an existing processor or developing a new processor is
very high.
It is also not appropriate to simply design traditional processors with more features to
cover all applications, because any given application only requires a particular set of
features — a processor with features not required by the application is overly costly and
consumes unnecessary power. It is also not possible to know all of the potential application targets when a processor is initially designed.
If processor configuration could be automated and made reliable, then system designers
would have the option and ability to create truly efficient application solutions.
This is just what Tensilica is about: Tensilica provides a set of techniques and tools for
designing an application solution that contains one or more processors, each one configured and enhanced at design-time to fine-tune its suitability for a specific application.
Fine-tuning an architecture can consist of any combination of:
„

Extensibility: Adding architectural enhancements.

„

Configurability: Creating custom processor configurations.

„

Retargetability: Mapping the architecture into hardware to meet different speed, area, and power targets in different processes.

1.1.1

Adding Architectural Enhancements

As an example of an architectural enhancement, consider a device designed to transmit
and receive data over a channel using a complex protocol. Because the protocol is complex, the processing cannot be reasonably accomplished entirely in hard logic, and in-

Xtensa Instruction Set Architecture (ISA) Reference Manual

1

Chapter 1. Introduction

stead a programmable processor is introduced into the system for protocol processing.
This processor’s programmability also allows bug fixes and upgrades to later protocols
to be done by loading the instruction memories with new software. However, the processor was probably not designed for this particular application (the application may not
have even existed when the processor was designed), and the application may perform
operations that require many instructions — operations that could be accomplished with
a trivial amount of additional processor logic.
Before the introduction of Tensilica’s Xtensa technology, processors could not be
enhanced easily. Because of this, many system designers are forced to solve problems
by executing the inefficient pure-software solution on the available general-purpose
processor. This results in a solution that may be slower, or higher power, or costlier than
necessary (for example, it may require a larger, more powerful processor to execute the
program at sufficient speed).
Other designers choose to provide some of the processing requirements in specialpurpose hardware that they design for the application. This approach requires special
code to access the custom hardware at various points in the program. However, the time
to transfer data between the processor and the custom hardware limits the utility of this
approach to fairly large units of work; small computations cannot sufficiently amortize
the communication overhead introduced by this approach to provide a reasonable
speed-up.
In the communication-channel application example, the protocol might require encryption, error-correction, or compression/decompression processing. Such processing
often operates on individual bits rather than a processor’s larger words. The circuitry for
a computation may be rather modest, but the need for the processor to extract each bit,
sequentially process it, and then repack the bits adds considerable overhead.
As a specific example, consider the Huffman decode shown in Table 1–1.
Table 1–1. Huffman Decode Example

2

Input

Value

Length

00xxxxxx

0

2

01xxxxxx

1

2

10xxxxxx

2

2

110xxxxx

3

3

1110xxxx

4

4

11110xxx

5

5

111110xx

6

6

1111110x

7

7

11111110

8

8

11111111

9

8

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

Both the value and the length must be computed, so that length bits can be shifted off to
find the start of the next token. (A similar encoding is used in the MPEG compression
standard.) There are many ways to code this for a conventional RISC instruction set, but
all of them require many instructions, because there are many tests to be done, and
each test requires a single cycle (as opposed to a single gate delay for logic). For example, in the MIPS instruction set, the above decode procedure might look like this:
/* input in t0, value out in t1, length out in t2 */
srl t1, t0, 6
li
t3, 3
beq t3, t4, 2f
li t2, 2
andi t3, t0, 0x20
beq t3, r0, 1f
li t2, 3
andi t3, t0, 0x10
beq t3, r0, 1f
li t2, 4
andi t3, t0, 0x08
beq t3, r0, 1f
li t2, 5
andi t3, t0, 0x04
beq t3, r0, 1f
li t2, 6
andi t3, t0, 0x02
beq t3, r0, 1f
li t2, 7
andi t3, t0, 0x01
beq t3, r0, 1f
li t2, 8
b
2f
li t1, 9
1:
/* length = value */
move t1, t2
2:
/* done */

This is so expensive that a 256-entry lookup table is typically used instead. However, a
256-entry lookup table takes significant space and can take many cycles to access. For
longer Huffman encodings, the table size would become prohibitive, leading to more
complex and slower code.
The logic to decode this requires roughly 30 gates (just the combinatorial logic function,
not counting instruction decode and so forth) — less than 0.1% of a processor gatecount — and can be computed by a special-purpose processor instruction in a single cycle. This is a factor of 4 to 20 speed-up over using general-purpose instructions only. A
processor extended to have this logic in the form of an instruction would simply do:
huff8t1, t0

/* t1[3:0] is length, t1[7:0] is value */

Xtensa Instruction Set Architecture (ISA) Reference Manual

3

Chapter 1. Introduction

Tensilica’s solution is to provide a mechanism with which to easily and efficiently extend
processor architecture with application-specific instructions.

1.1.2

Creating Custom Processor Configurations

While the ability to extend processor architecture, which we call extensibility, lets system
designers incorporate new functionality into a processor, configurability lets processor
designers specify whether (or how much) pre-designed functionality is required for a
particular product.
The simplest sort of configurability is a binary choice: an architectural feature is either
present or absent in a particular processor configuration. For example, a processor
might be offered either with or without floating-point hardware. Multiple configurations of
a set of architectural features could be created by the processor designer, not the
system designer.
System-design flexibility is improved by having finer gradations in processor-configuration choices. For example, a processor configuration might allow the system designer to
specify the number of registers in the register file, memory width, cache size, cache
associativity, and so on.

1.1.3

Mapping the Architecture into Hardware

Extensibility and configurability provide great flexibility. However, the resulting design
must still be mapped into physical hardware. Synthesis, placement, and routing tools
allow high-level representations of a design to be automatically mapped into more
detailed designs. While these mapping operations do not change the functionality of the
design, they are important building blocks that facilitate extensibility and configurability.
Many processors are manually designed all the way to the layout. For such a processor
design, extensibility and configurability would require changes to the layout. By contrast,
the Tensilica system builds on existing synthesis, placement, and routing tools so that
configuration need only change the input to synthesis, and conventional mapping techniques are used to create physical hardware.
Some synthesis tools choose different mapping based on the designer’s goal specifications, allowing the mapping to optimize for speed, power, area, or target components.
This is as close to providing configurability that existing mapping tools come: the designer can specify different synthesis parameters for a fixed input. By contrast, the Tensilica
approach lets the designer alter the input to synthesis, and change its functionality.

4

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

1.1.4

Development and Verification Tools

Extending an architecture and reconfiguring a processor may require widespread
changes in processor logic to keep pipeline stages synchronized. Such reconfiguration
requires that the processor be re-verified. Tensilica automates these changes and
makes them reliable.
In addition, when the processor changes, the software tool chain — compilers, assemblers, linkers, debuggers, simulators, and profilers — must change as well. In the past,
the cost of software changes associated with processor reconfigurations has been a
major impediment. Tensilica automates these changes also.
Finally, it should be possible to get feedback on the performance, cost, power, and other
effects of processor reconfiguration without taking the design through the entire mapping process. This feedback can be used to direct further reconfiguration of the processor until the system design goals are achieved. Tensilica’s technology dramatically
improves the feedback loop.

1.2

The Xtensa Instruction Set Architecture

The Xtensa Instruction Set Architecture (ISA) is a new post-RISC ISA targeted at
embedded, communication, and consumer products. The ISA is designed to provide:
„

A high degree of extensibility

„

Industry-leading code density

„

Optimized low-power implementation

„

High performance

„

Low-cost implementation

This manual describes the Xtensa ISA — both the core architecture and the architectural options. Figure 1–1 illustrates the general organization of the processor hardware in
which the Xtensa ISA is implemented. This manual does not describe the memory map,
protection model, or peripherals that can be implemented in particular configurations of
the Xtensa ISA.

Xtensa Instruction Set Architecture (ISA) Reference Manual

5

Chapter 1. Introduction

Processor Controls
Trace

Trace Port

JTAG

JTAG Tap Control
On-Chip Debug

Instruction RAM

Instruction Fetch / Decode

Instruction ROM
Instruction

Base ISA
Designer-Defined FLIX parallel
execution pipelines - "N" wide Execution Pipeline
Dispatch

Inst. Memory
Management &
Protection

Instruction
Cache

Exception Support

Instruction Address Watch
Registers
Timers

Interrupts

Interrupt Control
Designer-Defined
Queues and Ports

Designer-Defined Execution Units,
Register Files, and Interfaces

Data Address
Watch Registers

Designer-Defined Execution Units,
Register Files, and Interfaces

Exception Handling
Registers

Base Register File

External Interface

Base ALU

Xtensa LX
Processor Interface
Control

MAC 16 DSP

Designer-Defined
Execution Units
Data Memory
Management & Data Cache
Protection

Base ISA Feature

Optional Function

Write Buffer

Floating Point

Vectra LX DSP Engine

Configurable Function

PIF

MUL 16/32

Data
Load/Store
Unit

Designer-Defined Data
Load/Store Unit

Data ROMs
Data RAMs
Xtensa
Local
Memory
Interface

Optional & Configurable
Designer-Defined Features (TIE)

Figure 1–1. Xtensa LX Hardware Architecture Block Diagram
Table 1–2 compares the architectural features provided by the Xtensa ISA to those of
typical RISC architectures. Each of the Xtensa features are described in this manual.
Table 1–2. Comparison of Typical RISC and Xtensa ISA Features
Architectural Feature
Instruction size
Compare and branch

Typical RISC

Xtensa

32 bits

24 and 16 bit

no or partial

total

Application-specific instructions

no

yes

Zero-overhead loop

no

yes

no (except 29000)

yes

Funnel shift
Variable-increment register windows

no

yes

Conditional move

recently

yes

Compound multiply/add

recently

yes

Advanced multiprocessor synchronization

recently

yes

6

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

1.2.1

Configurability

The Xtensa ISA goes further than incorporating post-RISC features: it is modular,
consisting of a core architecture and architectural options. Table 1–3 lists the initial set
of modular components.
Table 1–3. Modular Components
Component

Reference

Core Architecture

Chapter 3, "Core Architecture" on page 23

Core Architecture

Section 4.2 “Core Architecture” on page 50

Options for Additional Instructions
Code Density Option

"Code Density Option" on page 53

Loop Option

"Loop Option" on page 54

Extended L32R Option

"Extended L32R Option" on page 56

16-bit Integer Multiply Option

"16-bit Integer Multiply Option" on page 57

32-bit Integer Multiply Option

"32-bit Integer Multiply Option" on page 58

MAC16 Option

"MAC16 Option" on page 60

Miscellaneous Operations Option

"Miscellaneous Operations Option" on page 62

Coprocessor Option

"Coprocessor Option" on page 63

Boolean Option

"Boolean Option" on page 65

Floating-Point Coprocessor Option

"Floating-Point Coprocessor Option" on page 67

Multiprocessor Synchronization Option

"Multiprocessor Synchronization Option" on page 74

Conditional Store Option

"Conditional Store Option" on page 77

Options for Interrupts and Exceptions
Exception Option

"Exception Option" on page 82

Unaligned Exception Option

"Unaligned Exception Option" on page 99

Interrupt Option

"Interrupt Option" on page 100

High-Priority Interrupt Option

"High-Priority Interrupt Option" on page 106

Timer Interrupt Option

"Timer Interrupt Option" on page 110

Xtensa Instruction Set Architecture (ISA) Reference Manual

7

Chapter 1. Introduction

Table 1–3. Modular Components (continued)
Component

Reference

Options for Memory
Instruction Cache Option

"Instruction Cache Option" on page 115

Instruction Cache Test Option

"Instruction Cache Test Option" on page 116

Instruction Cache Index Lock Option

"Instruction Cache Index Lock Option" on page 117

Data Cache Option

"Data Cache Option" on page 118

Data Cache Test Option

"Data Cache Test Option" on page 121

Data Cache Index Lock Option

"Data Cache Index Lock Option" on page 122

Instruction RAM Option

"Instruction RAM Option" on page 124

Instruction ROM Option

"Instruction ROM Option" on page 125

Data RAM Option

"Data RAM Option" on page 126

Data ROM Option

"Data ROM Option" on page 126

XLMI Option

"XLMI Option" on page 127

Hardware Alignment Option

"Hardware Alignment Option" on page 128

Memory ECC/Parity Option

"Memory ECC/Parity Option" on page 128

Options for Memory Protection
Region Protection Option

"Region Protection Option" on page 150

Region Translation Option

"Region Translation Option" on page 156

MMU Option

"MMU Option" on page 158

Options for Other Purposes
Windowed Register Option

"Windowed Register Option" on page 180

Processor Interface Option

"Processor Interface Option" on page 194

Miscellaneous Special Registers Option

"Miscellaneous Special Registers Option" on page 195

Thread Pointer Option

"Thread Pointer Option" on page 196

Processor ID Option

"Processor ID Option" on page 196

Debug Option

"Debug Option" on page 197

Trace Port Option

"Trace Port Option" on page 203

1.2.2

Extensibility

In addition to the Xtensa components shown in Table 1–3, designers can extend the
Xtensa architecture by adding States, Register Files, and instructions that operate both
on the AR Register File and on the additional states the designer has added. These instructions can be single cycle or multiple cycles, and share or re-use logic.

8

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

1.2.2.1 State Extensions
The designer can add State Registers. These State Registers can be the source or
destination of various instructions and are saved and restored by the operating system.
1.2.2.2 Register File Extensions
The designer can add Register Files of widely varying size. These Register Files can be
the source or destination of various instructions and are saved and restored by the
operating system. The registers within them are allocated by the compiler, which can
spill and re-fill them if necessary.
1.2.2.3 Instruction Extensions
The designer can define new instructions that contain simple functions consisting of
combinatorial logic that takes one or two source operands from registers and produces a
result to be written to a register:
AR[r] ← f(AR[s], AR[t])

Instructions can also be much more complex with register file values and State appearing as both inputs and outputs. These Instructions are described using the Tensilica
Instruction Extension (TIE) language (see Section 1.3.2).
1.2.2.4 Coprocessor Extensions
Another mechanism to extend the Xtensa ISA is to use the Coprocessor Option. A coprocessor is defined as a combination of registers, other state, and logic that operates
on that state, including loads, stores and setting of Booleans for branch true/false operations. A particular coprocessor can be enabled or disabled to control with one bit
whether or not instructions accessing that combination of registers and other state may
or may not execute.

1.2.3

Time-to-Market

The Xtensa Software Development Toolkit includes automatically generated software
that matches the designer’s processor configuration and eliminates tool headaches. The
ISA’s rich set of features (for example, interrupt and debug facilities) makes the system
designer’s job easier. The ability to create custom instructions with the TIE language
allows the designer to reach performance goals with less code-tuning or hard-tointerface-to external logic.

Xtensa Instruction Set Architecture (ISA) Reference Manual

9

Chapter 1. Introduction

1.2.4

Code Density

The Xtensa core ISA is implemented as 24-bit instructions. This instruction width provides a direct 25% reduction in code size compared with 32-bit ISAs. The instructions
provide access to the entire processor hardware and support special functions, such as
single-instruction compare-and-branch, which reduce the number of instructions required to implement various applications. These special functions result in further codesize reductions.
The Xtensa ISA also includes a Code Density Option that further reduces code size.
This option adds 16-bit instructions that are distinguished by opcode, and that can be
freely intermixed with 24-bit instructions to achieve higher code density than competing
ISAs without giving up the performance of a 32-bit ISA. The 16-bit instructions add no
new functionality but provide compact encoding of the most frequently used 24-bit instructions. In typical code, roughly half of all instructions can be encoded in 16 bits.
The core ISA omits the branch delay slots required by some RISC ISAs. This increases
code density by eliminating NOPs the compiler uses to fill the slot after a branch when it
cannot find a real instruction to put there (only 50% of the branch delay slots are filled on
some RISC architectures).
The Xtensa ISA provides a Windowed Registers Option. Xtensa windowed registers reduce code size by:
„

Eliminating register saves and restores at procedure entry and exit

„

Reducing argument shuffling

„

Allowing more local variables to live permanently in registers

1.2.5

Low Implementation Cost

The Xtensa architecture is designed to facilitate efficient implementation. It can be implemented with simple instruction pipelines and direct hardware execution without micro
code. Operations that are too complex to easily implement with single instructions are
synthesized into appropriate instruction sequences by the compiler. The base architecture avoids instructions that would need extra register file read or write ports. This keeps
the minimal configuration low-cost and low-power.
The Xtensa architecture fully supports the common data types and operations found in a
broad range of applications. The base architecture omits special-purpose data types
and operations. Optional instructions, the TIE language (see Section 1.3.2), and optional coprocessors allow the designer to add exactly the functionality needed, thus reducing the cost and performance due to unused general-purpose functions.

10

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

The Xtensa ISA’s improvements in code size help reduce system cost (for example, by
reducing the amount of ROM, Flash, or RAM required). Making features like the number
of debug registers configurable allows the system designer, instead of the processor
designer, to decide the cost/benefit trade-off.

1.2.6

Low-Power

The Xtensa ISA has several energy-efficient attributes that enhance battery-operated
systems. The core ISA is built on 32-bit operations; some embedded processors of similar performance have 64-bit base operations, which consumes additional power, often
unnecessarily. (TIE does allow 64-bit or greater computations to be added to the processor for those algorithms that require it, but these can be used selectively to achieve a
balance between performance and power consumption.)
The core ISA uses a register file with only two read ports and one write port, a configuration that requires fewer transistors and less power than architectures with more ports.
The Xtensa Windowed Registers Option saves power by reducing the number of dynamic data-memory references and increasing the opportunities for variables to reside
in registers, where accesses require less power than memory accesses.
The WAITI (Wait for Interrupt) instruction, which is a part of the Interrupt Option, saves
power by setting the current interrupt level, powering down the processor’s logic, and
waiting for an interrupt.

1.2.7

Performance

The Xtensa ISA achieves its extensibility, code density, and low-power advantages without sacrificing performance. For example, the Thumb and MIPS16 extensions of the
ARM and MIPS ISAs, respectively, provide improved code density by using only eight
registers and by reducing operand flexibility. By contrast, the Xtensa 24-bit instructions
can access 16 virtual registers with 3 register operands, and 16-bit instructions can
access all 16 registers with 1 to 3 register operands. The mapping of the 16 virtual
registers to the physical register file can eliminate register saves and restores at procedure entry and exit, also increasing performance.
The Xtensa ISA also enhances performance by providing:
„

„

A complete set of compare-and-branch instructions, eliminating the need for separate comparison instructions
LOOP, LOOPNEZ, and LOOPGTZ instructions that provide zero-overhead looping

These features are described in Section 3.8 of this manual. Other features of the architecture minimize critical paths, allow better compiler scheduling, and require fewer executed instructions to implement a given program.

Xtensa Instruction Set Architecture (ISA) Reference Manual

11

Chapter 1. Introduction

1.2.8

Pipelines

The Xtensa ISA can be implemented using a variety of pipelines. A 5-stage load-store
oriented pipeline, such as is used in many RISC processors, is supported by Xtensa implementations and illustrated in Figure 1–2. Many other variations are possible. A 7stage load-store oriented pipeline is supported by some Xtensa implementations. Instructions can also have computation in later pipe stages so that the computation can
use memory data loaded by the same instruction.

Xtensa Local
Memory
Interface
(XLMI)

Instruction
RAM

Instruction
Cache

Decode

General
Registers
(AR Registers)

Data
RAM

Instruction
ROM

Coprocessor
Registers

Address
Generation

ALU

Data
Cache

Data
ROM

Exception
Resolution
and Write
Back

I: Instruction Fetch

Coprocessor
ALU

R: Instruction Decode/
Register Fetch Cycle

E: Execute/Effective
Address Cycle

M: Memory Access/
Branch Complete Cycle

W: Write Back Cycle

Figure 1–2. Example Implementation Pipeline
The instruction set was also designed with a 2-read, 1-write general register file (called
Address Registers) in mind. While this approach results in lower implementation cost, it
prevents the inclusion of auto-incrementing loads and indexed stores to or from the
Address Registers. For the sake of symmetry, the ISA therefore does not include autoincrementing stores and indexed loads. However, all of these addressing modes are

12

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

possible for designer defined loads and stores. Designers can implement register files
with more read and write ports. For example, the Xtensa Floating-Point Coprocessor
Option contains a floating point register file with three read ports.

1.3

The Xtensa Processor Generator

The Xtensa Processor Generator is the key to rapid, optimal creation of applicationspecific processors. Using this tool, the designer can specify and generate a complete
processor subsystem. The designer can select the instruction set, memory hierarchy,
peripherals and interface options to fit the target application.
The Generator user interface captures designer input in several ways, including:
„

Configuration of the processor micro-architecture

„

Configuration of Tensilica-provided instruction and coprocessor options

„

Specification of designer-defined instruction and coprocessor extensions, using the
Tensilica Instruction Extension (TIE) language

Together, these specifications make up the configuration database shown near the top
of Figure 1–3. This file is used to generate all the software tools and hardware descriptions for the final application-specific processor.

1.3.1

Processor Configuration

The Generator interface drives the creation and optimization of all forms of the processor needed for integration into the system design flow. Based on the designer’s specifications, it creates synthesizable Verilog or VHDL code, synthesis scripts, an HDL test
bench, and physical placement files. Simultaneously, an optimized C and C++ compiler,
assembler, linker, symbolic debugger, Instruction Set Simulator, libraries and verification
tests are built for the designer’s software development.
The Generator interface lets the designer specify implementation targets for speed, area
and process technology, as well as the optimization priorities used in synthesis and layout.

1.3.2

System-Specific Instructions—The TIE Language

The Tensilica Instruction Extension (TIE) language lets the designer add instructions to
the processor implementation, including full software support for generated instructions.
The specification of instruction extensions can include the following aspects as well as
many others:
„

Instruction Operation — Defines the operation of an additional instruction

Xtensa Instruction Set Architecture (ISA) Reference Manual

13

Chapter 1. Introduction

„

Immediate and Constant Tables — Defines constant values in instructions

„

Register File — Defines new register files

„

State — Defines new single processor states for instructions to operate on

„

„

„

„

„

Length and Format — The FLIX extensions to TIE allow for multiple instruction sizes
and the defining of multiple operations in a single instruction
Queues and Ports — Defines input and output queue ports and other ports for the
Xtensa processor
Types — Defines new C/C++ data types associated with user defined register files.
Allows type checking and automatic loading, storing and register allocation
Prototypes — Defines the argument types of C/C++ intrinsics for each instruction
and the instruction sequences for loading, storing, and moving the added types
Schedule — Defines the pipeline stages at which instructions use input values and
produce output values

In addition to designer-defined register and register file operands, instructions can use
AR registers as source values. They may generate multiple results, including AR register
file results. These instructions should be designed to have circuit delays appropriate to
the number cycles specified in the schedule specifications to avoid limiting the processor clock frequency. The instruction semantics are expressed in a subset of Verilog,
including all commonly used operators (multiply, add, subtract, minus, not, or, comparisons, reduction operators, shifts, concatenation, and conditionals).
The use of TIE for the creation of new instructions and coprocessors is described in the
Tensilica Instruction Extension (TIE) Language User’s Guide. The TIE language is described in the Tensilica Instruction Extension (TIE) Language Reference Manual.

14

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 1. Introduction

Figure 1–3 illustrates the Xtensa design flow.
Configure Processor
(including Custom TIE
Instructions)

Configuration-Specific
Database

Configuration
-Independent
XtTools

ConfigurationSpecific
Software
Development
Tools

Configuration-Specific
HDL Description and
CAD Scripts

Install Software:
Set up Environment

Synthesize Logic

Compile, Assemble and
Link Application Software

Place and Route

Simulate, Debug & Profile
Application Software:
Add Custom Instructions

Verify Timing

Hardware User
Tasks

Software

Hardware

Automatically
Generated
Software User
Tasks

Figure 1–3. The Xtensa Design Flow

Xtensa Instruction Set Architecture (ISA) Reference Manual

15

Chapter 1. Introduction

16

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 2. Notation

2.

Notation

This manual uses the following notation for instruction descriptions. Additional notation
specific to opcode encodings is provided in "Opcode Encodings" on page 574.

2.1

Bit and Byte Order

This manual consistently uses little-endian bit ordering for describing instructions and
registers. Bits in little-endian notation are numbered starting from 0 for the least-significant bit of a field. However, this notation convention is independent of how an Xtensa
processor actually numbers bits, because a given processor can be configured for either
little- or big-endian byte and bit ordering. For most Xtensa instructions, bit numbering is
irrelevant; only the BBC and BBS instructions assign bit numbers to values on which the
processor operates. The BBC/BBS instructions use big-endian bit ordering (0 is the mostsignificant bit) on a big-endian processor configuration. Bit numbering by the BBC/BBS
instructions is illustrated in Figure 2–4.
In specifying little- or big-endian ordering during actual processor configuration, you are
specifying both the bit and the byte order; the two orderings have the same most-significant and least-significant ends.
Figure 2–5 on page 18 illustrates big- and little-endian byte order, as implemented by
Xtensa load (page 33) and store (page 36) instructions. Xtensa processors transfer data
to and from the system using interfaces that are configurable in width (32, 64, or 128 bits
in current implementations). These interfaces arrange their n bits according to their significance representing an n-bit unsigned integer value (that is, 0 to 2n-1). Load and store
instructions that reference quantities less than n bits access different bits of this integer
in little-endian and big-endian byte orderings (for example, by changing the selection algorithm for loads). Xtensa processors do not rearrange bits of a word to implement endianness (for example, swapping bytes for big-endian operation).
Little-Endian bit numbering for BBC/BBS instructions:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

←most-significant

least-significant→

Big-Endian bit numbering for BBC/BBS instructions:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

←most-significant

least-significant→

Figure 2–4. Big and Little Bit Numbering for BBC/BBS Instructions

Xtensa Instruction Set Architecture (ISA) Reference Manual

17

Chapter 2. Notation

Little-Endian byte addresses, 128-bit processor interface:
127 (←most-significant)

(least-significant→) 0

word 0

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

word 1

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

…

32

word 2

Big-Endian byte addresses, 128-bit processor interface:
127 (←most-significant)

(least-significant→) 0

word 0

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

word 1

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

word 2

32

…

Little-Endian byte addresses, 64-bit processor interface:
63 (←most-significant)

(least-significant→) 0

word 0

7

6

5

4

3

2

1

0

word 1

15

14

13

12

11

10

9

8

…

16

word 2

Big-Endian byte addresses, 64-bit processor interface:
63 (←most-significant)

(least-significant→) 0

word 0

0

1

2

3

4

5

6

7

word 1

8

9

10

11

12

13

14

15

word 2

16

…

Little-Endian byte addresses, 32-bit processor interface:
31

0

word 0

3

2

word 1

7

6

word 2

1

0

5

4

…

8

Big-Endian byte addresses, 32-bit processor interface:
31

0

word 0

0

1

2

3

word 1

4

5

6

7

word 2

8

…

Figure 2–5. Big and Little Endian Byte Ordering

18

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 2. Notation

2.2

Expressions

Table 2–4 defines notational forms used in expressions that describe the operation of instructions. In the table, v is an n-bit quantity, u is an m-bit quantity, and t is a 1-bit
quantity.
Table 2–4. Instruction-Description Expressions
Expression Notation1

Definition

vx

Bit x of v. The result is 1 bit.

vx..y

Bits from position x to y of v. The result is x-y+1 bits.

vy

The value v replicated y times. The result is n×y bits.

array[i]

Reference to element i of array.

u || v

The catenation of bit strings u and v. The result is m+n bits.

not v

Bitwise logical complement of v. The result is n bits.

u and v

Bitwise logical and of u and v. u and v must be the same width. The result is n
bits.

u or v

Bitwise logical or of u and v. u and v must be the same width. The result is n
bits.

u xor v

Bitwise logical exclusive or of u and v. u and v must be the same width. The
result is n bits.

u = v

Test for exact equality of u and v. u and v must be the same width. The result
is 1 bit.

u ≠ v

Test for inequality of u and v. u and v must be the same width. The result is 1
bit.

u < v

Two’s complement less-than test on u and v. u and v must be the same width.
The result is 1 bit.

u ≤ v

Two’s complement less-than or equal-to test on u and v. u and v must be the
same width. The result is 1 bit.

u > v

Two’s complement greater-than test on u and v. u and v must be the same
width. The result is 1 bit.

u ≥ v

Two’s complement greater-than or equal-to test on u and v. u and v must be
the same width. The result is 1 bit.

u + v

Two’s complement addition of u and v. u and v must be the same width. The
result is n bits.

u - v

Two’s complement subtraction of u and v. u and v must be the same width.
The result is n bits.

u x v

Low-order product of two’s complement multiplication of u and v. u and v must
be the same width. The result is n bits.

1.

t is a 1-bit quantity, u is a m-bit quantity, v is an n-bit quantity. Constants are written either as decimal numbers, in which case the width is
determined from context, or in binary.

Xtensa Instruction Set Architecture (ISA) Reference Manual

19

Chapter 2. Notation

Table 2–4. Instruction-Description Expressions (continued)
Expression Notation1

Definition

u quo v

Quotient of two’s complement division of u by v. u and v must be the same
width. The result is n bits.

u rem v

Remainder of two’s complement division of u by v. u and v must be the same
width. The result is n bits.

if t then u else v

Conditional expression. The value is u if t = 1. The value is v if t = 0.

u +s v

IEEE754 single-precision floating-point addition of u and v. u and v must be
32 bits. The result is 32 bits.

u -s v

IEEE754 single-precision floating-point subtraction of u and v. u and v must
be 32 bits. The result is 32 bits.

u Xs v

IEEE754 single-precision floating-point multiplication of u and v. u and v must
be 32 bits. The result is 32 bits.

u ÷s v

IEEE754 single-precision floating-point division of u by v. u and v must be 32
bits. The result is 32 bits.

sqrts(u)

IEEE754 single-precision floating-point square root of u. u must be 32 bits. The
result is 32 bits.

pows(u,v)

IEEE754 single-precision floating-point power function where u is raised to the
v power. u must be 32 bits. The result is 32 bits.

1.

2.3

t is a 1-bit quantity, u is a m-bit quantity, v is an n-bit quantity. Constants are written either as decimal numbers, in which case the width is
determined from context, or in binary.

Unsigned Semantics

In this notation, prepending a zero bit is often used for unsigned semantics. For
example, the following notation indicates an unsigned less-than test:
(0 || u) < (0 || v)

2.4

Case

Processor-state variables (for example, registers) are shown in UPPER CASE.
Temporary variables are shown in lower case. If a particular variable is in italics
(variable), it is local in the sense that it has no meaning outside the local instruction
flow. If it is plain (variable), it comes from or is used outside of the local instruction
flow such as an instruction field or the next PC.

20

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 2. Notation

2.5

Statements

Table 2–5 defines notational forms used in statements used to describe the operation of
instructions.
Table 2–5. Instruction-Description Statements
Statement Notation

Definition

v ← expr

Assignment of expr to v.

if t1 then
s1
[elseif t2 then
s2]
.
.
.
[else
sn]
endif

Conditional statement. If t1 = 1 then execute statements s1. Otherwise, if t2 =
1 then execute statements s2, etc. Finally if none of the previous tests are true,
execute statements sn.

label:

Define label for use as a goto target.

goto label

Transfer control to label.

2.6

Instruction Fields

The fields in Table 2–6 are used in the descriptions of the instructions. Instruction formats and opcodes are described in Chapter 7, "Instruction Formats and Opcodes" on
page 569.
Table 2–6. Uses Of Instruction Fields
Field

Definition

op0

Major opcode

op1

4-bit sub-opcode for 24-bit instructions

op2

4-bit sub-opcode for 24-bit instructions

r

AR target (result), BR target (result),
4-bit immediate,
4-bit sub-opcode

s

AR source, BR source,
AR target

t

AR target, BR target,
AR source, BR source,
4-bit sub-opcode

Xtensa Instruction Set Architecture (ISA) Reference Manual

21

Chapter 2. Notation

Table 2–6. Uses Of Instruction Fields (continued)
Field

Definition

n

Register window increment,
2-bit sub-opcode,
n||2'b00 is used as a AR target on CALLn/CALLXn

m

2-bit sub-opcode

i

1-bit sub-opcode

z

1-bit sub-opcode

imm4

4-bit immediate

imm6

6-bit immediate (PC-relative offset)

imm7

7-bit immediate (for MOVI.N)

imm8

8-bit immediate

imm12

12-bit immediate

imm16

16-bit immediate

offset

18-bit PC-relative offset

ai4const

4-bit immediate, if 0 interpreted as -1, else sign-extended

b4const

4-bit encoded constant value

bbi

5-bit selector for Booleans in registers

sa

4- or 5-bit shift amount

sr

8-bit special register selector

x

1-bit MAC16 data register selector (m0 or m1 only)

y

1-bit MAC16 data register selector (m2 or m3 only)

w

2-bit MAC16 data register selector (m0, m1, m2, or m3)

22

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

3.

Core Architecture

The Xtensa Core Architecture provides a baseline set of instructions available in every
Xtensa implementation. Having such a baseline eases the implementation of core software such as operating system ports and a compiler. This chapter describes that Core
Architecture.

3.1

Overview of the Core Architecture

The Xtensa Instruction Set is the product of extensive research into the right balance of
features to best address the needs of the embedded processor market. It borrows the
best features of other architectures as well as bringing new ISA innovations of its own.
While the Xtensa ISA derives most of its features from RISC, it has targeted areas in
which older CISC architectures have been strongest, such as compact code.
The Xtensa core ISA is implemented as a set of 24-bit instructions that perform 32-bit
operations. The instruction width was chosen primarily with code-size economy in mind.
The instructions themselves were selected for their utility in a wide range of embedded
applications. The core ISA has many powerful features, such as compound operation
instructions, that enhance its fit to embedded applications, but it avoids features that
would benefit some applications at the expense of cost or power on others (for example,
features that require extra register-file ports). Such features can be implemented in the
Xtensa architecture using options and coprocessors specifically targeted at a particular
application area.
The Xtensa ISA is organized as a core set of instructions with various optional packages
that extend the functionality for specific application areas. This allows the designer to
include only the required functionality in the processor core, maximizing the efficiency of
the solution. The core ISA provides the functionality required for general control applications, and excels at decision-making and bit and byte manipulation. The core also provides a target for third-party software, and for this reason deletions from the core are not
supported. Conversely, numeric computing applications such as digital signal processing are best done with optional ISA packages appropriate for specific application areas,
such as the MAC16 Option for integer filters, or the Floating-Point Coprocessor Option
for high-end audio processing.

3.2

Processor-Configuration Parameters

Table 3–7 lists the processor-configuration parameters that are required in the core architecture. Additional processor-configuration parameters are listed with each option
described in Chapter 4, "Architectural Options" on page 47.

Xtensa Instruction Set Architecture (ISA) Reference Manual

23

Chapter 3. Core Architecture

Table 3–7. Core Processor-Configuration Parameters
Parameter

Description

Valid Values

msbFirst

Byte order

0 or 1
0 → Little-endian (least significant bit first)
1 → Big-endian (most significant bit first)

3.3

Registers

Table 3–8 lists the core-architecture registers. Each register is described in the sections
that follow. Additional registers are added with many of the options described in
Chapter 4. The complete set of registers that are predefined in the architecture, including all registers used by the architectural options, is listed in Table 5–127 on page 205.
Table 3–8. Core-Architecture Set
Register
Mnemonic
AR
PC
SAR

Register Name

R/W

Special
Register
Number1

32

Address registers
(general registers)

R/W

—

1

32

Program counter

R/W

—

1

6

Shift-amount register

R/W

3

Quantity

Width
(bits)

162

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 5–127 on
page 205. A dash (—) means that the register is not a Special Register.

2.

See "Windowed Register Option" on page 180.

3.3.1

General (AR) Registers

Each instruction contains up to three 4-bit general-register specifiers, each of which can
select one of 16 32-bit registers. These general registers are named address registers
(AR) to distinguish them from coprocessor registers, which in many systems might serve
as “data” registers. However, the AR registers are not restricted to holding addresses;
they can also hold data.
If the Windowed Register Option is configured, the address register file is extended and
a mapping from virtual to physical registers is used.
The contents of the address register file are undefined after reset.

24

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

3.3.2

Shifts and the Shift Amount Register (SAR)

The ISA provides conventional immediate shifts (logical left, logical right, and arithmetic
right), but it does not provide single-instruction shifts in which the shift amount is a register operand. Taking the shift amount from a general register can create a critical timing
path. Also, simple shifts do not extend efficiently to larger widths. Funnel shifts (where
two data values are catenated on input to the shifter) solve this problem, but require too
many operands. The ISA solves both problems by providing a funnel shift in which the
shift amount is taken from the SAR register. Variable shifts are synthesized by the compiler using an instruction to compute SAR from the shift amount in a general register,
followed by a funnel shift.
Another advantage is that a unidirectional funnel shifter can be manipulated to provide
either right or left shifts based on the order of the source operands and transformation of
the shift amount. The ISA facilitates implementations that exploit this to reduce the logic
required by the shifter.
Funnel shifts are also useful for working with the 40-bit accumulator values created by
the MAC16 Option.
To facilitate unsigned bit-field extraction, the EXTUI instructions take a 4-bit mask field
that specifies the number of bits to mask the result of the shift. The 4-bit field specifies
masks of one to 16 ones. The SRLI instruction provides shifting without a mask.
The legal range of values for SAR is zero to 32, not zero to 31, so SAR is defined as six
bits. The use of SRC, SRA, SLL, or SRL when SAR > 32 is undefined.
SAR is undefined after processor reset.
The funnel shifter can also be used efficiently for byte alignment of unaligned memory
data. To load four bytes from an arbitrary byte boundary (in a processor that does not
have the Unaligned Exception Option), use the following code:
l32i
l32i
ssa8l
src

a4,a3,0
a5,a3,4
a3
a4,a5,a4

An unaligned block copy can be done (in a processor that does not have the Unaligned
Exception Option) with the following code for little-endian and small changes for big-endian:
l32i
ssa8l
loopnez

a6,a3,0
a3
a4,endloop

l32i

a7,a3,4

loop:

Xtensa Instruction Set Architecture (ISA) Reference Manual

25

Chapter 3. Core Architecture

src
s32i
l32i
src
s32i
addi
addi
endloop:

a8,a7,a6
a8,a2,0
a6,a3,8
a8,a6,a7
a8,a2,4
a2,a2,8
a3,a3,8

The overhead, compared to an aligned copy, is only one SRC per L32I.

3.3.3

Reading and Writing the Special Registers

The SAR register is part of the Non-Privileged Special Register set in the Xtensa ISA (the
other registers in this set are associated with the architectural options). The contents of
the special register in the Core Architecture can be read to an AR register with the read
special register (RSR.SAR) instruction or written from an AR register with the write special register (WSR.SAR) instruction as shown in Table 3–9. The exchange special register (XSR.SAR) instruction accomplishes the combined action of the read and write instructions.
Table 3–9. Reading and Writing Special Registers
Register Name
SAR

3.4

Special Register Number RSR .SAR Instruction
3

AR[t] ←

026||SAR

WSR .SAR Instruction
SAR ← AR[t]5..0

Data Formats and Alignment

The Core Architecture supports byte, 2-byte, and 4-byte data formats. Two additional
data formats are used in architectural options — a 32-bit single-precision format for the
Floating-Point Coprocessor Option, and a 40-bit accumulator value for the MAC16 Option. The MAC16 format is not a memory-operand format, but rather a temporary format
held in a special 40-bit accumulator register during MAC16 execution; the result can be
moved to two 32-bit registers for further operation or storage.
Table 3–10 summarizes the width and alignment of each data type. The processor uses
byte addressing for all data types stored in memory (that is, all except the MAC16 accumulator). Byte order can be specified as either big-endian or little-endian. In big-endian
byte order, byte 0 is the most-significant (left-most) byte. In little-endian byte order, byte
0 is the least-significant (right-most) byte. When specifying a byte order, both the byte
order and the bit order are specified: the two orderings always have the same mostsignificant and least-significant ends.

26

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

Table 3–10. Operand Formats and Alignment
Operand

Length

Alignment Address in Memory

8 bit

xxxx

2-byte

16 bits

xxx0

4-byte (word)

32 bits

xx00

IEEE-754 single-precision (Floating-Point Coprocessor Option)

32 bits

xx00

MAC16 accumulator (MAC16 Option)

40 bits

register image only (not in memory)

Byte

3.5

Memory

The Xtensa ISA is based on 32-bit virtual and physical memory addresses, which
provides a 232 or 4 GB address space for instructions and data.

3.5.1

Memory Addressing

Figure 3–6 shows an example of the processor’s interpretation of addresses when configured with caches. The widths of all fields are configurable, and in some cases the
width may be zero (in particular, there are always zero ignored bits today). The cache index and cache tag will overlap if the page size is smaller than the size of a single way of
the cache and if physical tags are used.
32-Bit Virtual Address
Ignored

Cache Tag

Offset in Page
0

31

Line Index

Attribute
Region

Cache Index
Physical Address

Figure 3–6. Virtual Address Fields
Without the Region Protection Option or the MMU Option, virtual and physical addresses are identical; if physical addresses are configured to be smaller than virtual addresses, virtual addresses are mapped to physical addresses only by truncation (high-order
bits are ignored). With the Region Protection Option or the MMU Option, virtual page
numbers are translated to physical page numbers.

Xtensa Instruction Set Architecture (ISA) Reference Manual

27

Chapter 3. Core Architecture

Without the Region Protection Option or the MMU Option, the formal definition of virtual
to physical translation is as follows (note that the ring parameter is ignored):
function ftranslate(vAddr, ring)-- fetch translate
b ← vAddr(VABITS-1)..(VABITS-3)
cacheattr ← CACHEATTR(b||2'b11)..(b||2'b00)
attributes ← fcadecode(cacheattr)
cause ← invalid(attributes) then InstructionFetchErrorCause else 0
ftranslate ← (vAddrPABITS-1..0, attributes, cause)
endfunction ftranslate
function ltranslate(vAddr, ring)-- load translate
b ← vAddr(VABITS-1)..(VABITS-3)
cacheattr ← CACHEATTR(b||2'b11)..(b||2'b00)
attributes ← lcadecode(cacheattr)
cause ← invalid(attributes) then LoadStoreErrorCause else 0
ltranslate ← (vAddrPABITS-1..0, attributes, cause)
endfunction ltranslate
function stranslate(vAddr, ring)-- store translate
b ← vAddr(VABITS-1)..(VABITS-3)
cacheattr ← CACHEATTR(b||2'b11)..(b||2'b00)
attributes ← scadecode(cacheattr)
cause ← invalid(attributes) then LoadStoreErrorCause else 0
stranslate ← (vAddrPABITS-1..0, attributes, cause)
endfunction stranslate

Translation with the MMU Option is described in Section 4.6.5.
The core ISA supports both little-endian (PC compatible) and big-endian (Internet compatible) address models as a configuration parameter. In this manual:
„

msbFirst = 1 is big-endian.

„

msbFirst = 0 is little-endian.

3.5.2

Addressing Modes

The core instruction set implements the register + immediate addressing mode. The
core ISA does not implement auto-incrementing stores or indexed loads. However, such
addressing modes are possible for coprocessors. For example, the Floating-Point
Coprocessor Option implements indexed as well as immediate addressing modes.

28

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

3.5.3

Program Counter

The 32-bit program counter (PC) holds a byte address and can address 4 GB of virtual
memory for instructions. However, when the Windowed Register Option is configured,
the register-window call instructions only store the low 30 bits of the return address.
Register-window return instructions leave the two most-significant bits of the PC unchanged. Therefore, subroutines called using register window instructions must be
placed in the same 1 GB address region as the call.

3.5.4

Instruction Fetch

This section describes the execution loop of the processor using the notation of
Chapter 2. The individual instruction actions are represented by the Inst() statement,
and are detailed in subsequent sections. Two versions of this code are supported; one
for little-endian (msbFirst = 0) and one for big-endian (msbFirst = 1). This definition
is in terms of a hypothetical aligned 64-bit fetch, and should not be confused with the
fetch algorithms used by specific Xtensa ISA implementations. Aligned 32-bit fetch and
unaligned fetch are other possible implementations, which would produce logically
equivalent results, but with different timings. Also, actual implementations would be expected to access memory only once for each fetch unit, not once per instruction as in the
definition in Section 3.5.4.1 and Section 3.5.4.2.
The processor may speculatively fetch instructions following the address in the program
counter. To facilitate this and to allow flexibility in the implementation, software must not
position instructions within the last 64 bytes before a boundary where protection or
cache attributes change. This exclusion does not apply if one of the two protections or
attributes is invalid. Instructions may be placed within 64 bytes before a transition from
valid to invalid or from invalid to valid — but not before any other transition. In addition, if
the Windowed Register Option is implemented, software must not position instructions
within the last 16 bytes of a 230 (1 GB) boundary, to allow flexibility in the implementation
of the register-window call and return instructions. The operation of the processor in
these exclusion regions is not defined.
3.5.4.1 Little-Endian Fetch Semantics
Little-endian instruction fetch is defined as follows for a 64-bit fetch width (other fetch
sizes are similar):
checkInterrupts()
-- see "Checking for Interrupts" on page 109
vAddr0 ← PC31..3||3'b000 -- this example is 64-bit fetch
(pAddr0, attributes, cause) ← ftranslate(vAddr0, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr0
Exception (cause)

Xtensa Instruction Set Architecture (ISA) Reference Manual

29

Chapter 3. Core Architecture

goto abortInstruction
endif
(mem0, error) ← ReadInstMemory(pAddr0, attributes, 8'b11111111)
-- get start of instruction
if error then
EXCVADDR ← vAddr0
Exception (InstructionFetchErrorCause)
goto abortInstruction
endif
b ← 0||PC2..0
if b2 = 0 or b1 = 0 or (b0 = 0 and mem0(b||3'b011) = 1) then
-- instruction contained within a single fetch (64 bits in this example)
inst ← (undefined64||mem0)((b+2)||3'b111)..(b||3'b000)
else
-- instruction crosses a fetch boundary (64 bits in this example)
vAddr1 ← vaddr0 + 32'd8
(pAddr1, attributes, cause) ← ftranslate(vAddr1, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr1
Exception (cause)
goto abortInstruction
endif
(mem1, error) ← ReadInstMemory(pAddr1,
attributes, 8'b11111111)
if error then
EXCVADDR ← vAddr1
Exception (InstructionFetchErrorCause)
goto abortInstruction
endif
inst ← (mem1||mem0)((b+2)||3'b111)..(b||3'b000)
endif
-- now have a 24-bit instruction (8 bits undefined if 16-bit), break it into fields
op0 ← inst3..0
t ← inst7..4
s ← inst11..8
r ← inst15..12
op1 ← inst19..16
op2 ← inst23..20
imm8 ← inst23..16
imm12 ← inst23..12
imm16 ← inst23..8
offset ← inst23..6
n ← inst5..4
m ← inst7..6
-- compute nextPC (may be overridden by branches, etc.)
nextPC ← PC + (030 || (if op03 then 2'b10 else 2'b11))
if LCOUNT ≠ 032 and CLOOPENABLE and nextPC = LEND then
LCOUNT ← LCOUNT − 1
nextPC ← LBEG

30

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

endif
-- execute instruction
Inst()
checkIcount ()
abortInstruction:
PC ← nextPC

3.5.4.2 Big-Endian Fetch Semantics
Big-endian instruction fetch is defined as follows for a 64-bit fetch width (other fetch
sizes are similar):

then

checkInterrupts()
-- see "Checking for Interrupts" on page 109
vAddr0 ← PC31..3||3'b000 -- this example is 64-bit fetch
(pAddr0, attributes, cause) ← ftranslate(vAddr0, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr0
Exception (cause)
goto abortInstruction
endif
(mem0, error) ← ReadInstMemory(pAddr0, attributes, 8'b11111111)
-- get start of instruction
if error then
EXCVADDR ← vAddr0
Exception (InstructionFetchErrorCause)
goto abortInstruction
endif
b ← 0||PC2..0
p0 ← b xor 14
p2 ← (b + 2) xor 14
if b2 = 0 or b1 = 0 or (b0 = 0 and (mem0||undefined64)(p0||3'b111) = 1)

else

-- instruction contained within a single fetch (64 bits in this example)
inst ← (mem0||undefined64)(p0||3'b111)..(p2||3'b000)
-- instruction crosses a fetch boundary (64 bits in this example)
vAddr1 ← vaddr0 + 32'd8
(pAddr1, attributes, cause) ← ftranslate(vAddr1, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr1
Exception (cause)
goto abortInstruction
endif
(mem1, error) ← ReadInstMemory(pAddr1,
attributes, 8'b11111111)
if error then
EXCVADDR ← vAddr1
Exception (InstructionFetchErrorCause)

Xtensa Instruction Set Architecture (ISA) Reference Manual

31

Chapter 3. Core Architecture

goto abortInstruction
endif
inst ← (mem0||mem1)(p0||3'b111)..(p2||3'b000)

endif
-- now have a 24-bit instruction (8 bits undefined if 16-bit), break it into fields
op0 ← inst23..20
t ← inst19..16
s ← inst15..12
r ← inst11..8
op1 ← inst7..4
op2 ← inst3..0
imm8 ← inst7..0
imm12 ← inst11..0
imm16 ← inst15..0
offset ← inst17..0
n ← inst19..18
m ← inst17..16
-- compute nextPC (may be overridden by branches, etc.)
nextPC ← PC + (030 || (if op03 then 2'b10 else 3'b11))
if LCOUNT ≠ 032 and CLOOPENABLE and nextPC = LEND then
LCOUNT ← LCOUNT − 1
nextPC ← LBEG
endif
-- execute instruction
Inst()
checkIcount ()
abortInstruction:
PC ← nextPC

3.6

Reset

When the processor emerges from the reset state, it initializes many registers. The ISA
guarantees the values of some states after reset but leaves many others undefined.
Actual Xtensa processor implementations will often define the values of state left
undefined by the ISA. Chapter 5, "Processor State" on page 205 contains information
about each state value, including the value to which it is reset.

3.7

Exceptions and Interrupts

The core ISA does not include support for exceptions or interrupts. These are architectural options are described in Section 4.4. Software running on a processor that is configured without an Exception Option should be well tested, as such a processor will do
something unexpected if it encounters a software error.

32

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

3.8

Instruction Summary

Table 3–11 summarizes the core instructions included in all versions of the Xtensa architecture. The remainder of this section gives an overview of the core instructions.
Table 3–11. Core Instruction Summary
Instructions1

Instruction Category

Reference

Load

L8UI, L16SI, L16UI, L32I,
L32R

"Load Instructions" on page 33

Store

S8I, S16I, S32I

"Store Instructions" on page 36

Memory ordering

MEMW, EXTW

"Memory Access Ordering" on page 39

Jump, Call

CALL0, CALLX0, RET
J, JX

"Jump and Call Instructions" on page
40

Conditional branch

BALL, BNALL, BANY, BNONE
BBC, BBCI, BBS, BBSI
BEQ, BEQI, BEQZ
BNE, BNEI, BNEZ
BGE, BGEI, BGEU, BGEUI, BGEZ
BLT, BLTI, BLTU, BLTUI, BLTZ

"Conditional Branch Instructions" on
page 40

Move

MOVI, MOVEQZ, MOVGEZ,
MOVLTZ, MOVNEZ

"Move Instructions" on page 42

Arithmetic

ADDI, ADDMI,
ADD, ADDX2, ADDX4, ADDX8,
SUB, SUBX2, SUBX4, SUBX8,
NEG, ABS

"Arithmetic Instructions" on page 43

Bitwise logical

AND, OR, XOR

"Bitwise Logical Instructions" on page
44

Shift

EXTUI, SRLI, SRAI, SLLI
SRC, SLL, SRL, SRA
SSL, SSR, SSAI, SSA8B, SSA8L

"Shift Instructions" on page 44

Processor control

RSR, WSR, XSR, RUR, WUR,
ISYNC, RSYNC, ESYNC, DSYNC,
NOP

"Processor Control Instructions" on
page 45

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

3.8.1

Load Instructions

Load instructions form a virtual address by adding a base register and an 8-bit unsigned
offset. This virtual address is translated to a physical address if necessary. The physical
address is then used to access the memory system (often through a cache). The memory system returns a data item (either 32, 64, or 128 bits, depending on the configuration). The load instructions then extract the referenced data from that memory item and
either zero-extend or sign-extend the result to be written into a register. Unless the

Xtensa Instruction Set Architecture (ISA) Reference Manual

33

Chapter 3. Core Architecture

Unaligned Exception Option is enabled, the processor does not handle misaligned data
or trap when a misaligned address is used; instead it simply loads the aligned data item
containing the computed virtual address. This allows the funnel shifter to be used with a
pair of loads to reference data on any byte address.
Only the loads L32I, L32I.N, and L32R can access InstRAM and InstROM locations.
Table 3–12 shows the loads in the Core Architecture.
Table 3–12. Load Instructions
Instruction

Format

Definition

L8UI

RRI8

8-bit unsigned load (8-bit offset)

L16SI

RRI8

16-bit signed load (8-bit shifted offset)

L16UI

RRI8

16-bit unsigned load (8-bit shifted offset)

L32I

RRI8

32-bit load (8-bit shifted offset)

L32R

RI16

32-bit load PC-relative (16-bit negative word offset)

Because the operation of caches is implementation-specific, this manual does not provide a formal specification of cache access.
The following routines define the load instructions:
function ReadMemory (pAddr, attributes, bytemask)
ReadMemory ← (Memory[pAddr], 0)
-- for now, no cache
endfunction ReadMemory
function Load8 (vAddr)
(pAddr, attributes, cause) ← ltranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2..0 xor msbFirst3
(mem64, error) ← ReadMemory(pAddr31..3, attributes, 07-p||1||0p)
mem8 ← mem64(p||3'b111)..(p||3'b000)
Load8 ← (mem8, error)
endfunction Load8
function Load16 (vAddr)
if UnalignedExceptionOption & Vaddr0 ≠ 1’b0 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction

34

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

endif
(pAddr, attributes, cause) ← ltranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2..1 xor msbFirst2
(mem64, error) ← ReadMemory(pAddr31..3, attributes,
(2'b00)3-p||2'b11||(2'b00)p)
mem16 ← mem64(p||4'b1111)..(p||4'b0000)
Load16 ← (mem16, error)
endfunction Load16
function Load32 (vAddr)
if UnalignedExceptionOption & Vaddr1..0 ≠ 2’b00 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction
endif
(pAddr, attributes, cause) ← ltranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2 xor msbFirst
(mem64, error) ← ReadMemory(pAddr31..3, attributes,
(4'b0000)1-p||4'b1111||(4'b0000)p)
mem32 ← mem64(p||5'b11111)..(p||5'b00000)
Load32 ← (mem32, error)
endfunction Load32
function Load32Ring (vAddr, ring)
if UnalignedExceptionOption & Vaddr1..0 ≠ 2’b00 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction
endif
(pAddr, attributes, cause) ← ltranslate(vAddr, ring)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2 xor msbFirst
(mem64, error) ← ReadMemory(pAddr31..3, attributes,
(4'b0000)1-p||4'b1111||(4'b0000)p)
mem32 ← mem64(p||5'b11111)..(p||5'b00000)

Xtensa Instruction Set Architecture (ISA) Reference Manual

35

Chapter 3. Core Architecture

Load32 ← (mem32, error)
endfunction Load32Ring
function Load64 (vAddr)
if UnalignedExceptionOption & Vaddr2..0 ≠ 3’b000 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction
endif
(pAddr, attributes, cause) ← ltranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
Load64 ← ReadMemory(pAddr31..3, attributes, 8'b11111111)
endfunction Load64

3.8.2

Store Instructions

Store instructions are similar to load instructions in address formation. Store memory
errors are not synchronous exceptions; it is expected that the memory system will use
an interrupt to indicate an error on a store.
Only the stores S32I and S32I.N can access InstRAM.
Table 3–13 shows the loads in the Core Architecture.
Table 3–13. Store Instructions
Instruction

Format

Definition

S8I

RRI8

8-bit store (8-bit offset)

S16I

RRI8

16-bit store (8-bit shifted offset)

S32I

RRI8

32-bit store (8-bit shifted offset)

The following routines define the store instructions:
procedure WriteMemory (pAddr, attributes, bytemask, data64)
-- for now, no cache
if bytemask0 then
Memory[pAddr]7..0 ← data647..0
endif
if bytemask1 then
Memory[pAddr]15..8 ← data6415..8
endif
if bytemask2 then

36

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

Memory[pAddr]23..16
endif
if bytemask3 then
Memory[pAddr]31..24
endif
if bytemask4 then
Memory[pAddr]39..32
endif
if bytemask5 then
Memory[pAddr]47..40
endif
if bytemask6 then
Memory[pAddr]55..48
endif
if bytemask7 then
Memory[pAddr]63..56
endif
endprocedure WriteMemory

← data6423..16
← data6431..24
← data6439..32
← data6447..40
← data6455..48
← data6463..56

procedure Store8 (vAddr, data8)
(pAddr, attributes, cause) ← stranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2..0 xor msbFirst3
WriteMemory(pAddr31..3, attributes, 07−p||1||0p,
undefined(7−p)||3'b000||data8||undefinedp||3'b000)
endprocedure Store8
procedure Store16 (vAddr, data16)
if UnalignedExceptionOption & Vaddr0 ≠ 1’b0 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction
endif
(pAddr, attributes, cause) ← stranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2..1 xor msbFirst2
WriteMemory(pAddr31..3, attributes, (2'b00)3-p||2'b11||(2'b00)p,
undefined(3-p)||4'b0000||data16||undefinedp||4'b0000)
endprocedure Store16
procedure Store32 (vAddr, data32)

Xtensa Instruction Set Architecture (ISA) Reference Manual

37

Chapter 3. Core Architecture

if UnalignedExceptionOption & Vaddr1..0 ≠ 2’b00 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction
endif
(pAddr, attributes, cause) ← stranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2 xor msbFirst
WriteMemory(pAddr31..3, attributes, (4'b0000)1p
||4'b1111||(4'b0000)p,
undefined(1-p)||5'b00000||data32||undefinedp||5'b00000)
endprocedure Store32
procedure Store32Ring (vAddr, data32, ring)
if UnalignedExceptionOption & Vaddr1..0 ≠ 2’b00 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction
endif
(pAddr, attributes, cause) ← stranslate(vAddr, ring)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
p ← pAddr2 xor msbFirst
WriteMemory(pAddr31..3, attributes, (4'b0000)1p||4'b1111||(4'b0000)p,
undefined(1-p)||5'b00000||data32||undefinedp||5'b00000)
endprocedure Store32Ring
procedure Store64 (vAddr, data64)
if UnalignedExceptionOption & Vaddr2..0 ≠ 3’b000 then
EXCVADDR ← vAddr
Exception (LoadStoreAlignmentCause)
goto abortInstruction
endif
(pAddr, attributes, cause) ← stranslate(vAddr, CRING)
if invalid(attributes) then
EXCVADDR ← vAddr
Exception (cause)
goto abortInstruction
endif
WriteMemory(pAddr31..3, attributes, 8'b11111111, data64)
endprocedure Store64

38

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

3.8.3

Memory Access Ordering

Xtensa implementations can perform ordinary load and store operations in any order, as
long as loads return the last (as defined by program execution order) values stored to
each byte of the load address for a single processor and a simple memory. This flexibility is appropriate because most memory accesses require only these semantics and
some implementations may be able to execute programs significantly faster by exploiting non-program order memory access. The Xtensa ISA only requires that implementations follow a simplified version of the Release Consistency model1 of memory access
ordering, although many implement stricter orderings for simplicity. For more on the
Xtensa memory order semantics, see "Multiprocessor Synchronization Option" on page
74.
However, some load and store instructions are executed not just to read and write storage, but to cause some side effects on some other part of the system (for example,
another processor or an I/O device). In C and C++, such variables must be declared
volatile. Loads and stores to such locations must be executed in program order. The
Xtensa ISA therefore provides an instruction that can be used to give program ordering
of load and store memory accesses.
The MEMW instruction causes all memory and cache accesses (loads, stores, acquires,
releases, prefetches, and cache operations, but not instruction fetches) before itself in
program order to access memory before all memory and cache accesses (but not instruction fetches) after. At least one MEMW should be executed in between every load or
store to a volatile variable. The Multiprocessor Synchronization Option provides
some additional instructions that also affect memory ordering in a more focused fashion.
MEMW has broader applications than these other instructions (for example, when reading
and writing device registers), but it also may affect performance more than the synchronization instructions.
The EXTW instruction is similar to MEMW, but it separates all external effects of instructions before the EXTW in program order from all external effects of instructions after the
EXTW in program order. EXTW is a superset of MEMW, and includes memory accesses in
what it orders.
Table 3–14 shows the memory ordering instructions in the Core Architecture.
Table 3–14. Memory Order Instructions
Instruction

Format

Definition

MEMW

RRR

Order memory accesses before with memory access after

EXTW

RRR

Order all external effects before with all external effects after

1.

Kourosh Gharachorloo, Dan Lenoski, James Laudon, Phillip Gibbons, Anoop Gupta, and John Hennessy, “Memory consistency and event ordering in scalable shared-memory multiprocessors,” Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 1526, May 1990.

Xtensa Instruction Set Architecture (ISA) Reference Manual

39

Chapter 3. Core Architecture

3.8.4

Jump and Call Instructions

The unconditional branch instruction, J, has a longer range (PC-relative) than conditional branches. Calls have a slightly longer range because they target 32-bit aligned
addresses. In addition, jump and call indirect instructions provide support for case
dispatch, function variables, and dynamic linking.
Table 3–15 shows the jump and call instructions.
Table 3–15. Jump and Call Instructions
Instruction

Format

Definition

CALL0

CALL

Call subroutine, PC-relative

CALLX0

CALLX

Call subroutine, address in register

J

CALL

Unconditional jump, PC-relative

JX

CALLX

Unconditional jump, address in register

CALLX

Subroutine return—jump to return address. Used to return from a routine
called by CALL0/CALLX0.

RET

3.8.5

Conditional Branch Instructions

The branch instructions in Table 3–16 compare a register operand against zero, an immediate, or a second register value and conditional branch based on the result of the
comparison. Compound compare and branch instructions improve code density and
performance compared to other ISAs. All branches are PC-relative; the immediate field
contains the difference between the target PC and the current PC plus four. The use of a
PC-relative offset of minus three to zero is illegal and reserved for future use.
Table 3–16. Conditional Branch Instructions
Instruction

Format

Definition

BEQZ

BRI12

Branch if equal to zero

BNEZ

BRI12

Branch if not equal to zero

BGEZ

BRI12

Branch if greater than or equal to zero

BLTZ

BRI12

Branch if less than zero

BEQI

BRI8

Branch if equal immediate1

BNEI

BRI8

Branch if not equal immediate1

BGEI

BRI8

Branch if greater than or equal immediate1

BLTI

BRI8

Branch if less than immediate1

BGEUI

BRI8

Branch if greater than or equal unsigned immediate2

1.

See Table 3–17 for encoding of signed immediate constants.

2.

See Table 3–18 for encoding of unsigned immediate constants.

40

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

Table 3–16. Conditional Branch Instructions (continued)
Instruction

Format

Definition

BLTUI

BRI8

Branch if less than unsigned immediate2

BBCI

RRI8

Branch if bit clear immediate

BBSI

RRI8

Branch if bit set immediate

BEQ

RRI8

Branch if equal

BNE

RRI8

Branch if not equal

BGE

RRI8

Branch if greater than or equal

BLT

RRI8

Branch if less than

BGEU

RRI8

Branch if greater than or equal unsigned

BLTU

RRI8

Branch if less than Unsigned

BANY

RRI8

Branch if any of masked bits set

BNONE

RRI8

Branch if none of masked bits set (All Clear)

BALL

RRI8

Branch if all of masked bits set

BNALL

RRI8

Branch if not all of masked bits set

BBC

RRI8

Branch if bit clear

RRI8

Branch if bit set

BBS
1.

See Table 3–17 for encoding of signed immediate constants.

2.

See Table 3–18 for encoding of unsigned immediate constants.

The encodings for the branch immediate constant (b4const) field and the branch
unsigned immediate constant (b4constu) fields, shown in Table 3–17 and Table 3–18,
specify one of the sixteen most frequent compare immediates for each type of constant.
Table 3–17. Branch Immediate (b4const) Encodings
Encoding

Decimal Value of Immediate

Hex Value of Immediate

0

-1

32’hFFFFFFFF

1

1

32’h00000001

2

2

32’h00000002

3

3

32’h00000003

4

4

32’h00000004

5

5

32’h00000005

6

6

32’h00000006

7

7

32’h00000007

8

8

32’h00000008

9

10

32’h0000000A

10

12

32’h0000000C

Xtensa Instruction Set Architecture (ISA) Reference Manual

41

Chapter 3. Core Architecture

Table 3–17. Branch Immediate (b4const) Encodings (continued)
Encoding

Decimal Value of Immediate

Hex Value of Immediate

11

16

32’h00000010

12

32

32’h00000020

13

64

32’h00000040

14

128

32’h00000080

15

256

32’h00000100

Table 3–18. Branch Unsigned Immediate (b4constu) Encodings

3.8.6

Encoding

Decimal Value of Immediate

Hex Value of Immediate

0

32768

32’h00008000

1

65536

32’h00010000

2

2

32’h00000002

3

3

32’h00000003

4

4

32’h00000004

5

5

32’h00000005

6

6

32’h00000006

7

7

32’h00000007

8

8

32’h00000008

9

10

32’h0000000A

10

12

32’h0000000C

11

16

32’h00000010

12

32

32’h00000020

13

64

32’h00000040

14

128

32’h00000080

15

256

32’h00000100

Move Instructions

MOVI sets a register to a constant encoded in the instruction. The conditional move
instructions shown in Table 3–19 are used for branch avoidance.

42

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

Table 3–19. Move Instructions
Instruction

Format

Definition

MOVI

RRI8

Load register with 12-bit signed constant

MOVEQZ

RRR

Conditional move if zero

MOVNEZ

RRR

Conditional move if non-zero

MOVLTZ

RRR

Conditional move if less than zero

MOVGEZ

RRR

Conditional move if greater than or equal to zero

3.8.7

Arithmetic Instructions

The arithmetic instructions that Table 3–20 lists include add and subtract with a small
shift for address calculations and for synthesizing constant multiplies. The ADDMI instruction is included for extending the range of load and store instructions.
Table 3–20. Arithmetic Instructions
Instruction
ADD
ADDX2
ADDX4
ADDX8
SUB
SUBX2
SUBX4
SUBX8
NEG

Format

Definition

RRR

Add two registers
AR[r] ← AR[s] + AR[t]

RRR

Add register to register shifted by 1
AR[r] ← (AR[s]30..0 || 0) + AR[t]

RRR

Add register to register shifted by 2
AR[r] ← (AR[s]29..0 || 02) + AR[t]

RRR

Add register to register shifted by 3
AR[r] ← (AR[s]28..0 || 03) + AR[t]

RRR

Subtract two registers
AR[r] ← AR[s] − AR[t]

RRR

Subtract register from register shifted by 1
AR[r] ← (AR[s]30..0 || 0) − AR[t]

RRR

Subtract register from register shifted by 2
AR[r] ← (AR[s]29..0 || 02) − AR[t]

RRR

Subtract register from register shifted by 3
AR[r] ← (AR[s]28..0 || 03) − AR[t]

RRR

Negate
AR[r] ← 0 − AR[t]

Xtensa Instruction Set Architecture (ISA) Reference Manual

43

Chapter 3. Core Architecture

Table 3–20. Arithmetic Instructions (continued)
Instruction

Format

Absolute value
AR[r] ← if AR[s]31 then 0 − AR[s] else AR[s]

RRI8

Add signed constant to register
AR[t] ← AR[s] + (imm8724||imm8)

RRI8

Add signed constant shifted by 8 to register
AR[t] ← AR[s] + (imm8716||imm8||08)

ABS
ADDI
ADDMI

3.8.8

Definition

RRR

Bitwise Logical Instructions

The bitwise logical instructions in Table 3–21 provide a core set from which other logicals can be synthesized. Immediate forms of these instructions are not provided because the immediate would be only four bits.
Table 3–21. Bitwise Logical Instructions
Instruction

Format

AND
OR
XOR

3.8.9

Definition

RRR

Bitwise logical AND
AR[r] ← AR[s] and AR[t]

RRR

Bitwise logical OR
AR[r] ← AR[s] or AR[t]

RRR

Bitwise logical exclusive OR
AR[r] ← AR[s] xor AR[t]

Shift Instructions

The shift instructions in Table 3–22 provide a rich set of operations while avoiding critical
timing paths. See Section 3.3.2 on page 25 for more information.
Table 3–22. Shift Instructions
Instruction

Format

Extract unsigned field immediate
Shifts right by 0..31 and ANDs with a mask of 1..16 ones
The operation of this instruction when the number of mask bits exceeds the number of
significant bits remaining after the shift is undefined and reserved for future use.

RRR

Shift left logical immediate by 1..31 bit positions (see page 525 for encoding of the
immediate value).

RRR

Shift right logical immediate by 0..15 bit positions
There is no SRLI for shifts ≥ 16; use EXTUI instead.

RRR

Shift right arithmetic immediate by 0..31 bit positions

EXTUI

SLLI
SRLI
SRAI

44

Definition

RRR

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 3. Core Architecture

Table 3–22. Shift Instructions (continued)
Instruction

Format

Definition

RRR

Shift right combined (a funnel shift with shift amount from SAR)
The two source registers are catenated, shifted, and the least significant 32 bits
returned.

RRR

Shift right arithmetic (shift amount from SAR)

RRR

Shift left logical
(Funnel shift AR[s] and 0 by shift amount from SAR)

RRR

Shift right logical
(Funnel shift 0 and AR[s] by shift amount from SAR)

RRR

Set shift amount register (SAR) for big-endian byte align
The t field must be zero.

RRR

Set shift amount register (SAR) for little-endian byte align

RRR

Set shift amount register (SAR) for shift right logical
This instruction differs from WSR to SAR in that only the five least significant bits of the
register are used.

SSL

RRR

Set shift amount register (SAR) for shift left logical

SSAI

RRR

Set shift amount register (SAR) immediate

SRC
SRA
SLL
SRL
SSA8B
SSA8L
SSR

3.8.10 Processor Control Instructions
Table 3–23 contains processor control instructions. The RSR.*, WSR.*, and XSR.*
instructions read, write, and exchange Special Registers for both the Core Architecture
and the architectural options, as detailed in Table 5–128 on page 209. They save and
restore context, process interrupts and exceptions, and control address translation and
attributes. The XSR.* instruction reads and writes both the Special Register, and
AR[t]. It combines the RSR.* and WSR.* operations to exchange the Special Register
with AR[t]. The XSR.* instruction is not present in T1030 and earlier processors.
The xSYNC instructions synchronize Special Register writes and their uses. See
Chapter 5 for more information on how xSYNC instructions are used. These synchronization instructions are separate from the synchronization instructions used for multiprocessors, which are described in Section 4.3.12 on page 74.
On some Xtensa implementations the latency of RSR is greater than one cycle, and so it
is advantageous to schedule uses of the RSR result away from the RSR to avoid an
interlock.
The point at which WSR.* or XSR.* to most Special Registers affects subsequent instructions is not defined (SAR and ACC are exceptions). In these cases, Table 5–128 on
page 209 explains how to ensure the effects are seen by a particular point in the instruction stream (typically involving the use of one of the ISYNC, RSYNC, ESYNC, or DSYNC

Xtensa Instruction Set Architecture (ISA) Reference Manual

45

Chapter 3. Core Architecture

instructions). A WSR.* or XSR.* followed by a RSR.* of the same register must be separated by an ESYNC instruction to guarantee the value written is read back. A WSR.PS or
XSR.PS followed by a RSIL also requires an ESYNC instruction.
Table 3–23. Processor Control Instructions
Instruction

Format

Definition

RSR

RSR

Read Special Register

WSR

RSR

Write Special Register

RSR

Exchange Special Register
(combined RSR and WSR)
Not present in T1030 and earlier processors

RRR

Instruction fetch synchronize: Waits for all previously fetched load, store, cache, and
special register write instructions that affect instruction fetch to be performed before
fetching the next instruction.

RRR

Instruction register synchronize: Waits for all previously fetched WSR and XSR
instructions to be performed before interpreting the register fields of the next
instruction. This operation is also performed as part of ISYNC.

RRR

Register value synchronize: Waits for all previously fetched WSR and XSR instructions
to be performed before the next instruction uses any register values. This operation is
also performed as part of ISYNC and RSYNC.

RRR

Load/store synchronize: Waits for all previously fetched WSR and XSR instructions to
be performed before interpreting the virtual address of the next load or store
instruction. This operation is also performed as part of ISYNC, RSYNC, and ESYNC.

RRR

No operation

XSR

ISYNC

RSYNC

ESYNC

DSYNC
NOP

46

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.

Architectural Options

This chapter defines the Xtensa ISA options. Each option adds some associated configuration resources and capabilities. Some options are dependent on the implementation
of other options. These interdependencies, if any, are listed as Prerequisites at the beginning of the description of each option. The additional parameters required to define
the option, the new state and instructions added by the option, and any other new features (such as exceptions) added by the option are listed and the operation of the option
is described.

4.1

Overview of Options

Section 4.2 provides a synopsis of the Core Architecture (covered in more detail in
Chapter 3) in a format similar to the format used for the options. The Instruction Set options available with an Xtensa processor are listed in five groups below.
"Options for Additional Instructions" on page 53 lists options whose primary function is
to add new instructions to the processor’s instruction set, including:
„

„

„

„

„

„

„

„

The Code Density Option on page 53 adds 16-bit encodings of the most frequently
used 24-bit instructions for higher code density.
The Loop Option on page 54 adds a “zero overhead loop,” which requires neither
the extra instruction for a branch at the end of a loop nor the additional delay slots
that would result from the taken branch. A few fixed cycles of overhead mean that
each iteration of the loop pays no cost for the loop branch.
The Extended L32R Option on page 56 allows an additional choice in the addressing mode of the L32R instruction.
The 16-bit Integer Multiply Option on page 57 adds signed and unsigned 16x16
multiplication instructions that produce 32-bit results.
The 32-bit Integer Multiply Option on page 58 adds signed and unsigned 32x32
multiplication instructions that produce high and low parts of a 64-bit result.
The 32-bit Integer Divide Option on page 59 implements signed and unsigned 32bit division and remainder instructions.
The MAC16 Option on page 60 adds multiply-accumulate functions that are useful
in digital signal processing (DSP).
The Miscellaneous Operations Option on page 62 provides a series of instructions useful for some applications, but which are not necessary for others. By making these optional, the Xtensa architecture allows the designer to choose only those
additional instructions that benefit the application.

Xtensa Instruction Set Architecture (ISA) Reference Manual

47

Chapter 4. Architectural Options

„

„

„

„

„

The Coprocessor Option on page 63 allows the grouping of certain states in the
processor and adds an enable bit, which allows for lazy context switching.
The Boolean Option on page 65 adds a set of Boolean registers, which can be set
and cleared by user instructions and that can be used as branch conditions.
The Floating-Point Coprocessor Option on page 67 adds a floating-point unit for
single precision floating point.
The Multiprocessor Synchronization Option on page 74 adds acquire and release instructions with specific memory ordering relationships to the other Xtensa
memory access instructions.
The Conditional Store Option on page 77 adds a compare and swap type atomic
operation to the instruction set.

"Options for Interrupts and Exceptions" on page 82 lists options whose primary function
is to add and control exceptions and interrupts, including:
„

„

„

„

„

„

The Exception Option on page 82 adds the basic functions needed for the processor to take exceptions.
The Relocatable Vector Option on page 98 adds the ability for the exception vectors to be relocated at run time.
The Unaligned Exception Option on page 99 adds an exception for memory accesses that are not aligned by their own size. They may then be emulated in software.
The Interrupt Option on page 100 builds upon the Exception Option to add a flexible software prioritized interrupt system.
The High-Priority Interrupt Option on page 106 adds a hardware prioritized interrupt system for higher performance.
The Timer Interrupt Option on page 110 adds timers and interrupts, which are
caused when the timer expires.

"Options for Local Memory" on page 111 lists options whose primary function is to add
different kinds of memory, such as RAMs, ROMs, or caches to the processor, including:
„

„

„

„

„

48

The Instruction Cache Option on page 115 adds an interface for a direct-mapped
or set-associative instruction cache.
The Instruction Cache Test Option on page 116 adds instructions to access the instruction cache tag and data.
The Instruction Cache Index Lock Option on page 117 adds per-index locking to
the instruction cache.
The Data Cache Option on page 118 adds an interface for a direct-mapped or setassociative data cache.
The Data Cache Test Option on page 121 adds instructions to access the data
cache tag.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

„

„

„

„
„

„

„

„

The Data Cache Index Lock Option on page 122 adds per-index locking to the
data cache.
The Instruction RAM Option on page 124 adds an interface for a local instruction
memory.
The Instruction ROM Option on page 125 adds an interface for a local instruction
Read Only Memory.
The Data RAM Option on page 126 adds an interface for a local data memory.
The Data ROM Option on page 126 adds an interface for a local data read-only
memory.
The XLMI Option on page 127 adds an interface with the timing of the local memory
interfaces, but with a full enough signal set to support non-memory devices.
The Hardware Alignment Option on page 128 adds the ability for the hardware to
handle unaligned accesses to data memory.
The Memory ECC/Parity Option on page 128 provides the ability to add parity or
ECC to cache and local memories.

"Options for Memory Protection and Translation" on page 138 lists options whose primary function is to control access to and manage memory, including:
„

„

„

The Region Protection Option on page 150 adds protection on memory in eight
segments.
The Region Translation Option on page 156 adds protection on memory in eight
segments and allows translations from one segment to another.
The MMU Option on page 158 adds full paging virtual memory management hardware.

"Options for Other Purposes" on page 179 lists options that do not fall conveniently into
one of the other groups, including:
„

„

„

„

The Windowed Register Option on page 180 adds additional physical AR registers and a mapping mechanism, which together lead to smaller code size and higher
performance.
The Processor Interface Option on page 194 adds a bus interface used by memory accesses, which are to locations other than local memories. It is used for cache
misses for cacheable addresses as well as for cache bypass memory accesses.
The Miscellaneous Special Registers Option on page 195 provides one to four
scratch registers within the processor readable and writable by RSR, WSR, and XSR,
which may be used for application-specific exceptions and interrupt processing
tasks.
The Thread Pointer Option on page 196 provides a Special Register that may be
used for a thread pointer.

Xtensa Instruction Set Architecture (ISA) Reference Manual

49

Chapter 4. Architectural Options

„

„

„

The Processor ID Option on page 196 adds a register that software can use to distinguish which of several processors it is running on.
The Debug Option on page 197 adds instructions-counting and breakpoint exceptions for debugging by software or external hardware.
The Trace Port Option on page 203 architectural features for supporting hardware
tracing of the processor.

The functionality of a fairly complete micro-controller is provided by enabling the Code
Density Option, the Exception Option, the Interrupt Option, the High-Priority Interrupt
Option, the Timer Interrupt Option, the Debug Option, and the Windowed Register Option.
The primary reason to disable the Code Density Option (16-bit instructions) is to provide
maximum opcode space for extensions. The primary reason to disable the other options
listed above is reduce the processor core area.
The choice of Cache, RAM, or ROM Options for instruction and data depends on the
characteristics of the application. RAM is not as flexible as Cache, but it requires slightly
less area because tags are not required. RAM may also be desirable when performance
predictability is required. ROM is even less flexible than RAM, but avoids the need to
load the memory and offers some protection from program errors and tampering.

4.2

Core Architecture

The Core Architecture is not an option, but rather a minimum base of processor state
and instructions, which allows system software and compiled code to run on all Xtensa
implementations. There are no prerequisites or incompatible options, but the tables normally used to show option additions are used here to give the base set. Table 4–24
through Table 4–26 show Core Architecture processor configurations, processor state,
and instructions.
Table 4–24. Core Architecture Processor-Configurations
Parameter
msbFirst

50

Description

Valid Values

Byte order for memory accesses

0 or 1
0 → Little-endian (least significant bit first)
1 → Big-endian (most significant bit first)

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–25. Core Architecture Processor-State
Register
Mnemonic

Quantity

Width (bits) Register Name

R/W

Special
Register
Number1

AR

16

32

Address register file

R/W

—

PC

1

32

Program counter

—

—

1

6

Shift amount register

R/W

3

SAR
1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

Table 4–26. Core Architecture Instructions
Instruction1

Format

Definition

ABS

RRR

Absolute value

ADD

RRR

Add two registers

ADDI

RRI8

Add a register and an 8-bit immediate

ADDMI

RRI8

Add a register and a shifted 8-bit immediate

ADDX2/4/8

RRR

Add two registers with one of them shifted left by one/two/three

AND

RRR

Bitwise AND of two registers

BALL/BANY

RRI8

Branch if all/any bits specified by a mask in one register are set in another register

BBC/BBS

RRI8

Branch if the bit specified by another register is clear/set

BBCI/BBSI

RRI8

Branch if the bit specified by an immediate is clear/set

BEQ

RRI8

Branch if a register equals another register

BEQI

RRI8

Branch if a register equals an encoded constant

BEQZ

BRI12

Branch if a register equals zero

BGE

RRI8

Branch if one register is greater than or equal to a register

BGEI

RRI8

Branch if one register is greater than or equal to an encoded constant

BGEU

RRI8

Branch if one register is greater or equal to a register as unsigned

BGEUI

BRI8

Branch if one register is greater or equal to an encoded constant as unsigned

BGEZ

BRI12

Branch if a register is greater than or equal to zero

BLT

RRI8

Branch if one register is less than a register

BLTI

BRI8

Branch if one register is less than an encoded constant

BLTU

RRI8

Branch if one register is less than a register as unsigned

BLTUI

RRI8

Branch if one register is less than an encoded constant as unsigned

BLTZ

BRI12

Branch if a register is less than zero

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

51

Chapter 4. Architectural Options

Table 4–26. Core Architecture Instructions (continued)
Instruction1

Format

Definition

RRI8

Branch if some/all bits specified by a mask in a register are clear in another
register

BNE

RRI8

Branch if a register does not equal a register

BNEI

RRI8

Branch if a register does not equal an encoded constant

BNEZ

BRI12

Branch if a register does not equal zero

CALL0

CALL

Call subroutine at PC plus offset, place return address in A0

CALLX0

CALLX

Call subroutine register specified location, place return address in A0

DSYNC/ESYNC

RRR

Wait for data memory/execution related changes to resolve

EXTUI

RRR

Extract field specified by immediates from a register

EXTW

RRR

Wait for any possible external ordering requirement (added in RA-2004.1)

ISYNC

RRR

Wait for instruction fetch related changes to resolve

J

CALL

Jump to PC plus offset

JX

CALLX

Jump to register specified location

L8UI

RRI8

Load zero extended byte

L16SI/L16UI

RRI8

Load sign/zero extended 16-bit quantity

L32I

RRI8

Load 32-bit quantity

L32R

RI16

Load literal at offset from PC (or from LITBASE with the Extended L32R Option)

MEMW

RRR

Wait for any possible memory ordering requirement

MOVEQZ

RRR

Move register if the contents of a register is zero

MOVGEZ

RRR

Move register if the contents of a register is greater than or equal to zero

MOVI

RRI8

Move a 12-bit immediate to a register

MOVLTZ

RRR

Move register if the contents of a register is less than zero

MOVNEZ

RRR

Move register if the contents of a register is not zero

NEG

RRR

Negate a register

NOP

RRR

No operation (added as a full instruction in RA-2004.1)

OR

RRR

Bitwise OR two registers

RET

CALLX

Subroutine return through A0

RSR.*

RSR

Read a Special Register

RSYNC

RRR

Wait for dispatch related changes to resolve

S8I/S16I/S32I

RRI8

Store byte/16-bit quantity/32-bit quantity

SLL/SLLI

RRR

Shift left logical by SAR/immediate

SRA/SRAI

RRR

Shift right arithmetic by SAR/immediate

SRC

RRR

Shift right combined by SAR with two registers as input and one as output

BNALL/BNONE

1.

52

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–26. Core Architecture Instructions (continued)
Instruction1

Format

Definition

SRL/SRLI

RRR

Shift right logical by SAR/immediate

RRR

Use low 2-bits of address register to prepare SAR for SRC assuming big/little
endian

SSAI

RRR

Set SAR to immediate value

SSL/SSR

RRR

Set SAR from register for left/right shift

SUB

RRR

Subtract two registers

SUBX2/4/8

RRR

Subtract two registers with the un-negated one shifted left by one/two/three

WSR.*

RSR

Write a special register

XOR

RRR

Bitwise XOR two registers

XSR.*

RRR

Read and write a special register in an exchange (added in T1040)

SSA8B/SSA8L

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3

Options for Additional Instructions

The options in this section have the primary function of adding new instructions to the
processor’s instruction set. The new instructions cover a variety of purposes including
new architectural capabilities, higher performance on existing capabilities, and smaller
code.

4.3.1

Code Density Option

This option adds 16-bit encodings of the most frequently used 24-bit instructions. When
a 24-bit instruction can be encoded into a 16-bit form, the code-size savings is significant.
„

Prerequisites: None

„

Incompatible options: None

„

Compatibility note: The additions made by this option were once considered part of
the core architecture, thus compatibility with binaries for previous hardware might
require the use of this option. Many available third-party software packages including some currently supported operating systems require the Code Density Option.

4.3.1.1 Code Density Option Architectural Additions
Table 4–27 shows this option’s architectural additions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

53

Chapter 4. Architectural Options

Table 4–27. Code Density Option Instruction Additions
Instruction1

Format

Definition

ADD.N

RRRN

Add two registers (same as ADD instruction but with a 16-bit encoding).

ADDI.N

RRRN

Add register and immediate (-1 and 1..15).

BEQZ.N

RI16

Branch if register is zero with a 6-bit unsigned offset (forward only).

BNEZ.N

RI16

Branch if register is non-zero with a 6-bit unsigned offset (forward only).

BREAK.N2

RRRN

This instruction is the same as BREAK but with a 16-bit encoding.

L32I.N

RRRN

Load 32 bits, 4-bit offset

MOV.N

RRRN

Narrow move

MOVI.N

RI7

Load register with immediate (-32..95).

NOP.N

RRRN

This instruction performs no operation. It is typically used for instruction alignment.

RET.N

RRRN

The same as RET but with a 16-bit encoding.

RETW.N3

RRRN

The same as RETW but with a 16-bit encoding.

S32I.N

RRRN

Store 32 bits, 4-bit offset

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

2.

Exists only if the Debug Option described in Section 4.7.6 on page 197 is configured.

3.

Exists only if the Windowed Register Option described in Section 4.7.1 on page 180 is configured.

4.3.1.2 Branches
For some implementations, branches to an instruction that crosses a 32-bit memory
boundary may suffer a small performance penalty. The compiler (or assembler) is expected to align performance-critical branch targets such that their byte address is 0 mod
4, 1 mod 4, or for 16-bit instructions, 2 mod 4. This can be accomplished either by converting some previous 16-bit-encoded instructions back to their 24-bit form, or by inserting a 16-bit NOP.N.

4.3.2

Loop Option

The Loop Option adds the ability for the processor to execute a zero-overhead loop
where the number of iterations (not counting an early exit) can be determined prior to
entering the loop. This capability is useful in digital signal processing applications where
the overhead of a branch in a heavily used loop is unacceptable. A single loop instruction defines both the beginning and end of a loop, as well as a count of how many times
the loop will execute.
„

Prerequisites: None

„

Incompatible options: None

54

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Compatibility note: The additions made by this option were once considered part of
the core architecture, thus compatibility with binaries for previous hardware might
require the use of this option. Many available third-party software packages including some currently supported operating systems require the Loop Option.

„

4.3.2.1 Loop Option Architectural Additions
Table 4–28 and Table 4–29 show this option’s architectural additions.
Table 4–28. Loop Option Processor-State Additions
Register
Mnemonic

Quantity

Width (bits)

LBEG

1

LEND
LCOUNT
1.

Special
Register
Number1

Register Name

R/W

32

Loop begin

R/W

0

1

32

Loop end

R/W

1

1

32

Loop count

R/W

2

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

LBEG and LEND are undefined after processor reset. LCOUNT is initialized to zero after
processor reset.
Table 4–29. Loop Option Instruction Additions
Instruction1
LOOP
LOOPGTZ
LOOPNEZ
1.

Format

Definition

BRI8

Set up a zero-overhead loop by setting LBEG, LEND, and LCOUNT special
registers.

BRI8

Set up a zero-overhead loop by setting LBEG, LEND, and LCOUNT special
registers. Skip loop if LCOUNT is not positive.

BRI8

Set up a zero-overhead loop by setting LBEG, LEND, and LCOUNT special
registers. Skip loop if LCOUNT is zero.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3.2.2 Restrictions on Loops
There is a restriction on instruction alignment for zero-overhead loops. The first instruction after the LOOP instruction, which begins at the address written to LBEG by the LOOP
instruction, must be entirely contained within a naturally aligned, power of two sized unit
of a particular size. That size is the next larger power of two equal to or greater than the
instruction length, but not less than 4 bytes. Thus a 16-bit instruction, if it is the first in a
loop, may be at 0 mod 4, 1 mod 4, or 2 mod 4. A 24-bit instruction, if it is the first in a
loop, may be at 0 mod 4 or at 1 mod 4. As an example of a potential larger instruction, a
64-bit instruction must be aligned at 0 mod 8.

Xtensa Instruction Set Architecture (ISA) Reference Manual

55

Chapter 4. Architectural Options

The last instruction of the loop must not be a call, ISYNC, WAITI, or RSR.LCOUNT. If the
last instruction of the loop is a taken branch, then the value of LCOUNT is undefined.
Thus, a taken branch may be used to exit the loop (in which case the value of LCOUNT is
irrelevant), but not to iterate within the loop.
4.3.2.3 Loops Disabled During Exceptions
Loops are disabled when PS.EXCM is set in Xtensa Exception Architecture 2 and above.
This prevents program code from maliciously or accidentally setting LEND to an address
in an exception handler and then causing the exception, thereby transitioning to Ring 0
while retaining control of the processor.
4.3.2.4 Loopback Semantics
The processor includes the following to compute the PC of the next instruction:
if LCOUNT ≠ 0 and CLOOPENABLE and nextPC = LEND then
LCOUNT ← LCOUNT − 1
nextPC ← LBEG
endif

The semantics above have some non-obvious consequences. A taken branch to the address in LEND does not cause a transfer to LBEG. Thus a taken branch to the LEND instruction can be used to exit the loop prematurely. This is why a call instruction as the
last instruction of a loop will not do the obvious thing (the return will branch to the LEND
address and exit the loop). To conditionally begin the next loop iteration, a branch to a
NOP before LEND may be used.

4.3.3

Extended L32R Option

The Extended L32R Option adds functionality to the standard L32R instruction. The
standard L32R instruction has an offset that can reach as far as 256kB below the current
PC. In the case where an instruction RAM approaches or exceeds 256kB in size, accessing literal data becomes much more difficult. This option is intended to ease the access to literal data by providing an optional separate literal base register.
„

Prerequisites: None

„

Incompatible options: MMU Option (page 158)

4.3.3.1 Extended L32R Option Architectural Additions
Table 4–30 shows this option’s architectural additions.

56

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–30. Extended L32R Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

LITBASE

1

21

Register Name

R/W

Special
Register
Number1

Literal base2

R/W

5

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

2.

See Figure 4–7 on page 57 for the format of this register.

4.3.3.2 The Literal Base Register
The literal base (LITBASE) register contains 20 upper bits, which define the location of
the literal base and one enable bit (En). When the enable bit is clear, the L32R instruction loads a literal at a negative offset from the PC. When the enable bit is set, the L32R
instruction loads a literal at a negative offset from the address formed by the 20 upper
bits of literal base and 12 lower bits of 12’h000. See the L32R instruction description in
Chapter 6. Figure 4–7 shows the LITBASE register format.
31

12 11

1 0

Literal Base Address

reserved

En

20

11

1

Figure 4–7. LITBASE Register Format
The enable bit of the literal base register is cleared after reset. The remaining bits are
undefined after reset.

4.3.4

16-bit Integer Multiply Option

This option provides two instructions that perform 16×16 multiplication, producing a 32bit result. It is typically useful for digital signal processing (DSP) algorithms that require
16 bits or less of input precision (32 bits of input precision is provided by the 32-bit Integer Multiply Option) and do not require more than 32-bit accumulation (as provided by
the MAC16 Option). Because a 16×16 multiplier is one-fourth the area of a 32×32 multiplier, this option is less costly than the 32-bit Integer Multiply Option. Because it lacks an
accumulator and data registers, it is less costly than the MAC16 Option.
„

Prerequisites: None

„

Incompatible options: None

„

See Also "MAC16 Option" on page 60 and "32-bit Integer Multiply Option" on page
58

Xtensa Instruction Set Architecture (ISA) Reference Manual

57

Chapter 4. Architectural Options

4.3.4.1 16-bit Integer Multiply Option Architectural Additions
Table 4–31 shows this option’s architectural additions. There are no configuration parameters associated with the MUL16 Option and no additional processor state.
Table 4–31. 16-bit Integer Multiply Option Instruction Additions
Instruction1
MUL16S
MUL16U
1.

Format

Definition

RRR

Signed 16×16 multiplication of the least-significant 16 bits of AR[s] and
AR[t], with the 32-bit product written to AR[r]

RRR

Unsigned 16×16 multiplication of the least-significant 16 bits of AR[s] and
AR[t], with the 32-bit product written to AR[r]

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3.5

32-bit Integer Multiply Option

This option provides instructions that implement 32-bit integer multiplication as instructions. This provides single instruction targets for the multiplication operators of programming languages such as C. When this option is not enabled, the Xtensa compiler uses
subroutine calls to implement 32-bit integer multiplication. Note that various algorithms
may be used to implement multiplication, and some hardware implementations may be
slower than the software implementations for some operand values. Implementations
may allow a choice of algorithms through configuration parameters to optimize among
area, speed, and other characteristics.
There is one sub-option within this option: Mul32High. It controls whether the MULSH
and MULUH instructions are included or not. For some implementations, generating the
high 32 bits of the product requires additional hardware, and so disabling this sub-option
may reduce cost.
„

Prerequisites: None

„

Incompatible options: None

„

See Also: "MAC16 Option" on page 60 and "16-bit Integer Multiply Option" on page
57

4.3.5.1 32-bit Integer Multiply Option Architectural Additions
Table 4–32 and Table 4–33 show this option’s architectural additions. This option adds
no new processor state.

58

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–32. 32-bit Integer Multiply Option Processor-Configuration Additions
Parameter

Description

Valid Values

Mul32High

Determines whether the MULSH and MULUH
instructions are included

0 or 1

MulAlgorithm

Determines the multiplication algorithm employed

Implementation-dependent

Table 4–33. 32-Bit Integer Multiply Instruction Additions
Instruction1

Format

Definition

MULL

RRR

Multiply low
(return least-significant 32 bits of product)

MULUH2

RRR

Multiply unsigned high
(return most-significant 32 bits of product)

MULSH2

RRR

Multiply signed high
(return most-significant 32 bits of product)

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

2.

These instructions are part of the Mul32High sub-option of 32-bit Integer Multiply Option.

4.3.6

32-bit Integer Divide Option

This option provides instructions that implement 32-bit integer division and remainder
operations. When this option is not enabled, the Xtensa compiler uses subroutine calls
to implement division and remainder. Note that various algorithms may be used to implement these instructions, and some hardware implementations may be slower than the
software implementations for some operand values.
„

Prerequisites: None

„

Incompatible Options: None

4.3.6.1 32-bit Integer Divide Option Architectural Additions
Table 4–34 through Table 4–36 show this option’s architectural additions. This option
adds no new processor state. This option does add a new exception, Integer Divide by
Zero, which is raised when the divisor operand of a QUOS, QUOU, REMS, or REMU instruction contains zero.
Table 4–34. 32-bit Integer Divide Option Processor-Configuration Additions
Parameter

Description

Valid Values

DivAlgorithm

Determines the division algorithm employed

Implementation-dependent

Xtensa Instruction Set Architecture (ISA) Reference Manual

59

Chapter 4. Architectural Options

Table 4–35. 32-bit Integer Divide Option Exception Additions
Exception

Description

EXCCAUSE value

IntegerDivideByZero

Exception raised when divisor is zero 6

Table 4–36. 32-bit Integer Divide Option Instruction Additions
Instruction1

Format

Definition

QUOS

RRR

Quotient Signed
(divide giving 32-bit quotient)

QUOU

RRR

Quotient Unsigned
(divide giving 32-bit quotient)

REMS

RRR

Remainder Signed
(divide giving 32-bit remainder)

REMU

RRR

Remainder Unsigned
(divide giving 32-bit remainder)

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243

4.3.7

MAC16 Option

The MAC16 Option adds multiply-accumulate functions that are useful in DSP and other
media-processing operations. The option adds a 40-bit accumulator (ACC), four 32-bit
data registers (MR[n]), and 72 instructions.
The multiplier operates on two 16-bits operands from either the address registers (AR) or
MAC16 registers (MR). Each operand may be taken from either the low or high half of a
register. The result of the operation is placed in the 40-bit accumulator. The MR registers and the low 32 bits and high 8 bits of the accumulator are readable and writable with
the RSR, WSR, and XSR instructions. MR[0] and MR[1] can be used as the first multiplier
input, and MR[2] and MR[3] can be used as the second multiplier input. Four of the 72
added instructions can load the MR registers with 32-bit values from memory in parallel
with multiply-accumulate operations.
The accumulator (ACC) and data registers (MR) are undefined after reset.
„

Prerequisites: None

„

Incompatible options: None

4.3.7.1 MAC16 Option Architectural Additions
Table 4–37 and Table 4–38 show this option’s architectural additions.

60

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–37. MAC16 Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

Register Name

R/W

Special
Register
Number1

ACCLO

1

32

Accumulator low

R/W

16

ACCHI

1

8

Accumulator high

R/W

17

MR[0]

2

1

32

MAC16 register 0 (m0 in assembler)

R/W

32

MR[1]

2

1

32

MAC16 register 1 (m1 in assembler)

R/W

33

MR[2]

2

1

32

MAC16 register 2 (m2 in assembler)

R/W

34

MR[3]2

1

32

MAC16 register 3 (m3 in assembler)

R/W

35

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

2.

These registers are known as MR[0..3] in hardware and as m0..3 in the software.

Table 4–38. MAC16 Option Instruction Additions
Instruction1, 2

Definition3

LDDEC

Load MAC16 data register (MR) with auto decrement

LDINC

Load MAC16 data register (MR) with auto increment

MUL.AA.qq

Signed multiply of two address registers

MUL.AD.qq

Signed multiply of an address register and a MAC16 data register

MUL.DA.qq

Signed multiply of a MAC16 data register and an address register

MUL.DD.qq

Signed multiply of two MAC16 data registers

MULA.AA.qq

Signed multiply-accumulate of two address registers

MULA.AD.qq

Signed multiply-accumulate of an address register and a MAC16 data register

MULA.DA.qq

Signed multiply-accumulate of a MAC16 data register and an address register

MULA.DD.qq

Signed multiply-accumulate of two MAC16 data registers

MULS.AA.qq

Signed multiply/subtract of two address registers

MULS.AD.qq

Signed multiply/subtract of an address register and a MAC16 data register

MULS.DA.qq

Signed multiply/subtract of a MAC16 data register and an address register

MULS.DD.qq

Signed multiply/subtract of two MAC16 data registers

MULA.DA.qq.LDDEC

Signed multiply-accumulate of a MAC16 data register and an address register, and load
a MAC16 data register with auto decrement

MULA.DA.qq.LDINC

Signed multiply-accumulate of a MAC16 data register and an address register, and load
a MAC16 data register with auto increment

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

2.

The qq opcode parameter indicates (by HH, HL, LH, or LL) whether the operands are taken from the Low or High 16-bit half of the AR or MR
registers. The first q represents the location of the first operand; the second q represents the location of the second operand.

3.

The destination for all product and accumulate results is the MAC16 accumulator

Xtensa Instruction Set Architecture (ISA) Reference Manual

61

Chapter 4. Architectural Options

Table 4–38. MAC16 Option Instruction Additions (continued)
Instruction1, 2

Definition3

MULA.DD.qq.LDDEC

Signed multiply-accumulate of two MAC16 data registers, and load a MAC16 data
register with auto decrement

MULA.DD.qq.LDINC

Signed multiply-accumulate of two MAC16 data registers, and load a MAC16 data
register with auto increment

UMUL.AA.qq

Unsigned multiply of two address registers

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

2.

The qq opcode parameter indicates (by HH, HL, LH, or LL) whether the operands are taken from the Low or High 16-bit half of the AR or MR
registers. The first q represents the location of the first operand; the second q represents the location of the second operand.

3.

The destination for all product and accumulate results is the MAC16 accumulator

4.3.7.2 Use With CLAMPS Instruction
The CLAMPS instruction, implemented with the Miscellaneous Operations Option, is useful in conjunction with the MAC16 Option. It allows clamping results to 16 bits before
storing to memory.

4.3.8

Miscellaneous Operations Option

These instructions can be individually enabled in groups to provide computational capability required by a few applications.
„

Prerequisites: None

„

Incompatible options: None

4.3.8.1 Miscellaneous Operations Option Architectural Additions
Table 4–39 and Table 4–40 show this option’s architectural additions.
Table 4–39. Miscellaneous Operations Option Processor-Configuration Additions
Parameter

Description

Valid Values

InstructionCLAMPS

Enable the signed clamp instruction: CLAMPS

0 or 1

InstructionMINMAX

Enable the minimum and maximum value instructions: MIN,
MAX, MINU, MAXU

0 or 1

InstructionNSA

Enabled the normalization shift amount instructions: NSA,
NSAU

0 or 1

InstructionSEXT

Enable the sign extend instruction: SEXT

0 or 1

62

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–40. Miscellaneous Operations Instruction Additions
Instruction1

Format

Definition

RRR

Clamp to signed power of two range
sign ← AR[s]31
AR[r] ← if AR[s]30..(t+7) = sign24−t
then AR[s]
else sign(25−t) || (not sign)t+7

RRR

Maximum value signed
AR[r] ← if AR[s] < AR[t] then AR[t] else AR[s]

RRR

Maximum value unsigned
AR[r] ← if (0||AR[s]) < (0||AR[t])
then AR[t]
else AR[s]

RRR

Minimum value signed
AR[r] ← if AR[s] < AR[t] then AR[s] else AR[t]

RRR

Minimum value unsigned
AR[r] ← if (0||AR[s]) < (0||AR[t])
then AR[s]
else AR[t]

RRR

Normalization shift amount signed
AR[r] ← nsa1(AR[s]31, AR[s])
NSA returns the number of contiguous bits in the most significant end of
AR[s] that are equal to the sign bit (not counting the sign bit itself), or 31 if
AR[s] = 0 or AR[s] = -1. The result may be used as a left shift amount
such that the result of SLL on AR[s] will have bit31 ≠ bit30 (if AR[s] ≠
0).

RRR

Normalization shift amount unsigned
AR[r] ← nsa1(0, AR[s])
NSAU returns the number of contiguous zero bits in the most significant end
of AR[s], or 32 if AR[s] = 0. The result may be used as a left shift
amount such that the result of SLL on AR[s] will have bit31 ≠ 0 (if
AR[s] ≠ 0).

RRR

Sign extend
sign ← AR[s]t+7
AR[r] ← sign(24−t) || AR[s]t+7..0

CLAMPS

MAX

MAXU

MIN

MINU

NSA

NSAU

SEXT
1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3.9

Coprocessor Option

A coprocessor is a combination of additional state, instructions and logic that operates
on that state, including moves and the setting of Booleans for branch true/false operations. The Coprocessor Option is general in nature: it adds state that is shared by all co-

Xtensa Instruction Set Architecture (ISA) Reference Manual

63

Chapter 4. Architectural Options

processors. After the Coprocessor Option is added, specific coprocessors, such as the
Floating-Point Coprocessor Option, can be added, along with system-specific instructions for coprocessor operations.
„

Prerequisites: Exception Option (page 82)

„

Incompatible options: None

4.3.9.1 Coprocessor Option Architectural Additions
Table 4–41 and Table 4–42 show this option’s architectural additions.
Table 4–41. Coprocessor Option Exception Additions
EXCCAUSE
value

Exception

Description

Coprocessor0Disabled

Coprocessor 0 instruction while cp0 disabled

32

Coprocessor1Disabled

Coprocessor 1 instruction while cp1 disabled

33

Coprocessor2Disabled

Coprocessor 2 instruction while cp2 disabled

34

Coprocessor3Disabled

Coprocessor 3 instruction while cp3 disabled

35

Coprocessor4Disabled

Coprocessor 4 instruction while cp4 disabled

36

Coprocessor5Disabled

Coprocessor 5 instruction while cp5 disabled

37

Coprocessor6Disabled

Coprocessor 6 instruction while cp6 disabled

38

Coprocessor7Disabled

Coprocessor 7 instruction while cp7 disabled

39

Table 4–42. Coprocessor Option Processor-State Additions
Register
Mnemonic
CPENABLE
1.

Quantity
1

Width
(bits)

Register Name

R/W

Special Register
Number1

8

Coprocessor enable bits

R/W

224

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

4.3.9.2 Coprocessor Context Switch
RUR and WUR are not created by the Coprocessor Option, but rather by TIE language
constructs. They provide a uniform way for reading and writing miscellaneous state added via the TIE language. The TIE user_register construct associates TIE state registers with RUR/WUR register numbers in 32-bit quantities. RUR reads 32 bits of TIE state
into an address register, and WUR writes 32 bits to a TIE state register from an address
register. The ISA does not define the result of additional bits read by RUR when fewer
than 32 bits of TIE state are associated with the user register.

64

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

The TIE compiler automatically generates for each coprocessor the assembly code to
save the state associated with a coprocessor to memory and to restore coprocessor
state from memory.
Tensilica reserves user register numbers for RUR and WUR in the range 192 to 255.
The CPENABLE register allows a “lazy” context switch of the coprocessor state. Any instruction that references coprocessor n state (not including the shared Boolean registers) when that coprocessor’s enable bit (bit n) is clear raises a
CoprocessornDisabled exception. CPENABLE can be cleared on context switch, and
the exception used to unload the previous task’s coprocessor state and load the current
task’s. The appropriate CPENABLE bit is then set by the exception handler, which then
returns to execute the coprocessor instruction. An RSYNC instruction must be executed
after writing CPENABLE before executing any instruction that references state controlled
by the changed bits of CPENABLE. This register is undefined after reset.
If a single instruction references state from more than one coprocessor not enabled in
CPENABLE, then one of CoprocessornDisabled exceptions is raised. The prioritization among multiple CoprocessornDisabled exceptions is implementation-specific.

4.3.10 Boolean Option
This option makes a set of Boolean registers available, along with branches and other
operations that refer to them. Multiple coprocessors and other TIE language extensions
can use this set.
„

Prerequisites: None

„

Incompatible options: None

4.3.10.1 Boolean Option Architectural Additions
Table 4–43 and Table 4–44 show this option’s architectural additions.
Table 4–43. Boolean Option Processor-State Additions
Register
Mnemonic
BR2

Quantity

Width
(bits)

Register Name

R/W

Special Register
Number1

16

1

Boolean registers

R/W

4

1. Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.
2. This register is known as Special Register BR or as individual Boolean bits b0..15.

Xtensa Instruction Set Architecture (ISA) Reference Manual

65

Chapter 4. Architectural Options

Table 4–44. Boolean Option Instruction Additions
Instruction1

Format

Definition

RRR

4-Boolean and reduction
(result is 1 if all of the 4 Booleans are true)

RRR

8-Boolean and reduction
(result is 1 if all of the 8 Booleans are true)

ANDB

RRR

Boolean and

ANDBC

RRR

Boolean and with complement

RRR

4-Boolean or reduction
(result is 1 if any of the 4 Booleans is true)

RRR

8-Boolean or reduction
(result is 1 if any of the 8 Booleans is true)

BF

RRI8

Branch if Boolean false

BT

RRI8

Branch if Boolean true

MOVF

RRR

Conditional move if false

MOVT

RRR

Conditional move if true

ORB

RRR

Boolean or

ORBC

RRR

Boolean or with complement

XORB

RRR

Boolean exclusive or

ALL4
ALL8

ANY4
ANY8

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3.10.2 Booleans
A coprocessor test or comparison produces a Boolean result. The Boolean Option provides 16 single-bit Boolean registers for storing the results of coprocessor comparisons
for testing in conditional move and branch instructions. Boolean logic may replace
branches in some situations. Compared to condition codes used by other ISAs, these
Booleans eliminate the bottleneck of having only a single place to store comparison results. It is possible, for example, to do multiple comparisons before the comparison results are used. For Single-Instruction Multiple-Data (SIMD) operations, Booleans provide up to 16 simultaneous compare results and conditionals.
Boolean-producing instructions generate only one sense of the condition (for example, =
but not ≠); all Boolean uses allow for complementing of the Boolean. Multiple Booleans
may be combined into a single Boolean using the ANY4, ALL4, and so forth instructions.
For example, this is useful after a SIMD comparison to test if any or all of the elements
satisfy the test, such as testing if any byte of a word is zero. ANY2 and ALL2 instructions
are not provided; ANDB and ORB provide this functionality given bs+0 and bs+1 as arguments.

66

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

The Boolean registers are undefined after reset.
The Boolean registers are accessible from C using the xtbool, xtbool2, xtbool4,
xtbool8, and xtbool16 data types. See the Xtensa C and C++ Compiler User’s
Guide for details.

4.3.11 Floating-Point Coprocessor Option
The Floating-Point Coprocessor Option adds the logic and architectural components
needed for IEEE754 single-precision floating-point operations. These operations are
useful for DSP that requires >16 bits of precision, such as audio compression and decompression. Also, DSP algorithms for less precise data are more easily coded using
floating-point, and good performance is obtainable when programming in languages
such as C.
„

Prerequisites: Coprocessor Option (page 63) and Boolean Option (page 65)

„

Incompatible options: None

4.3.11.1 Floating-Point Coprocessor Option Architectural Additions
Table 4–45 through Table 4–46 show this option’s architectural additions.
Table 4–45. Floating-Point Coprocessor Option Processor-State Additions
Register
Mnemonic

Quantity

Width (bits)

Register Name

R/W

Register
Number1

FR

16

32

Floating-point register

R/W

-

FCR

1

32

Floating-point control register

R/W

User 232

1

32

Floating-point status register

R/W

User 233

FSR
1.

See Table 3–23 on page 46.

Table 4–46. Floating-Point Coprocessor Option Instruction Additions
Instruction1

Format Definition

ABS.S

RRR

Single-precision absolute value

ADD.S

RRR

Single-precision add

CEIL.S

RRR

Single-precision floating-point to signed integer conversion with round to +∞

FLOAT.S

RRR

Signed integer to single-precision floating-point conversion (current rounding mode)

FLOOR.S

RRR

Single-precision floating-point to signed integer conversion with round to -∞

LSI

RRI8

Load single-precision immediate

LSIU

RRI8

Load single-precision immediate with base update

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

67

Chapter 4. Architectural Options

Table 4–46. Floating-Point Coprocessor Option Instruction Additions (continued)
Instruction1

Format Definition

LSX

RRR

Load single-precision indexed

LSXU

RRR

Load single-precision indexed with base update

MADD.S

RRR

Single-precision multiply-add

MOV.S

RRR

Single-precision move

MOVEQZ.S

RRR

Single-precision move if equal to zero

MOVF.S

RRR

Single-precision move if Boolean condition false

MOVGEZ.S

RRR

Single-precision move if greater than or equal to zero

MOVLTZ.S

RRR

Single-precision move if less than zero

MOVNEZ.S

RRR

Single-precision move if not equal to zero

MOVT.S

RRR

Single-precision move if Boolean condition true

MSUB.S

RRR

Single-precision multiply-subtract

MUL.S

RRR

Single-precision multiply

NEG.S

RRR

Single-precision negate

OEQ.S

RRR

Single-precision compare equal

OLE.S

RRR

Single-precision compare less than or equal

OLT.S

RRR

Single-precision compare less than

RFR

RRR

Read floating-point register (FR to AR)

ROUND.S

RRR

Single-precision floating-point to signed integer conversion with round to nearest

SSI

RRI8

Store single-precision immediate

SSIU

RRI8

Store single-precision immediate with base update

SSX

RRR

Store single-precision indexed

SSXU

RRR

Store single-precision indexed with base update

SUB.S

RRR

Single-precision subtract

TRUNC.S

RRR

Single-precision floating-point to signed integer conversion with round to 0

UEQ.S

RRR

Single-precision compare unordered or equal

UFLOAT.S

RRR

Unsigned integer to single-precision floating-point conversion (current rounding mode)

ULE.S

RRR

Single-precision compare unordered or less than or equal

ULT.S

RRR

Single-precision compare unordered or less than

UN.S

RRR

Single-precision compare unordered

UTRUNC.S

RRR

Single-precision floating-point to unsigned integer conversion with round to 0

WFR

RRR

Write floating-point register (AR to FR)

1.

68

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.3.11.2 Floating-Point Representation
The primary floating-point data type is IEEE754 single-precision:
31 30

23 22

0

s

exp

fraction

1

8

23

The other data format is a signed, 32-bit integer used by the FLOAT.S, TRUNC.S,
ROUND.S, FLOOR.S, and CEIL.S instructions.
IEEE754 uses a sign-magnitude format, with a 1-bit sign, an 8-bit exponent with bias
127, and a 24-bit significand formed from 23 stored bits representing the binary digits to
the right the binary point, and an implicit bit to the left of the binary point (0 if exponent is
zero, 1 if exponent is non-zero). Thus, the value of the number is:
(−1)s × 2exp−127 × implicit.fraction

Thus, the representation for 1.0 is 0x3F800000, with a sign of 0, exp of 127, a zero fraction, and an implicit 1 to the left of the binary point.
The Xtensa ISA includes IEEE754 signed-zero, infinity, quiet NaN, and sub-normal representations and processing rules. The ISA does not include IEEE754 signaling NaNs or
exceptions. Integer ⇔ floating-point conversions include a binary scale factor to make
conversion into and out of fixed-point formats faster.
4.3.11.3 Floating-Point State
Table 4–45 summarizes the processor state added by the floating-point coprocessor.
The FR register file consists of 16 registers of 32 bits each and is used for all data computation. Load and store instructions transfer data between the FR’s and memory. The
FCR register file has one field that may be changed at run-time to control the operation
of various instructions. Table 4–47 lists FCR fields and their associated meanings. The
format of FCR is
31

12 11

7 6 5 4 3 2 1 0

reserved

ignore

V Z O U I

RM

20

5

1 1 1 1 1

2

Xtensa Instruction Set Architecture (ISA) Reference Manual

69

Chapter 4. Architectural Options

Table 4–47. FCR fields
FCR Field

Meaning

RM

Rounding mode
0 → round to nearest
1 → round toward 0 (TRUNC)
2 → round toward +∞ (CEIL)
3 → round toward −∞ (FLOOR)

I

Inexact exception enable (0 → disabled, 1 → enabled)

U

Underflow exception enable (0 → disabled, 1 → enabled)

O

Overflow exception enable (0 → disabled, 1 → enabled)

Z

Divide-by-zero exception enable (0 → disabled, 1 → enabled)

V

Invalid exception enable (0 → disabled, 1 → enabled)

ignore

Reads as 0, ignored on write

reserved

Reads back last value written. Non-zero values cause a floating-point exception on any
floating-point instruction (see Section 4.3.11.4)

The FSR register file provides the status flags required by IEEE754. These flags are set
by any operation that raises a non-enabled exception (see Section 4.3.11.4). Enabled
exceptions abort the operation with a floating-point exception and the flags are not written:
31

12 11 10 9 8 7 6

0

reserved

V Z O U I

ignore

20

1 1 1 1 1

7

Table 4–48. FSR fields
FSR Field

Meaning

I

Inexact exception flag

U

Underflow exception flag

O

Overflow exception flag

Z

Divide-by-zero flag

V

Invalid exception flag

ignore

Reads as 0, ignored on write

reserved

Reads back last value written. Non-zero values cause a floating-point exception on any
floating-point instruction (see Section 4.3.11.4)

70

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Most architectures have a combined floating-point control and status register, instead of
separate registers. In high-performance pipelines, this combination can compromise
performance, as reads and writes must access all bits, even ones that are not required
by the program. Xtensa’s FCR may be read and written without waiting for the results of
pending floating-point operations. Writes to FCR affect subsequent floating-point operations, but there is usually little performance cost from this dependency. Only reads of
FSR need cause a significant pipeline interlock.
FCR and FSR are organized to allow implementation with a single 32-bit physical register. The separate register numbers affect only the bits read and written of this underlying
physical register. It is also possible for software to bitwise logical OR the RUR’s of FCR
and FSR to create the appearance of a single register and to write this combined value
to FCR and FSR.
The reserved bits of FCR and FSR must store the last value written, but if that value is
non-zero, this causes all floating-point operations to raise a floating-point exception.
This allows future extensions to define additional control values that if used in earlier implementations, can be emulated in software.
4.3.11.4 Floating-Point Exceptions
Current implementations neither raise exceptions enabled by FCR bits nor set flag bits in
FSR. They also do not raise an exception when one of the reserved bits of FCR or FSR is
non-zero.
4.3.11.5 Floating-Point Instructions
The floating-point instructions are defined in Table 4–49 and Table 4–50. The instructions operate on data in the floating-point register file, which consists of 16 32-bit registers.
The floating-point ISA requires a triple read-port FR register file for the MADD.S and
MSUB.S operations.

Xtensa Instruction Set Architecture (ISA) Reference Manual

71

Chapter 4. Architectural Options

Table 4–49. Floating-Point Coprocessor Option Load/Store Instructions
Instruction1

Format Definition
RRI8

Load single-precision immediate
vAddr ← AR[s] + (022||imm8||02)
FR[t] ← Load32(vAddr)

RRI8

Load single-Precision Immediate with Base Update
vAddr ← AR[s] + (022||imm8||02)
FR[t] ← Load32(vAddr)
AR[s] ← vAddr

RRR

Load single-Precision Indexed
vAddr ← AR[s] + AR[t]
FR[t] ← Load32(vAddr)

RRR

Load single-Precision Indexed with Base Update
vAddr ← AR[s] + AR[t]
FR[t] ← Load32(vAddr)
AR[s] ← vAddr

RRI8

Store single-Precision Immediate
vAddr ← AR[s] + (022||imm8||02)
Store32 (vAddr, FR[t])

RRI8

Store single-Precision Immediate with Base Update
vAddr ← AR[s] + (022||imm8||02)
Store32 (vAddr, FR[t])
AR[s] ← vAddr

RRR

Store single-Precision Indexed
vAddr ← AR[s] + AR[t]
Store32 (vAddr, FR[r])

RRR

Store single-Precision Indexed with Base Update
vAddr ← AR[s] + AR[t]
Store32 (vAddr, FR[r])
AR[s] ← vAddr

LSI

LSIU

LSX

LSXU

SSI

SSIU

SSX

SSXU
1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243

Table 4–50. Floating-Point Coprocessor Option Operation Instructions
Instruction1
ABS.S
ADD.S
CEIL.S
1.

72

Format Definition
RRR

Single-precision absolute value
FR[r] ← abss(FR[s])

RRR

Single-precision add
FR[r] ← FR[s] +s FR[t]
Scale and convert single-precision to integer, round to +∞
AR[r] ← ceils(FR[s] ×s pows(2.0,t))

RRR

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–50. Floating-Point Coprocessor Option Operation Instructions (continued)
Instruction1
FLOAT.S
FLOOR.S
MADD.S
MOV.S
MOVEQZ.S
MOVF.S
MOVGEZ.S
MOVLTZ.S
MOVNEZ.S
MOVT.S
MSUB.S
MUL.S
NEG.S
OEQ.S
OLE.S
OLT.S
RFR
ROUND.S
SUB.S
1.

Format Definition
RRR

Convert signed integer to single-precision and scale
FR[r] ← floats(AR[s]) ×s pows(2.0,-t)

RRR

Scale and convert single-precision to integer, round to −∞
AR[r] ← floors(FR[s] ×s pows(2.0,t))

RRR

Single-precision multiply/add
FR[r] ← FR[r] +s (FR[s] ×s FR[t])
Single-precision move
FR[r] ← FR[s]

RRR
RRR

Single-precision conditional move if equal to zero
if AR[t] = 032 then FR[r] ← FR[s] endif

RRR

Single-precision conditional move if false
if BRt = 0 then FR[r] ← FR[s] endif

RRR

Single-precision conditional move if greater than or equal to zero
if AR[t]31 = 0 then FR[r] ← FR[s] endif

RRR

Single-precision conditional move if less than zero
if AR[t]31 ≠ 0 then FR[r] ← FR[s] endif

RRR

Single-precision conditional move if not equal to zero
if AR[t] ≠ 032 then FR[r] ← FR[s] endif

RRR

Single-precision conditional move if true
if BRt ≠ 0 then FR[r] ← FR[s] endif

RRR

Single-precision multiply/subtract
FR[r] ← FR[r] −s (FR[s] ×s FR[t])
Single-precision multiply
FR[r] ← FR[s] ×s FR[t]

RRR
RRR

Single-precision negate
FR[r] ← −s FR[s]

RRR

Single-precision compare equal
BRr ← FR[s] OEQs FR[t];

RRR

Single-precision compare less than or equal
BRr ← FR[s] OLEs FR[t];

RRR

Single-precision compare less than
BRr ← FR[s] OLTs FR[t];

RRR

Move from FR to AR
AR[r] ← FR[s]

RRR

Scale and convert single-precision to integer, round to nearest
AR[r] ← rounds(FR[s] ×s pows(2.0,t))

RRR

Single-precision subtract
FR[r] ← FR[s] −s FR[t]

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

73

Chapter 4. Architectural Options

Table 4–50. Floating-Point Coprocessor Option Operation Instructions (continued)
Instruction1
TRUNC.S
UEQ.S
UFLOAT.S
ULE.S
ULT.S
UN.S
UTRUNC.S
WFR
1.

Format Definition
RRR

Scale and convert single-precision to signed integer, round to 0
AR[r] ← truncs(FR[s] ×s pows(2.0,t))

RRR

Single-precision compare unordered or equal
BRr ← FR[s] UEQs FR[t];

RRR

Convert unsigned integer to single-precision and scale
FR[r] ← ufloats(AR[s]) ×s pows(2.0,-t))

RRR

Single-precision compare unordered or less than or equal
BRr ← FR[s] ULEs FR[t];

RRR

Single-precision compare unordered or less than
BRr ← FR[s] ULTs FR[t];

RRR

Single-precision compare unordered
BRr ← FR[s] UNs FR[t];

RRR

Scale and convert single-precision to unsigned integer, round to 0
AR[r] ← utruncs(FR[s] ×s pows(2.0,t))

RRR

Move from AR to FR
FR[r] ← AR[s]

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3.12 Multiprocessor Synchronization Option
When multiple processors are used in a system, some sort of communication and synchronization between processors is required. (Note that multiprocessor synchronization
is distinct from pipeline synchronization between instructions as represented by the
ISYNC, RSYNC, ESYNC, and DSYNC instructions, despite the name similarity). In some
cases, self-synchronizing communication, such as input and output queues, is used. In
other cases, a shared memory model is used for communication, and it is necessary to
provide instruction-set support for synchronization because shared memory does not
provide the required semantics. The Multiprocessor Synchronization Option is designed
for this shared memory case.
„

Prerequisites: None

„

Incompatible Options: None

4.3.12.1 Memory Access Ordering
The Xtensa ISA requires that valid programs follow a simplified version of the Release
Consistency model of memory access ordering. Xtensa implementations may perform
ordinary load and store operations to non-overlapping addresses in any order. Loads
and stores to overlapping addresses on a single processor must be executed in program
order. This flexibility is appropriate because most memory accesses require only these

74

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

semantics and some implementations may be able to execute programs significantly
faster by exploiting non-program order memory access. While these semantics are appropriate for most loads and stores, order does matter when synchronizing between processors. Xtensa’s Multiprocessor Synchronization Option therefore augments ordinary
loads and stores with acquire and release operations, which are respectively loads and
stores with more constrained memory ordering semantics relative to each other and relative to ordinary loads and stores.
The Xtensa version of Release Consistency is adapted from Memory Consistency and
Event Ordering in Scalable Shared-Memory Multiprocessors by Gharachorloo et. al. in
the Proceedings of the 17th Annual International Symposium on Computer Architecture,
1990, from which the following three definitions are directly borrowed:
„

„

„

A load by processor i is considered performed with respect to processor k at a point
in time when the issuing of a store to the same address by processor k cannot affect
the value returned by the load.
A store by processor i is considered performed with respect to processor k at a point
in time when an issued load to the same address by processor k returns the value
defined by this store (or a subsequent store to the same location).
An access is performed when it is performed with respect to all processors.

Using these definitions, Xtensa places the following requirements on memory access:
„

„

„

Before an ordinary load or store access is allowed to perform with respect to any
other processor, all previous acquire accesses must be performed, and
Before a release access is allowed to perform with respect to any other processor,
all previous ordinary load, store, acquire, and release accesses must be performed,
and
Before an acquire is allowed to perform with respect to any other processor, all previous acquire accesses must be performed.

Many Xtensa implementations will adopt stricter memory orderings for simplicity. However, programs should not rely on any stricter memory ordering semantics than those
specified here.
4.3.12.2 Multiprocessor Synchronization Option Architectural Additions
Table 4–51 shows this option’s architectural additions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

75

Chapter 4. Architectural Options

Table 4–51. Multiprocessor Synchronization Option Instruction Additions
Instruction1

Format

Definition

RRI8

Load 32-bit acquire (8-bit shifted offset)
This load will perform before any subsequent loads, stores, or acquires are
performed. It is typically used to test the synchronization variable protecting a
critical region (for example, to acquire a lock).

RRI8

Store 32-bit release (8-bit shifted offset)
All prior loads, stores, acquires, and releases will be performed before this
store is performed. It is typically used to write a synchronization variable to
indicate that this processor is no longer in a critical region (for example, to
release a lock).

L32AI

S32RI

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3.12.3 Inter-Processor Communication with the L32AI and S32RI Instructions
L32AI and S32RI are 32-bit load and store instructions with acquire and release semantics. These instructions are useful for controlling the ordering of memory references
in multiprocessor systems, where different memory locations may be used for synchronization and data, so that precise ordering between synchronization references must be
maintained. Other load and store instructions may be executed by processor implementations in any order that produces the same uniprocessor result.
The MEMW instruction is somewhat similar in that it enforces load and store ordering, but
is less selective. MEMW is intended for implementing C’s volatile attribute, and not for
high performance synchronization between processors.
L32AI is used to load a synchronization variable. This load will be performed before any
subsequent load, store, acquire, or release is begun. This ensures that subsequent
loads and stores do not see or modify data that is protected by the synchronization variable.
S32RI is used to store to a synchronization variable. This store will not begin until all
previous loads, stores, acquires, or releases are performed. This ensures that any loads
of the synchronization variable that see the new value will also find all protected data
available as well.
Consider the following example:
volatile uint incount = 0;
volatile uint outcount = 0;
const uint bsize = 8;
data_t buffer[bsize];
void producer (uint n)
{

76

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

for (uint i = 0; i < n; i += 1) {
data_t d = newdata();
while (outcount == i - bsize);
buffer[i % bsize] = d;
incount = i+1;
}

}
void consumer (uint n)
{
for (uint i = 0; i < n; i += 1) {
while (incount == i);
data_t d = buffer[i % bsize];
outcount = i+1;
usedata (d);
}
}

//
//
//
//

produce next datum
wait for room
put data in buffer
signal data is ready

//
//
//
//

wait for data
read next datum
signal data read
use datum

Here, incount and outcount are synchronization variables, and buffer is a shared
data variable. producer’s writes to incount and consumer’s writes to outcount
must use S32RI and producer’s reads of outcount and consumer’s reads of
incount must use L32AI. If producer’s write to incount were done with a simple
S32I, the processor or memory system might reorder the write to buffer after the write
to incount, thereby allowing consumer to see the wrong data. Similarly, if
consumer’s read of incount were done with a simple L32I, the processor or memory
system might reorder the read to buffer before the read of incount, also causing
consumer to see the wrong data.

4.3.13 Conditional Store Option
In addition to the memory ordering needs satisfied by the Multiprocessor Synchronization Option, a multiprocessor system can require mutual exclusion, which cannot easily
be programmed using the Multiprocessor Synchronization Option. The Conditional Store
Option is intended to add that capability. It does so by adding a single instruction
(S32C1I), which atomically stores to a memory location only if its current value is the
expected one. A state register (SCOMPARE1) is also added to provide the additional operand required. Some implementations also have a state register (ATOMCTL) for further
control of the atomic operation in cache and on the PIF bus.
„

Prerequisites: Multiprocessor Synchronization Option (page 74)

„

Incompatible Options: None

When the atomic operation reaches the PIF bus, it causes a Read-Compare-Write
(RCW) transaction on the PIF, which is different from normal reads and writes.
4.3.13.1 Conditional Store Option Architectural Additions
Table 4–52 through Table 4–53 show this option’s architectural additions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

77

Chapter 4. Architectural Options

Table 4–52. Conditional Store Option Processor-State Additions
Register
Mnemonic

Register Name

R/W

Special
Register
Number1

32

Conditional store comparison data

R/W

12

6

Atomic Operation Control

R/W

99

Quantity

Width
(bits)

SCOMPARE1

1

ATOMCTL2

1

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

2.

Register exists only in some implementations.

Table 4–53. Conditional Store Option Instruction Additions
Instruction1

S32C1I
1.

Format

Definition

RRI8

Store 32-Bit compare conditional
Stores to a location only if the location contains the value in the SCOMPARE1
register. The comparison of the old value and the store, if equal, is atomic. The
instruction also returns the old value of the memory location.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.3.13.2 Exclusive Access with the S32C1I Instruction
L32AI and S32RI allow inter-processor communication, as in the producer-consumer
example in Section 4.3.12.3 (barrier synchronization is another example), but they are
not efficient for guaranteeing exclusive access to data (for example, locks). Some systems may provide efficient, tailored, application-specific exclusion support. When this is
not appropriate, the ISA provides another general-purpose mechanism for atomic updates of memory-based synchronization variables that can be used for exclusion algorithms. The S32C1I instruction stores to a location if the location contains the value in
the SCOMPARE1 register. The comparison of the old value and the conditional store are
atomic. S32C1I also returns the old value of the memory location, so it looks like both a
load and a store; this allows the program to determine whether the store succeeded,
and if not it can use the new value as the comparison for the next S32C1I. For example,
an atomic increment could be done as follows:
l32ai

a3, a2, 0

// current value of memory

wsr
mov
addi
s32c1i

a3,
a4,
a3,
a3,

bne

a3, a4, loop

//
//
//
//
//
//

loop:

78

scompare1
a3
a3, 1
a2, 0

put current value in SCOMPARE1
save for comparison
increment value
store new value if memory
still contains SCOMPARE1
if value changed, try again

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Semaphores and other exclusion operations are equally simple to create using S32C1I.
There are many possible atomic memory primitives. S32C1I was chosen for the Xtensa
ISA because it can easily synthesize all other primitives that operate on a single memory
location. Many other primitives (for example, test and set, or fetch and add) are not as
universal. Only primitives that operate on multiple memory locations are more powerful
than S32C1I. Note that there can be subtle issues with some algorithms if between a
read and an S32C1I, there are multiple changes to the target which bring the value
back to the original one.
The SCOMPARE1 register is undefined after reset.
4.3.13.3 Use Models for the S32C1I Instruction
Because of its nature as an atomic read-compare-write instruction, the S32C1I instruction is unusual in its relationships to local memories, caches, and system memories. Following is a list of ways that the S32C1I instruction is able to interact with memory. Some
implementations use the ATOMCTL Special Register described below to control which
way the instruction interacts with each memory type. Other implementations interact in a
fixed way with each memory type. Refer to a specific Xtensa processor data book for
more detailed information on how a specific processor handles S32C1I instructions.
„

„

„

Local Memory — Xtensa processors with the Conditional Store Option and the Data
RAM Option configured will execute S32C1I instructions whose address resolves to
a DataRAM address directly on that DataRAM. Unless access to the DataRAM is
shared with another master, no external logic is necessary in this case. None of the
other ways listed below may be used for addresses resolving to a DataRAM.
Exception — Xtensa processors with the Conditional Store Option and the Exception Option configured can execute the S32C1I instruction by taking an exception
(LoadStoreErrorCause). The exception may be considered an error, or it may
be used as a way to emulate the effect of the S32C1I instruction. Exception may be
the only method available for certain memory types or it may be directed by the
ATOMCTL register.
RCW Transaction — Xtensa processors with the Conditional Store Option and the
Processor Interface Option configured can execute the S32C1I instruction by sending an RCW transaction on the PIF bus. External logic must then implement the
atomic read-compare-write on the memory location. If the Data Cache Option is configured and the memory region is cacheable, any corresponding cache line will be
flushed out of the cache by the S32C1I instruction using the equivalent of a DHWBI
instruction before the RCW transaction is sent. RCW Transaction may be the only
method available for certain memory types or it may be directed by the ATOMCTL
register.

Xtensa Instruction Set Architecture (ISA) Reference Manual

79

Chapter 4. Architectural Options

If the address of the RCW transaction targets the Inbound PIF port of another
Xtensa processor, the targeted Xtensa processor has the Conditional Store Option
and the Data RAM Option configured, and the RCW address targets the DataRAM,
the RCW will be performed atomically on the target processor’s DataRAM. No external logic other than PIF bus interconnects is necessary to allow an Xtensa processor
to atomically access a DataRAM location in another Xtensa processor in this way.
„

Internal Operation — Xtensa processors with the Conditional Store Option and the
Data Cache Option configured can execute the S32C1I instruction by allocating and
filling the line in the cache and accessing the location atomically there. No external
logic is necessary in this case. Internal Operation may be the only method available
for certain memory types or it may be directed by the ATOMCTL register.

4.3.13.4 The Atomic Operation Control Register (ATOMCTL) under the Conditional Store
Option
The ATOMCTL register exists in some implementations of the Conditional Store Option to
control how the S32C1I instruction interacts with the cache and with the PIF bus. Implementations without the ATOMCTL register allow only one behavior per memory type.
Table 4–54 shows the ATOMCTL register. Table 4–54 describes the fields of the
ATOMCTL register. See Section 4.3.13.4 above for the meaning of the codes in the table.
31

80

6 5 4 3 2 1 0
reserved

WB

WT

BY

24

2

2

2

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–54. ATOMCTL Register Fields
Field

WB

WT

BY

1.

Width
(bits)

Definition

2

S32C1I to Writeback Cacheable Memory (including Writeback NoAllocate Memory)
0 → Exception - LoadStoreErrorCause
1 → RCW Transaction
2 → Internal Operation
3 → Reserved

2

S32C1I to Writethrough Cacheable Memory (including Cached-NoAllocate Memory)
0 → Exception - LoadStoreErrorCause
1 → RCW Transaction
2 → Internal Operation1
3 → Reserved

2

S32C1I to Bypass Memory
0 → Exception - LoadStoreErrorCause
1 → RCW Transaction
2 → Reserved
3 → Reserved

Some implementations do not implement this case and take an exception (LoadStoreErrorCause)instead.

ATOMCTL is defined after processor reset as shown in Table 5–186 on page 237.
An older, fixed operation, Xtensa processor which operates on all cacheable and bypass
regions by RCW transaction may be emulated by setting the ATOMCTL register to 0x15.
One which operates only on bypass regions by RCW transaction may be emulated by
setting the ATOMCTL register to 0x01.
Bits of the ATOMCTL register are present even when they correspond to a memory type
which is not configured in the Xtensa processor. For example, a processor configured
without a Data Cache will still contain the fields WB and WT and those fields may contain
any value. But in this case, no cacheable memory will be addressable and so it will not
be possible to make use of these fields.
In an Xtensa processor with the Data RAM Option configured, the ATOMCTL register
does not affect the "Local Memory" use model or the receiving of Inbound PIF transactions as described under the "RCW Transaction" use model in Section 4.3.13.3.
4.3.13.5 Memory Ordering and the S32C1I Instruction
With regard to the memory ordering defined for L32AI and S32RI in Section 4.3.12.1,
S32C1I plays the role of both acquire and release. That is, before the atomic pair of
memory accesses can perform, all ordinary loads, stores, acquires, and releases must
have performed. In addition, before any following ordinary load, store, acquire, or re-

Xtensa Instruction Set Architecture (ISA) Reference Manual

81

Chapter 4. Architectural Options

lease can be allowed to perform, the atomic pair of the S32C1I must have performed.
This allows the conditional store to make atomic changes to variables with ordering requirements, such as the counts discussed in the example in Section 4.3.12.3.

4.4

Options for Interrupts and Exceptions

The options in this section have the primary function of adding and controlling the behavior of the processor in the presence of exceptional conditions. These conditions include representatives of at least the following broad categories:
„

„

„

„

„

Instruction exceptions are unusual situations or errors encountered in the execution of the current instruction stream.
Interrupts are requests from outside the instruction stream that, if enabled, can start
the processor executing a different instruction stream.
Machine checks are failures of the processor hardware or related hardware that
need special handling to avoid causing the overall system to fail.
Debug conditions do not arise from the execution of the program or the surrounding
hardware, but rather from the desire of another agent to track the execution of the
processor.
Reset redirects the processor from any state, usually the undefined state after power-on, and starts it on a known execution path.

There are many ways of handling these conditions ranging from ignoring the conditions
or freezing the clock and asserting an output signal to multi-threaded self-handling of exceptional conditions. The Exception Option provides for the self-handling of instruction
exceptions and reset. Its self-handling mechanisms for these can be extended by the
Relocatable Vector Option and the Unaligned Exception Option. In addition, it provides a
foundation for additional options such as the Interrupt Option, the High-Priority Interrupt
Option, or the Timer Interrupt Option. Again, the Debug Option can be added to provide
for hardware debugging.

4.4.1

Exception Option

The Exception Option implements basic functions needed in the management of all
types of exceptional conditions. These conditions are handled by the processor itself by
redirecting execution to an exception vector to handle the condition with the possibility of
returning to continue execution at the original code stream. The option only fully implements the management of a subset of exceptional conditions. Additional options providing additional exception types use the Exception Option as a foundation.
„

Prerequisites: None

„

Incompatible options: None

82

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

„

Compatibility Note: Currently available hardware supports Xtensa Exception Architecture 2 (XEA2) and the descriptions in this chapter cover only XEA2. Differences
between this and Xtensa Exception Architecture 1 (XEA1) are described, for purposes of writing system software for XEA1 processors, in Section A.2 on page 611.

4.4.1.1 Exception Option Architectural Additions
Table 4–55 through Table 4–58 show this option’s architectural additions.
Table 4–55. Exception Option Constant Additions (Exception Causes)
Exception Cause

Constant Value

IllegalInstructionCause

6'b000000 (decimal 0)

SyscallCause

6'b000001 (decimal 1)

InstructionFetchErrorCause

6'b000010 (decimal 2)

LoadStoreErrorCause

6'b000011 (decimal 3)

Table 4–56. Exception Option Processor-Configuration Additions
Parameter

Description

Valid Values

NDEPC

Existence (number) of DEPC

0..1

ResetVector

Reset exception vector
(PC of first instruction executed after reset)

32-bit address

UserExceptionVector

Vector for exceptions and level-1 interrupts
when PS.EXCM = 0 and PS.UM = 1

32-bit address

KernelExceptionVector

Vector for exceptions and level-1 interrupts
when PS.EXCM = 0 and PS.UM = 0

32-bit address

DoubleExceptionVector

Vector for exceptions when
PS.EXCM = 1

32-bit address

Xtensa Instruction Set Architecture (ISA) Reference Manual

83

Chapter 4. Architectural Options

Table 4–57. Exception Option Processor-State Additions
Register
Mnemonic

Quantity

Register Name

R/W

Exception program counter2

R/W

177

Cause of last exception3

R/W

232

R/W

209

R/W

230

EPC[1]

1

32

EXCCAUSE

1

6

EXCSAVE[1]

32

1

PS
PS.EXCM
PS.UM
EXCVADDR
DEPC
1.

1

Special
Register
Number1

Width
(bits)

-

4

Save location for last exception
Miscellaneous processor state

2

5

1

4

Exception mode (see Table 4–63 on
page 87)

R/W

230

1

1

User vector mode (see Table 4–63 on
page 87)

R/W

230

1

32

Virtual address that caused last fetch, load,
or store exception

R/W

238

1

32

Double exception PC (exists if NDEPC=1)

R/W

192

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

2.

The EPC[i] and EXCSAVE[i] registers for interrupts above level 1 are part of the High-Priority Interrupt Option (Table 4–75 on page 107).

3.

See Table 4–64 on page 89 for the format of this register and Table 4–65 on page 94 for which vectors have causes reported in this register.

4.

Width depends on other configuration options.

5.

See "The Miscellaneous Program State Register (PS) under the Exception Option" on page 87.

Table 4–58. Exception Option Instruction Additions
Instruction1

Format

Definition

RRR

Exception wait
Waits for any exceptions of previously executed
instructions to occur.

RRR

System call
Generates an exception.

RRR

Returns from the KernelExceptionVector
exception.

RFDE

RRR

Returns from double exception (uses EPC if NDEPC=0)

ILL or illegal
instruction

—

Illegal instruction executed
The opcode ILL is guaranteed to always be an illegal
instruction

EXCW

SYSCALL
RFE

1.

84

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.4.1.2 Exception Causes under the Exception Option
A broad set of interrupts and exceptions can be handled by the processor itself under
the Exception Option. Table 4–59 through Table 4–62 list the types of exceptional conditions other than reset that can be handled under the Exception Option either natively or
with the help of an additional option. In each table, the first column contains the name of
the condition. The second column contains a description of the condition and the third
column contains both the option required for the condition to be handled and the name
of the vector to which execution will be redirected. Reset is provided by the Exception
Option and redirects execution to ResetVector.
Table 4–59. Instruction Exceptions under the Exception Option
Condition

Description

Required Option & Vector

Illegal instruction

Attempt to execute an illegal instruction or a legal
instruction under illegal conditions

Exception Option
General vector1

System call

Attempt to execute the SYSCALL instruction

Exception Option
General vector1

Instruction fetch error

Internal physical address or a data error during
instruction fetch

Exception Option
General vector1

Load or store error

Internal physical address or data error during
load or store

Exception Option
General vector1

Unaligned data exception

Attempt to load or store data at an address which
cannot be handled due to alignment

Unaligned Exception Option
General vector1

Privileged instruction

Attempt to execute a privileged operation without
sufficient privilege

MMU Option
General vector1

Memory access prohibited

Attempt to access data or instructions at a
prohibited address

Region Protection Option or MMU
Option — General vector1

Memory privilege violation

Attempt to access data or instructions without
sufficient privilege

MMU Option
General vector1

Address translation failure

Memory access needs translation information it
does not have available

MMU Option
General vector1

PIF bus error

Address or data error external to the processor
on the PIF bus2

Processor Interface Option
General vector1

1.

General vector means.DoubleExceptionVector if PS.EXCM is set. Otherwise it means UserExceptionVector if PS.UM is
set or KernelExceptionVector if PS.UM is clear.

2.

Imprecise errors on writes are not included.

3.

n can take on the values 4, 8, or 12 in each of overflow and underflow making a total of 6 vectors.

Xtensa Instruction Set Architecture (ISA) Reference Manual

85

Chapter 4. Architectural Options

Table 4–59. Instruction Exceptions under the Exception Option (continued)
Condition

Description

Required Option & Vector

Window exception

Attempt to execute an instruction needing AR
values moved between registers and stack

Windowed Register Option
WindowOverflown3, or
WindowUnderflown3

Alloca exception

Attempt to move the stack pointer when it would
cause an illegal condition on the stack

Windowed Register Option
General vector1

Coprocessor disabled

Attempt to execute an instruction requiring the
state of a disabled coprocessor

Coprocessor Option
General vector1

1.

General vector means.DoubleExceptionVector if PS.EXCM is set. Otherwise it means UserExceptionVector if PS.UM is
set or KernelExceptionVector if PS.UM is clear.

2.

Imprecise errors on writes are not included.

Table
4–60. Interrupts under the Exception Option
3.
n can take on the values 4, 8, or 12 in each of overflow and underflow making a total of 6 vectors.
Condition

Description

Required Option & Vector

Level-1 interrupt

Level or edge interrupt pin assertion handled as
part of general vector with software check

Interrupt Option
General vector1

Level-1 SW interrupt

Version of level-1 interrupt caused by software
using WSR.INTSET

Interrupt Option
General vector1

Medium-Level interrupt

Level/edge interrupt pin assertion handled with
High-Priority Interrupt Option
special interrupt level, masked on stack unusable InterruptVector[2..6]2

Medium-Level SW
interrupt

Version of medium level interrupt caused by
software using WSR.INTSET

High-Priority Interrupt Option
InterruptVector[2..6]2

High-Level interrupt

Level/edge interrupt pin assertion handled with
special interrupt level, extra stack care needed

High-Priority Interrupt Option
InterruptVector[2..6]2

High-level SW interrupt

Version of high level interrupt caused by software
using WSR.INTSET

High-Priority Interrupt Option
InterruptVector[2..6]2

Non-maskable interrupt

Edge triggered interrupt pin that cannot be
masked by software

High-Priority Interrupt Option
InterruptVector[2..7]2

Peripheral interrupt

Internal hardware (e.g., timers) causes one of the
above interrupts without an external pin

Timer Interrupt Option
(asserts another interrupt type)

1.

General vector means.DoubleExceptionVector if PS.EXCM is set. Otherwise it means UserExceptionVector if PS.UM is
set or KernelExceptionVector if PS.UM is clear.

2.

Medium and high level interrupts may use levels any level 2..6 not used for debug conditions. NMI is one level higher than the highest medium,
high, or debug level.

Table 4–61. Machine Checks under the Exception Option
Condition

Description

Required Option & Vector

ECC/parity error

An access to cache or local memory
produced an ECC or parity error

Memory ECC/Parity Option
MemoryErrorVector

86

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–62. Debug Conditions under the Exception Option
Condition

Description

Required Option & Vector

ICOUNT exception

An instruction would have incremented the
ICOUNT register to zero.

Debug Option
InterruptVector[dbg]1

BREAK exception

Attempt to execute the BREAK or BREAK.N
instruction.

Debug Option
InterruptVector[dbg]1

Instruction breakpoint

Attempt to execute an instruction matching one of
the instruction breakpoint registers

Debug Option
InterruptVector[dbg]1

Data breakpoint

Attempt to load or store to a data location
matching one of the data breakpoint registers.

Debug Option
InterruptVector[dbg]1

Debug interrupt

An interrupt through OCD

Debug Option2
InterruptVector[dbg]1

1.

Debug exceptions use an interrupt level provided by the High-Priority Interrupt Option. That level is labeled "dbg" in this table.

2.

The debug interrupt is actually created by the OCD Option under the Debug Option.

4.4.1.3 The Miscellaneous Program State Register (PS) under the Exception Option
The PS register contains miscellaneous fields that are grouped together primarily so that
they can be saved and restored easily for interrupts and context switching. Figure 4–8
shows its layout and Table 4–63 describes its fields. Section 5.3.5 “Processor Status
Special Register” describes the fields of this register in greater detail. The processor initializes these fields on processor reset: PS.INTLEVEL is set to 15, if it exists and
PS.EXCM is set to 1, and the other fields are set to zero.

*

13

12 11

8 7 6 5 4 3

0

*

OWB

RING

UM
EXCM

19 18 17 16 15
WOE
CALLINC

31

INTLEVEL

1

4

4

2

1 1

4

2

Figure 4–8. PS Register Format
Table 4–63. PS Register Fields
Field
INTLEVEL

EXCM

Width
(bits)

Definition [Required Option]

4

Interrupt-level disable [Interrupt Option]
Used to compute the current interrupt level of the processor (Section 4.4.1.4).

1

Exception mode [Exception Option]
0 → normal operation
1 → exception mode
Overrides the values of certain other PS fields (Section 4.4.1.4)

Xtensa Instruction Set Architecture (ISA) Reference Manual

87

Chapter 4. Architectural Options

Table 4–63. PS Register Fields (continued)
Width
(bits)

Field

Definition [Required Option]

1

User vector mode [Exception Option]
0 → kernel vector mode — exceptions do not need to switch stacks
1 → user vector mode — exceptions need to switch stacks
This bit does not affect protection. It is modified by software and affects the vector
used for a general exception.

2

Privilege level [MMU Option]

4

Old window base [Windowed Register Option]
The value of WindowBase before window overflow or underflow.

2

Call increment [Windowed Register Option]
Set to window increment by CALL instructions. Used by ENTRY to rotate window.

1
WOE

Window overflow-detection enable [Windowed Register Option]
0 → overflow detection disabled
1 → overflow detection enabled
Used to compute the current window overflow enable (Section 4.4.1.4)

*

Reserved for future use.
Writing a non-zero value to these fields results in undefined processor behavior.

UM

RING
OWB
CALLINC

4.4.1.4 Value of Variables under the Exception Option
The fields of the PS register listed in Table 4–63 affect many functions in the processor
through these variables:
The current interrupt level (CINTLEVEL) defines which levels of interrupts are currently
enabled and which are not. Interrupts at levels above CINTLEVEL are enabled. Those at
or below CINTLEVEL are disabled. To enable a given interrupt, CINTLEVEL must be
less than its level, and its INTENABLE bit must be 1. The level is defined by:
CINTLEVEL ← max(PS.EXCM∗EXCMLEVEL,PS.INTLEVEL)
PS.EXCM and PS.INTLEVEL are part of the PS register in Table 4–63. EXCMLEVEL is
defined in Table 4–74. CINTLEVEL is also used by the Debug Option.
The current ring (CRING) determines which ASIDs from the RASID register will cause a
privilege violation. ASIDs with position (in RASID) equal to or greater than CRING may
be used in translation while those with position less than CRING will cause a privilege violation. Privileged instructions may only be executed if CRING is zero. CRING is defined
by:
CRING ← if (MMU Option configured && PS.EXCM = 0) then PS.RING else 0
PS.EXCM and PS.RING are part of the PS register in Table 4–63.

88

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

The current window overflow enable (CWOE) defines whether window overflow exceptions are currently enabled. It is defined by:
CWOE ← if PS.EXCM then 0 else PS.WOE
PS.EXCM and PS.WOE are part of the PS register in Table 4–63.
The current loop enable (CLOOPENABLE) determines whether the loop-back function of
the zero-overhead loop instruction is enabled or not.
CLOOPENABLE ← PS.EXCM = 0
PS.EXCM is part of the PS register in Table 4–63.
4.4.1.5 The Exception Cause Register (EXCCAUSE) under the Exception Option
After an exception that redirects execution to one of the general exception vectors
(UserExceptionVector, KernelExceptionVector, or DoubleExceptionVector), the EXCCAUSE register contains a value that specifies the cause of the last exception. Figure 4–9 shows the EXCCAUSE register. Table 4–64 describes the 6-bit binaryvalue encodings for the register. EXCCAUSE is undefined after processor reset.
31

6 5

0

reserved

EXCCAUSE

26

6

Figure 4–9. EXCCAUSE Register
Table 4–64. Exception Causes
EXCCAUSE Cause Name
Code

Cause Description [Required Option]

EXCVADDR
Loaded

0

IllegalInstructionCause

Illegal instruction [Exception Option]

No

1

SyscallCause

SYSCALL instruction [Exception Option]

No

2

InstructionFetchErrorCause Processor internal physical address or data error
during instruction fetch [Exception Option]

Yes

LoadStoreErrorCause

Processor internal physical address or data error
during load or store [Exception Option]

Yes

Level1InterruptCause

Level-1 interrupt as indicated by set level-1 bits
in the INTERRUPT register [Interrupt Option]

No

AllocaCause

MOVSP instruction, if caller’s registers are not in
the register file [Windowed Register Option]

No

3
4
5

Xtensa Instruction Set Architecture (ISA) Reference Manual

89

Chapter 4. Architectural Options

Table 4–64. Exception Causes (continued)
EXCCAUSE Cause Name
Code
6

IntegerDivideByZeroCause

7
8
9

13
14
15
16
17
18

PrivilegedCause
LoadStoreAlignmentCause

Load or store to an unaligned address
[Unaligned Exception Option]

Yes

Yes

LoadStorePIFDataErrorCause

Synchronous PIF data error during LoadStore
access [Processor Interface Option]

Yes

InstrPIFAddrErrorCause

PIF address error during instruction fetch
[Processor Interface Option]

Yes

LoadStorePIFAddrErrorCause

Synchronous PIF address error during
LoadStore access [Processor Interface Option]

Yes

InstTLBMissCause

Error during Instruction TLB refill [MMU Option]

Yes

InstTLBMultiHitCause

Multiple instruction TLB entries matched [MMU
Option]

Yes

InstFetchPrivilegeCause

An instruction fetch referenced a virtual address
at a ring level less than CRING [MMU Option]

Yes

Reserved for Tensilica

21..23

An instruction fetch referenced a page mapped
with an attribute that does not permit instruction
fetch [Region Protection Option or MMU Option]

Yes

Reserved for Tensilica
LoadStoreTLBMissCause

Error during TLB refill for a load or store [MMU
Option]

Yes

LoadStoreTLBMultiHitCause

Multiple TLB entries matched for a load or store
[MMU Option]

Yes

LoadStorePrivilegeCause

A load or store referenced a virtual address at a
ring level less than CRING [MMU Option]

Yes

27

Reserved for Tensilica

28
LoadProhibitedCause

90

No

PIF data error during instruction fetch [Processor
Interface Option]

InstFetchProhibitedCause

26

Attempt to execute a privileged operation when
CRING ≠ 0 [MMU Option]

InstrPIFDataErrorCause

20

25

No

Reserved for Tensilica

19

24

QUOS, QUOU, REMS, or REMU divisor operand
is zero [32-bit Integer Divide Option]

EXCVADDR
Loaded

Reserved for Tensilica

10..11
12

Cause Description [Required Option]

A load referenced a page mapped with an
attribute that does not permit loads [Region
Protection Option or MMU Option]

Yes

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–64. Exception Causes (continued)
EXCCAUSE Cause Name
Code

Cause Description [Required Option]

29

A store referenced a page mapped with an
attribute that does not permit stores [Region
Protection Option or MMU Option]

StoreProhibitedCause
30..31
32..39

EXCVADDR
Loaded
Yes

Reserved for Tensilica
Coprocessor n instruction when cpn disabled. n
varies 0..7 as the cause varies 32..39
[Coprocessor Option]

CoprocessornDisabled

40..63

No

Reserved

Exceptions that redirect execution to other vectors that do not use EXCCAUSE may either
report details in a different cause register or may have only a single cause and no need
for additional cause information.
4.4.1.6 The Exception Virtual Address Register (EXCVADDR) under the Exception Option
The exception virtual address (EXCVADDR) register contains the virtual byte address that
caused the most recent fetch, load, or store exception. Table 4–64 shows, for every exception cause value, whether or not the exception virtual address register will be set.
This register is undefined after processor reset. Because EXCVADDR may be changed
by any TLB miss, even if the miss is handled entirely by processor hardware, code that
counts on it not changing value must guarantee that no TLB miss is possible by using
only static translations for both instruction and data accesses. Figure 4–10 shows the
EXCVADDR register format.
31

0
Exception Virtual Address
32

Figure 4–10. EXCVADDR Register Format
4.4.1.7 The Exception Program Counter (EPC) under the Exception Option
The exception program counter (EPC) register contains the virtual byte address of the
instruction that caused the most recent exception or the next instruction to be executed
in the case of a level-1 interrupt. This instruction has not been executed. Software may
restart execution at this address by using the RFE instruction after fixing the cause of the
exception or handling and clearing the interrupt. This register is undefined after processor reset and its value might change whenever PS.EXCM is 0.

Xtensa Instruction Set Architecture (ISA) Reference Manual

91

Chapter 4. Architectural Options

The Exception Option defines only one EPC value (EPC[1]). The High-Priority Interrupt
Option extends the EPC concept by adding one EPC value per high-priority interrupt
level (EPC[2..NLEVEL+NNMI]).
Figure 4–11 shows the EPC register format.
31

0
Exception Instruction Virtual Address
32

Figure 4–11. EPC Register Format for Exception Option
4.4.1.8 The Double Exception Program Counter (DEPC) under the Exception Option
The double exception program counter (DEPC) register contains the virtual byte address of the instruction that caused the most recent double exception. A double exception is one that is raised when PS.EXCM is set. This instruction has not been executed.
Many double exceptions cannot be restarted, but those that can may be restarted at this
address by using an RFDE instruction after fixing the cause of the exception.
The DEPC register exists only if the configuration parameter NDEPC=1. If DEPC does not
exist, the EPC register is used in its place when a double exception is taken and when
the RFDE instruction is executed. The consequence is that it is not possible to recover
from most double exceptions. NDEPC=1 is required if both the Windowed Register
Option and the MMU Option are configured. DEPC is undefined after processor reset.
Figure 4–12 shows the DEPC register format.
31

0
Exception Instruction Virtual Address
32

Figure 4–12. DEPC Register Format
4.4.1.9 The Exception Save Register (EXCSAVE) under the Exception Option
The exception save register (EXCSAVE[1]) is simply a read/write 32-bit register intended for saving one AR register in the exception vector software. This register is undefined
after processor reset and there are many software reasons its value might change
whenever PS.EXCM is 0.
The Exception Option defines only one exception save register (EXCSAVE[1]). The
High-Priority Interrupt Option extends this concept by adding one EXCSAVE register per
high-priority interrupt level (EXCSAVE[2..NLEVEL+NNMI]).

92

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Figure 4–13 shows the EXCSAVE register format.
31

0
For Software Use
32

Figure 4–13. EXCSAVE Register Format
4.4.1.10 Handling of Exceptional Conditions under the Exception Option
Under the Exception Option, exceptional conditions are handled by saving some state
and redirecting execution to one of a set of exception vector locations as listed in
Table 4–59 through Table 4–62 along with ResetVector. This section looks at this process from the other end and describes how the code at a vector can determine the nature of the exceptional condition that has just occurred.
Table 4–65 shows, for each vector, how the code can determine what has happened.
The first column lists the possible vectors, not just for the Exception Option itself, but
also for other options that add on to the Exception Option. For vectors which can be
reached for more than one cause, the second column indicates the register containing
the main indicator of that cause. The third column indicates other registers that may
contain secondary information under that vector. The last column shows the option that
is required for the vector and the other listed registers to exist.
The three exception vectors that use EXCCAUSE for the primary cause information form
a set called the “general vector.” If PS.EXCM is set when one of the exceptional conditions is raised, then the processor is already handling an exceptional condition and the
exception goes to the DoubleExceptionVector. Only a few double exceptions are
recoverable, including a TLB miss during a register window overflow or underflow exception. For these, EXCCAUSE (and EXCSAVE in Table 4–66) must be well enough understood not to need duplication. Otherwise (PS.EXCM clear), if PS.UM is set the exception goes to the UserExceptionVector, and if not the exception goes to the
KernelExceptionVector. The Exception Option effectively defines two operating
modes: user vector mode and kernel vector mode, controlled by the PS.UM bit. The
combination of user vector mode and kernel vector mode is provided so that the user
vector exception handler can switch to an exception stack before processing the exception, whereas the kernel vector exception handler can continue using the kernel stack.
Single or multiple high-priority interrupts can be configured for any hardware prioritized
levels 2..6. These will redirect to the InterruptVector[i] where “i” is the level. One
of those levels, often the highest one, can be chosen as the debug level and will redirect
execution to InterruptVector[d] where “d” is the debug level. The level one higher
than the highest high-priority interrupt can be chosen as an NMI, which will redirect execution to InterruptVector[n] where “n” is the NMI level (2..7).

Xtensa Instruction Set Architecture (ISA) Reference Manual

93

Chapter 4. Architectural Options

Table 4–65. Exception and Interrupt Information Registers by Vector
Vector

Main Cause

Other Information

Required Option

ResetVector

—

—

Exception Option

UserExceptionVector

EXCCAUSE

INTERRUPT, EXCVADDR

Exception Option

KernelExceptionVector EXCCAUSE

INTERRUPT, EXCVADDR

Exception Option

DoubleExceptionVector EXCCAUSE

EXCVADDR

Exception Option

WindowOverflow4

—

—

Windowed Register Option

WindowOverflow8

—

—

Windowed Register Option

WindowOverflow12

—

—

Windowed Register Option

WindowUnderflow4

—

—

Windowed Register Option

WindowUnderflow8

—

—

Windowed Register Option

WindowUnderflow12

—

—

Windowed Register Option

MemoryErrorVector

MESR

MECR, MEVADDR

High-Priority Interrupt Option

InterruptVector[i]1

INTERRUPT

—

High-Priority Interrupt Option

InterruptVector[d]2

DEBUGCAUSE —

Debug Option

InterruptVector[n]3

—

High-Priority Interrupt Option

—

1.

"i" indicates an arbitrary interrupt level. Medium- and high-level interrupts may be levels 2..6.

2.

"d" indicates the debug level. It may be levels 2..6 but is usually the highest level other than NMI.

3.

"n" indicates the NMI level. It may be levels 2..7. It must be the highest level but contiguous with other levels.

In addition to these characteristics of Vectors, when the Relocatable Vector Option
(page 98) is configured, the vectors are divided into two groups and within each group
are required to be in increasing address order as listed below:
Static Vector Group:
„

ResetVector

„

MemoryErrorVector

Dynamic Vector Group:
„

WindowOverflow4

„

WindowUnderflow4

„

WindowOverflow8

„

WindowUnderflow8

„

WindowOverflow12

„

WindowUnderflow12

„

InterruptVector[2]

94

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

„

InterruptVector[3]

„

InterruptVector[4]

„

InterruptVector[5]

„

InterruptVector[6]

„

InterruptVector[7]

„

KernelExceptionVector

„

UserExceptionVector

„

DoubleExceptionVector

Table 4–66 shows, for each vector in the first column, which registers are involved in the
process of taking the exception and returning from it for that vector. Since there is no return from the ResetVector, it has no entries in the other four columns of this table.
Otherwise all entries have a second column entry of where the PC is saved and a fifth
column entry of the instruction which should be used for returning. The third column
shows where the current PS register value is saved before being changed, while the
fourth column shows where the handler may find a scratch register. Note that the general vector entries and the window vector entries modify the PS only in ways that their respective return instructions undo, and therefore there is no required PS save register.
The window vector entries do not need scratch space because they are loading and
storing a block of AR registers that they can use for scratch where they need it.
Table 4–66. Exception and Interrupt Exception Registers by Vector
Vector

PC

PS

Scratch

Return Instr.

ResetVector

—

—

—

—

UserExceptionVector

EPC

—

EXCSAVE

RFE

KernelExceptionVector

EPC

—

EXCSAVE

RFE

DoubleExceptionVector

DEPC

—

EXCSAVE

RFDE

WindowOverflow4

EPC

—

—

RFWO

WindowOverflow8

EPC

—

—

RFWO

WindowOverflow12

EPC

—

—

RFWO

WindowUnderflow4

EPC

—

—

RFWU

WindowUnderflow8

EPC

—

—

RFWU

WindowUnderflow12

EPC

—

—

RFWU

MemoryErrorVector

MEPC

MEPS

MESAVE

RFME

1.

"i" indicates an arbitrary interrupt level. Medium- and high-level interrupts may be levels 2..6.

2

"d" indicates the debug level. It may be levels 2..6 but is usually the highest level other than NMI.

3.

"n" indicates the NMI level. It may be levels 2..7. It must be the highest level but contiguous with other levels.

Xtensa Instruction Set Architecture (ISA) Reference Manual

95

Chapter 4. Architectural Options

Table 4–66. Exception and Interrupt Exception Registers by Vector (continued)
Vector

PC

PS

Scratch

InterruptVector[i]1

EPCi1

EPSi1

EXCSAVEi1

RFIi1

2

2

2

2

RFId2

EXCSAVEn3

RFIn3

InterruptVector[d]

InterruptVector[n]3
1.

EPCd

EPCn3

EPSd

EPSn3

EXCSAVEd

Return Instr.

"i" indicates an arbitrary interrupt level. Medium- and high-level interrupts may be levels 2..6.

2

"d" indicates the debug level. It may be levels 2..6 but is usually the highest level other than NMI.

3.

"n" indicates the NMI level. It may be levels 2..7. It must be the highest level but contiguous with other levels.

The taking of an exception under the Exception Option has the following semantics:
procedure Exception(cause)
if (PS.EXCM & NDEPC=1) then
DEPC ← PC
nextPC ← DoubleExceptionVector
elseif PS.EXCM then
EPC[1] ← PC
nextPC ← DoubleExceptionVector
elseif PS.UM then
EPC[1] ← PC
nextPC ← UserExceptionVector
else
EPC[1] ← PC
nextPC ← KernelExceptionVector
endif
EXCCAUSE ← cause
PS.EXCM ← 1
endprocedure Exception

4.4.1.11 Exception Priority under the Exception Option
In implementations where instruction execution is overlapped (for example, via a pipeline), multiple instructions can cause exceptions. In this case, priority is given to the exception caused by the earliest instruction.
When a given instruction causes multiple exceptions, the priority order for choosing the
exception to be reported is listed below from highest priority to lowest. In cases where it
is possible to have more than one occurrence of the same cause within the same instruction, the priority among the occurrences is undefined.
Pre-Instruction Exceptions:
„

Non-maskable interrupt

„

High-priority interrupt (including debug exception for DEBUG INTERRUPT)

„

Level1InterruptCause

96

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

„

Debug exception for ICOUNT

„

Debug exception for IBREAK

Fetch Exceptions:
„

Instruction-fetch translation errors
-

InstTLBMultiHitCause

-

InstTLBMissCause

-

InstFetchPrivilegeCause

-

InstFetchProhibitedCause

„

InstructionFetchErrorCause (Instruction-fetch address or instruction data errors)

„

ECC/parity exception for Instruction-fetch

Decode Exceptions:
„

IllegalInstructionCause

„

PrivilegedCause

„

SyscallCause (SYSCALL instruction)

„

Debug exception for BREAK (BREAK, BREAK.N instructions)

Execute Register Exceptions:
„

Register window overflow

„

Register window underflow (RETW, RETW.N instructions)

„

AllocaCause (MOVSP instruction)

„

CoprocessornDisabledCause

Execute Data Exceptions:
Divide by Zero
Execute Memory Exceptions:
„

LoadStoreAlignmentCause (in the absence of the Hardware Alignment Option)

„

Debug exception for DBREAK

„

IHI, PITLB, IPF, or IPFL, or IHU target translation errors, in order of priority:
-

InstTLBMultiHitCause

-

InstTLBMissCause

-

InstFetchPrivilegeCause

-

InstFetchProhibitedCause

Xtensa Instruction Set Architecture (ISA) Reference Manual

97

Chapter 4. Architectural Options

„

Load, store, translation errors, in order of priority:
-

LoadStoreTLBMultiHitCause

-

LoadStoreTLBMissCause

-

LoadStorePrivilegeCause

-

LoadProhibitedCause

-

StoreProhibitedCause

„

InstructionFetchErrorCause (IPFL target address or data errors)

„

LoadStoreAlignmentCause (in the presence of the Hardware Alignment Option)

„

LoadStoreErrorCause (Load or store external address or data errors)

„

ECC/parity exception for all accesses except instruction-fetch

Exceptions are grouped in the priority list by what information is necessary to determine
whether or not the exception is to be raised. The pre-instruction exceptions may be evaluated before the instruction begins because they require nothing but the PC of the instruction. Fetch exceptions are encountered in the process of fetching the instruction.
Decode exceptions may be evaluated after obtaining the instruction itself. Execute register exceptions require internal register state and execute memory exceptions involve the
process of accessing the memory on which the instruction operates.
Exceptions are not necessarily precise. On some implementations, some exceptions are
raised after subsequent instructions have been executed. In such implementations, the
EXCW instruction can be used to prevent unwanted effects of imprecise exceptions. The
EXCW instruction causes the processor to wait until all previous instructions have taken
their exceptions, if any.
Interrupts have an implicit EXCW; when an interrupt is taken, all instructions prior to the
instruction addressed by EPC have been executed and any exceptions caused by those
instructions have been raised. Interrupts are listed at the top of the priority list. Because
the relative cycle position of an internal instruction and an interrupt pin assertion is not
well-defined, the priority of interrupts with respect to exceptions is not truly well-defined
either.

4.4.2

Relocatable Vector Option

This option splits Exception Vectors into two groups and adds a choice of two base addresses for one group and a Special Register as a base for the other group.
„

Prerequisites: Exception Option (page 82)

„

Incompatible options: None

98

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Under the Relocatable Vector Option, exception vectors are more restricted than they
are without it. The vectors are organized into two groups, a "Static" group and a "Dynamic" group. Within each group there is a required order among the vectors which exist. The list immediately after Table 4–65 (page 94) indicates both the group and the order within the group. Some implementations may place an upper bound on the size of
each group of vectors as measured by the difference between the address of the highest numbered vector in the group and the address of the lowest numbered vector in the
group.
The Static group of vectors is not movable under software control. Two base addresses
for the Static group are set by the designer at configuration time and an input pin of the
processor is sampled at reset to determine which of the two configured addresses will
be used. The base address will not change after reset. The offsets from this base are
also chosen at configuration time and will not change.
The Dynamic group of vectors is movable under software control. The Special Register,
VECBASE, described in Table 5–155 on page 224, holds the current base for the Dynamic group. The special register resets to a value set by the designer at configuration time
but is freely writable using the WSR.VECBASE instruction. The offsets from the base
must increase in the order indicated by Table 4–66 and are also set by the designer at
configuration time.
4.4.2.1 Relocatable Vector Option Architectural Additions
Table 4–67 shows this option’s architectural additions.
Table 4–67. Relocatable Vector Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

VECBASE

1

28

4.4.3

Register Name

R/W

Special
Register
Number1

Vector base

R/W

Table 5–155

Unaligned Exception Option

This option causes an exception to be raised on any unaligned memory access whether
it is generated by core architecture memory instructions, by optional instructions, or by a
designer’s TIE instructions.1 With system software cooperation, occasional unaligned
accesses can be handled correctly.

1.

In the T1050 release, which was the first for the Unaligned Exception Option, only Core Architecture memory instructions raise the unaligned
exception.

Xtensa Instruction Set Architecture (ISA) Reference Manual

99

Chapter 4. Architectural Options

Cache line oriented instructions such as prefetch and cache management instructions
will not raise the unaligned exception. Special instructions such as LICW that use a generated address for something other than an actual memory address also will not raise
the exception. Individual instruction listings list the unaligned exception when it can be
raised by that instruction.
Memory access instructions will raise the exception when address and size indicate it.
Any address that is not a multiple of the size associated with the instruction will raise the
unaligned exception whether or not the access crosses any particular size boundary. For
example, an L16UI instruction that generates the address 32’h00000005, will raise
the unaligned exception, even though the access is entirely within a single 32-bit access.
The exception cause register will contain LoadStoreAlignmentCause as indicated
below and the exception virtual address register will contain the virtual address of the
unaligned access.
„

Prerequisites: Exception Option (page 82)

„

Incompatible options: None

4.4.3.1 Unaligned Exception Option Architectural Additions
Table 4–68 shows this option’s architectural additions.
Table 4–68. Unaligned Exception Option Constant Additions (Exception Causes)
Exception Cause

Description

Constant Value

LoadStoreAlignmentCause

Load or store to an unaligned address.
(seeTable 4–64 on page 89)

6'b001001 (decimal 9)

4.4.4

Interrupt Option

The Interrupt Option implements level-1 interrupts. These are asynchronous exceptions
on processor input signals or software exceptions. They have the lowest priority of all interrupts. Level-1 interrupts are handled differently than the high-priority interrupts at priority levels 2 through 6 or NMI. The Interrupt Option is a prerequisite for the High-Priority
Interrupt Option, Timer Interrupt Option, and Debug Option.
Certain aspects of high-priority interrupts are specified along with those of level-1 interrupts in the Interrupt Option. Specifically, the following parameters are specified:
„
„

„

100

NINTERRUPT—Total number of level-1 plus high-priority interrupts.
INTTYPE[0..NINTERRUPT-1]—Interrupt type (level, edge, software, or internal)
for level-1 plus high-priority interrupts.
INTENABLE—Interrupt-enable mask for level-1 plus high-priority interrupts.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

INTERRUPT—Interrupt-request register for level-1 plus high-priority interrupts.

„

Nevertheless, high-priority interrupts specified in the Interrupt Option are not operational
without implementation of the High-Priority Interrupt Option.
„

Prerequisites: Exception Option (page 82)

„

Incompatible options: None

4.4.4.1 Interrupt Option Architectural Additions
Table 4–69 through Table 4–72 show this option’s architectural additions.
Table 4–69. Interrupt Option Constant Additions (Exception Causes)
Exception Cause

Description

Constant Value

Level1InterruptCause

Level-1 interrupt (seeTable 4–64 on
page 89)

6'b000100 (decimal 4)

Table 4–70. Interrupt Option Processor-Configuration Additions
Valid
Values

Parameter

Description

NINTERRUPT

Number of level-1, high-priority, and non-maskable
interrupts

1..32

INTTYPE[0..NINTERRUPT-1]

Interrupt type for level-1, high-priority, and non-maskable
interrupts Section 4.4.4.2

See
Table 4–73

LEVEL[0..NINTERRUPT-1]

Priority level of level-1 interrupts1

1

1.

This parameter has a fixed, implicit value. The parameter associates the level-1 interrupts with their interrupt priority (level) which, by definition, is always level 1 (lowest priority), The parameter must be explicitly specified only for the high-priority interrupts (Table 4–74 on page 107),
each of which can be assigned different priority levels, from 2 to 15.

Table 4–71. Interrupt Option Processor-State Additions
Register
Mnemonic
PS.INTLEVEL

Quantity

Width
(bits)

1

4

Register Name

R/W

Interrupt-level disable
(see Table 4–63 on page 87)

R/W

Special
Register
Number1
See Table 4–63
on page 87

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

2.

Level-sensitive interrupt bits are read-only, edge-triggered interrupt bits are read/clear, and software interrupt bits are read/write. Two register
numbers are provided for software modification to the INTERRUPT register: one that sets bits, and one that clears them.

Xtensa Instruction Set Architecture (ISA) Reference Manual

101

Chapter 4. Architectural Options

Table 4–71. Interrupt Option Processor-State Additions (continued)
Register
Mnemonic

Register Name

R/W

Special
Register
Number1

R/W

228

NINTERRUPT

Interrupt enable mask
(Level-1 and high-priority interrupts)
There is one bit for each level-1 and
high-priority interrupt, except nonmaskable interrupt (NMI) and
Debug interrupt. To enable a given
interrupt, CINTLEVEL
(Table 4–57 on page 84) must be
less than the level assigned by
LEVEL[i] to that interrupt, and
the INTENABLE bit for that
interrupt must be set to 1.

NINTERRUPT

Interrupt request register
R or
(level-1 and high-priority interrupts) R/W2
This holds pending level-1 and highpriority interrupt requests. There is 1
bit per pending interrupt, except
non-maskable interrupt (NMI). If the
bit is set to 1, an interrupt request is
pending. External level interrupt bits
are not writable.

Width
(bits)

Quantity
1

INTENABLE

INTERRUPT
(the mnemonics
INTERRUPT,
INTSET, and
INTCLEAR are
used depending on
the type of access)

1

226 for read,
226 for set, and
227 for clear

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

2.

Level-sensitive interrupt bits are read-only, edge-triggered interrupt bits are read/clear, and software interrupt bits are read/write. Two register
numbers are provided for software modification to the INTERRUPT register: one that sets bits, and one that clears them.

Table 4–72. Interrupt Option Instruction Additions

1.

Instruction1

Format

RSIL

RRR

Read and set interrupt level

WAITI

RRR

Wait for interrupt

Definition

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.4.4.2 Specifying Interrupts
Interrupt types (INTTYPE in Table 4–70) can be any of the values listed in Table 4–73.
The column labeled “Priority” shows the possible range of priorities for the interrupt type.
The column labeled “Pin” indicates whether there is an Xtensa core pin associated with
the interrupt, while the column labeled “Bit” indicates whether or not there is a bit in the
INTERRUPT and INTENABLE Special Registers corresponding to the interrupt. The last
two columns indicate how the interrupt may be set and how it may be cleared.

102

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–73. Interrupt Types
Type

Priority1

Pin?

Bit?

How Interrupt is Set

How Interrupt is Cleared

Level

1 to N

Yes

Yes

Signal level from device

At device

Edge

1 to N

Yes

Yes

Signal rising edge

WSR.INTCLEAR ‘1’

NMI

N+1

Yes

No

Signal rising edge

Automatically cleared by HW

Software

1 to N

No

Yes

WSR.INTSET ‘1’

WSR.INTCLEAR ‘1’

Timer

WSR.CCOMPAREn

1 to N

No

Yes

CCOUNT=CCOMPAREn

Debug2

2 to N

No2

No

Debug

hardware2

WriteErr

1 to N

No

Yes

Bus error on write

1.

Possible priorities where N is NLEVEL

2.

SeeSection 4.7.6 “Debug Option” on page 197 for more detail

Automatically cleared by HW
WSR.INTCLEAR ‘1’

A variable number (NINTERRUPT) of interrupts can be defined during processor configuration. External interrupt requests are signaled to the processor by either level-sensitive
or edge-triggered inputs. Software can test these interrupt requests at any time by reading the INTERRUPT register. An arbitrary number of software interrupts, not tied to an
external signal, can also be configured. Level-1 interrupts use either the
UserExceptionVector or KernelExceptionVector defined in Table 4–56 on
page 83, depending on the current setting of the PS.UM bit.
Software can manipulate the interrupt-enable bits (INTENABLE register) and then set
PS.INTLEVEL back to 0 to re-enable other interrupts, and thereby create arbitrary prioritizations. This is illustrated by the following C++ code:
class Interrupt {
public:
uint32_t bit;
void handler();
};

class Level1Interrupt {
const uint NPRIORITY = 4;

// number of priority groupings of level1 interrupts

struct InterruptGroup {
uint32_t allbits;

// all INTERRUPT register bits at this priority

uint32_t mask;

// mask of interrupt bits at this priority and lower

vector intlist;

// list of interrupts at this priority

} priority[NPRIORITY];
public:

Xtensa Instruction Set Architecture (ISA) Reference Manual

103

Chapter 4. Architectural Options

void handler();
};

// Called for all Level1 Interrupts
void
Level1Interrupt::handler ()
{
// determine software priority of this level1 interrupt
uint32_t interrupts = rsr(INTERRUPT);
uint p;
for (p = NPRIORITY-1; (interrupts & priority[p].allbits) == 0; p -= 1) {
if (p == 0)
return;
}
// found interrupts at priority p
uint32_t save_enable = rsr(INTENABLE);// save interrupt enables
wsr (INTENABLE, save_enable &~ priority[p].mask);// disable lower-priority ints
// no xSYNC instruction should be necessary here because INTENABLE and
// PS.INTLEVEL are both written and both used in the same pipe stages
uint32_t save_ps = rsil (0);

// save PS, then set level to 0

// now higher-priority level1 interrupts are enabled
// service all the priority p interrupts
do {
// first service the priority p interrupts we read earlier
for (vector::iterator i = priority[p].intlist.begin();
i = priority[p].intlist.end(); i++) {
if (interrupts & i->bit) {
// interrupt i is asserted
i->handler();

// call i’s handler
// this should clear the interrupt condition before it returns

interrupts &= ~i->bit;// clear i’s bit from request
if ((interrupts & priority[p].allbits) == 0)// early check for done
break;
}
}

104

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

// check if any more priority p interrupts arrived while we were servicing the previous batch
interrupts = rsr(INTERRUPT);
} while ((interrupts & priority[p].allbits) == 0);
// no more priority p interrupts
wsr (PS, save_ps);

// return to PS.INTLEVEL=1, disabling
// all level1 interrupts, before returning

wsr (INTENABLE, save_enable);

// restore original enables to allow lower
// priority level1 interrupts

// return to general exception handler
}

4.4.4.3 The Level-1 Interrupt Process
With respect to level-1 interrupts, the processor takes an interrupt when any level-1 interrupt, i, satisfies:
INTERRUPTi and INTENABLEi and (1 > CINTLEVEL)

Level-1 interrupts use the UserExceptionVector and KernelExceptionVector,
implemented by the Exception Option (Table 4–56 on page 83). The interrupt cause is
reported as Level1InterruptCause (Table 4–64). The interrupt handler can determine which level-1 interrupt caused the exception by doing an RSR of the INTERRUPT
register and ANDing with the contents of the INTENABLE register. The exact semantics
of the check for interrupts is given in "Checking for Interrupts" on page 109.
The process of taking an interrupt does not clear the interrupt request. The process
does set PS.EXCM to 1, which disables level-1 interrupts in the interrupt handler. Typically, PS.EXCM is reset to 0 by the handler, after it has set up the stack frame and
masked the interrupt. This allows other level-1 interrupts to be serviced. For level-sensitive interrupts, the handler must cause the source of the interrupt to deassert its interrupt
request before re-enabling the interrupt. For edge-triggered interrupts or software interrupts, the handler clears the interrupt condition by writing to the INTCLEAR register.
The WAITI instruction sets the current interrupt level in the PS.INTLEVEL register. In
some implementations it also powers down the processor’s logic, and waits for an interrupt. After executing the interrupt handler, execution continues with the instruction following the WAITI.
The INTENABLE register and the software and edge-triggered bits of the INTERRUPT
register are undefined after processor reset.

Xtensa Instruction Set Architecture (ISA) Reference Manual

105

Chapter 4. Architectural Options

4.4.4.4 Use of Interrupt Instructions
The RSIL instruction reads the PS register and sets the interrupt level. It is typically
used as follows:
RSIL
a2, newlevel
code to be executed at newlevel
WSR
a2, PS

A SYNC instruction is not required after the RSIL.

4.4.5

High-Priority Interrupt Option

The High-Priority Interrupt Option implements a configurable number of interrupt levels
between level 2 and level 6, and an optional non-maskable interrupt (NMI) at an implicit
infinite priority level. Like level-1 interrupts, high-priority interrupts are external, internal
or software interrupts. Unlike level-1 interrupts, however, each high-priority interrupt level has its own interrupt vector and special registers dedicated for saving state
(EPC[level], EPS[level] and EXCSAVE[level]). This allows much lower latency
interrupts as well as very efficient handler mechanisms. The EPC, EPS and EXCSAVE
registers are undefined after reset.
Certain aspects of high-priority interrupts are specified along with those of level-1 interrupts in the Interrupt Option, including the total number of level-1 plus high-priority interrupts (NINTERRUPT), the interrupt type for level-1 plus high-priority interrupts
(INTTYPE), the interrupt-enable mask for level-1 plus high-priority interrupts
(INTENABLE), and the interrupt-request register for level-1 plus high-priority interrupts
(INTERRUPT).
„

Prerequisites: Interrupt Option (page 100)

„

Incompatible options: None

4.4.5.1 High-Priority Interrupt Option Architectural Additions
Table 4–74 through Table 4–76 show this option’s architectural additions.

106

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–74. High-Priority Interrupt Option Processor-Configuration Additions
Parameter

Description

Valid Values

NLEVEL

Number of high-priority interrupt levels

2..61

EXCMLEVEL

Highest level masked by PS.EXCM

1..NLEVEL2

NNMI

Number of non-maskable interrupts
(NMI)

0 or 1

LEVEL[0..NINTERRUPT-1]

Priority levels of interrupts

1..NLEVEL3

High-priority interrupt vectors

32-bit address,
aligned on a 4byte boundary

Interrupt-level masks

computed4

InterruptVector[2..NLEVEL+NNMI]
LEVELMASK[1..NLEVEL-1]
1.

An interrupt’s “level” expresses its priority. The NLEVEL parameter defines the number of total interrupt levels (including level 1). Without the
High-Priority Interrupt Option, NLEVEL is fixed at 1. With the High-Priority Interrupt Option, NLEVEL ≥ 2.

2.

EXCMLEVEL was required to be 1 before the RA-2004.1 release. In the presence of the Debug Option, it still must be less than
DEBUGLEVEL.

3.

This parameter associates interrupt levels (priorities) with interrupt numbers. level-1 interrupts, by definition, are always priority level 1 (lowest
priority), and are defined in Table 4–70 on page 101. Non-maskable interrupts (NMI) have many characteristics of the level NLEVEL+1. There
is no level 0.

4.

This is computed as: LEVELMASK[j]i = (LEVEL[i] = j), where j is the level specified for interrupt i, and the width of each LEVELMASK is NINTERRUPT. Thus, there are NLEVEL-1 masks (one for each high-priority interrupt level), and each mask is NINTERRUPT bits wide. A bit number set to 1 in a LEVELMASK means that the corresponding interrupt number has that priority level. The masks are used in the formal
semantics to test whether an interrupt is taken on a given instruction ("Checking for Interrupts" on page 109).

Table 4–75. High-Priority Interrupt Option Processor-State Additions
Register Name

R/W

Special
Register
Number1

Exception program
counter

R/W

178-183

32

NLEVEL+NNMI-1

same as PS
register

Exception program
state

R/W

194-199

NLEVEL+NNMI-1

Save Location for
high-priority
interrupt handler

R/W

210-215

32

Register Mnemonic

Quantity

EPC
[2..NLEVEL+NNMI]

NLEVEL+NNMI-1

EPS
[2..NLEVEL+NNMI]
EXCSAVE
[2..NLEVEL+NNMI]
1.

Width
(bits)

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

Table 4–76. High-Priority Interrupt Option Instruction Additions
Instruction1
RFI
1.

Format
RRR

Definition
Return from high-priority interrupt

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

107

Chapter 4. Architectural Options

4.4.5.2 Specifying High-Priority Interrupts
The total number of level-1 plus high-priority interrupts (NINTERRUPT) and the interrupt
type for level-1 plus high-priority interrupts (INTTYPE) are specified in Table 4–70 on
page 101. The type of each high-priority interrupt level may be edge-triggered, levelsensitive, timer, write-error, or software.
The interrupt-enable mask for level-1 plus high-priority interrupts (INTENABLE) and the
interrupt-request register for level-1 plus high-priority interrupts (INTERRUPT) are specified in Table 4–71 on page 101.
The total number of interrupt levels is NLEVEL+NNMI (see Table 4–74). Specific interrupt
numbers are assigned interrupt levels using the LEVEL parameter in Table 4–74. A nonmaskable interrupt may be configured with the NNMI parameter in Table 4–74. The nonmaskable interrupt signal, if implemented, will be edge-triggered. Unlike other edge-triggered interrupts, there is no need to reset the NMI interrupt by writing to INTCLEAR.
4.4.5.3 The High-Priority Interrupt Process
Each high-priority interrupt level has three registers used to save processor state, as
shown in Table 4–75. The processor sets EPC[i] and EPS[i] when the interrupt is taken. EXCSAVE[i] exists for software. The RFI instruction reverses the interrupt process,
restoring processor state from EPC[i] and EPS[i].
The number of high-priority interrupt levels is expected to be small, due to the cost of
providing separate exception-state registers for each level. Interrupt numbers that share
level 1 are not limited to a single priority, because software can manipulate the interruptenables bits (INTENABLE register) to create arbitrary prioritizations.
The processor takes an interrupt only when some interrupt i satisfies:
INTERRUPTi and INTENABLEi and (level[i] > CINTLEVEL)

where level[i] is the configured interrupt level of interrupt number i. Each level of
high-priority interrupt has its own interrupt vector (InterruptVector in Table 4–74).
Interrupt numbers that share a level (and associated vector) can read the INTERRUPT
register (and INTENABLE) with the RSR instruction to determine which interrupt(s) raised
the exception. The non-maskable interrupt (NMI), if implemented, is taken regardless of
the current interrupt level (CINTLEVEL) or of INTENABLE.
The value of CINTLEVEL is set to at least EXCMLEVEL whenever PS.EXCM=1. Thus, all
interrupts at level EXCMLEVEL and below are masked during the time PS.EXCM=1. This
is done to allow high-level language coding with the Windowed Register Option of interrupt handlers for interrupts whose level is not greater than EXCMLEVEL. High-priority in-

108

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

terrupts with levels at or below EXCMLEVEL are often called medium-priority interrupts.
The interrupt latency is somewhat lower for levels greater than EXCMLEVEL, but handlers are more flexible for those whose level is not greater than EXCMLEVEL.
There are other conditions besides those in this section that can postpone the taking of
an interrupt. For more descriptions on these, refer to a specific Xtensa processor data
book.
4.4.5.4 Checking for Interrupts
The example below checks for interrupts. This is the checkInterrupts() procedure
called in the code example shown in Section 3.5.4 “Instruction Fetch” on page 29. The
procedure itself checks for interrupts and takes the highest priority interrupt that is pending.
The chkinterrupt() function for non-NMI levels returns one if:
„

the current interrupt level is not masking the interrupt (CINTLEVEL < level)

„

the interrupt is asserted (INTERRUPT)

„

the corresponding interrupt enable is set (INTENABLE), and

„

the interrupt is of the current level (LEVELMASK[level])

For NMI level interrupts, the no masking is done, but the edge sensor (made from
NMIinput and lastNMIinput) is explicitly included to avoid repeating the NMI every
cycle.
The takeinterrupt() function saves PC and PS in registers and changes them to
take the interrupt.
procedure checkInterrupts()
if chkinterrupt(NLEVEL+NNMI) then
takeinterrupt[NLEVEL+NNMI]
elseif chkinterrupt(NLEVEL+NNMI-1) then
.
.
.
elseif chkinterrupt(2) then
takeinterrupt[2]
elseif chkinterrupt(1) then
Exception (Level1InterruptCause)
endif
endprocedure checkInterrupts

where chkinterrupt and takeinterrupt are defined as:
function chkinterrupt(level)

Xtensa Instruction Set Architecture (ISA) Reference Manual

109

Chapter 4. Architectural Options

if level = NLEVEL+1 and NNMI = 1 then
chkinterrupt ← NMIinput = 1 and LastNMIinput = 0
lastNMIinput ← NMIinput
elseif level ≤ NLEVEL then
chkinterrupt ← (CINTLEVEL < level) and
((LEVELMASK[level] and INTERRUPT and INTENABLE) ≠ 0)
else
chkinterrupt ← 0
endif
endfunction chkinterrupt
function takeinterrupt(level)
EPC[level] ← PC
EPS[level] ← PS
PC ← InterruptVector[level]
PS.INTLEVEL ← level
PS.EXCM ← 1
endfunction takeinterrupt

4.4.6

Timer Interrupt Option

The Timer Interrupt Option is an in-core peripheral option for Xtensa processors. The
Timer Interrupt Option can be used to generate periodic interrupts from a 32-bit counter
and up to three 32-bit comparators. One counter period typically represents a number of
seconds of elapsed time, depending on the clock rate at which the processor is configured.
„

Prerequisites: Interrupt Option (page 100)

„

Incompatible options: None

4.4.6.1 Timer Interrupt Option Architectural Additions
Table 4–77 and Table 4–78 show this option’s architectural additions.
Table 4–77. Timer Interrupt Option Processor-Configuration Additions
Parameter

Description

Valid Values

NCCOMPARE

Number of 32-bit comparators

0..31,2

TIMERINT[0..NCCOMPARE-1]

Interrupt number for each comparator 0..NINTERRUPT-13

1.

The comparison registers can easily be multiplexed among multiple uses, so more than one comparator is usually not useful unless each comparator uses a different TIMERINT interrupt level.

2.

NCCOMPARE=0 with the Timer Interrupt Option specifies that CCOUNT exists, but there are no CCOMPARE registers or interrupts.

3.

NINTERRUPT is defined in the Interrupt Option, Table 4–70.

110

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–78. Timer Interrupt Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

Register Name

R/W

Special
Register
Number1

CCOUNT

1

32

Processor-clock count

R/W2

234

32

Processor-clock compare
(CCOUNT value at which an interrupt is
generated)

R/W3

240-242

CCOMPARE
1.

NCCOMPARE

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 3–23 on page 46.

2.

This register is not normally written except after reset; it is writable primarily for testing purposes.

3.

Writing CCOMPARE clears a pending interrupt.

4.4.6.2 Clock Counting and Comparison
The CCOUNT register increments on every processor-clock cycle. When CCOUNT =
CCOMPARE[i], a TIMERINT[i] interrupt request is generated. Although CCOUNT continues to increment and thus matches for only one cycle, the interrupt request is remembered until the interrupt is taken. In spite of this, timer interrupts are cleared by writing
CCOMPARE[i], not by writing INTCLEAR. Interrupt configuration determines the interrupt number and level. It is automatically an Internal interrupt type (the INTTYPE[i]
configuration parameter, Table 4–70).
For most applications, only one CCOMPARE register is required, because it can easily be
shared for multiple uses. Applications that require a greater range of counting than that
provided by the 32-bit CCOMPARE register can maintain a 64-bit cycle count and compare the upper bits in software.
CCOUNT and CCOMPARE[0..NCCOMPARE-1] are undefined after processor reset.

4.5

Options for Local Memory

The options in this section have the primary function of adding different kinds of memory, such as RAMs, ROMs, or caches to the processor. The added memories are tightly
integrated into the processor pipeline for highest performance.

4.5.1

General Cache Option Features

This subsection describes general characteristics of caches that are referred to in multiple later subsections about specific cache options.

Xtensa Instruction Set Architecture (ISA) Reference Manual

111

Chapter 4. Architectural Options

4.5.1.1 Cache Terminology
In the cache documentation a “line” is the smallest unit of data that can be moved between the cache and other parts of the system. If the cache is “direct-mapped,” each
byte of memory may be placed in only one position in the cache. In a direct-mapped
cache, the “index” refers to the portion of the address that is necessary to identify the
cache line containing the access.
A cache is “set-associative” if there is more than one location in the cache into which
any given line may be placed. It is “N-way set-associative” if there are N locations into
which any given line may be placed. The set of all locations into which one line may be
placed is called a “set” and the “index” refers to the portion of the address that is necessary to identify the set containing the access. The various locations within the set that
are capable of containing a line are called the “ways” of the set. And the union of the Nth
way of each set of the cache is the Nth “way” of the cache.
For example, a 4-way set-associative, 16k-byte cache with a 32-byte line size contains
512 lines. There are 128 sets of 4 lines each. The index is a 7-bit value that would most
likely consist of Address<11:5> and is used to determine what set contains the line. The
cache consists of 4 ways, each of which is 4k-bytes in size. A set represents 128 bytes
of storage made up of four lines of 32 bytes each.
4.5.1.2 Cache Tag Format
Figure 4–14 shows the instruction- and data-cache tag format for Xtensa. The number of
bits in the tag is a configuration parameter. So that all lines may be differentiated, the tag
field must always be at least 32−log2(CacheBytes/CacheWayCount) bits wide. If
an MMU with pages smaller than a way of the cache is used, the tag field must also be
at least 32−log2(MinPageSize) bits wide. The actual tag field size is the maximum of
these two values. The bits used in the tag field are the upper bits of the virtual address
left justified in the register (the most significant bit of the register represents the most
significant bit of the virtual address, bit 31). For example:
„

A 16 kB direct-mapped cache would have an 18-bit tag field.

„

A 16 kB 2-way associative cache would have a 19-bit tag field.

„

A 16 kB 2-way associative cache in conjunction with an MMU with a 4kB minimum
page size would have a 20-bit tag field.

The V bit is the line valid bit; 0 → invalid, 1 → valid. The three flag bits exist only for certain cache configurations. Any of the flag bits in Figure 4–14 not used in a particular configuration are reserved for future use and writing nonzero values to them gives undefined behavior. If the cache is set-associative, then bit[1] is the F bit and is used for
cache miss refill way selection. If the cache is a data cache with writeback functionality,
then the lowest remaining bit is the D bit, or dirty bit, and is used to signify whether the
cache contains a value more recent than its backing store and must be written back. If

112

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

the Index Lock Option is selected for that cache, the lowest remaining bit is the L bit, or
lock bit, and is used to signify whether or not the line is locked and may not be replaced.1
31

4 3 2 1 0
Tag

reserved

Flag

V

n

28-n

3

1

Figure 4–14. Instruction and Data Cache Tag Format for Xtensa
4.5.1.3 Cache Prefetch
There are two types of cache prefetch instructions. Normal prefetch instructions make
no change in the architecturally visible state but simply attempt to move cache lines
closer to the processor core. Any exception that might be raised causes the instruction
to become a NOP rather than actually raising an exception. This allows prefetch instructions to be used without penalty in places where their addresses may not represent legal
memory locations.
IPF attempts to move cache lines to the instruction cache. DPFR, DPFRO, DPFW, and
DPFWO attempt to move cache lines to the data cache. The differences are that the *R*
versions indicate that a write is not expected to the location in the immediate future while
the *W* versions indicate that a write to the location is likely in the near future. The *O
versions indicate that the most likely behavior is that the location is accessed in the near
future, but that it is not worth keeping after that access as another access is not expected. DPFWO indicates that either a write or a read followed by a write is expected soon.
The *O versions may be placed in different cache ways or kept in a separate buffer in
some implementations.
The second type of prefetch instructions, prefetch and lock instructions, are only available under their respective Cache Index Lock Options. They also do not change the operation of memory loads and stores and they affect only cache tag state, which affects
only future invalidation or line replacement operations on these lines. They are heavyweight operations and, unlike normal prefetch instructions, are only expected to be executed by code that sets up the caches for best performance.
The functions iprefetch and dprefetch are described below. Because they modify
no architectural state, they are described only by comments.

1.

Note that the three flag bits are added sequentially from the right. The bits that exist are always contiguous with each other and with the V bit on
the right. For the instruction cache, the valid combinations are 0-L-F, 0-0-F, and 0-0-0 because the instruction cache cannot be writeback and the
Index Lock Option is only available for set-associative caches. For the data cache, the valid combinations are 0-L-F, 0-0-F, 0-0-0, L-D-F, 0-D-F, and
0-0-D, which are the same three with and without the dirty bit inserted in its order.

Xtensa Instruction Set Architecture (ISA) Reference Manual

113

Chapter 4. Architectural Options

function iprefetch(vAddr, pAddr, lock)-- instruction prefetch
if lock then
-- move the line specified by vAddr/pAddr into the instruction cache
-- mark the line locked
else
-- no architecturally visible operation performed
-- no exception raised
-- try to move the line specified by vAddr/pAddr into the instruction cache
endif
endfunction iprefetch

function dprefetch(vAddr, pAddr, excl, once, lock)-- data prefetch
if lock then
-- move the line specified by vAddr/pAddr into the data cache
-- mark the line locked
else if excl then
-- no architecturally visible operation performed
-- no exception raised
-- if caches are coherent, get an exclusive copy
if once then
-- try to move the line specified by vAddr/pAddr where it can be
--

read and written once

else
-- try to move the line specified by vAddr/pAddr into the data cache
endif
else
-- no architecturally visible operation performed
-- no exception raised
if once then
-- try to move the line specified by vAddr/pAddr where it can be
read once
else
-- try to move the line specified by vAddr/pAddr into the data cache
endif
endif
endfunction dprefetch

114

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.5.2

Instruction Cache Option

The Instruction Cache Option adds on-chip first-level instruction cache. The Instruction
Cache Option also adds a few new instructions for prefetching and invalidation.
„

Prerequisites: Processor Interface Option (page 194)

„

Incompatible options: None

4.5.2.1 Instruction Cache Option Architectural Additions
Table 4–79 through Table 4–80 show this option’s architectural additions.
Table 4–79. Instruction Cache Option Processor-Configuration Additions
Parameter

Description

Valid Values

InstCacheWayCount

Instruction-cache set associativity
(ways)

1..41

InstCacheLineBytes

Instruction-cache line size (bytes)

16, 32, 64, 128, 2561

InstCacheBytes

Instruction-cache size (bytes)

1kB, 1.5kB, 2kB, 3kB, ... 32kB1

MemErrDetection

Error detection

MemErrEnable

Error enable

type2

None, parity, ECC
No-detect, detect3

1.

Valid values vary per implementation. Refer to information on local memories in a specific Xtensa processor data book.

2.

Must be identical for every instruction memory

3.

Detection may be enabled only when the Memory ECC/Parity Option is configured.

Xtensa Instruction Set Architecture (ISA) Reference Manual

115

Chapter 4. Architectural Options

Table 4–80. Instruction Cache Option Instruction Additions
Instruction1

Format

Definition

RRI8

Instruction-cache prefetch
This instruction checks whether the line containing the specified address is
present in the instruction cache, and if not, begins the transfer of the line from
memory to the cache. In some implementations, prefetching an instruction line
may prevent the processor from taking an instruction cache miss later.

RRI8

Instruction-cache hit invalidate
This instruction invalidates a line in the instruction cache if present and not
locked. If the specified address is not in the instruction cache then this
instruction has no effect. If the specified line is present and not locked, it is
invalidated. This instruction is required before executing instructions that have
been written by this processor, another processor, or DMA.

RRI8

Instruction-cache index invalidate
This instruction uses the virtual address to choose a location in the instruction
cache and invalidates the specified line if it is not locked. The method for
mapping the virtual address to an instruction cache location is implementationspecific. This instruction is primarily useful for instruction cache initialization
after power-up (note that if the Instruction Cache Index Lock Option is
implemented, an IIU instruction should precede the III).

IPF

IHI

III

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243

See Section 5.7 “Caches and Local Memories” on page 240 for more information about
synchronizations required when using the instruction cache.

4.5.3

Instruction Cache Test Option

The Instruction Cache Test Option is currently added to every processor that has an Instruction Cache Option; therefore, it is not actually a separate option. It adds instructions
capable of reading and writing the tag and data of the instruction cache. These instructions are intended to be used in testing the instruction cache, rather than in operational
code and may not be implemented in a binary compatible way in all future processors.
„

„

Prerequisites: Processor Interface Option (page 194) and Instruction Cache Option
(page 115)
Incompatible options: None

4.5.3.1 Instruction Cache Test Option Architectural Additions
Table 4–81 shows this option’s architectural additions.

116

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–81. Instruction Cache Test Option Instruction Additions
Instruction1

Format

Definition

RRR

Load instruction cache tag
This instruction uses its address to specify a line in the Instruction Cache and
loads the tag for that line into a register.

RRR

Load instruction cache word
This instruction uses its address to specify a word in the instruction cache and
loads that word into a register.

RRR

Store instruction cache tag
This instruction uses its address to specify a line in the instruction cache and
stores the tag for that line from a register.

RRR

Store instruction cache word
This instruction uses its address to specify a word in the instruction cache and
stores that word from a register.

LICT

LICW

SICT

SICW
1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

The instruction-cache access instructions must be fetched from a region of memory that
has the bypass attribute. Use an ISYNC instruction before transferring back to cached
instruction space. See Section 5.7 “Caches and Local Memories” for more information
about synchronizations required when using the instruction cache.

4.5.4

Instruction Cache Index Lock Option

The Instruction Cache Index Lock Option adds the capability of individually locking each
line of the instruction cache. This option may only be added to a cache, which has two or
more ways. One bit is added to the instruction cache tag RAM format. The Instruction
Cache Index Lock Option also adds new instructions for locking and unlocking lines.
„

„

Prerequisites: Processor Interface Option (page 194) and Instruction Cache Option
(page 115)
Incompatible options: None

4.5.4.1 Instruction Cache Index Lock Option Architectural Additions
Table 4–82 shows this option’s architectural additions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

117

Chapter 4. Architectural Options

Table 4–82. Instruction Cache Index Lock Option Instruction Additions
Instruction1

Format

Definition

RRI4

Instruction-cache prefetch and lock
This instruction checks whether the line containing the specified address is present in
the instruction cache, and if not, begins the transfer of the line from memory to the
cache. The line is placed in the instruction cache and the line marked as locked, that is,
not replaceable by ordinary instruction cache misses. To unlock the line, use IHU or
IIU. This instruction raises an illegal instruction exception on implementations that do
not support instruction cache locking.

RRI4

Instruction-cache hit unlock
This instruction unlocks a line in the instruction cache if present. If the specified
address is not in the instruction cache then this instruction has no effect. If the
specified line is present, it is unlocked. This instruction (or IIU) is required before
invalidating a line if it is locked.

RRI4

Instruction-cache index unlock
This instruction uses the virtual address to choose a location in the instruction cache
and unlocks the specified line. The method for mapping the virtual address to an
instruction cache location is implementation-specific. This instruction is primarily useful
for unlocking the entire instruction cache. This instruction (or IHU) is required before
invalidating a line if it is locked.

IPFL

IHU

IIU

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

See Section 5.7 “Caches and Local Memories” for more information about synchronizations required when using the instruction cache.

4.5.5

Data Cache Option

The Data Cache Option adds on-chip first-level data cache. It supports prefetching, writing back, and invalidation.
The data-cache prefetch read/write/once instructions have been provided to improve
performance, not to affect the processor state. Therefore, some implementations may
choose to implement these instructions as no-op instructions. In general, the performance improvement from using these instructions is implementation-dependent. In
some implementations, these instructions check whether the line containing the specified address is present in the data cache, and if not, begin the transfer of the line from
memory.
„

Prerequisites: Processor Interface Option (page 194)

„

Incompatible options: None

118

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.5.5.1 Data Cache Option Architectural Additions
Table 4–83 and Table 4–84 show this option’s architectural additions.
Table 4–83. Data Cache Option Processor-Configuration Additions
Parameter

Description

Valid Values

DataCacheWayCount

Data-cache set associativity (ways)

1..41

DataCacheLineBytes

Data-cache line size (bytes)

16, 32, 64, 128, 2561

DataCacheBytes

Data-cache size (bytes)

1kB, 1.5kB, 2kB, 3kB, ... 32kB1

IsWriteback

Data-cache configured as writeback

Yes, No

MemErrDetection

Error detection type

MemErrEnable

Error enable

2

None, parity, ECC
No-detect, detect3

1.

Valid values vary per implementation. Refer to information on local memories in a specific Xtensa processor data book.

2.

Must be identical for every data memory

3.

Detection may be enabled only when the Memory ECC/Parity Option is configured.

Table 4–84. Data Cache Option Instruction Additions
Instruction1

Format

Definition

RRI8

Data-cache prefetch {read,write}{,once}
The four variants specify various “hints” about how the data is likely to be used in the
future. DPFW and DPFWO indicate that the data is likely to be written in the near
future. On some systems this is used to fetch the data with write permission (e.g. in a
system with shared and exclusive states). DPFR and DPFRO indicate that the data is
likely only to be read. The once forms, DPFRO and DPFWO, indicate that the data is
likely to be read or written only once before it is replaced in the cache. On some
implementations this might be used to select a specific cache way, or to select a
streaming buffer instead of the cache.

RRI8

Data-cache hit writeback
If IsWriteback, this instruction forces dirty data in the data cache to be written
back to memory. If the specified address is not in the data cache, or is present but
unmodified, then this instruction has no effect. If the specified address is present and
modified in the data cache, the line containing it is written back, and marked
unmodified. This instruction is useful before a DMA read from memory, or to force
writes to a frame buffer to become visible, or to force writes to memory shared by two
processors.
If not IsWriteback, DHWB is a no-op.

DPFR,
DPFW,
DPFRO,
DPFWO

DHWB

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243

Xtensa Instruction Set Architecture (ISA) Reference Manual

119

Chapter 4. Architectural Options

Table 4–84. Data Cache Option Instruction Additions (continued)
Instruction1

Format

Definition

RRI8

Data-cache hit writeback invalidate
If IsWriteback, this instruction forces dirty data in the data cache to be written
back to memory. If the specified address is not in the data cache then this instruction
has no effect. If the specified address is present and modified in the data cache, the
line containing it is written back. After the writeback, if any, the line containing the
specified address is invalidated if present and not locked. This instruction is useful in
the same circumstances as DHWB and also before a DMA write to memory that does
not completely overwrite the line.
If not IsWriteback, DHWBI is identical to DHI except for privilege.

RRI4

Data-cache Index writeback (added in T1050)
If IsWriteback, this instruction forces dirty data in the data cache to be written
back to memory. The virtual address is used, in an implementation dependent manner,
to choose a cache line to write back. If the chosen line is unmodified, then this
instruction has no effect. If the chosen line is modified in the data cache, the line
containing it is written back, and marked unmodified. This instruction is useful for
writing back the entire cache.
If not IsWriteback, DIWB is a no-op.

RRI4

Data-cache index writeback invalidate (added in T1050)
If IsWriteback, this instruction forces dirty data in the data cache to be written
back to memory. The virtual address is used, in an implementation dependent manner,
to choose a cache line to write back. If the chosen line is modified in the data cache,
the line containing it is written back, and marked unmodified. After the writeback, if
any, the chosen line is invalidated if it is not locked. This instruction is useful for writing
back and invalidating the entire cache.
If not IsWriteback, DIWBI simply invalidates without writeback.

RRI8

Data-cache hit invalidate
This instruction invalidates a line in the data cache if present and not locked. If the
specified address is not in the data cache then this instruction has no effect. If the
specified address is present and not locked, it is invalidated. This instruction is useful
before a DMA write to memory that overwrites the entire line.

RRI4

Data-cache index invalidate
This instruction uses the virtual address to choose a location in the data cache and
invalidates the specified line if it is not locked. The method for mapping the virtual
address to a data cache location is implementation-specific. This instruction is
primarily useful for data cache initialization after power-up.

DHWBI

DIWB

DIWBI

DHI

DII

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243

See Section 5.7 “Caches and Local Memories” for more information about synchronizations required when using the data cache.

120

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

If IsWriteback, there is a dirty bit added to the data cache tag RAM format. The attributes described in Section 4.6.3.3 and Section 4.6.5.10 are then capable of setting a
region of memory to be either write-back or write-through. If not IsWriteback, both attribute settings result in write-through semantics.
When a region of memory is marked write-back, any store that hits in the cache writes
only the cache (setting the dirty bit, if it is not already set) and does not send a write on
the PIF. Any store that does not hit in the cache causes a miss. When the line is filled,
the semantics of a cache hit described above are followed. If a dirty line is evicted to use
the space in the cache, the entire line will be written on the PIF. The DHWB, DHWBI, DIWB, and DIWBI instructions will also write back a line if it is marked dirty.

4.5.6

Data Cache Test Option

The Data Cache Test Option is currently added to every processor, which has a Data
Cache Option and therefore, is not actually a separate option. It adds instructions capable of reading and writing the tag of the data cache. These instructions are intended to
be used in testing the data cache, rather than in operational code and may not be implemented in a binary compatible way in all future processors.
Prerequisites: Processor Interface Option (page 194) and Data Cache Option (page
118)

„

Incompatible options: None

„

4.5.6.1 Data Cache Test Option Architectural Additions
Table 4–85 shows this option’s architectural additions.
Table 4–85. Data Cache Test Option Instruction Additions
Instruction1

Format

Definition

RRR

Load data cache tag
This instruction uses its address to specify a line in the instruction cache and
loads the tag for that line into a register.

RRR

Store data cache tag
This instruction uses its address to specify a line in the instruction cache and
stores the tag for that line from a register.

LDCT

SDCT
1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

There are no instructions to access the data-cache data array. Normal loads and stores
can be used for this purpose with the isolate attribute.
See Section 5.7 “Caches and Local Memories” for more information about synchronizations required when using the data cache.

Xtensa Instruction Set Architecture (ISA) Reference Manual

121

Chapter 4. Architectural Options

4.5.7

Data Cache Index Lock Option

The Data Cache Index Lock Option adds the capability of individually locking each line
of the data cache. One bit is added to the data cache tag RAM format. The Data Cache
Index Lock Option also adds new instructions for locking and unlocking lines.
Prerequisites: Processor Interface Option (page 194) and Data Cache Option (page
118)

„

Incompatible options: None

„

4.5.7.1 Data Cache Index Lock Option Architectural Additions
Table 4–86 shows this option’s architectural additions.
Table 4–86. Data Cache Index Lock Option Instruction Additions
Instruction1

Format

Definition

RRI4

Data-cache prefetch and lock
This instruction checks whether the line containing the specified address is
present in the data cache, and if not, begins the transfer of the line from
memory to the cache. The line is placed in the data cache and the line marked
as locked, that is, not replaceable by ordinary data cache misses. To unlock
the line, use DHU or DIU. This instruction raises an illegal instruction
exception on implementations that do not support data cache locking.

RRI4

Data-cache hit unlock
This instruction unlocks a line in the data cache if present. If the specified
address is not in the data cache then this instruction has no effect. If the
specified address is present, it is unlocked. This instruction (or DIU) is
required before invalidating a line if it is locked.

RRI4

Data-cache index unlock
This instruction uses the virtual address to choose a location in the data cache
and unlocks the specified line. The method for mapping the virtual address to a
data cache location is implementation-specific. This instruction is primarily
useful for unlocking the entire data cache. This instruction (or DHU) is required
before invalidating a line if it is locked.

DPFL

DHU

DIU

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

See Section 5.7 “Caches and Local Memories” for more information about synchronizations required when using the data cache.

122

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.5.8

General RAM/ROM Option Features

The RAM and ROM options both provide internal memories that are part of the processor’s address space and are accessed with the same timing as cache. These memories
should not be confused with system RAM and ROM located outside of the processor,
which are often larger, and may be used for both instructions and data, and shared between processors and other processing elements.
The basic configuration parameters are the size and base address of the memory. It is
possible to configure cache, RAM, and ROM independently for both instruction and data, however some implementations may require an increased clock period if multiple instruction or multiple data memories are specified, or if the memory sizes are large. It is
sometimes appropriate for the system designer to instead place RAMs and ROMs external to the processor and access these through the cache.
Every Instruction and Data RAM and ROM is always required to be naturally aligned
(aligned on a boundary of a power of two which is equal to or larger than the size of the
RAM/ROM) in physical address space. The mapping from virtual address space to physical address space must have the property that the Index bits of the RAM/ROM are identity mapped. This is a slightly less restrictive condition than requiring that the RAM/ROM
must be contiguous and naturally aligned in virtual address space but this latter condition will always meet the requirement.
Instruction RAM can be referenced as data only by the L32I, L32R and S32I instructions and Instruction ROM referenced as data only by the L32I and L32R instructions.
This functionality is provided for initialization and test purposes, for which performance
is not critical, so these operations may be significantly slower on some Xtensa implementations. Most Xtensa code makes extensive use of L32R instructions, which load
values from a location relative to the current PC. For this to perform well for code located
in an instruction RAM or ROM, some sort of data memory (either internal or external)
should be located within the 256 KB range of the L32R instruction or else the Extended
L32R Option should be used.
Table 4–87 summarizes the restrictions on instruction and data RAM and ROM access.
The exceptions listed assume no memory protection exception has already been raised
on the access.

Xtensa Instruction Set Architecture (ISA) Reference Manual

123

Chapter 4. Architectural Options

Table 4–87. RAM/ROM Access Restrictions
Memory

Instruction
Fetch

L32R L32I
L32I.N

Other Loads

S32I S32I.N

Other Stores

InstROM

ok

ok1

undefined

LSE3

LSE3

InstRAM

ok

ok1

undefined

ok1

undefined

DataROM

IFE2

ok

ok

LSE3

LSE3

DataRAM

IFE2

ok

ok

ok

ok

UnifiedRAM

ok

ok

ok

ok

ok

1.

Reduced performance on some Xtensa implementations

2.

Instruction fetch error exception

3.

Load store error exception

4.5.9

Instruction RAM Option

This option provides an internal, read-write instruction memory. It is typically useful as
the only processor instruction store (no instruction cache) when all of the code for an application will fit in a small memory, or as an additional instruction store in parallel with the
cache for code that must have constant access time for performance reasons.
„

Prerequisites: None

„

Incompatible options: None

4.5.9.1 Instruction RAM Option Architectural Additions
Table 4–88 shows this option’s configuration parameters. There are no processor state
or instruction additions.
Table 4–88. Instruction RAM Option Processor-Configuration Additions
Parameter

Description

Valid Values

InstRAMBytes

Instruction RAM size (bytes)

512, 1kB, 2kB, 4kB, ... 256kB1

InstRAMPAddr

Instruction RAM base physical
address

32-bit address, aligned on multiple of
its size

MemErrDetection

Error detection type2

None, parity, ECC

MemErrEnable

Error enable

No-detect, detect3

1.

Refer to information on local memories in a specific Xtensa processor data book.

2.

Must be identical for every instruction memory

3.

Detection may be enabled only when the Memory ECC/Parity Option is configured.

124

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Instruction RAM may be accessed as data using the L32I, L32R, and S32I instructions. The operation of other loads and stores on InstRAM addresses is not defined.
S32I is useful for copying code into the InstRAM; L32I is useful for diagnostic testing of
InstRAM, and L32R allows constants to be loaded from InstRAM if no data memory is
within range. While L32I, L32R, and S32I to InstRAM are defined, on many implementations these accesses are much slower than references to data RAM, ROM, or cache,
and thus the use of InstRAM for data storage is not recommended.

4.5.10 Instruction ROM Option
This option provides an internal, read-only instruction memory. It is typically useful as
the only processor instruction store (no instruction cache) when all of the code for an application will fit in a small memory, or as an additional instruction store in parallel with the
cache for code that must have constant access time for performance reasons. Because
ROM is read-only, only code that is not subject to change should be put here.
„

Prerequisites: None

„

Incompatible options: None

4.5.10.1 Instruction ROM Option Architectural Additions
Table 4–89 shows this option’s configuration parameters. There are no processor state
or instruction additions.
Table 4–89. Instruction ROM Option Processor-Configuration Additions
Parameter

Description

Valid Values

InstROMBytes

Instruction ROM size (bytes)

512, 1kB, 2kB, 4kB, ... 256kB1

InstROMPAddr

Instruction ROM base physical
address

32-bit address, aligned on multiple of
its size

1.

Refer to information on Local Memories in a specific Xtensa processor data book.

Instruction ROM may be accessed as data using the L32I and L32R instructions. The
operation of other loads on InstROM addresses is not defined. L32I is useful for diagnostic testing of InstROM, and L32R allows constants to be loaded from InstROM if no
data memory is within range. While L32I and L32R to InstROM are defined, on many
implementations these accesses are much slower than references to data RAM, ROM,
or cache, and thus the use of InstROM for data storage is not recommended.

Xtensa Instruction Set Architecture (ISA) Reference Manual

125

Chapter 4. Architectural Options

4.5.11 Data RAM Option
This option provides an internal, read-write data memory. It is typically useful as the only
processor data store (no data cache) when all of the data for an application will fit in a
small memory, or as an additional data store in parallel with the cache for data that must
be constant access time for performance reasons.
„

Prerequisites: None

„

Incompatible options: None

4.5.11.1 Data RAM Option Architectural Additions
Table 4–90 shows this option’s configuration parameters. There are no processor state
or instruction additions.
Table 4–90. Data RAM Option Processor-Configuration Additions
Parameter

Description

Valid Values

DataRAMBytes

Data RAM size (bytes)

512, 1kB, 2kB, 4kB, ... 256kB1

Data RAM base physical address

32-bit address, aligned on multiple of
its size

MemErrDetection

Error detection type2

None, parity, ECC

MemErrEnable

Error enable

No-detect, detect3

DataRAMPAddr

1.

Refer to information on Local Memories in a specific Xtensa processor data book.

2.

Must be identical for every data memory

3.

Detection may be enabled only when the Memory ECC/Parity Option is configured.

In the absence of the Extended L32R Option it is recommended that processors with
data RAM or ROM and no data cache be configured with the DataRAMPAddr or
DataROMPAddr below the lowest instruction address and above the highest instruction
address minus 256 KB, so that the L32R literals can be stored in RAM or ROM for fast
access. The processor will fetch L32R literals from the instruction RAM, or ROM, but in
many implementations several cycles are required for the fetch, making the use of this
feature undesirable. The Extended L32R Option allows less restricted placement.

4.5.12 Data ROM Option
This option provides an internal, read-only data memory. It is typically useful as an additional data store in parallel with the cache for data that must be constant access time for
performance reasons.
„

Prerequisites: None

„

Incompatible options: None

126

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.5.12.1 Data ROM Option Architectural Additions
Table 4–91 shows this option’s configuration parameters. There are no processor state
or instruction additions.
Table 4–91. Data ROM Option Processor-Configuration Additions
Parameter

Description

Valid Values

DataROMBytes

Data ROM size (bytes)

512, 1kB, 2kB, 4kB, ... 256kB1

Data ROM base physical address

32-bit address, aligned on multiple of
its size

DataROMPAddr
1.

Refer to information on local memories in a specific Xtensa processor data book.

4.5.13 XLMI Option
The XLMI Option, or Xtensa Local Memory Interface Option, allows the attachment of
hardware other than caches, RAMs, and ROMs into the pipeline of the processor rather
than on the processor interface bus. The advantage of the XLMI is that the latency is
lower. The disadvantage is that speculation must be explicitly allowed for on loads. The
XLMI port contains signals that inform external devices after the fact concerning whether
a load was or was not speculative. Stores are never speculative. Refer to a specific
Xtensa processor data book for more detail.
„

Prerequisites: None

„

Incompatible options: None

Instructions may not be fetched from an XLMI interface. The virtual and physical addresses of the entire XLMI region must be identical in all bits.
4.5.13.1 XLMI Option Architectural Additions
Table 4–92 shows this option’s configuration parameters. There are no processor state
or instruction additions.
Table 4–92. XLMI Option Processor-Configuration Additions
Parameter

Description

Valid Values

XLMIBytes

XLMI size (bytes)

512, 1kB, 2kB, 4kB, ... 256kB1

XLMI base physical address

32-bit address, aligned on multiple of
its size

XLMIPAddr
1.

Refer to information on local memories in a specific Xtensa processor data book.

Xtensa Instruction Set Architecture (ISA) Reference Manual

127

Chapter 4. Architectural Options

4.5.14 Hardware Alignment Option
The Hardware Alignment Option adds hardware to the processor which allows loads and
stores to work correctly at any arbitrary alignment. It does this by making multiple accesses where necessary and combining the results. Unaligned accesses are still slower
than aligned accesses, but this option is more efficient than the Unaligned Exception
Option with software handler. In addition, the Hardware Alignment Option will work in situations where a software handler is difficult to write (for example, a load and operate instruction).
„

Prerequisites: Unaligned Exception Option (page 99)

„

Incompatible options: None

The Hardware Alignment Option builds on the Unaligned Exception Option so that almost all potential LoadStoreAlignmentCause exceptions are handled transparently
by hardware instead. A few situations, which are never expected to happen in real software, still raise a LoadStoreAlignmentCause exception. In order to properly handle
all TLB misses and other exceptions, the priority of the LoadStoreAlignmentCause
exception is lower when the Hardware Alignment Option is present than when it is not.
Exception priorities are listed in Section 4.4.1.11.
A LoadStoreAlignmentCause exception may still be raised in some implementations
with the Hardware Alignment Option if the address of a load or store instruction is not a
multiple of its size and any of the following conditions is also true:
„

The instruction is one of L32AI, S32RI, or S32C1I.

„

The memory type for either portion is XLMI, IRAM, or IROM.

„

The memory types (cache, DataRAM, bypass) of the two portions differ.

„

The cache attribute for either portion is Isolate.

„

The column labeled "Meaning for Cache Access" in either Table 4–104 on page 155
or Table 4–109 on page 178 is different for the two portions of the access.

4.5.15 Memory ECC/Parity Option
The Memory ECC/Parity Option allows the local memories and caches of Xtensa processors to be protected against errors by either parity or error correcting code (ECC). It
does not affect the processor interface and system memories must maintain their own
error detection and correction. Local memories must be wide enough to contain the additional bits required. The generation and checking of parity or ECC is done in the
Xtensa core through a combination of hardware and software mechanisms.
„

Prerequisites: Exception Option (page 82)

„

Incompatible options: None

128

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Each memory may be protected or not protected individually. All protected instruction
memories must use a single protection type (parity or ECC). Likewise, all protected data
memories must use a single protection type. For parity protection, data memories require one additional bit per byte while instruction memories require one additional bit per
four bytes and cache tags require one additional bit per tag. For ECC protection, instruction memories require 7 additional bits per 32-bit word, data memories require 5 additional bits per byte, and cache tags require 7 additional bits per tag.
The core computes parity or ECC bits on every store without doing a read-modify-write.
On every load or instruction fetch, these bits are checked and an exception is raised for
parity errors or for uncorrectable ECC errors. For correctable errors, a control bit in the
memory error status register (Table 4–94) indicates whether to raise an exception or
simply correct the value to be used (but not the value in memory) and continue. In addition, correctable ECC errors assert an output pin which may be used as an interrupt. Implementations may or may not implement hardware correction. If they do not implement
it, the exception is always raised.
4.5.15.1 Memory ECC/Parity Option Architectural Additions
Table 4–93 through Table 4–95 show this option’s architectural additions.
Table 4–93. Memory ECC/Parity Option Processor-Configuration Additions
Parameter

Description

Valid Values

MemoryErrorVector

Exception vector for memory errors

32-bit address

Each RAM/Cache has configuration
additions valid when the Memory
ECC/Parity Option is configured

Table 4–94. Memory ECC/Parity Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

Register Name

R/W

Access

MEPC

1

32

Memory error PC register

R/W

106

1

same as
PS
register1

Memory error PS register

R/W

107

MESAVE

1

32

Memory error save register

R/W

108

MESR

1

19

Memory error status register

R/W

109

MECR

1

22

Memory error check register

R/W

110

1

32

Memory error virtual address
register

R/W

111

MEPS

MEVADDR
1.

There are enough bits to save all configured PS Register Fields. See Table 4–63 on page 87.

Xtensa Instruction Set Architecture (ISA) Reference Manual

129

Chapter 4. Architectural Options

Table 4–95. Memory ECC/Parity Option Instruction Additions
Instruction1

Format

Definition

RFME

RRR

Return from memory error

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.5.15.2 Memory Error Information Registers
Three registers are used to maintain information about a memory error. They are updated for memory errors which do not raise an exception, as well as those which do. The
memory error status register (MESR), shown in Figure 4–15 with further description in
Table 4–98, contains control bits that control the operation of memory errors and status
bits that hold information about memory errors that have occurred.
Under normal operation, check bits are always calculated and written to local memories.
When ECC is enabled, an uncorrectable error, or a correctable error for which the
MESR.DataExc or MESR.InstExc bit is set, will raise an exception whenever it is encountered during either a load or a dirty castout. Inbound PIF operations return an error
when appropriate but the error will not be noted by the local processor. Correctable errors during a dirty castout when MESR.DataExc is clear may, in some implementations,
correct the error on the fly without setting MESR.RCE or associated status.
When ECC is enabled and either the MESR.DataExc bit or the MESR.InstExc bit is
clear or the MESR.MemE bit is set, hardware may be able to correct an error without raising an exception. This may cause MESR.RCE (along with many other fields),
MESR.DLCE, or MESR.ILCE to be set by hardware at an arbitrary time.
In addition, an external pin reflects the state of MESR.RCE and can be connected to an
interrupt input on the Xtensa processor itself or on another processor. This interrupt may
be at a much lower priority than the memory error exception handler, but it can still repair the memory itself and/or log the error much as the memory error exception handler
might. MESR.RCE must be cleared by software to return the external pin to zero and to
re-arm the mechanism for recording correctable errors.
12 11 10 9 8 7 6 5 4 3 2 1 0

*

Memory Type

*

Acc.
Type

*

Way
Numb.

*

2

2

4

2

2

2

2

4

1 1 1 1 1 1 1 1

*

ILCE
DLCE
RCE

Error
Type

*

DME
MemE

24 23 22 21 20 19 18 17 16 15

InstExc
DataExc
ErrTest
ErrEnab

31 30 29 28 27

2

1 1

Figure 4–15. MESR Register Format

130

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–96. MESR Register Fields
Field

Width
Definition
(bits)
1

Memory error.
0 → Memory error exception not in progress.
1 → Memory error exception in progress.
Set on taking memory error exception. Cleared by RFME instruction. Software reads
and writes MemE normally.

1

Double Memory error.
0 → Normal memory error exception.
1 → Current memory error exception encountered during a Memory error exception.
Set on taking memory error exception while MemE is set. Hardware does not clear.
Software reads and writes DME normally.

1

Recorded correctable error. (Exists only if ECC is configured.)
0 → Status refers to something else.
1 → Status refers to an error corrected by hardware.
RCE means that status refers to a correctable memory error that has been fixed in
hardware. Status, here, means the group of state that contains information about a
memory error. It consists of the status fields of MESR (Way Number, Access
Type, Memory Type, and Error Type) and the contents of the MECR and
MEVADDR registers. The recorded information may be used to fix the error in the
memory copy or to log the error.
RCE is set by hardware whenever MemE is clear, RCE is clear, and a correctable
error is fixed in hardware. RCE is cleared by hardware when a memory exception is
raised as the recorded information is lost and either DLCE or ILCE is set in its place.
Software reads and writes RCE normally.

1

Data lost correctable error. (Exists only if ECC is configured.)
0 → No information has been lost about data hardware corrected memory errors.
1 → Information has been lost about data hardware corrected memory errors.
DLCE means that there has been a correctable error on a data (execute) access
which has not been recorded because 1) it happened during a memory error exception
(MemE set), 2) a memory error exception happened before it was recorded (RCE now
cleared), or 3) it happened after another correctable error and before that error was
recorded (RCE also set).
DLCE is set by hardware whenever any data (execute) correctable error is fixed in
hardware but MemE or RCE is set and the new Access Type is not instruction
fetch. DLCE is also set by hardware when any memory exception is raised with RCE
set and with the current Access Type is not instruction fetch. DLCE is never
cleared by hardware. Software reads and writes DLCE normally.

MemE

DME

RCE (ECC1)

DLCE (ECC1)

1.

In some implementations the bits used with ECC may exist as state bits without effect even when only parity is configured.

Xtensa Instruction Set Architecture (ISA) Reference Manual

131

Chapter 4. Architectural Options

Table 4–96. MESR Register Fields (continued)
Field

Width
Definition
(bits)
1

Instruction fetch (Ifetch) lost correctable error. (Exists only if ECC is configured.)
0 → No information has been lost about ifetch hardware corrected memory errors.
1 → Information has been lost about ifetch hardware corrected memory errors.
ILCE means that there has been a correctable error on an Ifetch access which has
not been recorded because 1) it happened during a memory error exception (MemE
set), 2) a memory error exception happened before it was recorded (RCE now
cleared), or 3) it happened after another correctable error and before that error was
recorded (RCE also set).
ILCE is set by hardware whenever any Ifetch correctable error is fixed in hardware
but MemE or RCE is set and the new Access Type is instruction fetch. ILCE is
also set by hardware when any memory exception is raised with RCE set and with the
current Access Type is instruction fetch. ILCE is never cleared by hardware.
Software reads and writes ILCE normally.

1

Enable Memory ECC/Parity Option errors.
0 → Memory errors are disabled.
1 → Memory errors are enabled.
When ErrEnab is set, memory error exceptions and corrections are enabled. When
ErrEnab is clear, the same values are written to memories, but no checks and no
exceptions are raised on memory reads. Operation is undefined when both
ErrEnab and ErrTest are set. ErrEnab is not modified by hardware.

1

Memory error test mode.
0 → Normal memory error operation.
1 → Special memory error test operation.
When ErrTest is set, the memory write instructions S32I, S32I.N, SICT,
SICW, and SDCT insert the actual contents of the MECR register into the memory
check bits and the memory read instructions L32I, L32I.N, LICT, LICW, and
LDCT always place the actual check bits read from memory into the MECR register.
The operation of other memory access instructions is undefined when ErrTest is
set. When ErrTest is clear, memory writes compute appropriate check bits for
each write and memory reads do not affect the MECR register (unless a memory error
is detected). Cache fills and Inbound PIF operations are unaffected by the setting of
the ErrTest bit. Operation is undefined when both ErrEnab and ErrTest are
set. ErrTest is not modified by hardware.

ILCE (ECC1)

ErrEnab

ErrTest

1.

132

In some implementations the bits used with ECC may exist as state bits without effect even when only parity is configured.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–96. MESR Register Fields (continued)
Field

Width
Definition
(bits)
1

Data exception. (Exists only if ECC is configured.)
0 → No exception on hardware correctable data memory errors.
1 → Memory error exception on hardware correctable data memory errors.
Set by software to cause memory errors which might be handled in hardware on data
accesses to raise the memory error exception instead. This bit is forced to 1 (cannot
be cleared) if hardware is unable to handle any data access errors. If MemE is set, no
exception is raised for errors which hardware can handle even if DataExc is set.
DataExc is not modified by hardware.

1

Instruction exception. (Exists only if ECC is configured.)
0 → No exception on hardware correctable instruction fetch memory errors.
1 → Memory error exception on hardware correctable instr. fetch memory errors.
Set by software to cause memory errors which might be handled in hardware on
instruction fetches to raise the memory error exception instead. This bit is forced to 1
(cannot be cleared) if hardware is unable to handle any instruction fetch errors. If
MemE is set, no exception is raised for errors which hardware can handle even if
InstExc is set. InstExc is not modified by hardware.

2

Cache way number of a memory error. (Exists only if a multiway cache is configured.)
When RCE or MemE is set and the Memory Type field points to a cache, this field
contains the cache way number containing the error.
Way Number is set by hardware whenever MemE is clear, RCE is clear, and a
correctable error is fixed in hardware or whenever a memory exception is raised.

2

Access type of an access with memory error.
0 → Memory error during load or store
1 → Memory error during instruction fetch
2 → Memory error during instruction memory access (such as IPFL or IHI)
3 → Memory error during dirty line castout
When RCE or MemE is set, this field contains an indication of the access type which
caused the memory error.
Access Type is set by hardware whenever MemE is clear, RCE is clear, and a
correctable error is fixed in hardware or whenever a memory exception is raised.

DataExc
(ECC1)

InstExc
(ECC1)

Way Number

Access Type

1.

In some implementations the bits used with ECC may exist as state bits without effect even when only parity is configured.

Xtensa Instruction Set Architecture (ISA) Reference Manual

133

Chapter 4. Architectural Options

Table 4–96. MESR Register Fields (continued)
Field

Width
Definition
(bits)
4

Memory type to which the access with memory error was directed.
0 → Error in instruction RAM 0.
1 → Error in data RAM 0.
2 → Error in instruction cache data array.
3 → Error in data cache data array
4 → Error in instruction RAM 1.
5 → Error in data RAM 1.
6 → Error in Instruction cache tag array.
7 → Error in data cache tag array
8-15 → Reserved
When RCE or MemE is set, this field contains a pointer to the memory which caused
the memory error.
Memory Type is set by hardware whenever MemE is clear, RCE is clear, and a
correctable error is fixed in hardware or whenever a memory exception is raised.

2

Error Type

Type of memory error.
0 → Reserved
1 → Parity error
2 → Correctable ECC error
3 → Uncorrectable ECC error
When RCE or MemE is set, this field contains an indicator of the type of memory error
which caused the memory error.
Error Type is set by hardware whenever MemE is clear, RCE is clear, and a
correctable error is fixed in hardware or whenever a memory exception is raised.

*

Reserved for future use
Writing a non-zero value to one of these fields results in undefined processor behavior.
These bits read as undefined.

Memory Type

1.

In some implementations the bits used with ECC may exist as state bits without effect even when only parity is configured.

The memory error check register (MECR), shown in Figure 4–16 with further description
in Table 4–97, contains syndrome bits that indicate what error occurred. For data memories, all four check fields are used so that all bytes may be covered. For instruction
memories or for cache tags, only the Check 0 field is used.
When the ErrEnab bit of the MESR register is set and the RCE or MemE bit of the MESR
register is turned on, this register contains error syndromes. For parity memories, the error syndrome is ’1’ corresponding to a parity error and ’0’ corresponding to no parity error. For ECC memories, the error syndrome is a set of bits equal in length to the number
of check bits associated with that portion of memory. The bits are all zero where there is

134

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

no error. Non-zero values give more information about which bit or bits are in error. The
exact encoding depends on the implementation. See the Xtensa Microprocessor Data
Book for more information on the encoding.
When the ErrTest bit of the MESR register is set, MECR is loaded by every L32I,
L32I.N, LICT, LICW, and LDCT instruction with the actual check bits which have been
read from memory. When the ErrTest bit of the MESR register is set, the fields of MECR
are used by the S32I, S32I.N, SICT, SICW, and SDCT instructions to write the memory
check bits. Operation of other memory access instructions is not defined when ErrTest
is set. Operation is not defined if both ErrEnab and ErrTest are set.
Error addresses are reported with reference to the 32-bit word containing the error regardless of the size of the access and for all errors MEVADDR contains an address
aligned to 32-bits. For data memories, the check field(s) in MECR corresponding to the
damaged byte(s) contains a non-zero syndrome. For tag memories and instruction
memories, the Check 0 field of MECR contains the syndrome for the entire word. Errors
in portions of the word not actually used by the access may or may not be reported in
MECR.
31

29 28

24 23

21 20

16 15

13 12

8 7 6

0

*

Check 3

*

Check 2

*

Check 1

*

Check 0

3

5

3

5

3

5

1

7

Figure 4–16. MECR Register Format
Table 4–97. MECR Register Fields
Field

Width
Definition
(bits)
5

Check bits for the high order byte of a 32 bit data word.
This field is valid for accesses to data RAM and data cache. It contains 5 check bits for
ECC memories and 1 check bit (at the right end of the field) for parity memories. The
field is associated with the highest address byte in little endian processors and the
lowest address byte in big endian processors.

5

Check bits for the next high order byte of a 32 bit data word.
This field is valid for accesses to data RAM and data cache. It contains 5 check bits for
ECC memories and 1 check bit (at the right end of the field) for parity memories. The
field is associated with the second highest address byte in little endian processors and
the second lowest address byte in big endian processors.

Check 3

Check 2

Xtensa Instruction Set Architecture (ISA) Reference Manual

135

Chapter 4. Architectural Options

Table 4–97. MECR Register Fields (continued)
Field

Width
Definition
(bits)
5

Check bits for the next low order byte of a 32 bit data word.
This field is valid for accesses to data RAM and data cache. It contains 5 check bits for
ECC memories and 1 check bit (at the right end of the field) for parity memories. The
field is associated with the second lowest address byte in little endian processors and
the second highest address byte in big endian processors.

7

Check 0

Check bits for the low order byte of a 32 bit data word.
For accesses to data RAM and data cache this field contains 5 check bits for ECC
memories and 1 check bit (at the right end of the field) for parity memories and is
associated with the lowest address byte in little endian processors and the highest
address byte in big endian processors.
For accesses to instruction RAM, instruction cache and all cache tags, this field
contains 7 check bits for ECC memories and 1 check bit (at the right end of the field)
for parity memories and covers the whole 32-bit word or tag.

*

Reserved for future use
Writing a non-zero value to one of these fields results in undefined processor behavior.
These bits read as undefined.

Check 1

The memory error virtual address register (MEVADDR), shown in Figure 4–17, contains
address information regarding the location of the error. Table 4–98 details its contents as
a function of two fields of the MESR register. For errors in cache tags and for errors in
castout data, MEVADDR contains only index information. Along with the Way Number
field in MESR, this allows the incorrect memory bits to be located. For errors in instructions or data being accessed, MEVADDR contains the full virtual address used by the instruction. Along with other status information, MEVADDR is written when the ErrEnab bit
of the MESR register is set and the RCE or MemE bit of the MESR register is turned on.
31

0
Memory Error Virtual Address
32

Figure 4–17. MEVADDR Register Format
Table 4–98. MEVADDR Contents
MESR Memory Type

MESR Access Type MEVADDR Contents

Instruction RAM n

Full virtual address used in instruction.

Data RAM n

Full virtual address used in instruction.

1.

136

For LICW instructions or Isolate cache attributes, only the index and way bits along with lower order bits are valid.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–98. MEVADDR Contents (continued)
MESR Memory Type

MESR Access Type MEVADDR Contents

Instruction cache tag array

Index bits are valid, other bits are undefined.

Instruction cache data array

Full virtual address used in instruction.1

Data cache tag array

Index bits are valid, other bits are undefined.

Data cache data array

LoadStore

Data cache data array

Castout

1.

Full virtual address used in instruction.1
Index bits are valid, other bits are undefined.

For LICW instructions or Isolate cache attributes, only the index and way bits along with lower order bits are valid.

4.5.15.3 The Exception Registers
Three of the new registers created by this option are used in order to be able to take a
memory error exception at any time and return. As an exception, memory error cannot
be masked except by the MESR.ErrEnab bit. Whenever the exception is taken, the PC
of the instruction taking the error is saved in the MEPC register, the PS register is saved
in the MEPS register, and the MESAVE register is available for software use in the exception handler.
When an actual memory error exception is taken, the MEPC and MEPS registers are loaded with the original values of PC and PS, and then PS.INTLEVEL is raised to NLEVEL so
that all interrupts except NMI are masked and the PS.EXCM bit is set so that an ordinary
exception will cause a double exception. When hardware corrects a correctable memory
error, these actions are not taken, allowing memory error corrections even in the memory error exception handler.
A memory error exception may be taken at any time. This means that, even without
hardware correction, a memory error can be handled any time except during a memory
error handler. With hardware correction, only an uncorrectable memory error taken during a handler for another uncorrectable memory error is fatal.
4.5.15.4 Memory Error Semantics
Memory errors have the following semantics:
procedure MemoryError
return if !MESR.ErrEnab
exc ← ParityError | UncorrectableECCError
exc ← 1 if !MESR.MemE & MESR.InsExc & AccessType = IFetch
exc ← 1 if !MESR.MemE & MESR.DatExc & AccessType ≠ IFetch
MESR.ILCE ← 1 if exc & MESR.RCE & MESR.AccessType = IFetch
MESR.DLCE ← 1 if exc & MESR.RCE & MESR.AccessType ≠ IFetch
MESR.ILCE ← 1 if !exc & MESR.RCE & AccessType = IFetch
MESR.DLCE ← 1 if !exc & MESR.RCE & AccessType ≠ IFetch

Xtensa Instruction Set Architecture (ISA) Reference Manual

137

Chapter 4. Architectural Options

MESR.ILCE ← 1 if !exc & MESR.MemE & AccessType = IFetch
MESR.DLCE ← 1 if !exc & MESR.MemE & AccessType ≠ IFetch
if exc | !MESR.RCE then
MESR.WayNumber ← WayNumber
MESR.AccessType ← AccessType
MESR.MemoryType ← MemoryType
MESR.ErrorType ← ErrorType
MECR ← CheckBits
if MESR.AccessType = Castout then
MEVADDR ← Undefined||CacheIndex||Undefined
elsif MESR.MemoryType = Tag then
MEVADDR ← Undefined||CacheIndex||Undefined
else
MEVADDR ← VAddr
endif
MESR.RCE ← !exc
endif
if exc then
MESR.DME ← MESR.MemE
MESR.MemE ← 1
MEPC ← PC
MEPS ← PS
nextPC ← MemoryErrorExceptionVector
PS.INTLEVEL ← NLEVEL
PS.EXCM ← 1
endif
endprocedure MemoryError

4.6

Options for Memory Protection and Translation

Xtensa processors employ one of the options in this section for memory protection and
translation. The introduction in Section 4.6.1 provides background information for the
options in this section. The Region Protection Option described in Section 4.6.3 provides control of memory by 512 MB regions. Within each region, accessibility, cacheability, and characteristics of cacheability can be controlled. The Region Translation Option
described in Section 4.6.4 builds on that and adds a translation table with an entry for
each region so that virtual addresses in that region can be translated to corresponding
physical addresses in any of the 512 MB regions. The MMU Option described in
Section 4.6.5 is a full paging memory management unit. It supports hardware refill of the
TLB from page tables in memory.

138

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.6.1

Overview of Memory Management Concepts

Section 4.6.1.1 gives an overview of the basic memory translation scheme used in
Xtensa processors. Section 4.6.1.2 gives an overview of the basic memory protection
scheme used in Xtensa processors, and Section 4.6.1.3 gives an overview of the concept of attributes. These subsections take a broader view of the overall process and indicate the direction future memory protection and translation options may take.
4.6.1.1 Overview of Memory Translation
This subsection presents an overview of the thinking behind the memory translation in
the available options. It also provides insight into the kinds of extensions that are likely
in the future.
The available memory protection and translations options that support virtual-to-physical
address translation do so via an instruction TLB and a data TLB. (“TLB” was originally
an acronym for translation lookaside buffer, but this meaning is no longer entirely accurate; in this document TLB simply means the translation hardware.) These two hardware
structures may, in some configurations, act as translation caches that are refilled by
hardware from a common page table structure in memory. In other configurations, a TLB
may be self-sufficient for its translations, and no page tables are required.
A TLB consists of several entries, each of which maps one page (the page size may
vary with each entry). Virtual-to-physical address translation consists of searching the
TLB for an entry that matches the most significant bits of the virtual address and replacing those bits with bits from the TLB entry. The least significant bits of the virtual address
are identical between the virtual and physical addresses. The translation input and output are called the virtual page number (VPN) and the physical page number (PPN) respectively. The TLB search also involves matching the address space identifier (ASID)
bits of the TLB entry to one of the current ASIDs stored in the RASID register (more on
this below). The number of bits not translated is determined by the page size, which can
be dynamically programmed from a set of configuration specified values. The TLB entry
also supplies some attribute bits for the page, including bits that determine the cacheability of the page’s data, whether it is writable or not, and so forth. This is illustrated in
Figure 4–18.
It is illegal for more than one TLB entry to match both the virtual address and the ASID.
This is true even if the entries have different ASIDs which match at different ring levels.
Software is responsible for making sure the address range of all TLB entries visible according to the ASID values in the RASID register never overlap. Implementations may
detect this situation and take a MultiHit exception in this situation to aid in debugging.

Xtensa Instruction Set Architecture (ISA) Reference Manual

139

Chapter 4. Architectural Options

The instruction and data TLBs can be configured independently for most parameters,
which is appropriate because the instruction and data references of processors can
have fairly different requirements, and in some systems additional flexibility may be appropriate on one but not the other. However, when the two TLBs both refill from the common memory page table, the associated parameters are shared.

Virtual Address

ASID3

RASID

ASID2
ASID1

VABITS-1

0
VPN

ASID0

Page Index

TLB

PABITS-1

0
PPN

Attributes

Page Index

Physical Address

Figure 4–18. Virtual-to-Physical Address Translation
Xtensa implementations may perform virtual-to-physical address translation in parallel
or series with cache, RAM, ROM, and XLMI access. However, the translated physical
address is always used to decide which cache, RAM, or ROM access to use. Thus caches are potentially virtually indexed, even though they are always physically tagged.
When the number of cache index bits (that is log2(CacheBytes/WayCount)) is
greater than a page index and the same physical memory is mapped at multiple virtual
addresses, there is the possibility of multiple cache locations being used for the same
physical memory line, which can lead to the multiple views of memory being inconsistent. In such a system, software typically avoids this situation by restricting the virtual
addresses for multiply mapped physical memory. This software restriction is often referred to as “page coloring.” If physically indexed caches are necessary (and generally
they are not), the system designer may configure the TLBs such that cache index is a
physical address by using a large page size or a high cache associativity so that the
cache index bits are within the portion of the virtual and physical addresses that are
identical.

140

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

The TLBs are N-way set-associative structures with heterogeneous “ways” and a configurable N. Each way has its own parameters, such as the number of entries, page
size(s), constant or variable virtual address, and constant or variable physical address
and attributes. It is the ability to specify constant translations in some or all of the ways
that allows Xtensa’s TLBs to span smoothly from a fixed memory map to a fully programmable one. Fully or partially constant entries can be converted to logic gates in the
TLB at significantly lower cost than a run-time programmable way. In addition, even processors with generally programmable MMUs often have a few hardwired translations.
Xtensa can easily represent these hardwired translations with its constant TLB entries.
Xtensa actually requires a few constant TLB entries to provide translation in some circumstances, such as at reset and during exception handling.
The virtual address input to the TLBs is actually the catenation of an address space
identifier (ASID) specified in a processor register with the 32-bit virtual address from the
fetch, load, or store address calculation. ASIDs allow software to change the address
space seen by the processor (for example, on a context switch) with a simple register
write without changing the TLB contents. The TLB stores an ASID with each entry, and
so can simultaneously hold translations for multiple address spaces. The number of
ASID bits is configurable. ASIDs are also an integral part of protection, as they specify
the accessibility of memory by the processor at different privilege levels, as described in
the next section.
Xtensa TLBs do not have a separate valid bit in each entry. Instead, a reserved ASID
value of 0 is used to indicate an invalid entry. This can be viewed as saving a bit, or as
almost doubling the number of ASIDs for the same number of hardware bits stored in a
TLB entry.
Non-constant ways may be configured as AutoRefill. If no entry matching an access is
found in a TLB with one or more AutoRefill ways, the processor will attempt to load a
page table entry (PTE) from memory and write it into an entry of one of the AutoRefill
ways. A TLB with no AutoRefill ways does not use the page table.
Each way of a TLB is configured with a list of page sizes (expressed as the number of
bits in a page index). If the list has one element, the page size for that way is fixed. If the
list has more than one element, the page size of the way may be varied at runtime via
the ITLBCFG or DTLBCFG registers. When AutoRefill ways have programmable page
size, the PTE has a page size field (the value is an index into the PTEPageSizes configuration parameter), and hardware refill restricts the refill way selection to ways programmed with a page size matching the page size in the PTE. When looking up an address in the TLB, each way’s page size determines which bits are used to select one of
the way’s entries for comparison: vAddrP+log2(IndexCount)-1..P is the way index where P is
the number of bits configured or programmed for the way page size.

Xtensa Instruction Set Architecture (ISA) Reference Manual

141

Chapter 4. Architectural Options

4.6.1.2 Overview of Memory Protection
Many processors implement two levels of privilege, often called kernel and user, so that
the most privileged code need not depend on the correctness of less privileged code.
The operating system kernel has access to the entire processor, but disables access to
certain features while application code runs to prevent the application from accessing or
corrupting the kernel or other applications. This mechanism facilitates debugging and
improves system reliability.
Some processors implement multiple levels of decreasing privilege, called rings, often
with elaborate mechanisms for switching between rings. The Xtensa processor provides
a configurable number of rings (RingCount), but without the elaborate ring-to-ring transition mechanisms. When configured with two rings, it provides the common kernel/user
modes of operation, with Ring 0 being kernel and Ring 1 being user. With three or four
rings configured, the Xtensa processor provides the same functionality as more advanced processors, but with the requirement that ring-to-ring transitions must be provided by Ring 0 (kernel) software.
Without the MMU Option, or with the MMU Option and RingCount = 1, the Xtensa processor has a single level of privilege, and all instructions are always available.
With RingCount > 1, software executing with CRING = 0 (see Table 4–63 on page 87
and the description of PS.EXCM) is able to execute all Xtensa instructions; other rings
may only execute non-privileged instructions. The only distinction between the rings
greater than zero is those created by software in the virtual-to-physical translations in
the page table. The name “ring” is derived from an accessibility diagram for a single process such as that shown in Figure 4–19. At Ring 0 (that is, when CRING = 0), the processor can access all of the current process’ pages (that is, Ring 0 to RingCount-1
pages). At Ring 1 it can access all Ring 1 to RingCount-1 pages. Thus, when the processor is executing with Ring 1 privileges, its address space is a subset of that at Ring 0
privilege, as Figure 4–19 illustrates. This concentric nesting of privilege levels continues
to ring
RingCount-1, which can access only ring RingCount-1 pages.
It is illegal for more than one TLB entry to match both the virtual address and the ASID.
This is true even if the entries have different ASIDs which match at different ring levels.
One ring’s mapping cannot not override another.
It is illegal for two or more TLB entries to match a virtual address, even if they are at different ring levels; one ring’s mapping cannot not override another.
Systems that require only traditional kernel/user privilege levels can, of course, configure RingCount to be 2. However, rings can also be useful for sharing. Many operating
systems implement the notion of multiple threads sharing an address space, except for

142

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

a small number of per-thread pages. Such a system could use Ring 0 for the shared kernel address space, Ring 1 for per-process kernel address space, Ring 2 for shared application address space, and Ring 3 for per-thread application address space.
Ring 1
••
•

N-1

Figure 4–19. A Single Process’ Rings
Each Xtensa ring has its own ASID. Ring 0’s ASID is hardwired to 1. The ASIDs for
Rings 1 to RingCount-1 are specified in the RASID register. The ASIDs for each ring
in RASID must be different. Each ASID has a single ring level, though there may be
many ASIDs at the same ring level (except Ring 0). This allows nested privileges with
sharing such as shown in Figure 4–20. The ring number of a page is not stored in the
TLB; only the ASID is stored. When a TLB is searched for a virtual address match, the
ASIDs of all rings specified in RASID are tried. The position of the matching ASID in
RASID gives the ring number of the page. If the page’s ring number is less than the processor’s current ring number (CRING), then the access is denied with an exception (either InstFetchPrivilegeCause or LoadStorePrivilegeCause, as appropriate).
Ring 0
Ring 1

Ring 1

••
•

••
•

••
•

N-1

N-1

N-1

Figure 4–20. Nested Rings of Multiple Processes with Some Sharing

Xtensa Instruction Set Architecture (ISA) Reference Manual

143

Chapter 4. Architectural Options

Why not store the ring number of the page in the TLB, and then use a single ASID for all
rings, instead of having an ASID per ring? Because the latter allows sharing of TLB entries, and the former does not. For example, it is desirable at the very least to reuse the
same TLB entries for all kernel mapped addresses, instead of having the same PTEs
loaded into the TLB with different ASIDs. The Xtensa mechanism is more general than
adding a “global” bit to each entry (to ignore the ASID match) in that it allows finer granularity, as Figure 4–20 illustrates, not just all or nothing.
The kernel typically assigns ASIDs dynamically as it runs code in different address spaces. When no more ASIDs are available for a new address space, the kernel flushes the
Instruction and Data TLBs, and begins assigning ASIDs anew. For example, with
ASIDBits = 8 and RingCount = 2, a TLB flush need occur at most every 254 context
switches, if every context switch is to a new address space.
Note that CRING = 0 is the only requirement for privileged instructions to execute and
CRING is the only field that controls access to memory. The PS.UM bit is named User
Vector Mode and has nothing to do with privilege for either instructions or memory access. It controls only which exception vector is taken for general exceptions.
4.6.1.3 Overview of Attributes
Both page table entries (PTEs) and TLB entries store attribute bits that control whether
and how the processor accesses memory. The number of potential attributes required
by systems is large; to encode all the access capabilities required by any potential system would make this field too big to fit into a 4-byte PTE. However, the subset of values
required for any particular system is usually much smaller. Each memory protection and
translation option has a set of attributes, each of which encodes a set of capabilities
from Table 4–99 for loads along with a set for stores and a set for instruction fetches.
More capabilities are likely to be added in future implementations.
Table 4–99. Access Characteristics Encoded in the Attributes
Characteristic

Description

Used by

Invalid

Exception on access

Fetch, Load, Store

Isolate

Read/write cache contents regardless of tag compare

Load, Store

Bypass

Ignore cache contents regardless of tag compare — always
access memory for this page

Fetch, Load, Store

No-allocate

Do not refill cache on miss

Fetch, Load, Store

Write-through

Write memory in addition to DataCache

Store
Load1

Guarded

Access bytes on this page exactly when required by the
program (i.e. neither speculative references to reduce latency
nor multiple accesses are allowed).

1.

144

Instruction fetch is always non-guarded. Stores are always guarded.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

The assignment of capabilities to the attribute field of PTEs may be done with only one
encoding for each distinct set of capabilities, or in such a way that each characteristic
has its own bit, or anything in between. Often, single bits are used for a valid bit and a
write-enable. For a valid bit, all of the attribute values with this bit zero would specify the
Invalid characteristic so that any access causes an InstFetchProhibitedCause,
LoadProhibitedCause, or StoreProhibitedCause exception, depending on the
type of access. Similarly for the write-enable bit, all attribute values with write-enable
zero would specify the Invalid characteristic to cause a StoreProhibitedCause
exception on any store.
For systems that implement demand paging, software requires a page dirty bit to indicate that the page has been modified and must be written back to disk if it is replaced.
This may be provided by creating a write-enable bit as described above, and using it as
the per-page dirty bit. The first write to a clean (non-dirty) page causes a
StoreProhibitedCause exception. The exception handler checks one of the software bits, which indicates whether the page is really writable or not; if it is, it then sets
the hardware write-enable bit in both the TLB and the page table, and continues execution.

4.6.2

The Memory Access Process

All accesses to memory, whether to cache, local memories, XLMI, or PIF and whether
caused by instruction fetch, the instructions themselves, or hardware TLB refill, follow
certain steps. Following is a short description of these steps; each is discussed in more
detail in Section 4.6.2.1 through Section 4.6.2.6.
1. Choose the TLB: Determine from the instruction opcode or the reason for hardware access, which TLB if any, is used for the access (see Section 4.6.2.1 on
page 146 for details).
2. Lookup in the TLB: In that TLB, find an entry whose virtual page number
matches the upper bits of the virtual address of the access and, for appropriate
options, whose ASID matches one of the entries in the RASID register. Exactly
one match is needed to continue beyond this point, although exceptions may be
handled and the memory access process restarted (see Section 4.6.2.2 on
page 147 for details).
3. Check the access rights: If the attribute is invalid or, for appropriate options, if
the ring corresponding the ASID matched in the RASID register is too low, raise
an exception. The operating system may, among other choices, modify the TLB
entries and retry the access (see Section 4.6.2.3 on page 148 for details).
4. Direct the access to local memory: If the physical address of the access
matches an instruction RAM or ROM, a data RAM or ROM, or an XLMI port then
direct the access to that local memory or XLMI. An exception is possible at this
stage for certain conditions, such as attempting to write to a ROM (see
Section 4.6.2.4 on page 148 for details).

Xtensa Instruction Set Architecture (ISA) Reference Manual

145

Chapter 4. Architectural Options

5. Direct the access to PIF: For the given cache configuration and using the attribute, determine whether to execute the required access on the processor interface bus (PIF) and make that access if necessary (see Section 4.6.2.5 on
page 150 for details).
6. Direct the access to cache: Using the cache that corresponds to the TLB in
Step 1 above, look up the memory location in the cache, using the value if it is
there. If not, fill the cache from the PIF and then do the access (see
Section 4.6.2.6 on page 150 for details).
Logically, the steps are done in order. The TLB lookup is done first (in steps 1 through 3
above) and the memory access afterwards (in steps 4 through 6 above). For performance reasons, they are actually done in parallel. This has two consequences:
1. First, the virtual and physical addresses of an access to an XLMI port must be identical so that the full address can be provided at the desired time.
2. Second, for all other local memory accesses and cacheable addresses, the index
bits of the cache or local memory must be the same in both virtual and physical address. This means that caches which contain ways larger than the smallest page
size in the system require “page coloring” as described in Section 4.6.1.1 on
page 139.
For local memories, the second consequence requires a similar restriction on how they
can be mapped. Note that local memories do not require that sequential virtual pages be
mapped to sequential physical pages, but only that each virtual page be mapped to a
physical page with which it shares the values of index bits.
For the purposes of understanding exceptions raised by memory accesses, all the steps
above are done sequentially and the first exception encountered takes priority over later
ones. For performance reasons, again, all steps are done in parallel and the results prioritized afterward.
The above steps are further expanded in the following subsections.
4.6.2.1 Choose the TLB
Several instructions do not actually address memory. They simply use the bits of an address to access a cache and do something directly to it. The following groups of instructions have this property:

146

-

III, IIU

-

DII, DIU, DIWB, DIWBI

-

LICT, SICT, LICW, SICW

-

LDCT, SDCT

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

For each of these instructions, no TLB is accessed and the remainder of the steps are
not followed. No memory access exceptions are possible as the addresses are not really
addresses but only pointers to cache locations.
For the data accesses of instructions IHI, IHU, IPF, and IPFL, as well as all instruction
fetches, the instruction TLB is used for subsequent steps.
For the data accesses of all other instructions and for the hardware TLB refill accesses
(regardless of which TLB is being refilled) the data TLB is used for subsequent steps.
The above choices are reflected in Table 4–100 in the second column.
For compatibility the two TLBs should never give conflicting translations or protection attributes for any access as future processors may implement them with only a single set
of entries.
4.6.2.2 Lookup in the TLB
Each TLB lookup takes a virtual address as an operand and produces a physical address, a lookup ring, and attributes as a result. This process is described in more detail
in Section 4.6.1.1. Each way of the TLB is read using the appropriate address bits for
that way as index bits. For variable sized ways, the ITLBCFG or DTLBCFG register helps
determine which address bits are the index bits.
For options without ASIDs (Region Protection Option), a way matches the access if its
virtual page number (VPN) matches the VPN of the access. The lookup ring produced is
defined to be 0.
For options with ASIDs (MMU Option), a way matches the access if its Virtual Page
Number (VPN) matches the VPN of the access and the ASID of the way matches one of
the ASIDs in the RASID register. The lookup ring is determined by which ASID in the
RASID register is matched. Because the four entries in the RASID register are required
to be different and non-zero, the lookup ring is well determined.
There should not be a match for more than one of the ways. However, this condition currently raises an InstTLBMultiHitCause or a LoadStoreTLBMultiHitCause exception as a debugging aid. If two entries contain the same VPN, but different ASIDs,
they may co-exist in the TLB at the same time as long as the RASID never contains both
ASIDs at the same time.
If none of the ways match, options without auto-refill ways (Region Protection Option)
will raise an InstTLBMissCause or a LoadStoreTLBMissCause exception so that
system software can take appropriate action and possibly retry the access. Options with
auto-refill ways (MMU Option) will, automatically in hardware, use PTEVADDR to access
page tables in memory and replace an entry in one of the auto-refill ways. The access
will then be automatically retried. An error of any sort during the automatic refill process

Xtensa Instruction Set Architecture (ISA) Reference Manual

147

Chapter 4. Architectural Options

will raise an InstTLBMissCause or a LoadStoreTLBMissCause exception to be
raised so that system software can take appropriate action and possibly retry the access.
If no exception is raised, the physical page number and attributes of the matching entry
along with the lookup ring defined above are the results of the lookup and the access
continues with the next step.
4.6.2.3 Check the Access Rights
First, the lookup ring of the entry is checked against the ring of the access. The ring of
the access is usually CRING, but for L32E and S32E, for example, it is PS.RING instead.
If the lookup ring of the entry is smaller than the ring of the access, an
InstFetchPrivilegeCause or a LoadStorePrivilegeCause exception is raised.
This situation means that an instruction has attempted access to a region of memory at
a lower numbered ring than the one for which it has privilege.
Second, the attribute of the lookup is checked for validity. If the attribute is not valid, an
exception is raised. If the access chose the Instruction TLB in Section 4.6.2.1, it raises
an InstFetchProhibitedCause exception. If it chose the data TLB, it raises either a
LoadProhibitedCause exception or a StoreProhibitedCause exception, depending on whether it was a load or a store.
If no exception is raised, the access continues with the next step using the physical address and the attribute (which is known to be valid for access, but may still affect how
caches are used).
4.6.2.4 Direct the Access to Local Memory
The physical address of each access is compared to the address ranges of any instruction RAM, instruction ROM, data RAM, data ROM, or XLMI options that may exist in the
processor. Table 4–100 indicates what will happen in the case that an access initiated
by what is indicated in the Instruction column (which will use the TLB in the second column) if its address compares to an (abbreviated) option in one of the last six columns.
OK means the access is completed normally. NOP means the access is completed but
by its nature does nothing. IFE and LSE mean that an exception is raised. TLBI and
TLBD mean that an InstTLBMissCause or a LoadStoreTLBMissCause exception is
raised. Undef means the behavior is not defined.

148

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–100. Local Memory Accesses
Instruction

TLB Used1

InstRAM

InstROM

DataRAM

DataROM

XLMI

Instruction-fetch

ITLB

OK

OK

IFE2

IFE2

IFE2

IHI, IHU, IPF

ITLB

NOP

NOP

NOP

NOP

NOP

III, IIU

none

—

—

—

—

—

IPFL

ITLB

IFE5

IFE5

IFE2

IFE2

IFE2

L32I, L32R

DTLB

OK3

OK3

OK

OK

OK

LSE4

LSE4

OK

OK

OK

—

—

—

—

L8UI, L16SI, L16UI, L32AI, DTLB
L32E, FP Loads, MAC16 Loads
LICT, LICW, LDCT
S32I

none
DTLB

OK

3
4

LSE

4

LSE

4

—

LSE

4

OK

OK

LSE

4

OK

OK

S8I, S16I, S32E, S32RI, FP
Stores

DTLB

LSE

S32C1I

DTLB

LSE4

LSE4

OK7

LSE4

Undef

SICT, SICW, SDCT

none

—

—

—

—

—

DHI, DHU, DHWB, DHWBI

DTLB

NOP

NOP

NOP

NOP

NOP

DII, DIU, DIWB, DIWBI

none

—

—

—

—

—

DPFR, DPFRO, DPFW, DPFWO

DTLB

NOP

NOP

NOP

NOP

NOP

DPFL

DTLB

LSE4

LSE4

LSE6

LSE6

LSE6

Hardware ITLB Refill

DTLB

TLBI8

TLBI8

OK

OK

OK

DTLB

TLBD8

TLBD8

OK

OK

OK

DTLB

LSE4

LSE4

OK

OK

OK

DTLB

LSE4

LSE4

OK

LSE4

OK

Hardware DTLB Refill
Designer defined loads
Designer defined stores
1.

As described in Section 4.6.2.1 on page 146

2.

Raises exception - InstFetchErrorCause

3.

These accesses may be slow in some implementations.

4.

Raises exception - LoadStoreErrorCause

5.

Raises exception - InstFetchErrorCause - but not in all implementations

6.

Raises exception - LoadStoreErrorCause - but not in all implementations

7.

Works in newer implementations but in some older implementations raises an exception.

8.

Raises exception - InstTLBMissCause or a LoadStoreTLBMissCause depending on the original access.

Using the definition of guarded in Table 4–99, instruction-fetch accesses are never
guarded. Stores are always guarded. Loads to instruction RAM, instruction ROM, data
RAM, and data ROM are never guarded. These ports are assumed to be connected only
to devices with memory semantics so that no guarding is needed for loads. Loads to

Xtensa Instruction Set Architecture (ISA) Reference Manual

149

Chapter 4. Architectural Options

XLMI are only guarded in the sense that the load will be retired only under the conditions
for a guarded access. For all these memories, assertion of the memory enable is no
guarantee that the load was needed.
If none of the comparisons produces a match, the access continues with the next step
using the physical address and the attribute.
4.6.2.5 Direct the Access to PIF
The access is sent to the processor interface if any of the following is true:
„
„

„

The attribute indicates that the cache should be bypassed.
The chosen TLB in Section 4.6.2.1 and in Table 4–100 is the ITLB and the Instruction Cache Option is not configured.
The chosen TLB in Section 4.6.2.1 and in Table 4–100 is the DTLB and the Data
Cache Option is not configured.

Using the definition of guarded in Table 4–99 on page 144, instruction-fetch accesses to
the PIF are never guarded. Stores to the PIF are always guarded. Loads that are sent to
the PIF under this section (without being cached) are guarded if the attribute says that
they should be.
If the conditions of this section are not met, the access is cached and continues with the
next step using the physical address and the attribute.
4.6.2.6 Direct the Access to Cache
The access is cached. The attribute determines how the cache operates, including the
possibility of a write-through to the PIF.
The concept of guarding cannot be carried out for loads through the cache. Extra bytes
have been loaded simply to fill the cache line and the line may have been filled long before the access. Inherently, the line is filled a different number of times than an access is
executed and the line may be invalidated or evicted at any time and refilled later. Caching should not be used on ranges of memory address where guarding is important.

4.6.3

Region Protection Option

The simplest of the options, the Region Protection Option, provides a protection field for
each of the eight 512 MB regions in the address space. The field can allow access to the
region and it can set caching characteristics for the region, such as whether or not the
cache is used and if it is write-through or write-back.
„

150

Prerequisites: Exception Option (page 82)

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Incompatible options: MMU Option (page 158)

„

This simple option is built from the capabilities discussed in the introduction
(Section 4.6.1). It uses RingCount = 1, so the processor can always execute privileged
instructions. It sets ASIDBits to 0, which disables the ASID feature. The instruction
and data TLBs are programmed to each have one way of eight entries, and the VPNs
(virtual page numbers) and PPNs (physical page numbers) of these entries are constant
and hardwired to the identity map (that is, PPN = VPN). Only the attributes are not constant; they are writable using the WITLB and WDTLB instructions.
4.6.3.1 Region Protection Option Architectural Additions
Table 4–101 through Table 4–103 show this option’s architectural additions.
Table 4–101. Region Protection Option Exception Additions
Exception

Description

EXCCAUSE
value

InstFetchProhibitedCause

Instruction fetch is not allowed in region

20

LoadProhibitedCause

Load is not allowed in region

28

StoreProhibitedCause

Store is not allowed in region

29

Table 4–102. Region Protection Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

ITLB Entries

8

DTLB Entries

8

Register Name

R/W

Access

4

Instruction TLB entries

R/W

see Table 4–103

4

Data TLB entries

R/W

see Table 4–103

Table 4–103. Region Protection Option Instruction Additions
Instruction1

Format

Definition

IDTLB

RRR

Invalidate data TLB entry

IITLB

RRR

Invalidate instruction TLB entry

PDTLB

RRR

Probe data TLB

PITLB

RRR

Probe instruction TLB

RDTLB0

RRR

Read data TLB virtual

RDTLB1

RRR

Read data TLB translation

RITLB0

RRR

Read instruction TLB virtual

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

151

Chapter 4. Architectural Options

Table 4–103. Region Protection Option Instruction Additions (continued)
Instruction1

Format

Definition

RITLB1

RRR

Read instruction TLB translation

WDTLB

RRR

Write data TLB

WITLB

RRR

Write instruction TLB

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.6.3.2 Formats for Accessing Region Protection Option TLB Entries
During normal operation when instructions and data are being accessed from memory,
only lookups are being done in the TLBs. For maintenance of the TLBs, however, the
entries in the TLBs are accessed by the instructions in Table 4–103. Note that unused
bits at Bit 12 and above are ignored on write, and zero on read, so that those bits may
simply contain the address for access to all ways of both TLBs. Unused bits at Bit 11 and
below are required to be zero on write and undefined on read for forward compatibility.
The format of the as register used in all instructions in the table is shown in Figure 4–21.
The upper three bits are used as an index among the TLB entries just as they would be
when addressing memory. They are the Virtual Page Number (VPN) or upper three bits
of address. The remaining bits are ignored.
31

29 28

0

VPN

Ignored

3

29

Figure 4–21. Region Protection Option Addressing (as) Format for WxTLB, RxTLB1, & PxTLB
The WITLB and WDTLB instructions write the TLB entries. The as register is formatted
according to Figure 4–21, while the at register is formatted according to Figure 4–22.
The attribute for the region is described in detail in Section 4.6.3.3. The remaining bits
are ignored or required to be zero.
After modifying any TLB entry with a WITLB instruction, an ISYNC must be executed before executing any instruction from that region. In the special case of the WITLB changing the attribute of its own region, the ISYNC must immediately follow the WITLB and
both must be within the same memory region and, if the region is cacheable, within the
same cache line.

152

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

31

12 11

4 3

0

Ignored

Zero

Attribute

20

8

4

Figure 4–22. Region Protection Option Data (at) Format for WxTLB
The RITLB0 and RDTLB0 instructions exist under this option but do not return interesting information because the entire VPN is used as an index. The as register is formatted
according to Figure 4–21. The read instructions return zero in the at register.
The RITLB1 and RDTLB1 instructions return the at data format in Figure 4–23. The Attribute for the region is described in detail in Section 4.6.3.3. The VPN is returned in the
upper three bits as the Physical Page Number (PPN) because there is no translation.
The remaining bits are zero or undefined. The as register is formatted according to
Figure 4–21.
31

29 28

12 11

4 3

0

PPN

Zero

Undefined

Attribute

3

17

8

4

Figure 4–23. Region Protection Option Data (at) Format for RxTLB1
The PITLB and PDTLB instructions exist under this option but do not return interesting
information because all accesses hit in the respective TLBs and the TLBs have only a
single way. The as register is formatted according to Figure 4–21. The TLB probe instructions return the at data format in Figure 4–24. The VPN is returned in the upper
bits. The low bit is set because the probe always hits and the remaining bits are zero or
undefined.
31

29 28

12 11

1 0

VPN

Zero

Undefined

1

3

17

11

1

Figure 4–24. Region Protection Option Data (at) Format for PxTLB
The IITLB and IDTLB instructions exist under this option and their as register is formatted according to Figure 4–21, but they have no effect because the entries cannot be
removed from the respective TLBs.

Xtensa Instruction Set Architecture (ISA) Reference Manual

153

Chapter 4. Architectural Options

4.6.3.3 Region Protection Option Memory Attributes
The memory attributes written into the TLB entries by the WxTLB instructions and read
from them by the RxTLB1 instructions control access to memory and, where there is a
cache, how the cache is used. Table 4–104 shows the meanings of the attributes for instruction fetch, data load, and data store. For a more detailed description of the memory
access process and the place of these attributes in it, see Section 4.6.2.
The first column in Table 4–104 indicates the attribute attribute from the TLB while the
remaining columns indicate various effects on the access. The columns are described in
the following bullets:
„
„

Attr — the value of the 4-bit Attribute field of the TLB entry.
Rights — whether the TLB entry may successfully translate a data load, a data
store, or an instruction fetch.
-

The first character is an r if the entry is valid for a data load and a dash ("-")if
not.

-

The second character is a w if the entry is valid for a data store and a dash
("-")if not.

-

The third character is an x if the entry is valid for an instruction fetch and a dash
("-")if not.

If the translation is not successful, an exception is raised.
Local memory accesses (including XLMI) consult only the Rights column.
„

„

„

„

154

WB — some rows are split by whether or not the configured cache is writeback or
not. Rows without an entry apply to both cache types.
Meaning for Cache Access — the verbal description of the type of access made to
the cache.
Access Cache — indicates whether the cache provides the data.
-

The first character is an h if the cache provides the data when the tag indicates
hit and a dash ("-")if it does not.

-

The second character is an m if the cache provides the data when the tag indicates a miss and a dash ("-")if it does not. This capability is used only for Isolate mode.

Fill Cache — indicates whether an allocate and fill is done to the cache if the tag indicates a miss.
-

The first character is an r if the cache is filled on a data load and a dash ("-")if
it is not.

-

The second character is a w if the cache is filled on a data store and a dash ("")if it is not.

-

The third character is an x if the cache is filled on an instruction fetch and a
dash ("-")if it is not.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Guard Load — refers to the guarded attribute as described in Table 4–99 on
page 144. Stores are always guarded and instruction fetches are never guarded, but
loads are guarded where there is a “yes” in this column. Local memory loads are not
guarded.

„

Write Thru — indicates whether a write is done through the PIF interface.

„

-

The first character is an h if a Write Thru occurs when the tag indicates hit and a
dash ("-")if it does not.

-

The second character is an m if a Write Thru occurs when the tag indicates a
miss and a dash ("-")if it does not.

Writes to local memories are never Write-Thru. In most implementations, a write-thru will
only occur after any needed cache fill is complete.
Table 4–104. Region Protection Option Attribute Field Values
Access
Cache

Fill
Cache

Guard
Load

Write
Thru

Cached, No Allocate

h-

---

-

hm

rwx

Cached, WrtThru

h-

r-x

-

hm

rwx

Bypass cache

--

---

yes

hm

h-

--x

-

--

h-

rwx

-

--

h-

r-x

-

-m

—

—

—

—

hm

---

-

--

--

---

-

--

Attr

Rights

0

rw-

1
2

Meaning for Cache Access

1

3

--x

Cached

4

rwx

Cached, WrtBack alloc

5

rwx

Cached, WrtBack noalloc

6-13

---

Reserved2

14
15
1

rw---

Cache Isolated
illegal

1

3

2

Attribute not supported in all implementations. Please refer to a specific Xtensa processor data book for supported attributes.

2

Raises exception. EXCCAUSE is set to InstFetchProhibitedCause, LoadProhibitedCause, or StoreProhibitedCause depending on access type

3

For test only, implementation dependent, uses data cache like local memories and ignores tag.

All attribute entries in the ITLB and DTLB are set to cache bypass (4’h2) after reset.
In the absence of the Instruction Cache Option, Cached regions behave as Bypass regions on instruction fetch. In the absence of the Data Cache Option, Cached regions behave as Bypass regions on data load or store. If the Data Cache is not configured as
writeback (Section 4.5.5.1 on page 119) Attributes 4 and 5 behave as Attribute 1 instead
of as they are listed in Table 4–104.
After changing the attribute of any memory region with a WITLB instruction, an ISYNC
must be executed before executing any instruction from that region. In the special case
of the WITLB changing the attribute of its own region, the ISYNC must immediately follow the WITLB and both must be within the same cache line.

Xtensa Instruction Set Architecture (ISA) Reference Manual

155

Chapter 4. Architectural Options

After changing the attribute of a region by WDTLB, the operation of loads from and stores
to that region are undefined until a DSYNC instruction is executed.

4.6.4

Region Translation Option

Building on the Region Protection Option is the Region Translation Option, which adds a
virtual-to-physical translation on the upper three bits of the address. Thus, each of the
eight 512 MB regions, in addition to the attributes provided by the Region Protection Option, may be redirected to access a different region of physical address space.
„

Prerequisites: Exception Option (page 82) and Region Protection Option (page 150)

„

Incompatible options: MMU Option (page 158)

With this option, the Physical Page Numbers (PPNs) of each of the TLB entries is now
writable instead of constant and identity mapped. In this way, the same region of memory may be accessed with different attributes by the use of different virtual addresses.
This simple option is built from the capabilities discussed in the introduction (see
Section 4.6.1). It uses RingCount = 1, so the processor can always execute privileged
instructions. It sets ASIDBits to 0, which disables the ASID feature. The instruction
and data TLBs are programmed to each have one way of eight entries, and only the attributes and Physical Page Numbers (PPNs) are not constant; they are writable using
the WITLB and WDTLB instructions.
4.6.4.1 Region Translation Option Architectural Additions
There are no new exceptions, no new state registers, and no new Instructions added to
those in the Region Protection Option. The TLB entries contain three additional bits of
state. Access to these bits is described in Section 4.6.4.2.
4.6.4.2 Region Translation Option Formats for Accessing TLB Entries
During normal operation when instructions and data are being accessed from memory,
only lookups are being done in the TLBs. For maintenance of the TLBs, however, the
entries in the TLBs are accessed by the instructions in Table 4–103 on page 151. Note
that unused bits at Bit 12 and above are ignored on write and zero on read so that those
bits may simply contain the address for access to all ways of both TLBs. Unused bits at
Bit 11 and below are required to be zero on write and undefined on read for forward
compatibility.
The register formats used by the TLB instructions are very similar to those described in
Section 4.6.3.2 for the Region Protection Option. The only difference is the presence of
a Physical Page Number (PPN) in the upper three bits of the WxTLB, RxTLB1, and
PxTLB register formats.

156

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

The format of the as register used in all instructions in the table is shown in Figure 4–25.
The upper three bits are used as an index among the TLB entries just as they would be
when addressing memory. They are the Virtual Page Number (VPN) or upper three bits
of address. The remaining bits are ignored.
31

29 28

0

VPN

Ignored

3

29

Figure 4–25. Region Translation Option Addressing (as) Format for WxTLB, RxTLB1, & PxTLB
The WITLB and WDTLB instructions write the TLB entries. The as register is formatted
according to Figure 4–25, while the at register is formatted according to Figure 4–26.
The attribute for the region is described in detail in Section 4.6.3.3 on page 154. The remaining bits are ignored or required to be zero.
After modifying any TLB entry with a WITLB instruction, an ISYNC must be executed before executing any instruction from that region. In the special case of the WITLB changing the attribute of its own region, the ISYNC must immediately follow the WITLB and
both must be within the same memory region and, if the region is cacheable, within the
same cache line.
After modifying any TLB entry with a WDTLB instruction, the operation of loads from and
stores to that region are undefined until a DSYNC instruction is executed.
31

12 11

4 3

0

PPN

Ignored

Zero

Attribute

3

17

8

4

Figure 4–26. Region Translation Option Data (at) Format for WxTLB
The RITLB0 and RDTLB0 instructions exist under this option but do not return interesting information because the entire VPN is used as an index. The as register is formatted
according to Figure 4–25. The read instructions return zero in the at register.
The RITLB1 and RDTLB1 instructions return the at data format in Figure 4–27. The attribute for the region is described in detail in Section 4.6.3.3. The Physical Page Number
(PPN) is returned in the upper three bits. The remaining bits are zero or undefined. The
as register is formatted according to Figure 4–25.

Xtensa Instruction Set Architecture (ISA) Reference Manual

157

Chapter 4. Architectural Options

31

29 28

12 11

4 3

0

PPN

Zero

Undefined

Attribute

3

17

8

4

Figure 4–27. Region Translation Option Data (at) Format for RxTLB1
The PITLB and PDTLB instructions return the at data format in Figure 4–28. The Virtual
Page Number (VPN) is returned in the upper bits. The low bit is set because the probe
always hits, and the remaining bits are zero or undefined. The as register is formatted
according to Figure 4–25. These instructions work for their intended purpose, but do not
provide useful information under this simple option because the TLBs always hit and
have only a single way.
31

29 28

1 0

VPN

Zero

Undefined

1

3

17

11

1

Figure 4–28. Region Translation Option Data (at) Format for PxTLB
The IITLB and IDTLB instructions exist under this option and their as register is formatted according to Figure 4–25, but they have no effect because the entries cannot be
removed from the respective TLBs.
4.6.4.3 Region Translation Option Memory Attributes
The memory attributes written into the TLB entries by the WxTLB instructions and read
from them by the RxTLB1 instructions are exactly the same as under the Region Protection Option.
As with the Region Protection Option, all attributes in both TLBs are set to cache bypass
(4’b0010) after reset. In addition, the translation entries in both TLBs are set to identity
map after reset.

4.6.5

MMU Option

The MMU Option is a memory management unit created to run protected operating systems such as Linux on the Xtensa processor with demand paging hardware with a memory-based page table.
„

158

Prerequisites: Exception Option (page 82)

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Incompatible options: Region Protection Option (page 150), Extended L32R Option
(page 56)

„

This option is also built from the capabilities discussed in the introduction
(Section 4.6.1). It uses RingCount = 4 and only Ring 0 may execute privileged instructions. The option sets ASIDBits to 8, which allows for lower TLB management overhead.
The instruction and data TLBs are programmed to have seven and ten ways, respectively (see Section 4.6.5.3). Some of the ways are constants; others can be set to arbitrary
values. Still others auto-refill from a page table in memory that contains 4-byte PTEs,
each mapping a 4kB page with a 20-bit PPN, a 2-bit ring number, a 4-bit attribute, and 6
bits reserved for software. For a programmer’s view of the MMU, refer to the Xtensa
Microprocessor Programmer’s Guide.
4.6.5.1 MMU Option Architectural Additions
Table 4–105 through Table 4–108 show this option’s architectural additions.
Table 4–105. MMU Option Processor-Configuration Additions
Parameter

Description

Valid Values

NIREFILLENTRIES

Number of auto-refill entries in the ITLB
(divided among 4 ways)

16,32
(4, 8 entries per TLB way)

NDREFILLENTRIES

Number of auto-refill entries in the DTLB
(divided among 4 ways)

16,32
(4, 8 entries per TLB way)

IVARWAY56

Ways 5&6 of the ITLB can be variable for
greater flexibility in mapping memory

Variable or Fixed1

DVARWAY56

Ways 5&6 of the DTLB can be varialble
for greater flexitiblity in mapping memory

Variable or Fixed1

1.

Implementations may allow only Fixed, only Variable or a choice of either for this value.

Table 4–106. MMU Option Exception Additions
Exception

Description

EXCCAUSE
Value

PrivilegedCause

Privileged instruction attempted with CRING ≠ 0

8

InstTLBMissCause

Instruction fetch finds no entry in ITLB

16

InstTLBMultiHitCause

Instruction fetch finds multiple entries in ITLB

17

InstFetchPrivilegeCause

Instruction fetch matching entry requires lower CRING

18

InstFetchProhibitedCause

Instruction fetch is not allowed in region

20

LoadStoreTLBMissCause

Load/store finds no entry in DTLB

24

Xtensa Instruction Set Architecture (ISA) Reference Manual

159

Chapter 4. Architectural Options

Table 4–106. MMU Option Exception Additions (continued)
Exception

Description

EXCCAUSE
Value

LoadStoreTLBMultiHitCause

Load/store finds multiple entries in DTLB

25

LoadStorePrivilegeCause

Load/store matching entry requires lower CRING

26

LoadProhibitedCause

Load is not allowed in region

28

StoreProhibitedCause

Store is not allowed in region

29

Table 4–107. MMU Option Processor-State Additions
Register Name

R/W

Special
Register
Number1

2

Privilege level (see Table 4–63 on
page 87)

R/W

230

1

32

Page Table Virtual Address

R/W

83

RASID

1

32

Per-ring ASIDs

R/W

90

ITLBCFG

1

2/4

Instruction TLB configuration

R/W

91

DTLBCFG

1

2/4

Data TLB configuration

R/W

92

ITLB Entries

24,32,40,482

variable

Instruction TLB entries

R/W

Table 4–108

DTLB Entries

27,35,43,512

variable

Data TLB entries

R/W

Table 4–108

Register
Mnemonic

Quantity

Width
(bits)

1

PTEVADDR

PS.RING

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 5–127 on
page 205. The TLB Entries are not Special Registers, but are accessed by the instructions in Table 4–108 on page 160.

2.

See Section 4.6.5.3 on page 163 for more information on TLB structure.

Table 4–108. MMU Option Instruction Additions
Instruction1

Format

Definition

IDTLB

RRR

Invalidate data TLB entry

IITLB

RRR

Invalidate instruction TLB entry

PDTLB

RRR

Probe data TLB

PITLB

RRR

Probe instruction TLB

RDTLB0

RRR

Read data TLB virtual

RDTLB1

RRR

Read data TLB Translation

RITLB0

RRR

Read instruction TLB virtual

1.

160

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–108. MMU Option Instruction Additions (continued)
Instruction1

Format

Definition

RITLB1

RRR

Read instruction TLB translation

WDTLB

RRR

Write data TLB

WITLB

RRR

Write instruction TLB

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

4.6.5.2 MMU Option Register Formats
This section describes the address and data formats needed for reading and writing the
instruction and data TLBs.
PTEVADDR
Because four ways of each TLB are configured as AutoRefill, the MMU Option supports
hardware refill of the TLB from a page table (Section 4.6.5.9). The base virtual address
of the current page table is specified in the PTEBase field of the PTEVADDR register.
When read, PTEVADDR returns the PTEBase field in its upper bits as shown in
Figure 4–29, EXCVADDR31..12 in the field labeled VPN below followed by two zero bits.
When PTEVADDR is written, only the PTEBase field is modified. PTEVADDR is undefined
after reset. Figure 4–29 shows the PTEVADDR register format.
31

22 21

2 1 0

PTEBase

VPN

0

10

20

2

Figure 4–29. MMU Option PTEVADDR Register Format
RASID
The Ring ASID (RASID) register holds the current ASIDs for each ring. The register is
divided into four 8-bit sections, one for each ASID. The Ring 0 ASID is hardwired to 1.
The operation of the processor is undefined if any two of the four ASIDs are equal or if
it contains an ASID of zero. RASID is 32’h04030201 after reset. Figure 4–30 shows
the RASID register format.

Xtensa Instruction Set Architecture (ISA) Reference Manual

161

Chapter 4. Architectural Options

31

24 23

16 15

8 7

0

Ring3 ASID

Ring2 ASID

Ring1 ASID

8’h01

8

8

8

8

Figure 4–30. MMU Option RASID Register Format
ITLBCFG
Because one or three ways of the instruction TLB are configured with variable page sizes (depending on whether IVARWAY56 is, respectively, fixed or variable), the ITLBCFG
register specifies the page size for those ways. Regardless of IVARWAY56, the Size
field in bits[17:16] of the register controls the size of the entries in Way 4 and has the
values 2’b00 = 1 MB, 2’b01 = 4 MB, 2’b10 = 16 MB, and 2’b11 = 64 MB. If IVARWAY56 is
Variable, the Sz field in bit[20] of the register controls the size of the entries in Way 5 and
has the values 1’b0 = 128MB and 1’b1 = 256MB. If IVARWAY56 is Variable, the Sz field
in bit[24] of the register controls the size of the entries in Way 6 and has the values 1’b0
= 512MB and 1’b1 = 256MB. MBZ means “must be zero”. The entire TLB way should be
invalidated when its size is changed. The ITLBCFG register is zero after reset. The following shows the ITLBCFG register format.
31

25 24 23

21 20 19 18 17 16 15

MBZ

Sz

MBZ

7

1

3

Sz MBZ
1

0

Size

MBZ

2

16

2

MMU Option ITLBCFG Register Format
DTLBCFG
Because one or three ways of the data TLB are configured with variable page sizes (depending on whether DVARWAY56 is, respectively, fixed or variable), the DTLBCFG register specifies the page size for those ways. Regardless of DVARWAY56, the Size field in
bits[17:16] of the register controls the size of the entries in Way 4 and has the values
2’b00 = 1 MB, 2’b01 = 4 MB, 2’b10 = 16 MB, and 2’b11 = 64 MB. If DVARWAY56 is Variable, the Sz field in bit[20] of the register controls the size of the entries in Way 5 and
has the values 1’b0 = 128MB and 1’b1 = 256MB. If DVARWAY56 is Variable, the Sz field
in bit[24] of the register controls the size of the entries in Way 6 and has the values 1’b0
= 512MB and 1’b1 = 256MB. MBZ means “must be zero”. The entire TLB way should be
invalidated when its size is changed. The DTLBCFG register is zero after reset.
Figure 4–31 shows the DTLBCFG register format.

162

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

31

25 24 23

21 20 19 18 17 16 15

MBZ

Sz

MBZ

7

1

3

Sz MBZ
1

2

0

Size

MBZ

2

16

Figure 4–31. MMU Option DTLBCFG Register Format
4.6.5.3 The Structure of the MMU Option TLBs
The instruction TLB is 7-way set-associative. Ways 0-3 are AutoRefill ways used for
hardware refill of 4 kB page table entries from the page table when no matching TLB entry is found. The AutoRefill ways contain a total of either 16 entries (four per way) or 32
entries (eight per way) depending on NIREFILLENTRIES. Way 4 is a variable size way
of four entries and is used for mapping large pages of 1 MB, 4 MB, 16 MB, or 64 MB as
configured by the ITLBCFG register. The ASID fields in these ways are set to zero (invalid) after reset.
Way 5 (IVARWAY56 Fixed), with two constant entries, statically maps the 128 MB region
32'hD0000000–32'hD7FFFFFF to the first 128 MB of physical memory
(32'h00000000–32'h07FFFFFF) as cached memory (attribute 4’h7 as described in
Section 4.6.5.10), and the next 128 MB region (32'hD8000000–32'hDFFFFFFF) to
the same 128 MB of physical memory as cache bypassed memory (attribute 4’h3 as described in Section 4.6.5.10). The ASID entries for both entries is 8’h01. These 128 MB
regions are intended for the operating system kernel’s first 128 MB of code and data
(see Figure 4–32). Using a pair of large static mappings reduces the load on the demand refill portion of the instruction TLB and also provides access using two attributes
for the same memory. Physical memory above the first 128 MB is accessed via dynamically mapped virtual address space.
Way 5 (IVARWAY56 Variable), is a variable size way of four entries and is used for mapping very large pages of 128 MB or 256 MB as configured by the ITLBCFG register. The
ASID fields in this way are set to zero (invalid) after reset. This way may be used to emulateWay 5 (IVARWAY56 Fixed), or it may be used for a more flexible arrangement.
Way 6 (IVARWAY56 Fixed), also with 2 constant entries, statically maps the 256 MB region 32'hE0000000–32'hEFFFFFFF to the last 256 MB of physical memory
(32'hF0000000–32'hFFFFFFFF) as cached memory (attribute 4’h7 as described in
Section 4.6.5.10), and the next 256 MB region (32'hF0000000–32'hFFFFFFFF) to
the same 256MB of physical memory as cache bypassed memory (attribute 4’h3 as described in Section 4.6.5.10). The ASID entries for both entries is 8’h01. These 256 MB
regions are intended for addressing the system peripherals (for example, a PCI or other
I/O bus) and system ROM (see Figure 4–32).

Xtensa Instruction Set Architecture (ISA) Reference Manual

163

Chapter 4. Architectural Options

Way 6 (IVARWAY56 Variable), is a variable size way of eight entries and is used for
mapping very large pages of 512 MB or 256 MB as configured by the ITLBCFG register.
The ASID fields in this way are set one and the Attribute fields in this way are set to 4’h2
(Bypass) after reset, and the other fields are set so that this way directly maps all of
memory after reset. This way may be used to emulate Way 6 (IVARWAY56 Fixed), it may
be used to effectively "turn off" the ITLB, or it may be used for a more flexible arrangement.
The data TLB is 10-way set-associative. It has the same seven ways as the instruction
TLB above (using DTLBCFG/DVARWAY56, instead of ITLBCFG/IVARWAY56), with the
addition of Ways 7-9, which are single-entry ways for 4 kB pages. These ways are intended to hold translations required to map the page table for hardware refill and for entries that are not to be replaced by refill. The ASID fields in these ways are set to zero
(invalid) after reset.
All ASID fields in the ITLB and DTLB, except those in Way 5 & Way 6, are set to zero (invalid) after reset. ASID fields in Way 5 are set to zero (invalid) after reset if
IVARWAY56/DVARWAY56 is Variable.
4.6.5.4 The MMU Option Memory Map
The memory map is determined by the TLB configurations given in Section 4.6.5.3.
Figure 4–32 shows a graphical representation of the constant translations in Way 5 and
Way 6 when IVARWAY56 and DVARWAY56 are Fixed, as well as the regions that are
mapped by more flexible ways than these. Way 5 and Way 6 may be used to emulate
this same arrangement when IVARWAY56 and DVARWAY56 are Variable.

164

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

FFFFFFFF
F0000000
E0000000
D8000000
D0000000

Virtual

Physical

bypass

peripherals

cached

FFFFFFFF
F0000000

bypass
cached
mapped

00000000

08000000
00000000

Figure 4–32. MMU Option Address Map with IVARWAY56 and DVARWAY56 Fixed
This configuration provides both bypass and cached access to peripherals. Bypass access is used for devices and cached access is used for ROMs, for example. It also provides bypass and cached access to the low 128 MB of memory. This allows system software to access its memory without competing with user code for other TLB entries.
These are available after reset. The large page way (Way 4) and the auto-refill ways
(Ways 0-3) may be used to map as much additional space as desired (Section 4.6.5.9).
In the data TLB, Ways 7-9 may be used to map single pages so that they are always
available.
4.6.5.5 Formats for Writing MMU Option TLB Entries
During normal operation when instructions and data are being accessed from memory,
only lookups are being done in the TLBs. For maintenance of the TLBs, however, the
entries in the TLBs are accessed by the instructions in Table 4–108 on page 160.

Xtensa Instruction Set Architecture (ISA) Reference Manual

165

Chapter 4. Architectural Options

Writing the TLB with the WITLB and WDTLB instructions requires the formats for the as
and at registers shown in Figure 4–33 and Figure 4–34. These figures show, in parallel,
the formats for different ways of the cache and different conditions. For Ways 0-3, there
are two conditions that depend on the configuration parameter NIREFILLENTRIES or
NDREFILLENTRIES (see Figure 4–105 on page 159) and can have the values of 16 or
32 auto-refill entries per TLB (four or eight per TLB way). For Way 4, there are four conditions, which are the four values of the respectiveITLBCFG or DTLBCFG fields and indicate the size of pages currently contained within that way. Ways 5 and 6 can be Fixed or
Variable as determined by the IVARWAY56 and DVARWAY56 parameters. If they are variable then there are still two conditions which are the two values of the respective ITLBCFG or DTLBCFG fields and indicate the size of pages currently contained within that
way. Each row, then, contains the format for the way and condition indicated in the left
column. Note that writing to Way-5 and Way-6 when the IVARWAY56 and DVARWAY56
parameters are "Fixed" causes no changes because those ways are constant.
Writing ITLB Ways 7-15 or DTLB ways 10-15 is undefined.
The format of the as register used for the WITLB and WDTLB instructions is shown in
Figure 4–33. The low order four bits contain the way to be accessed. The upper bits
contain the Virtual Page Number (VPN). For clarity, the Index bits are separated out
from the rest of the VPN in this figure. Note that unused bits at Bit 12 and above are ignored so that those bits may simply contain the address for access to all ways of both
TLBs. Unused bits at Bit 11 and below are reserved for forward compatibility. They may
either be zero or they may be the result of the probe instruction (Section 4.6.5.7).

166

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Way

31 30 29 28 27 26 25 24 23 22 21 20 19

0-3 (16entry)

VPN without Index

0-3 (32entry)

Index

VPN without Index

4 (1MB)

VPN without Index

4 (4MB)

VPN without Index

4 (16MB)

VPN without Index Index

5 (Fixed)

Index

Index

Ignored

Index

Ignored
Ignored

4 (64MB) VPN w/o Idx Index

Ignored
Ignored

5 (128MB)

VPN

Index

5 (256MB) VPN Index
6 (Fixed)
6 (512MB)

15 14 13 12 11

Ignored
Ignored
Ignored

Index

6 (256MB) V

Index

Ignored
Ignored

4 3 2 1 0
Reserved

4’h0,1,2,3

Reserved

4’h0,1,2,3

Reserved

4’h4

Reserved

4’h4

Reserved

4’h4

Reserved

4’h4

Reserved

4’h5

Reserved

4’h5

Reserved

4’h5

Reserved

4’h6

Reserved

4’h6

Reserved

4’h6

Figure 4–33. MMU Option Addressing (as) Format for WxTLB
The format of the at register used for the WITLB and WDTLB instructions is shown in
Figure 4–34. The low order four bits contain the attribute to be written (see
Section 4.6.5.10). The two bits above those contain the ring for which this TLB entry is
to be written. The ASID taken from the RASID register (see Section 4.6.5.2) corresponding to this ring is stored with the TLB entry. It is not possible to write an entry with an
ASID which is not currently in the RASID register. The upper bits contain the Physical
Page Number (PPN) of the translation. Way-5 and Way-6 are constant ways when the
IVARWAY56 and DVARWAY56 parameters are "Fixed": The PPN remains as described in
Section 4.6.5.3, the ASID is not written but always matches Ring 0, and the attribute remains as described in Section 4.6.5.3, no matter what is in register at. As with the address format, unused bits at Bit 12 and above are ignored so that a 20-bit PPN may be
used with all ways of the TLB, and unused bits at Bit 11 and below are required to be
zero for forward compatibility.

Xtensa Instruction Set Architecture (ISA) Reference Manual

167

Chapter 4. Architectural Options

Way

31

29 28 27 26 25 24 23 22 21 20 19 18 17

12 11

6 5 4 3

0

0-3 (16entry)

PPN

6’h00

Ring

Attribute

0-3 (32entry)

PPN

6’h00

Ring

Attribute

6’h00

Ring

Attribute

6’h00

Ring

Attribute

6’h00

Ring

Attrbute

6’h00

Ring

Attribute

4 (1MB)

PPN

4 (4MB)

PPN

4 (16MB)

Ignored

PPN

4 (64MB)

Ignored

PPN

5 (Fixed)

Ignored
Ignored

5 (128MB)

PPN

5 (256MB)

6’h00
Ignored

PPN

6 (Fixed)
6 (512MB)

Ignored

Ignored
Ignored

PPN

6 (256MB)
7-9(DTLB)

Ignored
PPN

31

6’h00

Ring

Attribute

6’h00

Ring

Attribute

6’h00

Ignored

PPN

Ignored

29 28 27 26 25 24 23 22 21 20 19 18 17

12 11

Ignored

6’h00

Ring

Attribute

6’h00

Ring

Attribute

6’h00

Ring

Attribute

6 5 4 3

0

Figure 4–34. MMU Option Data (at) Format for WxTLB
After modifying any TLB entry with a WITLB instruction, an ISYNC must be executed before executing any instruction that depends on the modification. The ITLB entry currently
being used for instruction fetch may not be changed.
After modifying any TLB entry with a WDTLB instruction, the operation of loads and
stores that depend on that TLB entry are undefined until a DSYNC instruction is executed.
4.6.5.6 Formats for Reading MMU Option TLB Entries
Reading the TLB with the RITLB0, RITLB1, RDTLB0, and RDTLB1 instructions requires
the formats for the as and at registers shown in Figure 4–35 through Figure 4–37.
These figures show, in parallel, the formats for different ways of the cache and different
conditions. For Ways 0-3, there are two conditions that depend on the configuration parameter NIREFILLENTRIES or NDREFILLENTRIES (see Figure 4–105 on page 159)
and can have the values of 16 or 32 auto-refill entries per TLB (four or eight per TLB
way). For Way 4, there are four conditions, which are the four values of the respectiveITLBCFG or DTLBCFG fields and indicate the size of pages currently contained within

168

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

that way. Ways 5 and 6 can be Fixed or Variable as determined by the IVARWAY56 and
DVARWAY56 parameters. If they are variable then there are still two conditions which are
the two values of the respective ITLBCFG or DTLBCFG fields and indicate the size of
pages currently contained within that way. Each row, then, contains the format for the
way and condition indicated in the left column.
Reading ITLB ways 7-15 or DTLB ways 10-15 is undefined.
The format of the as register used for the RITLB0, RITLB1, RDTLB0, and RDTLB1 instructions is shown in Figure 4–35. The low order four bits contain the way to be accessed. Besides the Way bits, only the Index bits are needed for reading the TLB. Depending
on the TLB way being accessed, and other conditions such as the size assigned to the
variable size way or the number of auto refill entries in the TLB, different bits of address
may be needed as shown. Note that unused bits at Bit 12 and above are ignored so that
an entire 20-bit VPN may be used when accessing all ways of both TLBs. Unused bits at
Bit 11 and below are reserved for forward compatibility. They may either be zero or they
may be the result of the probe instruction (Section 4.6.5.7).
Way

31

29 28 27 26 25 24 23 22 21 20 19

0-3 (16entry)

15 14 13 12 11

Ignored

0-3 (32entry)

Index

Ignored

4 (1MB)

Ignored

4 (4MB)

Ignored

4 (16MB)

Ignored

Index
Index

Ignored

Index

Ignored

Index

Ignored

Reserved

4’h0,1,2,3

Reserved

4’h0,1,2,3

Reserved

4’h4

Reserved

4’h4

Reserved

4’h4

Reserved

4’h4

4 (64MB)

Ignored

Index

5 (Fixed)

Ignored

Ix

Ignored

Reserved

4’h5

Ignored Index

Ignored

Reserved

4’h5

5 (128MB)
5 (256MB)
6 (Fixed)
6 (512MB)

Ig

Ignored

4 3 2 1 0

Index

Ignored

Reserved

4’h5

Ignored Ix

Ignored

Reserved

4’h6

Reserved

4’h6

Reserved

4’h6

Reserved

4’h7,8,9

Index

6 (256MB) Ig
7-9(DTLB)

Index

Ignored
Ignored
Ignored

31

29 28 27 26 25 24 23 22 21 20 19

15 14 13 12 11

4 3 2 1 0

Figure 4–35. MMU Option Addressing (as) Format for RxTLB0 and RxTLB1

Xtensa Instruction Set Architecture (ISA) Reference Manual

169

Chapter 4. Architectural Options

Because reading generates more information than can fit in one 32-bit register, there are
two read instructions that return different values. The data resulting from the RITLB0
and RDTLB0 instructions is shown in Figure 4–36. The low bits contain the ASID stored
with the entry, while the upper bits contain the Virtual Page Number (VPN) without the
Index bits that were used in the address of the read. Unused bits at Bit 12 and above of
the data result of these instructions are defined to be zero so that the entire 20-bit field
may always be used as a VPN whatever the size of the way. Unused bits at Bit 11 and
below are undefined for forward compatibility.
Way

31 30 29 28 27 26 25 24 23 22 21

0-3 (16entry)

VPN without Index

0-3 (32entry)

3’b000

VPN without Index

4 (4MB)

10’h000

VPN without Index

12’h000

VPN without Index

8 7

2’b00 Undefined

VPN withoutIndex

4 (1MB)

4 (16MB)

15 14 13 12 11

14’h0000

0
ASID

Undefined

ASID

Undefined

ASID

Undefined

ASID

Undefined

ASID

4 (64MB) VPN w/o Idx

16’h0000

Undefined

ASID

5 (Fixed)

4’b1101

16’h0000

Undefined

ASID

5 (128MB)

VPN

Undefined

ASID

Undefined

ASID

Undefined

ASID

Undefined

ASID

Undefined

ASID

Undefined

ASID

5 (256MB) VPN
6 (Fixed)

3’b111

6 (512MB)

18’h00000
17’h00000
20’h00000

6 (256MB) V
7-9(DTLB)

17’h00000

19’h00000
VPN

31 30 29 28 27 26 25 24 23 22 21

15 14 13 12 11

8 7

0

Figure 4–36. MMU Option Data (at) Format for RxTLB0
The data resulting from the RITLB1, and RDTLB1 instructions is shown in Figure 4–37.
The low order four bits contain the attribute stored with the TLB entry (Section 4.6.5.10).
The upper bits contain the Physical Page Number (PPN) of the entry. Unused bits at Bit
12 and above of the data result of these instructions are defined to be zero so that the
entire 20-bit field may always be used as a PPN, whatever the size of the way. Unused
bits at Bit 11 and below are undefined for forward compatibility.

170

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Way

31

29 28 27 26 25 24 23 22 21 20 19

12 11

4 3

0

0-3 (16entry)

PPN

Undefined

Attribute

0-3 (32entry)

PPN

Undefined

Attribute

Undefined

Attribute

Undefined

Attribute

Undefined

Attribute

Undefined

Attribute

4 (1MB)

PPN

4 (4MB)

8’h00

PPN

4 (16MB)

10’h000

PPN

12’h000

4 (64MB)

PPN

14’h0000

5 (Fixed)

5’b00000

15’h0000

Undefined

Attribute

5 (128MB)

PPN

15’h0000

Undefined

Attribute

5 (256MB)

PPN

16’h0000

Undefined

Attribute

6 (Fixed)

4’b1111

16’h0000

Undefined

Attribute

6 (512MB)

PPN

Undefined

Attribute

Undefined

Attribute

Undefined

Attribute

6 (256MB)

PPN

7-9(DTLB)

17’h0000
16’h0000
PPN

31

29 28 27 26 25 24 23 22 21 20 19

12 11

4 3

0

Figure 4–37. MMU Option Data (at) Format for RxTLB1
4.6.5.7 Formats for Probing MMU Option TLB Entries
Probing the TLB with the PITLB and PDTLB instructions requires the formats for the as
and at registers shown in Figure 4–38 and Figure 4–39. Unlike writing and reading the
TLBs as explained in the previous two sections, the operation of probing a TLB begins
without knowing the way containing the sought after value. The formats do not, therefore, vary with the way being accessed. The probe instructions answer the question of
what entry in this TLB, if any, would be used to translate an access with a particular address from a particular ring. The sought for address is given in the as register as shown
in Figure 4–38 and the ring is given by PS.RING (not CRING, so that while PS.EXCM is
set, a probe may be done for a user program). If, for example, there is an entry that
matches in address, but its ASID does not match any ASID in the RASID register, or an
entry that matches in address, but the ASID corresponds in the RASID register to a ring
of lower number than the current PS.RING, the probe will not return a hit.
The format of the as register used for the PITLB and PDTLB instructions is shown in
Figure 4–38. Any address may be used as input to the probe instructions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

171

Chapter 4. Architectural Options

31

0
Probe Address
32

Figure 4–38. MMU Option Addressing (as) Format for PxTLB
The data resulting from the PITLB and PDTLB instructions is shown in Figure 4–39 and
Figure 4–40. The low three/four bits contain the Way (if any), which would be used to
translate the address and the next bit up is set if there is a translation in the TLB, and
clear if there is not. Some bits are undefined for forward compatibility but the result is
such that, if Hit=1, it may be used as the as register for WxTLB, RxTLB0, RxTLB1, or
IxTLB.
31

12 11

4 3 2

0

VPN

Undefined

Hit

Way

20

8

1

3

Figure 4–39. MMU Option Data (at) Format for PITLB

31

12 11

5 4 3

0

VPN

Undefined

Hit

Way

20

7

1

4

Figure 4–40. MMU Option Data (at) Format for PDTLB
4.6.5.8 Format for Invalidating MMU Option TLB Entries
Invalidating the TLB with the IITLB and IDTLB instructions requires the formats for the
as register shown in Figure 4–41. This figure shows, in parallel, the formats for different
ways of the cache and different conditions. For Ways 0-3, there are two conditions that
depend on the configuration parameter NIREFILLENTRIES or NDREFILLENTRIES
(Figure 4–105) and can have the values of 16 or 32 auto-refill entries per TLB (4 or 8 per
TLB way). For Way 4, there are four conditions, which are the four values of the respectiveITLBCFG or DTLBCFG fields and indicate the size of pages currently contained within
that way. Ways 5 and 6 can be Fixed or Variable as determined by the IVARWAY56 and
DVARWAY56 parameters. If they are variable then there are still two conditions which are
the two values of the respective ITLBCFG or DTLBCFG fields and indicate the size of
pages currently contained within that way. Each row, then, contains the format for the

172

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

way and condition indicated in the left column. Note that invalidating Way-5 and Way-6
when the IVARWAY56 and DVARWAY56 parameters are "Fixed" causes no changes because those ways are constant.
Invalidation of ITLB ways 7-15 or DTLB ways 10-15 is undefined.
The format of the as register used for the IITLB and IDTLB instructions is shown in
Figure 4–41. The low order four bits contain the way to be accessed. The upper bits
contain at least the Index from the Virtual Page Number (VPN). Note that unused bits at
Bit 12 and above are ignored so that those bits may simply contain the address for access to all ways of both TLBs. Unused bits at Bit 11 and below are reserved for forward
compatibility. They may either be zero or they may be the result of the probe instruction
(Section 4.6.5.7 on page 171).
Invalidation of an entry sets the corresponding ASID to zero so that it no longer responds when an address is looked up in the TLB.
Way

31 30 29 28 27 26 25 24 23 22 21 20 19

0-3 (16entry)

Ignored

0-3 (32entry)
Ignored

4 (4MB)

Ignored

4 (16MB)

Ignored

4 (64MB)

Ignored

Ignored

Index

Ignored

Index

Ignored
Ignored

Ignored Index
Ig

Index

Ignored
Ignored
Ignored

Index

6 (256MB) Ig
7-9(DTLB)

Index

Ignored

6 (Fixed)
6 (512MB)

Index

Index

5 (Fixed)

5 (256MB)

Index

Ignored

4 (1MB)

5 (128MB)

15 14 13 12 11

Index

Ignored
Ignored
Ignored

31 30 29 28 27 26 25 24 23 22 21 20 19

15 14 13 12 11

4 3 2 1 0
Reserved

4’h0,1,2,3

Reserved

4’h0,1,2,3

Reserved

4’h4

Reserved

4’h4

Reserved

4’h4

Reserved

4’h4

Reserved

4’h5

Reserved

4’h5

Reserved

4’h5

Reserved

4’h6

Reserved

4’h6

Reserved

4’h6

Reserved

4’h7,8,9
4 3 2 1 0

Figure 4–41. MMU Option Addressing (as) Format for IxTLB

Xtensa Instruction Set Architecture (ISA) Reference Manual

173

Chapter 4. Architectural Options

After modifying any TLB entry with a IITLB instruction, an ISYNC must be executed before executing any instruction that depends on the modification. After modifying any TLB
entries with an IDTLB instruction, the operation of loads from and stores that depend on
that TLB entry are undefined until a DSYNC instruction is executed.
4.6.5.9 MMU Option Auto-Refill TLB Ways and PTE Format
When no TLB entry matches the ASIDs and the virtual address presented to the MMU,
the MMU attempts to automatically load the appropriate page table entry (PTE) from the
page table and write it into the TLB in one of the AutoRefill ways. This hardware- generated load from the page table itself requires virtual-to-physical address translation,
which executes at Ring 0 so that it has access to the page table and uses the DTLB. An
error of any sort during the automatic refill process will cause an InstTLBMissCause
or a LoadStoreTLBMissCause exception to be raised so that system software can
take appropriate action and possibly retry the access. This combination of hardware and
software refill gives excellent performance while minimizing processor complexity. If the
second translation succeeds, the PTE load is done through the DataCache, if one is
configured, and the attributes for the page containing the PTE enable such a cache access. The PTE’s Ring field is then used as an index into the RASID register, and the resulting ASID is written together with the rest of the PTE into the TLB.
Xtensa’s TLB refill mechanism requires the page table for the current address space to
reside in the current virtual address space. The PTEBase field of the PTEVADDR register
gives the base address of the page table. On a TLB miss, the processor forms the virtual
address of the PTE by catenating the PTEBase portion of PTEVADDR, the Virtual Page
Number (VPN) bits of the miss virtual address, and 2 zero bits. The bits used from
PTEVADDR and from the virtual address are configuration dependent; the exact calculation for 4-byte PTEs is
PTEVADDR31..22||vAddr31..12||2'b00

The format of the PTEs is shown in Figure 4–42. The most significant bits hold the Physical Page Number (PPN), the translation of the virtual address corresponding to this entry. The Sw bits are available for software use in the page table (they are not stored in
the TLB). The Ring field specifies the privilege level required to access this page; this is
used to choose one of the four ASIDs from RASID when the TLB is written. The attribute
field gives the access attributes for this page (see Section 4.6.5.10).
31

12 11

6 5 4 3

0

PPN

Sw

Ring

Attribute

20

6

2

4

Figure 4–42. MMU Option Page Table Entry (PTE) Format

174

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

The configuration described in Section 4.6.5.4 (with IVARWAY56/DVARWAY56 Fixed)
provides a maximum of 3328 MB of dynamically mapped space (4 GB of total virtual address space with 768 MB of statically mapped space). The page table for this maximum
size requires 851968 PTEs (3328MB/4 kB). The entire set of PTEs require 3328 kB of
virtual address space (at 4 bytes per PTE). The PTEs themselves are at virtual addresses and, therefore, 832 of the PTEs in the table are for mapping the page table itself.
These PTEs for mapping the page table will fit onto a single page, the mapping for which
may be written into one of the single-entry ways (Ways 7-9) of the data TLB for guaranteed access.
For example, if PTEVADDR is set to 32’hCFC00000, then the virtual address space between there and 32’hCFF3FFFF is used as the page table. That page table is mapped
by the 832 entries between 32’hCFF3F000 and 32’hCFF3FCFF. The translation for the
page at 32’hCFF3F000 is placed in one of the single-entry ways of the data TLB. (The
accesses that might have used the remaining 192 PTE entries on that page would already have been translated by one of the constant ways.) Many of those 832 entries
may be marked invalid and the physical address space required for the page table may
be made very small.
In systems with large memories, the above maximum configuration may be improved in
performance by mapping the entire page table into the constant way (Way 5). If
PTEVADDR is set to 32’hD4000000, for example, the virtual address space between
there and 32’hD433FFFF, which maps to the physical address space between
32’h04000000 and 32’h0433FFFF (between 64 MB and about 68 MB) is used for a
flat page table mapping all of memory. Any TLB miss will now be handled by the hardware refill as the translation for the PTE will be handled by the constant way. The disadvantage is that over 3 MB of memory must be allocated to the page table.
In a small system, where all processes are limited to the first 8 MB of virtual space,
PTEVADDR might be set to 32’hCFC00000 and two of the single entry ways set to map
the page at 32’hCFC00000 and the page at 32’hCFC01000. One or both pages of
PTEs could be used for translations and the hardware refill would always succeed for legal addresses.
4.6.5.10 MMU Option Memory Attributes
Currently available hardware supports the memory attributes described in this section.
T1050 hardware supported somewhat different memory attributes, which are described
in Section A.5 “MMU Option Memory Attributes”. System software may use the subset of
attributes (1, 3, 5, 7, 12, 13, and 14) which have not changed to support all Xtensa processors.

Xtensa Instruction Set Architecture (ISA) Reference Manual

175

Chapter 4. Architectural Options

The memory attributes discussed in this section apply both to attribute values written in
and read from the TLBs (see Section 4.6.5.5 and Section 4.6.5.6) and to attribute values
stored in the PTE entries and written into the AutoRefill ways of the TLBs (see
Section 4.6.5.9).
For a more detailed description of the memory access process and the place of these attributes in it, see Section 4.6.2.
Table 4–109 shows the meanings of the attributes for instruction fetch, data load, and
data store. For a more detailed description of the memory access process and the place
of these attributes in it, see Section 4.6.2.
The first column in Table 4–109 indicates the attribute from the TLB while the remaining
columns indicate various effects on the access. The columns are described in the following bullets:
„
„

Attr — the value of the 4-bit Attribute field of the TLB entry.
Rights — whether the TLB entry may successfully translate a data load, a data
store, or an instruction fetch.
-

The first character is an r if the entry is valid for a data load and a dash ("-")if
not.

-

The second character is a w if the entry is valid for a data store and a dash
("-")if not.

-

The third character is an x if the entry is valid for an instruction fetch and a dash
("-")if not.

If the translation is not successful, an exception is raised.
Local memory accesses (including XLMI) consult only the Rights column.
„

„

„

176

WB — some rows are split by whether or not the configured cache is writeback or
not. Rows without an entry apply to both cache types.
Meaning for Cache Access — the verbal description of the type of access made to
the cache.
Access Cache — indicates whether the cache provides the data.
-

The first character is an h if the cache provides the data when the tag indicates
hit and a dash ("-")if it does not.

-

The second character is an m if the cache provides the data when the tag indicates a miss and a dash ("-")if it does not. This capability is used only for Isolate mode.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

„

„

„

Fill Cache — indicates whether an allocate and fill is done to the cache if the tag indicates a miss.
-

The first character is an r if the cache is filled on a data load and a dash ("-")if
it is not.

-

The second character is a w if the cache is filled on a data store and a dash ("")if it is not.

-

The third character is an x if the cache is filled on an instruction fetch and a
dash ("-")if it is not.

Guard Load — refers to the guarded attribute as described in Table 4–99 on
page 144. Stores are always guarded and instruction fetches are never guarded,
but loads are guarded where there is a “yes” in this column. Local memory loads are
not guarded.
Write Thru — indicates whether a write is done through the PIF interface.
-

The first character is an h if a Write Thru occurs when the tag indicates hit and a
dash ("-")if it does not.

-

The second character is an m if a Write Thru occurs when the tag indicates a
miss and a dash ("-")if it does not.

Writes to local memories are never Write-Thru. In most implementations, a write-thru will
only occur after any needed cache fill is complete.

Xtensa Instruction Set Architecture (ISA) Reference Manual

177

Chapter 4. Architectural Options

Table 4–109. MMU Option Attribute Field Values
Access
Cache

Fill
Cache

Guard
Load

Write
Thru

Bypass cache

--

---

yes

--

Bypass cache

--

---

yes

--

rw-

Bypass cache

--

---

yes

hm

3

rwx

Bypass cache

--

---

yes

hm

4

r--

Cached, WrtBack alloc

h-

r--

-

--

5

r-x

Cached, WrtBack alloc

h-

r-x

-

--

6

rw-

Cached, WrtBack alloc

h-

rw-

-

--

7

rwx

Cached, WrtBack alloc

h-

rwx

-

--

8

r--

Cached, WrtThru

h-

r--

-

--

9

r-x

Cached, WrtThru

h-

r-x

-

--

10

rw-

Cached, WrtThru

h-

r--

-

hm

11

rwx

Cached, WrtThru

h-

r-x

-

hm

--

---

-

--

hm

---

-

--

--

---

-

--

—

—

—

—

Attr

Rights

0

r--

1

r-x

2

Meaning for Cache Access

1

12

---

illegal

13

rw-

Cache Isolated2

14
15

-----

illegal

1

Reserved

1

1

Raises exception. EXCCAUSE is set to InstFetchProhibitedCause, LoadProhibitedCause, or StoreProhibitedCause depending on access type

2

For test only, implementation dependent, uses data cache like local memories and ignores tag.

In the absence of the Instruction Cache Option, Cached regions behave as Bypass regions on instruction fetch. In the absence of the Data Cache Option, Cached regions behave as Bypass regions on data load or store. If the Data Cache is not configured as
writeback (Section 4.5.5.1 on page 119) Attributes 4, 5, 6, and 7 behave as Attributes 8,
9, 10, and 11 respectively instead of as they are listed in Table 4–109.
4.6.5.11 MMU Option Operation Semantics
The following functions are used in the operation sections of the individual instruction
definitions:
function ltranslate(vAddr, ring)
ltranslate ← (pAddr, attributes, cause)
endfunction ltranslate
function ASID(ring)
ASID ← RASIDring*8+ASIDBits-1..ring*8
endfunction ASID

178

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

function InstPageBits(wi)
sizecodebits ← ceil(log2(InstTLB[wi].PageSizeCount))
sizecode ← IPAGESIZEwi*4+sizecodebits-1..wi*4
InstPageBits ← InstTLB[wi].PageBits[sizecode]
endfunction InstPageBits
function SplitInstTLBEntrySpec(spec)
wih ← ceil(log2(InstTLBWayCount)) − 1
wi ← specwih..0
eil ← InstPageBits(wi)
eih ← eil + log2(InstTLB[wi].IndexCount)
ei ← speceih..eil
vpn ← specInstTLBVAddrBits-1..eih+1
SplitInstTLBEntrySpec ← (vpn, ei, wi)
endfunction SplitInstTLBEntrySpec
function ProbeInstTLB (vAddr)
match ← 0
vpn ← undefined
ei ← undefined
wi ← undefined
for i in 0..InstTLBWayCount-1 do
if then
match ← match + 1
vpn ← x
ei ← x
wi ← i
endif
endfor
ProbeInstTLB ← (match, vpn, ei, wi)
endfunction ProbeInstTLB

4.7

Options for Other Purposes

This section contains options that do not fit easily into the previous sections. The Windowed Register Option provides the hardware for a memory efficient ABI. The Processor Interface Option provides a standard interface to system memory. The Miscellaneous Special Registers Option provides additional scratch registers. The Processor ID
Option provides the ability for software to determine on which processor it is running.
The Debug Option provides hardware to assist in debugging processors.

Xtensa Instruction Set Architecture (ISA) Reference Manual

179

Chapter 4. Architectural Options

4.7.1

Windowed Register Option

The Windowed Register Option replaces the simple 16-entry AR register file with a larger register file from which a window of 16 entries is visible at any given time. The window
is rotated on subroutine entry and exit, automatically saving and restoring some registers. When the window is rotated far enough to require registers to be saved to or restored from the program stack, an exception is raised to move some of the register values between the register file and the program stack. The option reduces code size and
increases performance of programs by eliminating register saves and restores at procedure entry and exit, and by reducing argument-shuffling at calls. It allows more local
variables to live permanently in registers, reducing the need for stack-frame maintenance in non-leaf routines.
Xtensa ISA register windows are different from register windows in other instruction
sets. Xtensa register increments are 4, 8, and 12 on a per-call basis, not a fixed increment as in other instruction sets. Also, Xtensa processors have no global address registers. The caller specifies the increment amount, while the callee performs the actual increment by the ENTRY instruction. The compiler uses an increment sufficient to hide the
registers that are live at the point of the call (which the compiler can pack into the fewest
possible at the low end of the register-number space). The number of physical registers
is 32 or 64, which makes this a more economical configuration. Sixteen registers are visible at one time. Assuming that the average number of live registers at the point of call is
6.5 (return address, stack pointer, and 4.5 local variables), and that the last routine uses
12 registers at its peak, this allows nine call levels to live in 64 registers (8×6.5+12=64).
As an example, an average of 6.5 live registers might represent 50% of the calls using
an increment of 4, 38% using an increment of 8, and 12% using an increment of 12.
„

Prerequisites: Exception Option (page 82)

„

Incompatible options: None

The rotation of the 16-entry visible window within the larger register file is controlled by
the WindowBase Special Register added by the option. The rotation always occurs in
units of four registers, causing the number of bits in WindowBase to be log2(NAREG/4).
Rotation at the time of a call can instantly save some registers and provide new registers for the called routine. Each saved register has a reserved location on the stack, to
which it may be saved if the call stack extends enough farther to need to re-use the
physical registers. The WindowStart Special Register, which is also added by the option
and consists of NAREG/4 bits, indicates which four register units are currently cached in
the physical register file instead of residing in their stack locations. An attempt to use
registers live with values from a parent routine raises an Overflow Exception which
saves those values and frees the registers for use. A return to a calling routine whose
registers have been previously saved to the stack raises an Underflow Exception which
restores those values. Programs without wide swings in the depth of the call stack save
and restore values only occasionally.

180

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.7.1.1 Windowed Register Option Architectural Additions
Table 4–110 through Table 4–113 show this option’s architectural additions.
Table 4–110. Windowed Register Option Constant Additions (Exception Causes)
Exception Cause

Description

Constant Value

MOVSP instruction, if the caller’s registers are not
present in the register file
(seeTable 4–64 on page 89)

6'b000101 (decimal 5)

AllocaCause

Table 4–111. Windowed Register Option Processor-Configuration Additions
Parameter

Description

Valid Values

WindowOverflow4

Window overflow exception vector for 4-register
stack frame

32-bit address1

WindowUnderflow4

Window underflow exception vector for 4-register
stack frame

32-bit address1

WindowOverflow8

Window overflow exception vector for 8-register
stack frame

32-bit address1

WindowUnderflow8

Window underflow exception vector for 8-register
stack frame

32-bit address1

WindowOverflow12

Window overflow exception vector for 12- register
stack frame

32-bit address1

WindowUnderflow12

Window underflow exception vector for 12register stack frame

32-bit address1

NAREG

Number of address registers

32 or 64

1.

Some implementations have restrictions on the alignment and relative location of the WindowOverflowN and WindowUnderflowN vectors. See
“procedure WindowCheck (wr, ws, wt)” in Section 4.7.1.3 “Window Overflow Check” on page 184 for how these are used.

Table 4–112. Windowed Register Option Processor-State Additions and Changes
Register
Mnemonic
AR
WindowBase
WindowStart
1.

Quantity

Width
(bits)

NAREG

32

1
1

Register Name

R/W

Special
Register
Number1

Address registers
(general registers)

R/W

—

log2(
NAREG/4)

Base of current address-register
window

R/W

72

NAREG/4

Call-window start bits

R/W

73

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 5–127 on
page 205.

Xtensa Instruction Set Architecture (ISA) Reference Manual

181

Chapter 4. Architectural Options

Table 4–112. Windowed Register Option Processor-State Additions and Changes
Register
Mnemonic

Quantity

Width
(bits)

Special
Register
Number1

2

Miscellaneous processor state,
window increment from call
(see Table 4–63 on page 87)

R/W

230

1

4

Miscellaneous processor state,
old window base
(see Table 4–63 on page 87)

R/W

230

1

1

Miscellaneous processor state,
window overflow enable
(see Table 4–63 on page 87)

R/W

230

PS.OWB

1.

R/W

1
PS.CALLINC

PS.WOE

Register Name

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 5–127 on
page 205.

Table 4–113. Windowed Register Option Instruction Additions
Instruction1

Format

Definition

MOVSP

RRR

Atomic check window and move

CALL4,
CALL8,
CALL12

CALL

Call subroutine, PC-relative. These instructions communicate the number of registers to
hide using PS.CALLINC in addition to the operation of CALL0.

CALLX4,
CALLX8,
CALLX12

CALLX

Call subroutine, address in register. These instructions communicate the number of
registers to hide using PS.CALLINC in addition to the operation of CALLX0.

BRI12

Subroutine entry—rotate registers, adjust stack pointer. This instruction should not be
used in a routine called by CALL0 or CALLX0.

CALLX

Subroutine return—unrotate registers, jump to return address. Used to return from a
routine called by CALL4, CALL8, CALL12, CALLX4, CALLX8, or CALLX12.

RRRN

Same at RETW in a 16-bit encoding

RRR

Rotate window by a constant. ROTW is intended for use in exception handlers and
context switch.

L32E

RRI4

Load 32 bits for window exception

S32E

RRI4

Store 32 bits for window exception

RFWO

RRR

Return from window overflow exception

RFWU

RRR

Return from window underflow exception

ENTRY
RETW
RETW.N2
ROTW

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

2.

Exists only if the Code Density Option described in Section 4.3.1 on page 53 is configured.

182

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.7.1.2 Managing Physical Registers
The WindowBase Special Register gives the position of the current window into the
physical register file. In the instruction descriptions, AR[i] is a short-hand for a reference to the physical register file AddressRegister defined as follows:
AddressRegister[((2'b00||i3..2) + WindowBase) || i1..0]

The WindowStart Special Register gives the state of physical registers (unused or part
of a window). WindowStart is used both to detect overflow and underflow on register
use and procedure return, as well as to determine the number of registers to be saved in
a given stack frame when handling exceptions and switching contexts. There is one bit
in WindowStart for each four physical registers. This bit is set if those four registers
are AR[0] to AR[3] for some call. WindowStart bits are set by ENTRY and cleared by
RETW.
The WindowBase and WindowStart registers are undefined after processor reset, and
should be initialized by the reset exception vector code.
Figure 4–43 through Figure 4–45 show three functionally identical implementations of
windowed registers. Figure 4–43 shows the concept of how the registers are addressed.
Figure 4–44 shows logic with the same functional result but with little or no penalty paid
in timing for the addition of the WindowBase value. Figure 4–45 shows a third version of
the logic with the same functional result but with no timing loss at all caused by the addition of the WindowBase value.
WindowBase
4
Inst

t
s

2

4
00
4
00

2 4
4

+

64:1

64:1

32

6

32

6
+

32 x 64
64 32-bit
registers

Figure 4–43. Conceptual Register Window Read

Xtensa Instruction Set Architecture (ISA) Reference Manual

183

Chapter 4. Architectural Options

WindowBase

00

2 4

00

4

4

s

+
+

4:1

2

4

t

32

4

16:1

Inst

16:1

4:1

4

32

4
32 x 64

64 32-bit
registers

Figure 4–44. Faster Register Window Read
WindowBase

4

s

4

16:1

t

32
16:1

Inst

16:1

16:1

4

32

32 x 64
64 32-bit
registers

Figure 4–45. Fastest Register Window Read
4.7.1.3 Window Overflow Check
The ENTRY instruction moves the register window, but does not guarantee that all the
registers in the current window are available for use. Instead, the processor waits for the
first reference to an occupied physical register before triggering a window overflow. This
prevents unnecessary overflows, because many routines do not use all 16 of their virtual

184

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

registers. Figure 4–46 shows the state of the register file just prior to a reference that
causes an overflow. The WS(n) notation shows which WindowStart bits are set in this
example, and gives the distance to the next bit set (that is, the number of registers
stored for the corresponding stack frame). In the figure, “rmax” indicates the maximum
register that the current procedure uses and “Base” is an abbreviation for WindowBase.
Note that registers are considered in groups of four here.
WS(3)

WS(2)

WS

WS(1)WS(2)

WS(2)

Invalid

WS(1)WS(2)

Active regs
Base
Base+
rmax

Base
+16

Valid Data

Valid Data
Window

Figure 4–46. Register Window Near Overflow
The check for overflow is done as follows:
WindowCheck (

if ref(AR[r]) then r3..2 else 2'b00,
if ref(AR[s]) then s3..2 else 2'b00,
if ref(AR[t]) then t3..2 else 2'b00)

where ref() is 1 if the register is used by the instruction, and 0 otherwise, and
WindowCheck is defined as follows:
procedure WindowCheck (wr, ws, wt)
n ← if (wr ≠ 2'b00 or ws ≠ 2'b00 or wt ≠ 2'b00)
and WindowStartWindowBase+1 then 2’b01
else if (wr1 or ws1 or wt1)
and WindowStartWindowBase+2 then 2’b10
else if (wr = 2'b11 or ws = 2'b11 or wt = 2'b11)
and WindowStartWindowBase+3 then 2’b11
else 2’b00
if CWOE = 1 and n ≠ 2’b00 then
PS.OWB ← WindowBase
m ← WindowBase + (2'b00||n)
PS.EXCM ← 1
EPC[1] ← PC
nextPC ← if WindowStartm+1 then WindowOverflow4
else if WindowStartm+2 then WindowOverflow8
else WindowOverflow12

Xtensa Instruction Set Architecture (ISA) Reference Manual

185

Chapter 4. Architectural Options

WindowBase ← m
endif
endprocedure WindowCheck

A single instruction may raise multiple window overflow exceptions. For example, suppose that registers 4..7 of the current window still contain a previous call frame’s values (WindowStartWindowBase+1 is set), and 8..15 are part of the subroutine called by
that frame (WindowStartWindowBase+2 is also set), and an instruction references register 10. The processor will raise an exception to spill registers 4..7 and then return to
retry the instruction, which will then raise another exception to spill registers 8..15. On
return from this overflow handler, the reference will finally succeed.
4.7.1.4 Call, Entry, and Return Mechanism
The register window mechanics of the {CALL, CALLX}{4,8,12}, ENTRY, and {RETW,
RETW.N} instructions are:
CALLn/CALLXn
WindowCheck (2'b00, 2'b00, n)
PS.CALLINC ← n
AR[n||2'b00] ← n || (PC + 3)29..0
ENTRY s, imm12
AR[PS.CALLINC||s1..0] ← AR[s] − (017||imm12||03)
WindowBase ← WindowBase + (02||PS.CALLINC)
WindowStartWindowBase ← 1

In the definition of ENTRY above, the AR read and the AR write refer to different registers.
RETW/RETW.N
n ← AR[0]31..30
nextPC ← PC31..30 || AR[0]29..0
owb ← WindowBase
m ← if WindowStartWindowBase-4’b0001 then 2’b01
elsif WindowStartWindowBase-4’b0010 then 2’b10
elsif WindowStartWindowBase-4’b0011 then 2’b11
else 2’b00
if n = 2’b00 | (m ≠ 2’b00 & m ≠ n) | PS.WOE=0 | PS.EXCM=1 then
-- undefined operation
-- may raise illegal instruction exception
else
WindowBase ← WindowBase − (02||n)
if WindowStartWindowBase ≠ 0 then
WindowStartowb ← 0
else
-- Underflow exception
PS.EXCM ← 1

186

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

EPC[1] ←
PS.OWB ←
nextPC ←
else
else

PC
owb
if n = 2'b01 then WindowUnderflow4
if n = 2'b10 then WindowUnderflow8
WindowUnderflow12

endif
endif

The RETW opcode assignment is such that the s and t fields are both zero, so that the
hardware may use either AR[s] or AR[t] in place of AR[0] above. Underflow is detected by the caller’s window’s WindowStart bit being clear (that is, not valid).
Figure 4–47 shows the register file just before a RETW that raises an underflow exception. window overflow and window underflow exceptions leave PS.UM unchanged.
WS

Invalid

Invalid

Active Regs
Valid Data
Base

Base +16

Base+rmax

Window

Figure 4–47. Register Window Just Before Underflow
4.7.1.5 Windowed Procedure-Call Protocol
While the procedure-call protocol is a matter for the compiler and ABI, the Xtensa ISA,
and particularly the Windowed Register Option was designed with the following goals in
mind:
„

Provide highly efficient call/return (measured in both code size and execution time)

„

Support per-call register window increments

„

Use a single stack for both register save/restore and local variables

„

Support variable frame sizes (for example, alloca)

„

„

Support programming language exception handling (for example,
setjmp/longjmp, catch/throw, and so forth)
Support debuggers

Xtensa Instruction Set Architecture (ISA) Reference Manual

187

Chapter 4. Architectural Options

Require minimal special ISA features (special registers and so forth)

„

Table 4–114 shows the register usage in the Windowed Register Option. Refer to
Section 8.1 “The Windowed Register and CALL0 ABIs” for a more complete description
of the Windowed Register ABI.
Table 4–114. Windowed Register Usage
Callee Register

Register Name

Usage

0

a0

1

a1/sp

Return address
Stack pointer

2..7

a2..a7

In, out, inout, and return values

Calls to routines that use only a2..a3 as parameters may use the CALL4, CALL8, or
CALL12 instructions to save 4, 8, or 12 live registers. Calls to routines that use a2..a7
for parameters may use only CALL4 or CALL8. The following assembly language illustrates the call protocol.
// In procedure g, the call
//
z = f(x, y)
// would compile into
mov
a6, x
// a6 is f’s a2 (x)
mov
a7, y
// a7 is f’s a3 (y)
call4
f
// put return address in f’s a0,
// goto f
mov
z, a6
// a6 is f’s a2 (return value)
// The function
//
int f(int a, int *b) { return a + *b; }
// would compile into
f:
entry
sp, framesize// allocate stack frame, rotate regs
// on entry, a0/ return address, a1/ stack pointer,
// a2/ a, a3/ *b
l32i
a3, a3, 0 // *b
add
a2, a2, a3// *b + a
retw

The “highly efficient call/return” goal requires that there not be separate stack and frame
pointer registers in cases where they would differ by a constant (that is, no alloca is
used). There are simply not enough registers to waste. For routines that do call alloca,
the compiler will copy the initial stack pointer to another register and use that for addressing all locals.
The variable allocation,
p1 = alloca(n1);

will be implemented as

188

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

movi
sub
and
movsp
addi

t4,
t5,
t4,
sp,
p1,

-16
// for alignment to 16-byte boundary
sp, n1
// reserve stack space
t5, t4
// ...
t4
// atomically set sp
sp, -16+botsize// save pointer

The botsize in the last statement allows the compiler to maintain a block of words at
the bottom of the stack (for example, this block might be for memory arguments to routines). The -16 is a constant of the call protocol; it puts 16 bytes of the bottom area below the stack pointer (since they are infrequently referenced), leaving the limited range
of the ISA’s load/store offsets available for more frequently referenced locals.
Figure 4–48 and Figure 4–49 show the stack frame before and after alloca.

Minimum Frame size
(specified in ENTRY instruction)

locals

sp

lp

bottom
sp-16

Figure 4–48. Stack Frame Before alloca()

Xtensa Instruction Set Architecture (ISA) Reference Manual

189

Chapter 4. Architectural Options

locals

lp
n1 bytes
p1
sp
bottom
sp-16

Figure 4–49. Stack Frame After First alloca()
Figure 4–50 shows the stacking of frames when the stack grows downward, as on most
other systems. The window save area for a frame is addressed with negative offsets
from the next stack frame’s sp. Four registers are saved in the base save area. If more
than four registers are saved, they are stored at the top of the stack frame, in the extra
save area, which can be found with negative offsets from the previous stack frame’s sp.
This unusual split allows for simple backtrace while providing for a variable sized save
area.

190

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Stack Pointer i-2

extra save area i-1
larger addresses

Frame i-1 (previous frame)
locals i-1

Stack Pointer i-1
base save area i-2
extra save area i

smaller addresses

locals i

Frame i (current frame)

Stack Pointer (a1/sp)
base save area i-1

Figure 4–50. Stack Frame Layout
Several of the goals listed on page 187 require that call stacks be backward-traceable.
That is, from the state of call[i], it must be possible to determine the state of
call[i-1]. It is best if the state of call[i] can be summarized in a single pointer (at
least when the registers have been saved), in which case this requirement is best described as: There must be a means of determining the pointer for call[i-1] from the
pointer of call[i]. For managing register-window overflow or underflow, this method
should also be very efficient; it should not, for example, involve routine-specific information or other table lookup (for example, frame size or stack offsets).
The Xtensa ISA represents the state of call[i] with its stack pointer (not the frame
pointer, as that is routine-specific and would cost too much to lookup). This can be made
to work even with alloca. Therefore it must be possible to read the stack pointer for

Xtensa Instruction Set Architecture (ISA) Reference Manual

191

Chapter 4. Architectural Options

call[i-1] at a fixed offset from the stack pointer (not the frame pointer) for call[i].
Thus, the stack pointer for call[i-1] is stored in the area labeled “base save area i-1”
in Figure 4–48.
For efficiency, the call[i-1] stack pointer is only stored into call[i]’s frame when
call[i-1]’s registers are stored into the stack on overflow. This is sufficient for register window underflow handling. Other back-tracing operations should begin by storing
registers of all call frames back into the stack.
Because the call[i-1] stack pointer is referenced infrequently, it is stored at a negative offset from the stack pointer. This leaves the ISA’s limited positive offsets available
for more frequent uses. Thus, the stack always reaches to 16 bytes below the contents
of the stack pointer. Interrupts and such must respect this 16-byte reserved space below
the stack pointer. Because the minimum number of registers to save is four, the processor stores four of call[i-1]’s registers, a0..a3, in this space; the rest (if any) are
saved in call[i-1]’s own frame.
The register-window call instructions only store the least-significant 30 bits of the return
address. Register-window return instructions leave the two most-significant bits of the
PC unchanged. Therefore, subroutines called using register window instructions must
be placed in the same 1 GB address region as the call.
4.7.1.6 Window Overflow and Underflow to and from the Program Stack
Register-window underflow occurs when a return instruction decrements to a window
that has been spilled (indicated by its WindowStart bit being cleared). The processor
saves the current PC in EPC[1] and transfers to one of three underflow handlers based
on the register window decrement. When the MMU Option is configured, it is necessary
for the handlers to access the stack with the same privilege level as the code that raised
the exception. Two special instructions, L32E and S32E, are therefore added by the
Windowed Register Option for this purpose. In addition, these instructions use negative
offsets in the formation of the virtual address, which saves several instructions in the
handlers. The exception handlers could be as simple as the following:
WindowOverflow4:

// inside call[i] referencing a register that
// contains data from call[j]
// On entry here: window rotated to call[j] start point; the
// registers to be saved are a0-a3; a4-a15 must be preserved
// a5 is call[j+1]’s stack pointer
s32e a0, a5, -16
// save a0 to call[j+1]’s frame
s32e a1, a5, -12
// save a1 to call[j+1]’s frame
s32e a2, a5, -8
// save a2 to call[j+1]’s frame
s32e a3, a5, -4
// save a3 to call[j+1]’s frame
rfwo
// rotates back to call[i] position

WindowUnderflow4:

192

// returning from call[i+1] to call[i] where

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

// call[i]’s registers must be reloaded
// On entry here: a0-a3 are to be reloaded with
// call[i].reg[0..3] but initially contain garbage.
// a4-a15 are call[i+1].reg[0..11],
// (in particular, a5 is call[i+1]’s stack pointer)
// and must be preserved
l32e a0, a5, -16
// restore a0 from call[i+1]’s frame
l32e a1, a5, -12
// restore a1 from call[i+1]’s frame
l32e a2, a5, -8
// restore a2 from call[i+1]’s frame
l32e a3, a5, -4
// restore a3 from call[i+1]’s frame
rfwu
WindowOverflow8:
// On entry here: window rotated to call[j]; the registers to be
// saved are a0-a7; a8-a15 must be preserved
// a9 is call[j+1]’s stack pointer
s32e a0, a9, -16
// save a0 to call[j+1]’s frame
l32e a0, a1, -12
// a0 <- call[j-1]’s sp
s32e a1, a9, -12
// save a1 to call[j+1]’s frame
s32e a2, a9, -8
// save a2 to call[j+1]’s frame
s32e a3, a9, -4
// save a3 to call[j+1]’s frame
s32e a4, a0, -32
// save a4 to call[j]’s frame
s32e a5, a0, -28
// save a5 to call[j]’s frame
s32e a6, a0, -24
// save a6 to call[j]’s frame
s32e a7, a0, -20
// save a7 to call[j]’s frame
rfwo
// rotates back to call[i] position
WindowUnderflow8:
// On entry here: a0-a7 are call[i].reg[0..7] and initially
// contain garbage, a8-a15 are call[i+1].reg[0..7],
// (in particular, a9 is call[i+1]’s stack pointer)
// and must be preserved
l32e a0, a9, -16
// restore a0 from call[i+1]’s frame
l32e a1, a9, -12
// restore a1 from call[i+1]’s frame
l32e a2, a9, -8
// restore a2 from call[i+1]’s frame
l32e a7, a1, -12
// a7 <- call[i-1]’s sp
l32e a3, a9, -4
// restore a3 from call[i+1]’s frame
l32e a4, a7, -32
// restore a4 from call[i]’s frame
l32e a5, a7, -28
// restore a5 from call[i]’s frame
l32e a6, a7, -24
// restore a6 from call[i]’s frame
l32e a7, a7, -20
// restore a7 from call[i]’s frame
rfwu
WindowOverflow12:
// On entry here: window rotated to call[j]; the registers to be
// saved are a0-a11; a12-a15 must be preserved
// a13 is call[j+1]’s stack pointer
s32e
a0, a13, -16 // save a0 to call[j+1]’s frame
l32e
a0, a1, -12 // a0 <- call[j-1]’s sp

Xtensa Instruction Set Architecture (ISA) Reference Manual

193

Chapter 4. Architectural Options

s32e
s32e
s32e
s32e
s32e
s32e
s32e
s32e
s32e
s32e
s32e
rfwo

a1, a13, -12
a2, a13, -8
a3, a13, -4
a4, a0, -48
a5, a0, -44
a6, a0, -40
a7, a0, -36
a8, a0, -32
a9, a0, -28
a10, a0, -24
a11, a0, -20

//
//
//
//
//
//
//
//
//
//
//
//

save a1 to call[j+1]’s frame
save a2 to call[j+1]’s frame
save a3 to call[j+1]’s frame
save a4 to end of call[j]’s frame
save a5 to end of call[j]’s frame
save a6 to end of call[j]’s frame
save a7 to end of call[j]’s frame
save a8 to end of call[j]’s frame
save a9 to end of call[j]’s frame
save a10 to end of call[j]’s frame
save a11 to end of call[j]’s frame
rotates back to call[i] position

WindowUnderflow12:
// On entry here: a0-a11 are call[i].reg[0..11] and initially
// contain garbage, a12-a15 are call[i+1].reg[0..3],
// (in particular, a13 is call[i+1]’s stack pointer)
// and must be preserved
l32e
a0, a13, -16 // restore a0 from call[i+1]’s frame
l32e
a1, a13, -12 // restore a1 from call[i+1]’s frame
l32e
a2, a13, -8 // restore a2 from call[i+1]’s frame
l32e
a11, a1, -12 // a11 <- call[i-1]’s sp
l32e
a3, a13, -4 // restore a3 from call[i+1]’s frame
l32e
a4, a11, -48 // restore a4 from end of call[i]’s frame
l32e
a5, a11, -44 // restore a5 from end of call[i]’s frame
l32e
a6, a11, -40 // restore a6 from end of call[i]’s frame
l32e
a7, a11, -36 // restore a7 from end of call[i]’s frame
l32e
a8, a11, -32 // restore a8 from end of call[i]’s frame
l32e
a9, a11, -28 // restore a9 from end of call[i]’s frame
l32e a10, a11, -24 // restore a10 from end of call[i]’s frame
l32e a11, a11, -20 // restore a11 from end of call[i]’s frame
rfwu

4.7.2

Processor Interface Option

The Processor Interface Option adds a bus interface used by memory accesses, which
are to locations other than local memories (page 123 through page 126). It is used for
cache misses for cacheable addresses (page 111 through page 122), as well as for
cache bypass memory accesses.
Direct memory access to local memories from outside may also be configured through
the bus interface added by the Processor Interface Option. The direct memory access
may either be top priority for highest bandwidth or intermediate priority for greatest efficiency.
„

Prerequisites: None

„

Incompatible options: None

194

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

„

Historical note: The additions made by this option were once considered part of the
Core Architecture and so compatibility with previous hardware might require the use
of this option.

Refer to a specific Xtensa processor data book for more detail on the Processor Interface Option.
4.7.2.1 Processor Interface Option Architectural Additions
Table 4–115 shows this option’s architectural additions (seeTable 4–64 on page 89 for
more). Note that asynchronous load/store errors are delivered via a configuration-dependent interrupt.
Table 4–115. Processor Interface Option Constant Additions (Exception Causes)
Exception Cause

Description

Constant Value

InstrPIFDataErrorCause

PIF data error during instruction fetch

6'b001100 (decimal 12)

LoadStorePIFDataErrorCause

Synchronous PIF data error during
LoadStore access

6'b001101 (decimal 13)

InstrPIFAddrErrorCause

PIF address error during instruction fetch 6'b001110 (decimal 14)

LoadStorePIFAddrErrorCause

Synchronous PIF address error during
LoadStore access

4.7.3

6'b001111 (decimal 15)

Miscellaneous Special Registers Option

The Miscellaneous Special Registers Option provides zero to four scratch registers within the processor readable and writable by RSR, WSR, and XSR. These registers are privileged. They may be useful for some application-specific exception and interrupt processing tasks in the kernel. The MISC registers are undefined after reset.
„

Prerequisites: None

„

Incompatible options: None

4.7.3.1 Miscellaneous Special Registers Option Architectural Additions
Table 4–116 and Table 4–117 show this option’s architectural additions.
Table 4–116. Miscellaneous Special Registers Option Processor-Configuration
Additions
Parameter

Description

Valid Values

NMISC

Number of miscellaneous 32-bit
Special Registers

0..4

Xtensa Instruction Set Architecture (ISA) Reference Manual

195

Chapter 4. Architectural Options

Table 4–117. Miscellaneous Special Registers Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

Register Name

R/W

Special
Register
Number1

MISC

NMISC

32

Miscellaneous privileged register

R/W

244-247

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 5–127 on
page 205.

4.7.4

Thread Pointer Option

The Thread Pointer Option provides an additional register to facilitate implementation of
Thread Local Storage by operating systems and tools. The register is readable and writable by RUR and WUR. The register is unprivileged and is undefined after reset.
„

Prerequisites: None

„

Incompatible options: None

4.7.4.1 Thread Pointer Option Architectural Additions
Table 4–118 shows this option’s architectural additions.
Table 4–118. Thread Pointer Option Processor-State Additions
Register
Mnemonic
THREADPTR
1.

Quantity

Width
(bits)

1

32

Register Name

R/W

Register Number1

Thread pointer

R/W

User 231

See Table 5–127 on page 205.

4.7.5

Processor ID Option

In some applications there are multiple Xtensa processors executing from the same instruction memory, and there is a need to distinguish one processor from another. This
option allows the system logic to provide each processor an identity by reading the PRID
register. The PRID value for each processor is typically in the range
0..NPROCESSORS-1, but this is not required. The PRID register is constant after reset.
„

Prerequisites: None

„

Incompatible options: None

4.7.5.1 Processor ID Option Architectural Additions
Table 4–119 shows this option’s architectural additions.

196

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

Table 4–119. Processor ID Option Special Register Additions
Register
Mnemonic
PRID

Quantity

Width
(bits)

1

322

Register Name

R/W

Special Register
Number1

Processor Id

R

235

1.

Registers with a Special Register assignment are read with the RSR instruction. See Table 5–127 on page 205.

2.

Some implementations may support only the low 16 bits of the PRID register.

4.7.6

Debug Option

The Debug Option implements instruction-counting and breakpoint exceptions for debugging by software or external hardware. The option uses an interrupt level previously
defined in the High-Priority Interrupt Option. In some implementations, some debug interrupts may not be masked by PS.INTLEVEL (see the Tensilica On-Chip Debugging
Guide). The Debug Option is useful when configuring a new (not previously debugged)
Xtensa processor configuration or for running previously untested software on a processor.
„

Prerequisites: High-Priority Interrupt Option (page 106)

„

Incompatible options: None

Some of the features listed below are added only when the OCD Option (see the Tensilica On-Chip Debugging Guide) is configured in addition to the Debug Option. Those features are included here, under the Debug Option, so that their architectural aspects are
documented, but marked as “available only with OCD Option.”
4.7.6.1 Debug Option Architectural Additions
Table 4–120 through Table 4–122 show this option’s architectural additions.
Table 4–120. Debug Option Processor-Configuration Additions
Parameter

Description

Valid Values

DEBUGLEVEL

Debug interrupt level

2..NLEVEL1,2

NIBREAK

Number of instruction breakpoints (break registers)

0..2

NDBREAK

Number of data breakpoints (break registers)

0..2

SZICOUNT

Number of bits in the ICOUNT register

2, 32

1.

NLEVEL is specified in the High-Priority Interrupt Option, Table 4–74 on page 107.

2.

DEBUGLEVEL must be greater than EXCMLEVEL (see Table 4–74 on page 107)

Xtensa Instruction Set Architecture (ISA) Reference Manual

197

Chapter 4. Architectural Options

Table 4–121. Debug Option Processor-State Additions
Register
Mnemonic

Quantity

Width
(bits)

Register Name

R/W

Special
Register
Number1

ICOUNT

1

2,32

Instruction count

R/W

236

ICOUNTLEVEL

1

4

Instruction-count level

R/W

237

IBREAKA

NIBREAK

32

Instruction-break address

R/W

128-129

IBREAKENABLE

1

NIBREAK

Instruction-break enable bits

R/W

96

DBREAKA

NDBREAK

32

Data-break address

R/W

144-145

DBREAKC

NDBREAK

82

Data break control

R/W

160-161

DEBUGCAUSE

1

10

Cause of last debug exception

R

233

32

Debug data register

R/W

104

3

DDR

1

3

1.

Registers with a Special Register assignment are read and/or written with the RSR, WSR, and XSR instructions. See Table 5–127 on page 205.

2.

See Figure 4–52 on page 202 for the DBREAKC register format.

3.

The DDR register may have separate physical registers for in and out directions in some implementations. The register is only available with
the OCD Option, for which the Debug Option is a prerequisite.

Table 4–122. Debug Option Instruction Additions
Instruction1

Format

Definition

BREAK

RRR

Breakpoint

BREAK.N2

RRRN

Narrow breakpoint

1.

These instructions are fully described in Chapter 6, "Instruction Descriptions" on page 243.

2.

Exists only if the Code Density Option described in Section 4.3.1 on page 53 is configured.

4.7.6.2 Debug Cause Register
The DEBUGCAUSE register contains a coded value giving the reason(s) that the processor took the debug exception. It is implementation-specific whether all applicable bits
are set or whether lower-priority conditions are undetected in the presence of higher-priority conditions.
For the priority of the bits in the DEBUGCAUSE register, see Section 4.4.1.11.
Figure 4–51 below shows the bits in the DEBUGCAUSE register, and Table 4–123 describes more fully the meaning of each bit.

198

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

31

12 11
reserved

8 7 6 5 4 3 2 1 0
DBNUM
4

reD B B D I I
served I N I B B C
6

Figure 4–51. DEBUGCAUSE Register
Table 4–123. DEBUGCAUSE Fields
Bit

Field

Meaning

0

IC

ICOUNT exception

1

IB

IBREAK exception

2

DB

DBREAK exception

3

BI

BREAK instruction

4

BN

BREAK.N instruction

5

DI

Debug interrupt1

11-8

DBNUM

Which of the DBREAK registers matched (added in RA-2004.1 release)

1.

The debug interrupt is only available with the OCD Option.

The DEBUGCAUSE register is undefined after processor reset and when CINTLEVEL <
DEBUGLEVEL.
4.7.6.3 Using Breakpoints
BREAK and BREAK.N are 24-bit and 16-bit instructions that simply raise a DEBUGLEVEL
exception with DEBUGCAUSE bit 3 or 4 set, respectively, when executed. Software can
replace an instruction with a breakpoint instruction to transfer control to a debug monitor
when execution reaches the replaced instruction.
The BREAK and BREAK.N instructions cannot be used on ROM code, and so the ISA
provides a configurable number of instruction-address breakpoint registers. When the
processor is about to complete the execution of the instruction fetched from virtual address IBREAKA[i], and IBREAKENABLEi is set, it raises an exception instead. It is up
to the software to compare the PC to the various IBREAKA/IBREAKENABLE pairs to determine which comparison caused the exception.
The processor also provides a configurable number of data-address breakpoint registers. Each breakpoint specifies a naturally aligned power of two-sized block of bytes between one byte and 64 bytes in the processor’s address space and whether the break
should occur on a load or a store or both. The lowest address of the covered block of

Xtensa Instruction Set Architecture (ISA) Reference Manual

199

Chapter 4. Architectural Options

bytes is placed in one of the DBREAKA registers. The size of the covered block of bytes
is placed in the low bits of the corresponding DBREAKC register while the upper two bits
of the DBREAKC register contain an indication of which access types should raise the exception. The settings for each possible block size are shown in Table 4–124. The ‘x’ values under DBREAKA[i]5..0 allow any naturally aligned address to be specified for that
size. The result of other combinations of DBREAKC and DBREAKA is not defined.
Table 4–124. DBREAK Fields
Desired DBREAK Size

DBREAKC[i]5..0

DBREAKA[i]5..0

1 Byte

6’b111111

6’bxxxxxx

2 Bytes

6’b111110

6’bxxxxx0

4 Bytes

6’b111100

6’bxxxx00

8 Bytes

6’b111000

6’bxxx000

16 Bytes

6’b110000

6’bxx0000

32 Bytes

6’b100000

6’bx00000

64 Bytes

6’b000000

6’b000000

When any of the bytes accessed by a load or store matches any of the bytes of the block
specified by one of the DBREAK[i] register pairs, the processor raises an exception instead of executing the load or store. Specifically, “match” is defined as:
(if load then DBREAKC[i]30 else DBREAKC[i]31) and
(DBREAKA[i] >= (126||DBREAKC[i]5..0 and vAddr)) and
(DBREAKA[i] <= (126||DBREAKC[i]5..0 and (vAddr+sz-1)))

where sz is the number of bytes in the memory access. That is, both the first and last
byte of the memory access are masked by (126||DBREAKC[i]5..0). This operation aligns
both byte addresses to the DBREAK size indicated by DBREAKC[i]as in Table 4–124. If
the first or last masked address or any address between them matches DBREAKA[i]
then a match exists. Note that bits in DBREAKA[i]5..0 corresponding to clear bits in
DBREAKC[i]5..0 should also be clear.
For the DBREAK exception, the DBNUM field of the DEBUGCAUSE register records, as a
four bit encoded number, which of the possible DBREAK[i] registers raised the exception. If more than one DBREAK[i] matches, one of the ones that matched is recorded in
DBNUM.
The processor clears IBREAKENABLE on processor reset; the IBREAKA, DBREAKA, and
DBREAKC registers are undefined after reset.

200

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.7.6.4 Debug Exceptions
Typically DEBUGLEVEL is set to NLEVEL (highest priority for maskable interrupts) to allow debugging of other exception handlers. DEBUGLEVEL may, in certain cases be set to
a lower level than NLEVEL.
The relation between the current interrupt level (CINTLEVEL, Table 4–63) and the specified debug interrupt level (DEBUGLEVEL, Table 4–120) determine whether debug interrupts can be taken. All debug exceptions (ICOUNT, IBREAK, DBREAK, BREAK, BREAK.N)
are disabled when CINTLEVEL ≥ DEBUGLEVEL. In this case, the BREAK and BREAK.N
instructions perform no operation.
4.7.6.5 Instruction Counting
The ICOUNT register counts instruction completions when CINTLEVEL is less than
ICOUNTLEVEL. Instructions that raise an exception (including the ICOUNT exception) do
not increment ICOUNT. When ICOUNT would increment to 0, it instead generates an
ICOUNT exception. (See "The checkIcount Procedure" on page 203 for the formal specification.) Because ICOUNT has priority ahead of other exceptions (see
Section 4.4.1.11), it is taken even if another exception would have kept the instruction
from completing and, therefore, ICOUNT from incrementing.
When ICOUNTLEVEL is 1, for example, ICOUNT stops counting when an interrupt or exception occurs and starts again at the return. Neither the instruction not executed nor
the return increment ICOUNT, but the re-execution of the instruction does. By this
mechanism, the count of instructions can be made the same whether or not the interrupt
or exception is taken. When incrementing is turned on or off by RSIL, WSR.PS, or
XSR.PS instructions, the state of CINTLEVEL and ICOUNTLEVEL before the instruction
begins determines whether or not the increment is done, as well as whether or not the
exception is raised.
Instruction counting may be used to implement single or multi-stepping. For repeatable
programs, it can also be used to determine the instruction count of the point of failure,
and allow the program to be re-run up to some point before the point of failure so that
the failure can be directly observed with tracing or stepping.
The purpose of the ICOUNTLEVEL register is to allow various levels of exception and interrupt processing to be visible or invisible for debugging. An ICOUNTLEVEL setting of 1
causes single-stepping to ignore exceptions and interrupts, whereas setting it to
DEBUGLEVEL allows the programmer to debug exception and interrupt handlers. The
ICOUNTLEVEL register should only be modified while PS.INTLEVEL or PS.EXCM is
high enough that both before and after the change, ICOUNT is not incrementing.

Xtensa Instruction Set Architecture (ISA) Reference Manual

201

Chapter 4. Architectural Options

This discussion applies for SZICOUNT=32. If SZICOUNT=2, then the upper bits appear
as all ones for all purposes of reading with RSR and for comparing. In that case,
WSR.ICOUNT affects only the lower two bits. The result is that the feature is really only
useful for single stepping because it cannot count very far. But in other respects it behaves in the same fashion.
ICOUNTLEVEL is undefined after reset. The ICOUNT register should be read or written
only when CINTLEVEL is greater than or equal to ICOUNTLEVEL, where the ICOUNT
register is not incrementing (see Table 5–173).
4.7.6.6 Debug Registers
Like all special registers, the IBREAKA, IBREAKENABLE, DBREAKA, DBREAKC, and
ICOUNT registers are read and written using the RSR, WSR, and XSR instructions.
Figure 4–52 shows the format of the DBREAKC registers and Table 4–125 shows the
DBREAKC[i] register fields.
31 30 29

6 5

SB LB

reserved

1 1

0
MASK
6

Figure 4–52. DBREAKC[i] Format
Table 4–125. DBREAKC[i] Register Fields
Field

Width
(bits)

Definition

6

Mask specifying which bits of vAddr to compare to DBREAKA[i]
See "Using Breakpoints" on page 199 for details.

1

Load data address match enable
0 → no exception on load data address match
1 → exception on load data address match

1
SB

Store data address match enable
0 → no exception on store data address match
1 → exception on store data address match

reserved

Reserved for future use
Writing a non-zero value to one of these fields results in undefined processor
behavior.

MASK

LB

202

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 4. Architectural Options

4.7.6.7 Debug Interrupts
The debug data register (DDR) allows communication between a debug supervisor executing on the processor and a debugger executing on a remote host. To stop an executing program being debugged, the external debugger may use a debug interrupt. Debug
interrupts share the same vector as other debug exceptions
(InterruptVector[DEBUGLEVEL]), but are distinguished by the setting of the DI bit of the
DEBUGCAUSE register. Both the DDR register and the debug interrupt are only available
with the OCD option (see the Tensilica On-Chip Debugging Guide).
The INTENABLE register (see Section 4.4.4) does not contain a bit for the debug interrupt.
4.7.6.8 The checkIcount Procedure
The definition of checkIcount, used in Section 3.5.4.1 “Little-Endian Fetch Semantics”
on page 29 and Section 3.5.4.2 “Big-Endian Fetch Semantics” on page 31, is:
procedure checkIcount ()
if CINTLEVEL < ICOUNTLEVEL then
if ICOUNT ≠ -1 then
ICOUNT ← ICOUNT + 1
elseif CINTLEVEL < DEBUGLEVEL then
-- Exception
DEBUGCAUSE ← 1
EPC[DEBUGLEVEL] ← PC
EPS[DEBUGLEVEL] ← PS
PC ← InterruptVector[DEBUGLEVEL]
PS.EXCM ← 1
PS.INTLEVEL ← DEBUGLEVEL
endif
endif
endprocedure checkIcount

4.7.7

Trace Port Option

The Trace Port Option provides outputs for tracing the processor’s activity without the
affect on processor timing that would happen with software profiling. For more information on this option, see the Xtensa Microprocessor Data Book. Because the Trace Port
Option provides only additional outputs, it adds only the few architectural features listed
below.
„

Prerequisites: None

„

Incompatible options: None

Xtensa Instruction Set Architecture (ISA) Reference Manual

203

Chapter 4. Architectural Options

4.7.7.1 Trace Port Option Architectural Additions
Table 4–119 shows this option’s architectural additions.
Table 4–126. Trace Port Option Special Register Additions
Register
Mnemonic
MMID
1.

Quantity

Width
(bits)

1

32

Register Name

R/W

Special Register
Number1

Memory Map Id

W

89

Registers with a Special Register assignment are read with the RSR instruction. See Table 5–127 on page 205.

The MMID register is a write only location whose contents affect the output to the trace
port and help in decoding the trace output by defining the which memory map is in force.

204

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

5.

Processor State

The architectural state of an Xtensa machine consists of its AR register file, a PC, Special
Registers, User Registers, TLB entries, and additional register files (added by options
and designer’s TIE). The Windowed Register Option causes an increase in the physical
size of the AR register file but does not change the number of registers visible by instructions at any given time. To a lesser extent, caches and local memories can be considered in some ways to be architectural state. The subsections of this chapter cover each
of these categories of state in turn.
The Floating-Point Coprocessor Option adds the FR register file and two User Registers
called FCR and FSR. The Region Protection Option and the MMU Option add ITLB Entries and DTLB Entries. Other options add only Special Registers. Designer’s TIE may
add User Registers, and additional register files. Only the AR register file, the PC, and
SAR are in all Xtensa processors.
Table 5–127 contains an alphabetical list of all Tensilica-defined registers that make up
Xtensa processor state, including the registers added by all architectural options. The
Special Register number column of most entries contains a Special Register number,
which can be looked up in Section 5.3 for more information. The last column contains a
reference where more information can be found in the pages following the table.
Table 5–127. Alphabetical List of Processor State
Name1

Description

Required Configuration
Option

Special
Register
Number

More Detail

ACCHI

Accumulator high bits

MAC16 Option

17

Table 5–133

ACCLO

Accumulator low bits

MAC16 Option

16

Table 5–132

AR

Address registers (general
registers)

Core Architecture

—

Section 5.1

Atomic Operation Control

Conditional Store Option

99

Table 5–186

BR

Boolean registers / register file

Boolean Option

4

Table 5–136

CACHEATTR

Cache attribute

ATOMCTL

CCOMPARE0..2 Cycle number to interrupt

XEA1 Only — see page 611

98

Table 9-250

Timer Interrupt Option

240-242

Table 5–176

Timer Interrupt Option

234

Table 5–175

Coprocessor Option

224

Table 5–184

CCOUNT

Cycle count

CPENABLE

Coprocessor enable bits

DBREAKA0..2

Data break address

Debug Option

144-145

Table 5–180

DBREAKC0..2

Data break control

Debug Option

160-161

Table 5–179

1

Used in RSR, WSR, and XSR instructions.

2

FCR & FSR are User Registers where most are system registers. These names are used in RUR and WUR instructions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

205

Chapter 5. Processor State

Table 5–127. Alphabetical List of Processor State (continued)
Description

Required Configuration
Option

Special
Register
Number

More Detail

DEBUGCAUSE

Cause of last debug exception

Debug Option

233

Table 5–159

DDR

Debug data register

Debug Option

104

Table 5–183

DEPC

Double exception PC

Exception Option

192

Table 5–162

Region Protection Option or
MMU Option

—

Section 5.5

Name1

DTLB Entries

Data TLB entries

DTLBCFG

Data TLB configuration

EPC1

Level-1 exception PC

MMU Option

92

Table 5–152

Exception Option

177

Table 5–160

EPC2..7

High level exception PC

High-Priority Interrupt Option

178-183

Table 5–161

EPS2..7

High level exception PS

High-Priority Interrupt Option

194-199

Table 5–164

EXCCAUSE

Cause of last exception

Exception Option

232

Table 5–153

EXCSAVE1

Level-1 exception save location

Exception Option

209

Table 5–166

EXCSAVE2..7

High level exception save
location

High-Priority Interrupt Option

210-215

Table 5–167

EXCVADDR

Exception virtual address

Exception Option

238

Table 5–154

Floating point control register

Floating-Point Coprocessor
Option

—

Table 5–189

Floating point registers

Floating-Point Coprocessor
Option

—

Section 5.6

Floating point status register

Floating-Point Coprocessor
Option

—

Table 5–190

Debug Option

128-129

Table 5–178

Debug Option

96

Table 5–177

FCR
FR
FSR
IBREAKA0..2

Instruction break address

IBREAKENABLE Instruction break enable bits
ICOUNT

Instruction count

Debug Option

236

Table 5–173

ICOUNTLEVEL

Instruction count level

Debug Option

237

Table 5–174

INTCLEAR

Clear requests in
INTERRUPT

Interrupt Option

227

Table 5–171

INTENABLE

Interrupt enable bits

Interrupt Option

228

Table 5–172

INTERRUPT

Interrupt request bits

Interrupt Option

226

Table 5–169

INTSET

Set Requests in INTERRUPT

Interrupt Option

226

Table 5–170

Region Protection Option or
MMU Option

—

Section 5.5

MMU Option

91

Table 5–151

ITLB Entries
ITLBCFG

Instruction TLB entries
Instruction TLB configuration

1

Used in RSR, WSR, and XSR instructions.

2

FCR & FSR are User Registers where most are system registers. These names are used in RUR and WUR instructions.

206

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–127. Alphabetical List of Processor State (continued)
Name1

Description

Required Configuration
Option

Special
Register
Number

More Detail

LBEG

Loop-begin address

Loop Option

0

Table 5–129

LCOUNT

Loop count

Loop Option

2

Table 5–131

LEND

Loop-end address

LITBASE

Literal base

M0..3

MAC16 data registers/register
file

MECR

Loop Option

1

Table 5–130

Extended L32R Option

5

Table 5–137

MAC16 Option

32-35

Table 5–134

Memory error check register

Memory ECC/Parity Option

110

Table 5–157

MEPC

Memory error PC register

Memory ECC/Parity Option

106

Table 5–163

MEPS

Memory error PS register

Memory ECC/Parity Option

107

Table 5–165

MESAVE

Memory error save register

Memory ECC/Parity Option

108

Table 5–168

MESR

Memory error status register

Memory ECC/Parity Option

109

Table 5–156

MEVADDR

Memory error virtual addr
register

Memory ECC/Parity Option

111

Table 5–158

Misc register 0-3

Miscellaneous Special
Registers Option

244-247

Table 5–185

MMID

Memory map ID

Trace Port Option

89

Table 5–182

MR

MAC16 Data registers/register
file

MAC16 Option

32-35

Table 5–134

PC

Program counter

Core Architecture

—

Section 5.2

PRID

Processor Id

Processor ID Option

235

Table 5–181

PS

Processor state

See Table 4–63 on page 87

230

Table 5–139

PTEVADDR

Page table virtual address

MMU Option

83

Table 5–149

RASID

Ring ASID values

MMU Option

90

Table 5–150

SAR

Shift-amount register

Core Architecture

3

Table 5–135

SCOMPARE1

Expected data value for
S32C1I

Multiprocessor
Synchronization Option

12

Table 5–138

THREADPTR

Thread pointer

Thread Pointer Option

—

Table 5–188

VECBASE

Vector Base

Relocatable Vector Option

231

Table 5–155

WindowBase

Base of current AR window

Windowed Register Option

72

Table 5–147

WindowStart

Call-window start bits

Windowed Register Option

73

Table 5–148

MISC0..3

1

Used in RSR, WSR, and XSR instructions.

2

FCR & FSR are User Registers where most are system registers. These names are used in RUR and WUR instructions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

207

Chapter 5. Processor State

5.1

General Registers

Many Xtensa instructions operate on the general registers in the AR register file. The instructions view sixteen such registers at any given time and usually have a 4-bit specifier field in the instruction for each register they access.
These general registers are named address registers (AR) to distinguish them from the
many different types of data registers that can be added to the instruction set
(Section 5.6). Although the AR registers can be used to hold data as well, they are involved with both the instruction set and the execution pipeline in such a way as to make
them ideally suited to contain addresses and the information used to compute addresses. They are ideally suited to computing branch conditions and targets as well, and as
such fill the role of general registers in the Xtensa instruction set.
When the Windowed Register Option is enabled, there are actually more than sixteen
registers in the AR register file. The windowed register ABI, described in Section 8.1,
can be used in combination with the Windowed Register Option to make use of the additional registers and avoid many of the register saves and restores that would normally
be associated with calls and returns. This improves both the speed and the code density
of Xtensa processors.
Reads from and writes to the AR register file are always interlocked by hardware. No
synchronization instructions are ever required by them.
The contents of the AR register file are undefined after reset.

5.2

Program Counter

The program counter (PC) holds the address of the next instruction to execute. It is
updated by instructions as they execute. Non-branch instructions simply increment it by
their length. Branch instructions, when taken, load it with a new value. Call and return instructions exist, which move values between the PC and general register AR[0]. Options such as the Loop Option change the PC in other useful ways.
Changes to and uses of the PC are always interlocked by hardware. No synchronization
instructions are ever required by them.

5.3

Special Registers

Special Registers hold the majority of the state added to the processor by the Options
listed in Chapter 4. Table 5–128 shows the Special Registers in numerical order with references to a more detailed description. Special Registers not listed in Table 5–128 are
reserved for future use.

208

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–128. Numerical List of Special Registers
Name1

Description

Required Configuration
Option

Special
Register
Number

More Detail

LBEG

Loop-begin address

Loop Option

0

Table 5–129

LEND

Loop-end address

Loop Option

1

Table 5–130

LCOUNT

Loop count

Loop Option

2

Table 5–131

SAR

Shift-amount register

Core Architecture

3

Table 5–135

BR

Boolean registers / register file

LITBASE

Literal base

SCOMPARE1

Expected data value for
S32C1I

ACCLO
ACCHI

Boolean Option

4

Table 5–136

Extended L32R Option

5

Table 5–137

Conditional Store Option

12

Table 5–138

Accumulator low bits

MAC16 Option

16

Table 5–132

Accumulator high bits

MAC16 Option

17

Table 5–133

M0..3 / MR

MAC16 data registers / register
file

MAC16 Option

32-35

Table 5–134

WindowBase

Base of current AR window

Windowed Register Option

72

Table 5–147

WindowStart

Call-window start bits

Windowed Register Option

73

Table 5–148

PTEVADDR

Page table virtual address

MMU Option

83

Table 5–149

MMID

Memory map ID

Trace Port Option

89

Table 5–182

RASID

Ring ASID values

MMU Option

90

Table 5–150

ITLBCFG

Instruction TLB configuration

MMU Option

91

Table 5–151

DTLBCFG

Data TLB configuration

MMU Option

92

Table 5–152

IBREAKENABLE Instruction break enable bits
CACHEATTR
ATOMCTL

Cache attribute
Atomic Operation Control

Debug Option

96

Table 5–177

XEA1 Only - see page 611

98

Table 9-250

Conditional Store Option

99

Table 5–186

Debug Option

104

Table 5–183

DDR

Debug data register

MEPC

Memory error PC register

Memory ECC/Parity Option

106

Table 5–163

MEPS

Memory error PS register

Memory ECC/Parity Option

107

Table 5–165

MESAVE

Memory error save register

Memory ECC/Parity Option

108

Table 5–168

MESR

Memory error status register

Memory ECC/Parity Option

109

Table 5–156

MECR

Memory error check register

Memory ECC/Parity Option

110

Table 5–157

MEVADDR

Memory error virtual addr
register

Memory ECC/Parity Option

111

Table 5–158

IBREAKA0..1

Instruction break address

Debug Option

128-129

Table 5–178

1

Used in RSR, WSR, and XSR instructions.

Xtensa Instruction Set Architecture (ISA) Reference Manual

209

Chapter 5. Processor State

Table 5–128. Numerical List of Special Registers (continued)
Name1

Description

Required Configuration
Option

Special
Register
Number

More Detail

DBREAKA0..1

Data break address

Debug Option

144-145

Table 5–180

DBREAKC0..1

Data break control

Debug Option

160-161

Table 5–179

EPC1

Level-1 exception PC

EPC2..7

High level exception PC

DEPC

Double exception PC

EPS2..7

High level exception PS

EXCSAVE1

Level-1 exception save location

EXCSAVE2..7

High level exception save
location

CPENABLE

Coprocessor enable bits

INTERRUPT

Interrupt request bits

INTSET

Exception Option

177

Table 5–160

High-Priority Interrupt Option

178-183

Table 5–161

Exception Option

192

Table 5–162

High-Priority Interrupt Option

194-199

Table 5–164

Exception Option

209

Table 5–166

High-Priority Interrupt Option

210-215

Table 5–167

Coprocessor Option

224

Table 5–184

Interrupt Option

226

Table 5–169

Set requests in INTERRUPT

Interrupt Option

226

Table 5–170

INTCLEAR

Clear requests in INTERRUPT

Interrupt Option

227

Table 5–171

INTENABLE

Interrupt enable bits

Interrupt Option

228

Table 5–172

PS

Processor state

See Table 4–63 on page 87

230

Table 5–139

VECBASE

Vector Base

Relocatable Vector Option

231

Table 5–155

EXCCAUSE

Cause of last exception

Exception Option

232

Table 5–153

DEBUGCAUSE

Cause of last debug exception

Debug Option

233

Table 5–159

CCOUNT

Cycle count

Timer Interrupt Option

234

Table 5–175

PRID

Processor Id

Processor ID Option

235

Table 5–181

ICOUNT

Instruction count

Debug Option

236

Table 5–173

ICOUNTLEVEL

Instruction count level

Debug Option

237

Table 5–174

EXCVADDR

Exception virtual address

Exception Option

238

Table 5–154

CCOMPARE0..2

Cycle number to generate
interrupt

Timer Interrupt Option

240-242

Table 5–176

Misc register 0-3

Miscellaneous Special
Registers Option

244-247

Table 5–185

MISC0..3
1

Used in RSR, WSR, and XSR instructions.

Section 5.3.1 describes the process of reading and writing these special registers, while
the sections that follow describe groups of specific Special Registers in more detail. A
table is included for each special register, which includes information specific to that
special register. The gray shaded rows describe the information that is contained in the
unshaded rows immediately below them.

210

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

The first row shows the Special Register number, the Name (which is used in the RSR.*,
WSR.*, and XSR.* instruction names), a short description, and the value immediately
after reset.
The second row shows the Option that creates the Special Register, the count or number of such special registers, the number of bits in the special register, whether access
to the register is privileged (requires CRING=0) or not, and whether XSR.* is a legal instruction or not. The Option that creates the Special Register is described in Chapter 4
including more information on each Special Register.
The third row shows the function of the WSR.* and RSR.* instructions for this Special
Register. The function of the XSR.* instruction is the combination of the RSR.* and the
WSR.* instructions.
The fourth row shows the other instructions that affect or are affected by this Special
Register.
The last row of each Special Register’s table shows what SYNC instructions are
required when using this Special Register. If no SYNC instructions are ever required, the
row is left out. On the left is an instruction or other action that changes the value of the
Special Register. On the right is an instruction or other action that makes use of the value of the Special Register. If a SYNC instruction is required for this pair of operations to
work as they should, it is listed in the middle. Wherever a DSYNC is required an ISYNC,
RSYNC, or ESYNC can also be used. Wherever an ESYNC is required an ISYNC or RSYNC
can also be used. Wherever an RSYNC is required an ISYNC can also be used. Note that
the 16-bit versions (*.N) of 24-bit instructions are not listed separately but always have
exactly the same requirements. Versions T1050 and before required additional SYNC
instructions in some cases as described in Section A.8 on page 621.
Because of the importance of its subfields, the PS Special Register is a special case. Its
subfields are listed in the same format as special registers. The synchronizations needed simply because the register has been written are listed under the entire register,
while the synchronizations needed because the value of a subfield has been changed
are listed under the subfield.

5.3.1

Reading and Writing Special Registers

The RSR.*, WSR.*, and XSR.* instructions access the special registers. The accesses
to the Special Registers act as separate instructions in many ways. For the full instruction name, replace the ‘*’ in the instructions with the name as given in the Special
Register Tables in this section.
Each RSR.* instruction moves a value from a Special Register to a general (AR) register. Each WSR.* instruction moves a value from a general (AR) register to a Special Register. Each XSR.* instruction exchanges the values in a general (AR) register and a Spe-

Xtensa Instruction Set Architecture (ISA) Reference Manual

211

Chapter 5. Processor State

cial Register. Some Special Registers do not allow this exchange. The Special Register
tables in this section show which do and do not allow this exchange. The exchange
takes place with the two reads taking place first, and then the two writes. In some cases,
the write of a Special Register can affect other behavior of the processor. In general, this
behavior change does not occur until after the instruction (including XSR.*) has completed execution.
Some of the Special Registers have interactions with other instructions or with hardware
execution. These interactions are also listed in the Special Register tables in this section. Because modification of many Special Registers is an unusual occurrence, synchronization instructions are used to ensure that their values have propagated everywhere before certain other actions are allowed to take place. Some of the interlocks
would be costly in performance or in gates if done in hardware, and the synchronization
instructions can be the most efficient solution.

5.3.2

LOOP Special Registers

The Loop Option adds the three registers shown in Table 5–129 through Table 5–131 for
controlling zero overhead loops. When the PC reaches LEND, it executes at LBEG instead and decrements LCOUNT. When LCOUNT reaches zero, the loop back does not occur.
Table 5–129. LBEG - Special Register #0
SR#
0

Name
LBEG

Description

Reset Value

Loop begin - address of beginning of zero overhead loop

Undefined

Option

Count

Bits

Loop Option

1

32

WSR Function
LBEG ← AR[t]

Privileged?

XSR Legal?

No

Yes

RSR Function
AR[t] ← LBEG

Other Changes to the Register
LOOP/LOOPGTZ/LOOPNEZ

Other Effects of the Register
Branch at end of zero overhead loop

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR LBEG ⇒ ISYNC ⇒ Potential branch caused by attempt to execute LEND

212

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–130. LEND - Special Register #1
SR#
1

Name
LEND

Description

Reset Value

Loop end - address of instruction after zero overhead loop

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Loop Option

1

32

No

Yes

WSR Function

RSR Function

LEND ← AR[t]

AR[t] ← LEND

Other Changes to the Register

Other Effects of the Register
Branch at end of zero overhead loop

LOOP/LOOPGTZ/LOOPNEZ

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR LEND ⇒ ISYNC ⇒ Potential branch caused by attempt to execute LEND

Table 5–131. LCOUNT - Special Register #2
SR#
2

Name

Description

Reset Value

Loop count remaining

LCOUNT
Option

Count

Bits

Loop Option

1

32

WSR Function
LCOUNT ← AR[t]

Undefined
Privileged?

XSR Legal?

No

Yes

RSR Function
AR[t] ← LCOUNT

Other Changes to the Register
LOOP/LOOPGTZ/LOOPNEZ

Other Effects of the Register
Branch at end of zero overhead loop

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR LCOUNT ⇒ ESYNC ⇒ RSR/XSR LCOUNT
WSR/XSR LCOUNT ⇒ ISYNC ⇒ Potential branch caused by attempt to execute LEND
WSR/XSR LCOUNT to zero⇒ ISYNC ⇒ WSR/XSR PS.EXCM with zero (for protection)

5.3.3

MAC16 Special Registers

The MAC16 Option adds the six registers described in Table 5–132 through
Table 5–134.

Xtensa Instruction Set Architecture (ISA) Reference Manual

213

Chapter 5. Processor State

Table 5–132. ACCLO - Special Register #16
SR#
16

Name
ACCLO

Description

Reset Value

Accumulator - low bits

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

MAC16 Option

1

32

No

Yes

WSR Function

RSR Function

ACC31..0 ← AR[t]

AR[t] ← ACC31..0
Other Effects of the Register

Other Changes to the Register
MUL.*/MULA.*/MULS.*/UMUL.*

MULA.*/MULS.*

Table 5–133. ACCHI - Special Register #17
SR#
17

Name

Description

Reset Value

Accumulator - high bits

ACCHI
Option

Count

Bits

MAC16 Option

1

8

WSR Function

Undefined
Privileged?

XSR Legal?

No

Yes

RSR Function

ACC39..32 ← AR[t]7..0
Undefined if AR[t]31..8 ≠ AR[t]724

AR[t] ← ACC3924||ACC39..32

Other Changes to the Register

Other Effects of the Register

MUL.*/MULA.*/MULS.*/UMUL.*

MULA.*/MULS.*

Table 5–134. M0..3 - Special Register #32-35
SR#
32-35

Name
M0..3 /

Description

MR1

Reset Value

MAC16 data registers / register

Option

Count

Bits

MAC16 Option

4

32

WSR Function
M[sr1..0] ← AR[t]

file1

Undefined

Privileged?

XSR Legal?

No

Yes

RSR Function
AR[t] ← M[sr1..0]

Other Changes to the Register

Other Effects of the Register

LDDEC/LDINC/MULA*.LDDEC/MULA*.LDINC MUL.*D*/MULA.*D*/MULS.*D*
1

214

These registers are known as MR[0..3] in hardware and as m0..3 in the software.

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

5.3.4

Other Unprivileged Special Registers

The SAR Special Register is included in the Xtensa Core Architecture, while the BR,
LITBASE, and SCOMPARE1 Special Registers are added by the options shown along
with other information about them in Table 5–135 through Table 5–138.
Table 5–135. SAR - Special Register #3
SR#
3

Name
SAR

Description

Reset Value

Shift amount register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Core Architecture (see page 25)

1

6

No

Yes

WSR Function

RSR Function

SAR ← AR[t]5..0
Undefined if AR[t]31..6 ≠ 026

AR[t] ← 026||SAR

Other Changes to the Register

Other Effects of the Register

SSL/SSR/SSAI/SSA8B/SSA8L

SLL/SRL/SRA/SRC

Table 5–136. BR - Special Register #4
SR#
4

Name
BR / b0..151

Description

Reset Value

Boolean register / register file1

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Boolean Option

1

16

No

Yes

WSR Function
BR ← AR[t]15..0
Undefined if AR[t]31..16 ≠ 016

RSR Function
AR[t] ← 016||BR

Other Changes to the Register
ALL4/ALL8/ANDB/ANDBC/ANY4/ANY8/
ORB/ORBC/XORB/OEQ.S/OLE.S/OLT.S/
UEQ.S/ULE.S/ULT.S/UN.S/User TIE
1

Other Effects of the Register
ALL4/ALL8/ANDB/ANDBC/ANY4/ANY8/
ORB/ORBC/XORB/
BF/BT/MOVF/MOVF.S/MOVT/MOVT.S

This register is known as Special Register BR or as individual Boolean bits b0..15.

Xtensa Instruction Set Architecture (ISA) Reference Manual

215

Chapter 5. Processor State

Table 5–137. LITBASE - Special Register #5
SR#
5

Name
LITBASE

Description

Reset Value

Literal base register

bit-0 clear1

Option

Count

Bits

Privileged?

XSR Legal?

Extended L32R Option

1

21

No

Yes

WSR Function

RSR Function

LITBASE ← AR[t]31..12||011||AR[t]0
Undefined if AR[t]11..1 ≠ 011

AR[t] ← LITBASE31..12||011||LITBASE0

Other Changes to the Register

Other Effects of the Register
L32R

1

After reset bit-0 is clear but the remainder of the register is undefined.

Table 5–138. SCOMPARE1 - Special Register #12
SR#
12

Name
SCOMPARE1

Description

Reset Value

Comparison register for the S32C1I instruction

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Conditional Store Option

1

32

No

Yes

WSR Function
SCOMPARE1 ← AR[t]

RSR Function
AR[t] ← SCOMPARE1

Other Changes to the Register

Other Effects of the Register
S32C1I

5.3.5

Processor Status Special Register

The Processor Status Special Register is made up of multiple fields with different purposes within the processor. They are combined into one register to simplify the saving
and restoring of state for exceptions, interrupts, and context switches. Table 5–139
describes the register as a whole, while Table 5–140 through Table 5–146 describe the
individual pieces of the register in a similar format.
The synchronization section of Table 5–139 gives requirements that must be met whenever the PS register is written regardless of whether any of its bits are changed. The
synchronization sections of Table 5–140 through Table 5–146 give requirements that
must be met only if that portion of the PS register is being modified.

216

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–139. PS - Special Register #230
SR#

Name

230

PS

Description

Reset Value

Miscellaneous program state

0x10 or 0x1F1

Option

Count

Bits

Privileged?

XSR Legal?

Exception Option

1

15

Yes

Yes

WSR Function

RSR Function

PS ← 013||AR[t]18..16||04||AR[t]11..0
PS.RING should be changed only when CEXCM=1
before the instruction making the change.

AR[t] ← PS

Other Changes to the Register

Other Effects of the Register

CALL[X]4-12/RFE/RFDO/RFDD/RFWO/RFWU/RFI
RSIL/WAITI/interrupts/exceptions

CALL[X]4-12/ENTRY/RETW/interrupts/loop-back
Privileged-instructions/ld-st-instructions/exceptions

Instruction ⇒ xSYNC ⇒ Instruction
See following entries for subfields of PS. Write to PS.X means a write to PS that changes subfield X.
1

PS is 5’h1F after reset if the.Interrupt Option is configured but reads as 5’h10 if it is not.

Table 5–140. PS.INTLEVEL - Special Register #230 (part)
SR#

Name

230 Part

PS.INTLEVEL

Description

Reset Value

Interrupt level mask part of PS (Table 5–139)

0x0 or 0xF1

Option

Count

Bits

Interrupt Option

1

4

WSR Function
(see Table 5–139)

Privileged?

XSR Legal?

Yes

Yes

RSR Function
(see Table 5–139)

Other Changes to the Register
RFI/RFDD/RFDO/RSIL/WAITI/
Hi-level-interrupts/debug-exceptions/NMI

Other Effects of the Register
RSIL/interrupts/debug-exceptions

Instruction ⇒ xSYNC ⇒ Instruction
Write to PS.INTLEVEL is a write to PS that changes subfield INTLEVEL.
WSR/XSR PS.INTLEVEL ⇒ RSYNC ⇒ Change in accepting interrupts
If PS.EXCM and PS.INTLEVEL are both changed in the same WSR.PS or XSR.PS instruction in such a way
that a particular interrupt is forbidden both before and after the instruction, there will be no cycle during the instruction
where the interrupt may be taken. Thus PS.EXCM may be cleared and PS.INTLEVEL raised (or PS.EXCM set
and PS.INTLEVEL lowered) in the same instruction and no gap is opened between them.
WSR/XSR PS.INTLEVEL ⇒ DSYNC ⇒ Change in taking debug exception (interrupt level)
RFI/RFDD/RFDO/RSIL/WAITI ⇒ (none) ⇒ RSIL or change in accepting interrupts/debug-exceptions
Hi-level-interrupts/debug-excep/NMI ⇒ (none) ⇒ RSIL or change in accepting interrupts/debug-exceptions
1

PS.INTLEVEL is 4’hF after reset if the.Interrupt Option is configured but reads as 4’h0 if it is not.

Xtensa Instruction Set Architecture (ISA) Reference Manual

217

Chapter 5. Processor State

Table 5–141. PS.EXCM - Special Register #230 (part)
SR#

Name

230 Part

PS.EXCM

Description

Reset Value

Exception mask part of PS (Table 5–139)

0x1

Option

Count

Bits

Privileged?

XSR Legal?

Exception Option

1

1

Yes

Yes

WSR Function
(see Table 5–139)

RSR Function
(see Table 5–139)

Other Changes to the Register
RFI/RFDD/RFDO/RFE/RFWO/RFWU
interrupts/exceptions

Other Effects of the Register
CALL[X]4-12/ENTRY/RETW/interrupts/loop-back
Ifetch/privileged-instr/ld-st-instructions/exceptions

Instruction ⇒ xSYNC ⇒ Instruction
Write to PS.EXCM is a write to PS that changes subfield EXCM.
WSR/XSR PS.EXCM ⇒ ISYNC ⇒ Changes in instruction fetch privilege
WSR/XSR PS.EXCM ⇒ RSYNC ⇒ Change in accepting Interrupts
If PS.EXCM and PS.INTLEVEL are both changed in the same WSR.PS or XSR.PS instruction in such a way
that a particular interrupt is forbidden both before and after the instruction, there will be no cycle during the instruction
where the interrupt may be taken. Thus PS.EXCM may be cleared and PS.INTLEVEL raised (or PS.EXCM set
and PS.INTLEVEL lowered) in the same instruction without a gap in interrupt masking.
WSR/XSR PS.EXCM to one ⇒ (none) ⇒ Restore non-zero LCOUNT value
WSR/XSR LCOUNT to zero ⇒ ISYNC ⇒ WSR/XSR PS.EXCM with zero (for protection)
WSR/XSR PS.EXCM ⇒ ESYNC ⇒ CALL[X]4-12/ENTRY/RETW
Note: In the Windowed Register Option, any instruction with an AR register operand can cause overflow exceptions.
WSR/XSR PS.EXCM ⇒ DSYNC ⇒ Changes in data fetch privilege
WSR/XSR PS.EXCM ⇒ (none) ⇒ Double exception vector or not
RFI/RFDD/RFDO/RFE ⇒ (none) ⇒ Anything
RFWO/RFWU ⇒ (none) ⇒ Anything
Interrupts/exceptions⇒ (none) ⇒ Anything

218

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–142. PS.UM - Special Register #230 (part)
SR#

Name

230 Part

PS.UM

Description

Reset Value

User vector mode part of PS (Table 5–139)

0x0

Option

Count

Bits

Privileged?

XSR Legal?

Exception Option

1

1

Yes

Yes

WSR Function

RSR Function

(see Table 5–139)

(see Table 5–139)

Other Changes to the Register

Other Effects of the Register
RSIL/level-1-interrupts
general-exceptionsdebug-exceptions

RFI/RFDD/RFDO

Instruction ⇒ xSYNC ⇒ Instruction
Write to PS.UM is a write to PS that changes subfield UM.
WSR/XSR PS.UM ⇒ RSYNC ⇒ Level-1-interrupts/general-exceptions/debug-exceptions
Note: In the Windowed Register Option, any instruction with an AR register operand can cause overflow exceptions.

Table 5–143. PS.RING - Special Register #230 (part)
SR#

Name

230 Part

Description

Reset Value

Ring part of PS (Table 5–139)

0x0

Option

Count

Bits

Privileged?

XSR Legal?

MMU Option

1

2

Yes

Yes

PS.RING

WSR Function
(see Table 5–139)

RSR Function
(see Table 5–139)

Other Changes to the Register
RFI/RFDD/RFDO

Other Effects of the Register
Hi-level-interrupts/debug-exception/
Privileged-instructions/ld-st-instructions

Instruction ⇒ xSYNC ⇒ Instruction
Write to PS.RING is a write to PS that changes subfield RING.
WSR/XSR PS.RING ⇒ ISYNC ⇒ Changes in instruction fetch privilege
WSR/XSR PS.RING ⇒ DSYNC ⇒ Changes in data fetch privilege

Xtensa Instruction Set Architecture (ISA) Reference Manual

219

Chapter 5. Processor State

Table 5–144. PS.OWB - Special Register #230 (part)
SR#

Name

230 Part

PS.OWB

Description

Reset Value

Old window base part of PS (Table 5–139)

0x0

Option

Count

Bits

Privileged?

XSR Legal?

Windowed Register Option

1

4

Yes

Yes

WSR Function

RSR Function

(see Table 5–139)

(see Table 5–139)

Other Changes to the Register

Other Effects of the Register

RFI/RFDD/RFDO/overflow-or-underflow-exception

RFWO/RFWU/RSIL/hi-level-interrupt/debug-exception

Table 5–145. PS.CALLINC - Special Register #230 (part)
SR#

Name

230 Part

Description

Reset Value

Call increment part of PS (Table 5–139)

PS.CALLINC
Option

Count

Bits

Windowed Register Option

1

2

WSR Function

0x0

Privileged?

XSR Legal?

Yes

Yes

RSR Function

(see Table 5–139)

(see Table 5–139)

Other Changes to the Register

Other Effects of the Register

CALL[X]4-12/RFI/RFDD/RFDO

ENTRY/RSIL/hi-level-interrupt/debug-exception

Table 5–146. PS.WOE - Special Register #230 (part)
SR#

Name

230 Part

Description
Window overflow enable part of PS (Table 5–139)

PS.WOE
Option

Count

Bits

Windowed Register Option

1

1

WSR Function
(see Table 5–139)

0x0

Privileged?

XSR Legal?

Yes

Yes

RSR Function
(see Table 5–139)

Other Changes to the Register
RFI/RFDD/RFDO

Reset Value

Other Effects of the Register
CALL4-12/CALLX4-12/ENTRY/RETW/RSIL/
Hi-level-interrupt/debug-exception/overflow-exception

Instruction ⇒ xSYNC ⇒ Instruction
Write to PS.WOE is a write to PS that changes subfield WOE.
WSR/XSR PS.WOE ⇒ RSYNC ⇒ CALL4-12/CALLX4-12/ENTRY/RETW
WSR/XSR PS.WOE ⇒ RSYNC ⇒ Overflow-exception
Note: In the Windowed Register Option, any instruction with an AR register operand can cause overflow exceptions.

220

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

5.3.6

Windowed Register Option Special Registers

The Windowed Register Option Special registers are described in Table 5–147 and
Table 5–148.
Table 5–147. WindowBase - Special Register #72
SR#
72

Name
WindowBase

Description

Reset Value

Base of current AR register window

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Windowed Register Option

1

log2(NAREG/4)

Yes

Yes

WSR Function

RSR Function

WindowBase ← AR[t]X-1..0
Undefined if AR[t]31..X ≠ 032-X
X = log2(NAREG/4)

AR[t] ← 032-X||WindowBase
X = log2(NAREG/4)

Other Changes to the Register

Other Effects of the Register

ENTRY/MOVSP/RETW/RFW*/ROTW
Overflow/underflow-exception

Any instruction which accesses the AR register file

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR WINDOWBASE ⇒ RSYNC ⇒ Any use or def of an ARregister

Table 5–148. WindowStart - Special Register #73
SR#
73

Name

Description
Call-window start bits

WindowStart
Option

Count

Bits

Windowed Register Option

1

NAREG/4

WSR Function
WindowStart ← AR[t]NAREG/4-1..0
Undefined if AR[t]31..NAREG/4 ≠ 032-NAREG/4

Undefined
Privileged?

XSR Legal?

Yes

Yes

RSR Function
AR[t] ← 032-NAREG/4||WindowStart

Other Changes to the Register
ENTRY/MOVSP/RETW/RFWO/RFWU

Reset Value

Other Effects of the Register
Any instruction which accesses the AR register file

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR WINDOWSTART ⇒ RSYNC ⇒ Any use of an AR register when CWOE=1
WSR/XSR WINDOWSTART ⇒ RSYNC ⇒ Any def of an AR register when CWOE=1

5.3.7

Memory Management Special Registers

The Special Registers for managing memory are described in Table 5–149 through
Table 5–152.

Xtensa Instruction Set Architecture (ISA) Reference Manual

221

Chapter 5. Processor State

Table 5–149. PTEVADDR - Special Register #83
SR#
83

Name
PTEVADDR

Description

Reset Value

Virtual address for page table lookups

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

MMU Option

1

32

Yes

Yes

WSR Function

RSR Function

PTEVADDRVABITS-1..X ← AR[t]VABITS-1..X
X = VABITS+log2(PTEbytes)min(PTEPageSizes)

AR[t] ← PTEVADDRVABITS-1..Y||0Y
Y = log2(PTEbytes)

Other Changes to the Register

Other Effects of the Register
Any instruction/data address translation

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR PTEVADDR ⇒ ISYNC ⇒ Any instruction access that might miss the ITLB
WSR/XSR PTEVADDR ⇒ DSYNC ⇒ Any load/store access that might miss the DTLB

Table 5–150. RASID - Special Register #90
SR#
90

Name
RASID

Description

Reset Value

Current ASID values for each protection ring

0x04030201

Option

Count

Bits

Privileged?

XSR Legal?

MMU Option

1

32

Yes

Yes

WSR Function
RASID ← AR[t]31..8||07||11

RSR Function
AR[t] ← RASID

Other Changes to the Register

Other Effects of the Register
Any instruction/data address translation

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR RASID ⇒ ISYNC ⇒ Instruction address translation that depends on the change
WSR/XSR RASID ⇒ DSYNC ⇒ Data address translation that depends on the change

222

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–151. ITLBCFG - Special Register #91
SR#
91

Name
ITLBCFG

Description

Reset Value

Instruction TLB configuration

0x00000000

Option

Count

Bits

Privileged?

XSR Legal?

MMU Option

1

32

Yes

Yes

WSR Function

RSR Function

ITLBCFG ← AR[t]
Affected ways should be invalidated after change.

AR[t] ← ITLBCFG

Other Changes to the Register

Other Effects of the Register
Any instruction address translation

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR ITLBCFG ⇒ ISYNC ⇒ Instruction address translation that depends on the change

Table 5–152. DTLBCFG - Special Register #92
SR#
92

Name
DTLBCFG

Description

Reset Value

Data TLB configuration

0x00000000

Option

Count

Bits

Privileged?

XSR Legal?

MMU Option

1

32

Yes

Yes

WSR Function
DTLBCFG ← AR[t]
Affected ways should be invalidated after change.

RSR Function
AR[t] ← DTLBCFG

Other Changes to the Register

Other Effects of the Register
Any data address translation

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR DTLBCFG ⇒ DSYNC ⇒ Any data address translation that depends on the change

5.3.8

Exception Support Special Registers

The Special Registers that provide information for the handling of an exception are
described in Table 5–153 through Table 5–159.

Xtensa Instruction Set Architecture (ISA) Reference Manual

223

Chapter 5. Processor State

Table 5–153. EXCCAUSE - Special Register #232
SR#
232

Name
EXCCAUSE

Description

Reset Value

Exception cause register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Exception Option

1

6

Yes

Yes

WSR Function

RSR Function

EXCCAUSE ← AR[t]5..0
Undefined if AR[t]31..6 ≠ 026

AR[t] ← 026||EXCCAUSE

Other Changes to the Register

Other Effects of the Register

Exception or interrupt

Table 5–154. EXCVADDR - Special Register #238
SR#
238

Name
EXCVADDR

Description

Reset Value

Exception virtual address register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Exception Option

1

32

Yes

Yes

WSR Function

RSR Function
AR[t] ← EXCVADDR
AR[t] is undefined if CEXCM = 0

EXCVADDR ← AR[t]
Other Changes to the Register

Other Effects of the Register

Some exceptions (see Table 4–64 on page 89), hardware
table walk (see Section 4.6.5.9 on page 174)

Table 5–155. VECBASE - Special Register #231
SR#
231

Name
VECBASE

Description

Reset Value

Vector Base

User Defined1

Option

Count

Bits

Privileged?

XSR Legal?

Relocatable Vector Option

1

32

Yes

Yes

WSR Function
VECBASE ← AR[t]

RSR Function
AR[t] ← VECBASE

Other Changes to the Register

Other Effects of the Register
Exception Vector Locations

1

224

The reset value of VECBASE is set by the user as part of the configuration

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–156. MESR - Special Register #109
SR#
109

Name
MESR

Description

Reset Value

Memory error status register

32’hXXXX0C00

Option

Count

Bits

Privileged?

XSR Legal?

Memory ECC/Parity Option

1

32

Yes

Yes

WSR Function

RSR Function

MESR ← AR[t]

AR[t] ← MESR

Other Changes to the Register

Other Effects of the Register

Memoryerror-exception, memory error without exception

Controls memory error logic

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR MESR ⇒ ISYNC ⇒ Change in error behavior on instruction memories
WSR/XSR MESR ⇒ DSYNC ⇒ Change in error behavior on data memories

Table 5–157. MECR - Special Register #110
SR#

Name

Description

Reset Value

Memory error check register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Memory ECC/Parity Option

1

22

Yes

Yes

110

MECR

WSR Function

RSR Function

MECR ← AR[t]

AR[t] ← MECR

Other Changes to the Register

Other Effects of the Register

Memoryerror-exception, memory error without exception,
Loads when MESR[9] is set.

Stores when MESR[9] is set.

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR MECR ⇒ ISYNC ⇒ Check bit write to instruction memories
WSR/XSR MECR ⇒ DSYNC ⇒ Check bit write to data memories

Table 5–158. MEVADDR - Special Register #111
SR#
111

Name
MEVADDR

Description

Reset Value

Memory error virtual address register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Memory ECC/Parity Option

1

32

Yes

Yes

WSR Function
MEVADDR ← AR[t]

RSR Function
AR[t] ← MEVADDR

Other Changes to the Register

Other Effects of the Register

Memoryerror-exception, memory error without exception

Xtensa Instruction Set Architecture (ISA) Reference Manual

225

Chapter 5. Processor State

Table 5–159. DEBUGCAUSE - Special Register #233
SR#
233

Name
DEBUGCAUSE

Description

Reset Value

Debug cause register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Debug Option

1

12

Yes

No

WSR Function

RSR Function
AR[t] ← 020||DEBUGCAUSE

Reserved
Other Changes to the Register

Other Effects of the Register

Debug exception or interrupt

5.3.9

Exception State Special Registers

The Special Registers that save the PC and PS values and an initial register value for
each of the levels are described in Table 5–160 through Table 5–162.
Table 5–160. EPC1 - Special Register #177
SR#
177

Name

Description

Reset Value

Exception PC[1]

EPC1
Option

Count

Bits

Exception Option

1

32

WSR Function

Undefined
Privileged?

XSR Legal?

Yes

Yes

RSR Function

EPC[1] ← AR[t]

AR[t] ← EPC[1]

Other Changes to the Register
General-exception/overflow-or-underflow-exception

Other Effects of the Register
RFE/RFWO/RFWU

Table 5–161. EPC2..7 - Special Register #178-183
SR#

Name

178-183 EPC2..7

Reset Value

Exception PC[2..7]

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

High-Priority Interrupt Option

NLEVEL
+NNMI-1

32

Yes

Yes

WSR Function
EPC[sr3..0] ← AR[t]
Other Changes to the Register
Level[sr3..0]-Interrupt/debug-exception/NMI

226

Description

RSR Function
AR[t] ← EPC[sr3..0]
AR[t] is undefined if sr3..0 > NLEVEL+NNMI
Other Effects of the Register
RFI[sr3..0]/RFDO/RFDD

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–162. DEPC - Special Register #192
SR#
192

Name
DEPC

Description

Reset Value

Double exception PC

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Exception Option

1

32

Yes

Yes

WSR Function

RSR Function

DEPC ← AR[t]

AR[t] ← DEPC

Other Changes to the Register

Other Effects of the Register

Double exception

RFDE

Table 5–163. MEPC - Special Register #106
SR#
106

Name

Description

Reset Value

Memory error PC register

MEPC
Option

Count

Bits

Memory ECC/Parity Option

1

32

WSR Function

Undefined
Privileged?

XSR Legal?

Yes

Yes

RSR Function
AR[t] ← MEPC
AR[t] is undefined unless MESR[0] is set.

MEPC ← AR[t]
Other Changes to the Register
Memoryerror-exception

Other Effects of the Register
RFME

Table 5–164. EPS2..7 - Special Register #194-199
SR#

Name

194-199 EPS2..7

Description

Reset Value

Exception processor status register[2..7]

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

High-Priority Interrupt Option

NLEVEL
+NNMI-1

32

Yes

Yes

WSR Function
EPS[sr3..0] ← AR[t]

RSR Function
AR[t] ← EPS[sr3..0]
AR[t] is undefined if sr3..0 > NLEVEL+NNMI

Other Changes to the Register
Level[sr3..0]-Interrupt/debug-exception/NMI

Other Effects of the Register
RFI[sr3..0]/RFDO/RFDD

Xtensa Instruction Set Architecture (ISA) Reference Manual

227

Chapter 5. Processor State

Table 5–165. MEPS - Special Register #107
SR#
107

Name
MEPS

Description

Reset Value

Memory error PS register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Memory ECC/Parity Option

1

32

Yes

Yes

WSR Function

RSR Function
AR[t] ← MEPS
AR[t] is undefined unless MESR[0] is set.

MEPS ← AR[t]
Other Changes to the Register

Other Effects of the Register

Memoryerror-exception

RFME

Table 5–166. EXCSAVE1 - Special Register #192
SR#
192

Name
EXCSAVE1

Description

Reset Value

Exception save register[1]

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Exception Option

1

32

Yes

Yes

WSR Function

RSR Function

EXCSAVE[1] ← AR[t]

AR[t] ← EXCSAVE[1]

Other Changes to the Register

Other Effects of the Register

Table 5–167. EXCSAVE2..7- Special Register #210-215
SR#

Name

Description
Exception save register[2..7]

210-215 EXCSAVE2..7

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

High-Priority Interrupt Option

NLEVEL
+NNMI-1

32

Yes

Yes

WSR Function
EXCSAVE[sr3..0] ← AR[t]
Other Changes to the Register

228

Reset Value

RSR Function
AR[t] ← EXCSAVE[sr3..0]
AR[t] is undefined if sr3..0 > NLEVEL+NNMI
Other Effects of the Register

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–168. MESAVE- Special Register #108
SR#
109

Name
MESAVE

Description

Reset Value

Memory error save register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Memory ECC/Parity Option

1

32

Yes

Yes

WSR Function

RSR Function

MESAVE ← AR[t]

AR[t] ← MESAVE

Other Changes to the Register

Other Effects of the Register

5.3.10 Interrupt Special Registers
The Special Registers that manage interrupt handling are described in Table 5–169
through Table 5–172.
Table 5–169. INTERRUPT - Special Register #226 (read)
SR#
226

Name
INTERRUPT

Description

Reset Value

Interrupt pending register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Interrupt Option

1

NINTERRUPT

Yes

No

WSR Function
see Table 5–170 and Table 5–171

RSR Function
AR[t] ← 032-NINTERRUPT||INTERRUPT

Other Changes to the Register
Assertion/deassertion of interrupt signals/
WSR.CCOMPAREn

Other Effects of the Register
Pipeline takes interrupt

Instruction ⇒ xSYNC ⇒ Instruction
WSR INTSET ⇒ ESYNC ⇒ RSR INTERRUPT
WSR INTCLEAR ⇒ ESYNC ⇒ RSR INTERRUPT

Xtensa Instruction Set Architecture (ISA) Reference Manual

229

Chapter 5. Processor State

Table 5–170. INTSET - Special Register #226 (write)
SR#
226

Name
INTSET

Description

Reset Value

Interrupt set register

No separate state

Option

Count

Bits

Privileged?

XSR Legal?

Interrupt Option

1

NINTERRUPT

Yes

No

WSR Function

RSR Function

INTERRUPT ← INTERRUPT or AR[t]X-1..0
Undefined if AR[t]31..X ≠ 032-X
X = NINTERRUPT
Only software interrupt bits can be set.

see Table 5–169

Other Changes to the Register

Other Effects of the Register

(State is INTERRUPT)

(State is INTERRUPT)
Instruction ⇒ xSYNC ⇒ Instruction

WSR INTSET ⇒ ESYNC ⇒ RSR INTERRUPT
WSR INTSET⇒ RSYNC ⇒ Instruction which must execute after the software interrupt

Table 5–171. INTCLEAR - Special Register #227
SR#
227

Name
INTCLEAR

Description

Reset Value

Interrupt clear register

No separate state

Option

Count

Bits

Privileged?

XSR Legal?

Interrupt Option

1

NINTERRUPT

Yes

No

WSR Function

RSR Function

INTERRUPT ← INTERRUPT and not AR[t]X-1..0
Undefined if AR[t]31..X ≠ 032-X
X = NINTERRUPT
AR[t] ← undefined32
Bits in AR[t]X-1..0 may be set without causing harm.
Only bits which can be cleared by this write are affected.
Other Changes to the Register
(State is INTERRUPT)

Other Effects of the Register
(State is INTERRUPT)

Instruction ⇒ xSYNC ⇒ Instruction
WSR INTCLEAR ⇒ ESYNC ⇒ RSR INTERRUPT
WSR INTCLEAR⇒ RSYNC ⇒ Instruction which must execute after the cleared interrupt

230

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–172. INTENABLE - Special Register #228
SR#
228

Name
INTENABLE

Description

Reset Value

Interrupt enable register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Interrupt Option

1

NINTERRUPT

Yes

Yes

WSR Function

RSR Function

INTENABLE ← AR[t]NINTERRUPT-1..0
Undefined if AR[t]31..X ≠ 032-X
X = NINTERRUPT

AR[t] ← 032-NINTERRUPT||INTENABLE

Other Changes to the Register

Other Effects of the Register
Pipeline takes interrupt

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR INTENABLE ⇒ ESYNC ⇒ RSR/XSR INTENABLE
WSR/XSR INTENABLE⇒ RSYNC ⇒ Any instruction which must wait for INTENABLE changes

5.3.11 Timing Special Registers
The Special Registers that manage instruction counting and cycle counting, including
timer interrupts are described in Table 5–173 through Table 5–176.
Table 5–173. ICOUNT - Special Register #236
SR#
236

Name

Description

Reset Value

Instruction count register

ICOUNT

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Debug Option

1

2 or 32

Yes

Yes

WSR Function
ICOUNT ← AR[t]
Write when CINTLEVEL ≥ ICOUNTLEVEL

RSR Function
AR[t] ← ICOUNT
Defined only when CINTLEVEL ≥ ICOUNTLEVEL

Other Changes to the Register
Increment on appropriate cycles

Other Effects of the Register
Debug exception

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR ICOUNT ⇒ ESYNC ⇒ RSR/XSR ICOUNT
WSR/XSR ICOUNT⇒ ISYNC ⇒ Ending CINTLEVEL ≥ ICOUNTLEVEL

Xtensa Instruction Set Architecture (ISA) Reference Manual

231

Chapter 5. Processor State

Table 5–174. ICOUNTLEVEL - Special Register #237
SR#
237

Name
ICOUNTLEVEL

Description

Reset Value

Instruction count level register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Debug Option

1

4

Yes

Yes

WSR Function

RSR Function

ICOUNTLEVEL ← AR[t]3..0
Undefined if AR[t]31..4 ≠ 028
Write when CINTLEVEL ≥ old ICOUNTLEVEL
Write when CINTLEVEL ≥ new ICOUNTLEVEL

AR[t] ← 028||ICOUNTLEVEL

Other Changes to the Register

Other Effects of the Register
Debug exception

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR ICOUNTLEVEL ⇒ ISYNC ⇒ Ending CINTLEVEL ≥ old ICOUNTLEVEL
WSR/XSR ICOUNTLEVEL ⇒ ISYNC ⇒ Ending CINTLEVEL ≥ new ICOUNTLEVEL

Table 5–175. CCOUNT - Special Register #234
SR#
234

Name
CCOUNT

Description

Reset Value

Cycle count register

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Timer Interrupt Option

1

32

Yes

Yes

WSR Function
CCOUNT ← AR[t]
Precise cycle of write is not defined
Not usually written during normal operation.

RSR Function
AR[t] ← CCOUNT
Precise cycle of read is not defined.

Other Changes to the Register
Increment each cycle

Other Effects of the Register
Generates Timer Interrupt

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR CCOUNT⇒ ESYNC ⇒ RSR/XSR CCOUNT

232

Xtensa Instruction Set Architecture (ISA) Reference Manual

Chapter 5. Processor State

Table 5–176. CCOMPARE0..2 - Special Register #240-242
SR#

Name

240-242 CCOMPARE0..2

Description

Reset Value

Cycle count compare registers

Undefined

Option

Count

Bits

Privileged?

XSR Legal?

Timer Interrupt Option

NCCOMPARE

32

Yes

Yes

WSR Function

RSR Function

CCOMPARE[sr1..0] ← AR[t]
INTERRUPTi ← 0; i is position of timer interrupt

AR[t] ← CCOMPARE[sr1..0]
AR[t] is undefined if sr1..0 ≥ NCOMPARE

Other Changes to the Register

Other Effects of the Register
Timer Interrupt

Instruction ⇒ xSYNC ⇒ Instruction
WSR/XSR CCOMPARE0..2 ⇒ ESYNC ⇒ RSR/XSR CCOUNT (to ensure CCOUNT