Intel XScale Microarchitecture Users Manual User
User Manual:
Open the PDF directly: View PDF
Page Count: 198 [warning: Documents this large are best viewed by clicking the View PDF Link!]
- Intel® XScale™ Microarchitecture for the PXA255 Processor
- Introduction 1
- Programming Model 2
- Memory Management 3
- Instruction Cache 4
- Branch Target Buffer 5
- Data Cache 6
- Configuration 7
- 7.1 Overview
- 7.2 CP15 Registers
- 7.2.1 Register 0: ID & Cache Type Registers
- 7.2.2 Register 1: Control & Auxiliary Control Registers
- 7.2.3 Register 2: Translation Table Base Register
- 7.2.4 Register 3: Domain Access Control Register
- 7.2.5 Register 5: Fault Status Register
- 7.2.6 Register 6: Fault Address Register
- 7.2.7 Register 7: Cache Functions
- 7.2.8 Register 8: TLB Operations
- 7.2.9 Register 9: Cache Lock Down
- 7.2.10 Register 10: TLB Lock Down
- 7.2.11 Register 13: Process ID
- 7.2.12 Register 14: Breakpoint Registers
- 7.2.13 Register 15: Coprocessor Access Register
- 7.3 CP14 Registers
- Performance Monitoring 8
- Test 9
- 9.1 Boundary-Scan Architecture and Overview
- 9.2 Reset
- 9.3 Instruction Register
- 9.4 Test Data Registers
- 9.5 TAP Controller
- 9.5.1 Test Logic Reset State
- 9.5.2 Run-Test/Idle State
- 9.5.3 Select-DR-Scan State
- 9.5.4 Capture-DR State
- 9.5.5 Shift-DR State
- 9.5.6 Exit1-DR State
- 9.5.7 Pause-DR State
- 9.5.8 Exit2-DR State
- 9.5.9 Update-DR State
- 9.5.10 Select-IR Scan State
- 9.5.11 Capture-IR State
- 9.5.12 Shift-IR State
- 9.5.13 Exit1-IR State
- 9.5.14 Pause-IR State
- 9.5.15 Exit2-IR State
- 9.5.16 Update-IR State
- Software Debug 10
- 10.1 Introduction
- 10.2 Debug Registers
- 10.3 Debug Control and Status Register (DCSR)
- 10.4 Debug Exceptions
- 10.5 HW Breakpoint Resources
- 10.6 Software Breakpoints
- 10.7 Transmit/Receive Control Register (TXRXCTRL)
- 10.8 Transmit Register (TX)
- 10.9 Receive Register (RX)
- 10.10 Debug JTAG Access
- 10.11 Trace Buffer
- 10.12 Trace Buffer Entries
- 10.13 Downloading Code into the Instruction Cache
- 10.14 Halt Mode Software Protocol
- 10.15 Software Debug Notes
- Performance Considerations 11
- 11.1 Branch Prediction
- 11.2 Instruction Latencies
- 11.2.1 Performance Terms
- 11.2.2 Branch Instruction Timings
- 11.2.3 Data Processing Instruction Timings
- 11.2.4 Multiply Instruction Timings
- 11.2.5 Saturated Arithmetic Instructions
- 11.2.6 Status Register Access Instructions
- 11.2.7 Load/Store Instructions
- 11.2.8 Semaphore Instructions
- 11.2.9 Coprocessor Instructions
- 11.2.10 Miscellaneous Instruction Timing
- 11.2.11 Thumb Instructions
- 11.3 Interrupt Latency
- Optimization Guide A
- A.1 Introduction
- A.2 Intel® XScale™ Core Pipeline
- A.3 Basic Optimizations
- A.4 Cache and Prefetch Optimizations
- A.4.1 Instruction Cache
- A.4.2 Data and Mini Cache
- A.4.3 Cache Considerations
- A.4.4 Prefetch Considerations
- A.4.4.1. Prefetch Distances
- A.4.4.2. Prefetch Loop Scheduling
- A.4.4.3. Compute vs. Data Bus Bound
- A.4.4.4. Low Number of Iterations
- A.4.4.5. Bandwidth Limitations
- A.4.4.6. Cache Memory Considerations
- A.4.4.7. Cache Blocking
- A.4.4.8. Prefetch Unrolling
- A.4.4.9. Pointer Prefetch
- A.4.4.10. Loop Interchange
- A.4.4.11. Loop Fusion
- A.4.4.12. Prefetch to Reduce Register Pressure
- A.5 Instruction Scheduling
- A.5.1 Scheduling Loads
- A.5.2 Scheduling Data Processing Instructions
- A.5.3 Scheduling Multiply Instructions
- A.5.4 Scheduling SWP and SWPB Instructions
- A.5.5 Scheduling the MRA and MAR Instructions (MRRC/MCRR)
- A.5.6 Scheduling the MIA and MIAPH Instructions
- A.5.7 Scheduling MRS and MSR Instructions
- A.5.8 Scheduling Coprocessor Instructions
- A.6 Optimizations for Size