SiFive U54-MC Manual: v19.08p3p0
File info: application/pdf · 77 pages · 648.62KB
SiFive U54-MC Manual: v19.08p3p0
U54-MC User Guide. 1.2 S5 RISC‑V Monitor Core The U54-MC includes a 64-bit S5 RISC‑V core, which has a high-performance, single-issue, in-order execution pipeline, with a peak sustainable execution rate of one instructi…
Extracted Text
SiFive U54-MC Manual v19.08p3p0
� SiFive, Inc.
SiFive U54-MC Manual
Proprietary Notice
Copyright � 2018�2020, SiFive Inc. All rights reserved.
Information in this document is provided "as is," with all faults.
SiFive expressly disclaims all warranties, representations, and conditions of any kind, whether express or implied, including, but not limited to, the implied warranties or conditions of merchantability, fitness for a particular purpose and non-infringement.
SiFive does not assume any liability rising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation indirect, incidental, special, exemplary, or consequential damages.
SiFive reserves the right to make changes without further notice to any products herein.
Release Information
Version
Date
v19.08p3p0 April 30, 2020
v19.08p2p0 December 06, 2019 v19.08p1p0 November 08, 2019 v19.08p0 September 17, 2019
Changes � Fixed issue in which mcause values did not
reset to 0 after reset
� Added the "Disable Speculative I$ Refill" bit to the Feature Disable CSR to partially mitigate undesired speculative accesses to the Memory Port
� Fixed issue in which a read of the L2 Cache via its sideband interface could fail to report an ECC error if the read occurred immediately after the corrupt data was written
� Fixed issue in which a read of the L2 Cache via its sideband interface could erroneously report an ECC error if the read occurred immediately after data with good ECC was written
� Fixed issue in which unused logic in asynchronous crossings (as found in the Debug connection to the core) would cause CDC lint warnings
� Fixed issue in which WFI did not gate the clock if the following instruction was a memory access
� Fixed issue in which performance counters set to count both exceptions and other retirement events only counted the exceptions
� Added a "Suppress Grant on Corrupt Data" bit to the Feature Disable CSR to mitigate issue in which it was impossible to clear an uncorrectable D-Cache ECC error from the core experiencing the error
� Fixed a potential bus hang when flushing L2
� Various documentation fixes and improvements � Fixed erratum in which the TDO pin may remain
driven after reset � Fixed erratum in which Debug.SBCS had incor-
rect reset value for SBACCESS
� Fixed typos and other minor documentation errors
� The Debug Module memory region is no longer accessible in M-mode
� Addition of the CDISCARD instruction for invalidating data cache lines without writeback
Version
Date
v19.05p2 August 26, 2019
v19.05p1 July 22, 2019
v19.05
June 09, 2019
v19.02
February 28, 2019
v1p0
October 10, 2018
Changes � Fix for errata on 5-series cores with L1 data
caches or L2 caches in which CFLUSH.D.L1 followed by a load that is nack'd could cause core lockup.
� Configuration of standard core parameters updated to match web specification. The L2 cache parameters were updated. Number of L2 banks is now 4, L2 associativity is now 16, and L2 size is now 2 MiB.
� SiFive Insight is enabled
� Enable debugger reads of Debug Module registers when periphery is in reset
� Fix errata to get illegal instruction exception executing DRET outside of debug mode
� v19.05 release of the U54-MC. No functional changes.
� Changed the date based release numbering system
� Top-level module name [U54MC_CoreIPSubsystem]
� SiFive Insight [enabled]
� WFI-based clock-gating [enabled]
� Core local interrupts [16 0]
� Global interrupts [128 127] � Initial release
Contents
1 Introduction .............................................................................................................. 6
1.1 U54-MC Overview....................................................................................................... 6 1.2 S5 RISCV Monitor Core ............................................................................................. 7 1.3 U5 RISCV Application Cores....................................................................................... 7 1.4 Debug Support ........................................................................................................... 8 1.5 Interrupts ................................................................................................................... 8
2 List of Abbreviations and Terms ...................................................................9
3 S5 RISC-V Core ..................................................................................................... 11
3.1 Instruction Memory System........................................................................................ 11 3.1.1 I-Cache Reconfigurability .................................................................................. 12
3.2 Instruction Fetch Unit ................................................................................................ 12 3.3 Execution Pipeline .................................................................................................... 13 3.4 Data Memory System................................................................................................ 13 3.5 Atomic Memory Operations........................................................................................ 14 3.6 Supported Modes ..................................................................................................... 14 3.7 Physical Memory Protection (PMP)............................................................................. 14
3.7.1 Functional Description ...................................................................................... 14 3.7.2 Region Locking ................................................................................................ 15 3.8 Hardware Performance Monitor.................................................................................. 15
4 U5 RISC-V Core..................................................................................................... 17
4.1 Instruction Memory System........................................................................................ 17 4.1.1 I-Cache Reconfigurability .................................................................................. 18
4.2 Instruction Fetch Unit ................................................................................................ 18 4.3 Execution Pipeline .................................................................................................... 19 4.4 Data Memory System................................................................................................ 19 4.5 Atomic Memory Operations........................................................................................ 20
1
2
4.6 Floating-Point Unit (FPU)........................................................................................... 20 4.7 Virtual Memory Support ............................................................................................. 20 4.8 Supported Modes ..................................................................................................... 20 4.9 Physical Memory Protection (PMP)............................................................................. 21
4.9.1 Functional Description ...................................................................................... 21 4.9.2 Region Locking ................................................................................................ 21 4.10 Hardware Performance Monitor................................................................................ 21
5 Memory Map ........................................................................................................... 24
6 Interrupts.................................................................................................................. 26
6.1 Interrupt Concepts .................................................................................................... 26 6.2 Interrupt Operation.................................................................................................... 27
6.2.1 Interrupt Entry and Exit ..................................................................................... 27 6.3 Interrupt Control Status Registers............................................................................... 28
6.3.1 Machine Status Register (mstatus) ..................................................................28 6.3.2 Machine Trap Vector (mtvec)............................................................................ 29 6.3.3 Machine Interrupt Enable (mie) .........................................................................30 6.3.4 Machine Interrupt Pending (mip) .......................................................................30 6.3.5 Machine Cause (mcause) ................................................................................. 31 6.4 Supervisor Mode Interrupts........................................................................................ 32 6.4.1 Delegation Registers (mideleg and medeleg) ..................................................33 6.4.2 Supervisor Status Register (sstatus) ...............................................................34 6.4.3 Supervisor Interrupt Enable Register (sie).........................................................35 6.4.4 Supervisor Interrupt Pending (sip) ....................................................................35 6.4.5 Supervisor Cause Register (scause).................................................................35 6.4.6 Supervisor Trap Vector (stvec) ........................................................................36 6.4.7 Delegated Interrupt Handling............................................................................. 37 6.5 Interrupt Priorities ..................................................................................................... 38 6.6 Interrupt Latency....................................................................................................... 38
7 Bus-Error Unit........................................................................................................ 39
7.1 Bus-Error Unit Overview ............................................................................................ 39
3
7.2 Reportable Errors ..................................................................................................... 39 7.3 Functional Behavior .................................................................................................. 39 7.4 Memory Map ............................................................................................................ 40
8 Core-Local Interruptor (CLINT).....................................................................41
8.1 CLINT Memory Map.................................................................................................. 41 8.2 MSIP Registers......................................................................................................... 42 8.3 Timer Registers ........................................................................................................ 42 8.4 Supervisor Mode Delegation ...................................................................................... 42
9 Level 2 Cache Controller .................................................................................43
9.1 Level 2 Cache Controller Overview............................................................................. 43 9.2 Functional Description ............................................................................................... 43
9.2.1 Way Enable and the L2 Loosely-Integrated Memory (L2 LIM) ...............................44 9.2.2 Way Masking and Locking................................................................................. 45 9.2.3 L2 Zero Device ................................................................................................ 45 9.2.4 Error Correcting Codes (ECC) ........................................................................... 46 9.3 Memory Map ............................................................................................................ 46 9.4 Register Descriptions ................................................................................................ 47 9.4.1 Cache Configuration Register (Config).............................................................48 9.4.2 Way Enable Register (WayEnable) ...................................................................48 9.4.3 ECC Error Injection Register (ECCInjectError)...............................................48 9.4.4 ECC Directory Fix Address (DirECCFix*).........................................................49 9.4.5 ECC Directory Fix Count (DirECCFixCount) ....................................................49 9.4.6 ECC Directory Fail Address (DirECCFail*) ......................................................49 9.4.7 ECC Data Fix Address (DatECCFix*) ...............................................................49 9.4.8 ECC Data Fix Count (DatECCFixCount) ..........................................................49 9.4.9 ECC Data Fail Address (DatECCFail*) ............................................................49 9.4.10 ECC Data Fail Count (DatECCFailCount)......................................................50 9.4.11 Cache Flush Registers (Flush*).....................................................................50 9.4.12 Way Mask Registers (WayMask*) ....................................................................50
10 Platform-Level Interrupt Controller (PLIC)...........................................52
4
10.1 Memory Map .......................................................................................................... 52 10.2 Interrupt Sources .................................................................................................... 56 10.3 Interrupt Priorities.................................................................................................... 56 10.4 Interrupt Pending Bits .............................................................................................. 57 10.5 Interrupt Enables..................................................................................................... 58 10.6 Priority Thresholds .................................................................................................. 58 10.7 Interrupt Claim Process ........................................................................................... 59 10.8 Interrupt Completion................................................................................................ 59
11 Custom Instructions ........................................................................................ 60
11.1 CFLUSH.D.L1........................................................................................................ 60 11.2 CDISCARD.D.L1 .................................................................................................... 60 11.3 Other Custom Instructions ....................................................................................... 61
12 Debug ...................................................................................................................... 62
12.1 Debug CSRs .......................................................................................................... 62 12.1.1 Trace and Debug Register Select (tselect)....................................................62 12.1.2 Trace and Debug Data Registers (tdata1-3) ..................................................63 12.1.3 Debug Control and Status Register (dcsr) .......................................................64 12.1.4 Debug PC (dpc)............................................................................................. 64 12.1.5 Debug Scratch (dscratch) ............................................................................ 64
12.2 Breakpoints ............................................................................................................ 64 12.2.1 Breakpoint Match Control Register (mcontrol) ................................................64 12.2.2 Breakpoint Match Address Register (maddress)...............................................66 12.2.3 Breakpoint Execution ...................................................................................... 66 12.2.4 Sharing Breakpoints Between Debug and Machine Mode ..................................67
12.3 Debug Memory Map................................................................................................ 67 12.3.1 Debug RAM and Program Buffer (0x300�0x3FF) .............................................67 12.3.2 Debug ROM (0x800�0xFFF) ..........................................................................67 12.3.3 Debug Flags (0x100�0x110, 0x400�0x7FF) ..................................................68 12.3.4 Safe Zero Address.......................................................................................... 68
12.4 Debug Module Interface........................................................................................... 68 12.4.1 DM Registers ................................................................................................. 68
5
12.4.2 12.4.3 12.4.4
Abstract Commands ....................................................................................... 69 Multi-core Synchronization .............................................................................. 69 System Bus Access ........................................................................................ 69
13 Error Correction Codes (ECC)....................................................................70
13.1 ECC Configuration .................................................................................................. 70 13.1.1 ECC Initialization ............................................................................................ 70
13.2 ECC Interrupt Handling and Error Injection ................................................................71 13.3 Hardware Operation Upon ECC Error .......................................................................71
14 References ............................................................................................................ 72
Chapter 1
Introduction
SiFive's U54-MC is a full-Linux-capable, cache-coherent 64-bit RISCV processor available as an IP block. The SiFive U54-MC is guaranteed to be compatible with all applicable RISCV standards, and this document should be read together with the official RISCV user-level, privileged, and external debug architecture specifications.
A summary of features in the U54-MC can be found in Table 1.
U54-MC Feature Set Feature Number of Harts S5 Core U5 Core Level-2 Cache PLIC Interrupts
PLIC Priority Levels Hardware Breakpoints Physical Memory Protection Unit
Description 5 Harts. 1� S5 RISCV core. 4� U5 RISCV cores. 2 MiB, 16-way L2 Cache. 127 Interrupt signals which can be connected to off core complex devices. The PLIC supports 7 priority levels. 2 hardware breakpoints. PMP with 8 x regions and a minimum granularity of 4 bytes.
Table 1: U54-MC Feature Set
1.1 U54-MC Overview
An overview of the SiFive U54-MC is shown in Figure 1. This RISC-V Core IP includes 5 x 64-bit RISCV cores, including local and global interrupt support, and physical memory protection. The memory system consists of Data Cache, Data Tightly-Integrated Memory, and Instruc-
6
Copyright � 2018�2020, SiFive Inc. All rights reserved.
7
tion Cache with configurable Tightly-Integrated Memory. The U54-MC also includes a debug unit, one incoming Port, and three outgoing Ports.
Figure 1: U54-MC Block Diagram
The U54-MC memory map is detailed in Chapter 5, and the interfaces are described in full in the U54-MC User Guide.
1.2 S5 RISCV Monitor Core
The U54-MC includes a 64-bit S5 RISCV core, which has a high-performance, single-issue, inorder execution pipeline, with a peak sustainable execution rate of one instruction per clock cycle. The S5 core supports Machine and User privilege modes, as well as standard Multiply, Atomic, and Compressed RISCV extensions (RV64IMAC).
The monitor core is described in more detail in Chapter 3.
1.3 U5 RISCV Application Cores
The U54-MC includes four 64-bit U5 RISCV cores, which each have a high-performance, single-issue, in-order execution pipeline, with a peak sustainable execution rate of one instruction per clock cycle. Each U5 core supports Machine, Supervisor, and User privilege modes, as well as standard Multiply, Single-Precision Floating Point, Double-Precision Floating Point, Atomic, and Compressed RISCV extensions (RV64GC).
Copyright � 2018�2020, SiFive Inc. All rights reserved.
8
The application cores are described in more detail in Chapter 4.
1.4 Debug Support
The U54-MC provides external debugger support over an industry-standard JTAG port, including 2 hardware-programmable breakpoints per hart.
Debug support is described in detail in Chapter 12, and the debug interface is described in the U54-MC User Guide.
1.5 Interrupts
This Core Complex includes a RISC-V standard Platform-Level Interrupt Controller (PLIC), which supports 136 global interrupts with 7 priority levels pre-integrated with the on-core-complex peripherals.
This Core Complex also provides the standard RISCV machine-mode timer and software interrupts via the Core-Local Interruptor (CLINT).
Interrupts are described in Chapter 6. The CLINT is described in Chapter 8. The PLIC is described in Chapter 10.
Chapter 2
List of Abbreviations and Terms
9
Copyright � 2018�2020, SiFive Inc. All rights reserved.
10
Term BHT BTB RAS CLINT
CLIC
hart DTIM IJTP ITIM JTAG LIM
PMP PLIC
TileLink
RO RW WO WARL
WIRI
WLRL
WPRI
Definition Branch History Table Branch Target Buffer Return-Address Stack Core-Local Interruptor. Generates per-hart software interrupts and timer interrupts. Core-Local Interrupt Controller. Configures priorities and levels for core local interrupts. HARdware Thread Data Tightly Integrated Memory Indirect-Jump Target Predictor Instruction Tightly Integrated Memory Joint Test Action Group Loosely Integrated Memory. Used to describe memory space delivered in a SiFive Core Complex but not tightly integrated to a CPU core. Physical Memory Protection Platform-Level Interrupt Controller. The global interrupt controller in a RISC-V system. A free and open interconnect standard originally developed at UC Berkeley. Used to describe a Read Only register field. Used to describe a Read/Write register field. Used to describe a Write Only registers field. Write-Any Read-Legal field. A register field that can be written with any value, but returns only supported values when read. Writes-Ignored, Reads-Ignore field. A read-only register field reserved for future use. Writes to the field are ignored, and reads should ignore the value returned. Write-Legal, Read-Legal field. A register field that should only be written with legal values and that only returns legal value if last written with a legal value. Writes-Preserve Reads-Ignore field. A register field that might contain unknown information. Reads should ignore the value returned, but writes to the whole register should preserve the original value.
Chapter 3
S5 RISC-V Core
This chapter describes the 64-bit S5 RISCV processor core used in the U54-MC. The S5 processor core comprises an instruction memory system, an instruction fetch unit, an execution pipeline, a data memory system, and support for global, software, and timer interrupts.
The S5 feature set is summarized in Table 2.
Feature ISA Instruction Cache Instruction Tightly Integrated Memory
Data Tightly Integrated Memory ECC Support
Modes
Description RV64IMAC. 16 KiB 2-way instruction cache. The S5 has support for an ITIM with a maximum size of 8 KiB. 8 KiB DTIM. Single error correction, double error detection on the ITIM and DTIM. The S5 supports the following modes: Machine Mode, User Mode.
Table 2: S5 Feature Set
3.1 Instruction Memory System
The instruction memory system consists of a dedicated 16 KiB 2-way set-associative instruction cache. The access latency of all blocks in the instruction memory system is one clock cycle. The instruction cache is not kept coherent with the rest of the platform memory system. Writes to instruction memory must be synchronized with the instruction fetch stream by executing a FENCE.I instruction.
The instruction cache has a line size of 64 bytes, and a cache line fill triggers a burst access outside of the U54-MC. The core caches instructions from executable addresses, with the exception of the Instruction Tightly Integrated Memory (ITIM), which is further described in Section 3.1.1. See the U54-MC Memory Map in Chapter 5 for a description of executable address regions that are denoted by the attribute X.
11
Copyright � 2018�2020, SiFive Inc. All rights reserved.
12
Trying to execute an instruction from a non-executable address results in a synchronous trap.
3.1.1 I-Cache Reconfigurability
The instruction cache can be partially reconfigured into ITIM, which occupies a fixed address range in the memory map. ITIM provides high-performance, predictable instruction delivery. Fetching an instruction from ITIM is as fast as an instruction-cache hit, with no possibility of a cache miss. ITIM can hold data as well as instructions, though loads and stores from a core to its ITIM are not as performant as loads and stores to its Data Tightly Integrated Memory (DTIM). Memory requests from one core to any other core's ITIM are not as performant as memory requests from a core to its own ITIM.
The instruction cache can be configured as ITIM for all ways except for 1 in units of cache lines (64 bytes). A single instruction cache way must remain an instruction cache. ITIM is allocated simply by storing to it. A store to the nth byte of the ITIM memory map reallocates the first n+1 bytes of instruction cache as ITIM, rounded up to the next cache line.
ITIM is deallocated by storing zero to the first byte after the ITIM region, that is, 28 KiB after the base address of ITIM as indicated in the Memory Map in Chapter 5. The deallocated ITIM space is automatically returned to the instruction cache.
For determinism, software must clear the contents of ITIM after allocating it. It is unpredictable whether ITIM contents are preserved between deallocation and allocation.
ITIM devices are shown in the memory map as two regions: a "mem" region and a "control" region. The "mem" region corresponds to the address range which can be allocated for ITIM use from the Instruction Cache. The "control" region corresponds to the first byte after the ITIM "mem" region.
3.2 Instruction Fetch Unit
The S5 instruction fetch unit contains branch prediction hardware to improve performance of the processor core. The branch predictor comprises a 28-entry branch target buffer (BTB) which predicts the target of taken branches, a 512-entry branch history table (BHT), which predicts the direction of conditional branches, and a 6-entry return-address stack (RAS) which predicts the target of procedure returns. The branch predictor has a one-cycle latency, so that correctly predicted control-flow instructions result in no penalty. Mispredicted control-flow instructions incur a three-cycle penalty.
The S5 implements the standard Compressed (C) extension to the RISCV architecture, which allows for 16-bit RISCV instructions.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
13
3.3 Execution Pipeline
The S5 execution unit is a single-issue, in-order pipeline. The pipeline comprises five stages: instruction fetch, instruction decode and register fetch, execute, data memory access, and register writeback.
The pipeline has a peak execution rate of one instruction per clock cycle, and is fully bypassed so that most instructions have a one-cycle result latency. There are several exceptions:
� LW has a two-cycle result latency, assuming a cache hit. � LH, LHU, LB, and LBU have a three-cycle result latency, assuming a cache hit. � CSR reads have a three-cycle result latency. � MUL, MULH, MULHU, and MULHSU have a 1-cycle result latency. � DIV, DIVU, REM, and REMU have between a 3-cycle and 64-cycle result latency, depending
on the operand values.
The pipeline only interlocks on read-after-write and write-after-write hazards, so instructions may be scheduled to avoid stalls.
The S5 implements the standard Multiply (M) extension to the RISCV architecture for integer multiplication and division. The S5 has a 64-bit per cycle hardware multiply and a 1-bit per cycle hardware divide. The multiplier is fully pipelined and can begin a new operation on each cycle, with a maximum throughput of one operation per cycle.
The hart will not abandon a Divide instruction in flight. This means if an interrupt handler tries to use a register that is the destination register of a divide instruction the pipeline stalls until the divide is complete.
Branch and jump instructions transfer control from the memory access pipeline stage. Correctlypredicted branches and jumps incur no penalty, whereas mispredicted branches and jumps incur a three-cycle penalty.
Most CSR writes result in a pipeline flush with a five-cycle penalty.
3.4 Data Memory System
The S5 data memory system consists of a DTIM interface, which supports up to 8 KiB. The access latency from a core to its own DTIM is two clock cycles for full words and three clock cycles for smaller quantities. Memory requests from one core to any other core's DTIM are not as performant as memory requests from a core to its own DTIM. Misaligned accesses are not supported in hardware and result in a trap to allow software emulation.
Stores are pipelined and commit on cycles where the data memory system is otherwise idle. Loads to addresses currently in the store pipeline result in a five-cycle penalty.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
14
3.5 Atomic Memory Operations
The S5 core supports the RISCV standard Atomic (A) extension on the DTIM and the Peripheral Port. Atomic memory operations to regions that do not support them generate an access exception precisely at the core.
The load-reserved and store-conditional instructions are only supported on cached regions, thus generate an access exception on DTIM and other uncached memory regions.
See The RISCV Instruction Set Manual, Volume I: User-Level ISA, Version 2.1 for more information on the instructions added by this extension.
3.6 Supported Modes
The S5 supports RISCV user mode, providing two levels of privilege: machine (M) and user (U). U-mode provides a mechanism to isolate application processes from each other and from trusted code running in M-mode.
See The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10 for more information on the privilege modes.
3.7 Physical Memory Protection (PMP)
The S5 includes a Physical Memory Protection (PMP) unit compliant with The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10. PMP can be used to set memory access privileges (read, write, execute) for specified memory regions. The S5 PMP supports 8 regions with a minimum region size of 4 bytes.
This section describes how PMP concepts in the RISCV architecture apply to the S5. The definitive resource for information about the RISCV PMP is The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
3.7.1 Functional Description
The S5 includes a PMP unit, which can be used to restrict access to memory and isolate processes from each other.
The S5 PMP unit has 8 regions and a minimum granularity of 4 bytes. Overlapping regions are permitted. The S5 PMP unit implements the architecturally defined pmpcfgX CSR pmpcfg0, supporting 8 regions. pmpcfg2 is implemented, but hardwired to zero. Access to pmpcfg1 or pmpcfg3 results in an illegal instruction exception.
The PMP registers may only be programmed in M-mode. Ordinarily, the PMP unit enforces permissions on U-mode accesses. However, locked regions (see Section 3.7.2) additionally enforce their permissions on M-mode.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
15
3.7.2 Region Locking
The PMP allows for region locking whereby, once a region is locked, further writes to the configuration and address registers are ignored. Locked PMP entries may only be unlocked with a system reset. A region may be locked by setting the L bit in the pmpicfg register.
In addition to locking the PMP entry, the L bit indicates whether the R/W/X permissions are enforced on M-Mode accesses. When the L bit is clear, the R/W/X permissions apply only to Umode.
3.8 Hardware Performance Monitor
The U54-MC supports a basic hardware performance monitoring facility compliant with The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10. The mcycle CSR holds a count of the number of clock cycles the hart has executed since some arbitrary time in the past. The minstret CSR holds a count of the number of instructions the hart has retired since some arbitrary time in the past. Both are 64-bit counters.
The hardware performance monitor includes two additional event counters, mhpmcounter3 and mhpmcounter4. The event selector CSRs mhpmevent3 and mhpmevent4 are registers that control which event causes the corresponding counter to increment. The mhpmcounters are 40-bit counters.
The event selectors are partitioned into two fields, as shown in Table 3: the lower 8 bits select an event class, and the upper bits form a mask of events in that class. The counter increments if the event corresponding to any set mask bit occurs. For example, if mhpmevent3 is set to 0x4200, then mhpmcounter3 will increment when either a load instruction or a conditional branch instruction retires. An event selector of 0 means "count nothing."
Note that in-flight and recently retired instructions may or may not be reflected when reading or writing the performance counters or writing the event selectors.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
16
Machine Hardware Performance Monitor Event Register Instruction Commit Events, mhpeventX[7:0] = 0
Bits
Meaning
8
Exception taken
9
Integer load instruction retired
10
Integer store instruction retired
11
Atomic memory operation retired
12
System instruction retired
13
Integer arithmetic instruction retired
14
Conditional branch retired
15
JAL instruction retired
16
JALR instruction retired
17
Integer multiplication instruction retired
18
Integer division instruction retired
Microarchitectural Events , mhpeventX[7:0] = 1
Bits
Meaning
8
Load-use interlock
9
Long-latency interlock
10
CSR read interlock
11
Instruction cache/ITIM busy
12
Data cache/DTIM busy
13
Branch direction misprediction
14
Branch/jump target misprediction
15
Pipeline flush from CSR write
16
Pipeline flush from other event
17
Integer multiplication interlock
Memory System Events, mhpeventX[7:0] = 2
Bits
Meaning
8
Instruction cache miss
9
Memory-mapped I/O access
Table 3: mhpmevent Register Description
Chapter 4
U5 RISC-V Core
This chapter describes the 64-bit U5 RISCV processor core used in the U54-MC. The U5 processor core comprises an instruction memory system, an instruction fetch unit, an execution pipeline, a floating-point unit, a data memory system, a memory management unit, and support for global, software, and timer interrupts.
The U5 feature set is summarized in Table 4.
Feature ISA Instruction Cache Instruction Tightly Integrated Memory
Data Cache ECC Support
Virtual Memory Support
Modes
Description RV64GC. 32 KiB 8-way instruction cache. The U5 has support for an ITIM with a maximum size of 28 KiB. 32 KiB 8-way data cache. Single error correction, double error detection on the ITIM and Data Cache. The U5 has support for Sv39 virtual memory support with a 39-bit virtual address space, 38-bit physical address space, and a 32-entry TLB. The U5 supports the following modes: Machine Mode, Supervisor Mode, User Mode.
Table 4: U5 Feature Set
4.1 Instruction Memory System
The instruction memory system consists of a dedicated 32 KiB 8-way set-associative instruction cache. The access latency of all blocks in the instruction memory system is one clock cycle. The instruction cache is not kept coherent with the rest of the platform memory system. Writes to instruction memory must be synchronized with the instruction fetch stream by executing a FENCE.I instruction.
17
Copyright � 2018�2020, SiFive Inc. All rights reserved.
18
The instruction cache has a line size of 64 bytes, and a cache line fill triggers a burst access outside of the U54-MC. The core caches instructions from executable addresses, with the exception of the Instruction Tightly Integrated Memory (ITIM), which is further described in Section 4.1.1. See the U54-MC Memory Map in Chapter 5 for a description of executable address regions that are denoted by the attribute X.
Trying to execute an instruction from a non-executable address results in a synchronous trap.
4.1.1 I-Cache Reconfigurability
The instruction cache can be partially reconfigured into ITIM, which occupies a fixed address range in the memory map. ITIM provides high-performance, predictable instruction delivery. Fetching an instruction from ITIM is as fast as an instruction-cache hit, with no possibility of a cache miss. ITIM can hold data as well as instructions, though loads and stores from a core to its ITIM are not as performant as loads and stores to its Data Tightly Integrated Memory (DTIM). Memory requests from one core to any other core's ITIM are not as performant as memory requests from a core to its own ITIM.
The instruction cache can be configured as ITIM for all ways except for 1 in units of cache lines (64 bytes). A single instruction cache way must remain an instruction cache. ITIM is allocated simply by storing to it. A store to the nth byte of the ITIM memory map reallocates the first n+1 bytes of instruction cache as ITIM, rounded up to the next cache line.
ITIM is deallocated by storing zero to the first byte after the ITIM region, that is, 28 KiB after the base address of ITIM as indicated in the Memory Map in Chapter 5. The deallocated ITIM space is automatically returned to the instruction cache.
For determinism, software must clear the contents of ITIM after allocating it. It is unpredictable whether ITIM contents are preserved between deallocation and allocation.
ITIM devices are shown in the memory map as two regions: a "mem" region and a "control" region. The "mem" region corresponds to the address range which can be allocated for ITIM use from the Instruction Cache. The "control" region corresponds to the first byte after the ITIM "mem" region.
4.2 Instruction Fetch Unit
The U5 instruction fetch unit contains branch prediction hardware to improve performance of the processor core. The branch predictor comprises a 28-entry branch target buffer (BTB) which predicts the target of taken branches, a 512-entry branch history table (BHT), which predicts the direction of conditional branches, and a 6-entry return-address stack (RAS) which predicts the target of procedure returns. The branch predictor has a one-cycle latency, so that correctly predicted control-flow instructions result in no penalty. Mispredicted control-flow instructions incur a three-cycle penalty.
The U5 implements the standard Compressed (C) extension to the RISCV architecture, which allows for 16-bit RISCV instructions.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
19
4.3 Execution Pipeline
The U5 execution unit is a single-issue, in-order pipeline. The pipeline comprises five stages: instruction fetch, instruction decode and register fetch, execute, data memory access, and register writeback.
The pipeline has a peak execution rate of one instruction per clock cycle, and is fully bypassed so that most instructions have a one-cycle result latency. There are several exceptions:
� LW has a two-cycle result latency, assuming a cache hit. � LH, LHU, LB, and LBU have a three-cycle result latency, assuming a cache hit. � CSR reads have a three-cycle result latency. � MUL, MULH, MULHU, and MULHSU have a 1-cycle result latency. � DIV, DIVU, REM, and REMU have between a 3-cycle and 64-cycle result latency, depending
on the operand values.
The pipeline only interlocks on read-after-write and write-after-write hazards, so instructions may be scheduled to avoid stalls.
The U5 implements the standard Multiply (M) extension to the RISCV architecture for integer multiplication and division. The U5 has a 64-bit per cycle hardware multiply and a 1-bit per cycle hardware divide. The multiplier is fully pipelined and can begin a new operation on each cycle, with a maximum throughput of one operation per cycle.
The hart will not abandon a Divide instruction in flight. This means if an interrupt handler tries to use a register that is the destination register of a divide instruction the pipeline stalls until the divide is complete.
Branch and jump instructions transfer control from the memory access pipeline stage. Correctlypredicted branches and jumps incur no penalty, whereas mispredicted branches and jumps incur a three-cycle penalty.
Most CSR writes result in a pipeline flush with a five-cycle penalty.
4.4 Data Memory System
The U5 data memory system has a 8-way set-associative 32 KiB write-back data cache that supports 64-byte cache lines. The access latency is two clock cycles for words and doublewords, and three clock cycles for smaller quantities. Misaligned accesses are not supported in hardware and result in a trap to support software emulation. The data caches are kept coherent with a directory-based cache coherence manager, which resides in the outer L2 cache.
Stores are pipelined and commit on cycles where the data memory system is otherwise idle. Loads to addresses currently in the store pipeline result in a five-cycle penalty.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
20
4.5 Atomic Memory Operations
The U5 core supports the RISCV standard Atomic (A) extension on the DTIM and the Peripheral Port. Atomic memory operations to regions that do not support them generate an access exception precisely at the core.
The load-reserved and store-conditional instructions are only supported on cached regions, thus generate an access exception on DTIM and other uncached memory regions.
See The RISCV Instruction Set Manual, Volume I: User-Level ISA, Version 2.1 for more information on the instructions added by this extension.
4.6 Floating-Point Unit (FPU)
The U5 FPU provides full hardware support for the IEEE 754-2008 floating-point standard for 32-bit single-precision and 64-bit double-precision arithmetic. The FPU includes a fully pipelined fused-multiply-add unit and an iterative divide and square-root unit, magnitude comparators, and float-to-integer conversion units, all with full hardware support for subnormals and all IEEE default values.
4.7 Virtual Memory Support
The U5 has support for virtual memory through the use of a Memory Management Unit (MMU). The MMU supports the Bare and Sv39 modes as described in The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
The U5 MMU has a 39-bit virtual address space mapped to a 38-bit physical address space. A hardware page-table walker refills the address translation caches. Both first-level instruction and data address translation caches are fully associative and have 32 entries.
The MMU supports 2 MiB megapages and 1 GiB gigapages to reduce translation overheads for large contiguous regions of virtual and physical address space.
Note that the U5 does not automatically set the Accessed (A) and Dirty (D) bits in a Sv39 Page Table Entry (PTE). Instead, the U5 MMU will raise a page fault exception for a read to a page with PTE.A=0 or a write to a page with PTE.D=0.
4.8 Supported Modes
The U5 supports RISCV supervisor and user modes, providing three levels of privilege: machine (M), supervisor (S) and user (U). U-mode provides a mechanism to isolate application processes from each other and from trusted code running in M-mode. S-mode adds a number of additional CSRs and capabilities.
See The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10 for more information on the privilege modes.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
21
4.9 Physical Memory Protection (PMP)
The U5 includes a Physical Memory Protection (PMP) unit compliant with The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10. PMP can be used to set memory access privileges (read, write, execute) for specified memory regions. The U5 PMP supports 8 regions with a minimum region size of 4 bytes.
This section describes how PMP concepts in the RISCV architecture apply to the U5. The definitive resource for information about the RISCV PMP is The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
4.9.1 Functional Description
The U5 includes a PMP unit, which can be used to restrict access to memory and isolate processes from each other.
The U5 PMP unit has 8 regions and a minimum granularity of 4 bytes. Overlapping regions are permitted. The U5 PMP unit implements the architecturally defined pmpcfgX CSR pmpcfg0, supporting 8 regions. pmpcfg2 is implemented, but hardwired to zero. Access to pmpcfg1 or pmpcfg3 results in an illegal instruction exception.
The PMP registers may only be programmed in M-mode. Ordinarily, the PMP unit enforces permissions on S-mode and U-mode accesses. However, locked regions (see Section 4.9.2) additionally enforce their permissions on M-mode.
4.9.2 Region Locking
The PMP allows for region locking whereby, once a region is locked, further writes to the configuration and address registers are ignored. Locked PMP entries may only be unlocked with a system reset. A region may be locked by setting the L bit in the pmpicfg register.
In addition to locking the PMP entry, the L bit indicates whether the R/W/X permissions are enforced on M-Mode accesses. When the L bit is clear, the R/W/X permissions apply to S-mode and U-mode.
4.10 Hardware Performance Monitor
The U54-MC supports a basic hardware performance monitoring facility compliant with The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10. The mcycle CSR holds a count of the number of clock cycles the hart has executed since some arbitrary time in the past. The minstret CSR holds a count of the number of instructions the hart has retired since some arbitrary time in the past. Both are 64-bit counters.
The hardware performance monitor includes two additional event counters, mhpmcounter3 and mhpmcounter4. The event selector CSRs mhpmevent3 and mhpmevent4 are registers that con-
Copyright � 2018�2020, SiFive Inc. All rights reserved.
22
trol which event causes the corresponding counter to increment. The mhpmcounters are 40-bit counters.
The event selectors are partitioned into two fields, as shown in Table 5: the lower 8 bits select an event class, and the upper bits form a mask of events in that class. The counter increments if the event corresponding to any set mask bit occurs. For example, if mhpmevent3 is set to 0x4200, then mhpmcounter3 will increment when either a load instruction or a conditional branch instruction retires. An event selector of 0 means "count nothing."
Note that in-flight and recently retired instructions may or may not be reflected when reading or writing the performance counters or writing the event selectors.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
23
Machine Hardware Performance Monitor Event Register
Instruction Commit Events, mhpeventX[7:0] = 0
Bits
Meaning
8
Exception taken
9
Integer load instruction retired
10
Integer store instruction retired
11
Atomic memory operation retired
12
System instruction retired
13
Integer arithmetic instruction retired
14
Conditional branch retired
15
JAL instruction retired
16
JALR instruction retired
17
Integer multiplication instruction retired
18
Integer division instruction retired
19
Floating-point load instruction retired
20
Floating-point store instruction retired
21
Floating-point addition retired
22
Floating-point multiplication retired
23
Floating-point fused multiply-add retired
24
Floating-point division or square-root retired
25
Other floating-point instruction retired
Microarchitectural Events , mhpeventX[7:0] = 1
Bits
Meaning
8
Load-use interlock
9
Long-latency interlock
10
CSR read interlock
11
Instruction cache/ITIM busy
12
Data cache/DTIM busy
13
Branch direction misprediction
14
Branch/jump target misprediction
15
Pipeline flush from CSR write
16
Pipeline flush from other event
17
Integer multiplication interlock
18
Floating-point interlock
Memory System Events, mhpeventX[7:0] = 2
Bits
Meaning
8
Instruction cache miss
9
Data cache miss or memory-mapped I/O access
10
Data cache writeback
11
Instruction TLB miss
12
Data TLB miss
Table 5: mhpmevent Register Description
Chapter 5
Memory Map
The memory map of the U54-MC is shown in Table 6.
24
Copyright � 2018�2020, SiFive Inc. All rights reserved.
25
Base 0x0000_0000 0x0000_1000 0x0100_0000 0x0100_2000 0x0170_0000 0x0170_1000 0x0170_2000 0x0170_3000 0x0170_4000 0x0170_5000 0x0180_0000 0x0180_4000 0x0182_0000 0x0182_8000 0x0184_0000 0x0184_8000 0x0186_0000 0x0186_8000 0x0188_0000 0x0188_8000 0x0200_0000 0x0201_0000 0x0201_1000 0x0800_0000 0x0820_0000 0x0C00_0000 0x1000_0000 0x2000_0000 0x4000_0000 0x6000_0000 0x8000_0000 0xA000_0000
Top 0x0000_0FFF 0x00FF_FFFF 0x0100_1FFF 0x016F_FFFF 0x0170_0FFF 0x0170_1FFF 0x0170_2FFF 0x0170_3FFF 0x0170_4FFF 0x017F_FFFF 0x0180_3FFF 0x0181_FFFF 0x0182_7FFF 0x0183_FFFF 0x0184_7FFF 0x0185_FFFF 0x0186_7FFF 0x0187_FFFF 0x0188_7FFF 0x01FF_FFFF 0x0200_FFFF 0x0201_0FFF 0x07FF_FFFF 0x081F_FFFF 0x0BFF_FFFF 0x0FFF_FFFF 0x1FFF_FFFF 0x3FFF_FFFF 0x5FFF_FFFF 0x7FFF_FFFF 0x9FFF_FFFF 0xFFFF_FFFF
Attr. RWX A
RWX A
RW A RW A RW A RW A RW A
RWX A
RWX A
RWX A
RWX A
RWX A
RW A RW A
RWX A
RW A
RWX A RWX
RWXCA
Description Debug Reserved DTIM (8 KiB) Reserved BusError BusError BusError BusError BusError Reserved ITIM Reserved ITIM Reserved ITIM Reserved ITIM Reserved ITIM Reserved CLINT L2 Cache Controller Reserved L2 LIM Reserved PLIC Reserved Peripheral Port (512 MiB) System Port (512 MiB) Reserved Memory Port (512 MiB) Reserved
Table 6: U54-MC Memory Map. Memory Attributes: R - Read, W Write, X - Execute, C - Cacheable, A - Atomics
Chapter 6
Interrupts
This chapter describes how interrupt concepts in the RISCV architecture apply to the U54-MC. The definitive resource for information about the RISCV interrupt architecture is The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
6.1 Interrupt Concepts
The U54-MC supports Machine Mode and Supervisor Mode interrupts. It also has support for the following types of RISCV interrupts: local and global. Local interrupts are signaled directly to an individual hart with a dedicated interrupt value. This allows for reduced interrupt latency as no arbitration is required to determine which hart will service a given request and no additional memory accesses are required to determine the cause of the interrupt. Software and timer interrupts are local interrupts generated by the Core-Local Interruptor (CLINT). The U54-MC contains no other local interrupt sources. Global interrupts, by contrast, are routed through a Platform-Level Interrupt Controller (PLIC), which can direct interrupts to any hart in the system via the external interrupt. Decoupling global interrupts from the harts allow the design of the PLIC to be tailored to the platform, permitting a broad range of attributes like the number of interrupts and the prioritization and routing schemes. By default, all interrupts are handled in machine mode. For harts that support supervisor mode, it is possible to selectively delegate interrupts to supervisor mode. This chapter describes the U54-MC interrupt architecture. Chapter 8 describes the Core-Local Interruptor. Chapter 10 describes the global interrupt architecture and the PLIC design.
26
Copyright � 2018�2020, SiFive Inc. All rights reserved.
27
The U54-MC interrupt architecture is depicted in Figure 2.
Figure 2: U54-MC Interrupt Architecture Block Diagram
6.2 Interrupt Operation
Within a privilege mode m, if the associated global interrupt-enable {ie} is clear, then no interrupts will be taken in that privilege mode, but a pending-enabled interrupt in a higher privilege mode will preempt current execution. If {ie} is set, then pending-enabled interrupts at a higher interrupt level in the same privilege mode will preempt current execution and run the interrupt handler for the higher interrupt level.
When an interrupt or synchronous exception is taken, the privilege mode is modified to reflect the new privilege mode. The global interrupt-enable bit of the handler's privilege mode is cleared.
6.2.1 Interrupt Entry and Exit When an interrupt occurs:
� The value of mstatus.MIE is copied into mcause.MPIE, and then mstatus.MIE is cleared, effectively disabling interrupts.
� The privilege mode prior to the interrupt is encoded in mstatus.MPP. � The current pc is copied into the mepc register, and then pc is set to the value specified by
mtvec as defined by the mtvec.MODE described in Table 9.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
28
At this point, control is handed over to software in the interrupt handler with interrupts disabled. Interrupts can be re-enabled by explicitly setting mstatus.MIE or by executing an MRET instruction to exit the handler. When an MRET instruction is executed, the following occurs:
� The privilege mode is set to the value encoded in mstatus.MPP. � The global interrupt enable, mstatus.MIE, is set to the value of mcause.MPIE. � The pc is set to the value of mepc.
At this point control is handed over to software.
The Control and Status Registers involved in handling RISCV interrupts are described in Section 6.3.
6.3 Interrupt Control Status Registers
The U54-MC specific implementation of interrupt CSRs is described below. For a complete description of RISCV interrupt behavior and how to access CSRs, please consult The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
6.3.1 Machine Status Register (mstatus)
The mstatus register keeps track of and controls the hart's current operating state, including whether or not interrupts are enabled. A summary of the mstatus fields related to interrupts in the U54-MC is provided in Table 7. Note that this is not a complete description of mstatus as it contains fields unrelated to interrupts. For the full description of mstatus, please consult The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
CSR Bits
0 1 2 3 4 5 6 7 8 [10:9] [12:11]
Field Name Reserved SIE Reserved MIE Reserved SPIE Reserved MPIE SPP Reserved MPP
Machine Status Register mstatus
Attr. Description WPRI RW Supervisor Interrupt Enable WPRI RW Machine Interrupt Enable WPRI RW Supervisor Previous Interrupt Enable WPRI RW Machine Previous Interrupt Enable RW Supervisor Previous Privilege Mode WPRI RW Machine Previous Privilege Mode
Table 7: U54-MC mstatus Register (partial)
Copyright � 2018�2020, SiFive Inc. All rights reserved.
29
Interrupts are enabled by setting the MIE bit in mstatus and by enabling the desired individual interrupt in the mie register, described in Section 6.3.3.
6.3.2 Machine Trap Vector (mtvec)
The mtvec register has two main functions: defining the base address of the trap vector, and setting the mode by which the U54-MC will process interrupts. For Direct and Vectored modes, the interrupt processing mode is defined in the lower two bits of the mtvec register. The mtvec register is described in Table 9.
CSR Bits [1:0]
[63:2]
Field Name MODE
BASE[63:2]
Machine Trap Vector Register mtvec
Attr. Description WARL MODE Sets the interrupt processing mode.
The encoding for the U54-MC supported modes is described in Table 9. WARL Interrupt Vector Base Address.
When operating in Direct Mode, requires 4 byte alignment.
When operating in Vectored Mode, requires 4 � XLEN byte alignment.
Table 8: mtvec Register
Value 0x0 0x1
2
MODE Field Encoding mtvec.MODE
Name
Description
Direct
All exceptions set pc to BASE.
Vectored
Asynchronous interrupts set pc to BASE + 4 �
mcause.EXCCODE.
Reserved
Table 9: Encoding of mtvec.MODE
See Table 8 for a description of the mtvec register. See Table 9 for a description of the mtvec.MODE field. See Table 13 for the U54-MC interrupt exception code values.
Mode Direct When operating in direct mode all synchronous exceptions and asynchronous interrupts trap to the mtvec.BASE address. Inside the trap handler, software must read the mcause register to determine what triggered the trap.
When operating in Direct Mode, BASE must be 4-byte aligned.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
30
Mode Vectored
While operating in vectored mode, interrupts set the pc to mtvec.BASE + 4 � exception code (mcause.EXCCODE). For example, if a machine timer interrupt is taken, the pc is set to mtvec.BASE + 0x1C. Typically, the trap vector table is populated with jump instructions to transfer control to interrupt-specific trap handlers.
In vectored interrupt mode, BASE must be 4 � XLEN byte aligned.
All machine external interrupts (global interrupts) are mapped to exception code of 11. Thus, when interrupt vectoring is enabled, the pc is set to address mtvec.BASE + 0x2C for any global interrupt.
6.3.3 Machine Interrupt Enable (mie)
Individual interrupts are enabled by setting the appropriate bit in the mie register. The mie register is described in Table 10.
CSR Bits
0 1 2 3 4 5 6 7 8 9 10 11 [63:12]
Machine Interrupt Enable Register
mie
Field Name Reserved SSIE Reserved MSIE Reserved STIE Reserved MTIE Reserved SEIE Reserved MEIE Reserved
Attr. WPRI RW WPRI RW WPRI RW WPRI RW WPRI RW WPRI RW WPRI
Description Supervisor Software Interrupt Enable Machine Software Interrupt Enable Supervisor Timer Interrupt Enable Machine Timer Interrupt Enable Supervisor External Interrupt Enable Machine External Interrupt Enable
Table 10: mie Register
6.3.4 Machine Interrupt Pending (mip)
The machine interrupt pending (mip) register indicates which interrupts are currently pending. The mip register is described in Table 11.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
31
CSR Bits
0 1 2 3 4 5 6 7 8 9 10 11 [63:12]
Machine Interrupt Pending Register
mip
Field Name Reserved SSIP Reserved MSIP Reserved STIP Reserved MTIP Reserved SEIP Reserved MEIP Reserved
Attr. WIRI RW WIRI RO WIRI RW WIRI RO WIRI RW WIRI RO WIRI
Description Supervisor Software Interrupt Pending Machine Software Interrupt Pending Supervisor Timer Interrupt Pending Machine Timer Interrupt Pending Supervisor External Interrupt Pending Machine External Interrupt Pending
Table 11: mip Register
6.3.5 Machine Cause (mcause)
When a trap is taken in machine mode, mcause is written with a code indicating the event that caused the trap. When the event that caused the trap is an interrupt, the most-significant bit of mcause is set to 1, and the least-significant bits indicate the interrupt number, using the same encoding as the bit positions in mip. For example, a Machine Timer Interrupt causes mcause to be set to 0x8000_0000_0000_0007. mcause is also used to indicate the cause of synchronous exceptions, in which case the most-significant bit of mcause is set to 0.
See Table 12 for more details about the mcause register. Refer to Table 13 for a list of synchronous exception codes.
CSR Bits [9:0] [62:10] 63
Field Name Exception Code
Reserved Interrupt
Machine Cause Register
mcause
Attr. WLRL WLRL WARL
Description A code identifying the last exception.
1 if the trap was caused by an interrupt; 0 otherwise.
Table 12: mcause Register
Copyright � 2018�2020, SiFive Inc. All rights reserved.
32
Interrupt 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Interrupt Exception Codes Exception Code Description
0 Reserved 1 Supervisor software interrupt 2 Reserved 3 Machine software interrupt 4 Reserved 5 Supervisor timer interrupt 6 Reserved 7 Machine timer interrupt 8 Reserved 9 Supervisor external interrupt 10 Reserved 11 Machine external interrupt 12 Reserved 0 Instruction address misaligned 1 Instruction access fault 2 Illegal instruction 3 Breakpoint 4 Load address misaligned 5 Load access fault 6 Store/AMO address misaligned 7 Store/AMO access fault 8 Environment call from U-mode 9 Environment call from S-mode 10 Reserved 11 Environment call from M-mode 12 Instruction page fault 13 Load page fault 14 Reserved 15 Store/AMO page fault 16 Reserved
Table 13: mcause Exception Codes
6.4 Supervisor Mode Interrupts
The U54-MC supports the ability to selectively direct interrupts and exceptions to supervisor mode, resulting in improved performance by eliminating the need for additional mode changes.
This capability is enabled by the interrupt and exception delegation CSRs; mideleg and medeleg, respectively. Supervisor interrupts and exceptions can be managed via supervisor versions of the interrupt CSRs, specifically: stvec, sip, sie, and scause.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
33
Machine mode software can also directly write to the sip register, which effectively sends an interrupt to supervisor mode. This is especially useful for timer and software interrupts as it may be desired to handle these interrupts in both machine mode and supervisor mode.
The delegation and supervisor CSRs are described in the sections below. The definitive resource for information about RISCV supervisor interrupts is The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
6.4.1 Delegation Registers (mideleg and medeleg)
By default, all traps are handled in machine mode. Machine mode software can selectively delegate interrupts and exceptions to supervisor mode by setting the corresponding bits in mideleg and medeleg CSRs. The exact mapping is provided in Table 14 and Table 15 and matches the mcause interrupt and exception codes defined in Table 13.
Note that local interrupts may be delegated to supervisor mode.
CSR Bits
0 1 [4:2] 5 [8:6] 9 [63:10]
Machine Interrupt Delegation Register
Field Name Reserved SSIP Reserved STIP Reserved SEIP Reserved
Attr. WARL
RW WARL
RW WARL
RW WARL
mideleg Description Delegate Supervisor Software Interrupt Delegate Supervisor Timer Interrupt Delegate Supervisor External Interrupt
Table 14: mideleg Register
Copyright � 2018�2020, SiFive Inc. All rights reserved.
34
CSR Bits
0 1 2 3 4 5 6 7 8 9 [11:0] 12 13 14 15 [63:16]
Machine Exception Delegation Register
medeleg
Attr.
Description
RW
Delegate Instruction Access Misaligned Exception
RW
Delegate Instruction Access Fault Exception
RW
Delegate Illegal Instruction Exception
RW
Delegate Breakpoint Exception
RW
Delegate Load Access Misaligned Exception
RW
Delegate Load Access Fault Exception
RW
Delegate Store/AMO Address Misaligned Exception
RW
Delegate Store/AMO Access Fault Exception
RW
Delegate Environment Call from U-Mode
RW
Delegate Environment Call from S-Mode
WARL
Reserved
RW
Delegate Instruction Page Fault
RW
Delegate Load Page Fault
WARL
Reserved
RW
Delegate Store/AMO Page Fault Exception
WARL
Reserved
Table 15: medeleg Register
6.4.2 Supervisor Status Register (sstatus)
Similar to machine mode, supervisor mode has a register dedicated to keeping track of the hart's current state called sstatus. sstatus is effectively a restricted view of mstatus, described in Section 6.3.1, in that changes made to sstatus are reflected in mstatus and viceversa, with the exception of the machine mode fields, which are not visible in sstatus.
A summary of the sstatus fields related to interrupts in the U54-MC is provided in Table 16. Note that this is not a complete description of sstatus as it also contains fields unrelated to interrupts. For the full description of sstatus, consult the The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
CSR Bits
0 1 [4:2] 5 [7:6] 8 [12:9]
Supervisor Status Register
Field Name Reserved SIE Reserved SPIE Reserved SPP Reserved
Attr. WPRI RW WPRI RW WPRI RW WPRI
sstatus Description Supervisor Interrupt Enable Supervisor Previous Interrupt Enable Supervisor Previous Privilege Mode
Table 16: U54-MC sstatus Register (partial)
Copyright � 2018�2020, SiFive Inc. All rights reserved.
35
Interrupts are enabled by setting the SIE bit in sstatus and by enabling the desired individual interrupt in the sie register, described in Section 6.4.3.
6.4.3 Supervisor Interrupt Enable Register (sie)
Supervisor interrupts are enabled by setting the appropriate bit in the sie register. The U54-MC sie register is described in Table 17.
CSR Bits
0 1 [4:2] 5 [8:6] 9 [63:10]
Supervisor Interrupt Enable Register
Field Name Reserved SSIE Reserved STIE Reserved SEIE Reserved
Attr. WPRI RW WPRI RW WPRI RW WPRI
sie Description Supervisor Software Interrupt Enable Supervisor Timer Interrupt Enable Supervisor External Interrupt Enable
Table 17: sie Register
6.4.4 Supervisor Interrupt Pending (sip)
The supervisor interrupt pending (sip) register indicates which interrupts are currently pending. The U54-MC sip register is described in Table 18.
CSR Bits
0 1 [4:2] 5 [8:6] 9 [63:10]
Supervisor Interrupt Pending Register
sip
Field Name
Attr. Description
Reserved
WIRI
SSIP
RW Supervisor Software Interrupt Pending
Reserved
WIRI
STIP
RW Supervisor Timer Interrupt Pending
Reserved
WIRI
SEIP
RW Supervisor External Interrupt Pending
Reserved
WIRI
Table 18: sip Register
6.4.5 Supervisor Cause Register (scause)
When a trap is taken in supervisor mode, scause is written with a code indicating the event that caused the trap. When the event that caused the trap is an interrupt, the most-significant bit of scause is set to 1, and the least-significant bits indicate the interrupt number, using the same encoding as the bit positions in sip. For example, a Supervisor Timer Interrupt causes scause to be set to 0x8000_0000_0000_0005.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
36
scause is also used to indicate the cause of synchronous exceptions, in which case the mostsignificant bit of scause is set to 0. Refer to Table 20 for a list of synchronous exception codes.
CSR Bits [62:0]
63
Supervisor Cause Register
scause
Field Name Exception Code
(EXCCODE) Interrupt
Attr. Description WLRL A code identifying the last exception.
WARL 1 if the trap was caused by an interrupt; 0 otherwise.
Table 19: scause Register
Interrupt 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Supervisor Interrupt Exception Codes Exception Code Description
0 Reserved 1 Supervisor software interrupt 2 � 4 Reserved 5 Supervisor timer interrupt 6 � 8 Reserved 9 Supervisor external interrupt 10 Reserved 0 Instruction address misaligned 1 Instruction access fault 2 Illegal instruction 3 Breakpoint 4 Reserved 5 Load access fault 6 Store/AMO address misaligned 7 Store/AMO access fault 8 Environment call from U-mode 9 � 11 Reserved 12 Instruction page fault 13 Load page fault 14 Reserved 15 Store/AMO Page Fault 16 Reserved
Table 20: scause Exception Codes
6.4.6 Supervisor Trap Vector (stvec)
By default, all interrupts trap to a single address defined in the stvec register. It is up to the interrupt handler to read scause and react accordingly. RISCV and the U54-MC also support the ability to optionally enable interrupt vectors. When vectoring is enabled, each interrupt defined in sie will trap to its own specific interrupt handler.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
37
Vectored interrupts are enabled when the MODE field of the stvec register is set to 1.
CSR Bits [1:0]
[63:2]
Supervisor Trap Vector Register
Field Name MODE
BASE[63:2]
Attr. WARL
WARL
stvec
Description MODE determines whether or not interrupt vectoring is enabled. The encoding for the MODE field is described in Table 22. Interrupt Vector Base Address. Must be aligned on a 128-byte boundary when MODE=1. Note, BASE[1:0] is not present in this register and is implicitly 0.
Table 21: stvec Register
Value 0 1
2
MODE Field Encoding stvec.MODE
Name
Description
Direct
All exceptions set pc to BASE
Vectored
Asynchronous interrupts set pc to BASE + 4 �
scause.EXCCODE
Reserved
Table 22: Encoding of stvec.MODE
If vectored interrupts are disabled (stvec.MODE=0), all interrupts trap to the stvec.BASE address. If vectored interrupts are enabled (stvec.MODE=1), interrupts set the pc to stvec.BASE + 4 � exception code (scause.EXCCODE). For example, if a supervisor timer interrupt is taken, the pc is set to stvec.BASE + 0x14. Typically, the trap vector table is populated with jump instructions to transfer control to interrupt-specific trap handlers.
In vectored interrupt mode, BASE must be 128-byte aligned.
All supervisor external interrupts (global interrupts) are mapped to exception code of 9. Thus, when interrupt vectoring is enabled, the pc is set to address stvec.BASE + 0x24 for any global interrupt.
See Table 21 for a description of the stvec register. See Table 22 for a description of the stvec.MODE field. See Table 20 for the U54-MC supervisor mode interrupt exception code values.
6.4.7 Delegated Interrupt Handling Upon taking a delegated trap, the following occurs:
� The value of sstatus.SIE is copied into sstatus.SPIE, then sstatus.SIE is cleared, effectively disabling interrupts.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
38
� The current pc is copied into the sepc register, and then pc is set to the value of stvec. In the case where vectored interrupts are enabled, pc is set to stvec.BASE + 4 � exception code (scause.EXCCODE).
� The privilege mode prior to the interrupt is encoded in sstatus.SPP.
At this point, control is handed over to software in the interrupt handler with interrupts disabled. Interrupts can be re-enabled by explicitly setting sstatus.SIE or by executing an SRET instruction to exit the handler. When an SRET instruction is executed, the following occurs:
� The privilege mode is set to the value encoded in sstatus.SPP. � The value of sstatus.SPIE is copied into sstatus.SIE. � The pc is set to the value of sepc.
At this point, control is handed over to software.
6.5 Interrupt Priorities
Individual priorities of global interrupts are determined by the PLIC, as discussed in Chapter 10.
U54-MC interrupts are prioritized as follows, in decreasing order of priority:
� Machine external interrupts � Machine software interrupts � Machine timer interrupts � Supervisor external interrupts � Supervisor software interrupts � Supervisor timer interrupts
6.6 Interrupt Latency
Interrupt latency for the U54-MC is 4 cycles, as counted by the number of cycles it takes from signaling of the interrupt to the hart to the first instruction fetch of the handler.
Global interrupts routed through the PLIC incur additional latency of 3 cycles where the PLIC is clocked by clock. This means that the total latency, in cycles, for a global interrupt is: 4 + 3 (core_clock_0 Hz clock Hz). This is a best case cycle count and assumes the handler is cached or located in ITIM. It does not take into account additional latency from a peripheral source.
Chapter 7
Bus-Error Unit
This chapter describes the operation of the SiFive Bus-Error Unit.
7.1 Bus-Error Unit Overview
The Bus-Error Unit (BEU) is a per-processor device that records erroneous events and reports them using platform-level and hart-local interrupts. The BEU can be configured to generate interrupts on correctable memory errors, uncorrectable memory errors, and/or TileLink bus errors.
7.2 Reportable Errors
Table 23 lists the events that a Bus-Error Unit may report.
Cause 0 1 2 3 4 5 6 7
Meaning No error Reserved Instruction cache or ITIM correctable ECC error ITIM uncorrectable ECC error Reserved Load or store TileLink bus error Data cache correctable ECC error Data cache uncorrectable ECC error
Table 23: enable Register Description
7.3 Functional Behavior
When one of the events listed in Table 23 occurs, the Bus-Error Unit can record information about the event and can generate an interrupt to the PLIC or locally to the hart. The enable register contains a mask of which events the BEU can record. Each bit in enable corresponds to
39
Copyright � 2018�2020, SiFive Inc. All rights reserved.
40
an event in Table 23; for example, if enable[3] is set, the BEU will record uncorrectable ITIM errors.
The cause register indicates the event the BEU has most recently recorded, e.g., a value of 3 indicates an uncorrectable ITIM error. The cause value 0 is reserved to indicate no error. cause is only written for events enabled in the enable register. Furthermore, cause is only written when its current value is 0; that is, if multiple events occur, only the first one is latched, until software clears the cause register.
The value register supplies the physical address that caused the event, or 0 if the address is unknown. The BEU writes the value register whenever it writes the cause register: i.e., when an event enabled in the enable register occurs, and when cause contains 0.
The accrued register indicates which events have occurred since the last time it was cleared by software. Its format is the same as the enable register. The BEU sets bits in the accrued register whether or not they are enabled in the enable register.
The plic_interrupt register indicates which accrued events should generate an interrupt to the PLIC. An interrupt is generated when any bit is set in both accrued and plic_interrupt, i.e., when (accrued & plic_interrupt) != 0.
The local_interrupt register indicates which accrued events should generate an interrupt directly to the hart associated with this bus-error unit. An interrupt is generated when any bit is set in both accrued and local_interrupt, i.e., when (accrued & local_interrupt) != 0}. The interrupt cause is 128; it does not have a bit in the mie CSR, so it is always enabled; nor does it have a bit in the mideleg CSR, so it cannot be delegated to a mode less privileged than M-mode.
7.4 Memory Map
The Bus-Error Unit memory map is shown in Table 24.
Offset 0x00 0x08 0x10 0x18 0x20 0x28
Name cause value enable plic_interrupt accrued local_interrupt
Description Cause of error event Physical address of error event Event enable mask Platform-level interrupt enable mask Accrued event mask Hart-local interrupt-enable mask
Table 24: Bus-Error Unit Memory Map
Chapter 8
Core-Local Interruptor (CLINT)
The CLINT block holds memory-mapped control and status registers associated with software and timer interrupts. The U54-MC CLINT complies with The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10.
8.1 CLINT Memory Map
Table 25 shows the memory map for CLINT on SiFive U54-MC.
Address 0x0200_0000 0x0200_0004 0x0200_0008 0x0200_000C 0x0200_0010 0x0200_0014
... 0x0200_3FFF 0x0200_4000 0x0200_4008 0x0200_4010 0x0200_4018 0x0200_4020 0x0200_4028
... 0x0200_BFF7 0x0200_BFF8 0x0200_C000
Width 4B 4B 4B 4B 4B
8B 8B 8B 8B 8B
8B
Attr. RW RW RW RW RW
Description msip for hart 0 msip for hart 1 msip for hart 2 msip for hart 3 msip for hart 4 Reserved
Notes MSIP Registers (1 bit wide)
RW mtimecmp for hart 0 MTIMECMP Registers RW mtimecmp for hart 1 RW mtimecmp for hart 2 RW mtimecmp for hart 3 RW mtimecmp for hart 4
Reserved
RW mtime Reserved
Timer Register
Table 25: CLINT Register Map
41
Copyright � 2018�2020, SiFive Inc. All rights reserved.
42
8.2 MSIP Registers
Machine-mode software interrupts are generated by writing to the memory-mapped control register msip. Each msip register is a 32-bit wide WARL register where the upper 31 bits are tied to 0. The least significant bit is reflected in the MSIP bit of the mip CSR. Other bits in the msip registers are hardwired to zero. On reset, each msip register is cleared to zero.
Software interrupts are most useful for interprocessor communication in multi-hart systems, as harts may write each other's msip bits to effect interprocessor interrupts.
8.3 Timer Registers
mtime is a 64-bit read-write register that contains the number of cycles counted from the rtc_toggle signal described in the U54-MC User Guide. A timer interrupt is pending whenever mtime is greater than or equal to the value in the mtimecmp register. The timer interrupt is reflected in the mtip bit of the mip register described in Chapter 6.
On reset, mtime is cleared to zero. The mtimecmp registers are not reset.
8.4 Supervisor Mode Delegation
By default, all interrupts trap to machine mode, including timer and software interrupts. In order for supervisor timer and software interrupts to trap directly to supervisor mode, supervisor timer and software interrupts must first be delegated to supervisor mode.
Please see Section 6.4 for more details on supervisor mode interrupts.
Chapter 9
Level 2 Cache Controller
This chapter describes the functionality of the Level 2 Cache Controller used in the U54-MC.
9.1 Level 2 Cache Controller Overview
The SiFive Level 2 Cache Controller is used to provide access to fast copies of memory for masters in a Core Complex. The Level 2 Cache Controller also acts as a directory-based coherency manager. The SiFive Level 2 Cache Controller offers extensive flexibility as it allows for several features in addition to the Level 2 Cache functionality. These include memory-mapped access to L2 Cache RAM for disabled cache ways, scratchpad functionality, way masking and locking, ECC support with error tracking statistics, error injection, and interrupt signaling capabilities. These features are described in Section 9.2.
9.2 Functional Description
The U54-MC L2 Cache Controller is configured into 4 banks. Each bank contains 512 sets of 16 ways and each way contains a 64-byte block. This subdivision into banks helps facilitate increased available bandwidth between CPU masters and the L2 Cache as each bank has its own dedicated 64-bit TL-C inner port. As such, multiple requests to different banks may proceed in parallel. The outer port of the L2 Cache Controller is a 128-bit TL-C port shared among all banks and typically connected to a DDR controller. The outer Memory port(s) of the L2 Cache Controller is shared among all banks and typically connected to cacheable memory. The overall organization of the L2 Cache Controller is depicted in Figure 3.
43
Copyright � 2018�2020, SiFive Inc. All rights reserved.
44
Figure 3: Organization of the SiFive L2 Cache Controller
9.2.1 Way Enable and the L2 Loosely-Integrated Memory (L2 LIM) Similar to the ITIM discussed in Chapter 4, the SiFive Level 2 Cache Controller allows for its SRAMs to act either as direct addressed memory in the Core Complex address space or as a cache that is controlled by the L2 Cache Controller and which can contain a copy of any cacheable address.
When cache ways are disabled, they are addressable in the L2 Loosely-Integrated Memory (L2 LIM) address space as described in the U54-MC memory map in Chapter 5. Fetching instructions or data from the L2 LIM provides deterministic behavior equivalent to an L2 cache hit, with no possibility of a cache miss. Accesses to L2 LIM are always given priority over cache way accesses, which target the same L2 cache bank.
Out of reset, all ways, except for way 0, are disabled. Cache ways can be enabled by writing to the WayEnable register described in Section 9.4.2. Once a cache way is enabled, it cannot be
Copyright � 2018�2020, SiFive Inc. All rights reserved.
45
disabled unless the U54-MC is reset. The highest numbered L2 Cache Way is mapped to the lowest L2 LIM address space, and way 1 occupies the highest L2 LIM address range. As L2 cache ways are enabled, the size of the L2 LIM address space shrinks. The mapping of L2 cache ways to L2 LIM address space is show in Figure 4.
Figure 4: Mapping of L2 Cache Ways to L2 LIM Addresses
9.2.2 Way Masking and Locking The SiFive L2 Cache Controller can control the amount of cache memory a CPU master is able to allocate into by using the WayMaskX register described in Section 9.4.12. Note that WayMaskX registers only affect allocations, and reads can still occur to ways that are masked. As such, it becomes possible to lock down specific cache ways by masking them in all WayMaskX registers. In this scenario, all masters can still read data in the locked cache ways but cannot evict data.
9.2.3 L2 Zero Device The SiFive L2 Cache Controller has a dedicated scratchpad address region that allows for allocation into the cache using an address range which is not memory backed. This address region is denoted as the L2 Zero Device in the Memory Map in Chapter 5. Writes to the scratchpad region allocate into cache ways that are enabled and not masked. Care must be taken with the scratchpad, however, as there is no memory backing this address space. Cache evictions from addresses in the scratchpad result in data loss.
The main advantage of the L2 Zero Device over the L2 LIM is that it is a cacheable region allowing for data stored to the scratchpad to also be cached in a master's L1 data cache resulting in faster access.
The recommended procedure for using the L2 Zero Device is as follows:
1. Use the WayEnable register to enable the desired cache ways.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
46
2. Designate a single master that will allocate into the scratchpad. For this procedure, we designate this master as Master S. All other masters (CPU and non-CPU) are denoted as Masters X.
3. Masters X: Write to the WayMaskX register to mask the ways that are to be used for the scratchpad. This prevents Masters X from evicting cache lines in the designated scratchpad ways.
4. Master S: Write to the WayMaskX register to mask all ways except the ways that are to be used for the scratchpad. At this point, Master S should only be able to allocate into the cache ways meant to be used as a scratchpad.
5. Master S: Write scratchpad data into the L2 Zero Device address range.
6. Master S: Repeat steps 4 and 5 for each way to be used as scratchpad.
7. Master S: Use the WayMaskX register to mask the scratchpad ways for Master S so that it is no longer able to evict cache lines from the designated scratchpad ways.
8. At this point, the scratchpad ways should contain the scratchpad data, with all masters able to read, write, and execute from this address space, and no masters able to evict the scratchpad contents.
9.2.4 Error Correcting Codes (ECC)
The SiFive Level 2 Cache Controller supports ECC. ECC is applied to both categories of SRAM used, the data SRAMs and the metadata SRAMs (index, tag, and directory information). The data SRAMs use Single-Error Correcting, Double-Error Detecting (SECDED). The metadata SRAMs use Single-Error Correcting, Double-Error Detecting (SECDED).
Whenever a correctable error is detected, the cache immediately repairs the corrupted bit and writes it back to SRAM. This corrective procedure is completely invisible to application software. However, to support diagnostics, the cache records the address of the most recently corrected metadata and data errors. Whenever a new error is corrected, a counter is increased and an interrupt is raised. There are independent addresses, counters, and interrupts for correctable metadata and data errors.
DirFail, DirError, DataError, and DataFail signals are used to indicate that an L2 metadata, data, or uncorrectable L2 data error has occurred, respectively. These signals are connected to the PLIC as described in Chapter 10 and are cleared upon reading their respective count registers.
9.3 Memory Map
The L2 Cache Controller memory map is shown in Table 26.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
47
Offset 0x000 0x008
0x040 0x100 0x104 0x108 0x120 0x124 0x128 0x140 0x144 0x148 0x160 0x164 0x168 0x200
0x240
0x800 0x808 0x810 0x818 0x820 0x828 0x830 0x838 0x840 0x848 0x850 0x858 0x860 0x868 0x870
Name Config WayEnable
ECCInjectError DirECCFixLow DirECCFixHigh DirECCFixCount DirECCFailLow DirECCFailHigh DirECCFailCount DatECCFixLow DatECCFixHigh DatECCFixCount DatECCFailLow DatECCFailHigh DatECCFailCount Flush64
Flush32
WayMask0 WayMask1 WayMask2 WayMask3 WayMask4 WayMask5 WayMask6 WayMask7 WayMask8 WayMask9 WayMask10 WayMask11 WayMask12 WayMask13 WayMask14
Description Information about the Cache Configuration The index of the largest way which has been enabled. May only be increased. Inject an ECC Error The low 32-bits of the most recent address to fail ECC The high 32-bits of the most recent address to fail ECC Reports the number of times an ECC error occured The low 32-bits of the most recent address to fail ECC The high 32-bits of the most recent address to fail ECC Reports the number of times an ECC error occured The low 32-bits of the most recent address to fail ECC The high 32-bits of the most recent address to fail ECC Reports the number of times an ECC error occured The low 32-bits of the most recent address to fail ECC The high 32-bits of the most recent address to fail ECC Reports the number of times an ECC error occured Flush the phsyical address equal to the 64-bit written data from the cache Flush the physical address equal to the 32-bit written data << 4 from the cache Master 0 way mask register Master 1 way mask register Master 2 way mask register Master 3 way mask register Master 4 way mask register Master 5 way mask register Master 6 way mask register Master 7 way mask register Master 8 way mask register Master 9 way mask register Master 10 way mask register Master 11 way mask register Master 12 way mask register Master 13 way mask register Master 14 way mask register
Table 26: Register offsets within the L2 Cache Controller Control Memory Map
9.4 Register Descriptions
This section describes the functionality of the memory-mapped registers in the Level 2 Cache Controller.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
48
9.4.1 Cache Configuration Register (Config)
The Config Register can be used to programmatically determine information regarding the cache size and organization.
Register Offset Bits Field Name
[7:0] Banks [15:8] Ways [23:16] lgSets [31:24] lgBlockBytes
Attr. RO RO RO RO
Config Register 0x0
Rst. Description 0x4 Number of banks in the cache
0x10 Number of ways per bank 0x9 Base-2 logarithm of the sets per bank 0x6 Base-2 logarithm of the bytes per cache block
Table 27: Information about the Cache Configuration
9.4.2 Way Enable Register (WayEnable)
The WayEnable register determines which ways of the Level 2 Cache Controller are enabled as cache. Cache ways that are not enabled are mapped into the U54-MC's L2 LIM (Loosely-Integrated Memory) as described in the memory map in Chapter 5.
This register is initialized to 0 on reset and may only be increased. This means that, out of reset, only a single L2 cache way is enabled, as one cache way must always remain enabled. Once a cache way is enabled, the only way to map it back into the L2 LIM address space is by a reset.
Register Offset
WayEnable Register 0x8
Bits Field
Attr. Rst. Description
Name
[7:0] WayEnable RW 0x0 The index of the largest way which has been enabled.
May only be increased.
Table 28: The index of the largest way which has been enabled. May only be increased.
9.4.3 ECC Error Injection Register (ECCInjectError)
The ECCInjectError register can be used to insert an ECC error into either the backing data or metadata SRAM. This function can be used to test error correction logic, measurement, and recovery.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
49
Register Offset Bits Field Name
[7:0] ECCToggleBit
[15:8] Reserved 16 ECCToggleType
[31:17] Reserved
ECCInjectError Register 0x40
Attr. Rst. Description RW 0x0 Toggle (corrupt) this bit index on the next cache
operation
RW 0x0 Toggle (corrupt) a bit in 0=data or 1=directory
Table 29: Inject an ECC Error
9.4.4 ECC Directory Fix Address (DirECCFix*)
The DirECCFixHigh and DirECCFixLow registers are read-only registers that contain the address of the most recently corrected metadata error. This field supplies only the portions of the address that correspond to the affected set and bank, since all ways are corrected together.
9.4.5 ECC Directory Fix Count (DirECCFixCount) The DirECCFixCount register is a read-only register that contains the number of corrected L2 metadata errors.
Reading this register clears the DirError interrupt signal described in Section 9.2.4.
9.4.6 ECC Directory Fail Address (DirECCFail*)
The DirECCFailLow and DirECCFailHigh registers are read-only registers that contains the address of the most recent uncorrected L2 metadata error.
9.4.7 ECC Data Fix Address (DatECCFix*)
The DatECCFixLow and DatECCFixHigh registers are read-only registers that contain the address of the most recently corrected L2 data error.
9.4.8 ECC Data Fix Count (DatECCFixCount) The DataECCFixCount register is a read-only register that contains the number of corrected data errors.
Reading this register clears the DataError interrupt signal described in Section 9.2.4.
9.4.9 ECC Data Fail Address (DatECCFail*)
The DatECCFailLow and DatECCFailHigh registers are a read-only registers that contain the address of the most recent uncorrected L2 data error.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
50
9.4.10 ECC Data Fail Count (DatECCFailCount) The DatECCFailCount register is a read-only register that contains the number of uncorrected data errors.
Reading this register clears the DataFail interrupt signal described in Section 9.2.4.
9.4.11 Cache Flush Registers (Flush*)
The U54-MC L2 Cache Controller provides two registers that can be used for flushing specific cache blocks.
Flush64 is a 64-bit write-only register that flushes the cache block containing the address written. Flush32 is a 32-bit write-only register that flushes a cache block containing the written address left shifted by 4 bytes. In both registers, all bits must be written in a single access for the flush to take effect.
9.4.12 Way Mask Registers (WayMask*)
The WayMaskX register allows a master connected to the L2 Cache Controller to specify which L2 cache ways can be evicted by master X. Masters can still access memory cached in masked ways. The mapping between masters and their L2 master IDs is shown in Table 31.
At least one cache way must be enabled. It is recommended to set/clear bits in this register using atomic operations.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
51
WayMask0 Register
Register Offset
0x800
Bits Field Name Attr. Rst. Description
0 WayMask0[0] RW 0x1 Enable way 0 for Master 0
1 WayMask0[1] RW 0x1 Enable way 1 for Master 0
2 WayMask0[2] RW 0x1 Enable way 2 for Master 0
3 WayMask0[3] RW 0x1 Enable way 3 for Master 0
4 WayMask0[4] RW 0x1 Enable way 4 for Master 0
5 WayMask0[5] RW 0x1 Enable way 5 for Master 0
6 WayMask0[6] RW 0x1 Enable way 6 for Master 0
7 WayMask0[7] RW 0x1 Enable way 7 for Master 0
8 WayMask0[8] RW 0x1 Enable way 8 for Master 0
9 WayMask0[9] RW 0x1 Enable way 9 for Master 0
10 WayMask0[10] RW 0x1 Enable way 10 for Master 0
11 WayMask0[11] RW 0x1 Enable way 11 for Master 0
12 WayMask0[12] RW 0x1 Enable way 12 for Master 0
13 WayMask0[13] RW 0x1 Enable way 13 for Master 0
14 WayMask0[14] RW 0x1 Enable way 14 for Master 0
15 WayMask0[15] RW 0x1 Enable way 15 for Master 0
Table 30: Master 0 way mask register
Master ID 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Description Hart 0 DCache MMIO Hart 0 ICache Hart 1 DCache Hart 1 ICache Hart 2 DCache Hart 2 ICache Hart 3 DCache Hart 3 ICache Hart 4 DCache Hart 4 ICache Debug AXI4 Front Port ID#0 AXI4 Front Port ID#1 AXI4 Front Port ID#2 AXI4 Front Port ID#3
Table 31: Master IDs in the L2 Cache Controller
Chapter 10
Platform-Level Interrupt Controller (PLIC)
This chapter describes the operation of the platform-level interrupt controller (PLIC) on the U54-MC. The PLIC complies with The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10 and can support a maximum of 136 external interrupt sources with 7 priority levels. The U54-MC PLIC resides in the clock timing domain, allowing for relaxed timing requirements. The latency of global interrupts, as perceived by a hart, increases with the ratio of the core_clock_0 frequency and the clock frequency.
10.1 Memory Map
The memory map for the U54-MC PLIC control registers is shown in Table 32. The PLIC memory map only supports aligned 32-bit memory accesses.
52
Copyright � 2018�2020, SiFive Inc. All rights reserved.
53
Address 0x0C00_0000 0x0C00_0004
... 0x0C00_0220 0x0C00_0224
... 0x0C00_1000
... 0x0C00_1010 0x0C00_1014
... 0x0C00_2000
... 0x0C00_2010
0x0C00_2014 ...
0x0C00_2080
... 0x0C00_2090
0x0C00_2094 ...
0x0C00_2100
... 0x0C00_2110
0x0C00_2114 ...
0x0C00_2180
... 0x0C00_2190
0x0C00_2194 ...
0x0C00_2200
Width 4B 4B 4B 4B 4B 4B
4B 4B
4B 4B
4B 4B
4B
Attr. RW
PLIC Register Map Description
Reserved source 1 priority
RW source 136 priority Reserved
Notes
See Section 10.3 for more information
RO Start of pending array
RO Last word of pending array Reserved
See Section 10.4 for more information
RW Start Hart 0 M-Mode inter-
rupt enables
See Section 10.5 for more
RW End Hart 0 M-Mode interrupt information
enables
Reserved
RW Start Hart 1 M-Mode inter-
rupt enables
See Section 10.5 for more
information RW End Hart 1 M-Mode interrupt
enables
Reserved
RW Start Hart 1 S-Mode interrupt
enables
See Section 10.5 for more
information RW End Hart 1 S-Mode interrupt
enables
Reserved
RW Start Hart 2 M-Mode inter-
rupt enables
See Section 10.5 for more
information RW End Hart 2 M-Mode interrupt
enables
Reserved
RW Start Hart 2 S-Mode interrupt See Section 10.5 for more
enables
information
Table 32: PLIC Register Map
Copyright � 2018�2020, SiFive Inc. All rights reserved.
54
... 0x0C00_2210 4B
0x0C00_2214 ...
0x0C00_2280 4B
... 0x0C00_2290 4B
0x0C00_2294 ...
0x0C00_2300 4B
... 0x0C00_2310 4B
0x0C00_2314 ...
0x0C00_2380 4B
... 0x0C00_2390 4B
0x0C00_2394 ...
0x0C00_2400 4B
... 0x0C00_2410 4B
0x0C00_2414 ...
0x0C20_0000 4B
0x0C20_0004 4B
0x0C20_0008 ...
0x0C20_1000 4B
PLIC Register Map
RW End Hart 2 S-Mode interrupt enables Reserved
RW Start Hart 3 M-Mode inter-
rupt enables
See Section 10.5 for more
RW End Hart 3 M-Mode interrupt information
enables
Reserved
RW Start Hart 3 S-Mode interrupt
enables
See Section 10.5 for more
information RW End Hart 3 S-Mode interrupt
enables
Reserved
RW Start Hart 4 M-Mode inter-
rupt enables
See Section 10.5 for more
information RW End Hart 4 M-Mode interrupt
enables
Reserved
RW Start Hart 4 S-Mode interrupt
enables
See Section 10.5 for more
information RW End Hart 4 S-Mode interrupt
enables
Reserved
RW Hart 0 M-Mode priority threshold
RW Hart 0 M-Mode claim/complete Reserved
See Section 10.6 for more information See Section 10.7 for more information
RW Hart 1 M-Mode priority threshold
See Section 10.6 for more information
Table 32: PLIC Register Map
Copyright � 2018�2020, SiFive Inc. All rights reserved.
55
0x0C20_1004 4B
0x0C20_1008 ...
0x0C20_2000 4B
0x0C20_2004 4B
0x0C20_2008 ...
0x0C20_3000 4B
0x0C20_3004 4B
0x0C20_3008 ...
0x0C20_4000 4B
0x0C20_4004 4B
0x0C20_4008 ...
0x0C20_5000 4B
0x0C20_5004 4B
0x0C20_5008 ...
0x0C20_6000 4B
0x0C20_6004 4B
0x0C20_6008 ...
0x0C20_7000 4B
0x0C20_7004 4B
0x0C20_7008 ...
0x0C20_8000 4B
PLIC Register Map RW Hart 1 M-Mode claim/com-
plete Reserved
See Section 10.7 for more information
RW Hart 1 S-Mode priority threshold
RW Hart 1 S-Mode claim/complete Reserved
See Section 10.6 for more information See Section 10.7 for more information
RW Hart 2 M-Mode priority threshold
RW Hart 2 M-Mode claim/complete Reserved
See Section 10.6 for more information See Section 10.7 for more information
RW Hart 2 S-Mode priority threshold
RW Hart 2 S-Mode claim/complete Reserved
See Section 10.6 for more information See Section 10.7 for more information
RW Hart 3 M-Mode priority threshold
RW Hart 3 M-Mode claim/complete Reserved
See Section 10.6 for more information See Section 10.7 for more information
RW Hart 3 S-Mode priority threshold
RW Hart 3 S-Mode claim/complete Reserved
See Section 10.6 for more information See Section 10.7 for more information
RW Hart 4 M-Mode priority threshold
RW Hart 4 M-Mode claim/complete Reserved
See Section 10.6 for more information See Section 10.7 for more information
RW Hart 4 S-Mode priority threshold
See Section 10.6 for more information
Table 32: PLIC Register Map
Copyright � 2018�2020, SiFive Inc. All rights reserved.
56
0x0C20_8004 4B
0x0C20_8008 ...
0x1000_0000
PLIC Register Map RW Hart 4 S-Mode claim/com-
plete Reserved
See Section 10.7 for more information
End of PLIC Memory Map Table 32: PLIC Register Map
10.2 Interrupt Sources
The U54-MC has 136 interrupt sources. 127 of these are external global interrupts. The remainder are driven by various on-chip devices as listed in Table 33. These signals are positive-level triggered.
In the PLIC, as specified in The RISCV Instruction Set Manual, Volume II: Privileged Architecture, Version 1.10, Global Interrupt ID 0 is defined to mean "no interrupt," hence global_interrupts[0] corresponds to PLIC Interrupt ID 1.
See the U54-MC User Guide for a description of global interrupts.
Source Start 1
128 132 133 134 135 136
Source End 127 131 132 133 134 135 136
Source External Global Interrupts L2 Cache Bus Error Unit Bus Error Unit Bus Error Unit Bus Error Unit Bus Error Unit
Table 33: PLIC Interrupt Source Mapping
10.3 Interrupt Priorities
Each PLIC interrupt source can be assigned a priority by writing to its 32-bit memory-mapped priority register. The U54-MC supports 7 levels of priority. A priority value of 0 is reserved to mean "never interrupt" and effectively disables the interrupt. Priority 1 is the lowest active priority, and priority 7 is the highest. Ties between global interrupts of the same priority are broken by the Interrupt ID; interrupts with the lowest ID have the highest effective priority. See Table 34 for the detailed register description.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
57
PLIC Interrupt Priority Register (priority)
Base Address
0x0C00_0000 + 4 � Interrupt ID
Bits
Field Name
Attr.
Rst. Description
[2:0]
Priority
RW
X
Sets the priority for a given global inter-
rupt.
[31:3]
Reserved
RO
0
Table 34: PLIC Interrupt Priority Register
10.4 Interrupt Pending Bits
The current status of the interrupt source pending bits in the PLIC core can be read from the pending array, organized as 5 words of 32 bits. The pending bit for interrupt ID is stored in bit
of word
. As such, the U54-MC has 5 interrupt pending registers. Bit 0 of
word 0, which represents the non-existent interrupt source 0, is hardwired to zero.
A pending bit in the PLIC core can be cleared by setting the associated enable bit then performing a claim as described in Section 10.7.
PLIC Interrupt Pending Register 1 (pending1)
Base Address
0x0C00_1000
Bits
Field Name
Attr.
0
Interrupt 0 Pend- RO
ing
1
Interrupt 1 Pend- RO
ing
2
Interrupt 2 Pend- RO
ing
31 Interrupt 31 Pend- RO ing
Rst. Description
0
Non-existent global interrupt 0 is hard-
wired to zero
0
Pending bit for global interrupt 1
0
Pending bit for global interrupt 2
...
0
Pending bit for global interrupt 31
Table 35: PLIC Interrupt Pending Register 1
PLIC Interrupt Pending Register 5 (pending5)
Base Address
0x0C00_1010
Bits
Field Name
Attr.
Rst. Description
0
Interrupt 128
RO
0
Pending bit for global interrupt 128
Pending
...
8
Interrupt 136
RO
0
Pending bit for global interrupt 136
Pending
[31:9]
Reserved
WIRI
X
Table 36: PLIC Interrupt Pending Register 5
Copyright � 2018�2020, SiFive Inc. All rights reserved.
58
10.5 Interrupt Enables
Each global interrupt can be enabled by setting the corresponding bit in the enables registers. The enables registers are accessed as a contiguous array of 5 � 32-bit words, packed the same way as the pending bits. Bit 0 of enable word 0 represents the non-existent interrupt ID 0 and is hardwired to 0.
64-bit and 32-bit word accesses are supported by the enables array in SiFive RV64 systems.
PLIC Interrupt Enable Register 1 (enable1) for Hart 0 M-Mode
Base Address
0x0C00_2000
Bits
Field Name
Attr.
Rst. Description
0
Interrupt 0 Enable RO
0
Non-existent global interrupt 0 is hard-
wired to zero
1
Interrupt 1 Enable RW
X
Enable bit for global interrupt 1
2
Interrupt 2 Enable RW
X
Enable bit for global interrupt 2
...
31
Interrupt 31
RW
X
Enable bit for global interrupt 31
Enable
Table 37: PLIC Interrupt Enable Register 1 for Hart 0 M-Mode
PLIC Interrupt Enable Register 5 (enable5) for Hart 4 S-Mode
Base Address
0x0C00_2410
Bits
Field Name
Attr.
Rst. Description
0
Interrupt 128
RW
X
Enable bit for global interrupt 128
Enable
...
8
Interrupt 136
RW
X
Enable bit for global interrupt 136
Enable
[31:9]
Reserved
RO
0
Table 38: PLIC Interrupt Enable Register 5 for Hart 4 S-Mode
10.6 Priority Thresholds
The U54-MC supports setting of an interrupt priority threshold via the threshold register. The threshold is a WARL field, where the U54-MC supports a maximum threshold of 7.
The U54-MC masks all PLIC interrupts of a priority less than or equal to threshold. For example, a threshold value of zero permits all interrupts with non-zero priority, whereas a value of 7 masks all interrupts.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
59
PLIC Interrupt Priority Threshold Register (threshold)
Base Address
0x0C20_0000
[2:0] [31:3]
Threshold Reserved
RW
X
Sets the priority threshold
RO
0
Table 39: PLIC Interrupt Threshold Register
10.7 Interrupt Claim Process
A U54-MC hart can perform an interrupt claim by reading the claim/complete register (Table 40), which returns the ID of the highest-priority pending interrupt or zero if there is no pending interrupt. A successful claim also atomically clears the corresponding pending bit on the interrupt source.
A U54-MC hart can perform a claim at any time, even if the MEIP bit in its mip (Table 11) register is not set.
The claim operation is not affected by the setting of the priority threshold register.
10.8 Interrupt Completion
A U54-MC hart signals it has completed executing an interrupt handler by writing the interrupt ID it received from the claim to the claim/complete register (Table 40). The PLIC does not check whether the completion ID is the same as the last claim ID for that target. If the completion ID does not match an interrupt source that is currently enabled for the target, the completion is silently ignored.
PLIC Claim/Complete Register (claim)
Base Address
0x0C20_0004
[31:0] Interrupt Claim/
RW
X
A read of zero indicates that no inter-
Complete for Hart
rupts are pending. A non-zero read
0 M-Mode
contains the id of the highest pending
interrupt. A write to this register signals
completion of the interrupt id written.
Table 40: PLIC Interrupt Claim/Complete Register for Hart 0 M-Mode
Chapter 11
Custom Instructions
These custom instructions use the SYSTEM instruction encoding space, which is the same as custom CSR encoding space, but with funct3=0.
11.1 CFLUSH.D.L1
� Implemented as state machine in L1 D$, for cores with data caches. � Only available in M-mode. � Opcode 0xFC000073: with optional rs1 field in bits 19:15. � When rs1 = x0, CFLUSH.D.L1 writes back and invalidates all lines in the L1 D$. � When rs1 != x0, CFLUSH.D.L1 writes back and invalidates the L1 D$ line containing the
virtual address in integer register rs1. � If the effective privilege mode does not have write permissions to the address in rs1, then a
store access or store page-fault exception is raised. � If the address in rs1 is in an uncacheable region with write permissions, the instruction has
no effect but raises no exceptions. � Note that if the PMP scheme write-protects only part of a cache line, then using a value for
rs1 in the write-protected region will cause an exception, whereas using a value for rs1 in the write-permitted region will write back the entire cache line.
11.2 CDISCARD.D.L1
� Implemented as state machine in L1 D$, for cores with data caches. � Only available in M-mode. � Opcode 0xFC200073: with optional rs1 field in bits 19:15. � When rs1 = x0, CDISCARD.D.L1 invalidates, but does not write back, all lines in the L1 D$.
Dirty data within the cache is lost.
60
Copyright � 2018�2020, SiFive Inc. All rights reserved.
61
� When rs1 x0, CDISCARD.D.L1 invalidates, but does not write back, the L1 D$ line containing the virtual address in integer register rs1. Dirty data within the cache line is lost.
� If the effective privilege mode does not have write permissions to the address in rs1, then a store access or store page-fault exception is raised.
� If the address in rs1 is in an uncacheable region with write permissions, the instruction has no effect but raises no exceptions.
� Note that if the PMP scheme write-protects only part of a cache line, then using a value for rs1 in the write-protected region will cause an exception, whereas using a value for rs1 in the write-permitted region will invalidate and discard the entire cache line.
11.3 Other Custom Instructions
Other custom instructions may be implemented, but their functionality is not documented further here and they should not be used in this version of the U54-MC.
Chapter 12
Debug
This chapter describes the operation of SiFive debug hardware, which follows The RISCV Debug Specification, Version 0.13. Currently only interactive debug and hardware breakpoints are supported.
12.1 Debug CSRs
This section describes the per-hart trace and debug registers (TDRs), which are mapped into the CSR space as follows:
CSR Name tselect tdata1 tdata2 tdata3 dcsr dpc dscratch
Description Trace and debug register select First field of selected TDR Second field of selected TDR Third field of selected TDR Debug control and status register Debug PC Debug scratch register
Allowed Access Modes D, M D, M D, M D, M D D D
Table 41: Debug Control and Status Registers
The dcsr, dpc, and dscratch registers are only accessible in debug mode, while the tselect and tdata1-3 registers are accessible from either debug mode or machine mode.
12.1.1 Trace and Debug Register Select (tselect) To support a large and variable number of TDRs for tracing and breakpoints, they are accessed through one level of indirection where the tselect register selects which bank of three tdata1-3 registers are accessed via the other three addresses.
The tselect register has the format shown below:
62
Copyright � 2018�2020, SiFive Inc. All rights reserved.
63
CSR Bits [31:0]
Trace and Debug Select Register
tselect
Field Name index
Attr. Description WARL Selection index of trace and debug registers
Table 42: tselect CSR
The index field is a WARL field that does not hold indices of unimplemented TDRs. Even if index can hold a TDR index, it does not guarantee the TDR exists. The type field of tdata1 must be inspected to determine whether the TDR exists.
12.1.2 Trace and Debug Data Registers (tdata1-3)
The tdata1-3 registers are XLEN-bit read/write registers selected from a larger underlying bank of TDR registers by the tselect register.
CSR Bits [27:0] [31:28]
Trace and Debug Data Register 1
tdata1
Field Name
Attr. Description
TDR-Specific Data
type
RO Type of the trace & debug register selected
by tselect
Table 43: tdata1 CSR
CSR Bits [31:0]
Trace and Debug Data Registers 2 and 3
Field Name
tdata2/3 Attr. Description
TDR-Specific Data
Table 44: tdata2/3 CSRs
The high nibble of tdata1 contains a 4-bit type code that is used to identify the type of TDR selected by tselect. The currently defined types are shown below:
Type 0 1 2 3
Description No such TDR register Reserved Address/Data Match Trigger Reserved
Table 45: tdata Types
The dmode bit selects between debug mode (dmode=1) and machine mode (dmode=1) views of the registers, where only debug mode code can access the debug mode view of the TDRs. Any
Copyright � 2018�2020, SiFive Inc. All rights reserved.
64
attempt to read/write the tdata1-3 registers in machine mode when dmode=1 raises an illegal instruction exception.
12.1.3 Debug Control and Status Register (dcsr)
This register gives information about debug capabilities and status. Its detailed functionality is described in The RISCV Debug Specification, Version 0.13.
12.1.4 Debug PC (dpc)
When entering debug mode, the current PC is copied here. When leaving debug mode, execution resumes at this PC.
12.1.5 Debug Scratch (dscratch)
This register is generally reserved for use by Debug ROM in order to save registers needed by the code in Debug ROM. The debugger may use it as described in The RISCV Debug Specification, Version 0.13.
12.2 Breakpoints
The U54-MC supports two hardware breakpoint registers per hart, which can be flexibly shared between debug mode and machine mode.
When a breakpoint register is selected with tselect, the other CSRs access the following information for the selected breakpoint:
CSR Name tselect tdata1 tdata2 tdata3
Breakpoint Alias tselect mcontrol maddress N/A
Description Breakpoint selection index Breakpoint match control Breakpoint match address Reserved
Table 46: TDR CSRs when used as Breakpoints
12.2.1 Breakpoint Match Control Register (mcontrol) Each breakpoint control register is a read/write register laid out in Table 47.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
65
Breakpoint Control Register (mcontrol)
Register Offset
CSR
Bits
Field
Attr.
Rst.
Description
Name
0
R
WARL
X
Address match on LOAD
1
W
WARL
X
Address match on STORE
2
X
WARL
X
Address match on Instruction FETCH
3
U
WARL
X
Address match on User Mode
4
S
WARL
X
Address match on Supervisor Mode
5
Reserved
WPRI
X
Reserved
6
M
WARL
X
Address match on Machine Mode
[10:7]
match
WARL
X
Breakpoint Address Match type
11
chain
WARL
0
Chain adjacent conditions.
[15:12]
action
WARL
0
Breakpoint action to take.
[17:16]
sizelo
WARL
0
Size of the breakpoint. Always 0.
18
timing
WARL
0
Timing of the breakpoint. Always 0.
19
select
WARL
0
Perform match on address or data.
Always 0.
20
Reserved
WPRI
X
Reserved
[26:21] maskmax
RO
4
Largest supported NAPOT range
27
dmode
RW
0
Debug-Only access mode
[31:28]
type
RO
2
Address/Data match type, always 2
Table 47: Test and Debug Data Register 3
The type field is a 4-bit read-only field holding the value 2 to indicate this is a breakpoint containing address match logic.
The action field is a 4-bit read-write WARL field that specifies the available actions when the address match is successful. The value 0 generates a breakpoint exception. The value 1 enters debug mode. Other actions are not implemented.
The R/W/X bits are individual WARL fields, and if set, indicate an address match should only be successful for loads/stores/instruction fetches, respectively, and all combinations of implemented bits must be supported.
The M/S/U bits are individual WARL fields, and if set, indicate that an address match should only be successful in the machine/supervisor/user modes, respectively, and all combinations of implemented bits must be supported.
The match field is a 4-bit read-write WARL field that encodes the type of address range for breakpoint address matching. Three different match settings are currently supported: exact, NAPOT, and arbitrary range. A single breakpoint register supports both exact address matches and matches with address ranges that are naturally aligned powers-of-two (NAPOT) in size. Breakpoint registers can be paired to specify arbitrary exact ranges, with the lower-numbered breakpoint register giving the byte address at the bottom of the range and the higher-numbered
Copyright � 2018�2020, SiFive Inc. All rights reserved.
66
breakpoint register giving the address 1 byte above the breakpoint range, and using the chain bit to indicate both must match for the action to be taken.
NAPOT ranges make use of low-order bits of the associated breakpoint address register to encode the size of the range as follows:
maddress a...aaaaaa a...aaaaa0 a...aaaa01 a...aaa011 a...aa0111 a...a01111
... a01...1111
Match type and size Exact 1 byte
2-byte NAPOT range 4-byte NAPOT range 8-byte NAPOT range 16-byte NAPOT range 32-byte NAPOT range
... 231-byte NAPOT range
Table 48: NAPOT Size Encoding
The maskmax field is a 6-bit read-only field that specifies the largest supported NAPOT range. The value is the logarithm base 2 of the number of bytes in the largest supported NAPOT range. A value of 0 indicates that only exact address matches are supported (1-byte range). A value of 31 corresponds to the maximum NAPOT range, which is 231 bytes in size. The largest range is encoded in maddress with the 30 least-significant bits set to 1, bit 30 set to 0, and bit 31 holding the only address bit considered in the address comparison.
To provide breakpoints on an exact range, two neighboring breakpoints can be combined with the chain bit. The first breakpoint can be set to match on an address using action of 2 (greater than or equal). The second breakpoint can be set to match on address using action of 3 (less than). Setting the chain bit on the first breakpoint prevents the second breakpoint from firing unless they both match.
12.2.2 Breakpoint Match Address Register (maddress)
Each breakpoint match address register is an XLEN-bit read/write register used to hold significant address bits for address matching and also the unary-encoded address masking information for NAPOT ranges.
12.2.3 Breakpoint Execution
Breakpoint traps are taken precisely. Implementations that emulate misaligned accesses in software will generate a breakpoint trap when either half of the emulated access falls within the address range. Implementations that support misaligned accesses in hardware must trap if any byte of an access falls within the matching range.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
67
Debug-mode breakpoint traps jump to the debug trap vector without altering machine-mode registers.
Machine-mode breakpoint traps jump to the exception vector with "Breakpoint" set in the mcause register and with badaddr holding the instruction or data address that caused the trap.
12.2.4 Sharing Breakpoints Between Debug and Machine Mode
When debug mode uses a breakpoint register, it is no longer visible to machine mode (that is, the tdrtype will be 0). Typically, a debugger will leave the breakpoints alone until it needs them, either because a user explicitly requested one or because the user is debugging code in ROM.
12.3 Debug Memory Map
This section describes the debug module's memory map when accessed via the regular system interconnect. The debug module is only accessible to debug code running in debug mode on a hart (or via a debug transport module).
12.3.1 Debug RAM and Program Buffer (0x300�0x3FF)
The U54-MC has 16 32-bit words of program buffer for the debugger to direct a hart to execute arbitrary RISC-V code. Its location in memory can be determined by executing aiupc instructions and storing the result into the program buffer.
The U54-MC has two 32-bit words of debug data RAM. Its location can be determined by reading the DMHARTINFO register as described in the RISC-V Debug Specification. This RAM space is used to pass data for the Access Register abstract command described in the RISC-V Debug Specification. The U54-MC supports only general-purpose register access when harts are halted. All other commands must be implemented by executing from the debug program buffer.
In the U54-MC, both the program buffer and debug data RAM are general-purpose RAM and are mapped contiguously in the Core Complex memory space. Therefore, additional data can be passed in the program buffer, and additional instructions can be stored in the debug data RAM.
Debuggers must not execute program buffer programs that access any debug module memory except defined program buffer and debug data addresses.
The U54-MC does not implement the DMSTATUS.anyhavereset or DMSTATUS.allhavereset bits.
12.3.2 Debug ROM (0x800�0xFFF)
This ROM region holds the debug routines on SiFive systems. The actual total size may vary between implementations.
Copyright � 2018�2020, SiFive Inc. All rights reserved.
68
12.3.3 Debug Flags (0x100�0x110, 0x400�0x7FF)
The flag registers in the debug module are used for the debug module to communicate with each hart. These flags are set and read used by the debug ROM and should not be accessed by any program buffer code. The specific behavior of the flags is not further documented here.
12.3.4 Safe Zero Address
In the U54-MC, the debug module contains the addresses 0x0 through 0xFFF in the memory map. Memory accesses to these addresses raise access exceptions, unless the hart is in debug mode. This property allows a "safe" location for unprogrammed parts, as the default mtvec location is 0x0.
12.4 Debug Module Interface
The SiFive Debug Module (DM) conforms to The RISCV Debug Specification, Version 0.13. A debug probe or agent connects to the Debug Module through the Debug Module Interface (DMI). The following sections describe notable spec options used in the implementation and should be read in conjunction with the RISCV Debug Specification.
12.4.1 DM Registers
dmstatus register
dmstatus holds the DM version number and other implementation information. Most importantly, it contains status bits that indicate the current state of the selected hart(s).
dmcontrol register A debugger performs most hart control through the dmcontrol register.
Control dmactive
ndmreset resethaltreq hartreset hartsel hasel
Function This bit enables the DM and is reflected in the dmactive output signal. When dmactive=0, the clock to the DM is gated off. This is a read/write bit that drives the ndreset output signal. When set, the DM will halt the hart when it emerges from reset. Not Supported This field selects the hart to operate on When set, additional hart(s) in the hart array mask register are selected in addition to the one selected by hartsel.
Table 49: Debug Control Register
Copyright � 2018�2020, SiFive Inc. All rights reserved.
69
hawindow register
This register contains a bitmap where bit 0 corresponds to hart 0, bit 1 to hart 1, etc. Any bits set in this register select the corresponding hart in addition to the hart selected by hartsel.
12.4.2 Abstract Commands
Abstract commands provide a debugger with a path to read and write processor state. Many aspects of Abstract Commands are optional in the RISCV Debug Spec and are implemented as described below.
Cmdtype Access Register
Quick Access Access Memory
Feature GPR registers CSR registers FPU registers Autoexec
Post-increment
Support Access Register command, register number 0x1000 - 0x101f
Not supported. CSRs are accessed using the Program Buffer.
Not supported. FPU registers are accessed using the Program Buffer. Both autoexecprogbuf and autoexecdata are supported. Not supported.
Not supported.
Not supported. Memory access is accomplished using the Program Buffer.
Table 50: Debug Abstract Commands
12.4.3 Multi-core Synchronization
The DM is configured with one Halt Group which may be programmed to synchronize execution between harts or between hart(s) and external logic such as a cross-trigger matrix. The Halt Group is configured using the dmcs2 register.
12.4.4 System Bus Access
System Bus Access (SBA) provides an alternative method to access memory. SBA operation conforms to the RISC-V Debug Spec and the description is not duplicated here. Comparing Program Buffer memory access and SBA:
Program Buffer Memory Access Virtual address Subject to Physical Memory Protection (PMP) Cache coherent Hart must be halted
SBA Memory Access Physical Address Not subject to PMP Cache coherent Hart may be halted or running
Table 51: System Bus VS Program Buffer Comparison
Chapter 13
Error Correction Codes (ECC)
Error correction codes (ECC) are implemented on various memories within the U54-MC, allowing for the detection and, in some cases, correction of memory errors. The following SRAM blocks on the U54-MC support ECC:
� Instruction Cache � Data Cache � L2 Cache
The minimal case of an ECC error is a single bit error that is detected, reported via interrupt handler, and corrected automatically by hardware without any software intervention. More difficult scenarios involve double or multi-bit errors that are still reported and tracked in hardware but are not correctable. The ECC hardware includes logic for detection and correction, in addition to 7 redundant bits per 32-bit codeword or 8 redundant bits per 64-bit codeword.
13.1 ECC Configuration
All blocks with ECC support are enabled globally through the Bus Error Unit (BEU) configuration registers. The BEU is used to configure ECC reporting and enable interrupt handling via the global or local interrupt controller. The global interrupt controller is the Platform-Level Interrupt Controller (PLIC). The local interrupt controller is the Core-Local Interruptor (CLINT). The BEU registers plic_interrupt and local_interrupt are used to route the errors to the respective interrupt controller. Additionally, the BEU can be used for TileLink bus errors.
13.1.1 ECC Initialization Any SRAM block or cache memory containing ECC functionality needs to be initialized prior to use. ECC will correct defective bits based on memory contents, so if memory is not first initialized to a known state, then the ECC will not operate as expected. It is recommended to use a DMA, if available, to write the entire SRAM or cache to zeros prior to enabling ECC reporting. If no DMA is present, use store instructions issued from the processor. Initializing memory with
70
Copyright � 2018�2020, SiFive Inc. All rights reserved.
71
ECC from an external bus is not recommended. After initialization, ECC-related registers can be written to zero, and then ECC reporting can be enabled. 64-bit aligned writes are recommended.
13.2 ECC Interrupt Handling and Error Injection
Single bit errors are automatically repaired by the hardware.
The plic_interrupt register indicates which accrued events should generate an interrupt to the PLIC. An interrupt is generated when any bit is set in both accrued and plic_interrupt, i.e., when (accrued & plic_interrupt) != 0.
The local_interrupt register indicates which accrued events should generate an interrupt directly to the hart associated with this Bus Error Unit. An interrupt is generated when any bit is set in both accrued and local_interrupt, i.e., when (accrued & local_interrupt) != 0.
BEU errors are always enabled and thus do not have a control bit in mie (Machine Interrupt Enable) CSR. Likewise, there is no dedicated control bit for BEU errors in the mideleg (Machine Interrupt Delegation) CSR, so it cannot be delegated to a lower privilege mode than M-mode.
Monitoring overall ECC events can be accomplished in software via the interrupt handler.
The L2 Cache Controller contains hardware counters to track ECC events, and optionally inject ECC errors to test the software handling of ECC events. The L2 Cache Controller is further described in Chapter 9.
The exception code value is located in the mcause (Machine Trap Cause) CSR. When BEU interrupts are routed thorugh the PLIC, the default exception code value will be 11 (0xB). When ECC interrupts are routed through the CLINT, the default exception code value will be 128 (0x80). These exception codes are further detailed in Section 6.3.5.
13.3 Hardware Operation Upon ECC Error
Hardware will operate differently depending on which memory type encounters an ECC errror:
� Instruction Cache: The error is corrected and the cache line is flushed. � Data Cache: The error is corrected and the cache line is invalidated and written back to the
next level of memory. � L2 Cache: Single bit correction for L2 data and metadata (metadata includes index, tag, and
directory information). Double bit detection only on the L2 data array.
Double bit errors are reported at the Core Complex boundary via the halt_from_tile_X signal that, if asserted, remains high until reset.
Chapter 14
References
Visit the SiFive forums for support and answers to frequently asked questions: https://forums.sifive.com [1] A. Waterman and K. Asanovic, Eds., The RISC-V Instruction Set Manual, Volume I: UserLevel ISA, Version 2.2, June 2019. [Online]. Available: https://riscv.org/specifications/ [2] ----, The RISC-V Instruction Set Manual Volume II: Privileged Architecture Version 1.11, June 2019. [Online]. Available: https://riscv.org/specifications/privileged-isa [3] ----, SiFive TileLink Specification Version 1.8.0, August 2019. [Online]. Available: https://sifive.com/documentation/tilelink/tilelink-spec [4] A. Chang, D. Barbier, and P. Dabbelt, RISC-V Platform-Level Interrupt Controller (PLIC) Specification. [Online]. Available: https://github.com/riscv/riscv-plic-spec
72