Chisel Manual
User Manual:
Open the PDF directly: View PDF .
Page Count: 13

Chisel Manual
Jonathan Bachrach, Huy Vo, Krste Asanovi´c
EECS Department, UC Berkeley
{jrb|huytbvo|krste}@eecs.berkeley.edu
April 10, 2016
1 Introduction
This document is a manual for Chisel (Constructing
Hardware In a Scala Embedded Language). Chisel
is a hardware construction language embedded in
the high-level programming language Scala. A sep-
arate Chisel tutorial document provides a gentle in-
troduction to using Chisel, and should be read first.
This manual provides a comprehensive overview
and specification of the Chisel language, which is re-
ally only a set of special class definitions, predefined
objects, and usage conventions within Scala. When
you write a Chisel program you are actually writ-
ing a Scala program. In this manual, we presume
that you already understand the basics of Scala. If
you are unfamiliar with Scala, we recommend you
consult one of the excellent Scala books ([3], [2]).
2 Nodes
Any hardware design in Chisel is ultimately rep-
resented by a graph of node objects. User code in
Chisel generate this graph of nodes, which is then
passed to the Chisel backends to be translated into
Verilog or C++ code. Nodes are defined as follows:
class Node {
// name assigned by user or from introspection
var name: String = ""
// incoming graph edges
def inputs: ArrayBuffer[Node]
// outgoing graph edges
def consumers: ArrayBuffer[Node]
// node specific width inference
def inferWidth: Int
// get width immediately inferrable
def getWidth: Int
// get first raw node
def getRawNode: Node
// convert to raw bits
def toBits: Bits
// convert to raw bits
def fromBits(x: Bits): this.type
// return lit value if inferrable else null
def litOf: Lit
// return value of lit if litOf is non null
def litValue(default: BigInt = BigInt(-1)): BigInt
}
The uppermost levels of the node class hierarchy
are shown in Figure 1. The basic categories are:
Lit – constants or literals,
Op – logical or arithmetic operations,
Updateable – conditionally updated nodes,
Data – typed wires or ports,
Reg – positive-edge-triggered registers, and
Mem – memories.
Node
UpdateableLit Op
Data Reg Mem
Figure 1: Node hierarchy.
3 Lits
Raw literals are represented as
Lit
nodes defined as
follows:
class Lit extends Node {
// original value
val inputVal: BigInt
}
1

Raw literals contain a collection of bits. Users do
not create raw literals directly, but instead use type
constructors defined in Section 5.
4 Ops
Raw operations are represented as
Op
nodes defined
as follows:
class Op extends Node {
// op name used during emission
val op: String
}
Ops compute a combinational function of their in-
puts.
5 Types
A Chisel graph representing a hardware design con-
tains raw and type nodes. The Chisel type system
is maintained separately from the underlying Scala
type system, and so type nodes are interspersed
between raw nodes to allow Chisel to check and re-
spond to Chisel types. Chisel type nodes are erased
before the hardware design is translated into C++ or
Verilog. The
getRawNode
operator defined in the base
Node class, skips type nodes and returns the first
raw node found. Figure 2 shows the built-in Chisel
type hierarchy, with Data as the topmost node.
Data
Bits
Bool
Num
UInt SInt
Bundle Vec
Aggregate
Figure 2: Chisel type hierarchy.
Built-in scalar types include
Bool
,
SInt
, and
UInt
and
built-in aggregate types
Bundle
and
Vec
allow the
user to expand the set of Chisel datatypes with col-
lections of other types.
Data itself is a node:
abstract class Data extends Node {
override def cloneType(): this.type =
this.getClass.newInstance.
asInstanceOf[this.type]
// simple conversions
def toSInt: SInt
def toUInt: UInt
def toBool: Bool
def toBits: Bits
// flatten out to leaves of tree
def flatten: Array[(String, Data)]
// port direction if leaf
def dir: PortDir
// change dir to OUTPUT
def asOutput:this.type
// change dir to INPUT
def asInput:this.type
// change polarity of dir
def flip:this.type
// assign to input
def :=[T<: Data](t: T)
// bulk assign to input
def <>(t: Data)
}
The Data class has methods for converting between
types and for delegating port methods to its sin-
gle input. We will discuss ports in Section 10. Fi-
nally, users can override the
cloneType
method in
their own type nodes (e.g., bundles) in order to re-
flect construction parameters that are necessary for
cloning.
Data nodes can be used for four purposes:
•types
–
UInt(width = 8)
– record intermediate
types in the graph specifying at minimum
bitwidth (described in this section),
•wires
–
UInt(width = 8)
– serve as forward dec-
larations of data allowing future conditional
updates (described in Section 6),
•ports
–
UInt(dir = OUTPUT, width = 8)
– are special-
ized wires defining module interfaces, and ad-
ditionally specify direction (described in Sec-
tion 10), and
•literals
–
UInt(1)
or
UInt(1, 8)
– can be con-
structed using type object constructors spec-
ifying their value and optional width.
5.1 Bits
In Chisel, a raw collection of bits is represented by
the Bits type defined as follows:
object Bits {
def apply(dir: PortDir = null,
width: Int = -1): Bits
// create literal from BigInt or Int
def apply(value: BigInt, width: Int = -1): Bits
// create literal from String using
2
// base_char digit+ string format
def apply(value: String, width: Int = -1): Bits
}
class Bits extends Data with Updateable {
// bitwise-not
def unary_~(): Bits
// bitwise-and
def & (b: Bits): Bits
// bitwise-or
def | (b: Bits): Bits
// bitwise-xor
def ^ (b: Bits): Bits
// and-reduction
def andR(): Bool
// or-reduction
def orR(): Bool
// xor-reduction
def xorR(): Bool
// logical NOT
def unary_!(): Bool
// logical AND
def && (b: Bool): Bool
// logical OR
def || (b: Bool): Bool
// equality
def ===(b: Bits): Bool
// inequality
def != (b: Bits): Bool
// logical left shift
def << (b: UInt): Bits
// logical right shift
def >> (b: UInt): Bits
// concatenate
def ## (b: Bits): Bits
// extract single bit, LSB is 0
def apply(x: Int): Bits
// extract bit field from end to start bit pos
def apply(hi: Int, lo: Int): Bits
}
def Cat[T <: Data](elt: T, elts: T*): Bits
Bits has methods for simple bit operations. Note
that
##
is binary concatenation, while
Cat
is an n-
ary concatentation. To avoid colliding with Scala’s
builtin
==
, Chisel’s bitwise comparison is named
===
.
A field of width
n
can be created from a single bit
using Fill:
def Fill(n: Int, field: Bits): Bits
and two inputs can be selected using Mux:
def Mux[T <: Data](sel: Bits, cons: T, alt: T): T
Constant or literal values are expressed using
Scala integers or strings passed to constructors for
the types:
UInt(1) // decimal 1-bit lit from Scala Int.
UInt("ha")// hex 4-bit lit from string.
UInt("o12")// octal 4-bit lit from string.
UInt("b1010")// binary 4-bit lit from string.
producing a
Lit
as shown in the leftmost subfigure
of Figure 3.
Operations return an actual operator node with
a type node combining the input type nodes. See
Figure 3 for successively more complicated exam-
ples.
5.2 Bools
Boolean values are represented as Bools:
object Bool {
def apply(dir: PortDir = null): Bool
// create literal
def apply(value: Boolean): Bool
}
class Bool extends UInt
Bool is equivalent to UInt(width = 1).
5.3 Nums
Num
is a type node which defines arithmetic opera-
tions:
class Num extends Bits {
// Negation
def unary_-(): Bits
// Addition
def +(b: Num): Num
// Subtraction
def -(b: Num): Num
// Multiplication
def *(b: Num): Num
// Greater than
def >(b: Num): Bool
// Less than
def <(b: Num): Bool
// Less than or equal
def <=(b: Num): Bool
// Greater than or equal
def >=(b: Num): Bool
}
Signed and unsigned integers are considered sub-
sets of fixed-point numbers and are represented by
types SInt and UInt respectively:
object SInt {
def apply (dir: PortDir = null,
width: Int = -1): SInt
// create literal
def apply (value: BigInt, width: Int = -1): SInt
def apply (value: String, width: Int = -1): SInt
}
class SInt extends Num
object UInt {
def apply(dir: PortDir = null,
width: Int = -1): UInt
// create literal
def apply(value: BigInt, width: Int = -1): UInt
def apply(value: String, width: Int = -1): UInt
3

UInt
Lit(1)
UInt
Op(&)
UInt
Lit(1)
UInt
Lit(2)
UInt
Op(|)
UInt UInt
Lit(3)Op(&)
UInt
Lit(1)
UInt
Lit(2)
a = UInt(1) b = a & UInt(2) b | UInt(3)
Figure 3: Chisel Op/Lit graphs constructed with algebraic expressions showing the insertion of type nodes.
}
class UInt extends Num {
// arithmetic right shift
override def >> (b: UInt): SInt
}
Signed fixed-point numbers, including integers, are
represented using two’s-complement format.
5.4 Bundles
Bundles group together several named fields of po-
tentially different types into a coherent unit, much
like a struct in C:
class Bundle extends Data {
// shallow named bundle elements
def elements: ArrayBuffer[(String, Data)]
}
The name and type of each element in a Bundle
can be obtained with the
elements
method, and the
flatten
method returns the elements at the leaves
for nested aggregates. Users can define new bun-
dles by subclassing Bundle as follows:
class MyFloat extends Bundle {
val sign = Bool()
val exponent = UInt(width = 8)
val significand = UInt(width = 23)
}
Elements are accessed using Scala field access:
val x=new MyFloat()
val xs = x.sign
The names given to a bundle’s elements when
they are emitted by a C++ or Verilog backend are
obtained from their bundle field names, using Scala
introspection.
5.5 Vecs
Vecs create an indexable vector of elements:
object Vec {
def apply[T <: Data](elts: Seq[T]): Vec[T]
def apply[T <: Data](elt0: T, elts: T*): Vec[T]
def fill[T <: Data](n: Int)(gen: => T): Vec[T]
def tabulate[T <: Data](n: Int)
(gen: (Int) => T): Vec[T]
def tabulate[T <: Data](n1: Int, n2: Int)
(gen: (Int, Int) => T): Vec[Vec[T]]
}
class Vec[T <: Data](n: Int, val gen: () => T)
extends Data {
def apply(idx: UInt): T
def apply(idx: Int): T
def forall(p: T => Bool): Bool
def exists(p: T => Bool): Bool
def contains[T <: Bits](x: T): Bool
def count(p: T => Bool): UInt
def indexWhere(p: T => Bool): UInt
def lastIndexWhere(p: T => Bool): UInt
}
with
n
elements of type defined with the
gen
thunk.
Users can access elements statically with an
Int
in-
4
dex or dynamically using a
UInt
index, where dy-
namic access creates a virtual type node (represent-
ing a read “port”) that records the read using the
given address. In either case, users can wire to the
result of a read as follows:
v(a) := d
Read-only memories can be expressed as Vecs of
literals:
val rom = Vec(UInt(3), UInt(7), UInt(4), UInt(0)) {
UInt(width=3) }
val dout = rom(addr)
5.6 Bit Width Inference
Users are required to set bit widths of ports and
registers, but otherwise, bit widths on nodes are
automatically inferred unless set manually by
the user (using
Extract
or
Cat
). The bit-width
inference engine starts from the graph’s input
ports and calculates node output bit widths from
their respective input bit widths according to the
following set of rules:
operation bit width
z = x + y wz = max(wx, wy)
z = x - y wz = max(wx, wy)
z = x & y wz = max(wx, wy)
z = Mux(c, x, y) wz = max(wx, wy)
z=w*y wz = wx + wy
z = x << n wz = wx + maxNum(n)
z = x >> n wz = wx - minNum(n)
z = Cat(x, y) wz = wx + wy
z = Fill(n, x) wz = wx *maxNum(n)
where for instance
wz
is the bit width of wire
z
, and
the &rule applies to all bitwise logical operations.
The bit-width inference process continues un-
til no bit width changes. Except for right shifts
by known constant amounts, the bit-width infer-
ence rules specify output bit widths that are never
smaller than the input bit widths, and thus, output
bit widths either grow or stay the same. Further-
more, the width of a register must be specified by
the user either explicitly or from the bitwidth of
the reset value. From these two requirements, we
can show that the bit-width inference process will
converge to a fixpoint.
Shouldn’t & return bitwidth that is min() of inputs?
6 Updateables
When describing the operation of wire and state
nodes, it is often useful to give the specification as
a series of conditional updates to the output value
and to spread out these updates across several sep-
arate statements. For example, the output of a Data
node can be referenced immediately, but its input
can be set later.
Updateable
represents a condition-
ally updateable node, which accumulates accesses
to the node and which can later generate muxes to
combine these accesses in the circuit.
abstract class Updateable extends Node {
// conditional reads
def reads: Queue[(Bool, UInt)]
// conditional writes
def writes: Queue[(Bool, UInt, Node)]
// gen mux integrating all conditional writes
def genMuxes(default: Node)
override def := (x: Node): this.type
}
Chisel provides conditional update rules in the
form of the
when
construct to support this style of
sequential logic description:
object when {
def apply(cond: Bool)(block: => Unit): when
}
class when (prevCond: Bool) {
def elsewhen (cond: Bool)(block: => Unit): when
def otherwise (block: => Unit): Unit
}
when
manipulates a global condition stack with dy-
namic scope. Therefore,
when
creates a new condi-
tion that is in force across function calls. For exam-
ple:
def updateWhen (c: Bool, d: Data) =
when (c) { r := d }
when (a) {
updateWhen(b, x)
}
is the same as:
when (a) {
when (b) { r := x }
}
Chisel provides some syntactic sugar for other
common forms of conditional updates:
def unless(c: Bool)(block: => Unit) =
when (!c) { block )
and
def otherwise(block: => Unit) =
when (Bool(true)) { block }
We introduce the
switch
statement for conditional
updates involving a series of comparisons against a
common key:
5
def switch(c: UInt)(block: => Unit): Unit
def is(v: Bool)(block: => Unit)
7 Forward Declarations
Purely combinational circuits are not allowed to
have cycles between nodes, and Chisel will report
an error if such a cycle is detected. Because they
do not have cycles, legal combinational circuits can
always be constructed in a feed-forward manner,
by adding new nodes whose inputs are derived
from nodes that have already been defined. Sequen-
tial circuits naturally have feedback between nodes,
and so it is sometimes necessary to reference an
output wire before the producing node has been de-
fined. Because Scala evaluates program statements
sequentially, we have allowed data nodes to serve
as a wire providing a declaration of a node that can
be used immediately, but whose input will be set
later. For example, in a simple CPU, we need to
define the
pcPlus4
and
brTarget
wires so they can
be referenced before definition:
val pcPlus4 = UInt()
val brTarget = UInt()
val pcNext = Mux(pcSel, brTarget, pcPlus4)
val pcReg = RegUpdate(pcNext)
pcPlus4 := pcReg + UInt(4)
...
brTarget := addOut
The wiring operator
:=
is used to wire up the con-
nection after
pcReg
and
addOut
are defined. After
all assignments are made and the circuit is being
elaborated, it is an error if a forward declaration is
unassigned.
8 Regs
The simplest form of state element supported by
Chisel is a positive-edge-triggered register defined
as follows:
object Reg {
def apply[T <: Data]
(data: T, next: T = null, init: T = null): T
}
object RegNext {
def apply[T <: Data] (next: T, init: T = null): T
}
object RegInit {
def apply[T <: Data] (init: T): T
}
class Reg extends Updateable
where it can be constructed as follows:
val r1 = RegUpdate(io.in)
val r2 = RegReset(UInt(1, 8))
val r3 = RegUpdate(io.in, UInt(1))
val r4 = Reg(UInt(width = 8))
where
resetVal
is the value a reg takes on when
implicit reset is Bool(true).
9 Mems
Chisel supports random-access memories via the
Mem construct. Writes to Mems are positive-edge-
triggered and reads are either combinational or
positive-edge-triggered.
object Mem {
def apply[T <: Data](depth: Int, gen: => T,
seqRead: Boolean = false): Mem
}
class Mem[T <: Data](gen: () => T, depth: Int,
seqRead: Boolean = false)
extends Updateable {
def apply(idx: UInt): T
}
Ports into Mems are created by applying a
UInt
in-
dex. A 32-entry register file with one write port and
two combinational read ports might be expressed
as follows:
val rf = Mem(32, UInt(width = 64))
when (wen) { rf(waddr) := wdata }
val dout1 = rf(waddr1)
val dout2 = rf(waddr2)
If the optional parameter seqRead is set, Chisel
will attempt to infer sequential read ports when a
Reg is assigned the output of a Mem. A one-read,
one-write SRAM might be described as follows:
val ram1r1w =
Mem(1024, UInt(width = 32), seqRead = true)
val dout = Reg(UInt())
when (wen) { ram1r1w(waddr) := wdata }
when (ren) { dout := ram1r1w(raddr) }
Single-ported SRAMs can be inferred when the
read and write conditions are mutually exclusive in
the same when chain:
val ram1p =
Mem(1024, UInt(width = 32), seqRead = true)
val dout = Reg(UInt())
when (wen) { ram1p(waddr) := wdata }
.elsewhen (ren) { dout := ram1p(raddr) }
If the same Mem address is both written and
6
sequentially read on the same clock edge, or if a
sequential read enable is cleared, then the read data
is implementation-defined.
Mem also supports write masks for subword
writes. A given bit is written if the corresponding
mask bit is set.
val ram = Mem(256, UInt(width = 32))
when (wen) { ram.write(waddr, wdata, wmask) }
10 Ports
Ports are
Data
derived nodes used as interfaces to
hardware modules. A port is a directional version of
a primitive
Data
object. Port directions are defined
as follows:
trait PortDir
object INPUT extends PortDir
object OUTPUT extends PortDir
Aggregate ports can be recursively constructed us-
ing either a vec or bundle with instances of
Port
s as
leaves.
11 Modules
In Chisel, modules are very similar to modules in
Verilog, defining a hierarchical structure in the gen-
erated circuit. The hierarchical module namespace
is accessible in downstream tools to aid in debug-
ging and physical layout. A user-defined module is
defined as a class which:
•inherits from Module,
•
contains an interface Bundle stored in a field
named io, and
•wires together subcircuits in its constructor.
Users write their own modules by subclassing
Module which is defined as follows:
abstract class Module {
val io: Bundle
var name: String = ""
def compileV: Unit
def compileC: Unit
}
and defining their own
io
field. For example, to
define a two input mux, we would define a module
as follows:
class Mux2 extends Module {
val io =new Bundle{
val sel = Bool(INPUT)
val in0 = Bool(INPUT)
val in1 = Bool(INPUT)
val out = Bool(OUTPUT)
}
io.out := (io.sel & io.in1) | (~io.sel & io.in0)
}
The
:=
assignment operator, used in the body of a
module definition, is a special operator in Chisel
that wires the input of left-hand side to the output
of the right-hand side. It is typically used to connect
an output port to its definition.
The
<>
operator bulk connects interfaces of op-
posite gender between sibling modules or inter-
faces of same gender between parent/child mod-
ules. Bulk connections connect leaf ports using path-
name matching. Connections are only made if one
of the ports is non-null, allowing users to repeat-
edly bulk-connect partially filled interfaces. After
all connections are made and the circuit is being
elaborated, Chisel warns users if ports have other
than exactly one connection to them.
The names given to the nodes and submodules
stored in a module when they are emitted by a C++
or Verilog backend are obtained from their mod-
ule field names, using Scala introspection. Use the
function
setName()
to set the names for nodes or sub-
modules.
12 BlackBox
Black boxes allow users to define interfaces to cir-
cuits defined outside of Chisel. The user defines:
•a module as a subclass of BlackBox and
•an io field with the interface.
•optionally a subclass of VerilogParameters
For example, one could define a simple ROM black-
box as:
class RomIo extends Bundle {
val isVal = Bool(INPUT)
val raddr = UInt(INPUT, 32)
val rdata = UInt(OUTPUT, 32)
raddr.setName("RADDR")
}
class RomParams extends VerilogParameters {
val MY_STR ="Test"
val MY_INT = 5
}
class Rom extends BlackBox {
val io =new RomIo()
val romParams =new RomParams()
setVerilogParameters(romParams)
renameClock(Driver.implicitClock, "clock_A")
7

renameClock("my_other_clock","test_clock")
renameReset("rst")
// Define how to use in simulation here
}
The parameters are transformed to the verilog
parameters with names and values used in the
class definition.
setVerilogParameters
can also take
a string directly. The function
renameClock
can take a
Clock
object or a string name of the clock to rename
the BlackBox output clock. The function
renameReset
will rename the implicit reset. If other resets need
to be named, call
setName()
. An example of the use
of
setName()
is shown in the io class. Rather than
being called
io_raddr
for the io of the BlackBox, it
will be
RADDR
. The blackbox behaves as a module in
c simulation. This means you can implement the
functionality of the BlackBox using the io so that
you can verify your design.
13 Printf and Sprintf
Chisel provides the ability to format and print
strings for debugging purposes. The
printf
and
sprintf
construct are similar to their C namesakes:
they take a format string and a variable number of
arguments, then print or return a string, respec-
tively. During simulation,
printf
prints the for-
matted string to the console on rising clock edges.
sprintf
, on the other hand, returns the formatted
string as a bit vector.
Supported format specifiers are
%b
(binary num-
ber),
%d
(decimal number),
%x
(hexadecimal number),
and
%s
(string consisting of a sequence of 8-bit ex-
tended ASCII characters). (
%%
specifies a literal
%
.)
Unlike in C, there are no width modifiers: the bit
width of the corresponding argument determines
the width in the string representation.
The following example prints the line
"0x4142
16706 AB" on cycles when cis true:
val x= Bits(0x4142)
val s1 = sprintf("%x %s", x, x);
when (c) { printf("%d %s\n", x, s1); }
14 Assert
Runtime assertions are provided by the
assert
con-
struct. During simulation, if an assertion’s argu-
ment is false on a rising clock edge, an error is
printed and simulation terminates. For example,
Chisel DUT
inputs
outputs
Figure 4: DUT run using a Tester object in Scala
with stdin and stdout connected
the following will terminate simulation after ten
clock cycles:
val x= Reg(init = UInt(0, 4))
x := x + UInt(1)
assert(x < UInt(10))
15 Main and Testing
In order to construct a circuit, the user calls
chiselMain from their top level main function:
object chiselMain {
def apply[T <: Module]
(args: Array[String], comp: () => T): T
}
which when run creates C++ files named mod-
ule_name
.cpp
and module_name
.h
in the directory spec-
ified with --targetDir dir_name argument.
Testing is a crucial part of circuit design, and
thus in Chisel we provide a mechanism for testing
circuits by providing test vectors within Scala using
subclasses of the Tester class:
class Tester[T <: Module]
(val c:T,val isTrace: Boolean = true) {
var t: Int
var ok: Boolean
val rnd: Random
def int(x: Boolean): BigInt
def int(x: Int): BigInt
def int(x: Bits): BigInt
def reset(n: Int = 1)
def step(n: Int): Int
def pokeAt(data: Mem[T], index: Int, x: BigInt)
def poke(data: Bits, x: BigInt)
def poke(data: Aggregate, x: Array[BigInt])
def peekAt(data: Mem[T], index: Int)
def peek(data: Bits): BigInt
def peek(data: Aggregate): Array[BigInt]
def expect (good: Boolean, msg: String): Boolean
def expect (data: Bits, target: BigInt): Boolean
}
which binds a tester to a module and allows users
to write tests using the given debug protocol. In
particular, users utilize:
8
•poke to set input port and state values,
•step to execute the circuit one time unit,
•peek to read port and state values, and
•expect
to compare peeked circuit values to ex-
pected arguments.
Users connect tester instances to modules using:
object chiselMainTest {
def apply[T <: Module]
(args: Array[String], comp: () => T)(
tester: T => Tester[T]): T
}
When
--test
is given as an argument to
chiselMain
, a
tester instance runs the Design Under Test (DUT) in
a separate process with stdin and stdout connected
so that debug commands can be sent to the DUT
and responses can be received from the DUT as
shown in Figure 4.
For example, in the following:
class Mux2Tests(c: Mux2) extends Tester(c) {
val n= pow(2, 3).toInt
for (s <- 0 until 2) {
for (i0 <- 0 until 2) {
for (i1 <- 0 until 2) {
poke(c.io.sel, s)
poke(c.io.in1, i1)
poke(c.io.in0, i0)
step(1)
expect(c.io.out, (if (s == 1) i1 else i0))
}
}
}
}
assignments for each input of
Mux2
is set to the ap-
propriate values using
poke
. For this particular ex-
ample, we are testing the
Mux2
by hardcoding the
inputs to some known values and checking if the
output corresponds to the known one. To do this,
on each iteration we generate appropriate inputs to
the module and tell the simulation to assign these
values to the inputs of the device we are testing
c
,
step the circuit, and test the expected value. Finally,
the following shows how the tester is invoked:
chiselMainTest(args + "--test", () => new Mux2()){
c=>new Mux2Tests(c)
}
Finally, command arguments for chiselMain*are
as follows:
--targetDir target pathname prefix
--genHarness generate harness file for C++
--debug put all wires in C++ class file
--compile compiles generated C++
--test runs tests using C++ app
--backend v generate verilog
--backend c generate C++ (default)
--vcd enable vcd dumping
16 C++ Emulator
The C++ emulator is based on a fast multiword
library using C++ templates. A single word is de-
fined by val_tas follows:
typedef uint64_t val_t;
typedef int64_t sval_t;
typedef uint32_t half_val_t;
and multiwords are defined by dat_tas follows:
template <int w>
class dat_t{
public:
const static int n_words;
inline int width ( void );
inline int n_words_of ( void );
inline bool to_bool ( void );
inline val_t lo_word ( void );
inline unsigned long to_ulong ( void );
std::string to_str ();
dat_t<w> ();
template <int sw>
dat_t<w> (const dat_t<sw>& src);
dat_t<w> (const dat_t<w>& src);
dat_t<w> (val_t val);
template <int sw>
dat_t<w> mask(dat_t<sw> fill, int n);
template <int dw>
dat_t<dw> mask(int n);
template <int n>
dat_t<n> mask(void);
dat_t<w> operator + ( dat_t<w> o );
dat_t<w> operator - ( dat_t<w> o );
dat_t<w> operator - ( );
dat_t<w+w> operator *( dat_t<w> o );
dat_t<w+w> fix_times_fix( dat_t<w> o );
dat_t<w+w> ufix_times_fix( dat_t<w> o );
dat_t<w+w> fix_times_ufix( dat_t<w> o );
dat_t<1> operator < ( dat_t<w> o );
dat_t<1> operator > ( dat_t<w> o );
dat_t<1> operator >= ( dat_t<w> o );
dat_t<1> operator <= ( dat_t<w> o );
dat_t<1> gt ( dat_t<w> o );
dat_t<1> gte ( dat_t<w> o );
dat_t<1> lt ( dat_t<w> o );
dat_t<1> lte ( dat_t<w> o );
dat_t<w> operator ^ ( dat_t<w> o );
dat_t<w> operator & ( dat_t<w> o );
dat_t<w> operator | ( dat_t<w> o );
dat_t<w> operator ~(void );
dat_t<1> operator !(void );
dat_t<1> operator && ( dat_t<1> o );
dat_t<1> operator || ( dat_t<1> o );
9
dat_t<1> operator == ( dat_t<w> o );
dat_t<1> operator == ( datz_t<w> o );
dat_t<1> operator != ( dat_t<w> o );
dat_t<w> operator <<(int amount );
dat_t<w> operator << ( dat_t<w> o );
dat_t<w> operator >>(int amount );
dat_t<w> operator >> ( dat_t<w> o );
dat_t<w> rsha ( dat_t<w> o);
dat_t<w>& operator = ( dat_t<w> o );
dat_t<w> fill_bit(val_t bit);
dat_t<w> fill_byte
(val_t byte, int nb, int n);
template <int dw, int n>
dat_t<dw> fill( void );
template <int dw, int nw>
dat_t<dw> fill( dat_t<nw> n );
template <int dw>
dat_t<dw> extract();
template <int dw>
dat_t<dw> extract(val_t e, val_t s);
template <int dw, int iwe, int iws>
dat_t<dw> extract
(dat_t<iwe> e, dat_t<iws> s);
template <int sw>
dat_t<w> inject
(dat_t<sw> src, val_t e, val_t s);
template <int sw, int iwe, int iws>
dat_t<w> inject
(dat_t<sw> src,
dat_t<iwe> e, dat_t<iws> s);
template <int dw>
dat_t<dw> log2();
dat_t<1> bit(val_t b);
val_t msb();
template <int iw>
dat_t<1> bit(dat_t<iw> b)
}
template <int w, int sw>
dat_t<w> DAT(dat_t<sw> dat);
template <int w>
dat_t<w> LIT(val_t value);
template <int w> dat_t<w>
mux ( dat_t<1> t, dat_t<w> c, dat_t<w> a )
where wis the bit width parameter.
The Chisel compiler compiles top level modules
into a single flattened
mod_t
class that can be created
and executed:
class mod_t{
public:
// initialize module
virtual void init (void) { };
// compute all combinational logic
virtual void clock_lo (dat_t<1> reset) { };
// commit state updates
virtual void clock_hi (dat_t<1> reset) { };
// print printer specd node values to stdout
virtual void print (FILE*f) { };
// scan scanner specd node values from stdin
virtual bool scan (FILE*f) { return true; };
// dump vcd file
virtual void dump (FILE*f, int t) { };
};
Either the Chisel compiler can create a harness
or the user can write a harness themselves. The
following is an example of a harness for a CPU
module:
#include "cpu.h"
int main (int argc, char*argv[]) {
cpu_t*c = new cpu_t();
int lim = (argc > 1) ? atoi(argv[1]) : -1;
c->init();
for (int t = 0; lim < 0 || t < lim; t++) {
dat_t<1> reset = LIT<1>(t == 0);
if (!c->scan(stdin)) break;
c->clock_lo(reset);
c->clock_hi(reset);
c->print(stdout);
}
}
17 Verilog
Chisel generates Verilog when the
-v
argument is
passed into
chiselMain
. For example, from SBT, the
following
run --v
would produce a single Verilog file named module-
name
.v
in the target directory. The file will contain
one module per module defined as submodules of
the top level module created in
chiselMain
. Modules
with the same interface and body are cached and
reused.
18 Multiple Clock Domains
Chisel 2.0 introduced support of multiple clock do-
mains.
18.1 Creating Clock domains
In order to use multiple clock domains, users must
create multiple clocks. In Chisel, clocks are first
class nodes created with a reset signal parameter
and defined as follows:
class Clock (reset: Bool) extends Node {
def reset: Bool // returns reset pin
}
In Chisel there is a builtin implicit clock that state
elements use by default:
var implicitClock =new Clock( implicitReset )
10
The clock for state elements and modules can
be defined using an additional named parameter
called clock:
Reg(... clock: Clock = implicitClock)
Mem(... clock: Clock = implicitClock)
Module(... clock: Clock = implicitClock)
18.2 Crossing Clock Domains
There are two ways that circuits can be defined to
send data between clock domains. The first and
most primitive way is by using a synchronizer cir-
cuit comprised of two registers as follows:
// signalA is in clock domain clockA,
// want a version in clockB as signalB
val s1 = Reg(init = UInt(0), clock = clockB)
val s2 = Reg(init = UInt(0), clock = clockB)
s1 := signalA
s2 := s1;
signalB := s2
Due to metastability issues, this technique is limited
to communicating one bit data between domains.
The second and more general way to send data
between domains is by using an asynchronous fifo:
class AsyncFifo[T<:Data](gen: T, entries: Int, enq_clk:
Clock, deq_clock: Clock)
extends Module
We can then get a version of signalA from clock do-
mains clockA to clockB by specifying the standard
fifo parameters and the two clocks and then using
the standard decoupled ready/valid signals:
val fifo =new AsyncFifo(Uint(width = 32), 2, clockA,
clockB)
fifo.io.enq.bits := signalA
signalB := fifo.io.deq.bits
fifo.io.enq.valid := condA
fifo.io.deq.ready := condB
...
18.3 Backend Specific Multiple Clock
Domains
Clock domains can be mapped to both the C++ and
Verilog backends in a domain-specific manner. For
the purposes of showing how to drive a multi clock
design, consider the example of hardware with two
modules communicating using an AsyncFifo with
each module on separate clocks:
fastClock
and
slowClock.
18.3.1 C++
In the C++ backend, for every clock ithere is a
•uint64_t clk_i
field representing the clock
i
’s
period,
•uint63_t clk_i_cnt
field representing the
clock i’s current count,
•clock_lo_iand clock_hi_i,
•int reset()
function which ensures that all
clock_lo
and
clock_hi
functions are called at
least once, and
•int clock(reset)
function which computes
min delta, invokes appropriate
clock_lo
and
clock_hi’s and returns min delta used.
In order to set up a C++ simulation, the user
•initializes all period fields to desired period
•initializes all count fields to desired phase,
•calls reset and then
•repeated calls clock to step the simulation.
The following is a C++ example of a main function
for the slowClock /fastClock example:
int main(int argc, char** argv) {
ClkDomainTest_t dut;
dut.init(1);
dut.clk = 2;
dut.clk_cnt = 1;
dut.fastClock = 4;
dut.fastClock_cnt = 0;
dut.slowClock = 6;
dut.slowClock_cnt = 0;
for (int i = 0; i < 12; i ++)
dut.reset();
for (int i = 0; i < 96; i ++)
dut.clock(LIT<1>(0));
}
18.3.2 Verilog
In Verilog,
•
Chisel creates a new port for each clock / reset,
•
Chisel wires all the clocks to the top module,
and
•
the user must create an
always
block clock
driver for every clock i.
The following is a Verilog example of a top level
harness to drive the
slowClock
/
fastClock
example
circuit:
11
module emulator;
reg fastClock = 0, slowClock = 0,
resetFast = 1, resetSlow = 1;
wire [31:0] add, mul, test;
always #2 fastClock = ~fastClock;
always #4 slowClock = ~slowClock;
initial begin
#8
resetFast = 0;
resetSlow = 0;
#400
$finish;
end
ClkDomainTest dut (
.fastClock(fastClock),
.slowClock(slowClock),
.io_resetFast(resetFast),
.io_resetSlow(resetSlow),
.io_add(add), .io_mul(mul), .io_test(test));
endmodule
See
http://www.asic-world.com/verilog/
verifaq2.html
for more information about
simulating clocks in Verilog.
19 Extra Stuff
def ListLookup[T <: Bits]
(addr: UInt, default: List[T],
mapping: Array[(UInt, List[T])]): List[T]
def Lookup[T <: Data]
(addr: UInt, default: T,
mapping: Seq[(UInt, T)]): T
// n-way multiplexor
def MuxCase[T <: Data]
(default: T, mapping: Seq[(Bool, T)]): T
// n-way indexed multiplexer:
def MuxLookup[S <: UInt, T <: Data]
(key: S, default: T, mapping: Seq[(S, T)]): T
// create n enum values of given type
def Enum[T <: UInt]
(n: Int)(gen: => T): List[T]
// create enum values of given type and names
def Enum[T <: UInt]
(l: Symbol *)(gen: => T): Map[Symbol, T]
// create enum values of given type and names
def Enum[T <: UInt]
(l: List[Symbol])(gen: => T): Map[Symbol, T]
20 Standard Library
20.1 Math
// Returns the log base 2 of the input
// Scala Integer rounded up
def log2Up(in: Int): Int
// Returns the log base 2 of the input
// Scala Integer rounded down
def log2Down(in: Int): Int
// Returns true if the input Scala Integer
// is a power of 2
def isPow2(in: Int): Boolean
// linear feedback shift register
def LFSR16(increment: Bool = Bool(true)): UInt
20.2 Sequential
// Returns the n-cycle delayed version
// of the input signal
// Has an optional enable signal defaulting to true
def ShiftRegister[T <: Data](in: T, n: Int, en =
Bool(true)): T
def Counter(cond: Bool, n: Int) = {
val c= RegReset(UInt(0, log2Up(n)))
val wrap = c === UInt(n-1)
when (cond) {
c := Mux(Bool(!isPow2(n)) && wrap, UInt(0),
c + UInt(1))
}
(c, wrap && cond)
}
20.3 UInt
// Returns the number of bits set in the
// input signal. Causes an exception if
// the input is wider than 32 bits.
def PopCount(in: UInt): UInt
// Returns the reverse the input signal
def Reverse(in: UInt): UInt
// returns the one hot encoding of
// the input UInt
def UIntToOH(in: UInt, width: Int): UInt
// does the inverse of UIntToOH
def OHToUInt(in: UInt): UInt
def OHToUInt(in: Seq[Bool]): UInt
// Builds a Mux tree out of the input
// signal vector using a one hot encoded
// select signal. Returns the output of
// the Mux tree
def Mux1H[T <: Data]
(sel: UInt, in: Vec[T]): T
def Mux1H[T <: Data]
(sel: Vec[Bool], in: Vec[T]): T
// Builds a Mux tree under the
// assumption that multiple
// select signals can be enabled.
// Priority is given to the first
// select signal. Returns the output
// of the Mux tree.
def PriorityMux[T <: Data]
(sel: UInt, in: Seq[T]): T
def PriorityMux[T <: Data]
12
(sel: Seq[UInt], in: Seq[T]): T
// Returns the bit position of the
// trailing 1 in the input vector with
// the assumption that multiple bits of
// the input bit vector can be set
def PriorityEncoder(in: UInt): UInt
def PriorityEncoder(in: Seq[Bool]): UInt
// Returns the bit position of the
// trailing 1 in the input vector with
// the assumption that only one bit in
// the input vector can be set
def PriorityEncoderOH(in: UInt): UInt
def PriorityEncoderOH(in: Seq[Boo]): UInt
20.4 Decoupled
// Adds a ready-valid handshaking
// protocol to any interface. The
// standard used is that the
// consumer uses the flipped
// interface.
class DecoupledIO[+T <: Data](gen: T)
extends Bundle {
val ready = Bool(INPUT)
val valid = Bool(OUTPUT)
val bits = gen.cloneType.asOutput
}
// Adds a valid protocol to any
// interface. The standard used is
// that the consumer uses the
// fliped interface.
class ValidIO[+T <: Data](gen: T)
extends Bundle {
val valid = Bool(OUTPUT)
val bits = gen.cloneType.asOutput
}
// Hardware module that is used to
// sequence n producers into 1 consumer.
// Priority is given to lower
// producer
// Example usage:
// val arb = new Arbiter(UInt(), 2)
// arb.io.in(0) <> producer0.io.out
// arb.io.in(1) <> producer1.io.out
// consumer.io.in <> arb.io.out
class Arbiter[T <: Data](gen: T, n: Int)
extends Module
// Hardware module that is used to
// sequence n producers into 1 consumer.
// Producers are chosen in round robin
// order
// Example usage:
// val arb = new RRArbiter(UInt(), 2)
// arb.io.in(0) <> producer0.io.out
// arb.io.in(1) <> producer1.io.out
// consumer.io.in <> arb.io.out
class RRArbiter[T <: Data](gen: T, n: Int)
extends Module
// Generic hardware queue. Required
// parameter entries controls the
// depth of the queues. The width of
// the queue is determined from the
// inputs.
// Example usage:
// val q = new Queue(UInt(), 16)
// q.io.enq <> producer.io.out
// consumer.io.in <> q.io.deq
class Queue[T <: Data]
(gen: T, entries: Int,
pipe: Boolean = false,
flow: Boolean = false)
extends Module
// A hardware module that delays data
// coming down the pipeline by the
// number of cycles set by the
// latency parameter. Functionality
// is similar to ShiftRegister but
// this exposes a Pipe interface.
// Example usage:
// val pipe = new Pipe(UInt())
// pipe.io.enq <> produce.io.out
// consumer.io.in <> pipe.io.deq
class Pipe[T <: Data]
(gen: T, latency: Int = 1) extends Module
References
[1]
Bachrach, J., Vo, H., Richards, B., Lee, Y., Water-
man, A., Avižienis, Wawrzynek, J., Asanovi´c
Chisel: Constructing Hardware in a Scala Em-
bedded Language in DAC ’12.
[2]
Odersky, M., Spoon, L., Venners, B. Program-
ming in Scala by Artima.
[3]
Payne, A., Wampler, D. Programming Scala by
O’Reilly books.
13