Nimble User Manual

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 192

DownloadNimble User Manual
Open PDF In BrowserView PDF
NIMBLE User Manual
NIMBLE Development Team
Version 0.6-12

https://r-nimble.org
https://github.com/nimble-dev/nimble

2

Contents
I

Introduction

9

1 Welcome to NIMBLE

11

1.1

What does NIMBLE do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2

How to use this manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Lightning introduction

13

2.1

A brief example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2

Creating a model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3

Compiling the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4

One-line invocation of MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5

Creating, compiling and running a basic MCMC configuration . . . . . . . . . . . . . 20

2.6

Customizing the MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.7

Running MCEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.8

Creating your own functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 More introduction

29

3.1

NIMBLE adopts and extends the BUGS language for specifying models . . . . . . . 29

3.2

nimbleFunctions for writing algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3

The NIMBLE algorithm library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Installing NIMBLE

33

4.1

Requirements to run NIMBLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2

Installing a C++ compiler for NIMBLE to use . . . . . . . . . . . . . . . . . . . . . 33

4.3

4.2.1

OS X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.2.2

Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.2.3

Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Installing the NIMBLE package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3

4

CONTENTS
4.3.1
4.4

II

Problems with installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Customizing your installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4.1

Using your own copy of Eigen . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.4.2

Using libnimble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.4.3

BLAS and LAPACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.4.4

Customizing compilation of the NIMBLE-generated C++ . . . . . . . . . . . 36

Models in NIMBLE

5 Writing models in NIMBLE’s dialect of BUGS
5.1

5.2

6.2

39

Comparison to BUGS dialects supported by WinBUGS, OpenBUGS and JAGS . . . 39
5.1.1

Supported features of BUGS and JAGS . . . . . . . . . . . . . . . . . . . . . 39

5.1.2

NIMBLE’s Extensions to BUGS and JAGS . . . . . . . . . . . . . . . . . . . 39

5.1.3

Not-yet-supported features of BUGS and JAGS . . . . . . . . . . . . . . . . . 40

Writing models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2.1

Declaring stochastic and deterministic nodes . . . . . . . . . . . . . . . . . . 41

5.2.2

More kinds of BUGS declarations . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2.3

Vectorized versus scalar declarations . . . . . . . . . . . . . . . . . . . . . . . 45

5.2.4

Available distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.2.5

Available BUGS language functions . . . . . . . . . . . . . . . . . . . . . . . . 50

5.2.6

Available link functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2.7

Truncation, censoring, and constraints . . . . . . . . . . . . . . . . . . . . . . 53

6 Building and using models
6.1

37

57

Creating model objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1.1

Using nimbleModel to create a model . . . . . . . . . . . . . . . . . . . . . . . 57

6.1.2

Creating a model from standard BUGS and JAGS input files . . . . . . . . . 61

6.1.3

Making multiple instances from the same model definition . . . . . . . . . . . 62

NIMBLE models are objects you can query and manipulate . . . . . . . . . . . . . . 63
6.2.1

What are variables and nodes? . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.2.2

Determining the nodes and variables in a model

6.2.3

Accessing nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.2.4

How nodes are named . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.2.5

Why use node names? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.2.6

Checking if a node holds data . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

. . . . . . . . . . . . . . . . 63

CONTENTS

III

5

Algorithms in NIMBLE

7 MCMC

69
71

7.1

One-line invocation of MCMC: nimbleMCMC . . . . . . . . . . . . . . . . . . . . . . 72

7.2

The MCMC configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.2.1

Default MCMC configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.2.2

Customizing the MCMC configuration . . . . . . . . . . . . . . . . . . . . . . 75

7.3

Building and compiling the MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

7.4

User-friendly execution of MCMC algorithms: runMCMC . . . . . . . . . . . . . . . 82

7.5

Running the MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.5.1

Measuring sampler computation times: getTimes . . . . . . . . . . . . . . . . 84

7.6

Extracting MCMC samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7.7

Calculating WAIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.8

k-fold cross-validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.9

Samplers provided with NIMBLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.9.1

Conjugate (‘Gibbs’) samplers . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.9.2

Customized log-likelihood evaluations: RW_llFunction sampler . . . . . . . . 86

7.9.3

Particle MCMC sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.10 Detailed MCMC example: litters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.11 Comparing different MCMCs with MCMCsuite and compareMCMCs . . . . . . . . . 90
7.11.1 MCMC Suite example: litters . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.11.2 MCMC Suite outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.11.3 Customizing MCMC Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8 Sequential Monte Carlo and MCEM
8.1

8.2

Particle Filters / Sequential Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . 95
8.1.1

Filtering Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

8.1.2

Particle MCMC (PMCMC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Monte Carlo Expectation Maximization (MCEM) . . . . . . . . . . . . . . . . . . . . 99
8.2.1

Estimating the Asymptotic Covariance From MCEM . . . . . . . . . . . . . . 102

9 Spatial models
9.1

95

103

Intrinsic Gaussian CAR model: dcar_normal . . . . . . . . . . . . . . . . . . . . . . 103
9.1.1

Specification and density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

9.1.2

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6

CONTENTS
9.2

9.3

Proper Gaussian CAR model: dcar_proper . . . . . . . . . . . . . . . . . . . . . . . 106
9.2.1

Specification and density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

9.2.2

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

MCMC Sampling of CAR models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9.3.1

Initial values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

9.3.2

Zero-neighbor regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

9.3.3

Zero-mean constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

10 Bayesian nonparametric models

111

10.1 Bayesian nonparametric mixture models . . . . . . . . . . . . . . . . . . . . . . . . . 111
10.2 Chinese Restaurant Process model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.2.1 Specification and density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.3 Stick-breaking model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.3.1 Specification and function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
10.4 MCMC sampling of BNP models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
10.4.1 Sampling CRP models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
10.4.2 Sampling stick-breaking models . . . . . . . . . . . . . . . . . . . . . . . . . . 117

IV

Programming with NIMBLE

119

Overview

121

11 Writing simple nimbleFunctions

123

11.1 Introduction to simple nimbleFunctions . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.2 R functions (or variants) implemented in NIMBLE . . . . . . . . . . . . . . . . . . . 124
11.2.1 Finding help for NIMBLE’s versions of R functions . . . . . . . . . . . . . . . 124
11.2.2 Basic operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.2.3 Math and linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.2.4 Distribution functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
11.2.5 Flow control: if-then-else, for, while, and stop . . . . . . . . . . . . . . . . . . 129
11.2.6 print and cat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
11.2.7 Checking for user interrupts: checkInterrupt . . . . . . . . . . . . . . . . . . . 130
11.2.8 Optimization: optim and nimOptim . . . . . . . . . . . . . . . . . . . . . . . 130

CONTENTS

7

11.2.9 ‘nim’ synonyms for some functions . . . . . . . . . . . . . . . . . . . . . . . . 130
11.3 How NIMBLE handles types of variables . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.3.1 nimbleList data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.3.2 How numeric types work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.4 Declaring argument and return types . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
11.5 Compiled nimbleFunctions pass arguments by reference . . . . . . . . . . . . . . . . 135
11.6 Calling external compiled code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
11.7 Calling uncompiled R functions from compiled nimbleFunctions . . . . . . . . . . . . 136
12 Creating user-defined BUGS distributions and functions

137

12.1 User-defined functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
12.2 User-defined distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
12.2.1 Using registerDistributions for alternative parameterizations and providing
other information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
13 Working with NIMBLE models

143

13.1 The variables and nodes in a NIMBLE model . . . . . . . . . . . . . . . . . . . . . . 143
13.1.1 Determining the nodes in a model . . . . . . . . . . . . . . . . . . . . . . . . 143
13.1.2 Understanding lifted nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
13.1.3 Determining dependencies in a model . . . . . . . . . . . . . . . . . . . . . . 145
13.2 Accessing information about nodes and variables . . . . . . . . . . . . . . . . . . . . 147
13.2.1 Getting distributional information about a node . . . . . . . . . . . . . . . . 147
13.2.2 Getting information about a distribution

. . . . . . . . . . . . . . . . . . . . 148

13.2.3 Getting distribution parameter values for a node . . . . . . . . . . . . . . . . 148
13.2.4 Getting distribution bounds for a node . . . . . . . . . . . . . . . . . . . . . . 149
13.3 Carrying out model calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
13.3.1 Core model operations: calculation and simulation . . . . . . . . . . . . . . . 150
13.3.2 Pre-defined nimbleFunctions for operating on model nodes: simNodes, calcNodes, and getLogProbNodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
13.3.3 Accessing log probabilities via logProb variables . . . . . . . . . . . . . . . . . 154
14 Data structures in NIMBLE
14.1 The modelValues data structure

157
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

14.1.1 Creating modelValues objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
14.1.2 Accessing contents of modelValues . . . . . . . . . . . . . . . . . . . . . . . . 159
14.2 The nimbleList data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
14.2.1 Using eigen and svd in nimbleFunctions . . . . . . . . . . . . . . . . . . . . . 165

8

CONTENTS

15 Writing nimbleFunctions to interact with models

169

15.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
15.2 Using and compiling nimbleFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . 171
15.3 Writing setup code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
15.3.1 Useful tools for setup functions . . . . . . . . . . . . . . . . . . . . . . . . . . 172
15.3.2 Accessing and modifying numeric values from setup . . . . . . . . . . . . . . 172
15.3.3 Determining numeric types in nimbleFunctions . . . . . . . . . . . . . . . . . 173
15.3.4 Control of setup outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
15.4 Writing run code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
15.4.1 Driving models: calculate, calculateDiff, simulate, getLogProb . . . . . . . . . 174
15.4.2 Getting and setting variable and node values . . . . . . . . . . . . . . . . . . 174
15.4.3 Getting parameter values and node bounds . . . . . . . . . . . . . . . . . . . 176
15.4.4 Using modelValues objects

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

15.4.5 Using model variables and modelValues in expressions . . . . . . . . . . . . . 180
15.4.6 Including other methods in a nimbleFunction . . . . . . . . . . . . . . . . . . 181
15.4.7 Using other nimbleFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
15.4.8 Virtual nimbleFunctions and nimbleFunctionLists . . . . . . . . . . . . . . . . 183
15.4.9 Character objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
15.4.10 User-defined data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
15.5 Example: writing user-defined samplers to extend NIMBLE’s MCMC engine . . . . 187
15.6 Copying nimbleFunctions (and NIMBLE models) . . . . . . . . . . . . . . . . . . . . 188
15.7 Debugging nimbleFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
15.8 Timing nimbleFunctions with run.time . . . . . . . . . . . . . . . . . . . . . . . . . . 189
15.9 Clearing and unloading compiled objects . . . . . . . . . . . . . . . . . . . . . . . . . 190
15.10Reducing memory usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Part I

Introduction

9

Chapter 1

Welcome to NIMBLE
NIMBLE is a system for building and sharing analysis methods for statistical models from R, especially for hierarchical models and computationally-intensive methods. While NIMBLE is embedded
in R, it goes beyond R by supporting separate programming of models and algorithms along with
compilation for fast execution.
As of version 0.6.13, NIMBLE has been around for a while and is reasonably stable, but we have
a lot of plans to expand and improve it. The algorithm library provides MCMC with a lot of
user control and ability to write new samplers easily. Other algorithms include particle filtering
(sequential Monte Carlo) and Monte Carlo Expectation Maximization (MCEM).
But NIMBLE is about much more than providing an algorithm library. It provides a language for
writing model-generic algorithms. We hope you will program in NIMBLE and make an R package
providing your method. Of course, NIMBLE is open source, so we also hope you’ll contribute to
its development.
Please join the mailing lists (see R-nimble.org/more/issues-and-groups) and help improve NIMBLE
by telling us what you want to do with it, what you like, and what could be better. We have a
lot of ideas for how to improve it, but we want your help and ideas too. You can also follow and
contribute to developer discussions on the wiki of our GitHub repository.
If you use NIMBLE in your work, please cite us, as this helps justify past and future funding for
the development of NIMBLE. For more information, please call citation('nimble') in R.

1.1

What does NIMBLE do?

NIMBLE makes it easier to program statistical algorithms that will run efficiently and work on
many different models from R.
You can think of NIMBLE as comprising four pieces:
1. A system for writing statistical models flexibly, which is an extension of the BUGS language1 .
2. A library of algorithms such as MCMC.
1

See Chapter 5 for information about NIMBLE’s version of BUGS.

11

12

CHAPTER 1. WELCOME TO NIMBLE
3. A language, called NIMBLE, embedded within and similar in style to R, for writing algorithms
that operate on models written in BUGS.
4. A compiler that generates C++ for your models and algorithms, compiles that C++, and
lets you use it seamlessly from R without knowing anything about C++.

NIMBLE stands for Numerical Inference for statistical Models for Bayesian and Likelihood Estimation.
Although NIMBLE was motivated by algorithms for hierarchical statistical models, it’s useful for
other goals too. You could use it for simpler models. And since NIMBLE can automatically
compile R-like functions into C++ that use the Eigen library for fast linear algebra, you can use it
to program fast numerical functions without any model involved2 .
One of the beauties of R is that many of the high-level analysis functions are themselves written in
R, so it is easy to see their code and modify them. The same is true for NIMBLE: the algorithms
are themselves written in the NIMBLE language.

1.2

How to use this manual

We suggest everyone start with the Lightning Introduction in Chapter 2.
Then, if you want to jump into using NIMBLE’s algorithms without learning about NIMBLE’s
programming system, go to Part II to learn how to build your model and Part III to learn how to
apply NIMBLE’s built-in algorithms to your model.
If you want to learn about NIMBLE programming (nimbleFunctions), go to Part IV. This teaches
how to program user-defined function or distributions to use in BUGS code, compile your R code
for faster operations, and write algorithms with NIMBLE. These algorithms could be specific algorithms for your particular model (such as a user-defined MCMC sampler for a parameter in your
model) or general algorithms you can distribute to others. In fact the algorithms provided as part
of NIMBLE and described in Part III are written as nimbleFunctions.

2

The packages Rcpp and RcppEigen provide different ways of connecting C++, the Eigen library and R. In those
packages you program directly in C++, while in NIMBLE you program in R in a nimbleFunction and the NIMBLE
compiler turns it into C++.

Chapter 2

Lightning introduction
2.1

A brief example

Here we’ll give a simple example of building a model and running some algorithms on the model,
as well as creating our own user-specified algorithm. The goal is to give you a sense for what one
can do in the system. Later sections will provide more detail.
We’ll use the pump model example from BUGS1 . We could load the model from the standard BUGS
example file formats (Section 6.1.2), but instead we’ll show how to enter it directly in R.
In this ‘lightning introduction’ we will:
1.
2.
3.
4.
5.
6.

Create the model for the pump example.
Compile the model.
Create a basic MCMC configuration for the pump model.
Compile and run the MCMC
Customize the MCMC configuration and compile and run that.
Create, compile and run a Monte Carlo Expectation Maximization (MCEM) algorithm, which
illustrates some of the flexibility NIMBLE provides to combine R and NIMBLE.
7. Write a short nimbleFunction to generate simulations from designated nodes of any model.

2.2

Creating a model

First we define the model code, its constants, data, and initial values for MCMC.
pumpCode <- nimbleCode({
for (i in 1:N){
theta[i] ~ dgamma(alpha,beta)
lambda[i] <- theta[i]*t[i]
x[i] ~ dpois(lambda[i])
}
alpha ~ dexp(1.0)
1

The data set describes failure rates of some pumps.

13

14

CHAPTER 2. LIGHTNING INTRODUCTION

beta ~ dgamma(0.1,1.0)
})
pumpConsts <- list(N = 10,
t = c(94.3, 15.7, 62.9, 126, 5.24,
31.4, 1.05, 1.05, 2.1, 10.5))
pumpData <- list(x = c(5, 1, 5, 14, 3, 19, 1, 1, 4, 22))
pumpInits <- list(alpha = 1, beta = 1,
theta = rep(0.1, pumpConsts$N))
Here x[i] is the number of failures recorded during a time duration of length t[i] for the ith
pump. theta[i] is a failure rate, and the goal is estimate parameters alpha and beta. Now let’s
create the model and look at some of its nodes.
pump <- nimbleModel(code = pumpCode, name = "pump", constants = pumpConsts,
data = pumpData, inits = pumpInits)
pump$getNodeNames()
##
##
##
##
##
##
##
##
##
##
##

[1]
[4]
[7]
[10]
[13]
[16]
[19]
[22]
[25]
[28]
[31]

"alpha"
"theta[1]"
"theta[4]"
"theta[7]"
"theta[10]"
"lambda[3]"
"lambda[6]"
"lambda[9]"
"x[2]"
"x[5]"
"x[8]"

"beta"
"theta[2]"
"theta[5]"
"theta[8]"
"lambda[1]"
"lambda[4]"
"lambda[7]"
"lambda[10]"
"x[3]"
"x[6]"
"x[9]"

"lifted_d1_over_beta"
"theta[3]"
"theta[6]"
"theta[9]"
"lambda[2]"
"lambda[5]"
"lambda[8]"
"x[1]"
"x[4]"
"x[7]"
"x[10]"

pump$x
##

[1]

5

1

5 14

3 19

1

1

4 22

pump$logProb_x
##
##

[1]
[7]

-2.998011
-2.358795

pump$alpha
## [1] 1

-1.118924
-2.358795

-1.882686 -2.319466
-9.630645 -48.447798

-4.254550 -20.739651

2.2. CREATING A MODEL

15

pump$theta

##

[1] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1

pump$lambda

##

[1]

9.430

1.570

6.290 12.600

0.524

3.140

0.105

0.105

0.210

1.050

Notice that in the list of nodes, NIMBLE has introduced a new node, lifted_d1_over_beta. We
call this a ‘lifted’ node. Like R, NIMBLE allows alternative parameterizations, such as the scale or
rate parameterization of the gamma distribution. Choice of parameterization can generate a lifted
node, as can using a link function or a distribution argument that is an expression. It’s helpful to
know why they exist, but you shouldn’t need to worry about them.
Thanks to the plotting capabilities of the igraph package that NIMBLE uses to represent the
directed acyclic graph, we can plot the model (Figure 2.1).
pump$plotGraph()
You are in control of the model. By default, nimbleModel does its best to initialize a model, but
let’s say you want to re-initialize theta. To simulate from the prior for theta (overwriting the
initial values previously in the model) we first need to be sure the parent nodes of all theta[i]
nodes are fully initialized, including any non-stochastic nodes such as lifted nodes. We then use
the simulate function to simulate from the distribution for theta. Finally we use the calculate
function to calculate the dependencies of theta, namely lambda and the log probabilities of x to
ensure all parts of the model are up to date. First we show how to use the model’s getDependencies
method to query information about its graph.
# Show all dependencies of alpha and beta terminating in stochastic nodes
pump$getDependencies(c("alpha", "beta"))

## [1] "alpha"
## [4] "theta[1]"
## [7] "theta[4]"
## [10] "theta[7]"
## [13] "theta[10]"

"beta"
"theta[2]"
"theta[5]"
"theta[8]"

"lifted_d1_over_beta"
"theta[3]"
"theta[6]"
"theta[9]"

# Now show only the deterministic dependencies
pump$getDependencies(c("alpha", "beta"), determOnly = TRUE)

## [1] "lifted_d1_over_beta"

16

CHAPTER 2. LIGHTNING INTRODUCTION

x[8]
x[9]
x[5]

lambda[8]
lambda[9]

lambda[5]
theta[8]
theta[9]
theta[5]
beta
theta[1]
theta[2]
lifted_d1_over_beta
alpha
theta[10]
theta[7]

x[2]
lambda[2]

x[1]
lambda[1]

lambda[10]
x[10]

theta[6]
theta[3]
theta[4]

lambda[7]
x[7]

lambda[6]

lambda[3]
lambda[4]
x[3]

x[6]

x[4]
Figure 2.1: Directed Acyclic Graph plot of the pump model, thanks to the igraph package

2.2. CREATING A MODEL

17

# Check that the lifted node was initialized.
pump[["lifted_d1_over_beta"]] # It was.
## [1] 1
# Now let's simulate new theta values
set.seed(1) # This makes the simulations here reproducible
pump$simulate("theta")
pump$theta
# the new theta values
##
##

[1] 0.15514136 1.88240160 1.80451250 0.83617765 1.22254365 1.15835525
[7] 0.99001994 0.30737332 0.09461909 0.15720154

# lambda and logProb_x haven't been re-calculated yet
pump$lambda # these are the same values as above
##

[1]

9.430

1.570

6.290 12.600

0.524

3.140

0.105

0.105

0.210

1.050

pump$logProb_x
##
##

[1]
[7]

-2.998011
-2.358795

-1.118924
-2.358795

-1.882686 -2.319466
-9.630645 -48.447798

-4.254550 -20.739651

pump$getLogProb("x") # The sum of logProb_x
## [1] -96.10932
pump$calculate(pump$getDependencies(c("theta")))
## [1] -262.204
pump$lambda
##
##

[1]
[6]

# Now they have.

14.6298299
36.3723548

29.5537051 113.5038360 105.3583839
1.0395209
0.3227420
0.1987001

6.4061287
1.6506161

pump$logProb_x
##
##

[1]
[7]

-6.002009 -26.167496 -94.632145 -65.346457
-1.000761 -1.453644 -9.840589 -39.096527

-2.626123

-7.429868

Notice that the first getDependencies call returned dependencies from alpha and beta down to the
next stochastic nodes in the model. The second call requested only deterministic dependencies. The
call to pump$simulate("theta") expands "theta" to include all nodes in theta. After simulating
into theta, we can see that lambda and the log probabilities of x still reflect the old values of theta,
so we calculate them and then see that they have been updated.

18

CHAPTER 2. LIGHTNING INTRODUCTION

2.3

Compiling the model

Next we compile the model, which means generating C++ code, compiling that code, and loading
it back into R with an object that can be used just like the uncompiled model. The values in
the compiled model will be initialized from those of the original model in R, but the original and
compiled models are distinct objects so any subsequent changes in one will not be reflected in the
other.
Cpump <- compileNimble(pump)
Cpump$theta
##
##

[1] 0.15514136 1.88240160 1.80451250 0.83617765 1.22254365 1.15835525
[7] 0.99001994 0.30737332 0.09461909 0.15720154

Note that the compiled model is used when running any NIMBLE algorithms via C++, so the
model needs to be compiled before (or at the same time as) any compilation of algorithms, such as
the compilation of the MCMC done in the next section.

2.4

One-line invocation of MCMC

The most direct approach to invoking NIMBLE’s MCMC engine is using the nimbleMCMC function.
This function would generally take the code, data, constants, and initial values as input, but it can
also accept the (compiled or uncompiled) model object as an argument. It provides a variety of
options for executing and controlling multiple chains of NIMBLE’s default MCMC algorithm, and
returning posterior samples, posterior summary statistics, and/or WAIC values.
For example, to execute two MCMC chains of 10,000 samples each, and return samples, summary
statistics, and WAIC values:
mcmc.out <- nimbleMCMC(code = pumpCode, constants = pumpConsts,
data = pumpData, inits = pumpInits,
nchains = 2, niter = 10000,
summary = TRUE, WAIC = TRUE, monitors = c('alpha','beta','theta'))
names(mcmc.out)
## [1] "samples" "summary" "WAIC"
mcmc.out$summary
##
##
##
##
##
##

$chain1
alpha
beta
theta[1]
theta[2]

Mean
0.69804352
0.92862598
0.06019274
0.10157737

Median
0.65835063
0.82156847
0.05676327
0.08203988

St.Dev.
0.27037676
0.54969128
0.02544956
0.07905076

95%CI_low
0.287898244
0.183699137
0.021069950
0.008066869

95%CI_upp
1.3140461
2.2872696
0.1199230
0.3034085

2.4. ONE-LINE INVOCATION OF MCMC
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

theta[3]
theta[4]
theta[5]
theta[6]
theta[7]
theta[8]
theta[9]
theta[10]

19

0.08874755
0.11567784
0.60382223
0.61204831
0.90263434
0.89021051
1.57678136
1.98954127

0.08396502
0.11301465
0.54935089
0.60085518
0.70803389
0.70774794
1.44390008
1.96171250

0.03760562
0.03012598
0.31219612
0.13803302
0.73960182
0.72668155
0.76825189
0.42409802

0.031186960
0.064170937
0.159731108
0.372712375
0.074122175
0.072571029
0.455195149
1.241383787

0.1769982
0.1824525
1.3640771
0.9135269
2.7598261
2.8189252
3.4297368
2.9012192

Mean
alpha
0.69101961
beta
0.91627273
theta[1] 0.05937364
theta[2] 0.10017726
theta[3] 0.08908126
theta[4] 0.11592652
theta[5] 0.59755632
theta[6] 0.61080189
theta[7] 0.89902759
theta[8] 0.89954594
theta[9] 1.57530029
theta[10] 1.98911473

Median
0.65803654
0.81434426
0.05611283
0.08116259
0.08390782
0.11356920
0.54329373
0.59946693
0.70901502
0.70727079
1.45005738
1.96227061

St.Dev.
0.26548378
0.53750825
0.02461866
0.07855024
0.03704170
0.03064645
0.31871551
0.13804343
0.72930369
0.73345905
0.75242164
0.42298189

95%CI_low
0.277195564
0.185772263
0.020956151
0.008266343
0.031330829
0.063595333
0.149286703
0.371373877
0.076243503
0.071250926
0.469959364
1.246910723

95%CI_upp
1.2858148
2.2702428
0.1161870
0.3010355
0.1736876
0.1829574
1.3748728
0.9097319
2.7441445
2.8054633
3.3502795
2.9102326

Median
0.65803654
0.81828160
0.05646474
0.08162361
0.08394667
0.11326039
0.54668011
0.60015416
0.70852800
0.70761105
1.44719278
1.96195345

St.Dev.
0.26795776
0.54365539
0.02504028
0.07880204
0.03732417
0.03038683
0.31548032
0.13803618
0.73445465
0.73007484
0.76035931
0.42352979

95%CI_low
0.28329854
0.18549077
0.02102807
0.00811108
0.03123228
0.06385253
0.15363752
0.37203765
0.07550465
0.07211191
0.46374515
1.24334249

$chain2

$all.chains
alpha
beta
theta[1]
theta[2]
theta[3]
theta[4]
theta[5]
theta[6]
theta[7]
theta[8]
theta[9]
theta[10]

Mean
0.69453156
0.92244935
0.05978319
0.10087731
0.08891440
0.11580218
0.60068928
0.61142510
0.90083096
0.89487822
1.57604083
1.98932800

95%CI_upp
1.2999319
2.2785444
0.1183433
0.3017967
0.1749967
0.1827382
1.3686801
0.9122467
2.7534885
2.8067373
3.3866706
2.9068229

mcmc.out$WAIC
## [1] 48.69896
See Section 7.1 or help(nimbleMCMC) for more details about using nimbleMCMC.
Note that the WAIC value varies depending on what quantities are treated as parameters; see
Section 7.7 for more details.

20

CHAPTER 2. LIGHTNING INTRODUCTION

2.5

Creating, compiling and running a basic MCMC configuration

At this point we have initial values for all of the nodes in the model, and we have both the original
and compiled versions of the model. As a first algorithm to try on our model, let’s use NIMBLE’s
default MCMC. Note that conjugate relationships are detected for all nodes except for alpha, on
which the default sampler is a random walk Metropolis sampler.
pumpConf <- configureMCMC(pump, print = TRUE)

##
##
##
##
##
##
##
##
##
##
##
##

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]

RW sampler: alpha
conjugate_dgamma_dgamma sampler: beta
conjugate_dgamma_dpois sampler: theta[1]
conjugate_dgamma_dpois sampler: theta[2]
conjugate_dgamma_dpois sampler: theta[3]
conjugate_dgamma_dpois sampler: theta[4]
conjugate_dgamma_dpois sampler: theta[5]
conjugate_dgamma_dpois sampler: theta[6]
conjugate_dgamma_dpois sampler: theta[7]
conjugate_dgamma_dpois sampler: theta[8]
conjugate_dgamma_dpois sampler: theta[9]
conjugate_dgamma_dpois sampler: theta[10]

pumpConf$addMonitors(c("alpha", "beta", "theta"))

## thin = 1: alpha, beta, theta
pumpMCMC <- buildMCMC(pumpConf)
CpumpMCMC <- compileNimble(pumpMCMC, project = pump)
niter <- 1000
set.seed(1)
samples <- runMCMC(CpumpMCMC, niter = niter)
par(mfrow = c(1, 4), mai = c(.6, .4, .1, .2))
plot(samples[ , "alpha"], type = "l", xlab = "iteration",
ylab = expression(alpha))
plot(samples[ , "beta"], type = "l", xlab = "iteration",
ylab = expression(beta))
plot(samples[ , "alpha"], samples[ , "beta"], xlab = expression(alpha),
ylab = expression(beta))
plot(samples[ , "theta[1]"], type = "l", xlab = "iteration",
ylab = expression(theta[1]))

400

800

3.0

0.10

2.5

θ1

0.0

0.02

0.5

1.0

0.06

1.5

β

2.0

2.5
0.0

0.5

0.5

1.0

1.5

β

2.0

1.5
1.0

α

0

0.14

21

3.0

2.6. CUSTOMIZING THE MCMC

0

iteration

400

800

0.5

1.0

1.5

0

α

iteration

400

800

iteration

1.0
0.6
0.2

0.4

ACF

0.4
0.2

0.0

0.0

ACF

0.6

0.8

0.8

1.0

acf(samples[, "alpha"]) # plot autocorrelation of alpha sample
acf(samples[, "beta"]) # plot autocorrelation of beta sample

0 5

15
Lag

25

0 5

15

25

Lag

Notice the posterior correlation between alpha and beta. A measure of the mixing for each is the
autocorrelation for each parameter, shown by the acf plots.

2.6

Customizing the MCMC

Let’s add an adaptive block sampler on alpha and beta jointly and see if that improves the mixing.
pumpConf$addSampler(target = c("alpha", "beta"), type = "RW_block",
control = list(adaptInterval = 100))
pumpMCMC2 <- buildMCMC(pumpConf)
# need to reset the nimbleFunctions in order to add the new MCMC
CpumpNewMCMC <- compileNimble(pumpMCMC2, project = pump,
resetFunctions = TRUE)

22

CHAPTER 2. LIGHTNING INTRODUCTION

set.seed(1)
CpumpNewMCMC$run(niter)
## NULL
samplesNew <- as.matrix(CpumpNewMCMC$mvSamples)

0.15

3

0

400

800

1

0.05

0.10

θ1

2

β

2

0

0

0.5

1

1.0

β

α

1.5

3

2.0

par(mfrow = c(1, 4), mai = c(.6, .4, .1, .2))
plot(samplesNew[ , "alpha"], type = "l", xlab = "iteration",
ylab = expression(alpha))
plot(samplesNew[ , "beta"], type = "l", xlab = "iteration",
ylab = expression(beta))
plot(samplesNew[ , "alpha"], samplesNew[ , "beta"], xlab = expression(alpha),
ylab = expression(beta))
plot(samplesNew[ , "theta[1]"], type = "l", xlab = "iteration",
ylab = expression(theta[1]))

0

iteration

400

800

0.5

1.5

0

α

iteration

1.0
0.8
0.6
0.0

0.2

0.4

ACF

0.4
0.2
0.0

ACF

0.6

0.8

1.0

acf(samplesNew[, "alpha"]) # plot autocorrelation of alpha sample
acf(samplesNew[, "beta"]) # plot autocorrelation of beta sample

0 5

15
Lag

25

0 5

15
Lag

25

400

800

iteration

2.7. RUNNING MCEM

23

We can see that the block sampler has decreased the autocorrelation for both alpha and beta. Of
course these are just short runs, and what we are really interested in is the effective sample size of
the MCMC per computation time, but that’s not the point of this example.
Once you learn the MCMC system, you can write your own samplers and include them. The entire
system is written in nimbleFunctions.

2.7

Running MCEM

NIMBLE is a system for working with algorithms, not just an MCMC engine. So let’s try maximizing the marginal likelihood for alpha and beta using Monte Carlo Expectation Maximization2 .
pump2 <- pump$newModel()
box = list( list(c("alpha","beta"), c(0, Inf)))
pumpMCEM <- buildMCEM(model = pump2, latentNodes = "theta[1:10]",
boxConstraints = box)
pumpMLE <- pumpMCEM$run()

##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

Iteration Number: 1.
Current number of MCMC iterations: 1000.
Parameter Estimates:
alpha
beta
0.8160625 1.1230921
Convergence Criterion: 1.001.
Iteration Number: 2.
Current number of MCMC iterations: 1000.
Parameter Estimates:
alpha
beta
0.8045037 1.1993128
Convergence Criterion: 0.0223464.
Monte Carlo error too big: increasing MCMC
Iteration Number: 3.
Current number of MCMC iterations: 1250.
Parameter Estimates:
alpha
beta
0.8203178 1.2497067
Convergence Criterion: 0.004913688.
Monte Carlo error too big: increasing MCMC
Monte Carlo error too big: increasing MCMC
Monte Carlo error too big: increasing MCMC
Iteration Number: 4.
Current number of MCMC iterations: 3032.
2

sample size.

sample size.
sample size.
sample size.

Note that for this model, one could analytically integrate over theta and then numerically maximize the resulting
marginal likelihood.

24

CHAPTER 2. LIGHTNING INTRODUCTION

## Parameter Estimates:
##
alpha
beta
## 0.8226618 1.2602452
## Convergence Criterion: 0.0004201048.
pumpMLE
##
alpha
beta
## 0.8226618 1.2602452
Both estimates are within 0.01 of the values reported by George et al. (1993)3 . Some discrepancy
is to be expected since it is a Monte Carlo algorithm.

2.8

Creating your own functions

Now let’s see an example of writing our own algorithm and using it on the model. We’ll do
something simple: simulating multiple values for a designated set of nodes and calculating every
part of the model that depends on them. More details on programming in NIMBLE are in Part
IV.
Here is our nimbleFunction:
simNodesMany <- nimbleFunction(
setup = function(model, nodes) {
mv <- modelValues(model)
deps <- model$getDependencies(nodes)
allNodes <- model$getNodeNames()
},
run = function(n = integer()) {
resize(mv, n)
for(i in 1:n) {
model$simulate(nodes)
model$calculate(deps)
copy(from = model, nodes = allNodes,
to = mv, rowTo = i, logProb = TRUE)
}
})
simNodesTheta1to5 <- simNodesMany(pump, "theta[1:5]")
simNodesTheta6to10 <- simNodesMany(pump, "theta[6:10]")
Here are a few things to notice about the nimbleFunction.
1. The setup function is written in R. It creates relevant information specific to our model for
use in the run-time code.
3

Table 2 of the paper accidentally swapped the two estimates.

2.8. CREATING YOUR OWN FUNCTIONS

25

2. The setup code creates a modelValues object to hold multiple sets of values for variables in
the model provided.
3. The run function is written in NIMBLE. It carries out the calculations using the information
determined once for each set of model and nodes arguments by the setup code. The run-time
code is what will be compiled.
4. The run code requires type information about the argument n. In this case it is a scalar
integer.
5. The for-loop looks just like R, but only sequential integer iteration is allowed.
6. The functions calculate and simulate, which were introduced above in R, can be used in
NIMBLE.
7. The special function copy is used here to record values from the model into the modelValues
object.
8. Multiple instances, or ‘specializations’, can be made by calling simNodesMany with different arguments. Above, simNodesTheta1to5 has been made by calling simNodesMany with the pump
model and nodes "theta[1:5]" as inputs to the setup function, while simNodesTheta6to10
differs by providing "theta[6:10]" as an argument. The returned objects are objects of a
uniquely generated R reference class with fields (member data) for the results of the setup
code and a run method (member function).
By the way, simNodesMany is very similar to a standard nimbleFunction provided with NIMBLE,
simNodesMV.
Now let’s execute this nimbleFunction in R, before compiling it.
set.seed(1) # make the calculation repeatable
pump$alpha <- pumpMLE[1]
pump$beta <- pumpMLE[2]
# make sure to update deterministic dependencies of the altered nodes
pump$calculate(pump$getDependencies(c("alpha","beta"), determOnly = TRUE))
## [1] 0
saveTheta <- pump$theta
simNodesTheta1to5$run(10)
simNodesTheta1to5$mv[["theta"]][1:2]
## [[1]]
## [1] 0.21829875 1.93210969 0.62296551 0.34197266 3.45729601 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
##
## [[2]]
## [1] 0.82759981 0.08784057 0.34414959 0.29521943 0.14183505 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154

26

CHAPTER 2. LIGHTNING INTRODUCTION

simNodesTheta1to5$mv[["logProb_x"]][1:2]

## [[1]]
## [1] -10.250111 -26.921849 -25.630612 -15.594173 -11.217566
## [7] -1.000761 -1.453644 -9.840589 -39.096527
##
## [[2]]
## [1] -61.043876 -1.057668 -11.060164 -11.761432 -3.425282
## [7] -1.000761 -1.453644 -9.840589 -39.096527

-7.429868

-7.429868

In this code we have initialized the values of alpha and beta to their MLE and then recorded
the theta values to use below. Then we have requested 10 simulations from simNodesTheta1to5.
Shown are the first two simulation results for theta and the log probabilities of x. Notice that
theta[6:10] and the corresponding log probabilities for x[6:10] are unchanged because the nodes
being simulated are only theta[1:5]. In R, this function runs slowly.
Finally, let’s compile the function and run that version.
CsimNodesTheta1to5 <- compileNimble(simNodesTheta1to5,
project = pump, resetFunctions = TRUE)
Cpump$alpha <- pumpMLE[1]
Cpump$beta <- pumpMLE[2]
Cpump$calculate(Cpump$getDependencies(c("alpha","beta"), determOnly = TRUE))

## [1] 0
Cpump$theta <- saveTheta
set.seed(1)
CsimNodesTheta1to5$run(10)
## NULL
CsimNodesTheta1to5$mv[["theta"]][1:2]
## [[1]]
## [1] 0.21829875 1.93210969 0.62296551 0.34197266 3.45729601 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154
##
## [[2]]
## [1] 0.82759981 0.08784057 0.34414959 0.29521943 0.14183505 1.15835525
## [7] 0.99001994 0.30737332 0.09461909 0.15720154

2.8. CREATING YOUR OWN FUNCTIONS

27

CsimNodesTheta1to5$mv[["logProb_x"]][1:2]
## [[1]]
## [1] -10.250111 -26.921849 -25.630612 -15.594173 -11.217566
## [7] -1.042151 -1.004362 -1.894675 -3.081102
##
## [[2]]
## [1] -61.043876 -1.057668 -11.060164 -11.761432 -3.425282
## [7] -1.042151 -1.004362 -1.894675 -3.081102

-2.782156

-2.782156

Given the same initial values and the same random number generator seed, we got identical results
for theta[1:5] and their dependencies, but it happened much faster.

28

CHAPTER 2. LIGHTNING INTRODUCTION

Chapter 3

More introduction
Now that we have shown a brief example, we will introduce more about the concepts and design of
NIMBLE.
One of the most important concepts behind NIMBLE is to allow a combination of high-level processing in R and low-level processing in C++. For example, when we write a Metropolis-Hastings
MCMC sampler in the NIMBLE language, the inspection of the model structure related to one
node is done in R, and the actual sampler calculations are done in C++. This separation between
setup and run steps will become clearer as we go.

3.1

NIMBLE adopts and extends the BUGS language for specifying models

We adopted the BUGS language, and we have extended it to make it more flexible. The BUGS
language became widely used in WinBUGS, then in OpenBUGS and JAGS. These systems all
provide automatically-generated MCMC algorithms, but we have adopted only the language for
describing models, not their systems for generating MCMCs.
NIMBLE extends BUGS by:
1. allowing you to write new functions and distributions and use them in BUGS models;
2. allowing you to define multiple models in the same code using conditionals evaluated when
the BUGS code is processed;
3. supporting a variety of more flexible syntax such as R-like named parameters and more general
algebraic expressions.
By supporting new functions and distributions, NIMBLE makes BUGS an extensible language,
which is a major departure from previous packages that implement BUGS.
We adopted BUGS because it has been so successful, with over 30,000 users by the time they
stopped counting (Lunn et al., 2009). Many papers and books provide BUGS code as a way to
document their statistical models. We describe NIMBLE’s version of BUGS later. The web sites
for WinBUGS, OpenBUGS and JAGS provide other useful documntation on writing models in
BUGS. For the most part, if you have BUGS code, you can try NIMBLE.
NIMBLE does several things with BUGS code:
29

30

CHAPTER 3. MORE INTRODUCTION
1. NIMBLE creates a model definition object that knows everything about the variables and
their relationships written in the BUGS code. Usually you’ll ignore the model definition and
let NIMBLE’s default options take you directly to the next step.
2. NIMBLE creates a model object1 . This can be used to manipulate variables and operate
the model from R. Operating the model includes calculating, simulating, or querying the log
probability value of model nodes. These basic capabilities, along with the tools to query
model structure, allow one to write programs that use the model and adapt to its structure.
3. When you’re ready, NIMBLE can generate customized C++ code representing the model,
compile the C++, load it back into R, and provide a new model object that uses the compiled
model internally. We use the word ‘compile’ to refer to all of these steps together.

As an example of how radical a departure NIMBLE is from previous BUGS implementations,
consider a situation where you want to simulate new data from a model written in BUGS code.
Since NIMBLE creates model objects that you can control from R, simulating new data is trivial.
With previous BUGS-based packages, this isn’t possible.
More information about specifying and manipulating models is in Chapters 6 and 13.

3.2

nimbleFunctions for writing algorithms

NIMBLE provides nimbleFunctions for writing functions that can (but don’t have to) use BUGS
models. The main ways that nimbleFunctions can use BUGS models are:
1. inspecting the structure of a model, such as determining the dependencies between variables,
in order to do the right calculations with each model;
2. accessing values of the model’s variables;
3. controlling execution of the model’s probability calculations or corresponding simulations;
4. managing modelValues data structures for multiple sets of model values and probabilities.
In fact, the calculations of the model are themselves constructed as nimbleFunctions, as are the
algorithms provided in NIMBLE’s algorithm library2 .
Programming with nimbleFunctions involves a fundamental distinction between two stages of processing:
1. A setup function within a nimbleFunction gives the steps that need to happen only once for
each new situation (e.g., for each new model). Typically such steps include inspecting the
model’s variables and their relationships, such as determining which parts of a model will
need to be calculated for a MCMC sampler. Setup functions are executed in R and never
compiled.
2. One or more run functions within a nimbleFunction give steps that need to happen multiple
times using the results of the setup function, such as the iterations of a MCMC sampler.
Formally, run code is written in the NIMBLE language, which you can think of as a small
1
2

or multiple model objects
That’s why it’s easy to use new functions and distributions written as nimbleFunctions in BUGS code.

3.3. THE NIMBLE ALGORITHM LIBRARY

31

subset of R along with features for operating models and related data structures. The NIMBLE language is what the NIMBLE compiler can automatically turn into C++ as part of a
compiled nimbleFunction.
What NIMBLE does with a nimbleFunction is similar to what it does with a BUGS model:
1. NIMBLE creates a working R version of the nimbleFunction. This is most useful for debugging
(Section 15.7).
2. When you are ready, NIMBLE can generate C++ code, compile it, load it back into R and
give you new objects that use the compiled C++ internally. Again, we refer to these steps all
together as ‘compilation’. The behavior of compiled nimbleFunctions is usually very similar,
but not identical, to their uncompiled counterparts.
If you are familiar with object-oriented programming, you can think of a nimbleFunction as a class
definition. The setup function initializes a new object and run functions are class methods. Member
data are determined automatically as the objects from a setup function needed in run functions. If
no setup function is provided, the nimbleFunction corresponds to a simple (compilable) function
rather than a class.
More about writing algorithms is in Chapter 15.

3.3

The NIMBLE algorithm library

In Version 0.6.13, the NIMBLE algorithm library includes:
1. MCMC with samplers including conjugate (Gibbs), slice, adaptive random walk (with options
for reflection or sampling on a log scale), adaptive block random walk, and elliptical slice,
among others. You can modify sampler choices and configurations from R before compiling
the MCMC. You can also write new samplers as nimbleFunctions.
2. WAIC calculation for model comparison after an MCMC algorithm has been run.
3. A set of particle filter (sequential Monte Carlo) methods including a basic bootstrap filter,
auxiliary particle filter, and Liu-West filter.
4. An ascent-based Monte Carlo Expectation Maximization (MCEM) algorithm.
5. A variety of basic functions that can be used as programming tools for larger algorithms.
These include:
a. A likelihood function for arbitrary parts of any model.
b. Functions to simulate one or many sets of values for arbitrary parts of any model.
c. Functions to calculate the summed log probability (density) for one or many sets of
values for arbitrary parts of any model along with stochastic dependencies in the model
structure.
More about the NIMBLE algorithm library is in Chapter 8.

32

CHAPTER 3. MORE INTRODUCTION

Chapter 4

Installing NIMBLE
4.1

Requirements to run NIMBLE

You can run NIMBLE on any of the three common operating systems: Linux, Mac OS X, or
Windows.
The following are required to run NIMBLE.
1. R, of course.
2. The igraph and coda R packages.
3. A working C++ compiler that NIMBLE can use from R on your system. There are standard
open-source C++ compilers that the R community has already made easy to install. See
Section 4.2 for instructions. You don’t need to know anything about C++ to use NIMBLE.
This must be done before installing NIMBLE.
NIMBLE also uses a couple of C++ libraries that you don’t need to install, as they will already be
on your system or are provided by NIMBLE.
1. The
own
2. The
how

Eigen C++ library for linear algebra. This comes with NIMBLE, or you can use your
copy.
BLAS and LAPACK numerical libraries. These come with R, but see Section 4.4.3 for
to use a faster version of the BLAS.

Most fairly recent versions of these requirements should work.

4.2

Installing a C++ compiler for NIMBLE to use

NIMBLE needs a C++ compiler and the standard utility make in order to generate and compile
C++ for models and algorithms.1
1

This differs from most packages, which might need a C++ compiler only when the package is built. If you
normally install R packages using install.packages on Windows or OS X, the package arrives already built to your
system.

33

34

4.2.1

CHAPTER 4. INSTALLING NIMBLE

OS X

On OS X, you should install Xcode. The command-line tools, which are available as a smaller
installation, should be sufficient. This is freely available from the Apple developer site and the App
Store.
For the compiler to work correctly for OS X, the installed R must be for the correct version of OS
X. For example, R for Snow Leopard (OS X version 10.8) will attempt to use an incorrect C++
compiler if the installed OS X is actually version 10.9 or higher.
In the somewhat unlikely event you want to install from the source package rather than the CRAN
binary package, the easiest approach is to use the source package provided at R-nimble.org. If you
do want to install from the source package provided by CRAN, you’ll need to install this gfortran
package.

4.2.2

Linux

On Linux, you can install the GNU compiler suite (gcc/g++). You can use the package manager
to install pre-built binaries. On Ubuntu, the following command will install or update make, gcc
and libc.
sudo apt-get install build-essential

4.2.3

Windows

On Windows, you should download and install Rtools.exe available from https://cran.r-project.
org/bin/windows/Rtools/. Select the appropriate executable corresponding to your version of R
(and follow the urge to update your version of R if you notice it is not the most recent). This installer
leads you through several ‘pages’. We think you can accept the defaults with one exception: check
the PATH checkbox (page 5) so that the installer will add the location of the C++ compiler and
related tools to your system’s PATH, ensuring that R can find them. After you click ‘Next’, you
will get a page with a window for customizing the new PATH variable. You shouldn’t need to do
anything there, so you can simply click ‘Next’ again.
The checkbox for the ‘R 2.15+ toolchain’ (page 4) must be checked (in order to have gcc/g++,
make, etc. installed). This should be checked by default.

4.3

Installing the NIMBLE package

Since NIMBLE is an R package, you can install it in the usual way, via install.packages("nimble")
in R or using the R CMD INSTALL method if you download the package source directly.
NIMBLE can also be obtained from the NIMBLE website. To install from our website, please see
our Download page for the specific invocation of install.packages.

4.4. CUSTOMIZING YOUR INSTALLATION

4.3.1

35

Problems with installation

We have tested the installation on the three commonly used platforms – MacOS, Linux, Windows2 .
We don’t anticipate problems with installation, but we want to hear about any and help resolve
them. Please post about installation problems to the nimble-users Google group or email nimble.
stats@gmail.com.

4.4

Customizing your installation

For most installations, you can ignore low-level details. However, there are some options that some
users may want to utilize.

4.4.1

Using your own copy of Eigen

NIMBLE uses the Eigen C++ template library for linear algebra. Version 3.2.1 of Eigen is included
in the NIMBLE package and that version will be used unless the package’s configuration script finds
another version on the machine. This works well, and the following is only relevant if you want to
use a different (e.g., newer) version.
The configuration script looks in the standard include directories, e.g. /usr/include and
/usr/local/include for the header file Eigen/Dense. You can specify a particular location in
either of two ways:
1. Set the environment variable EIGEN_DIR before installing the R package, for example: export
EIGEN_DIR=/usr/include/eigen3 in the bash shell.
2. Use R CMD INSTALL --configure-args='--with-eigen=/path/to/eigen' nimble_VERSION.tar.gz
or install.packages("nimble", configure.args = "--with-eigen=/path/to/eigen")
In these cases, the directory should be the full path to the directory that contains the Eigen
directory, e.g., /usr/include/eigen3. It is not the full path to the Eigen directory itself, i.e.,
NOT /usr/include/eigen3/Eigen.

4.4.2

Using libnimble

NIMBLE generates specialized C++ code for user-specified models and nimbleFunctions. This
code uses some NIMBLE C++ library classes and functions. By default, on Linux the library code
is compiled once as a linkable library - libnimble.so. This single instance of the library is then linked
with the code for each generated model. In contrast, the default for Windows and Mac OS X is to
compile the library code as a static library - libnimble.a - that is compiled into each model’s and
each algorithm’s own dynamically loadable library (DLL). This does repeat the same code across
models and so occupies more memory. There may be a marginal speed advantage. If one would
like to enable the linkable library in place of the static library (do this only on Mac OS X and other
UNIX variants and not on Windows), one can install the source package with the configuration
argument --enable-dylib set to true. First obtain the NIMBLE source package (which will have
2

We’ve tested NIMBLE on Windows 7, 8 and 10.

36

CHAPTER 4. INSTALLING NIMBLE

the extension .tar.gz from our website and then install as follows, replacing VERSION with the
appropriate version number:
R CMD INSTALL --configure-args='--enable-dylib=true' nimble_VERSION.tar.gz

4.4.3

BLAS and LAPACK

NIMBLE also uses BLAS and LAPACK for some of its linear algebra (in particular calculating
density values and generating random samples from multivariate distributions). NIMBLE will use
the same BLAS and LAPACK installed on your system that R uses. Note that a fast (and where
appropriate, threaded) BLAS can greatly increase the speed of linear algebra calculations. See
Section A.3.1 of the R Installation and Administration manual available on CRAN for more details
on providing a fast BLAS for your R installation.

4.4.4

Customizing compilation of the NIMBLE-generated C++

For each model or nimbleFunction, NIMBLE can generate and compile C++. To compile generated
C++, NIMBLE makes system calls starting with R CMD SHLIB and therefore uses the regular R
configuration in ${R_HOME}/etc/${R_ARCH}/Makeconf. NIMBLE places a Makevars file in the
directory in which the code is generated, and R CMD SHLIB uses this file as usual.
In all but specialized cases, the general compilation mechanism will suffice. However, one can
customize this. One can specify the location of an alternative Makevars (or Makevars.win) file
to use. Such an alternative file should define the variables PKG_CPPFLAGS and PKG_LIBS. These
should contain, respectively, the pre-processor flag to locate the NIMBLE include directory, and
the necessary libraries to link against (and their location as necessary), e.g., Rlapack and Rblas on
Windows, and libnimble. Advanced users can also change their default compilers by editing the
Makevars file, see Section 1.2.1 of the Writing R Extensions manual available on CRAN.
Use of this file allows users to specify additional compilation and linking flags. See the Writing R
Extensions manual for more details of how this can be used and what it can contain.

Part II

Models in NIMBLE

37

Chapter 5

Writing models in NIMBLE’s dialect
of BUGS
Models in NIMBLE are written using a variation on the BUGS language. From BUGS code,
NIMBLE creates a model object. This chapter describes NIMBLE’s version of BUGS. The next
chapter explains how to build and manipulate model objects.

5.1

Comparison to BUGS dialects supported by WinBUGS,
OpenBUGS and JAGS

Many users will come to NIMBLE with some familiarity with WinBUGS, OpenBUGS, or JAGS, so
we start by summarizing how NIMBLE is similar to and different from those before documenting
NIMBLE’s version of BUGS more completely. In general, NIMBLE aims to be compatible with the
original BUGS language and also JAGS’ version. However, at this point, there are some features
not supported by NIMBLE, and there are some extensions that are planned but not implemented.

5.1.1
1.
2.
3.
4.
5.
6.
7.

Supported features of BUGS and JAGS

Stochastic and deterministic1 node declarations.
Most univariate and multivariate distributions.
Link functions.
Most mathematical functions.
‘for’ loops for iterative declarations.
Arrays of nodes up to 4 dimensions.
Truncation and censoring as in JAGS using the T() notation and dinterval.

5.1.2

NIMBLE’s Extensions to BUGS and JAGS

NIMBLE extends the BUGS language in the following ways:
1

NIMBLE calls non-stochastic nodes ‘deterministic’, whereas BUGS calls them ‘logical’. NIMBLE uses ‘logical’ in
the way R does, to refer to boolean (TRUE/FALSE) variables.

39

40

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS

1. User-defined functions and distributions – written as nimbleFunctions – can be used in model
code. See Chapter 12.
2. Multiple parameterizations for distributions, similar to those in R, can be used.
3. Named parameters for distributions and functions, similar to R function calls, can be used.
4. Linear algebra, including for vectorized calculations of simple algebra, can be used in deterministic declarations.
5. Distribution parameters can be expressions, as in JAGS but not in WinBUGS. Caveat: parameters to multivariate distributions (e.g., dmnorm) cannot be expressions (but an expression
can be defined in a separate deterministic expression and the resulting variable then used).
6. Alternative models can be defined from the same model code by using if-then-else statements
that are evaluated when the model is defined.
7. More flexible indexing of vector nodes within larger variables is allowed. For example one can
place a multivariate normal vector arbitrarily within a higher-dimensional object, not just in
the last index.
8. More general constraints can be declared using dconstraint, which extends the concept of
JAGS’ dinterval.
9. Link functions can be used in stochastic, as well as deterministic, declarations.2
10. Data values can be reset, and which parts of a model are flagged as data can be changed,
allowing one model to be used for different data sets without rebuilding the model each time.
11. As of Version 0.6-6 we now support stochastic/dynamic indexes. More specifically in earlier
versions all indexes needed to be constants. Now indexes can be other nodes or functions
of other nodes. For a given dimension of a node being indexed, if the index is not constant, it must be a scalar value. So expressions such as mu[k[i], 3] or mu[k[i], 1:3] or
mu[k[i], j[i]] are allowed, but not mu[k[i]:(k[i]+1)]. Nested dynamic indexes such as
mu[k[j[i]]] are also allowed.

5.1.3

Not-yet-supported features of BUGS and JAGS

In this release, the following are not supported.
1. The appearance of the same node on the left-hand side of both a <- and a ∼ declaration
(used in WinBUGS for data assignment for the value of a stochastic node).
2. Multivariate nodes must appear with brackets, even if they are empty. E.g., x cannot be
multivariate but x[] or x[2:5] can be.
3. NIMBLE generally determines the dimensionality and sizes of variables from the BUGS code.
However, when a variable appears with blank indices, such as in x.sum <- sum(x[]), and
if the dimensions of the variable are not clearly defined in other declarations, NIMBLE currently requires that the dimensions of x be provided when the model object is created (via
nimbleModel).

5.2

Writing models

Here we introduce NIMBLE’s version of BUGS. The WinBUGS, OpenBUGS and JAGS manuals
are also useful resources for writing BUGS models, including many examples.
2

But beware of the possibility of needing to set values for ‘lifted’ nodes created by NIMBLE.

5.2. WRITING MODELS

5.2.1

41

Declaring stochastic and deterministic nodes

BUGS is a declarative language for graphical (or hierarchical) models. Most programming languages
are imperative, which means a series of commands will be executed in the order they are written. A
declarative language like BUGS is more like building a machine before using it. Each line declares
that a component should be plugged into the machine, but it doesn’t matter in what order they
are declared as long as all the right components are plugged in by the end of the code.
The machine in this case is a graphical model3 . A node (sometimes called a vertex) holds one value,
which may be a scalar or a vector. Edges define the relationships between nodes. A huge variety
of statistical models can be thought of as graphs.
Here is the code to define and create a simple linear regression model with four observations.
library(nimble)
mc <- nimbleCode({
intercept ~ dnorm(0, sd = 1000)
slope ~ dnorm(0, sd = 1000)
sigma ~ dunif(0, 100)
for(i in 1:4) {
predicted.y[i] <- intercept + slope * x[i]
y[i] ~ dnorm(predicted.y[i], sd = sigma)
}
})
model <- nimbleModel(mc, data = list(y = rnorm(4)))
library(igraph)
layout <- matrix(ncol
# These seem to be
# so I'll just use
data

= 2, byrow = TRUE,
rescaled to fit in the plot area,
0-100 as the scale
= c(33, 100,
66, 100,
50, 0, # first three are parameters
15, 50, 35, 50, 55, 50, 75, 50, # x's
20, 75, 40, 75, 60, 75, 80, 75, # predicted.y's
25, 25, 45, 25, 65, 25, 85, 25) # y's

)
sizes <- c(45, 30,
rep(20,
rep(50,
rep(20,

30,
4),
4),
4))

edge.color <- "black"
# c(
3

Technically, a directed acyclic graph

42

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS

# rep("green", 8),
# rep("red", 4),
# rep("blue", 4),
# rep("purple", 4))
stoch.color <- "deepskyblue2"
det.color <- "orchid3"
rhs.color <- "gray73"
fill.color <- c(
rep(stoch.color, 3),
rep(rhs.color, 4),
rep(det.color, 4),
rep(stoch.color, 4)
)

plot(model$graph, vertex.shape = "crectangle",
vertex.size = sizes,
vertex.size2 = 20,
layout = layout,
vertex.label.cex = 3.0,
vertex.color = fill.color,
edge.width = 3,
asp = 0.5,
edge.color = edge.color)

intercept

slope

predicted.y[1]

predicted.y[2]

predicted.y[3]

predicted.y[4]

x[1]

x[2]

x[3]

x[4]

y[1]

y[2]

y[3]

y[4]

sigma
Figure 5.1: Graph of a linear regression model
The graph representing the model is shown in Figure 5.1. Each observation, y[i], is a node whose
edges say that it follows a normal distribution depending on a predicted value, predicted.y[i],

5.2. WRITING MODELS

43

and standard deviation, sigma, which are each nodes. Each predicted value is a node whose edges
say how it is calculated from slope, intercept, and one value of an explanatory variable, x[i],
which are each nodes.
This graph is created from the following BUGS code:
{
intercept ~ dnorm(0, sd = 1000)
slope ~ dnorm(0, sd = 1000)
sigma ~ dunif(0, 100)
for(i in 1:4) {
predicted.y[i] <- intercept + slope * x[i]
y[i] ~ dnorm(predicted.y[i], sd = sigma)
}
}
In this code, stochastic relationships are declared with ‘∼’ and deterministic relationships are declared with ‘<-’. For example, each y[i] follows a normal distribution with mean predicted.y[i]
and standard deviation sigma. Each predicted.y[i] is the result of intercept + slope * x[i].
The for-loop yields the equivalent of writing four lines of code, each with a different value of i.
It does not matter in what order the nodes are declared. Imagine that each line of code draws
part of Figure 5.1, and all that matters is that the everything gets drawn in the end. Available
distributions, default and alternative parameterizations, and functions are listed in Section 5.2.4.
An equivalent graph can be created by this BUGS code:
{
intercept ~ dnorm(0, sd = 1000)
slope ~ dnorm(0, sd = 1000)
sigma ~ dunif(0, 100)
for(i in 1:4) {
y[i] ~ dnorm(intercept + slope * x[i], sd = sigma)
}
}
In this case, the predicted.y[i] nodes in Figure 5.1 will be created automatically by NIMBLE
and will have a different name, generated by NIMBLE.

5.2.2

More kinds of BUGS declarations

Here are some examples of valid lines of BUGS code. This code does not describe a sensible or
complete model, and it includes some arbitrary indices (e.g. mvx[8:10, i]) to illustrate flexibility.
Instead the purpose of each line is to illustrate a feature of NIMBLE’s version of BUGS.
{
# 1. normal distribution with BUGS parameter order
x ~ dnorm(a + b * c, tau)
# 2. normal distribution with a named parameter

44

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS
y ~ dnorm(a + b * c, sd = sigma)
# 3. For-loop and nested indexing
for(i in 1:N) {
for(j in 1:M[i]) {
z[i,j] ~ dexp(r[ blockID[i] ])
}
}
# 4. multivariate distribution with arbitrary indexing
for(i in 1:3)
mvx[8:10, i] ~ dmnorm(mvMean[3:5], cov = mvCov[1:3, 1:3, i])
# 5. User-provided distribution
w ~ dMyDistribution(hello = x, world = y)
# 6. Simple deterministic node
d1 <- a + b
# 7. Vector deterministic node with matrix multiplication
d2[] <- A[ , ] %*% mvMean[1:5]
# 8. Deterministic node with user-provided function
d3 <- foo(x, hooray = y)

}
When a variable appears only on the right-hand side, it can be provided via constants (in which
case it can never be changed) or via data or inits, as discussed in Chapter 6.
Notes on the comment-numbered lines are:
1. x follows a normal distribution with mean a + b*c and precision tau (default BUGS second
parameter for dnorm).
2. y follows a normal distribution with the same mean as x but a named standard deviation
parameter instead of a precision parameter (sd = 1/sqrt(precision)).
3. z[i, j] follows an exponential distribution with parameter r[ blockID[i] ]. This shows
how for-loops can be used for indexing of variables containing multiple nodes. Variables that
define for-loop indices (N and M) must also be provided as constants.
4. The arbitrary block mvx[8:10, i] follows a multivariate normal distribution, with a named
covariance matrix instead of BUGS’ default of a precision matrix. As in R, curly braces for
for-loop contents are only needed if there is more than one line.
5. w follows a user-defined distribution. See Chapter 12.
6. d1 is a scalar deterministic node that, when calculated, will be set to a + b.
7. d2 is a vector deterministic node using matrix multiplication in R’s syntax.
8. d3 is a deterministic node using a user-provided function. See Chapter 12.
5.2.2.1

More about indexing

Examples of allowed indexing include:
• x[i] # a single index
• x[i:j] # a range of indices

5.2. WRITING MODELS
•
•
•
•
•
•
•

45

x[i:j,k:l] # multiple single indices or ranges for higher-dimensional arrays
x[i:j, ] # blank indices indicating the full range
x[3*i+7] # computed indices
x[(3*i):(5*i+1)] # computed lower and upper ends of an index range
x[k[i]+1] # a dynamic (and computed) index
x[k[j[i]]] # nested dynamic indexes
x[k[i], 1:3] # nested indexing of rows or columns

NIMBLE does not allow multivariate nodes to be used without square brackets, which is an incompatibility with JAGS. Therefore a statement like xbar <- mean(x) in JAGS must be converted to
xbar <- mean(x[]) (if x is a vector) or xbar <- mean(x[,]) (if x is a matrix) for NIMBLE4 .
Section 6.1.1.5 discusses how to provide NIMBLE with dimensions of x when needed.
Generally NIMBLE supports R-like linear algebra expressions and attempts to follow the same
rules as R about dimensions (although in some cases this is not possible). For example, x[1:3]
%*% y[1:3] converts x[1:3] into a row vector and thus computes the inner product, which is
returned as a 1 × 1 matrix (use inprod to get it as a scalar, which it typically easier). Like in R, a
scalar index will result in dropping a dimension unless the argument drop=FALSE is provided. For
example, mymatrix[i, 1:3] will be a vector of length 3, but mymatrix[i, 1:3, drop=FALSE]
will be a 1 × 3 matrix. More about indexing and dimensions is discussed in Section 11.3.2.6.

5.2.3

Vectorized versus scalar declarations

Suppose you need nodes logY[i] that should be the log of the corresponding Y[i], say for i from
1 to 10. Conventionally this would be created with a for loop:
{
for(i in 1:10) {
logY[i] <- log(Y[i])
}
}
Since NIMBLE supports R-like algebraic expressions, an alternative in NIMBLE’s dialect of BUGS
is to use a vectorized declaration like this:
{
logY[1:10] <- log(Y[1:10])
}
There is an important difference between the models that are created by the above two methods.
The first creates 10 scalar nodes, logY[1] , . . . , logY[10]. The second creates one vector node,
logY[1:10]. If each logY[i] is used separately by an algorithm, it may be more efficient computationally if they are declared as scalars. If they are all used together, it will often make sense to
declare them as a vector.
4

In nimbleFunctions, as explained in later chapters, square brackets with blank indices are not necessary for
multivariate objects.

46

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS

5.2.4

Available distributions

5.2.4.1

Distributions

NIMBLE supports most of the distributions allowed in BUGS and JAGS. Table 5.1 lists the distributions that are currently supported, with their default parameterizations, which match those of
BUGS5 . NIMBLE also allows one to use alternative parameterizations for a variety of distributions
as described next. See Section 12.2 to learn how to write new distributions using nimbleFunctions.
Table 5.1: Distributions with their default order of parameters. The value of the random variable is denoted by x.
Name

Usage

Density

Lower

Upper

Bernoulli

dbern(prob = p)
0 0, b > 0
dbin(prob = p, size = n)
0 < p < 1, n ∈ N∗
dcar_normal(adj, weights,
num, tau, c, zero_mean
dcar_proper(mu, C, adj,
num, M, tau, gamma)
dcat(prob = p)

px (1 − p)1−x

0

1

xa−1 (1−x)b−1
β(a,b)

0

1

0

n

1

N

Beta
Binomial
CAR
(intrinsic)
CAR
(proper)
Categorical

(n)
x

px (1 − p)n−x

see chapter 9 for details
see chapter 9 for details
∑px
i

p ∈ (R+ )N

pi

k

x 2 −1 exp(−x/2)

Chi-square

dchisq(df = k), k > 0

Double
exponential

ddexp(location = µ, rate =
τ ), τ > 0

τ
2

Dirichlet

ddirch(alpha = α)
αj ≥ 0
dexp(rate = λ), λ > 0
dflat()
dgamma(shape = r, rate = λ)
λ > 0, r > 0
dhalfflat()

Γ(

Exponential
Flat
Gamma
Half flat
Inverse
gamma
Logistic

dinvgamma(shape = r, scale = λ)
λ > 0, r > 0
dlogis(location = µ,
rate = τ ),τ > 0

Log-normal
5

dlnorm(meanlog = µ,
taulog = τ ), τ > 0

0

k

2 2 Γ( k2 )

exp(−τ |x − µ|)
∑

j αj )

∏

α −1

xj j
j Γ(αj )

0

λ exp(−λx)
∝ 1 (improper)

0

λr xr−1 exp(−λx)
Γ(r)

0

∝ 1 (improper)

0

λr x−(r+1) exp(−λ/x)
Γ(r)

0

τ exp{(x−µ)τ }
[1+exp{(x−µ)τ }]2

(

τ
2π

)1
2

{

x−1 exp −τ (log(x) − µ)2 /2

}

0

Note that the same distributions are available for writing nimbleFunctions, but in that case the default parameterizations and function names match R’s when possible. Please see Section 11.2.4 for how to use distributions in
nimbleFunctions.

5.2. WRITING MODELS
Name

Usage

47
Density

Lower

∏

Multinomial

dmulti(prob = p, size = n)
∑
j xj = n

n!

Multivariate
normal

dmnorm(mean = µ, prec = Λ)
Λ positive definite

(2π)− 2 |Λ| 2 exp{− (x−µ)

Multivariate

dmvt(mu = µ, prec = Λ)

Student t
Negative
binomial

df = ν), Λ positive def.
dnegbin(prob = p, size = r)
0 < p ≤ 1, r ≥ 0

Normal

dnorm(mean = µ, tau = τ )
τ >0
dpois(lambda = λ)
λ>0

Poisson
Student t
Uniform
Weibull
Wishart

dt(mu = µ, tau = τ , df = k)
τ > 0, k > 0
dunif(min = a, max = b)
a 0, λ > 0
dwish(R = R, df = k)
R p × p pos. def., k ≥ p

Inverse

dinvwish(S = S, df = k)

Wishart

S p × p pos. def., k ≥ p

5.2.4.1.1

j xj !
d

T Λ(x−µ)

1

Γ( ν+d
)
2
|Λ|1/2 (1
Γ( ν2 )(νπ)d/2

+

2

}

(x−µ)T Λ(x−µ) − ν+d
) 2
ν

(x+r−1)

pr (1 − p)x

x

(

τ
2π

)1
2

0

exp{−τ (x − µ)2 /2}

exp(−λ)λx
x!
)
Γ( k+1
2
Γ( k2 )

(

τ
kπ

0
)1 {
2

1+

τ (x−µ)2
k

}− (k+1)
2

1
b−a

a

vλxv−1 exp(−λxv )

0

k/2 exp{−tr(Rx)/2}
|x|(k−p−1)/2 |R|∏
p
2pk/2 π p(p−1)/4 i=1 Γ((k+1−i)/2)
−1
|x|−(k+p+1)/2 |S|k/2
∏pexp{−tr(Sx )/2}
pk/2
p(p−1)/4
Γ((k+1−i)/2)
2
π
i=1

Improper distributions

Note that dcar_normal, dflat and dhalfflat specify improper prior distributions and should
only be used when the posterior distribution of the model is known to be proper. Also for these
distributions, the density function returns the unnormalized density and the simulation function
returns NaN so these distributions are not appropriate for algorithms that need to simulate from
the prior or require proper (normalized) densities.
5.2.4.2

Alternative parameterizations for distributions

NIMBLE allows one to specify distributions in model code using a variety of parameterizations,
including the BUGS parameterizations. Available parameterizations are listed in Table 5.2. To
understand how NIMBLE handles alternative parameterizations, it is useful to distinguish three
cases, using the gamma distribution as an example:
1. A canonical parameterization is used directly for computations6 . For gamma, this is (shape,
6

Upper

x
pj j

Usually this is the parameterization in the Rmath header of R’s C implementation of distributions.

b

48

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS
scale).
2. The BUGS parameterization is the one defined in the original BUGS language. In general this
is the parameterization for which conjugate MCMC samplers can be executed most efficiently.
For dgamma, this is (shape, rate).
3. An alternative parameterization is one that must be converted into the canonical parameterization. For dgamma, NIMBLE provides both (shape, rate) and (mean, sd) parameterization
and creates nodes to calculate (shape, scale) from either (shape, rate) or (mean, sd). In the
case of dgamma, the BUGS parameterization is also an alternative parameterization.

Since NIMBLE provides compatibility with existing BUGS and JAGS code, the order of parameters places the BUGS parameterization first. For example, the order of parameters for dgamma is
dgamma(shape, rate, scale, mean, sd). Like R, if parameter names are not given, they are
taken in order, so that (shape, rate) is the default. This happens to match R’s order of parameters,
but it need not. If names are given, they can be given in any order. NIMBLE knows that rate is
an alternative to scale and that (mean, sd) are an alternative to (shape, scale or rate).
Table 5.2: Distribution parameterizations allowed in NIMBLE. The first column indicates the supported parameterizations for distributions given in Table 5.1. The second column
indicates the relationship to the canonical parameterization
used in NIMBLE.
Parameterization

NIMBLE re-parameterization

dbern(prob)
dbeta(shape1, shape2)
dbeta(mean, sd)

dbin(size = 1, prob)
canonical
dbeta(shape1 = meanˆ2 * (1-mean) / sdˆ2 - mean,
shape2 = mean * (1 - mean)ˆ2 / sdˆ2 + mean - 1)
canonical
canonical
canonical
canonical
ddexp(location, scale = 1 / rate)
ddexp(location, scale = sqrt(var / 2))
canonical
canonical
dexp(rate = 1/scale)
canonical
dgamma(shape, scale = 1 / rate)
dgamma(shape = meanˆ2/sdˆ2, scale = sdˆ2/mean)
canonical
dgamma(shape, rate = 1 / scale)
canonical
dlogis(location, scale = 1 / rate
canonical
dlnorm(meanlog, sdlog = 1 / sqrt(taulog)
dlnorm(meanlog, sdlog = sqrt(varlog)
canonical

dbin(prob, size)
dcat(prob)
dchisq(df)
ddexp(location, scale)
ddexp(location, rate)
ddexp(location, var)
ddirch(alpha)
dexp(rate)
dexp(scale)
dgamma(shape, scale)
dgamma(shape, rate)
dgamma(mean, sd)
dinvgamma(shape, rate)
dinvgamma(shape, scale)
dlogis(location, scale)
dlogis(location, rate)
dlnorm(meanlog, sdlog)
dlnorm(meanlog, taulog)
dlnorm(meanlog, varlog)
dmulti(prob, size)

5.2. WRITING MODELS

49

Parameterization

NIMBLE re-parameterization

dmnorm(mean, cholesky,
...prec_param=1)
dmnorm(mean, cholesky,
...prec_param=0)
dmnorm(mean, prec)
dmnorm(mean, cov)
dmvt(mu, cholesky, df,
...prec_param=1)
dmvt(mu, cholesky, df,
...prec_param=0)
dmvt(mu, prec, df)
dmvt(mu, scale, df)
dnegbin(prob, size)
dnorm(mean, sd)
dnorm(mean, tau)
dnorm(mean, var)
dpois(lambda)
dt(mu, sigma, df)
dt(mu, tau, df)
dt(mu, sigma2, df)
dunif(min, max)
dweib(shape, scale)
dweib(shape, rate)
dweib(shape, lambda)
dwish(cholesky, df,
...scale_param=1)
dwish(cholesky, df,
...scale_param=0)
dwish(R, df)
dwish(S, df)
dinvwish(cholesky, df,
...scale_param=1)
dinvwish(cholesky, df,
...scale_param=0)
dinvwish(R, df)
dinvwish(S, df)

canonical (precision)
canonical (covariance)
dmnorm(mean, cholesky = chol(prec), prec_param=1)
dmnorm(mean, cholesky = chol(cov), prec_param=0)
canonical (precision/inverse scale)
canonical (scale)
dmvt(mu, cholesky = chol(prec), df, prec_param=1)
dmvt(mu, cholesky = chol(scale), df, prec_param=0)
canonical
canonical
dnorm(mean, sd = 1 / sqrt(var))
dnorm(mean, sd = sqrt(var))
canonical
canonical
dt(mu, sigma = 1 / sqrt(tau), df)
dt(mu, sigma = sqrt(sigma2), df)
canonical
canonical
dweib(shape, scale = 1 / rate)
dweib(shape, scale = lambdaˆ(- 1 / shape)
canonical (scale)
canonical (inverse scale)
dwish(cholesky = chol(R), df, scale_param = 0)
dwish(cholesky = chol(S), df, scale_param = 1)
canonical (scale)
canonical (inverse scale)
dinvwish(cholesky = chol(R), df, scale_param = 0)
dinvwish(cholesky = chol(S), df, scale_param = 1)

Note that for multivariate normal, multivariate t, Wishart, and Inverse Wishart, the canonical
parameterization uses the Cholesky decomposition of one of the precision/inverse scale or covariance/scale matrix. For example, for the multivariate normal, if prec_param=TRUE, the cholesky
argument is treated as the Cholesky decomposition of a precision matrix. Otherwise it is treated
as the Cholesky decomposition of a covariance matrix.
In addition, NIMBLE supports alternative distribution names, known as aliases, as in JAGS, as
specified in Table 5.3.

50

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS
Table 5.3: Distributions with alternative names (aliases)
Distribution

Canonical name

Alias

Binomial
Chi-square
Double exponential
Dirichlet
Multinomial
Negative binomial
Weibull
Wishart

dbin
dchisq
ddexp
ddirch
dmulti
dnegbin
dweib
dwish

dbinom
dchisqr
dlaplace
ddirich
dmultinom
dnbinom
dweibull
dwishart

We plan to, but do not currently, include the following distributions as part of core NIMBLE: double
exponential (Laplace), beta-binomial, Dirichlet-multinomial, F, Pareto, or forms of the multivariate
t other than the standard one provided.

5.2.5

Available BUGS language functions

Tables 11.2-11.3 show the available operators and functions. Support for more general R expressions
is covered in Chapter 11 about programming with nimbleFunctions.
For the most part NIMBLE supports the functions used in BUGS and JAGS, with exceptions
indicated in the table. Additional functions provided by NIMBLE are also listed. Note that we
provide distribution functions for use in calculations, namely the ‘p’, ‘q’, and ‘d’ functions. See
Section 11.2.4 for details on the syntax for using distribution functions as functions in deterministic
calculations, as only some parameterizations are allowed and the names of some distributions differ
from those used to define stochastic nodes in a model.
Table 5.4: Functions operating on scalars, many of which
can operate on each element (component-wise) of vectors and
matrices. Status column indicates if the function is currently
provided in NIMBLE. Vector input column indicates if the
function can take a vector as an argument (i.e., if the function
is vectorized).
Usage

Description

x | y, x & y
!x
x > y, x >= y
x < y, x <= y
x != y, x == y
x + y, x - y, x * y
x / y
xˆy, pow(x, y)
x %% y
min(x1, x2),

logical OR (|) and AND(&)
logical not
greater than (and or equal to)
less than (and or equal to)
(not) equals
component-wise operators
component-wise division
power
modulo (remainder)
min. (max.) of two scalars

Comments

mix of scalar and vector
vector x and scalar y
xy ; vector x,scalar y

Status

Vector input

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

yes
yes
yes
yes
yes
yes
yes
yes
no
See pmin,

5.2. WRITING MODELS
Usage

51

Description

max(x1, x2)
exp(x)
log(x)
sqrt(x)
abs(x)
step(x)
equals(x)
cube(x)
sin(x), cos(x),
tan(x)
asin(x), acos(x),
atan(x)
asinh(x), acosh(x),
atanh(x)
logit(x)
ilogit(x), expit(x)
probit(x)
iprobit(x), phi(x)
cloglog(x)
icloglog(x)
ceiling(x)
floor(x)
round(x)
trunc(x)
lgamma(x), loggam(x)
besselK(k, nu,
...expon.scaled)
log1p(x)
lfactorial(x),
logfact(x)
qDIST(x, PARAMS)
pDIST(x, PARAMS)
rDIST(x, PARAMS)
dDIST(x, PARAMS)
sort(x)
rank(x, s)
ranked(x, s)
order(x)

Status

Vector input

yes
yes
yes
yes
yes
yes
yes
yes

pmax
yes
yes
yes
yes
yes
yes
yes
yes

inverse trigonometric functions

yes

yes

inv. hyperbolic trig. functions

yes

yes

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

yes
yes

yes
yes

yes
yes
yes
yes
no
no
no
no

yes
yes
yes
yes

exponential
natural logarithm
square root
absolute value
step function at 0
equality of two scalars
third power
trigonometric functions

Comments

0 if x < 0, 1 if x >= 0
1 if x == y, 0 if x! = y
x3

logit
inverse logit
probit (Gaussian quantile)
inverse probit (Gaussian CDF)
complementary log log
inverse complementary log log
ceiling function
floor function
round to integer
truncation to integer
log gamma function
modified bessel function
of the second kind
log of 1 + x
log factorial

log(x/(1 − x))
exp(x)/(1 + exp(x))
Φ−1 (x)
Φ(x)
log(− log(1 − x))
1 − exp(− exp(x))
⌈(x)⌉
⌊(x)⌋

“q” distribution functions
“p” distribution functions
“r” distribution functions
“d” distribution functions

canonical
canonical
canonical
canonical

log Γ(x)

log(1 + x)
log x!
parameterization
parameterization
parameterization
parameterization

Table 5.5: Functions operating on vectors and matrices. Status column indicates if the function is currently provided in
NIMBLE.
Usage

Description

Comments

Status

inverse(x)

matrix inverse

x symmetric, positive def.

yes

52

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS

Usage

Description

Comments

Status

chol(x)
t(x)
x%*%y
inprod(x, y)
solve(x)
forwardsolve(x, y)
backsolve(x, y)
logdet(x)
asRow(x)
asCol(x)
sum(x)
mean(x)
sd(x)
prod(x)
min(x), max(x)
pmin(x, y), pmax(x,y)

matrix Cholesky factorization
matrix transpose
matrix multiply
dot product
solve system of equations
solve lower-triangular system of equations
solve upper-triangular system of equations
log matrix determinant
convert vector to 1-row matrix
convert vector to 1-column matrix
sum of elements of x
mean of elements of x
standard deviation of elements of x
product of elements of x
min. (max.) of elements of x
vector of mins (maxs) of elements of
x and y
linear interpolation
matrix eigenvalues
matrix eigenvectors
matrix singular values
matrix left singular vectors
matrix right singular vectors

x symmetric, positive def.
x⊤
xy; x, y conformant
x⊤ y; x and y vectors
x−1 y; y matrix or vector
x−1 y; x lower-triangular
x−1 y; x upper-triangular
log |x|
sometimes automatic
sometimes automatic

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

interp.lin(x, v1, v2)
eigen(x)$values
eigen(x)$vectors
svd(x)$d
svd(x)$u
svd(x)$v

x symmetric
x symmetric

See Section 12.1 to learn how to use nimbleFunctions to write new functions for use in BUGS code.

5.2.6

Available link functions

NIMBLE allows the link functions listed in Table 5.6.
Table 5.6: Link functions.
Link function

Description

Range

Inverse

cloglog(y) <- x
log(y) <- x
logit(y) <- x
probit(y) <- x

Complementary log log
Log
Logit
Probit

0 c[i]) and 0 if it is observed. The data vector for t should have NA
(indicating missing data) for any censored t[i] entries. (As a result, these nodes will be sampled
in an MCMC.) The data vector for c should give the censoring times corresponding to censored
entries and a value above the observed times for uncensored entries (e.g., Inf). Left-censored
observations would be specified by setting censored[i] to 0 and t[i] to NA.
The dinterval is not really a distribution but rather a trick: in the above example when
censored[i] = 1 it gives a ‘probability’ of 1 if t[i] > c[i] and 0 otherwise. This means
that t[i] <= c[i] is treated as impossible. More generally than simple right- or left-censoring,
censored[i] ~ dinterval(t[i], c[i, ]) is defined such that for a vector of increasing
cutpoints, c[i, ], t[i] is enforced to fall within the censored[i]-th cutpoint interval. This is
done by setting data censored[i] as follows:
censored[i] = 0 if t[i] <= c[i, 1]
censored[i] = m if c[i, m] < t[i] <= c[i, m+1] for 1 ≤ m ≤ M
censored[i] = M if c[i, M] < t[i]
(The i index is provided only for consistency with the previous example.) The most common uses
of dinterval will be for left- and right-censored data, in which case c[i,] will be a single value
(and typically given as simply c[i]), and for interval-censored data, in which case c[i,] will be a
vector of two values.
Nodes following a dinterval distribution should normally be set as data with known values. Otherwise, the node may be simulated during initialization in some algorithms (e.g., MCMC) and
thereby establish a permanent, perhaps unintended, constraint.
Censoring differs from truncation because censoring an observation involves bounds on a random
variable that could have taken any value, while in truncation we know a priori that a datum could
not have occurred outside the truncation range.
5.2.7.3

Constraints and ordering

NIMBLE provides a more general way to enforce constraints using dconstraint(cond). For example, we could specify that the sum of mu1 and mu2 must be positive like this:
mu1 ~ dnorm(0, 1)
mu2 ~ dnorm(0, 1)
constraint_data ~ dconstraint( mu1 + mu2 > 0 )
with constraint_data set (as data) to 1. This is equivalent to a half-normal distribution on the
half-plane µ1 + µ2 > 0. Nodes following dconstraint should be provided as data for the same
reason of avoiding unintended initialization described above for dinterval.
Formally, dconstraint(cond) is a probability distribution on {0, 1} such that P (1) = 1 if cond is
TRUE and P (0) = 1 if cond is FALSE.
Of course, in many cases, parameterizing the model so that the constraints are automatically
respected may be a better strategy than using dconstraint. One should be cautious about constraints that would make it hard for an MCMC or optimization to move through the parameter

5.2. WRITING MODELS

55

space (such as equality constraints that involve two or more parameters). For such restrictive constraints, general purpose algorithms that are not tailored to the constraints may fail or be inefficient.
If constraints are used, it will generally be wise to ensure the model is initialized with values that
satisfy them.
5.2.7.3.1

Ordering

To specify an ordering of parameters, such as α1 <= α2 <= α3 one can use dconstraint as follows:
constraint_data ~ dconstraint( alpha1 <= alpha2 & alpha2 <= alpha3 )
Note that unlike in BUGS, one cannot specify prior ordering using syntax such as
alpha[1] ~ dnorm(0, 1) I(, alpha[2])
alpha[2] ~ dnorm(0, 1) I(alpha[1], alpha[3])
alpha[3] ~ dnorm(0, 1) I(alpha[2], )
as this does not represent a directed acyclic graph.
Also note that specifying prior ordering using T(,) can result in possibly unexpected results. For
example:
alpha1 ~ dnorm(0, 1)
alpha2 ~ dnorm(0, 1) T(alpha1, )
alpha3 ~ dnorm(0, 1) T(alpha2, )
will enforce alpha1 ≤ alpha2 ≤ alpha3, but it does not treat the three parameters symmetrically.
Instead it puts a marginal prior on alpha1 that is standard normal and then constrains alpha2
and alpha3 to follow truncated normal distributions. This is not equivalent to a symmetric prior
on the three alphas that assigns zero probability density when values are not in order.
NIMBLE does not support the JAGS sort syntax.

56

CHAPTER 5. WRITING MODELS IN NIMBLE’S DIALECT OF BUGS

Chapter 6

Building and using models
This chapter explains how to build and manipulate model objects starting from BUGS code.

6.1

Creating model objects

NIMBLE provides two functions for creating model objects: nimbleModel and readBUGSmodel. The
first, nimbleModel, is more general and was illustrated in Chapter 2. The second, readBUGSmodel
provides compatibility with BUGS file formats for models, variables, data, and initial values for
MCMC.
In addition one can create new model objects from existing model objects.

6.1.1

Using nimbleModel to create a model

nimbleModel processes BUGS code to determine all the nodes, variables, and their relationships
in a model. Any constants must be provided at this step. Data and initial values can optionally
be provided. BUGS code passed to nimbleModel must go through nimbleCode.
We look again at the pump example from the introduction:
pumpCode <- nimbleCode({
for (i in 1:N){
theta[i] ~ dgamma(alpha,beta);
lambda[i] <- theta[i]*t[i];
x[i] ~ dpois(lambda[i])
}
alpha ~ dexp(1.0);
beta ~ dgamma(0.1,1.0);
})
pumpConsts <- list(N = 10,
t = c(94.3, 15.7, 62.9, 126, 5.24,
31.4, 1.05, 1.05, 2.1, 10.5))
57

58

CHAPTER 6. BUILDING AND USING MODELS

pumpData <- list(x = c(5, 1, 5, 14, 3, 19, 1, 1, 4, 22))
pumpInits <- list(alpha = 1, beta = 1,
theta = rep(0.1, pumpConsts$N))
pump <- nimbleModel(code = pumpCode, name = "pump", constants = pumpConsts,
data = pumpData, inits = pumpInits)

6.1.1.1

Data and constants

NIMBLE makes a distinction between data and constants:
• Constants can never be changed and must be provided when a model is defined. For example,
a vector of known index values, such as for block indices, helps define the model graph itself
and must be provided as constants. Variables used in the index ranges of for-loops must also
be provided as constants.
• Data is a label for the role a node plays in the model. Nodes marked as data will by default
be protected from any functions that would simulate over their values (see simulate in
Chapter 13), but it is possible to over-ride that default or to change their values by direct
assignment. This allows an algorithm to be applied to many data sets in the same model
without re-creating the model each time. It also allows simulation of data in a model.

WinBUGS, OpenBUGS and JAGS do not allow data values to be changed or different nodes to be
labeled as data without starting from the beginning again. Hence they do not distinguish between
constants and data.
For compatibility with BUGS and JAGS, NIMBLE allows both to be provided in the constants
argument to nimbleModel, in which case NIMBLE handles values for stochastic nodes as data and
everything else as constants.
Values for nodes that appear only on the right-hand side of BUGS declarations (e.g., covariates/predictors) can be provided as constants or as data or initial values. There is no real difference
between providing as data or initial values and the values can be added after building a model via
setInits or setData.
6.1.1.2

Providing data and initial values to an existing model

Whereas constants must be provided during the call to nimbleModel, data and initial values can be
provided later via the model member functions setData and setInits. For example, if pumpData
is a named list of data values (as above), then pump$setData(pumpData) sets the named variables
to the values in the list.
setData does two things: it sets the values of the data nodes, and it flags those nodes as containing
data. nimbleFunction programmers can then use that information to control whether an algorithm
should over-write data or not. For example, NIMBLE’s simulate functions by default do not
overwrite data values but can be told to do so. Values of data variables can be replaced, and the

6.1. CREATING MODEL OBJECTS

59

indication of which nodes should be treated as data can be reset by using the resetData method,
e.g. pump$resetData().
6.1.1.3

Missing data values

Sometimes one needs a model variable to have a mix of data and non-data, often due to missing
data values. In NIMBLE, when data values are provided, any nodes with NA values will not be
labeled as data. A node following a multivariate distribution must be either entirely observed or
entirely missing.
Here’s an example of running an MCMC on the pump model, with two of the observations taken
to be missing. Some of the steps in this example are documented more below. NIMBLE’s default
MCMC configuration will treat the missing values as unknowns to be sampled, as can be seen in
the MCMC output here.
pumpMiss <- pump$newModel()
pumpMiss$resetData()
pumpDataNew <- pumpData
pumpDataNew$x[c(1, 3)] <- NA
pumpMiss$setData(pumpDataNew)
pumpMissConf <- configureMCMC(pumpMiss)
pumpMissConf$addMonitors('x', 'alpha', 'beta', 'theta')
## thin = 1: alpha, beta, x, theta
pumpMissMCMC <- buildMCMC(pumpMissConf)
Cobj <- compileNimble(pumpMiss, pumpMissMCMC)
niter <- 10
set.seed(0)
Cobj$pumpMissMCMC$run(niter)
## NULL
samples <- as.matrix(Cobj$pumpMissMCMC$mvSamples)
samples[1:5, 13:17]
##
##
##
##
##
##

[1,]
[2,]
[3,]
[4,]
[5,]

x[1] x[2] x[3] x[4] x[5]
17
1
2
14
3
11
1
4
14
3
14
1
9
14
3
11
1
24
14
3
9
1
29
14
3

60

CHAPTER 6. BUILDING AND USING MODELS

Missing values may also occur in explanatory/predictor variables. Values for such variables should
be passed in via the data argument to nimbleModel, with NA for the missing values. In some
contexts, one would want to specify distributions for such explanatory variables, for example so
that an MCMC would impute the missing values.
6.1.1.4

Defining alternative models with the same code

Avoiding code duplication is a basic principle of good programming. In NIMBLE, one can use
definition-time if-then-else statements to create different models from the same code. As a simple
example, say we have a linear regression model and want to consider including or omitting x[2] as
an explanatory variable:
regressionCode <- nimbleCode({
intercept ~ dnorm(0, sd = 1000)
slope1 ~ dnorm(0, sd = 1000)
if(includeX2) {
slope2 ~ dnorm(0, sd = 1000)
for(i in 1:N)
predictedY[i] <- intercept + slope1 * x1[i] + slope2 * x2[i]
} else {
for(i in 1:N) predictedY[i] <- intercept + slope1 * x1[i]
}
sigmaY ~ dunif(0, 100)
for(i in 1:N) Y[i] ~ dnorm(predictedY[i], sigmaY)
})
includeX2 <- FALSE
modelWithoutX2 <- nimbleModel(regressionCode, constants = list(N = 30),
check=FALSE)
modelWithoutX2$getVarNames()
##
##
##
##

[1]
[3]
[5]
[7]

"intercept"
"slope1"
"predictedY"
"sigmaY"
"lifted_d1_over_sqrt_oPsigmaY_cP" "Y"
"x1"

includeX2 <- TRUE
modelWithX2 <- nimbleModel(regressionCode, constants = list(N = 30),
check = FALSE)
modelWithX2$getVarNames()
##
##
##
##
##

[1]
[3]
[5]
[7]
[9]

"intercept"
"slope2"
"sigmaY"
"Y"
"x2"

"slope1"
"predictedY"
"lifted_d1_over_sqrt_oPsigmaY_cP"
"x1"

6.1. CREATING MODEL OBJECTS

61

Whereas the constants are a property of the model definition – since they may help determine the
model structure itself – data nodes can be different in different copies of the model generated from
the same model definition. The setData and setInits described above can be used for each copy
of the model.

6.1.1.5

Providing dimensions via nimbleModel

nimbleModel can usually determine the dimensions of every variable from the declarations in the
BUGS code. However, it is possible to use a multivariate object only with empty indices (e.g. x[,]),
in which case the dimensions must be provided as an argument to nimbleModel.
Here’s an example with multivariate nodes. The first provides indices, so no dimensions argument
is needed, while the second omits the indices and provides a dimensions argument instead.
code <- nimbleCode({
y[1:K] ~ dmulti(p[1:K], n)
p[1:K] ~ ddirch(alpha[1:K])
log(alpha[1:K]) ~ dmnorm(alpha0[1:K], R[1:K, 1:K])
})
K <- 5
model <- nimbleModel(code, constants = list(n = 3, K = K,
alpha0 = rep(0, K), R = diag(K)),
check = FALSE)
codeAlt <- nimbleCode({
y[] ~ dmulti(p[], n)
p[] ~ ddirch(alpha[])
log(alpha[]) ~ dmnorm(alpha0[], R[ , ])
})
model <- nimbleModel(codeAlt, constants = list(n = 3, K = K, alpha0 = rep(0, K),
R = diag(K)),
dimensions = list(y = K, p = K, alpha = K),
check = FALSE)
In that example, since alpha0 and R are provided as constants, we don’t need to specify their
dimensions.

6.1.2

Creating a model from standard BUGS and JAGS input files

Users with BUGS and JAGS experience may have files set up in standard formats for use in
BUGS and JAGS. readBUGSmodel can read in the model, data/constant values and initial values
in those formats. It can also take information directly from R objects somewhat more flexibly than
nimbleModel, specifically allowing inputs set up similarly to those for BUGS and JAGS. In either
case, after processing the inputs, it calls nimbleModel. Note that unlike BUGS and JAGS, only a

62

CHAPTER 6. BUILDING AND USING MODELS

single set of initial values can be specified in creating a model. Please see help(readBUGSmodel)
for argument details.
As an example of using readBUGSmodel, let’s create a model for the pump example from BUGS.
pumpDir <- system.file('classic-bugs', 'vol1', 'pump', package = 'nimble')
pumpModel <- readBUGSmodel('pump.bug', data = 'pump-data.R',
inits = 'pump-init.R', dir = pumpDir)
## Detected x as data within 'constants'.
Note that readBUGSmodel allows one to include var and data blocks in the model file as in some
of the BUGS examples (such as inhaler). The data block pre-computes constant and data values.
Also note that if data and inits are provided as files, the files should contain R code that creates
objects analogous to what would populate the list if a list were provided instead. Please see the
JAGS manual examples or the classic_bugs directory in the NIMBLE package for example syntax.
NIMBLE by and large does not need the information given in a var block but occasionally this is
used to determine dimensionality, such as in the case of syntax like xbar <- mean(x[]) where x is
a variable that appears only on the right-hand side of BUGS expressions.
Note that NIMBLE does not handle formatting such as in some of the original BUGS examples in
which data was indicated with syntax such as data x in 'x.txt'.

6.1.3

Making multiple instances from the same model definition

Sometimes it is useful to have more than one copy of the same model. For example, an algorithm
(i.e., nimbleFunction) such as an MCMC will be bound to a particular model before it is run. A user
could build multiple algorithms to use the same model instance, or they may want each algorithm
to have its own instance of the model.
There are two ways to create new instances of a model, shown in this example:
simpleCode <- nimbleCode({
for(i in 1:N) x[i] ~ dnorm(0, 1)
})
# Return the model definition only, not a built model
simpleModelDefinition <- nimbleModel(simpleCode, constants = list(N = 10),
returnDef = TRUE, check = FALSE)
# Make one instance of the model
simpleModelCopy1 <- simpleModelDefinition$newModel(check = FALSE)
# Make another instance from the same definition
simpleModelCopy2 <- simpleModelDefinition$newModel(check = FALSE)
# Ask simpleModelCopy2 for another copy of itself
simpleModelCopy3 <- simpleModelCopy2$newModel(check = FALSE)
Each copy of the model can have different nodes flagged as data and different values in any nodes.
They cannot have different values of N because that is a constant; it must be a constant because it
helps define the model.

6.2. NIMBLE MODELS ARE OBJECTS YOU CAN QUERY AND MANIPULATE

6.2

63

NIMBLE models are objects you can query and manipulate

NIMBLE models are objects that can be modified and manipulated from R. In this section we
introduce some basic ways to use a model object. Chapter 13 covers more topics for writing
algorithms that use models.

6.2.1

What are variables and nodes?

This section discusses some basic concepts and terminology to be able to speak about NIMBLE
models clearly.
Suppose we have created a model from the following BUGS code.
mc <- nimbleCode({
a ~ dnorm(0, 0.001)
for(i in 1:5) {
y[i] ~ dnorm(a, sd = 0.1)
for(j in 1:3)
z[i,j] ~ dnorm(y[i], tau)
}
tau ~ dunif(0, 20)
y.squared[1:5] <- y[1:5]^2
})
model <- nimbleModel(mc, data = list(z = matrix(rnorm(15), nrow = 5)))
In NIMBLE terminology:
• The variables of this model are a, y, z, and y.squared.
• The nodes of this model are a, y[1] , . . . , y[5], z[1,1] , . . . , z[5, 3], and y.squared[1:5].
In graph terminology, nodes are vertices in the model graph.
• The node functions of this model are a ~ dnorm(0, 0.001), y[i] ~ dnorm(a, 0.1),
z[i,j] ~ dnorm(y[i], sd = 0.1), and y.squared[1:5] <- y[1:5]ˆ2. Each node’s
calculations are handled by a node function. Sometimes the distinction between nodes and
node functions is important, but when it is not important we may refer to both simply as
nodes.
• The scalar elements of this model include all the scalar nodes as well as the scalar elements
y.squared[1] , . . . , y.squared[5] of the multivariate node y.squared[1:5].

6.2.2

Determining the nodes and variables in a model

One can determine the variables in a model using getVarNames and the nodes in a model using
getNodeNames. Optional arguments to getNodeNames allow you to select only certain types of
nodes, as discussed in Section 13.1.1 and in the R help for getNodeNames.

64

CHAPTER 6. BUILDING AND USING MODELS

model$getVarNames()

## [1] "a"
"y"
## [3] "lifted_d1_over_sqrt_oPtau_cP" "z"
## [5] "tau"
"y.squared"
model$getNodeNames()

##
##
##
##
##
##
##
##
##
##
##
##

[1]
[3]
[5]
[7]
[9]
[11]
[13]
[15]
[17]
[19]
[21]
[23]

"a"
"y[1]"
"y[3]"
"y[5]"
"y.squared[1:5]"
"z[1, 2]"
"z[2, 1]"
"z[2, 3]"
"z[3, 2]"
"z[4, 1]"
"z[4, 3]"
"z[5, 2]"

"tau"
"y[2]"
"y[4]"
"lifted_d1_over_sqrt_oPtau_cP"
"z[1, 1]"
"z[1, 3]"
"z[2, 2]"
"z[3, 1]"
"z[3, 3]"
"z[4, 2]"
"z[5, 1]"
"z[5, 3]"

Note that some of the nodes may be ‘lifted’ nodes introduced by NIMBLE (Section 13.1.2). In this
case lifted_d1_over_sqrt_oPtau_cP (this is a node for the standard deviation of the z nodes
using NIMBLE’s canonical parameterization of the normal distribution) is the only lifted node in
the model.
To determine the dependencies of one or more nodes in the model, you can use getDependencies
as discussed in Section 13.1.3.

6.2.3

Accessing nodes

Model variables can be accessed and set just as in R using $ and [[ ]]. For example
model$a <- 5
model$a
## [1] 5
model[["a"]]
## [1] 5

6.2. NIMBLE MODELS ARE OBJECTS YOU CAN QUERY AND MANIPULATE

65

model$y[2:4] <- rnorm(3)
model$y
## [1]

NA -0.9261095 -0.1771040

0.4020118

NA

0.4020118

0.8303732

model[["y"]][c(1, 5)] <- rnorm(2)
model$y
## [1] -0.7317482 -0.9261095 -0.1771040
model$z[1,]
## [1] -0.3340008

1.2079084

0.5210227

While nodes that are part of a variable can be accessed as above, each node also has its own name
that can be used to access it directly. For example, y[2] has the name ‘y[2]’ and can be accessed
by that name as follows:
model[["y[2]"]]
## [1] -0.9261095
model[["y[2]"]] <- -5
model$y
## [1] -0.7317482 -5.0000000 -0.1771040

0.4020118

0.8303732

model[["z[2, 3]"]]
## [1] -0.1587546
model[["z[2:4, 1:2]"]][1, 2]
## [1] -1.231323
model$z[2, 2]
## [1] -1.231323
Notice that node names can include index blocks, such as model[["z[2:4, 1:2]"]], and these are
not strictly required to correspond to actual nodes. Such blocks can be subsequently sub-indexed
in the regular R manner, such as model[["z[2:4, 1:2]"]][1, 2].

66

6.2.4

CHAPTER 6. BUILDING AND USING MODELS

How nodes are named

Every node has a name that is a character string including its indices, with a space after every
comma. For example, X[1, 2, 3] has the name ‘X[1, 2, 3]’. Nodes following multivariate distributions have names that include their index blocks. For example, a multivariate node for X[6:10,
3] has the name ‘X[6:10, 3]’.
The definitive source for node names in a model is getNodeNames, described previously.
In the event you need to ensure that a name is formatted correctly, you can use the
expandNodeNames method. For example, to get the spaces correctly inserted into ‘X[1,1:5]’:
multiVarCode <- nimbleCode({
X[1, 1:5] ~ dmnorm(mu[], cov[,])
X[6:10, 3] ~ dmnorm(mu[], cov[,])
})
multiVarModel <- nimbleModel(multiVarCode, dimensions =
list(mu = 5, cov = c(5,5)), calculate = FALSE)
multiVarModel$expandNodeNames("X[1,1:5]")
## [1] "X[1, 1:5]"
Alternatively, for those inclined to R’s less commonly used features, a nice trick is to use its parse
and deparse functions.
deparse(parse(text = "X[1,1:5]", keep.source = FALSE)[[1]])
## [1] "X[1, 1:5]"
The keep.source = FALSE makes parse more efficient.

6.2.5

Why use node names?

Syntax like model[["z[2, 3]"]] may seem strange at first, because the natural habit of an R user
would be model[["z"]][2,3]. To see its utility, consider the example of writing the nimbleFunction
given in Section 2.8. By giving every scalar node a name, even if it is part of a multivariate variable,
one can write functions in R or NIMBLE that access any single node by a name, regardless of the
dimensionality of the variable in which it is embedded. This is particularly useful for NIMBLE,
which resolves how to access a particular node during the compilation process.

6.2.6

Checking if a node holds data

Finally, you can query whether a node is flagged as data using the isData method applied to one
or more nodes or nodes within variables:

6.2. NIMBLE MODELS ARE OBJECTS YOU CAN QUERY AND MANIPULATE
model$isData('z[1]')
## [1] TRUE
model$isData(c('z[1]', 'z[2]', 'a'))
## [1]

TRUE

TRUE FALSE

model$isData('z')
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [15] TRUE
model$isData('z[1:3, 1]')
## [1] TRUE TRUE TRUE

67

68

CHAPTER 6. BUILDING AND USING MODELS

Part III

Algorithms in NIMBLE

69

Chapter 7

MCMC
NIMBLE provides a variety of paths to creating and executing an MCMC algorithm, which differ
greatly in their simplicity of use, and also in the options available and customizability.
The most direct approach to invoking the MCMC engine is using the nimbleMCMC function (Section
7.1). This one-line call creates and executes an MCMC, and provides a wide range of options for
controlling the MCMC: specifying monitors, burn-in, and thinning, running multiple MCMC chains
with different initial values, and returning posterior samples, summary statistics, and/or a WAIC
value. However, this approach is restricted to using NIMBLE’s default MCMC algorithm; further
customization of, for example, the specific samplers employed, is not possible.
The lengthier and more customizable approach to invoking the MCMC engine on a particular
NIMBLE model object involves the following steps:
1. (Optional) Create and customize an MCMC configuration for a particular model:
a. Use configureMCMC to create an MCMC configuration (see Section 7.2). The configuration contains a list of samplers with the node(s) they will sample.
b. (Optional) Customize the MCMC configuration:
i. Add, remove, or re-order the list of samplers (Section 7.9 and help(samplers) in
R for details), including adding your own samplers (Section 15.5);
ii. Change the tuning parameters or adaptive properties of individual samplers;
iii. Change the variables to monitor (record for output) and thinning intervals for
MCMC samples.
2. Use buildMCMC to build the MCMC object and its samplers either from the model (using
default MCMC configuration) or from a customized MCMC configuration (Section 7.3).
3. Compile the MCMC object (and the model), unless one is debugging and wishes to run the
uncompiled MCMC.
4. Run the MCMC and extract the samples (Sections 7.4, 7.5 and 7.6).
5. Optionally, calculate the WAIC using the posterior samples (Section 7.7).
NIMBLE provides two additional functions which facilitate the comparison of multiple, different,
MCMC algorithms:
71

72

CHAPTER 7. MCMC
• MCMCsuite can run multiple, different MCMCs for the same model. These can include multiple NIMBLE MCMCs from different configurations as well as external MCMCs such as from
WinBUGS, OpenBUGS, JAGS or Stan (Section 7.11).
• compareMCMCs manages multiple calls to MCMCsuite and generates html pages comparing
performance of different MCMCs.

End-to-end examples of MCMC in NIMBLE can be found in Sections 2.5-2.6 and Section 7.10.

7.1

One-line invocation of MCMC: nimbleMCMC

The most direct approach to executing an MCMC algorithm in NIMBLE is using nimbleMCMC.
This single function can be used to create an underlying model and associated MCMC algorithm,
compile both of these, execute the MCMC, and return samples, summary statistics, and a WAIC
value. This approach circumvents the longer (and more flexible) approach using nimbleModel,
configureMCMC, buildMCMC, compileNimble, and runMCMC, which is described subsequently.
The nimbleMCMC function provides control over the:
•
•
•
•
•
•
•
•
•
•

number of MCMC iterations in each chain;
number of MCMC chains to execute;
number of burn-in samples to discard from each chain;
thinning interval on which samples should be recorded;
model variables to monitor and return posterior samples;
initial values, or a function for generating initial values for each chain;
setting the random number seed;
returning posterior samples as a matrix or a coda mcmc object;
returning posterior summary statistics; and
returning a WAIC value calculated using samples from all chains.

This entry point for using nimbleMCMC is the code, constants, data, and inits arguments that
are used for building a NIMBLE model (see Chapters 5 and 6). However, when using nimbleMCMC,
the inits argument can also specify a list of lists of initial values that will be used for each MCMC
chain, or a function that generates a list of initial values, which will be generated at the onset
of each chain. As an alternative entry point, a NIMBLE model object can also be supplied to
nimbleMCMC, in which case this model will be used to build the MCMC algorithm.
Based on its arguments, nimbleMCMC optionally returns any combination of
• Posterior samples,
• Posterior summary statistics, and
• WAIC value.
The above are calculated and returned for each MCMC chain. Additionally, posterior summary
statistics are calculated for all chains combined when multiple chains are run.
Several example usages of nimbleMCMC are shown below:

7.2. THE MCMC CONFIGURATION

73

code <- nimbleCode({
mu ~ dnorm(0, sd = 1000)
sigma ~ dunif(0, 1000)
for(i in 1:10)
x[i] ~ dnorm(mu, sd = sigma)
})
data <- list(x = c(2, 5, 3, 4, 1, 0, 1, 3, 5, 3))
initsFunction <- function() list(mu = rnorm(1,0,1), sigma = runif(1,0,10))
# execute one MCMC chain, monitoring the "mu" and "sigma" variables,
# with thinning interval 10. fix the random number seed for reproducible
# results. by default, only returns posterior samples.
mcmc.out <- nimbleMCMC(code = code, data = data, inits = initsFunction,
monitors = c("mu", "sigma"), thin = 10,
niter = 20000, nchains = 1, setSeed = TRUE)
# note that the inits argument to nimbleModel must be a list of
# initial values, whereas nimbleMCMC can accept inits as a function
# for generating new initial values for each chain.
initsList <- initsFunction()
Rmodel <- nimbleModel(code, data = data, inits = initsList)
# using the existing Rmodel object, execute three MCMC chains with
# specified burn-in. return samples, summary statistics, and WAIC.
mcmc.out <- nimbleMCMC(model = Rmodel,
niter = 20000, nchains = 3, nburnin = 2000,
summary = TRUE, WAIC = TRUE)
# run ten chains, generating random initial values for each
# chain using the inits function specified above.
# only return summary statistics from each chain; not all the samples.
mcmc.out <- nimbleMCMC(model = Rmodel, nchains = 10, inits = initsFunction,
samples = FALSE, summary = TRUE)
See help(nimbleMCMC) for further details.

7.2

The MCMC configuration

The MCMC configuration contains information needed for building an MCMC. When no customization is needed, one can jump directly to the buildMCMC step below. An MCMC configuration is an
object of class MCMCconf, which includes:
• The model on which the MCMC will operate
• The model nodes which will be sampled (updated) by the MCMC
• The samplers and their internal configurations, called control parameters

74

CHAPTER 7. MCMC
• Two sets of variables that will be monitored (recorded) during execution of the MCMC and
thinning intervals for how often each set will be recorded. Two sets are allowed because it
can be useful to monitor different variables at different intervals

7.2.1

Default MCMC configuration

Assuming we have a model named Rmodel, the following will generate a default MCMC configuration:
mcmcConf <- configureMCMC(Rmodel)

The default configuration will contain a single sampler for each node in the model, and the default
ordering follows the topological ordering of the model.

7.2.1.1

Default assignment of sampler algorithms

The default sampler assigned to a stochastic node is determined by the following, in order of
precedence:

1. If the node has no stochastic dependents, a posterior_predictive sampler is assigned. This
sampler sets the new value for the node simply by simulating from its distribution.
2. If the node has a conjugate relationship between its prior distribution and the distributions
of its stochastic dependents, a conjugate (‘Gibbs’) sampler is assigned.
3. If the node follows a multinomial distribution, then a RW_multinomial sampler is assigned.
This is a discrete random-walk sampler in the space of multinomial outcomes.
4. If a node follows a Dirichlet distribution, then a RW_dirichlet sampler is assigned. This is
a random walk sampler in the space of the simplex defined by the Dirichlet.
5. If the node follows any other multivariate distribution, then a RW_block sampler is assigned for
all elements. This is a Metropolis-Hastings adaptive random-walk sampler with a multivariate
normal proposal (Roberts and Sahu, 1997).
6. If the node is binary-valued (strictly taking values 0 or 1), then a binary sampler is assigned.
This sampler calculates the conditional probability for both possible node values and draws
the new node value from the conditional distribution, in effect making a Gibbs sampler.
7. If the node is otherwise discrete-valued, then a slice sampler is assigned (Neal, 2003).
8. If none of the above criteria are satisfied, then a RW sampler is assigned. This is a MetropolisHastings adaptive random-walk sampler with a univariate normal proposal distribution.

These sampler assignment rules can be inspected, reordered, and easily modified using the
system option nimbleOptions("MCMCdefaultSamplerAssignmentRules") and customized
samplerAssignmentRules objects.
Details of each sampler and its control parameters can be found by invoking help(samplers).

7.2. THE MCMC CONFIGURATION
7.2.1.2

75

Sampler assignment rules

The behavior of configureMCMC can be customized to control how samplers are assigned. A new
set of sampler assignment rules can be created using samplerAssignmentRules, which can be modified using the addRule and reorder methods, then passed as an argument to configureMCMC.
Alternatively, the default behavior of configureMCMC can be altered by setting the system option MCMCdefaultSamplerAssignmentRules to a custom samplerAssignmentRules object. See
help(samplerAssignmentRules) for details.

7.2.1.3

Options to control default sampler assignments

As a lightweight alternative to using samplerAssignmentRules, very basic control of default sampler assignments is provided via two arguments to configureMCMC. The useConjugacy argument
controls whether conjugate samplers are assigned when possible, and the multivariateNodesAsScalars
argument controls whether scalar elements of multivariate nodes are sampled individually. See
help(configureMCMC) for usage details.

7.2.1.4

Default monitors

The default MCMC configuration includes monitors on all top-level stochastic nodes of the model.
Only variables that are monitored will have their samples saved for use outside of the MCMC.
MCMC configurations include two sets of monitors, each with different thinning intervals. By
default, the second set of monitors (monitors2) is empty.

7.2.1.5

Automated parameter blocking

The default configuration may be replaced by one generated from an automated parameter blocking
algorithm. This algorithm determines groupings of model nodes that, when jointly sampled with a
RW_block sampler, increase overall MCMC efficiency. Overall efficiency is defined as the effective
sample size of the slowest-mixing node divided by computation time. This is done by:
autoBlockConf <- configureMCMC(Rmodel, autoBlock = TRUE)
Note that this using autoBlock = TRUE compiles and runs MCMCs, progressively exploring different
sampler assignments, so it takes some time and generates some output. It is most useful for
determining effective blocking strategies that can be re-used for later runs. The additional control
argument autoIt may also be provided to indicate the number of MCMC samples to be used in
each trial of the automated blocking procedure (default 20,000).

7.2.2

Customizing the MCMC configuration

The MCMC configuration may be customized in a variety of ways, either through additional named
arguments to configureMCMC or by calling methods of an existing MCMCconf object.

76
7.2.2.1

CHAPTER 7. MCMC
Controlling which nodes to sample

One can create an MCMC configuration with default samplers on just a particular set of nodes
using the nodes argument to configureMCMC. The value for the nodes argument may be a character
vector containing node and/or variable names. In the case of a variable name, a default sampler
will be added for all stochastic nodes in the variable. The order of samplers will match the order
of nodes. Any deterministic nodes will be ignored.
If a data node is included in nodes, it will be assigned a sampler. This is the only way in which
a default sampler may be placed on a data node and will result in overwriting data values in the
node.
7.2.2.2

Creating an empty configuration

If you plan to customize the choice of all samplers, it can be useful to obtain a configuration with
no sampler assignments at all. This can be done by any of nodes = NULL, nodes = character(),
or nodes = list().
7.2.2.3

Overriding the default sampler assignment rules

The default rules used for assigning samplers to model nodes can be overridden using the rules
argument to configureMCMC. This argument must be an object of class samplerAssignmentRules,
which defines an ordered set of rules for assigning samplers. Rules can be modified and reordered, to
give different precedence to particular samplers, or to assign user-defined samplers (see section 15.5).
The following example creates a new set of rules (which initially contains the default assignment
rules), reorders the rules, adds a new rule, then uses these rules to create an MCMC configuration
object.
my_rules <- samplerAssignmentRules()
my_rules$reorder(c(8, 1:7))
my_rules$addRule(condition = quote(model$getDistribution(node) == "dmnorm"),
sampler = new_dmnorm_sampler)
mcmcConf <- configureMCMC(Rmodel, rules = my_rules)
In addition, the default behavior of configureMCMC can be altered by setting the system
option nimbleOptions(MCMCdefaultSamplerAssignmentRules = my_rules), or reset to
the original default behavior using nimbleOptions(MCMCdefaultSamplerAssignmentRules =
samplerAssignmentRules()).
7.2.2.4

Overriding the default sampler control list values

The default values of control list elements for all sampling algorithms may be overridden through
use of the control argument to configureMCMC, which should be a named list. Named elements in
the control argument will be used for all default samplers and any subsequent sampler added via
addSampler (see below). For example, the following will create the default MCMC configuration,
except all RW samplers will have their initial scale set to 3, and none of the samplers (RW, or
otherwise) will be adaptive.

7.2. THE MCMC CONFIGURATION

77

mcmcConf <- configureMCMC(Rmodel, control = list(scale = 3, adaptive = FALSE))
When adding samplers to a configuration using addSampler, the default control list can also be
over-ridden.
7.2.2.5

Adding samplers to the configuration: addSampler

Samplers may be added to a configuration using the addSampler method of the MCMCconf object.
The first argument gives the node(s) to be sampled, called the target, as a character vector. The
second argument gives the type of sampler, which may be provided as a character string or a
nimbleFunction object. Valid character strings are indicated in help(samplers) (do not include
"sampler_"). Added samplers can be labeled with a name argument, which is used in output of
printSamplers.
Writing a new sampler as a nimbleFunction is covered in Section 15.5.
The hierarchy of precedence for control list elements for samplers is:
1. The control list argument to addSampler;
2. The control list argument to configureMCMC;
3. The default values, as defined in the sampling algorithm setup function.
Samplers added by addSampler will be appended to the end of current sampler list. Adding a
sampler for a node will not automatically remove any existing samplers on that node.
7.2.2.6

Printing, re-ordering, modifying and removing samplers: printSamplers, removeSamplers, setSamplers, and getSamplerDefinition

The current, ordered, list of all samplers in the MCMC configuration may be printed by calling
the printSamplers method. When you want to see only samplers acting on specific model nodes
or variables, provide those names as an argument to printSamplers. The printSamplers method
accepts arguments controlling the level of detail displayed as discussed in its R help information.
# Print all samplers
mcmcConf$printSamplers()
# Print all samplers operating on node "a[1]",
# or any of the "beta[]" variables
mcmcConf$printSamplers(c("a[1]", "beta"))
# Print all conjugate and slice samplers
mcmcConf$printSamplers(type = c("conjugate", "slice"))
# Print all RW samplers operating on "x"
mcmcConf$printSamplers("x", type = "RW")

78

CHAPTER 7. MCMC

# Print the first 100 samplers
mcmcConf$printSamplers(1:100)
# Print all samplers in their order of execution
mcmcConf$printSamplers(executionOrder = TRUE)
Samplers may be removed from the configuration object using removeSamplers, which accepts a
character vector of node or variable names, or a numeric vector of indices.
# Remove all samplers acting on "x" or any component of it
mcmcConf$removeSamplers("x")
# Remove all samplers acting on "alpha[1]" and "beta[1]"
mcmcConf$removeSamplers(c("alpha[1]", "beta[1]"))
# Remove the first five samplers
mcmcConf$removeSamplers(1:5)
# Providing no argument removes all samplers
mcmcConf$removeSamplers()
Samplers to retain may be specified reordered using setSamplers, which also accepts a character
vector of node or variable names, or a numeric vector of indices.
# Set the list of samplers to those acting on any components of the
# model variables "x", "y", or "z".
mcmcConf$setSamplers(c("x", "y", "z"))
# Set the list of samplers to only those acting on model nodes
# "alpha[1]", "alpha[2]", ..., "alpha[10]"
mcmcConf$setSamplers("alpha[1:10]")
# Truncate the current list of samplers to the first 10 and the 100th
mcmcConf$setSamplers(ind = c(1:10, 100))
The nimbleFunction definition underlying a particular sampler may be viewed using the
getSamplerDefinition method, using the sampler index as an argument. A node name argument
may also be supplied, in which case the definition of the first sampler acting on that node is
returned. In all cases, getSamplerDefinition only returns the definition of the first sampler
specified either by index or node name.
# Return the definition of the third sampler in the mcmcConf object
mcmcConf$getSamplerDefinition(3)
# Return the definition of the first sampler acting on node "x",
# or the first of any indexed nodes comprising the variable "x"
mcmcConf$getSamplerDefinition("x")

7.2. THE MCMC CONFIGURATION
7.2.2.7

79

Customizing individual sampler configurations: getSamplers, setSamplers,
setName, setSamplerFunction, setTarget, and setControl

Each sampler in an MCMCconf object is represented by a sampler configuration as a samplerConf
object. Each samplerConf is a reference class object containing the following (required) fields: name
(a character string), samplerFunction (a valid nimbleFunction sampler), target (the model node
to be sampled), and control (list of control arguments). The MCMCconf method getSamplers
allows access to the samplerConf objects. These can be modified and then passed as an argument to setSamplers to over-write the current list of samplers in the MCMC configuration object.
However, no checking of the validity of this modified list is performed; if the list of samplerConf
objects is corrupted to be invalid, incorrect behavior will result at the time of calling buildMCMC.
The fields of a samplerConf object can be modified using the access functions setName(name),
setSamplerFunction(fun), setTarget(target, model), and setControl(control).
Here are some examples:
# retrieve samplerConf list
samplerConfList <- mcmcConf$getSamplers()
# change the name of the first sampler
samplerConfList[[1]]$setName("newNameForThisSampler")
# change the sampler function of the second sampler,
# assuming existance of a nimbleFunction 'anotherSamplerNF',
# which represents a valid nimbleFunction sampler.
samplerConfList[[2]]$setSamplerFunction(anotherSamplerNF)
# change the 'adaptive' element of the control list of the third sampler
control <- samplerConfList[[3]]$control
control$adaptive <- FALSE
samplerConfList[[3]]$setControl(control)
# change the target node of the fourth sampler
samplerConfList[[4]]$setTarget("y", model)
# model argument required
# use this modified list of samplerConf objects in the MCMC configuration
mcmcConf$setSamplers(samplerConfList)

7.2.2.8

Customizing the sampler execution order

The ordering of sampler execution can be controlled as well. This allows for sampler functions
to execute multiple times within a single MCMC iteration, or the execution of different sampler
functions to be interleaved with one another.
The sampler execution order is set using the function setSamplerExecutionOrder, and the current
ordering of execution is retrieved using getSamplerExecutionOrder. For example, assuming the
MCMC configuration object mcmcConf contains five samplers:

80

CHAPTER 7. MCMC

# first sampler to execute twice, in succession:
mcmcConf$setSamplerExecutionOrder(c(1, 1, 2, 3, 4, 5))
# first sampler to execute multiple times, interleaved:
mcmcConf$setSamplerExecutionOrder(c(1, 2, 1, 3, 1, 4, 1, 5))
# fourth sampler to execute 10 times, only
mcmcConf$setSamplerExecutionOrder(rep(4, 10))
# omitting the argument to setSamplerExecutionOrder()
# resets the ordering to each sampler executing once, sequentially
mcmcConf$setSamplerExecutionOrder()
# retrieve the current ordering of sampler execution
ordering <- mcmcConf$getSamplerExecutionOrder()
# print the sampler functions in the order of execution
mcmcConf$printSamplers(executionOrder = TRUE)

7.2.2.9

Monitors and thinning intervals: printMonitors, getMonitors, addMonitors,
setThin, and resetMonitors

An MCMC configuration object contains two independent sets of variables to monitor, each
with their own thinning interval: thin corresponding to monitors, and thin2 corresponding to
monitors2. Monitors operate at the variable level. Only entire model variables may be monitored.
Specifying a monitor on a node, e.g., x[1], will result in the entire variable x being monitored.
The variables specified in monitors and monitors2 will be recorded (with thinning interval thin)
into objects called mvSamples and mvSamples2, contained within the MCMC object. These are
both modelValues objects; modelValues are NIMBLE data structures used to store multiple sets of
values of model variables1 . These can be accessed as the member data mvSamples and mvSamples2
of the MCMC object, and they can be converted to matrices using as.matrix (see Section 7.6).
Monitors may be added to the MCMC configuration either in the original call to configureMCMC
or using the addMonitors method:
# Using an argument to configureMCMC
mcmcConf <- configureMCMC(Rmodel, monitors = c("alpha", "beta"),
monitors2 = "x")
# Calling a member method of the mcmcconf object
# This results in the same monitors as above
mcmcConf$addMonitors("alpha", "beta")
mcmcConf$addMonitors2("x")
Similarly, either thinning interval may be set at either step:
1

See Section 14.1 for general information on modelValues.

7.3. BUILDING AND COMPILING THE MCMC

81

# Using an argument to configureMCMC
mcmcConf <- configureMCMC(Rmodel, thin = 1, thin2 = 100)
# Calling a member method of the mcmcConf object
# This results in the same thinning intervals as above
mcmcConf$setThin(1)
mcmcConf$setThin2(100)
The current lists of monitors and thinning intervals may be displayed using the printMonitors
method. Both sets of monitors (monitors and monitors2) may be reset to empty character vectors
by calling the resetMonitors method. The methods getMonitors and getMonitors2 return the
currently specified monitors and monitors2 as character vectors.
7.2.2.10 Monitoring model log-probabilities
To record model log-probabilities from an MCMC, one can add monitors for logProb variables
(which begin with the prefix logProb_) that correspond to variables with (any) stochastic nodes.
For example, to record and extract log-probabilities for the variables alpha, sigma_mu, and Y:
mcmcConf <- configureMCMC(Rmodel)
mcmcConf$addMonitors("logProb_alpha", "logProb_sigma_mu", "logProb_Y")
Rmcmc <- buildMCMC(mcmcConf)
Cmodel <- compileNimble(Rmodel)
Cmcmc <- compileNimble(Rmcmc, project = Rmodel)
Cmcmc$run(10000)
samples <- as.matrix(Cmcmc$mvSamples)
The samples matrix will contain both MCMC samples and model log-probabilities.

7.3

Building and compiling the MCMC

Once the MCMC configuration object has been created, and customized to one’s liking, it may be
used to build an MCMC function:
Rmcmc <- buildMCMC(mcmcConf, enableWAIC = TRUE)
buildMCMC is a nimbleFunction. The returned object Rmcmc is an instance of the nimbleFunction
specific to configuration mcmcConf (and of course its associated model).
Note that if you would like to be able to calculate the WAIC of the model after the MCMC
function has been run, you must set enableWAIC = TRUE as an argument to either configureMCMC
or buildMCMC, or set nimbleOptions(enableWAIC = TRUE), which will enable WAIC calculations
for all subsequently built MCMC functions. For more information on WAIC calculations, see
Section 7.7 or help(buildMCMC) in R.
When no customization is needed, one can skip configureMCMC and simply provide a model object
to buildMCMC. The following two MCMC functions will be identical:

82

CHAPTER 7. MCMC

mcmcConf <- configureMCMC(Rmodel)
Rmcmc1 <- buildMCMC(mcmcConf)
Rmcmc2 <- buildMCMC(Rmodel)

# default MCMC configuration

# uses the default configuration for Rmodel

For speed of execution, we usually want to compile the MCMC function to C++ (as is the case
for other NIMBLE functions). To do so, we use compileNimble. If the model has already been
compiled, it should be provided as the project argument so the MCMC will be part of the same
compiled project. A typical compilation call looks like:
Cmcmc <- compileNimble(Rmcmc, project = Rmodel)
Alternatively, if the model has not already been compiled, they can be compiled together in one
line:
Cmcmc <- compileNimble(Rmodel, Rmcmc)
Note that if you compile the MCMC with another object (the model in this case), you’ll need to
explicitly refer to the MCMC component of the resulting object to be able to run the MCMC:
Cmcmc$Rmcmc$run(niter = 1000)

7.4

User-friendly execution of MCMC algorithms: runMCMC

Once an MCMC algorithm has been created using buildMCMC, the function runMCMC can be used
to run multiple chains and extract posterior samples, summary statistics and/or a WAIC value.
This is a simpler approach to executing an MCMC algorithm, than the process of executing and
extracting samples as described in Sections 7.5 and 7.6.
runMCMC also provides several user-friendly options such as burn-in, thinning, running multiple
chains, and different initial values for each chain. However, using runMCMC does not support several
lower-level options, such as timing the individual samplers internal to the MCMC, continuing an
exisiting MCMC run (picking up where it left off), or modifying the sampler execution ordering.
runMCMC takes arguments that will control the following aspects of the MCMC:
•
•
•
•
•
•
•
•
•

Number of iterations in each chain;
Number of chains;
Number of burn-in samples to discard from each chain;
Thinning interval for recording samples;
Initial values, or a function for generating initial values for each chain;
Setting the random number seed;
Returning the posterior samples as a coda mcmc object;
Returning summary statistics calculated from each chains; and
Returning a WAIC value calculated using samples from all chains.

7.5. RUNNING THE MCMC

83

The following examples demonstrate some uses of runMCMC, and assume the existence of Cmcmc, a
compiled MCMC algorithm.
# run a single chain, and return a matrix of samples
mcmc.out <- runMCMC(Cmcmc)
# run three chains of 10000 samples, discard initial burnin of 1000,
# record samples thereafter using a thinning interval of 10,
# and return of list of sample matrices
mcmc.out <- runMCMC(Cmcmc, niter=10000, nburnin=1000, thin=10, nchains=3)
# run three chains, returning posterior samples, summary statistics,
# and the WAIC value for each chain
mcmc.out <- runMCMC(Cmcmc, nchains = 3, summary = TRUE, WAIC = TRUE)
# run two chains, and specify the initial values for each
initsList <- list(list(mu = 1, sigma = 1),
list(mu = 2, sigma = 10))
mcmc.out <- runMCMC(Cmcmc, nchains = 2, inits = initsList)
# run ten chains of 100,000 iterations each, using a function to
# generate initial values and a fixed random number seed for each chain.
# only return summary statistics from each chain; not all the samples.
initsFunction <- function()
list(mu = rnorm(1,0,1), sigma = runif(1,0,100))
mcmc.out <- runMCMC(Cmcmc, niter = 100000, nchains = 10,
inits = initsFunction, setSeed = TRUE,
samples = FALSE, summary = TRUE)
See help(runMCMC) for further details.

7.5

Running the MCMC

The MCMC algorithm (either the compiled or uncompiled version) can be executed using the member method mcmc$run (see help(buildMCMC) in R). The run method has one required argument,
niter, the number of iterations to be run.
The run} method has optional argumentsnburnin,thinandthin2. These can be used to
specify the number of pre-thinning burnin samples to discard, and the post-burnin
thinning intervals for recording samples (corresponding tomonitorsandmonitors2). If
eitherthinandthin2‘ are provided, they will override the thinning intervals that were specified in
the original MCMC configuration object.
The run method has an optional reset argument. When reset = TRUE (the default value), the
following occurs prior to running the MCMC:
1. All model nodes are checked and filled or updated as needed, in valid (topological) order.
If a stochastic node is missing a value, it is populated using a call to simulate and its log

84

CHAPTER 7. MCMC
probability value is calculated. The values of deterministic nodes are calculated from their
parent nodes. If any right-hand-side-only nodes (e.g., explanatory variables) are missing a
value, an error results.
2. All MCMC sampler functions are reset to their initial state: the initial values of any sampler
control parameters (e.g., scale, sliceWidth, or propCov) are reset to their initial values, as
were specified by the original MCMC configuration.
3. The internal modelValues objects mvSamples and mvSamples2 are each resized to the appropriate length for holding the requested number of samples (niter/thin, and niter/thin2,
respectively).

When mcmc$run(niter, reset = FALSE) is called, the MCMC picks up from where it left off,
continuing the previous chain and expanding the output as needed. No values in the model are
checked or altered, and sampler functions are not reset to their initial states.

7.5.1

Measuring sampler computation times: getTimes

If you want to obtain the computation time spent in each sampler, you can set time=TRUE as a
run-time argument and then use the method getTimes() obtain the times. For example,
Cmcmc$run(niter, time = TRUE)
Cmcmc$getTimes()
will return a vector of the total time spent in each sampler, measured in seconds.

7.6

Extracting MCMC samples

After executing the MCMC, the output samples can be extracted as follows:
mvSamples <- mcmc$mvSamples
mvSamples2 <- mcmc$mvSamples2
These modelValues objects can be converted into matrices using as.matrix:
samplesMatrix <- as.matrix(mvSamples)
samplesMatrix2 <- as.matrix(mvSamples2)
The column names of the matrices will be the node names of nodes in the monitored variables.
Then, for example, the mean of the samples for node x[2] could be calculated as:
mean(samplesMatrix[, "x[2]"])
Obtaining samples as matrices is most common, but see Section 14.1 for more about programming
with modelValues objects, especially if you want to write nimbleFunctions to use the samples.

7.7. CALCULATING WAIC

7.7

85

Calculating WAIC

Once an MCMC algorithm has been run, as described in Section 7.5, the WAIC (Watanabe, 2010)
can be calculated from the posterior samples produced by the MCMC algorithm. Note that in
order to calculate the WAIC value after running an MCMC algorithm, the argument enableWAIC
= TRUE must have been supplied to configureMCMC or to buildMCMC, or the enableWAIC NIMBLE
option must have been set to TRUE.
The WAIC is calculated by calling the member method mcmc$calculateWAIC (see help(buildMCMC)
in R for more details). The calculateWAIC method has one required argument, nburnin, the
number of initial samples to discard prior to WAIC calculation. nburnin defaults to 0.
Cmcmc$calculateWAIC(nburnin = 100)
Note that there is not a unique value of WAIC for a model. WAIC is calculated from Equations
5, 12, and 13 in (Gelman et al., 2014) (i.e. using pWAIC2). In NIMBLE, the set of all stochastic
nodes monitored by the MCMC object will be treated as θ for the purposes of Equation 5 from
(Gelman et al., 2014).
In many cases one would want θ to be the set of all stochastic nodes in the model, in which case the
user must set the MCMC monitors to include all stochastic nodes; by default the MCMC monitors
are only the top-level nodes of the model.

7.8

k-fold cross-validation

The runCrossValidate function in NIMBLE performs k-fold cross-validation on a nimbleModel
fit via MCMC. More information can be found by calling help(runCrossValidate).

7.9

Samplers provided with NIMBLE

Most documentation of MCMC samplers provided with NIMBLE can be found by invoking
help(samplers) in R. Here we provide additional explanation of conjugate samplers and how
complete customization can be achieved by making a sampler use an arbitrary log-likelihood
function, such as to build a particle MCMC algorithm.

7.9.1

Conjugate (‘Gibbs’) samplers

By default, configureMCMC() and buildMCMC() will assign conjugate samplers to all nodes satisfying a conjugate relationship, unless the option useConjugacy = FALSE is specified.
The current release of NIMBLE supports conjugate sampling of the relationships listed in Table
7.12 .

2

NIMBLE’s
internal
definitions
of
nimble:::conjugacyRelationshipsInputList.

these

relationships

can

be

viewed

with

86

CHAPTER 7. MCMC
Table 7.1: Conjugate relationships supported by NIMBLE’s
MCMC engine.
Prior Distribution

Sampling (Dependent Node) Distribution

Parameter

Beta

Bernoulli
Binomial
Negative Binomial
Multinomial
Categorical
Normal
Lognormal
Poisson
Normal
Lognormal
Gamma
Inverse Gamma
Exponential
Weibull
Normal
Lognormal
Normal
Lognormal
Gamma
Inverse Gamma
Exponential
Normal
Lognormal
Multivariate Normal
Multivariate Normal
Multivariate Normal

prob
prob
prob
prob
prob
mean
meanlog
lambda
tau
taulog
rate
scale
rate
lambda
sd
sdlog
var
varlog
scale
rate
scale
mean
meanlog
mean
prec
cov

Dirichlet
Flat
Gamma

Halfflat
Inverse Gamma

Normal
Multivariate Normal
Wishart
Inverse Wishart

Conjugate sampler functions may (optionally) dynamically check that their own posterior likelihood
calculations are correct. If incorrect, a warning is issued. However, this functionality will roughly
double the run-time required for conjugate sampling. By default, this option is disabled in NIMBLE.
This option may be enabled with nimbleOptions(verifyConjugatePosteriors = TRUE).
If one wants information about conjugate node relationships for other purposes, they can be obtained using the checkConjugacy method on a model. This returns a named list describing all
conjugate relationships. The checkConjugacy method can also accept a character vector argument
specifying a subset of node names to check for conjugacy.

7.9.2

Customized log-likelihood evaluations: RW_llFunction sampler

Sometimes it is useful to control the log-likelihood calculations used for an MCMC updater instead
of simply using the model. For example, one could use a sampler with a log-likelihood that analytically (or numerically) integrates over latent model nodes. Or one could use a sampler with a loglikelihood that comes from a stochastic approximation such as a particle filter (see below), allowing

7.9. SAMPLERS PROVIDED WITH NIMBLE

87

composition of a particle MCMC (PMCMC) algorithm (Andrieu et al., 2010). The RW_llFunction
sampler handles this by using a Metropolis-Hastings algorithm with a normal proposal distribution
and a user-provided log-likelihood function. To allow compiled execution, the log-likelihood function must be provided as a specialized instance of a nimbleFunction. The log-likelihood function
may use the same model as the MCMC as a setup argument (as does the example below), but
if so the state of the model should be unchanged during execution of the function (or you must
understand the implications otherwise).
The RW_llFunction sampler can be customized using the control list argument to set the initial
proposal distribution scale and the adaptive properties for the Metropolis-Hastings sampling. In
addition, the control list argument must contain a named llFunction element. This is the
specialized nimbleFunction that calculates the log-likelihood; it must accept no arguments and
return a scalar double number. The return value must be the total log-likelihood of all stochastic
dependents of the target nodes – and, if includesTarget = TRUE, of the target node(s) themselves
– or whatever surrogate is being used for the total log-likelihood. This is a required control list
element with no default. See help(samplers) for details.
Here is a complete example:
code <- nimbleCode({
p ~ dunif(0, 1)
y ~ dbin(p, n)
})
Rmodel <- nimbleModel(code, data = list(y=3), inits = list(p=0.5, n=10))
llFun <- nimbleFunction(
setup = function(model) { },
run = function() {
y <- model$y
p <- model$p
n <- model$n
ll <- lfactorial(n) - lfactorial(y) - lfactorial(n-y) +
y * log(p) + (n-y) * log(1-p)
returnType(double())
return(ll)
}
)
RllFun <- llFun(Rmodel)
mcmcConf <- configureMCMC(Rmodel, nodes = NULL)
mcmcConf$addSampler(target = "p", type = "RW_llFunction",
control = list(llFunction = RllFun, includesTarget = FALSE))
Rmcmc <- buildMCMC(mcmcConf)

88

7.9.3

CHAPTER 7. MCMC

Particle MCMC sampler

For state space models, a particle MCMC (PMCMC) sampler can be specified for top-level parameters. This sampler is described in Section 8.1.2.

7.10

Detailed MCMC example: litters

Here is a detailed example of specifying, building, compiling, and running two MCMC algorithms.
We use the litters example from the BUGS examples.
###############################
##### model configuration #####
###############################
# define our model using BUGS syntax
litters_code <- nimbleCode({
for (i in 1:G) {
a[i] ~ dgamma(1, .001)
b[i] ~ dgamma(1, .001)
for (j in 1:N) {
r[i,j] ~ dbin(p[i,j], n[i,j])
p[i,j] ~ dbeta(a[i], b[i])
}
mu[i] <- a[i] / (a[i] + b[i])
theta[i] <- 1 / (a[i] + b[i])
}
})
# list of fixed constants
constants <- list(G = 2,
N = 16,
n = matrix(c(13, 12, 12, 11, 9, 10, 9, 9, 8, 11, 8, 10, 13,
10, 12, 9, 10, 9, 10, 5, 9, 9, 13, 7, 5, 10, 7, 6,
10, 10, 10, 7), nrow = 2))
# list specifying model data
data <- list(r = matrix(c(13, 12, 12, 11, 9, 10, 9, 9, 8, 10, 8, 9, 12, 9,
11, 8, 9, 8, 9, 4, 8, 7, 11, 4, 4, 5 , 5, 3, 7, 3,
7, 0), nrow = 2))
# list specifying initial values
inits <- list(a = c(1, 1),
b = c(1, 1),
p = matrix(0.5, nrow = 2, ncol = 16),
mu
= c(.5, .5),
theta = c(.5, .5))

7.10. DETAILED MCMC EXAMPLE: LITTERS

# build the R model object
Rmodel <- nimbleModel(litters_code,
constants = constants,
data
= data,
inits
= inits)

###########################################
##### MCMC configuration and building #####
###########################################
# generate the default MCMC configuration;
# only wish to monitor the derived quantity "mu"
mcmcConf <- configureMCMC(Rmodel, monitors = "mu")
# check the samplers assigned by default MCMC configuration
mcmcConf$printSamplers()
# double-check our monitors, and thinning interval
mcmcConf$printMonitors()
# build the executable R MCMC function
mcmc <- buildMCMC(mcmcConf)
# let's try another MCMC, as well,
# this time using the crossLevel sampler for top-level nodes
# generate an empty MCMC configuration
# we need a new copy of the model to avoid compilation errors
Rmodel2 <- Rmodel$newModel()
mcmcConf_CL <- configureMCMC(Rmodel2, nodes = NULL, monitors = "mu")
# add two crossLevel samplers
mcmcConf_CL$addSampler(target = c("a[1]", "b[1]"), type = "crossLevel")
mcmcConf_CL$addSampler(target = c("a[2]", "b[2]"), type = "crossLevel")
# let's check the samplers
mcmcConf_CL$printSamplers()
# build this second executable R MCMC function
mcmc_CL <- buildMCMC(mcmcConf_CL)

###################################
##### compile to C++, and run #####
###################################

89

90

CHAPTER 7. MCMC

# compile the two copies of the model
Cmodel <- compileNimble(Rmodel)
Cmodel2 <- compileNimble(Rmodel2)
# compile both MCMC algorithms, in the same
# project as the R model object
# NOTE: at this time, we recommend compiling ALL
# executable MCMC functions together
Cmcmc <- compileNimble(mcmc, project = Rmodel)
Cmcmc_CL <- compileNimble(mcmc_CL, project = Rmodel2)
# run the default MCMC function,
# and example the mean of mu[1]
Cmcmc$run(1000)
cSamplesMatrix <- as.matrix(Cmcmc$mvSamples)
mean(cSamplesMatrix[, "mu[1]"])
# run the crossLevel MCMC function,
# and examine the mean of mu[1]
Cmcmc_CL$run(1000)
cSamplesMatrix_CL <- as.matrix(Cmcmc_CL$mvSamples)
mean(cSamplesMatrix_CL[, "mu[1]"])

###################################
#### run multiple MCMC chains #####
###################################
# run 3 chains of the crossLevel MCMC
samplesList <- runMCMC(Cmcmc_CL, niter=1000, nchains=3)
lapply(samplesList, dim)

7.11

Comparing different MCMCs with MCMCsuite and compareMCMCs

NIMBLE’s MCMCsuite function automatically runs WinBUGS, OpenBUGS, JAGS, Stan, and/or
multiple NIMBLE configurations on the same model. Note that the BUGS code must be compatible
with whichever BUGS packages are included, and separate Stan code must be provided. NIMBLE’s
compareMCMCs manages calls to MCMCsuite for multiple sets of comparisons and organizes the
output(s) for generating html pages summarizing results. It also allows multiple results to be
combined and allows some different options for how results are processed, such as how effective
sample size is estimated.
We first show how to use MCMCsuite for the same litters example used in Section 7.10. Subsequently, additional details of the MCMCsuite are given. Since use of compareMCMCs is similar, we

7.11. COMPARING DIFFERENT MCMCS WITH MCMCSUITE AND COMPAREMCMCS 91
refer readers to help(compareMCMCs) and the functions listed under ‘See also’ on that R help page.

7.11.1

MCMC Suite example: litters

The following code executes the following MCMC algorithms on the litters example:
•
•
•
•

WinBUGS
JAGS
NIMBLE default configuration
NIMBLE custom configuration using two crossLevel samplers

output <- MCMCsuite(
code = litters_code,
constants = constants,
data = data,
inits = inits,
monitors = 'mu',
MCMCs = c('winbugs', 'jags', 'nimble', 'nimble_CL'),
MCMCdefs = list(
nimble_CL = quote({
mcmcConf <- configureMCMC(Rmodel, nodes = NULL)
mcmcConf$addSampler(target = c('a[1]', 'b[1]'),
type = 'crossLevel')
mcmcConf$addSampler(target = c('a[2]', 'b[2]'),
type = 'crossLevel')
mcmcConf
})),
plotName = 'littersSuite'
)

7.11.2

MCMC Suite outputs

Executing the MCMC Suite returns a named list containing various outputs, as well as generates
and saves traceplots and posterior density plots. The default elements of this return list object are:
Samples
samples is a three-dimensional array, containing all MCMC samples from each algorithm. The first
dimension of the samples array corresponds to each MCMC algorithm, and may be indexed by
the name of the algorithm. The second dimension of the samples array corresponds to each node
which was monitored, and may be indexed by the node name. The third dimension of samples
contains the MCMC samples, and has length niter/thin - burnin.
Summary
The MCMC suite output contains a variety of pre-computed summary statistics, which are stored
in the summary matrix. For each monitored node and each MCMC algorithm, the following default
summary statistics are calculated: mean, median, sd, the 2.5% quantile, and the 97.5% quantile.
These summary statistics are easily viewable, as:

92

CHAPTER 7. MCMC

output$summary
# , , mu[1]
#
# winbugs
# jags
# nimble
# nimble_CL
#
# , , mu[2]
#
# winbugs
# jags
# nimble
# nimble_CL

mean
0.8795868
0.8872778
0.8562232
0.8871314

median
0.8889000
0.8911989
0.8983763
0.8961146

sd
0.04349589
0.02911325
0.12501395
0.05243039

quant025
0.7886775
0.8287991
0.4071524
0.7640730

quant975
0.9205025
0.9335317
0.9299781
0.9620532

mean
0.7626974
0.7635539
0.7179094
0.7605938

median
0.7678000
0.7646913
0.7246935
0.7655945

sd
0.04569705
0.03803033
0.06061116
0.09138471

quant025
0.6745975
0.6824946
0.6058669
0.5822785

quant975
0.8296025
0.8313314
0.7970130
0.9568195

Timing
timing contains a named vector of the runtime for each MCMC algorithm, the total compile time
for the NIMBLE model and MCMC algorithms, and the compile time for Stan (if specified). All
run- and compile- times are given in seconds.
Efficiency
Using the MCMCsuite option calculateEfficiency = TRUE will also provide several measures of
MCMC sampling efficiency. Additional summary statistics are provided for each node: the total
number of samples collected (n), the effective sample size resulting from these samples (ess), and
the effective sample size per second of algorithm runtime (efficiency).
In addition to these node-by-node measures of efficiency, an additional return list element is also
provided. This element, efficiency, is itself a named list containing two elements: min and mean,
which contain the minimal and mean efficiencies (effective sample size per second of algorithm
run-time) across all monitored nodes, separately for each algorithm.
Plots
Executing MCMCsuite provides and saves several plots. These include trace plots and posterior
density plots for each monitored node, under each algorithm.
Note that the generation of MCMC Suite plots in Rstudio may result in several warning messages
from R (regarding graphics devices), but will function without any problems.

7.11.3

Customizing MCMC Suite

MCMCsuite is customizable in terms of all of the following:
• MCMC algorithms to execute, optionally including WinBUGS, OpenBUGS, JAGS, Stan, and
various flavors of NIMBLE’s MCMC;
• custom-configured NIMBLE MCMC algorithms;
• automated parameter blocking for efficient MCMC sampling;
• nodes to monitor;

7.11. COMPARING DIFFERENT MCMCS WITH MCMCSUITE AND COMPAREMCMCS 93
•
•
•
•
•
•

number of MCMC iterations;
thinning interval;
burn-in;
summary statistics to report;
calculating sampling efficiency (effective sample size per second of algorithm run-time); and
generating and saving plots.

NIMBLE MCMC algorithms may be specified using the MCMCs argument to MCMCsuite, which is
character vector defining the MCMC algorithms to run. The MCMCs argument may include any of
the following algorithms:
•
•
•
•
•
•

“winbugs”: WinBUGS MCMC algorithm
“openbugs”: OpenBUGS MCMC algorithm
“jags”: JAGS MCMC algorithm
“Stan”: Stan default MCMC algorithm
“nimble”: NIMBLE MCMC using the default configuration
“nimble_noConj”: NIMBLE MCMC using the default configuration with useConjugacy =
FALSE
• “autoBlock”: NIMBLE MCMC algorithm with block sampling of dynamically determined
parameter groups attempting to maximize sampling efficiency
The default value for the MCMCs argument is "nimble", which specifies only the default NIMBLE
MCMC algorithm.
The names of additional, custom, MCMC algorithms may also be provided in the MCMCs argument,
so long as these custom algorithms are defined in the MCMCdefs argument. An example of this
usage is given with the crossLevel algorithm in the litters example in 7.11.2.
The MCMCdefs argument should be a named list of definitions, for any custom MCMC algorithms
specified in the MCMCs argument. If MCMCs specified an algorithm called "myMCMC", then MCMCdefs
must contain an element named "myMCMC". The contents of this element must be a block of code
that, when executed, returns the desired MCMC configuration object. This block of code may
assume the existence of the R model object, Rmodel. Further, this block of code need not worry
about adding monitors to the MCMC configuration; it need only specify the samplers.
As a final important point, execution of this block of code must return the MCMC configuration
object. Therefore, elements supplied in the MCMCdefs argument should usually take the form:
MCMCdefs = list(
myMCMC = quote({
mcmcConf <- configureMCMC(Rmodel, ....)
mcmcConf$addSampler(.....)
mcmcConf
# returns the MCMC configuration object
})
)
Full details of the arguments and customization of the MCMC Suite is available via
help(MCMCsuite).

94

CHAPTER 7. MCMC

Chapter 8

Sequential Monte Carlo and MCEM
The NIMBLE algorithm library is growing and as of version 0.6-7 includes a suite of Sequential
Monte Carlo algorithms as well as a more robust MCEM and k-fold cross-validation.

8.1
8.1.1

Particle Filters / Sequential Monte Carlo
Filtering Algorithms

NIMBLE includes algorithms for four different types of sequential Monte Carlo (also known as
particle filters), which can be used to sample from the latent states and approximate the log
likelihood of a state space model. The particle filters currently implemented in NIMBLE are the
bootstrap filter, the auxiliary particle filter, the Liu-West filter, and the ensemble Kalman filter,
which can be built, respectively, with calls to buildBootstrapFilter, buildAuxiliaryFilter,
buildLiuWestFilter, and buildEnsembleKF. Each particle filter requires setup arguments model
and nodes; the latter should be a character vector specifying latent model nodes. In addition, each
particle filter can be customized using a control list argument. Details on the control options and
specifics of the filtering algorithms can be found in the help pages for the functions.
Once built, each filter can be run by specifying the number of particles. Each filter has a modelValues object named mvEWSamples that is populated with equally-weighted samples from the posterior
distribution of the latent states (and in the case of the Liu-West filter, the posterior distribution of
the top level parameters as well) as the filter is run. The bootstrap, auxiliary, and Liu-West filters
also have another modelValues object, mvWSamples, which has unequally-weighted samples from
the posterior distribution of the latent states, along with weights for each particle. In addition,
the bootstrap and auxiliary particle filters return estimates of the log-likelihood of the given state
space model.
We first create a linear state-space model to use as an example for our particle filter algorithms.
# Building a simple linear state-space model.
# x is latent space, y is observed data
timeModelCode <- nimbleCode({
x[1] ~ dnorm(mu_0, 1)
y[1] ~ dnorm(x[1], 1)
95

96

CHAPTER 8. SEQUENTIAL MONTE CARLO AND MCEM
for(i in 2:t){
x[i] ~ dnorm(x[i-1] * a + b, 1)
y[i] ~ dnorm(x[i] * c, 1)
}

a ~ dunif(0, 1)
b ~ dnorm(0, 1)
c ~ dnorm(1,1)
mu_0 ~ dnorm(0, 1)
})
# simulate some data
t <- 25; mu_0 <- 1
x <- rnorm(1 ,mu_0, 1)
y <- rnorm(1, x, 1)
a <- 0.5; b <- 1; c <- 1
for(i in 2:t){
x[i] <- rnorm(1, x[i-1] * a + b, 1)
y[i] <- rnorm(1, x[i] * c, 1)
}
# build the model
rTimeModel <- nimbleModel(timeModelCode, constants = list(t = t),
data <- list(y = y), check = FALSE )
# Set parameter
rTimeModel$a  y, x >= y
x < y, x <= y

logical OR (|) and AND(&)
logical not
greater than (and or equal to)
less than (and or equal to)

Comments

Status

Vector input

yes
yes
yes
yes

yes
yes
yes
yes

11.2. R FUNCTIONS (OR VARIANTS) IMPLEMENTED IN NIMBLE
Usage

Description

x != y, x == y
x + y, x - y, x * y
x / y
xˆy, pow(x, y)
x %% y
min(x1, x2),
max(x1, x2)
exp(x)
log(x)
sqrt(x)
abs(x)
step(x)
equals(x)
cube(x)
sin(x), cos(x),
tan(x)
asin(x), acos(x),
atan(x)
asinh(x), acosh(x),
atanh(x)
logit(x)
ilogit(x), expit(x)
probit(x)
iprobit(x), phi(x)
cloglog(x)
icloglog(x)
ceiling(x)
floor(x)
round(x)
trunc(x)
lgamma(x), loggam(x)
besselK(k, nu,
...expon.scaled)
log1p(x)
lfactorial(x),
logfact(x)
qDIST(x, PARAMS)
pDIST(x, PARAMS)
rDIST(x, PARAMS)
dDIST(x, PARAMS)
sort(x)
rank(x, s)
ranked(x, s)
order(x)

(not) equals
component-wise operators
component-wise division
power
modulo (remainder)
min. (max.) of two scalars

Status

Vector input

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

yes
yes
yes
yes
no
See pmin,
pmax
yes
yes
yes
yes
yes
yes
yes
yes

inverse trigonometric functions

yes

yes

inv. hyperbolic trig. functions

yes

yes

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

yes
yes

yes
yes

yes
yes
yes
yes
no
no
no
no

yes
yes
yes
yes

exponential
natural logarithm
square root
absolute value
step function at 0
equality of two scalars
third power
trigonometric functions

Comments

127

mix of scalar and vector
vector x and scalar y
xy ; vector x,scalar y

0 if x < 0, 1 if x >= 0
1 if x == y, 0 if x! = y
x3

logit
inverse logit
probit (Gaussian quantile)
inverse probit (Gaussian CDF)
complementary log log
inverse complementary log log
ceiling function
floor function
round to integer
truncation to integer
log gamma function
modified bessel function
of the second kind
log of 1 + x
log factorial

log(x/(1 − x))
exp(x)/(1 + exp(x))
Φ−1 (x)
Φ(x)
log(− log(1 − x))
1 − exp(− exp(x))
⌈(x)⌉
⌊(x)⌋

“q” distribution functions
“p” distribution functions
“r” distribution functions
“d” distribution functions

canonical
canonical
canonical
canonical

log Γ(x)

log(1 + x)
log x!
parameterization
parameterization
parameterization
parameterization

128

CHAPTER 11. WRITING SIMPLE NIMBLEFUNCTIONS
Table 11.3: Functions operating on vectors and matrices. Status column indicates if the function is currently provided in
NIMBLE.

Usage

Description

Comments

Status

inverse(x)
chol(x)
t(x)
x%*%y
inprod(x, y)
solve(x)
forwardsolve(x, y)
backsolve(x, y)
logdet(x)
asRow(x)
asCol(x)
sum(x)
mean(x)
sd(x)
prod(x)
min(x), max(x)
pmin(x, y), pmax(x,y)

matrix inverse
matrix Cholesky factorization
matrix transpose
matrix multiply
dot product
solve system of equations
solve lower-triangular system of equations
solve upper-triangular system of equations
log matrix determinant
convert vector to 1-row matrix
convert vector to 1-column matrix
sum of elements of x
mean of elements of x
standard deviation of elements of x
product of elements of x
min. (max.) of elements of x
vector of mins (maxs) of elements of
x and y
linear interpolation
matrix eigendecomposition; returns a
nimbleList of type eigenNimbleList
matrix singular value decomposition;
returns a nimbleList of type
svdNimbleList

x symmetric, positive def.
x symmetric, positive def.
x⊤
xy; x, y conformant
x⊤ y; x and y vectors
x−1 y; y matrix or vector
x−1 y; x lower-triangular
x−1 y; x upper-triangular
log |x|
sometimes automatic
sometimes automatic

yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes

interp.lin(x, v1, v2)
eigen(x)
svd(x)

x symmetric

More information on the nimbleLists returned by the eigen and svd functions in NIMBLE can be
found in Section 14.2.1.

11.2.4

Distribution functions

Distribution ‘d’, ‘r’, ‘p’, and ‘q’ functions can all be used from nimbleFunctions (and in BUGS
model code), but care is needed in the syntax, as follows.
• Names of the distributions generally (but not always) match those of R, which sometimes
differ from BUGS. See the list below.
• Supported parameterizations are also indicated in the list below.
• For multivariate distributions (multivariate normal, Dirichlet, and Wishart), ‘r’ functions only
return one random draw at a time, and the first argument must always be 1.
• R’s recycling rule (re-use of an argument as needed to accommodate longer values of other
arguments) is generally followed, but the returned object is always a scalar or a vector, not a
matrix or array.

no
yes
yes

11.2. R FUNCTIONS (OR VARIANTS) IMPLEMENTED IN NIMBLE

129

As in R (and nimbleFunctions), arguments are matched by order or by name (if given). Standard
arguments to distribution functions in R (log, log.p, lower.tail) can be used and have the same
defaults. User-defined distributions for BUGS (Chapter 12) can also be used from nimbleFunctions.
For standard distributions, we rely on R’s regular help pages (e.g., help(dgamma). For distributions
unique to NIMBLE (e.g., dexp_nimble, ddirch), we provide help pages.
Supported distributions, listed by their ‘d’ function, include:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

dbinom(x, size, prob, log)
dcat(x, prob, log)
dmulti(x, size, prob, log)
dnbinom(x, size, prob, log)
dpois(x, lambda, log)
dbeta(x, shape1, shape2, log)
dchisq(x, df, log)
dexp(x, rate, log)
dexp_nimble(x, rate, log)
dexp_nimble(x, scale, log)
dgamma(x, shape, rate, log)
dgamma(x, shape, scale, log)
dinvgamma(x, shape, rate, log)
dinvgamma(x, shape, scale, log)
dlnorm(x, meanlog, sdlog, log)
dlogis(x, location, scale, log)
dnorm(x, mean, sd, log)
dt_nonstandard(x, df, mu, sigma, log)
dt(x, df, log)
dunif(x, min, max, log)
dweibull(x, shape, scale, log)
ddirch(x, alpha, log)
dmnorm_chol(x, mean, cholesky, prec_param, log)
dmvt_chol(x, mu, cholesky, df, prec_param, log)
dwish_chol(x, cholesky, df, scale_param, log)

In the last three, cholesky stands for Cholesky decomposition of the relevant matrix; prec_param
is a logical indicating whether the Cholesky is of a precision matrix (TRUE) or covariance matrix
(FALSE)3 ; and scale_param is a logical indicating whether the Cholesky is of a scale matrix (TRUE)
or an inverse scale matrix (FALSE).

11.2.5

Flow control: if-then-else, for, while, and stop

These basic flow-control structures use the same syntax as in R. However, for-loops are limited to
sequential integer indexing. For example, for(i in 2:5) {...} works as it does in R. Decreasing
index sequences are not allowed. Unlike in R, if is not itself a function that returns a value.
3

For the multivariate t, these are more properly termed the ‘inverse scale’ and ‘scale’ matrices

130

CHAPTER 11. WRITING SIMPLE NIMBLEFUNCTIONS

We plan to include more flexible for-loops in the future, but for now we’ve included just
one additional useful feature: for(i in seq_along(NFL)) will work as in R, where NFL is a
nimbleFunctionList. This is described in Section 15.4.8.
stop, or equivalently nimStop, throws control to R’s error-handling system and can take a character
argument that will be displayed in an error message.

11.2.6 print and cat
print, or equivalently nimPrint, prints an arbitrary set of outputs in order and adds a newline
character at the end. cat or nimCat is identical, except without a newline at the end.

11.2.7

Checking for user interrupts: checkInterrupt

When you write algorithms that will run for a long time in C++, you may want to explicitly check
whether a user has tried to interrupt the execution (i.e., by pressing Control-C). Simply include
checkInterrupt in run code at places where a check should be done. If there has been an interrupt
waiting to be handled, the process will stop and return control to R.

11.2.8

Optimization: optim and nimOptim

NIMBLE provides a nimOptim function that partially implement’s R’s optim function with some minor differences. nimOptim supports optimization methods ‘Nelder-Mead’, ‘BFGS’, ‘CG’, ‘L-BFGSB’, but does not support methods ‘SANN’ and ‘Brent’. NIMBLE’s nimOptim supports gradients
using user-provided functions if available or finite differences otherwise, but it does not currently
support Hessian computations. Currently nimOptim does not support extra parameters to the
function being optimized (via \dots), but a work-around is to create a new nimbleFunction that
binds those fixed parameters. Finally, nimOptim requires a nimbleList datatype for the control
parameter, whereas R’s optim uses a simple R list. To define the control parameter, create a
default value with the nimOptimDefaultControl function, and set any desired fields. For example
usage, see the unit tests in tests/test-optim.R.

11.2.9

‘nim’ synonyms for some functions

NIMBLE uses some keywords, such as dim and print, in ways similar but not identical to R. In
addition, there are some keywords in NIMBLE that have the same names as R functions with
quite different functionality. For example, step is part of the BUGS language, but it is also an R
function for stepwise model selection. And equals is part of the BUGS language but is also used
in the testthat package, which we use in testing NIMBLE.
NIMBLE tries to avoid conflicts by replacing some keywords immediately upon creating a nimbleFunction. These replacements include
• c → nimC
• copy → nimCopy
• dim → nimDim

11.3. HOW NIMBLE HANDLES TYPES OF VARIABLES
•
•
•
•
•
•
•
•
•
•
•

131

print → nimPrint
cat → nimCat
step → nimStep
equals → nimEquals
rep → nimRep
round → nimRound
seq → nimSeq
stop → nimStop
switch → nimSwitch
numeric, integer, logical → nimNumeric, nimInteger, nimLogical
matrix, array → nimMatrix, nimArray

This system gives programmers the choice between using the keywords like nimPrint directly, to
avoid confusion in their own code about which ‘print’ is being used, or to use the more intuitive
keywords like print but remember that they are not the same as R’s functions.

11.3

How NIMBLE handles types of variables

Variables in the NIMBLE language are statically typed. Once a variable is used for one type, it
can’t subsequently be used for a different type. This rule facilitates NIMBLE’s compilation to C++.
The NIMBLE compiler often determines types automatically, but sometimes the programmer needs
to explicitly provide them.
The elemental types supported by NIMBLE include double (floating-point), integer, logical, and
character. The type of a numeric or logical object refers to the number of dimensions and the
elemental type of the elements. Hence if x is created as a double matrix, it can only be used
subsequently for a double matrix. The size of each dimension is not part of its type and thus can
be changed. Up to four dimensions are supported for double, integer, and logical. Only vectors
(one dimension) are supported for character. Unlike R, NIMBLE supports true scalars, which have
0 dimensions.

11.3.1

nimbleList data structures

A nimbleList is a data structure that can contain arbitrary other NIMBLE objects, including
other nimbleLists. Like other NIMBLE types, nimbleLists are strongly typed: each nimbleList
is created from a configuration that declares what types of objects it will hold. nimbleLists are
covered in Chapter 14.2.

11.3.2

How numeric types work

R’s dynamic types support easy programming because one type can sometimes be transformed
to another type automatically when an expression is evaluated. NIMBLE’s static types makes it
stricter than R.

132

CHAPTER 11. WRITING SIMPLE NIMBLEFUNCTIONS

11.3.2.1 When NIMBLE can automatically set a numeric type
When a variable is first created by assignment, its type is determined automatically by that assignment. For example, if x has not appeared before, then
x <- A %*% B # assume A and B are double matrices or vectors
will create x to be a double matrix of the correct size (determined during execution).
11.3.2.1.1

Avoid changing types of a variable within a nimbleFunction

Because NIMBLE is statically typed, you cannot use the same variable name for two objects of
different types (including objects of different dimensions).
Suppose we have (implicitly) created x as a double matrix. If x is used subsequently, it can only
be used as a double matrix. This is true even if it is assigned a new value, which will again set its
size automatically but cannot change its type.
x <- A %*% B # assume A and B are double matrices or vectors
x <- nimMatrix(0, nrow = 5, ncol = 2) # OK: 'x' is still a double matrix
x <- rnorm(10) # NOT OK: 'x' is a double vector

11.3.2.2 When a numeric object needs to be created before being used
If the contents of a variable are to be populated by assignment into some indices in steps, the
variable must be created first. Further, it must be large enough for its eventual contents; it will
not be automatically resized if assignments are made beyond its current size. For example, in the
following code, x must be created before being filled with contents for specific indices.
x <- numeric(10)
for(i in 1:10)
x[i] <- foo(y[i])

11.3.2.3 Changing the sizes of existing objects: setSize
setSize changes the size of an object, preserving its contents in column-major order.
# Example of creating and resizing a floating-point vector
# myNumericVector will be of length 10, with all elements initialized to 2
myNumericVector <- numeric(10, value = 2)
# resize this numeric vector to be length 20; last 10 elements will be 0
setSize(myNumericVector, 20)

11.3. HOW NIMBLE HANDLES TYPES OF VARIABLES

133

# Example of creating a 1-by-10 matrix with values 1:10 and resizing it
myMatrix <- matrix(1:10, nrow = 1, ncol = 10)
# resize this matrix to be a 10-by-10 matrix
setSize(myMatrix, c(10, 10))
# The first column will have the 1:10

11.3.2.4 Confusions between scalars and length-one vectors
In R, there is no such thing is a true scalar; scalars can always be treated as vectors of length one.
NIMBLE allows true scalars, which can create confusions. For example, consider the following
code:
myfun <- nimbleFunction(
run = function(i = integer()) { # i is an integer scalar
randomValues <- rnorm(10)
# double vector
a <- randomValues[i]
# double scalar
b <- randomValues[i:i]
# double vector
d <- a + b
# double vector
f <- c(i)
# integer vector
})
In the line that creates b, the index range i:i is not evaluated until run time. Even though i:i
will always evaluate to simpy i, the compiler does not determine that. Since there is a vector index
range provided, the result of randomValues[i:i] is determined to be a vector. The following line
then creates d as a vector, because a vector plus a scalar returns a vector. Another way to create
a vector from a scalar is to use c, as illustrated in the last line.
11.3.2.5 Confusions between vectors and one-column or one-row matrices
Consider the following code:
myfun <- nimbleFunction(
run = function() {
A <- matrix(value = rnorm(9), nrow = 3)
B <- rnorm(3)
Cmatrix <- A %*% B
# double matrix, one column
Cvector <- (A %*% B)[,1]
# double vector
Cmatrix <- (A %*% B)[,1]
# error, vector assigned to matrix
Cmatrix[,1] <- (A %*% B)[,1]
# ok, if Cmatrix is large enough
})
This creates a matrix A, a vector B, and matrix-multiplies them. The vector B is automatically
treated as a one-column matrix in matrix algebra computations. The result of matrix multiplication
is always a matrix, but a programmer may expect a vector, since they know the result will have
one column. To make it a vector, simply extract the first column. More information about such
handling is provided in the next section.

134

CHAPTER 11. WRITING SIMPLE NIMBLEFUNCTIONS

11.3.2.6 Understanding dimensions and sizes from linear algebra
As much as possible, NIMBLE behaves like R when determining types and sizes returned from linear
algebra expressions, but in some cases this is not possible because R uses run-time information while
NIMBLE must determine dimensions at compile time. For example, when matrix multiplying a
matrix by a vector, R treats the vector as a one-column matrix unless treating it as a one-row
matrix is the only way to make the expression valid, as determined at run time. NIMBLE usually
must assume during compilation that it should be a one-column matrix, unless it can determine
not just the number of dimensions but the actual sizes during compilation. When needed asRow
and asCol can control how a vector will be treated as a matrix.
Here is a guide to such issues. Suppose v1 and v2 are vectors, and M1 is a matrix. Then
• v1 + M1 generates a compilation error unless one dimension of M1 is known at compile-time
to be 1. If so, then v1 is promoted to a 1-row or 1-column matrix to conform with M1, and
the result is a matrix of the same sizes. This behavior occurs for all component-wise binary
functions.
• v1 %*% M1 defaults to promoting v1 to a 1-row matrix, unless it is known at compile-time
that M1 has 1 row, in which case v1 is promoted to a 1-column matrix.
• M1 %*% v1 defaults to promoting v1 to a 1-column matrix, unless it is known at compile time
that M1 has 1 column, in which case v1 is promoted to a 1-row matrix.
• v1 %*% v2 promotes v1 to a 1-row matrix and v2 to a 1-column matrix, so the returned
values is a 1x1 matrix with the inner product of v1 and v2. If you want the inner product as
a scalar, use inprod(v1, v2).
• asRow(v1) explicitly promotes v1 to a 1-row matrix. Therefore v1 %*% asRow(v2) gives the
outer product of v1 and v2.
• asCol(v1) explicitly promotes v1 to a 1-column matrix.
• The default promotion for a vector is to a 1-column matrix. Therefore, v1 %*% t(v2) is
equivalent to v1 %*% asRow(v2) .
• When indexing, dimensions with scalar indices will be dropped. For example, M1[1,] and
M1[,1] are both vectors. If you do not want this behavior, use drop=FALSE just as in R. For
example, M1[1,,drop=FALSE] is a matrix.
• The left-hand side of an assignment can use indexing, but if so it must already be correctly
sized for the result. For example, Y[5:10, 20:30] <- x will not work – and could crash
your R session with a segmentation fault – if Y is not already at least 10x30 in size. This
can be done by setSize(Y, c(10, 30)). See Section 11.3.2.3 for more details. Note that
non-indexed assignment to Y, such as Y <- x, will automatically set Y to the necessary size.
Here are some examples to illustrate the above points, assuming M2 is a square matrix.
• Y <- v1 + M2 %*% v2 will return a 1-column matrix. If Y is created by this statement, it
will be a 2-dimensional variable. If Y already exists, it must already be 2-dimesional, and it
will be automatically re-sized for the result.
• Y <- v1 + (M2 %*% v2)[,1] will return a vector. Y will either be created as a vector or
must already exist as a vector and will be re-sized for the result.

11.4. DECLARING ARGUMENT AND RETURN TYPES

135

11.3.2.7 Size warnings and the potential for crashes
For matrix algebra, NIMBLE cannot ensure perfect behavior because sizes are not known until
run time. Therefore, it is possible for you to write code that will crash your R session. In Version
0.6.13, NIMBLE attempts to issue a warning if sizes are not compatible, but it does not halt
execution. Therefore, if you execute A <- M1 %*% M2, and M1 and M2 are not compatible for matrix
multiplication, NIMBLE will output a warning that the number of rows of M1 does not match the
number of columns of M2. After that warning the statement will be executed and may result in
a crash. Another easy way to write code that will crash is to do things like Y[5:10, 20:30]  C++ overhead!

If your nimbleFunctions are very fast, say under 1ms, then microbenchmark will be inaccurate due
to R-to-C++ conversion overhead (that won’t happen in your actual functions). To get timing
information in C++, NIMBLE provides a run.time function that avoids the R-to-C++ overhead.
myMicrobenchmark <- compileNimble(nimbleFunction(
run = function(iters = integer(0)){
time1 <- run.time({
for (t in 1:iters) myCompiledFunVersion1(1.234)
})
time2 <- run.time({
for (t in 1:iters) myCompiledFunVersion2(1.234)

190

CHAPTER 15. WRITING NIMBLEFUNCTIONS TO INTERACT WITH MODELS
})
return(c(time1, time2))
returnType(double(1))

}))
print(myMicroBenchmark(100000))

15.9

Clearing and unloading compiled objects

Sometimes it is useful to clear all the compiled objects from a project and unload the shared library
produced by your C++ compiler. To do so, you can use nimble:::clearCompiled(obj) where
obj is a compiled object such as a compiled model or nimbleFunction (e.g., a compiled MCMC
algorithm). This will clear all compiled objects associated with your NIMBLE project. For example,
if cModel is a compiled model, nimble:::clearCompiled(cModel) will clear both the model and
all associated nimbleFunctions such as compiled MCMCs that use that model. Be careful, use of
nimble:::clearCompiled can be dangerous. There is some risk that if you have copies of the
R objects that interfaced to compiled C++ objects that have been removed, and you attempt to
use those R objects after clearing their compiled counterparts, you will crash R. We have tried to
minimize that risk, but we can’t guarantee safe behavior.

15.10

Reducing memory usage

NIMBLE can create a lot of objects in its processing, and some of them use R features such as
reference classes that are heavy in memory usage. We have noticed that building large models can
use lots of memory. To help alleviate this, we provide two options, which can be controlled via
nimbleOptions.
As noted above, the option buildInterfacesForCompiledNestedNimbleFunctions defaults to
FALSE, which means NIMBLE will not build full interfaces to compiled nimbleFunctions that ony
appear within other nimbleFunctions. If you want access to all such nimbleFunctions, use the option
buildInterfacesForCompiledNestedNimbleFunctions = TRUE. This will use more memory but
can be useful for debugging.
The option clearNimbleFunctionsAfterCompiling is more drastic, and it is experimental, so
‘buyer beware’. This will clear much of the contents of an uncompiled nimbleFunction object after
it has been compiled in an effort to free some memory. We expect to be able to keep making
NIMBLE more efficient – faster execution and lower memory use – in the future.

Bibliography
Andrieu, C., Doucet, A., and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods.
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(3):269––342.
Banerjee, S., Carlin, B., and Gelfand, A. (2015). Hierarchical Modeling and Analysis for Spatial
Data. Chapman & Hall, Boca Raton, 2 edition.
Blackwell, D. and MacQueen, J. (1973). Ferguson distributions via Pólya urn schemes. The Annals
of Statistics, 1:353–355.
Escobar, M. D. (1994). Estimating normal means with a Dirichlet process prior. Journal of the
American Statistical Association, 89:268–277.
Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures.
Journal of the American Statistical Association, 90:577–588.
Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Annals of Statistics,
1:209–230.
Ferguson, T. S. (1974). Prior distribution on the spaces of probability measures. Annals of Statistics,
2:615–629.
Gelman, A., Hwang, J., and Vehtari, A. (2014). Understanding predictive information criteria for
Bayesian model. Statistics and Computing, 24(6):997–1016.
George, E., Makov, U., and Smith, A. (1993). Conjugate likelihood distributions. Scandinavian
Journal of Statistics, 20(2):147––156.
Ishwaran, H. and James, L. F. (2001). Gibbs sampling methods for stick-breaking priors. Journal
of the American Statistical Association, 96(453):161–173.
Ishwaran, H. and James, L. F. (2002). Approximate Dirichlet process computing in finite normal
mixtures: smoothing and prior information. Journal of Computational and Graphical Statistics,
11:508–532.
Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates I: Density estimates. The Annals
of Statistics, 12:351–357.
Lunn, D., Spiegelhalter, D., Thomas, A., and Best, N. (2009). The BUGS project: Evolution,
critique and future directions. Statistics in Medicine, 28(25):3049––3067.
Neal, R. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of
Computational and Graphical Statistics, 9:249–265.
191

192

BIBLIOGRAPHY

Neal, R. M. (2003). Slice sampling. The Annals of Statistics, 31(3):705–741.
Paciorek, C. (2009). Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models. Technical report, University of California,
Berkeley.
Roberts, G. O. and Sahu, S. K. (1997). Updating schemes, correlation structure, blocking and
parameterization for the Gibbs sampler. Journal of the Royal Statistical Society: Series B
(Statistical Methodology), 59(2):291–317.
Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Chapman
& Hall, Boca Raton.
Sethuraman, J. (1994). A constructive definition of Dirichlet prior. Statistica Sinica, 2:639–650.
Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research,
11(Dec):3571–3594.



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Mode                       : UseOutlines
Page Count                      : 192
Creator                         : LaTeX with hyperref package
Producer                        : XeTeX 0.99992
Create Date                     : 2019:01:12 13:32:18-08:00
EXIF Metadata provided by EXIF.tools

Navigation menu