Manual

manual

User Manual: Pdf

Open the PDF directly: View PDF .
Page Count: 69

Download
Open PDF In Browser	View PDF

Neural Network Design & Deployment
Olivier Bichler, David Briand, Victor Gacoin, Benjamin Bertelone
Wednesday 14th February, 2018

Commissariat à l’Energie Atomique et aux Energies Alternatives
Institut List | CEA Saclay Nano-INNOV | Bât. 861-PC142
91191 Gif-sur-Yvette Cedex - FRANCE
Tel. : +33 (0)1.69.08.49.67 | Fax : +33(0)1.69.08.83.95
www-list.cea.fr
Établissement Public à caractère Industriel et Commercial | RCS Paris B 775 685 019

Département Architecture Conception et Logiciels Embarqués

Contents
1 Presentation
1.1 Database handling . . .
1.2 Data pre-processing . .
1.3 Deep network building .
1.4 Performances evaluation
1.5 Hardware exports . . . .
1.6 Summary . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

2 About N2D2-IP

N2D2 IP only
N2D2 IP only

N2D2 IP only

6
6
6
7
8
8
10
11

3 Performing simulations
3.1 Obtaining the latest version of this manual . .
3.2 Minimum system requirements . . . . . . . . .
3.3 Obtaining N2D2 . . . . . . . . . . . . . . . . .
3.3.1 Prerequisites . . . . . . . . . . . . . . .
Red Hat Enterprise Linux (RHEL) 6 . .
Ubuntu . . . . . . . . . . . . . . . . . .
Windows . . . . . . . . . . . . . . . . .
3.3.2 Getting the sources . . . . . . . . . . . .
3.3.3 Compilation . . . . . . . . . . . . . . . .
3.4 Downloading training datasets . . . . . . . . .
3.5 Run the learning . . . . . . . . . . . . . . . . .
3.6 Test a learned network . . . . . . . . . . . . . .
3.6.1 Interpreting the results . . . . . . . . .
Recognition rate . . . . . . . . . . . . .
Confusion matrix . . . . . . . . . . . . .
Memory and computation requirements
Kernels and weights distribution . . . .
Output maps activity . . . . . . . . . .
3.7 Export a learned network . . . . . . . . . . . .
3.7.1 C export . . . . . . . . . . . . . . . . . .
3.7.2 CPP_OpenCL export . . . . . . . . . . . .
3.7.3 CPP_TensorRT export . . . . . . . . . .
3.7.4 CPP_cuDNN export . . . . . . . . . . . .
3.7.5 C_HLS export . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

11
11
11
12
12
12
12
13
13
13
14
14
14
14
14
14
14
14
15
15
17
18
19
20
20

4 INI file interface
4.1 Syntax . . . . . . . . . . . .
4.1.1 Properties . . . . . .
4.1.2 Sections . . . . . . .
4.1.3 Case sensitivity . . .
4.1.4 Comments . . . . . .
4.1.5 Quoted values . . . .
4.1.6 Whitespace . . . . .
4.1.7 Escape characters .
4.2 Template inclusion syntax .
4.2.1 Variable substitution
4.2.2 Control statements .
block . . . . . . . . .
for . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

21
21
21
21
21
21
21
21
21
22
22
22
23
23

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

2/69

4.3
4.4

4.5

4.6

N2D2 IP only
N2D2 IP only
N2D2 IP only

N2D2 IP only
N2D2 IP only
N2D2 IP only
N2D2 IP only

N2D2 IP only
N2D2 IP only

N2D2 IP only

if . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
include . . . . . . . . . . . . . . . . . . . . . . . . . .
Global parameters . . . . . . . . . . . . . . . . . . . . . . . .
Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 MNIST . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2 GTSRB . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.3 Directory . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.4 Other built-in databases . . . . . . . . . . . . . . . . .
CIFAR10_Database . . . . . . . . . . . . . . . . . . . .
CIFAR100_Database . . . . . . . . . . . . . . . . . . .
CKP_Database . . . . . . . . . . . . . . . . . . . . . .
Caltech101_DIR_Database . . . . . . . . . . . . . . .
Caltech256_DIR_Database . . . . . . . . . . . . . . .
CaltechPedestrian_Database . . . . . . . . . . . . .
Daimler_Database . . . . . . . . . . . . . . . . . . . .
FDDB_Database . . . . . . . . . . . . . . . . . . . . . .
GTSDB_DIR_Database . . . . . . . . . . . . . . . . . .
ILSVRC2012_Database . . . . . . . . . . . . . . . . . .
KITTI_Database . . . . . . . . . . . . . . . . . . . . .
KITTI_Road_Database . . . . . . . . . . . . . . . . . .
LITISRouen_Database . . . . . . . . . . . . . . . . . .
4.4.5 Dataset images slicing . . . . . . . . . . . . . . . . . .
Stimuli data analysis . . . . . . . . . . . . . . . . . . . . . . .
4.5.1 Zero-mean and unity standard deviation normalization
4.5.2 Substracting the mean image of the set . . . . . . . .
Environment . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.1 Built-in transformations . . . . . . . . . . . . . . . . .
AffineTransformation . . . . . . . . . . . . . . . . .
ApodizationTransformation . . . . . . . . . . . . . .
ChannelExtractionTransformation . . . . . . . . . .
ColorSpaceTransformation . . . . . . . . . . . . . .
DFTTransformation . . . . . . . . . . . . . . . . . . .
DistortionTransformation . . . . . . . . . . . . . .
EqualizeTransformation . . . . . . . . . . . . . . . .
ExpandLabelTransformation . . . . . . . . . . . . . .
FilterTransformation . . . . . . . . . . . . . . . . .
FlipTransformation . . . . . . . . . . . . . . . . . .
GradientFilterTransformation . . . . . . . . . . . .
LabelSliceExtractionTransformation . . . . . . . .
MagnitudePhaseTransformation . . . . . . . . . . . .
MorphologicalReconstructionTransformation . . .
MorphologyTransformation . . . . . . . . . . . . . .
NormalizeTransformation . . . . . . . . . . . . . . .
PadCropTransformation . . . . . . . . . . . . . . . .
RandomAffineTransformation . . . . . . . . . . . . .
RangeAffineTransformation . . . . . . . . . . . . . .
RangeClippingTransformation . . . . . . . . . . . .
RescaleTransformation . . . . . . . . . . . . . . . .
ReshapeTransformation . . . . . . . . . . . . . . . .
SliceExtractionTransformation . . . . . . . . . . .
ThresholdTransformation . . . . . . . . . . . . . . .
TrimTransformation . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

23
23
23
23
23
23
24
26
26
26
26
26
26
27
27
27
28
28
28
28
28
28
29
29
29
31
32
33
33
34
34
34
35
35
35
35
36
36
37
37
37
38
38
39
39
39
39
39
39
40
40
40
3/69

N2D2 IP only

4.7

N2D2 IP only

WallisFilterTransformation . . . . . .
Network layers . . . . . . . . . . . . . . . . . . .
4.7.1 Layer definition . . . . . . . . . . . . . . .
4.7.2 Weight fillers . . . . . . . . . . . . . . . .
ConstantFiller . . . . . . . . . . . . . .
NormalFiller . . . . . . . . . . . . . . .
UniformFiller . . . . . . . . . . . . . . .
XavierFiller . . . . . . . . . . . . . . .
4.7.3 Weight solvers . . . . . . . . . . . . . . .
SGDSolver_Frame . . . . . . . . . . . . .
SGDSolver_Frame_CUDA . . . . . . . . . .
4.7.4 Activation functions . . . . . . . . . . . .
Logistic . . . . . . . . . . . . . . . . . .
LogisticWithLoss . . . . . . . . . . . . .
Rectifier . . . . . . . . . . . . . . . . .
Saturation . . . . . . . . . . . . . . . . .
Softplus . . . . . . . . . . . . . . . . . .
Tanh . . . . . . . . . . . . . . . . . . . . .
TanhLeCun . . . . . . . . . . . . . . . . .
4.7.5 Anchor . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
Outputs remapping . . . . . . . . . . . .
4.7.6 Conv . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
Configuration parameters (Spike models)
4.7.7 Deconv . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.8 Pool . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Spike models)
4.7.9 Unpool . . . . . . . . . . . . . . . . . . .
4.7.10 ElemWise . . . . . . . . . . . . . . . . . .
Sum operation . . . . . . . . . . . . . . . .
AbsSum operation . . . . . . . . . . . . . .
EuclideanSum operation . . . . . . . . . .
Prod operation . . . . . . . . . . . . . . .
Max operation . . . . . . . . . . . . . . . .
Examples . . . . . . . . . . . . . . . . . .
4.7.11 FMP . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.12 Fc . . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
Configuration parameters (Spike models)
4.7.13 Rbf . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.14 Softmax . . . . . . . . . . . . . . . . . . .
4.7.15 LRN . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.16 Dropout . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.17 BatchNorm . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.18 Transformation . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

40
40
40
41
42
42
42
42
42
42
43
43
43
43
44
44
44
44
44
44
44
45
46
47
48
49
50
50
51
51
52
53
53
53
53
53
53
53
54
54
54
54
55
56
56
57
57
57
57
57
58
58
4/69

5 Tutorials
5.1 Building a classifier neural network . . . . . . . .
5.2 Building a segmentation neural network . . . . .
5.2.1 Faces detection . . . . . . . . . . . . . . .
5.2.2 Gender recognition . . . . . . . . . . . . .
5.2.3 ROIs extraction . . . . . . . . . . . . . . .
5.2.4 Data visualization . . . . . . . . . . . . .
5.3 Transcoding a learned network in spike-coding .
5.3.1 Render the network compatible with spike
5.3.2 Configure spike-coding parameters . . . .

. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
simulations
. . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

59
59
61
63
64
64
65
66
66
67

5/69

1

Presentation

The N2D2 platform is a comprehensive solution for fast and accurate Deep Neural Network (DNN)
simulation and full and automated DNN-based applications building. The platform integrates
database construction, data pre-processing, network building, benchmarking and hardware export
to various targets. It is particularly useful for DNN design and exploration, allowing simple and fast
prototyping of DNN with different topologies. It is possible to define and learn multiple network
topology variations and compare the performances (in terms of recognition rate and computationnal
cost) automatically. Export targets include CPU, DSP and GPU with OpenMP, OpenCL, Cuda,
cuDNN and TensorRT programming models as well as custom hardware IP code generation with
High-Level Synthesis for FPGA and dedicated configurable DNN accelerator IP1 .
In the following, the first section describes the database handling capabilities of the tool,
which can automatically generate learning, validation and testing data sets from any hand made
database (for example from simple files directories). The second section briefly describes the data
pre-processing capabilites built-in the tool, which does not require any external pre-processing
step and can handle many data transformation, normalization and augmentation (for example
using elastic distortion to improve the learning). The third section show an example of DNN
building using a simple INI text configuration file. The fourth section show some examples of
metrics obtained after the learning and testing to evaluate the performances of the learned DNN.
Next, the fifth section introduces the DNN hardware export capabilities of the toolflow, which can
automatically generate ready to use code for various targets such as embedded GPUs or full custom
dedicated FPGA IP. Finally, we conclude by summarising the main features of the tool.

1.1

Database handling

The tool integrates everything needed to handle custom or hand made databases:
• Genericity: load image and sound, 1D, 2D or 3D data;
• Associate a label for each data point (useful for scene labeling for example) or a single label
to each data file (one object/class per image for example), 1D or 2D labels;
• Advanced Region of Interest (ROI) handling:
Support arbitrary ROI shapes (circular, rectangular, polygonal or pixelwise defined);
Convert ROIs to data point (pixelwise) labels;
Extract one or multiple ROIs from an initial dataset to create as many corresponding
additional data to feed the DNN;
• Native support of file directory-based databases, where each sub-directory represents a
different label. Most used image file formats are supported (JPEG, PNG, PGM...);
• Possibility to add custom datafile format in the tool without any change in the code base;
• Automatic random partitionning of the database into learning, validation and testing sets.

1.2

Data pre-processing

Data pre-processing, such as image rescaling, normalization, filtering... is directly integrated into
the toolflow, with no need for external tool or pre-processing. Each pre-processing step is called a
transformation.
The full sequence of transformations can be specified easily in a INI text configuration file. For
example:
; First step: convert the image to grayscale
[env.Transformation-1]
Type=ChannelExtractionTransformation
CSChannel=Gray
1

Ongoing work

6/69

; Second step: rescale the image to a 29x29 size
[env.Transformation-2]
Type=RescaleTransformation
Width=29
Height=29
; Third step: apply histogram equalization to the image
[env.Transformation-3]
Type=EqualizeTransformation
; Fourth step (only during learning): apply random elastic distortions to the images to extent the
learning set
[env.OnTheFlyTransformation]
Type=DistortionTransformation
ApplyTo=LearnOnly
ElasticGaussianSize=21
ElasticSigma=6.0
ElasticScaling=20.0
Scaling=15.0
Rotation=15.0

Example of pre-processing transformations built-in in the tool are:
•
•
•
•
•
•
•
•
•
•

1.3

Image color space change and color channel extraction;
Elastic distortion;
Histogram equalization (including CLAHE);
Convolutional filtering of the image with custom or pre-defined kernels (Gaussian, Gabor...);
(Random) image flipping;
(Random) extraction of fixed-size slices in a given label (for multi-label images)
Normalization;
Rescaling, padding/cropping, triming;
Image data range clipping;
(Random) extraction of fixed-size slices.

Deep network building

The building of a deep network is straightforward and can be done withing the same INI configuration
file. Several layer types are available: convolutional, pooling, fully connected, Radial-basis function
(RBF) and softmax. The tool is highly modular and new layer types can be added without
any change in the code base. Parameters of each layer type are modifiable, for example for the
convolutional layer, one can specify the size of the convolution kernels, the stride, the number of
kernels per input map and the learning parameters (learning rate, initial weights value...). For the
learning, the data dynamic can be chosen between 16 bits (with NVIDIA® cuDNN2 ), 32 bit and 64
bit floating point numbers.
The following example, which will serve as the use case for the rest of this presentation, shows
how to build a DNN with 5 layers: one convolution layer, followed by one MAX pooling layer,
followed by two fully connected layers and a softmax output layer.
; Specify the input data format
[env]
SizeX=24
SizeY=24
BatchSize=12
; First layer: convolutional with 3x3 kernels
[conv1]
Input=env
Type=Conv
2

On future GPUs

7/69

KernelWidth=3
KernelHeight=3
NbChannels=32
Stride=1
; Second layer: MAX pooling with pooling area 2x2
[pool1]
Input=conv1
Type=Pool
Pooling=Max
PoolWidth=2
PoolHeight=2
NbChannels=32
Stride=2
Mapping.Size=1 ; one to one connection between convolution output maps and pooling input maps
; Third layer: fully connected layer with 60 neurons
[fc1]
Input=pool1
Type=Fc
NbOutputs=60
; Fourth layer: fully connected with 10 neurons
[fc2]
Input=fc1
Type=Fc
NbOutputs=10
; Final layer: softmax
[softmax]
Input=fc2
Type=Softmax
NbOutputs=10
WithLoss=1
[softmax.Target]
TargetValue=1.0
DefaultValue=0.0

The resulting DNN is shown in figure 1.
The learning is accelerated in GPU using the NVIDIA® cuDNN framework, integrated into
the toolflow. Using GPU acceleration, learning times can be reduced typically by two orders of
magnitude, enabling the learning of large databases within tens of minutes to a few hours instead
of several days or weeks for non-GPU accelerated learning.

1.4

Performances evaluation

The software automatically outputs all the information needed for the network applicative performances analysis, such as the recognition rate and the validation score during the learning; the
confusion matrix during learning, validation and test; the memory and computation requirements
of the network; the output maps activity for each layer, and so on, as shown in figure 2.

1.5

Hardware exports

Once the learned DNN recognition rate performances are satisfying, an optimized version of the
network can be automatically exported for various embedded targets. An automated network
computation performances benchmarking can also be performed among different targets.
The following targets are currently supported by the toolflow:
• Plain C code (no dynamic memory allocation, no floating point processing);

8/69

env

conv1

24x24

32 (22x22)

pool1
32 (11x11)

Max

fc1

fc2

softmax

60

10

10

Figure 1: Automatically generated and ready to learn DNN from the INI configuration file example.

Recognition rate and validation score

Confusion matrix

Memory and computation requirements

Output maps activity

Figure 2: Example of information automatically generated by the software during and after learning.

• C code accelerated with OpenMP;
• C code tailored for High-Level Synthesis (HLS) with Xilinx® Vivado® HLS;
Direct synthesis to FPGA, with timing and utilization after routing;
9/69

Possibility to constrain the maximum number of clock cycles desired to compute the
whole network;
FPGA utilization vs number of clock cycle trade-off analysis;
• OpenCL code optimized for either CPU/DSP or GPU;
• Cuda kernels, cuDNN and TensorRT code optimized for NVIDIA® GPUs.
Different automated optimizations are embedded in the exports:
• DNN weights and signal data precision reduction (down to 8 bit integers or less for custom
FPGA IPs);
• Non-linear network activation functions approximations;
• Different weights discretization methods.
The exports are generated automatically and come with a Makefile and a working testbench,
including the pre-processed testing dataset. Once generated, the testbench is ready to be compiled
and executed on the target platform. The applicative performance (recognition rate) as well as the
computing time per input data can then be directly mesured by the testbench.

100000

OpenCL
CUDA
HLS FPGA

10000

Kpixels image / s

OpenMP

1000
100
10
1

Figure 3: Example of network benchmarking on different hardware targets.

The figure 3 shows an example of benchmarking results of the previous DNN on different targets
(in log scale). Compared to desktop CPUs, the number of input image pixels processed per second
is more than one order of magnitude higher with GPUsand at least two orders of magnitude better
with synthesized DNN on FPGA.

1.6

Summary

The N2D2 platform is today a complete and production ready neural network building tool, which
does not require advanced knownledges in deep learning to be used. It is tailored for fast neural
network applications generation and porting with minimum overhead in terms of database creation
and management, data pre-processing, networks configuration and optimized code generation,
which can save months of manual porting and verification effort to a single automated step in the
tool.

10/69

2

About N2D2-IP

While N2D2 is our deep learning open-source core framework, some modules referred as "N2D2-IP"
in the manual, are only available through custom license agreement with CEA LIST.
If you are interested in obtaining some of these modules, please contact our business developer
for more information on available licensing options:
Sandrine VARENNE (Sandrine.VARENNE@cea.fr)
In addition to N2D2-IP modules, we can also provide our expertise to design specific solutions
for integrating DNN in embedded hardware systems, where power, latency, form factor and/or
cost are constrained. We can target CPU/DSP/GPU CoTS hardware as well as our own PNeuro
(programmable) and DNeuro (dataflow) dedicated hardware accelerator IPs for DNN on FPGA or
ASIC.

3

Performing simulations

3.1

Obtaining the latest version of this manual

Before going further, please make sure you are reading the latest version of this manual. It is located
in the manual sub-directory. To compile the manual in PDF, just run the following command:
cd manual && make

In order to compile the manual, you must have pdflatex and bibtex installed, as well as some
common LaTeX packages.
• On Ubuntu, this can be done by installing the texlive and texlive-latex-extra software
packages.
• On Windows, you can install the MiKTeX software, which includes everything needed and will
install the required LaTeX packages on the fly.

3.2

Minimum system requirements

• Supported processors:
ARM Cortex A15 (tested on Tegra K1)
ARM Cortex A53/A57 (tested on Tegra X1)
Pentium-compatible PC (Pentium III, Athlon or more-recent system recommended)
• Supported operating systems:
Windows ≥ 7 or Windows Server ≥ 2012, 64 bits with Visual Studio ≥ 2013.3 (2013
Update 3)
GNU/Linux with GCC ≥ 4.4 (tested on RHEL ≥ 6, Debian ≥ 6, Ubuntu ≥ 14.04)
• At least 256 MB of RAM (1 GB with GPU/CUDA) for MNIST dataset processing
• At least 150 MB available hard disk space + 350 MB for MNIST dataset processing
For CUDA acceleration:
• CUDA ≥ 6.5 and CuDNN ≥ 1.0
• NVIDIA GPU with CUDA compute capability ≥ 3 (starting from Kepler micro-architecture)
• At least 512 MB GPU RAM for MNIST dataset processing

11/69

3.3
3.3.1

Obtaining N2D2
Prerequisites

Red Hat Enterprise Linux (RHEL) 6 Make sure you have the following packages installed:
•
•
•
•

cmake
gnuplot
opencv
opencv-devel

(may require the rhel-x86_64-workstation-optional-6 repository channel)

Plus, to be able to use GPU acceleration:
• Install the CUDA repository package:
rpm -Uhv http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-reporhel6-7.5-18.x86_64.rpm
yum clean expire-cache
yum install cuda

• Install cuDNN from the NVIDIA website: register to NVIDIA Developer and download the latest version of cuDNN. Simply copy the header and library files from the cuDNN archive to the
corresponding directories in the CUDA installation path (by default: /usr/local/cuda/include
and /usr/local/cuda/lib64, respectively).
• Make sure the CUDA library path (e.g. /usr/local/cuda/lib64) is added to the LD_LIBRARY_PATH
environment variable.
Ubuntu
version:
•
•
•
•
•

Make sure you have the following packages installed, if they are available on your Ubuntu

cmake
gnuplot
libopencv-dev
libcv-dev
libhighgui-dev

Plus, to be able to use GPU acceleration:
• Install the CUDA repository package matching your distribution. For example, for Ubuntu
14.04 64 bits:
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repoubuntu1404_7.5-18_amd64.deb
dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb

• Install the cuDNN repository package matching your distribution. For example, for Ubuntu
14.04 64 bits:
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64/
nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb
dpkg -i nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb

Note that the cuDNN repository package is provided by NVIDIA for Ubuntu starting from
version 14.04.
• Update the package lists: apt-get update
• Install the CUDA and cuDNN required packages:
apt-get install cuda-core-7-5 cuda-cudart-dev-7-5 cuda-cublas-dev-7-5 cuda-curand-dev-7-5
libcudnn5-dev

• Make sure there is a symlink to /usr/local/cuda:
ln -s /usr/local/cuda-7.5 /usr/local/cuda

• Make sure the CUDA library path (e.g. /usr/local/cuda/lib64) is added to the LD_LIBRARY_PATH
environment variable.
12/69

Windows On Windows 64 bits, Visual Studio ≥ 2013.3 (2013 Update 3) is required.
Make sure you have the following software installed:
• CMake (http://www.cmake.org/): download and run the Windows installer.
• dirent.h C++ header (https://github.com/tronkko/dirent): to be put in the Visual
Studio include path.
• Gnuplot (http://www.gnuplot.info/): the bin sub-directory in the install path needs to be
added to the Windows PATH environment variable.
• OpenCV (http://opencv.org/): download the latest 2.x version for Windows and extract it
to, for example, C:\OpenCV\. Make sure to define the environment variable OpenCV_DIR to point
to C:\OpenCV\opencv\build. Make sure to add the bin sub-directory (C:\OpenCV\opencv\build\x64
\vc12\bin) to the Windows PATH environment variable.
Plus, to be able to use GPU acceleration:
• Download and install CUDA toolkit 8.0 located at https://developer.nvidia.com/compute/
cuda/8.0/prod/local_installers/cuda_8.0.44_windows-exe:
rename cuda_8.0.44_windows-exe cuda_8.0.44_windows.exe
cuda_8.0.44_windows.exe -s compiler_8.0 cublas_8.0 cublas_dev_8.0 cudart_8.0 curand_8.0
curand_dev_8.0

• Update the PATH environment variable:
set PATH=%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin;%ProgramFiles%\NVIDIA GPU
Computing Toolkit\CUDA\v8.0\libnvvp;%PATH%

• Download and install cuDNN 8.0 located at http://developer.download.nvidia.com/
compute/redist/cudnn/v5.1/cudnn-8.0-windows7-x64-v5.1.zip (the following command
assumes that you have 7-Zip installed):
7z x cudnn-8.0-windows7-x64-v5.1.zip
copy cuda\include\*.* ^
"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include\"
copy cuda\lib\x64\*.* ^
"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64\"
copy cuda\bin\*.* ^
"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\"

3.3.2

Getting the sources

Use the following command:
git clone git@github.com:CEA-LIST/N2D2.git

3.3.3

Compilation

To compile the program:
mkdir build
cd build
cmake .. && make

On Windows, you may have to specify the generator, for example:
cmake .. -G"Visual Studio 12"

Then open the newly created N2D2 project in Visual Studio 2013. Select "Release" for the build
target. Right click on ALL_BUILD item and select "Build".

13/69

3.4

Downloading training datasets

A python script located in the repository root directory allows you to select and automatically
download some well-known datasets, like MNIST and GTSRB (the script requires Python 2.x with
bindings for GTK 2 package):
./tools/install_stimuli_gui.py

By default, the datasets are downloaded in the path specified in the N2D2_DATA environment
variable, which is the root path used by the N2D2 tool to locate the databases. If the N2D2_DATA
variable is not set, the default value used is /local/$USER/n2d2_data/ (or /local/n2d2_data/ if
the USER environment variable is not set) on Linux and C:\n2d2_data\ on Windows.
Please make sure you have write access to the N2D2_DATA path, or if not set, in the default
/local/$USER/n2d2_data/ path.

3.5

Run the learning

The following command will run the learning for 600,000 image presentations/steps and log the
performances of the network every 10,000 steps:
./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -learn 600000 -log 10000

Note: you may want to check the gradient computation using the -check option. Note that it
can be extremely long and can occasionally fail if the required precision is too high.

3.6

Test a learned network

After the learning is completed, this command evaluate the network performances on the test data
set:
./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -test

3.6.1

Interpreting the results

Recognition rate The recognition rate and the validation score are reported during the learning
in the TargetScore_*/Success_validation.png file, as shown in figure 4.
Confusion matrix The software automatically outputs the confusion matrix during learning,
validation and test, with an example shown in figure 5. Each row of the matrix contains the number
of occurrences estimated by the network for each label, for all the data corresponding to a single
actual, target label. Or equivalently, each column of the matrix contains the number of actual,
target label occurrences, corresponding to the same estimated label. Idealy, the matrix should be
diagonal, with no occurrence of an estimated label for a different actual label (network mistake).
The confusion matrix reports can be found in the simulation directory:
• TargetScore_*/ConfusionMatrix_learning.png;
• TargetScore_*/ConfusionMatrix_validation.png;
• TargetScore_*/ConfusionMatrix_test.png.
Memory and computation requirements The software also report the memory and computation requirements of the network, as shown in figure 6. The corresponding report can be found in
the stats sub-directory of the simulation.
Kernels and weights distribution The synaptic weights obtained during and after the learning
can be analyzed, in terms of distribution (weights sub-directory of the simulation) or in terms of
kernels (kernels sub-directory of the simulation), as shown in 7.
14/69

Figure 4: Recognition rate and validation score during learning.

Figure 5: Example of confusion matrix obtained after the learning.

Output maps activity The initial output maps activity for each layer can be visualized in the
outputs_init sub-directory of the simulation, as shown in figure 8.

3.7

Export a learned network

15/69

Figure 6: Example of memory and computation requirements of the network.

./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -export CPP_OpenCL

Export types:
• C C export using OpenMP;
• C_HLS C export tailored for HLS with Vivado HLS;
• CPP_OpenCL C++ export using OpenCL;
• CPP_Cuda C++ export using Cuda;
• CPP_cuDNN C++ export using cuDNN;
• CPP_TensorRT C++ export using tensorRT 2.1 API;
• SC_Spike SystemC spike export.
Other program options related to the exports:
Option [default value]
-nbbits [8]

-calib

[0]

-calib-passes

[2]

-no-unsigned
-db-export

[-1]

Description
Number of bits for the weights and signals. Must be 8, 16, 32
or 64 for integer export, or -32, -64 for floating point export.
The number of bits can be arbitrary for the C_HLS export (for
example, 6 bits)
Number of stimuli used for the calibration. 0 = no calibration
(default), -1 = use the full test dataset for calibration
Number of KL passes for determining the layer output values
distribution truncation threshold (0 = use the max. value,
no truncation)
If present, disable the use of unsigned data type in integer
exports
Max. number of stimuli to export (0 = no dataset export, -1
= unlimited)

16/69

conv1 kernels

conv2 kernels

conv1 weights distribution

conv2 weights distribution

Figure 7: Example of kernels and weights distribution analysis for two convolutional layers.

N2D2 IP only

3.7.1

C export

Test the exported network:
cd export_C_int8
make
./bin/n2d2_test

The result should look like:
...
1652.00/1762
( a v g = 93.757094%)
1653.00/1763
( a v g = 93.760635%)
1654.00/1764
( a v g = 93.764172%)
T e s t e d 1764 s t i m u l i
S u c c e s s r a t e = 93.764172%
P r o c e s s t i m e p e r s t i m u l u s = 1 8 7 . 5 4 8 1 8 6 us ( 1 2 t h r e a d s )
Confusion matrix :
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
| T \ E |
0 |
1 |
2 |
3 |
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

17/69

Figure 8: Output maps activity example of the first convolutional layer of the network.
|
0 |
329 |
1 |
5 |
2 |
|
|
97.63% |
0.30% |
1.48% |
0.59% |
|
1 |
0 |
692 |
2 |
6 |
|
|
0.00% |
98.86% |
0.29% |
0.86% |
|
2 |
11 |
27 |
609 |
55 |
|
|
1.57% |
3.85% |
86.75% |
7.83% |
|
3 |
0 |
0 |
1 |
24 |
|
|
0.00% |
0.00% |
4.00% |
96.00% |
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
T: Target
E: Estimated

N2D2 IP only

3.7.2

CPP_OpenCL export

The OpenCL export can run the generated program in GPU or CPU architectures. Compilation
features:

18/69

Preprocessor command [default value]
PROFILING [0]

GENERATE_KBIN
LOAD_KBIN

CUDA

[0]

MALI

[0]

INTEL
AMD

[0]

[0]

[0]

[1]

Description
Compile the binary with a synchronization between each layers and return the mean execution
time of each layer. This preprocessor option can
decrease performances.
Generate the binary output of the OpenCL kernel
.cl file use. The binary is store in the /bin folder.
Indicate to the program to load an OpenCL kernel as a binary from the /bin folder instead of a
.cl file.
Use the CUDA OpenCL SDK locate at
/usr/local/cuda
Use the MALI OpenCL SDK locate at
/usr/M aliO penCLS DKv XXX
Use the INTEL OpenCL SDK locate at
/opt/intel/opencl
Use the AMD OpenCL SDK locate at
/opt/AM DAP P SDK − XXX

Program options related to the OpenCL export:
Option [default value]
-cpu
-gpu
-batch

[1]

-stimulus

[NULL]

Description
If present, force to use a CPU architecture to run the program
If present, force to use a GPU architecture to run the program
Size of the batch to use
Path to a specific input stimulus to test. For example: stimulus /stimulus/env0000.pgm command will test the file
env0000.pgm of the stimulus folder.

Test the exported network:
cd export_CPP_OpenCL_float32
make
./bin/n2d2_opencl_test -gpu

3.7.3

CPP_TensorRT export

The tensorRT 2.1 API export can run the generated program in NVIDIA GPU architecture. It use
CUDA and tensorRT 2.1 API library. The currently supported layers by the tensorRT 2.1 export
are : Convolutional, Pooling, Concatenation, Fully-Connected, Softmax and all activations type.
Custom layers implementation through the plugin factory and generic 8-bits calibrations inference
features are under development.
Program options related to the tensorRT 2.1 API export:
Option [default value]
-batch [1]
-dev [0]
-stimulus [NULL]

-prof
-iter-build

[1]

Description
Size of the batch to use
CUDA Device ID selection
Path to a specific input stimulus to test. For example: stimulus /stimulus/env0000.pgm command will test the file
env0000.pgm of the stimulus folder.
Activates the layer wise profiling mechanism. This option
can decrease execution time performance.
Sets the number of minimization build iterations done by
the tensorRT builder to find the best layer tactics.
19/69

Test the exported network with layer wise profiling:
cd export_CPP_TensorRT_float32
make
./bin/n2d2_tensorRT_test -prof

The results of the layer wise profiling should look like:
(19%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ CONV1 + CONV1_ACTIVATION:
0 . 0 2 1 9 4 6 7 ms
(05%) ∗∗∗∗∗∗∗∗∗∗∗∗ POOL1 : 0 . 0 0 6 7 5 5 7 3 ms
(13%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ CONV2 + CONV2_ACTIVATION: 0 . 0 1 5 9 0 8 9 ms
(05%) ∗∗∗∗∗∗∗∗∗∗∗∗ POOL2 : 0 . 0 0 6 1 6 0 4 7 ms
(14%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ CONV3 + CONV3_ACTIVATION: 0 . 0 1 5 9 7 1 3 ms
(19%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ FC1 + FC1_ACTIVATION : 0 . 0 2 2 2 2 4 2 ms
(13%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ FC2 : 0 . 0 1 4 9 0 1 3 ms
(08%) ∗∗ ∗ ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ SOFTMAX: 0 . 0 1 0 0 6 3 3 ms
A v er a g e p r o f i l e d tensorRT p r o c e s s t i m e p e r s t i m u l u s = 0 . 1 1 3 9 3 2 ms

3.7.4

CPP_cuDNN export

The cuDNN export can run the generated program in NVIDIA GPU architecture. It use CUDA
and cuDNN library. Compilation features:
Preprocessor command [default value]
PROFILING [0]

ARCH32

[0]

Description
Compile the binary with a synchronization between each layers and return the mean execution
time of each layer. This preprocessor option can
decrease performances.
Compile the binary with the 32-bits architecture
compatibility.

Program options related to the cuDNN export:
Option [default value]
-batch [1]
-dev [0]
-stimulus [NULL]

Description
Size of the batch to use
CUDA Device ID selection
Path to a specific input stimulus to test. For example: stimulus /stimulus/env0000.pgm command will test the file
env0000.pgm of the stimulus folder.

Test the exported network:
cd export_CPP_cuDNN_float32
make
./bin/n2d2_cudnn_test

N2D2 IP only

3.7.5

C_HLS export

Test the exported network:
cd export_C_HLS_int8
make
./bin/n2d2_test

Run the High-Level Synthesis (HLS) with Xilinx® Vivado® HLS:
vivado_hls -f run_hls.tcl

20/69

4

INI file interface

The INI file interface is the primary way of using N2D2. It is a simple, lightweight and user-friendly
format for specifying a complete DNN-based application, including dataset instanciation, data
pre-processing, neural network layers instanciation and post-processing, with all its hyperparameters.

4.1

Syntax

INI files are simple text files with a basic structure composed of sections, properties and values.
4.1.1

Properties

The basic element contained in an INI file is the property. Every property has a name and a value,
delimited by an equals sign (=). The name appears to the left of the equals sign.
name=value

4.1.2

Sections

Properties may be grouped into arbitrarily named sections. The section name appears on a line
by itself, in square brackets ([ and ]). All properties after the section declaration are associated
with that section. There is no explicit "end of section" delimiter; sections end at the next section
declaration, or the end of the file. Sections may not be nested.
[section]
a=a
b=b

4.1.3

Case sensitivity

Section and property names are case sensitive.
4.1.4

Comments

Semicolons (;) or number sign (#) at the beginning or in the middle of the line indicate a comment.
Comments are ignored.
; comment text
a=a # comment text
a="a ; not a comment" ; comment text

4.1.5

Quoted values

Values can be quoted, using double quotes. This allows for explicit declaration of whitespace,
and/or for quoting of special characters (equals, semicolon, etc.).
4.1.6

Whitespace

Leading and trailing whitespace on a line are ignored.
4.1.7

Escape characters

A backslash (\) followed immediately by EOL (end-of-line) causes the line break to be ignored.

21/69

4.2

Template inclusion syntax

Is is possible to recursively include templated INI files. For example, the main INI file can include
a templated file like the following:
[inception@inception_model.ini.tpl]
INPUT=layer_x
SIZE=32
ARRAY=2 ; Must be the number of elements in the array
ARRAY[0].P1=Conv
ARRAY[0].P2=32
ARRAY[1].P1=Pool
ARRAY[1].P2=64

If the inception_model.ini.tpl template file content is:
[{{SECTION_NAME}}_layer1]
Input={{INPUT}}
Type=Conv
NbChannels={{SIZE}}
[{{SECTION_NAME}}_layer2]
Input={{SECTION_NAME}}_layer1
Type=Fc
NbOutputs={{SIZE}}
{% block ARRAY %}
[{{SECTION_NAME}}_array{{#}}]
Prop1=Config{{.P1}}
Prop2={{.P2}}
{% endblock %}

The resulting equivalent content for the main INI file will be:
[inception_layer1]
Input=layer_x
Type=Conv
NbChannels=32
[inception_layer2]
Input=inception_layer1
Type=Fc
NbOutputs=32
[inception_array0]
Prop1=ConfigConv
Prop2=32
[inception_array1]
Prop1=ConfigPool
Prop2=64

The SECTION_NAME template parameter is automatically generated from the name of the including
section (before @).
4.2.1
{{VAR}}

4.2.2

Variable substitution
is replaced by the value of the VAR template parameter.
Control statements

Control statements are between {% and %} delimiters.

22/69

block {%block ARRAY %} ... {%endblock %}
The # template parameter is automatically generated from the {%block ... %} template control
statement and corresponds to the current item position, starting from 0.
for

{%for VAR in range([START, ]END])%} ... {%endfor %}
If START is not specified, the loop begins at 0 (first value of VAR). The last value of VAR is END-1.

if

... [{%else %}] ... {%endif %}
may be ==, !=, exists or not_exists.

{%if VAR OP [VALUE] %}
OP

include

4.3

{%include FILENAME %}

Global parameters

Option [default value]
DefaultModel [Transcode]

Description
Default layers model. Can be Frame, Frame_CUDA, Transcode or
Spike

SignalsDiscretization

[0]

FreeParametersDiscretization

Number of levels for signal discretization
Number of levels for weights discretization

[0]

4.4

Databases

The tool integrates pre-defined modules for several well-known database used in the deep learning
community, such as MNIST, GTSRB, CIFAR10 and so on. That way, no extra step is necessary to
be able to directly build a network and learn it on these database.
4.4.1

MNIST

MNIST (LeCun et al., 1998) is already fractionned into a learning set and a testing set, with:
• 60,000 digits in the learning set;
• 10,000 digits in the testing set.
Example:
[database]
Type=MNIST_IDX_Database
Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value]
Validation [0.0]
DataPath

Description
Fraction of the learning set used for validation
Path to the database

[$N2D2_DATA/mnist]
4.4.2

GTSRB

GTSRB (Stallkamp et al., 2012) is already fractionned into a learning set and a testing set, with:
• 39,209 digits in the learning set;
• 12,630 digits in the testing set.
Example:

23/69

[database]
Type=GTSRB_DIR_Database
Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value]
Validation [0.0]

Description
Fraction of the learning set used for validation
Path to the database

DataPath

[$N2D2_DATA/GTSRB]
4.4.3

Directory

Hand made database stored in files directories are directly supported with the DIR_Database module.
For example, suppose your database is organized as following (in the path specified in the N2D2_DATA
environment variable):
•
•
•
•

GST/airplanes:

800 images
123 images
GST/Faces: 435 images
GST/Motorbikes: 798 images
GST/car_side:

You can then instanciate this database as input of your neural network using the following
parameters:
[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.4 ; 40% of images of the smallest category = 49 (0.4x123) images for each category will be
used for learning
Validation=0.2 ; 20% of images of the smallest category = 25 (0.2x123) images for each category
will be used for validation
; the remaining images will be used for testing

Each subdirectory will be treated as a different label, so there will be 4 different labels, named
after the directory name.
The stimuli are equi-partitioned for the learning set and the validation set, meaning that the
same number of stimuli for each category is used. If the learn fraction is 0.4 and the validation
fraction is 0.2, as in the example above, the partitioning will be the following:
Label ID

Label name

Learn set

Validation set

Test set

0
1
2
3

airplanes

49
49
49
49
196

25
25
25
25
100

726
49
361
724
1860

car_side
Faces
Motorbikes

Total:
Mandatory option
Option [default value]
DataPath
Learn

LoadInMemory
Depth

[1]

[0]

Description
Path to the root stimuli directory
If PerLabelPartitioning is true, fraction of images used for
the learning; else, number of images used for the learning,
regardless of their labels
Load the whole database into memory
Number of sub-directory levels to include. Examples:
24/69

[]
LabelDepth [1]
LabelName

PerLabelPartitioning

Validation

Test

[1]

[0.0]

[1.0-Learn-Validation]

ValidExtensions
LoadMore
ROIFile

[]

[]

[]

DefaultLabel
ROIsMargin

[]

[0]

Depth = 0: load stimuli only from the current directory
(DataPath)
Depth = 1: load stimuli from DataPath and stimuli contained
in the sub-directories of DataPath
Depth < 0: load stimuli recursively from DataPath and all its
sub-directories
Base stimuli label name
Number of sub-directory name levels used to form the stimuli
labels. Examples:
LabelDepth = -1: no label for all stimuli (label ID = -1)
LabelDepth = 0: uses LabelName for all stimuli
LabelDepth = 1: uses LabelName for stimuli in the current
directory (DataPath) and LabelName/sub-directory name for
stimuli in the sub-directories
If true, the stimuli are equi-partitioned for the learn/validation/test sets, meaning that the same number of stimuli for
each label is used
If PerLabelPartitioning is true, fraction of images used for the
validation; else, number of images used for the validation,
regardless of their labels
If PerLabelPartitioning is true, fraction of images used for the
test; else, number of images used for the test, regardless of
their labels
List of space-separated valid stimulus file extensions (if left
empty, any file extension is considered a valid stimulus)
Name of an other section with the same options to load a
different DataPath
File containing the stimuli ROIs. If a ROI file is specified,
LabelDepth should be set to -1
Label name for pixels outside any ROI (default is no label,
pixels are ignored)
Number of pixels around ROIs that are ignored (and not
considered as DefaultLabel pixels)

To load and partition more than one DataPath, one can use the LoadMore option:
[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.6
Validation=0.4
LoadMore=database.test
; Load stimuli from the "GST_Test" path in the test dataset
[database.test]
DataPath=${N2D2_DATA}/GST_Test
Learn=0.0
Test=1.0
; The LoadMore option is recursive:
; LoadMore=database.more
; [database.more]
; Load even more data here

25/69

4.4.4

Other built-in databases

CIFAR10_Database CIFAR10 database (Krizhevsky, 2009).
Option [default value]
Validation [0.0]
DataPath

Description
Fraction of the learning set used for validation
Path to the database

[$N2D2_DATA/cifar-10-batchesbin]
CIFAR100_Database CIFAR100 database (Krizhevsky, 2009).
Option [default value]
Validation [0.0]
UseCoarse [0]
DataPath

Description
Fraction of the learning set used for validation
If true, use the coarse labeling (10 labels instead of 100)
Path to the database

[$N2D2_DATA/cifar-100-binary]
CKP_Database The Extended Cohn-Kanade (CK+) database for expression recognition (Lucey
et al., 2010).
Option [default value]
Learn
Validation

[0.0]

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the database

[$N2D2_DATA/cohn-kanadeimages]
Caltech101_DIR_Database Caltech 101 database (Fei-Fei et al., 2004).
Option [default value]
Learn

[0.0]
IncClutter [0]
Validation

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
If true, includes the BACKGROUND_Google directory of
the database
Path to the database

[$N2D2_DATA/
101_ObjectCategories]
Caltech256_DIR_Database Caltech 256 database (Griffin et al., 2007).
Option [default value]
Learn

[0.0]
IncClutter [0]
Validation

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
If true, includes the BACKGROUND_Google directory of
the database
Path to the database

[$N2D2_DATA/
256_ObjectCategories]

26/69

CaltechPedestrian_Database Caltech Pedestrian database (Dollár et al., 2009).
Note that the images and annotations must first be extracted from the seq video data located in
the videos directory using the dbExtract.m Matlab tool provided in the "Matlab evaluation/labeling
code" downloadable on the dataset website.
Assuming the following directory structure (in the path specified in the N2D2_DATA environment
variable):
•
•
•
•

(from the setxx.tar files)
CaltechPedestrians/data-USA/annotations/... (from the setxx.tar files)
CaltechPedestrians/tools/piotr_toolbox/toolbox (from the Piotr’s Matlab Toolbox archive)
CaltechPedestrians/*.m including dbExtract.m (from the Matlab evaluation/labeling code)
CaltechPedestrians/data-USA/videos/...

Use the following command in Matlab to generate the images and annotations:
cd([getenv(’N2D2_DATA’) ’/CaltechPedestrians’])
addpath(genpath(’tools/piotr_toolbox/toolbox’)) % add the Piotr’s Matlab Toolbox in the Matlab
path
dbInfo(’USA’)
dbExtract()

Option [default value]
Validation [0.0]
SingleLabel [1]
IncAmbiguous [0]
DataPath

Description
Fraction of the learning set used for validation
Use the same label for "person" and "people" bounding box
Include ambiguous bounding box labeled "person?" using the
same label as "person"
Path to the database images

[$N2D2_DATA/
CaltechPedestrians/dataUSA/images]
Path to the database annotations

LabelPath

[$N2D2_DATA/
CaltechPedestrians/dataUSA/annotations]
Daimler_Database Daimler Monocular Pedestrian Detection Benchmark (Daimler Pedestrian).
Option [default value]
Learn [1.0]
Validation [0.0]
Test [0.0]
Fully [0]

Description
Fraction of images used for the learning
Fraction of images used for the validation
Fraction of images used for the test
When activate it use the test dataset to learn. Use only on
fully-cnn mode

FDDB_Database Face Detection Data Set and Benchmark (FDDB) (Jain and Learned-Miller,
2010).
Option [default value]
Learn
Validation

[0.0]

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the images (decompressed originalPics.tar.gz)

[$N2D2_DATA/FDDB]
LabelPath

Path to the annotations (decompressed FDDB-folds.tgz)

[$N2D2_DATA/FDDB]
27/69

GTSDB_DIR_Database GTSDB database (Houben et al., 2013).
Option [default value]
Learn
Validation

[0.0]

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the database

[$N2D2_DATA/FullIJCNN2013]
ILSVRC2012_Database ILSVRC2012 database (Russakovsky et al., 2015).
Option [default value]
Learn
DataPath

Description
Fraction of images used for the learning
Path to the database

[$N2D2_DATA/ILSVRC2012]
Path to the database labels list file

LabelPath

[$N2D2_DATA
/ILSVRC2012/synsets.txt]
KITTI_Database

KITTI Database.

Option [default value]
Learn [0.8]
Validation [0.2]

Description
Fraction of images used for the learning
Fraction of images used for the validation

KITTI_Road_Database KITTI Road Database. The KITTI Road Database provide ROI which
can be used to road segmentation.
Option [default value]
Learn [0.8]
Validation [0.2]

Description
Fraction of images used for the learning
Fraction of images used for the validation

LITISRouen_Database LITIS Rouen audio scene dataset (Rakotomamonjy and Gasso, 2014).
Option [default value]
Learn [0.4]
Validation [0.4]
DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the database

[$N2D2_DATA/data_rouen]
4.4.5

Dataset images slicing

It is possible to automatically slice images from a dataset, with a given slice size and stride, using
the .slicing attribute. This effectively increases the number of stimuli in the set.
[database.slicing]
ApplyTo=NoLearn
Width=2048
Height=1024
StrideX=2048
StrideY=1024

28/69

4.5

Stimuli data analysis

You can enable stimuli data reporting with the following section (the name of the section must
start with env.StimuliData):
[env.StimuliData-raw]
ApplyTo=LearnOnly
LogSizeRange=1
LogValueRange=1

The stimuli data reported for the full MNIST learning set will look like:
env . S t i m u l i D a t a −raw d a t a :
Number o f s t i m u l i : 60000
Data w i d t h r a n g e : [ 2 8 , 2 8 ]
Data h e i g h t r a n g e : [ 2 8 , 2 8 ]
Data c h a n n e l s r a n g e : [ 1 , 1 ]
Value r a n g e : [ 0 , 2 5 5 ]
Value mean : 3 3 . 3 1 8 4
Value s t d . d e v . : 7 8 . 5 6 7 5

4.5.1

Zero-mean and unity standard deviation normalization

It it possible to normalize the whole database to have zero mean and unity standard deviation on
the learning set using a RangeAffineTransformation transformation:
; Stimuli normalization based on learning set global mean and std.dev.
[env.Transformation-normalize]
Type=RangeAffineTransformation
FirstOperator=Minus
FirstValue=[env.StimuliData-raw]_GlobalValue.mean
SecondOperator=Divides
SecondValue=[env.StimuliData-raw]_GlobalValue.stdDev

The variables _GlobalValue.mean and _GlobalValue.stdDev are automatically generated in the [env.
StimuliData-raw] block. Thanks to this facility, unknown and arbitrary database can be analysed
and normalized in one single step without requiring any external data manipulation.
After normalization, the stimuli data reported is:
env . S t i m u l i D a t a −n o r m a l i z e d d a t a :
Number o f s t i m u l i : 60000
Data w i d t h r a n g e : [ 2 8 , 2 8 ]
Data h e i g h t r a n g e : [ 2 8 , 2 8 ]
Data c h a n n e l s r a n g e : [ 1 , 1 ]
Value r a n g e : [ − 0 . 4 2 4 0 7 4 , 2 . 8 2 1 5 4 ]
Value mean : 2 . 6 4 7 9 6 e −07
Value s t d . d e v . : 1

Where we can check that the global mean is close to 0 and the standard deviation is 1 on the
whole dataset. The result of the transformation on the first images of the set can be checked in the
generated frames folder, as shown in figure 9.
4.5.2

Substracting the mean image of the set

Using the StimuliData object followed with an AffineTransformation, it is also possible to use the
mean image of the dataset to normalize the data:
[env.StimuliData-meanData]
ApplyTo=LearnOnly
MeanData=1 ; Provides the _MeanData parameter used in the transformation
[env.Transformation]
Type=AffineTransformation
FirstOperator=Minus
FirstValue=[env.StimuliData-meanData]_MeanData

29/69

Figure 9: Image of the set after normalization.

The resulting global mean image can be visualized in env.StimuliData-meanData/meanData.bin.png
an is shown in figure 10.

Figure 10: Global mean image generated by StimuliData with the MeanData parameter enabled.

After this transformation, the reported stimuli data becomes:
env . S t i m u l i D a t a −p r o c e s s e d d a t a :

30/69

Number o f s t i m u l i : 60000
Data w i d t h r a n g e : [ 2 8 , 2 8 ]
Data h e i g h t r a n g e : [ 2 8 , 2 8 ]
Data c h a n n e l s r a n g e : [ 1 , 1 ]
Value r a n g e : [ − 1 3 9 . 5 5 4 , 2 5 4 . 9 7 9 ]
Value mean : −3.45583 e −08
Value s t d . d e v . : 6 6 . 1 2 8 8

The result of the transformation on the first images of the set can be checked in the generated
frames folder, as shown in figure 11.

Figure 11: Image of the set after the AffineTransformation substracting the global mean image (keep in
mind that the original image value range is [0, 255]).

4.6

Environment

The environment simply specify the input data format of the network (width, height and batch
size). Example:
[env]
SizeX=24
SizeY=24
BatchSize=12 ; [default: 1]

Option [default value]
SizeX
SizeY
NbChannels
BatchSize

[1]

[1]

CompositeStimuli
CachePath

[]

[0]

Description
Environment width
Environment height
Number of channels (applicable only if there is no env.
ChannelTransformation[...])
Batch size
If true, use pixel-wise stimuli labels
Stimuli cache path (no cache if left empty)
31/69

StimulusType [SingleBurst]
DiscardedLateStimuli

[1.0]

PeriodMeanMin

[50 TimeMs]

PeriodMeanMax

[12 TimeS]

PeriodRelStdDev
PeriodMin

4.6.1

[0.1]

[11 TimeMs]

Method for converting stimuli into spike trains. Can be any
of SingleBurst, Periodic, JitteredPeriodic or Poissonian
The pixels in the pre-processed stimuli with a value above
this limit never generate spiking events
Mean minimum period Tmin , used for periodic temporal codings, corresponding to pixels in the pre-processed stimuli with
a value of 0 (which are supposed to be the most significant
pixels)
Mean maximum period Tmax , used for periodic temporal
codings, corresponding to pixels in the pre-processed stimuli
with a value of 1 (which are supposed to be the least significant pixels). This maximum period may be never reached if
DiscardedLateStimuli is lower than 1.0
Relative standard deviation, used for periodic temporal codings, applied to the spiking period of a pixel
Absolute minimum period, or spiking interval, used for periodic temporal codings, for any pixel

Built-in transformations

There are 6 possible categories of transformations:
• env.Transformation[...] Transformations applied to the input images before channels creation;
• env.OnTheFlyTransformation[...] On-the-fly transformations applied to the input images before
channels creation;
• env.ChannelTransformation[...] Create or add transformation for a specific channel;
• env.ChannelOnTheFlyTransformation[...] Create or add on-the-fly transformation for a specific
channel;
• env.ChannelsTransformation[...] Transformations applied to all the channels of the input
images;
• env.ChannelsOnTheFlyTransformation[...] On-the-fly transformations applied to all the channels
of the input images.
Example:
[env.Transformation]
Type=PadCropTransformation
Width=24
Height=24

Several transformations can applied successively. In this case, to be able to apply multiple
transformations of the same category, a different suffix ([...]) must be added to each transformation.
The transformations will be processed in the order of appearance in the INI file
regardless of their suffix.
Common set of parameters for any kind of transformation:
Option [default value]
ApplyTo [All]

Description
Apply the transformation only to the specified stimuli sets.
Can be:
LearnOnly: learning set only
ValidationOnly: validation set only
TestOnly: testing set only
32/69

NoLearn:

validation and testing sets only
learning and testing sets only
NoTest: learning and validation sets only
All: all sets (default)
NoValidation:

Example:
[env.Transformation-1]
Type=ChannelExtractionTransformation
CSChannel=Gray
[env.Transformation-2]
Type=RescaleTransformation
Width=29
Height=29
[env.Transformation-3]
Type=EqualizeTransformation
[env.OnTheFlyTransformation]
Type=DistortionTransformation
ApplyTo=LearnOnly ; Apply this transformation for the Learning set only
ElasticGaussianSize=21
ElasticSigma=6.0
ElasticScaling=20.0
Scaling=15.0
Rotation=15.0

List of available transformations:
AffineTransformation Apply an element-wise affine transformation to the image with matrixes
of the same size.
Option [default value]
FirstOperator

Description
First element-wise operator, can be Plus, Minus, Multiplies,
Divides

FirstValue
SecondOperator [Plus]

First matrix file name
Second element-wise operator, can be Plus, Minus, Multiplies,
Divides

SecondValue

[]

Second matrix file name

The final operation is the following, with A the image matrix, B1st , B2nd the matrixes to
add/substract/multiply/divide and the element-wise operator :
f (A) = A op1st B1st



op2nd

B2nd

ApodizationTransformation Apply an apodization window to each data row.
Option [default value]
Size
WindowName [Rectangular]

Description
Window total size (must match the number of data columns)
Window name. Possible values are:
Rectangular: Rectangular
Hann: Hann
Hamming: Hamming
Cosine: Cosine
Gaussian: Gaussian
Blackman: Blackman
Kaiser: Kaiser
33/69

Gaussian window

Gaussian window.

Option [default value]
WindowName .Sigma
[0.4]
Blackman window

Sigma

Blackman window.

Option [default value]
WindowName .Alpha
[0.16]
Kaiser window

Description

Description
Alpha

Kaiser window.

Option [default value]
WindowName .Beta [5.0]

Description
Beta

ChannelExtractionTransformation Extract an image channel.
Option
CSChannel

Description
Blue: blue channel in the BGR colorspace, or first channel of
any colorspace
Green: green channel in the BGR colorspace, or second channel of any colorspace
Red: red channel in the BGR colorspace, or third channel of
any colorspace
Hue: hue channel in the HSV colorspace
Saturation: saturation channel in the HSV colorspace
Value: value channel in the HSV colorspace
Gray: gray conversion
Y: Y channel in the YCbCr colorspace
Cb: Cb channel in the YCbCr colorspace
Cr: Cr channel in the YCbCr colorspace

ColorSpaceTransformation Change the current image colorspace.
Option
ColorSpace

Description
BGR: if the image is in grayscale, convert it in BGR
HSV
HLS
YCrCb
CIELab
CIELuv

DFTTransformation Apply a DFT to the data. The input data must be single channel, the
resulting data is two channels, the first for the real part and the second for the imaginary part.
Option [default value]
TwoDimensional [1]

Description
If true, compute a 2D image DFT. Otherwise, compute the
1D DFT of each data row

Note that this transformation can add zero-padding if required by the underlying FFT implementation.
34/69

N2D2 IP only

DistortionTransformation Apply elastic distortion to the image. This transformation is generally used on-the-fly (so that a different distortion is performed for each image), and for the learning
only.
Option [default value]
ElasticGaussianSize [15]
ElasticSigma [6.0]
ElasticScaling [0.0]
Scaling [0.0]
Rotation [0.0]

N2D2 IP only

Description
Size of the gaussian for elastic distortion (in pixels)
Sigma of the gaussian for elastic distortion
Scaling of the gaussian for elastic distortion
Maximum random scaling amplitude (+/-, in percentage)
Maximum random rotation amplitude (+/-, in °)

EqualizeTransformation Image histogram equalization.
Option [default value]
Method [Standard]

Description
Standard: standard histogram equalization
CLAHE: contrast limited adaptive histogram equalization
Threshold for contrast limiting (for CLAHE only)
Size of grid for histogram equalization (for CLAHE only). Input
image will be divided into equally sized rectangular tiles. This
parameter defines the number of tiles in row and column.

[40.0]
[8]

CLAHE_ClipLimit
CLAHE_GridSize

N2D2 IP only

ExpandLabelTransformation Expand single image label (1x1 pixel) to full frame label.
FilterTransformation Apply a convolution filter to the image.
Option [default value]

Description
Convolution kernel. Possible values are:
*: custom kernel
Gaussian: Gaussian kernel
LoG: Laplacian Of Gaussian kernel
DoG: Difference Of Gaussian kernel
Gabor: Gabor kernel

Kernel

* kernel

Custom kernel.
Option
Kernel.SizeX
Kernel.SizeY

[0]
[0]

Kernel.Mat

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
List of row-major ordered coefficients of
the kernel

If both Kernel.SizeX and Kernel.SizeY are 0, the kernel is assumed to be square.
Gaussian kernel

Gaussian kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY

√ [1]
[ 2.0]

Kernel.Positive
Kernel.Sigma

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
If true, the center of the kernel is positive
Sigma of the kernel

35/69

LoG kernel

Laplacian Of Gaussian kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY

√ [1]
Kernel.Sigma [ 2.0]
Kernel.Positive

DoG kernel

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
If true, the center of the kernel is positive
Sigma of the kernel

Difference Of Gaussian kernel kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY

[1]
Kernel.Sigma1 [2.0]
Kernel.Sigma2 [1.0]
Kernel.Positive

Gabor kernel

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
If true, the center of the kernel is positive
Sigma1 of the kernel
Sigma2 of the kernel

Gabor kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY
Kernel.Theta

√
[ 2.0]
Kernel.Lambda [10.0]
Kernel.Psi [π/2.0]
Kernel.Gamma [0.5]
Kernel.Sigma

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
Theta of the kernel
Sigma of the kernel
Lambda of the kernel
Psi of the kernel
Gamma of the kernel

FlipTransformation Image flip transformation.
Option [default value]
HorizontalFlip [0]
VerticalFlip [0]
RandomHorizontalFlip [0]
RandomVerticalFlip [0]
N2D2 IP only

Description
If true, flip the image horizontally
If true, flip the image vertically
If true, randomly flip the image horizontally
If true, randomly flip the image vertically

GradientFilterTransformation Compute image gradient.

36/69

Option [default value]
Scale [1.0]
Delta [0.0]
GradientFilter [Sobel]
KernelSize

[3]

ApplyToLabels

InvThreshold
Threshold
Label

[0]

[0.5]

[]

GradientScale

N2D2 IP only

[0]

[1.0]

Description
Scale to apply to the computed gradient
Bias to add to the computed gradient
Filter type to use for computing the gradient. Possible
options are: Sobel, Scharr and Laplacian
Size of the filter kernel (has no effect when using the Scharr
filter, which kernel size is always 3x3)
If true, use the computed gradient to filter the image label and
ignore pixel areas where the gradient is below the Threshold.
In this case, only the labels are modified, not the image
If true, ignored label pixels will be the ones with a low
gradient (low contrasted areas)
Threshold applied on the image gradient
List of labels to filter (space-separated)
Rescale the image by this factor before applying the gradient
and the threshold, then scale it back to filter the labels

LabelSliceExtractionTransformation Extract a slice from an image belonging to a given label.
Option [default value]
Width
Height
Label

[-1]

Description
Width of the slice to extract
Height of the slice to extract
Slice should belong to this label ID. If -1, the label ID is
random

MagnitudePhaseTransformation Compute the magnitude and phase of a complex two channels
input data, with the first channel x being the real part and the second channel y the imaginary
part. The resulting data is two channels, the first one with the magnitude and the second one with
the phase.
Option [default value]
LogScale [0]

Description
If true, compute the magnitude in log scale

The magnitude is:
Mi,j =

q

x2i,j + x2i,j

0 = log(1 + M ).
If LogScale = 1, compute Mi,j
i,j
The phase is:

θi,j = atan2(yi,j , xi,j )
N2D2 IP only

MorphologicalReconstructionTransformation Apply a morphological reconstruction transformation to the image. This transformation is also useful for post-processing.

37/69

Option [default value]
Operation

Size
ApplyToLabels

[0]

Shape [Rectangular]
NbIterations

N2D2 IP only

[1]

Description
Morphological operation to apply. Can be:
ReconstructionByErosion: reconstruction by erosion operation
ReconstructionByDilation: reconstruction by dilation operation
OpeningByReconstruction: opening by reconstruction operation
ClosingByReconstruction: closing by reconstruction operation
Size of the structuring element
If true, apply the transformation to the labels instead of the
image
Shape of the structuring element used for morphology operations. Can be Rectangular, Elliptic or Cross.
Number of times erosion and dilation are applied for opening
and closing reconstructions

MorphologyTransformation Apply a morphology transformation to the image. This transformation is also useful for post-processing.
Option [default value]
Operation

Size
ApplyToLabels

[0]

Shape [Rectangular]
NbIterations

[1]

Description
Morphological operation to apply. Can be:
Erode: erode operation (= erode(src))
Dilate: dilate operation (= dilate(src))
Opening: opening operation (open(src) = dilate(erode(src)))
Closing: closing operation (close(src) = erode(dilate(src)))
Gradient: morphological gradient (= dilate(src)−erode(src))
TopHat: top hat (= src − open(src))
BlackHat: black hat (= close(src) − src)
Size of the structuring element
If true, apply the transformation to the labels instead of the
image
Shape of the structuring element used for morphology operations. Can be Rectangular, Elliptic or Cross.
Number of times erosion and dilation are applied

NormalizeTransformation Normalize the image.
Option [default value]
Norm [MinMax]

NormValue

[1.0]

NormMin

[0.0]

NormMax

[1.0]

PerChannel

[0]

Description
Norm type, can be:
L1: L1 normalization
L2: L2 normalization
Linf: Linf normalization
MinMax: min-max normalization
Norm value (for L1, L2 and Linf)
Such that ||data||Lp = N ormV alue
Min value (for MinMax only)
Such that min(data) = N ormM in
Max value (for MinMax only)
Such that max(data) = N ormM ax
If true, normalize each channel individually

38/69

PadCropTransformation Pad/crop the image to a specified size.
Option [default value]
Width
Height
PaddingBackground [MeanColor]

N2D2 IP only

Description
Width of the padded/cropped image
Height of the padded/cropped image
Background color used when padding. Possible values:
MeanColor: pad with the mean color of the image
BlackColor: pad with black

RandomAffineTransformation Apply a global random affine transformation to the values of the
image.
Option [default value]
GainVar
BiasVar

[0.0]

Description
Random gain is in range ±GainVar
Random bias is in range ±BiasVar

RangeAffineTransformation Apply an affine transformation to the values of the image.
Option [default value]
FirstOperator
FirstValue
SecondOperator [Plus]
SecondValue

[0.0]

Description
First operator, can be Plus, Minus, Multiplies, Divides
First value
Second operator, can be Plus, Minus, Multiplies, Divides
Second value

The final operation is the following:
f (x) = (x opo1st val1st )
N2D2 IP only

o
op2nd

val2nd

RangeClippingTransformation Clip the value range of the image.
Option [default value]
RangeMin [min(data)]
RangeMax [max(data)]

Description
Image values below RangeMin are clipped to 0
Image values above RangeMax are clipped to 1 (or the maximum
integer value of the data type)

RescaleTransformation Rescale the image to a specified size.
Option [default value]
Width
Height
KeepAspectRatio
ResizeToFit

[0]

[1]

Description
Width of the rescaled image
Height of the rescaled image
If true, keeps the aspect ratio of the image
If true, resize along the longest dimension when
KeepAspectRatio is true

ReshapeTransformation Reshape the data to a specified size.
Option [default value]
NbRows
NbCols

[0]

NbChannels

[0]

Description
New number of rows
New number of cols (0 = no check)
New number of channels (0 = no change)

39/69

N2D2 IP only

SliceExtractionTransformation Extract a slice from an image.
Option [default value]
Width
Height
OffsetX
OffsetY

[0]
[0]

[0]
RandomOffsetY [0]
AllowPadding [0]
RandomOffsetX

Description
Width of the slice to extract
Height of the slice to extract
X offset of the slice to extract
Y offset of the slice to extract
If true, the X offset is chosen randomly
If true, the Y offset is chosen randomly
If true, zero-padding is allowed if the image is smaller than
the slice to extract

ThresholdTransformation Apply a thresholding transformation to the image. This transformation is also useful for post-processing.
Option [default value]
Threshold
OtsuMethod
Operation

[0]

[Binary]

Description
Threshold value
Use Otsu’s method to determine the optimal threshold (if
true, the Threshold value is ignored)
Thresholding operation to apply. Can be:
Binary
BinaryInverted
Truncate
ToZero
ToZeroInverted

MaxValue

[1.0]

Max. value to use with Binary and BinaryInverted operations

TrimTransformation Trim the image.
Option [default value]
NbLevels
Method [Discretize]

N2D2 IP only

Description
Number of levels for the color discretization of the image
Possible values are:
Reduce: discretization using K-means
Discretize: simple discretization

WallisFilterTransformation Apply Wallis filter to the image.
Option [default value]
Size

[0.0]
[1.0]
PerChannel [0]
Mean

StdDev

4.7
4.7.1

Description
Size of the filter
Target mean value
Target standard deviation
If true, apply Wallis filter to each channel individually (this
parameter is meaningful only if Size is 0)

Network layers
Layer definition

Common set of parameters for any kind of layer.

40/69

Option [default value]
Input
Type
Model [DefaultModel]
ConfigSection

[]

Description
Name of the section(s) for the input layer(s). Comma separated
Type of the layer. Can be any of the type described below
Layer model to use
Name of the configuration section for layer

To specify that the back-propagated error must be computed at the output of a given layer
(generally the last layer, or output layer), one must add a target section named LayerName .Target:
...
[LayerName.Target]
TargetValue=1.0 ; default: 1.0
DefaultValue=0.0 ; default: -1.0

4.7.2

Weight fillers

Fillers to initialize weights and biases in the different type of layer.
Usage example:
[conv1]
...
WeightsFiller=NormalFiller
WeightsFiller.Mean=0.0
WeightsFiller.StdDev=0.05
...

The initial weights distribution for each layer can be checked in the weights_init folder, with
an example shown in figure 12.

Figure 12: Initial weights distribution of a layer using a normal distribution (NormalFiller) with a 0 mean
and a 0.05 standard deviation.

41/69

ConstantFiller

Fill with a constant value.

Option
FillerName .Value

Description
Value for the filling

NormalFiller Fill with a normal distribution.
Option [default value]
FillerName .Mean [0.0]
FillerName .StdDev [1.0]

Description
Mean value of the distribution
Standard deviation of the distribution

UniformFiller Fill with an uniform distribution.
Option [default value]
FillerName .Min [0.0]
FillerName .Max [1.0]

Description
Min. value
Max. value

XavierFiller Fill with an uniform distribution with normalized variance (Glorot and Bengio,
2010).
Option [default value]
FillerName .VarianceNorm
[FanIn]
FillerName .Distribution
[Uniform]

Description
Normalization, can be FanIn, Average or FanOut
Distribution, can be Uniform or Normal

Use an uniform distribution with interval [−scale, scale], with scale =
• n = f an-in with FanIn, resulting in V ar(W ) =
• n=

(f an-in+f an-out)
2

3.0
n .

1
f an-in

with Average, resulting in V ar(W ) =

• n = f an-out with FanOut, resulting in V ar(W ) =
4.7.3

q

2
f an-in+f an-out

1
f an-out

Weight solvers

SGDSolver_Frame SGD Solver for Frame models.
Option [default value]
SolverName .LearningRate
[0.01]
SolverName .Momentum [0.0]
SolverName .Decay [0.0]
SolverName .
LearningRatePolicy [None]
SolverName .
LearningRateStepSize [1]
SolverName .LearningRateDecay
[0.1]
SolverName .Clamping [0]
SolverName .Power [0.0]
SolverName .MaxIterations
[0.0]

Description
Learning rate
Momentum
Decay
Learning rate decay policy. Can be any of None, StepDecay,
ExponentialDecay, InvTDecay, PolyDecay
Learning rate step size (in number of stimuli)
Learning rate decay
If true, clamp the weights and bias between -1 and 1
Polynomial learning rule power parameter
Polynomial learning rule maximum number of iterations

42/69

The learning rate decay policies are the following:
• StepDecay: every SolverName .LearningRateStepSize stimuli, the learning rate is reduced by a
factor SolverName .LearningRateDecay;
• ExponentialDecay: the learning rate is α = α0 exp(−kt), with α0 the initial learning rate
SolverName .LearningRate, k the rate decay SolverName .LearningRateDecay and t the step
number (one step every SolverName .LearningRateStepSize stimuli);
• InvTDecay: the learning rate is α = α0 /(1 + kt), with α0 the initial learning rate SolverName .
LearningRate, k the rate decay SolverName .LearningRateDecay and t the step number (one step
every SolverName .LearningRateStepSize stimuli).
• InvDecay: the learning rate is α = α0 ∗ (1 + kt)−n , with α0 the initial learning rate SolverName .LearningRate, k the rate decay SolverName .LearningRateDecay, t the current iteration
and n the power parameter SolverName .Power
• PolyDecay: the learning rate is α = α0 ∗ (1 − kt )n , with α0 the initial learning rate SolverName .LearningRate, k the current iteration, t the maximum number of iteration SolverName .
MaxIterations and n the power parameter SolverName .Power

SGDSolver_Frame_CUDA SGD Solver for Frame_CUDA models.
Option [default value]
SolverName .LearningRate
[0.01]
SolverName .Momentum [0.0]
SolverName .Decay [0.0]
SolverName .
LearningRatePolicy [None]
SolverName .
LearningRateStepSize [1]
SolverName .LearningRateDecay
[0.1]
SolverName .Clamping [0]

Description
Learning rate
Momentum
Decay
Learning rate decay policy. Can be any of None, StepDecay,
ExponentialDecay, InvTDecay
Learning rate step size (in number of stimuli)
Learning rate decay
If true, clamp the weights and bias between -1 and 1

The learning rate decay policies are identical to the ones in the SGDSolver\_Frame solver.
4.7.4

Activation functions

Activation function to be used at the output of layers.
Usage example:
[conv1]
...
ActivationFunction=Rectifier
ActivationFunction.LeakSlope=0.01
ActivationFunction.Clipping=20
...

Logistic Logistic activation function.
LogisticWithLoss Logistic with loss activation function.
43/69

Rectifier Rectifier or ReLU activation function.
Option [default value]
ActivationFunction.LeakSlope

Description
Leak slope for negative inputs

[0.0]
ActivationFunction.Clipping

Clipping value for positive outputs

[0.0]
Saturation

Saturation activation function.

Softplus Softplus activation function.
Tanh Tanh activation function.
Computes y = tanh(αx).
Option [default value]
ActivationFunction.Alpha

[1.0]

Description
α parameter

TanhLeCun Tanh activation function with an α parameter of 1.7159 × (2.0/3.0).
4.7.5

Anchor

Anchor layer for Faster R-CNN.
Option [default value]
Input

Anchor[*]
ScoresCls

Description
This layer takes one or two inputs. The total number of
input channels must be ScoresCls + 4, with ScoresCls being
equal to 1 or 2.
Anchors definition. For each anchor, there must be two
space-separated values: the root area and the aspect ratio.
Number of classes per anchor. Must be 1 (if the scores input
uses logistic regression) or 2 (if the scores input is a two-class
softmax layer)

Configuration parameters (Frame models)
Option [default value]
PositiveIoU [0.7]

Model(s)
all Frame

[0.3]

all Frame

[10.0]

NegativeIoU

LossPositiveSample

[128]

all Frame
all Frame

LossNegativeSample

[128]

all Frame

LossLambda

Description
Assign a positive label for anchors whose IoU overlap
is higher than PositiveIoU with any ground-truth box
Assign a negative label for non-positive anchors whose
IoU overlap is lower than NegativeIoU for all groundtruth boxes
Balancing parameter λ
Number of random positive samples for the loss computation
Number of random negative samples for the loss computation

44/69

Usage example:
; RPN network: cls layer
[scores]
Input=...
Type=Conv
KernelWidth=1
KernelHeight=1
; 18 channels for 9 anchors
NbChannels=18
...
[scores.softmax]
Input=scores
Type=Softmax
NbOutputs=[scores]NbChannels
WithLoss=1
; RPN network: coordinates layer
[coordinates]
Input=...
Type=Conv
KernelWidth=1
KernelHeight=1
; 36 channels for 4 coordinates x 9 anchors
NbChannels=36
...
; RPN network: anchors
[anchors]
Input=scores.softmax,coordinates
Type=Anchor
ScoresCls=2 ; using a two-class softmax for the scores
Anchor[0]=32 1.0
Anchor[1]=48 1.0
Anchor[2]=64 1.0
Anchor[3]=80 1.0
Anchor[4]=96 1.0
Anchor[5]=112 1.0
Anchor[6]=128 1.0
Anchor[7]=144 1.0
Anchor[8]=160 1.0
ConfigSection=anchors.config
[anchors.config]
PositiveIoU=0.7
NegativeIoU=0.3
LossLambda=1.0

Outputs remapping Outputs remapping allows to convert scores and coordinates output feature
maps layout from another ordering that the one used in the N2D2 Anchor layer, during weights
import/export.
For example, lets consider that the imported weights corresponds to the following output feature
maps ordering:
0
1
2
3
4
5

anchor[0].y
anchor[0].x
anchor[0].h
anchor[0].w
anchor[1].y
anchor[1].x

45/69

6 anchor[1].h
7 anchor[1].w
8 anchor[2].y
9 anchor[2].x
10 anchor[2].h
11 anchor[2].w

The output feature maps ordering required by the Anchor layer is:
0 anchor[0].x
1 anchor[1].x
2 anchor[2].x
3 anchor[0].y
4 anchor[1].y
5 anchor[2].y
6 anchor[0].w
7 anchor[1].w
8 anchor[2].w
9 anchor[0].h
10 anchor[1].h
11 anchor[2].h

The feature maps ordering can be changed during weights import/export:
; RPN network: coordinates layer
[coordinates]
Input=...
Type=Conv
KernelWidth=1
KernelHeight=1
; 36 channels for 4 coordinates x 9 anchors
NbChannels=36
...
ConfigSection=coordinates.config
[coordinates.config]
WeightsExportFormat=HWCO ; Weights format used by TensorFlow
OutputsRemap=1:4,0:4,3:4,2:4

4.7.6

Conv

Convolutional layer.
Option [default value]
KernelWidth
KernelHeight
NbChannels

[1]
SubSampleY [1]
SubSample [1]
SubSampleX

[1]
[1]
Stride [1]
StrideX
StrideY

[0]
PaddingY [0]
Padding [0]
PaddingX

Description
Width of the kernel
Height of the kernel
Number of output channels
X-axis subsampling factor of the output feature maps
Y-axis subsampling factor of the output feature maps
Subsampling factor of the output feature maps
(mutually exclusive with SubSampleX and SubSampleY)
X-axis stride of the kernels
Y-axis stride of the kernels
Stride of the kernels
(mutually exclusive with StrideX and StrideY)
X-axis input padding
Y-axis input padding
Input padding
(mutually exclusive with PaddingX and PaddingY)
46/69

ActivationFunction [Tanh]

Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Weights initial values filler

WeightsFiller

[NormalFiller(0.0, 0.05)]
Biases initial values filler

BiasFiller

[NormalFiller(0.0, 0.05)]
Mapping.SizeX [1]
Mapping.SizeY [1]
Mapping.Size [1]
[1]
[1]
Mapping.Stride [1]
Mapping.StrideX
Mapping.StrideY

[0]
Mapping.OffsetY [0]
Mapping.Offset [0]
Mapping.OffsetX

Mapping.NbIterations

[0]

[1]
[1]
Mapping(in).Size [1]
Mapping(in).SizeX
Mapping(in).SizeY

[1]
[1]
Mapping(in).Stride [1]
Mapping(in).StrideX
Mapping(in).StrideY

[0]
[0]
Mapping(in).Offset [0]
Mapping(in).OffsetX
Mapping(in).OffsetY

Mapping(in).NbIterations

[]
BiasesSharing []
WeightsSharing

[0]

Mapping canvas pattern default width
Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in
Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in
Mapping canvas default Y-axis offset for input layer in
Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)
Share the weights with an other layer
Share the biases with an other layer

Configuration parameters (Frame models)
Option [default value]
NoBias [1]
Solvers.*
WeightsSolver.*

Model(s)
all Frame
all Frame
all Frame

BiasSolver.*

all Frame

Description
If true, don’t use bias
Any solver parameters
Weights solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
47/69

all Frame

WeightsExportFormat

[OCHW]

all Frame

WeightsExportTranspose

Weights import/export format. Can be OCHW or OCHW,
with O the output feature map, C the input feature map
(channel), H the kernel row and W the kernel column, in
the order of the outermost dimension (in the leftmost
position) to the innermost dimension (in the rightmost
position)
If true, import/export transposed kernels

[0]

Configuration parameters (Spike models)
Experimental option (implementation may be wrong or susceptible to change)

Option [default value]
IncomingDelay [1 TimePs
;100 TimeFs]
Threshold [1.0]
BipolarThreshold [1]
Leak

[0.0]

Model(s)
all Spike

Description
Synaptic incoming delay wdelay

Spike, Spike_RRAM

Threshold of the neuron Ithres
If true, the threshold is also applied to the absolute
value of negative values (generating negative spikes)
Neural leak time constant τleak (if 0, no leak)
Neural refractory period Tref rac
Relative initial synaptic weight winit
Mean minimum synaptic weight wmin
Mean maximum synaptic weight wmax
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
Intrinsic SET switching probability PSET (upon receiving a SET programming pulse). Assuming uniform
statistical distribution (not well supported by experiments on RRAM)
Intrinsic RESET switching probability PRESET (upon
receiving a RESET programming pulse). Assuming
uniform statistical distribution (not well supported by
experiments on RRAM)
Synaptic redundancy (number of RRAM device per
synapse)
Bipolar weights
Bipolar integration
Extrinsic STDP LTP probability (cumulative with intrinsic SET switching probability PSET )
Extrinsic STDP LTD probability (cumulative with
intrinsic RESET switching probability PRESET )
STDP LTP time window TLT P
Neural lateral inhibition period Tinhibit

Spike, Spike_RRAM
Spike, Spike_RRAM

Refractory

Spike, Spike_RRAM

WeightsRelInit

[0.0]
[0.0;0.05]
WeightsMinMean [1;0.1]
WeightsMaxMean [100;10.0]
WeightsMinVarSlope [0.0]
WeightsMinVarOrigin [0.0]
WeightsMaxVarSlope [0.0]
WeightsMaxVarOrigin [0.0]
WeightsSetProba [1.0]

Spike

WeightsResetProba

Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM

[1.0]

Spike_RRAM

[1]

Spike_RRAM

SynapticRedundancy
BipolarWeights

Spike_RRAM

[0]

BipolarIntegration

Spike_RRAM

[0]

Spike_RRAM

LtpProba

[0.2]

Spike_RRAM

LtdProba

[0.1]

Spike_RRAM

StdpLtp

[1000 TimePs]

InhibitRefractory

Spike_RRAM

[0

Spike_RRAM

TimePs]
EnableStdp

[1]

RefractoryIntegration

Spike_RRAM

[1]

Spike_RRAM

If false, STDP is disabled (no synaptic weight change)
If true, reset the integration to 0 during the refractory
period
48/69

DigitalIntegration

4.7.7

[0]

Spike_RRAM

If false, the analog value of the devices is integrated,
instead of their binary value

Deconv

Deconvolutionlayer.
Option [default value]
KernelWidth
KernelHeight
NbChannels

[1]
[1]
Stride [1]
StrideX
StrideY

[0]
[0]
Padding [0]
PaddingX
PaddingY

ActivationFunction [Tanh]
WeightsFiller

Description
Width of the kernel
Height of the kernel
Number of output channels
X-axis stride of the kernels
Y-axis stride of the kernels
Stride of the kernels
(mutually exclusive with StrideX and StrideY)
X-axis input padding
Y-axis input padding
Input padding
(mutually exclusive with PaddingX and PaddingY)
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Weights initial values filler

[NormalFiller(0.0, 0.05)]
Biases initial values filler

BiasFiller

[NormalFiller(0.0, 0.05)]
Mapping.SizeX [1]
Mapping.SizeY [1]
Mapping.Size [1]
[1]
[1]
Mapping.Stride [1]
Mapping.StrideX
Mapping.StrideY

[0]
[0]
Mapping.Offset [0]
Mapping.OffsetX
Mapping.OffsetY

Mapping.NbIterations

[0]

[1]
[1]
Mapping(in).Size [1]
Mapping(in).SizeX
Mapping(in).SizeY

[1]
[1]
Mapping(in).Stride [1]
Mapping(in).StrideX
Mapping(in).StrideY

Mapping(in).OffsetX

[0]

Mapping canvas pattern default width
Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in
Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in
49/69

[0]
[0]

Mapping(in).OffsetY
Mapping(in).Offset

Mapping(in).NbIterations

[0]

[]
BiasesSharing []
WeightsSharing

Mapping canvas default Y-axis offset for input layer in
Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)
Share the weights with an other layer
Share the biases with an other layer

Configuration parameters (Frame models)
Option [default value]
NoBias [1]
BackPropagate [1]
Solvers.*
WeightsSolver.*

Model(s)
all Frame
all Frame
all Frame
all Frame

BiasSolver.*

all Frame

WeightsExportFormat

all Frame

[OCHW]

WeightsExportTranspose

all Frame

Description
If true, don’t use bias
If true, enable backpropogation
Any solver parameters
Weights solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
Weights import/export format. Can be OCHW or OCHW,
with O the output feature map, C the input feature map
(channel), H the kernel row and W the kernel column, in
the order of the outermost dimension (in the leftmost
position) to the innermost dimension (in the rightmost
position)
If true, import/export transposed kernels

[0]

4.7.8

Pool

Pooling layer.
Option [default value]
Pooling
PoolWidth
PoolHeight
NbChannels

[1]
[1]
Stride [1]
StrideX
StrideY

[0]
[0]
Padding [0]
PaddingX
PaddingY

ActivationFunction [Linear]
Mapping.SizeX

[1]

Description
Type of pooling (Max or Average)
Width of the pooling area
Height of the pooling area
Number of output channels
X-axis stride of the pooling areas
Y-axis stride of the pooling areas
Stride of the pooling areas
(mutually exclusive with StrideX and StrideY)
X-axis input padding
Y-axis input padding
Input padding
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Mapping canvas pattern default width
50/69

[1]
[1]

Mapping.SizeY
Mapping.Size

[1]
Mapping.StrideY [1]
Mapping.Stride [1]
Mapping.StrideX

[0]
[0]
Mapping.Offset [0]
Mapping.OffsetX
Mapping.OffsetY

Mapping.NbIterations

[0]

[1]
[1]
Mapping(in).Size [1]
Mapping(in).SizeX
Mapping(in).SizeY

[1]
[1]
Mapping(in).Stride [1]
Mapping(in).StrideX
Mapping(in).StrideY

[0]
[0]
Mapping(in).Offset [0]
Mapping(in).OffsetX
Mapping(in).OffsetY

Mapping(in).NbIterations

[0]

Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in
Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in
Mapping canvas default Y-axis offset for input layer in
Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)

Configuration parameters (Spike models)
Option [default value]
IncomingDelay [1 TimePs
;100 TimeFs]
value

4.7.9

Model(s)
all Spike

Description
Synaptic incoming delay wdelay

Unpool

Unpooling layer.
Option [default value]
Pooling
PoolWidth
PoolHeight
NbChannels

Description
Type of pooling (Max or Average)
Width of the pooling area
Height of the pooling area
Number of output channels
51/69

ArgMax

[1]
StrideY [1]
Stride [1]
StrideX

[0]
[0]
Padding [0]
PaddingX
PaddingY

ActivationFunction [Linear]

[1]
[1]
Mapping.Size [1]
Mapping.SizeX
Mapping.SizeY

[1]
Mapping.StrideY [1]
Mapping.Stride [1]
Mapping.StrideX

[0]
[0]
Mapping.Offset [0]
Mapping.OffsetX
Mapping.OffsetY

Mapping.NbIterations

[0]

[1]
[1]
Mapping(in).Size [1]
Mapping(in).SizeX
Mapping(in).SizeY

[1]
[1]
Mapping(in).Stride [1]
Mapping(in).StrideX
Mapping(in).StrideY

[0]
[0]
Mapping(in).Offset [0]
Mapping(in).OffsetX
Mapping(in).OffsetY

Mapping(in).NbIterations

4.7.10

[0]

Name of the associated pool layer for the argmax (the pool
layer input and the unpool layer output dimension must
match)
X-axis stride of the pooling areas
Y-axis stride of the pooling areas
Stride of the pooling areas
(mutually exclusive with StrideX and StrideY)
X-axis input padding
Y-axis input padding
Input padding
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Mapping canvas pattern default width
Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in
Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in
Mapping canvas default Y-axis offset for input layer in
Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)

ElemWise

Element-wise operation layer.
Option [default value]

Description

52/69

Number of output neurons
Type of operation (Sum, AbsSum, EuclideanSum, Prod, or Max)
Weights for the Sum, AbsSum, and EuclideanSum operation, in
the same order as the inputs
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh

NbOutputs
Operation
Weights

[]

ActivationFunction [Linear]

Given N input tensors Ti , performs the following operation:
Sum operation Tout =

PN
1

AbsSum operation Tout =

(wi Ti )

PN
1

(wi |Ti |)

EuclideanSum operation Tout =
Prod operation Tout =

QN
1

q
PN
1

(wi Ti )2

(Ti )

Max operation Tout = M AX1N (Ti )
Examples

Sum of two inputs (Tout = T1 + T2 ):

[elemwise_sum]
Input=layer1,layer2
Type=ElemWise
NbOutputs=[layer1]NbChannels
Operation=Sum

Weighted sum of two inputs, by a factor 0.5 for layer1 and 1.0 for layer2 (Tout = 0.5×T1 +1.0×T2 ):
[elemwise_weighted_sum]
Input=layer1,layer2
Type=ElemWise
NbOutputs=[layer1]NbChannels
Operation=Sum
Weights=0.5 1.0

Single input scaling by a factor 0.5 (Tout = 0.5 × T1 ):
[elemwise_scale]
Input=layer1
Type=ElemWise
NbOutputs=[layer1]NbChannels
Operation=Sum
Weights=0.5

Absolute value of an input (Tout = |T1 |):
[elemwise_abs]
Input=layer1
Type=ElemWise
NbOutputs=[layer1]NbChannels
Operation=Abs

4.7.11

FMP

Fractional max pooling layer (Graham, 2014).

53/69

Option [default value]
NbChannels
ScalingRatio
ActivationFunction [Linear]

Description
Number of output channels


input size
Scaling ratio. The output size is round scaling
ratio .
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh

Configuration parameters (Frame models)
Option [default value]
Overlapping [1]
PseudoRandom [1]

4.7.12

Model(s)
all Frame
all Frame

Description
If true, use overlapping regions, else use disjoint regions
If true, use pseudorandom sequences, else use random
sequences

Fc

Fully connected layer.
Option [default value]

Description
Number of output neurons
Weights initial values filler

NbOutputs
WeightsFiller

[NormalFiller(0.0, 0.05)]
Biases initial values filler

BiasFiller

[NormalFiller(0.0, 0.05)]
ActivationFunction [Tanh]

Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh

Configuration parameters (Frame models)
Option [default value]
NoBias [1]
BackPropagate [1]
Solvers.*
WeightsSolver.*

Model(s)
all Frame
all Frame
all Frame
all Frame

BiasSolver.*

all Frame

DropConnect

[1.0]

Frame

Description
If true, don’t use bias
If true, enable backpropogation
Any solver parameters
Weights solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
If below 1.0, fraction of synapses that are disabled with
drop connect

Configuration parameters (Spike models)

54/69

Option [default value]
IncomingDelay [1 TimePs
;100 TimeFs]
Threshold [1.0]
BipolarThreshold [1]
Leak

[0.0]

Model(s)
all Spike

Description
Synaptic incoming delay wdelay

Spike, Spike_RRAM

Threshold of the neuron Ithres
If true, the threshold is also applied to the absolute
value of negative values (generating negative spikes)
Neural leak time constant τleak (if 0, no leak)
Neural refractory period Tref rac
Terminate delta
Relative initial synaptic weight winit
Mean minimum synaptic weight wmin
Mean maximum synaptic weight wmax
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
Intrinsic SET switching probability PSET (upon receiving a SET programming pulse). Assuming uniform
statistical distribution (not well supported by experiments on RRAM)
Intrinsic RESET switching probability PRESET (upon
receiving a RESET programming pulse). Assuming
uniform statistical distribution (not well supported by
experiments on RRAM)
Synaptic redundancy (number of RRAM device per
synapse)
Bipolar weights
Bipolar integration
Extrinsic STDP LTP probability (cumulative with intrinsic SET switching probability PSET )
Extrinsic STDP LTD probability (cumulative with
intrinsic RESET switching probability PRESET )
STDP LTP time window TLT P
Neural lateral inhibition period Tinhibit

Spike, Spike_RRAM
Spike, Spike_RRAM

[0.0]
TerminateDelta [0]
WeightsRelInit [0.0;0.05]
WeightsMinMean [1;0.1]
WeightsMaxMean [100;10.0]
WeightsMinVarSlope [0.0]
WeightsMinVarOrigin [0.0]
WeightsMaxVarSlope [0.0]
WeightsMaxVarOrigin [0.0]
WeightsSetProba [1.0]
Refractory

WeightsResetProba

Spike, Spike_RRAM
Spike
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM

[1.0]

Spike_RRAM

[1]

Spike_RRAM

SynapticRedundancy
BipolarWeights

Spike, Spike_RRAM

[0]

BipolarIntegration

Spike_RRAM

[0]

Spike_RRAM

LtpProba

[0.2]

Spike_RRAM

LtdProba

[0.1]

Spike_RRAM

StdpLtp

[1000 TimePs]

Spike_RRAM

[0

InhibitRefractory

Spike_RRAM

TimePs]
EnableStdp

[1]

Spike_RRAM

RefractoryIntegration
DigitalIntegration

N2D2 IP only

4.7.13

[1]

[0]

Spike_RRAM
Spike_RRAM

If false, STDP is disabled (no synaptic weight change)
If true, reset the integration to 0 during the refractory
period
If false, the analog value of the devices is integrated,
instead of their binary value

Rbf

Radial basis function fully connected layer.
Option [default value]
NbOutputs
CentersFiller

Description
Number of output neurons
Centers initial values filler

[NormalFiller(0.5, 0.05)]
55/69

Scaling initial values filler

ScalingFiller

[NormalFiller(10.0, 0.05)]

Configuration parameters (Frame models)
Option [default value]
Solvers.*
CentersSolver.*

Model(s)
all Frame
all Frame

ScalingSolver.*

all Frame

RbfApprox [None]

Frame

4.7.14

Description
Any solver parameters
Centers solver parameters, take precedence over the
Solvers.* parameters
Scaling solver parameters, take precedence over the
Solvers.* parameters
Approximation for the Gaussian function, can be any
of: None, Rectangular or SemiLinear

Softmax

Softmax layer.
Option [default value]
NbOutputs
WithLoss

[0]

Description
Number of output neurons
Softmax followed with a multinomial logistic layer

The softmax function performs the following operation, with aix,y and bix,y the input and the
output respectively at position (x, y) on channel i:
exp(aix,y )

bix,y =

N
P

exp(ajx,y )

j=0

and
daix,y =

N 
X



δij − aix,y ajx,y dbjx,y

j=0

When the WithLoss option is enabled, compute the gradient directly in respect of the cross-entropy
loss:
Lx,y =

N
X

tjx,y log(bjx,y )

j=0

In this case, the gradient output becomes:
daix,y = dbix,y
with
dbix,y = tix,y − bix,y
56/69

4.7.15

LRN

Local Response Normalization (LRN) layer.
Option [default value]
NbOutputs

Description
Number of output neurons

The response-normalized activity bix,y is given by the expression:
aix,y

bix,y =
k+α


min(N −1,i+n/2)
P

2
ajx,y

!β

j=max(0,i−n/2)

Configuration parameters (Frame models)
Option [default value]
N [5]
Alpha [1.0e-4]
Beta
K

[0.75]

[2.0]

4.7.16

Model(s)
all Frame
all Frame
all Frame
all Frame

Description
Normalization window width in elements
Value of the alpha variance scaling parameter in the
normalization formula
Value of the beta power parameter in the normalization
formula
Value of the k parameter in normalization formula

Dropout

Dropout layer (Srivastava et al., 2012).
Option [default value]
NbOutputs

Description
Number of output neurons

Configuration parameters (Frame models)
Option [default value]
Dropout [0.5]

4.7.17

Model(s)
all Frame

Description
The probability with which the value from input would
be dropped

BatchNorm

Batch Normalization layer (Ioffe and Szegedy, 2015).
Option [default value]

Description
57/69

NbOutputs
ActivationFunction [Tanh]

[]
BiasesSharing []
MeansSharing []
ScalesSharing

VariancesSharing

[]

Number of output neurons
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Share the scales with an other layer
Share the biases with an other layer
Share the means with an other layer
Share the variances with an other layer

Configuration parameters (Frame models)
Option [default value]
Solvers.*
ScaleSolver.*

Model(s)
all Frame
all Frame

BiasSolver.*

all Frame

[0.0]

all Frame

Epsilon

4.7.18

Description
Any solver parameters
Scale solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
Epsilon value used in the batch normalization formula.
If 0.0, automatically choose the minimum possible
value.

Transformation

Transformation layer, which can apply any transformation described in 4.6.1. Useful for fully CNN
post-processing for example.
Option [default value]
NbOutputs
Transformation

Description
Number of outputs
Name of the transformation to apply

The Transformation options must be placed in the same section.
Usage example for fully CNNs:
[post.Transformation-thres]
Input=... ; for example, network’s logistic of softmax output layer
NbOutputs=1
Type=Transformation
Transformation=ThresholdTransformation
Operation=ToZero
Threshold=0.75
[post.Transformation-morpho]
Input=post.Transformation-thres
NbOutputs=1
Type=Transformation
Transformation=MorphologyTransformation
Operation=Opening
Size=3

58/69

5

Tutorials

5.1

Building a classifier neural network

For this tutorial, we will use the classical MNIST handwritten digit dataset. A driver module
already exists for this dataset, named MNIST_IDX_Database.
To instantiate it, just add the following lines in a new INI file:
[database]
Type=MNIST_IDX_Database
Validation=0.2 ; Use 20% of the dataset for validation

In order to create a neural network, we first need to define its input, which is declared with a
[sp] section (sp for StimuliProvider). In this section, we configure the size of the input and the
batch size:
[sp]
SizeX=32
SizeY=32
BatchSize=128

We can also add pre-processing transformations to the StimuliProvider, knowing that the final
data size after transformations must match the size declared in the [sp] section. Here, we must
rescale the MNIST 28x28 images to match the 32x32 network input size.
[sp.Transformation_1]
Type=RescaleTransformation
Width=[sp]SizeX
Height=[sp]SizeY

Next, we declare the neural network layers. In this example, we reproduced the well-known
LeNet network. The first layer is a 5x5 convolutional layer, with 6 channels. Since there is only one
input channel, there will be only 6 convolution kernels in this layer.
[conv1]
Input=sp
Type=Conv
KernelWidth=5
KernelHeight=5
NbChannels=6

The next layer is a 2x2 MAX pooling layer, with a stride of 2 (non-overlapping MAX pooling).
[pool1]
Input=conv1
Type=Pool
PoolWidth=2
PoolHeight=2
NbChannels=[conv1]NbChannels
Stride=2
Pooling=Max
Mapping.Size=1 ; One to one connection between input and output channels

The next layer is a 5x5 convolutional layer with 16 channels.
[conv2]
Input=pool1
Type=Conv
KernelWidth=5
KernelHeight=5
NbChannels=16

Note that in LeNet, the [conv2] layer is not fully connected to the pooling layer. In N2D2, a
custom mapping can be defined for each input connection. The connection of n-th output map to
the inputs is defined by the n-th column of the matrix below, where the rows correspond to the
inputs.
59/69

Map(pool1)=\
1 0 0 0 1 1 1
1 1 0 0 0 1 1
1 1 1 0 0 0 1
0 1 1 1 0 0 1
0 0 1 1 1 0 0
0 0 0 1 1 1 0

0
1
1
1
1
0

0
0
1
1
1
1

1
0
0
1
1
1

1
1
0
0
1
1

1
1
1
0
0
1

1
1
0
1
1
0

0
1
1
0
1
1

1
0
1
1
0
1

1
1
1
1
1
1

\
\
\
\
\

Another MAX pooling and convolution layer follow:
[pool2]
Input=conv2
Type=Pool
PoolWidth=2
PoolHeight=2
NbChannels=[conv2]NbChannels
Stride=2
Pooling=Max
Mapping.Size=1
[conv3]
Input=pool2
Type=Conv
KernelWidth=5
KernelHeight=5
NbChannels=120

The network is composed of two fully-connected layers of 84 and 10 neurons respectively:
[fc1]
Input=conv3
Type=Fc
NbOutputs=84
[fc2]
Input=fc1
Type=Fc
NbOutputs=10

Finally, we use a softmax layer to obtain output classification probabilities and compute the
loss function.
[softmax]
Input=fc2
Type=Softmax
NbOutputs=[fc2]NbOutputs
WithLoss=1

In order to tell N2D2 to compute the error and the classification score on this softmax layer, one
must attach a N2D2 Target to this layer, with a section with the same name suffixed with .Target:
[softmax.Target]

By default, the activation function for the convolution and the fully-connected layers is the
hyperbolic tangent. Because the [fc2] layer is fed to a softmax, it should not have any activation
function. We can specify it by adding the following line in the [fc2] section:
[fc2]
...
ActivationFunction=Linear

In order to improve further the networks performances, several things can be done:
• Use ReLU activation functions. In order to do so, just add the following in the [conv1],
[conv2], [conv3] and [fc1] layer sections:
ActivationFunction=Rectifier

60/69

For the ReLU activation function to be effective, the weights must be initialized carefully, in
order to avoid dead units that would be stuck in the ] − ∞, 0] output range before the ReLU
function. In N2D2, one can use a custom WeightsFiller for the weights initialization. For the
ReLU activation function, a popular and efficient filler is the so-called XavierFiller (see the
4.7.2 section for more information):
WeightsFiller=XavierFiller

• Use dropout layers. Dropout is highly effective to improve the network generalization
capacity. Here is an example of a dropout layer inserted between the [fc1] and [fc2] layers:
[fc1]
...
[fc1.drop]
Input=fc1
Type=Dropout
NbOutputs=[fc1]NbOutputs
[fc2]
Input=fc1.drop ; Replaces "Input=fc1"
...

• Tune the learning parameters. You may want to tune the learning rate and other learning
parameters depending on the learning problem at hand. In order to do so, you can add a
configuration section that can be common (or not) to all the layers. Here is an example of
configuration section:
[conv1]
...
ConfigSection=common.config
[...]
...
[common.config]
NoBias=1
WeightsSolver.LearningRate=0.05
WeightsSolver.Decay=0.0005
Solvers.LearningRatePolicy=StepDecay
Solvers.LearningRateStepSize=[sp]_EpochSize
Solvers.LearningRateDecay=0.993
Solvers.Clamping=1

For more details on the configuration parameters for the Solver, see section 4.7.3.
• Add input distortion. See for example the DistortionTransformation (section 4.6.1).
The complete INI model corresponding to this tutorial can be found in models/LeNet.ini.
In order to use CUDA/GPU accelerated learning, the default layer model should be switched to
Frame_CUDA. You can enable this model by adding the following line at the top of the INI file (before
the first section):
DefaultModel=Frame_CUDA

5.2

Building a segmentation neural network

In this tutorial, we will learn how to do image segmentation with N2D2. As an example, we will
implement a face detection and gender recognition neural network, using the IMDB-WIKI dataset.
First, we need to instanciate the IMDB-WIKI dataset built-in N2D2 driver:
[database]
Type=IMDBWIKI_Database

61/69

WikiSet=1 ; Use the WIKI part of the dataset
IMDBSet=0 ; Don’t use the IMDB part (less accurate annotation)
Learn=0.90
Validation=0.05
DefaultLabel=background ; Label for pixels outside any ROI (default is no label, pixels are
ignored)

We must specify a default label for the background, because we want to learn to differenciate
faces from the background (and not simply ignore the background for the learning).
The network input is then declared:
[sp]
SizeX=480
SizeY=360
BatchSize=48
CompositeStimuli=1

In order to work with segmented data, i.e. data with bounding box annotations or pixel-wise
annotations (as opposed to a single label per data), one must enable the CompositeStimuli option in
the [sp] section.
We can then perform various operations on the data before feeding it to the network, like for
example converting the 3-channels RGB input images to single-channel gray images:
[sp.Transformation-1]
Type=ChannelExtractionTransformation
CSChannel=Gray

We must only rescale the images to match the networks input size. This can be done using
a RescaleTransformation, followed by a PadCropTransformation if one want to keep the images aspect
ratio.
[sp.Transformation-2]
Type=RescaleTransformation
Width=[sp]SizeX
Height=[sp]SizeY
KeepAspectRatio=1 ; Keep images aspect ratio
; Required to ensure all the images are the same size
[sp.Transformation-3]
Type=PadCropTransformation
Width=[sp]SizeX
Height=[sp]SizeY

A common additional operation to extend the learning set is to apply random horizontal mirror
to images. This can be achieved with the following FlipTransformation:
[sp.OnTheFlyTransformation-4]
Type=FlipTransformation
RandomHorizontalFlip=1
ApplyTo=LearnOnly ; Apply this transformation only on the learning set

Note that this is an on-the-fly transformation, meaning it cannot be cached and is re-executed
every time even for the same stimuli. We also apply this transformation only on the learning set,
with the ApplyTo option.
Next, the neural network can be described:
[conv1.1]
Input=sp
Type=Conv
...
[pool1]
...
[...]

62/69

...
[fc2]
Input=drop1
Type=Conv
...
[drop2]
Input=fc2
Type=Dropout
NbOutputs=[fc2]NbChannels

A full network description can be found in the IMDBWIKI.ini file in the models directory of
N2D2. It is a fully-CNN network.
Here we will focus on the output layers required to detect the faces and classify their gender.
We start from the [drop2] layer, which has 128 channels of size 60x45.
5.2.1

Faces detection

We want to first add an output stage for the faces detection. It is a 1x1 convolutional layer with a
single 60x45 output map. For each output pixel, this layer outputs the probability that the pixel
belongs to a face.
[fc3.face]
Input=drop2
Type=Conv
KernelWidth=1
KernelHeight=1
NbChannels=1
Stride=1
ActivationFunction=LogisticWithLoss
WeightsFiller=XavierFiller
ConfigSection=common.config ; Same solver options that the other layers

In order to do so, the activation function of this layer must be of type LogisticWithLoss.
We must also tell N2D2 to compute the error and the classification score on this softmax layer,
by attaching a N2D2 Target to this layer, with a section with the same name suffixed with .Target:
[fc3.face.Target]
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_face.dat
; Visualization parameters
NoDisplayLabel=0
LabelsHueOffset=90

In this Target, we must specify how the dataset annotations are mapped to the layer’s output.
This can be done in a separate file using the LabelsMapping parameter. Here, since the output layer
has a single output per pixel, the target value can only be 0 or 1. A target value of -1 means that
this output is ignored (no error back-propagated). Since the only annotations in the IMDB-WIKI
dataset are faces, the mapping described in the IMDBWIKI_target_face.dat file is easy:
# background
background 0
# padding (*) is ignored (-1)
* -1
# not background = face
default 1

63/69

5.2.2

Gender recognition

We can also add a second output stage for gender recognition. Like before, it would be a 1x1
convolutional layer with a single 60x45 output map. But here, for each output pixel, this layer
would output the probability that the pixel represents a female face.
[fc3.gender]
Input=drop2
Type=Conv
KernelWidth=1
KernelHeight=1
NbChannels=1
Stride=1
ActivationFunction=LogisticWithLoss
WeightsFiller=XavierFiller
ConfigSection=common.config

The output layer is therefore identical to the face’s output layer, but the target mapping is
different. For the target mapping, the idea is simply to ignore all pixels not belonging to a face and
affect the target 0 to male pixels and the target 1 to female pixels.
[fc3.gender.Target]
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_gender.dat
; Only display gender probability for pixels detected as face pixels
MaskLabelTarget=fc3.face.Target
MaskedLabel=1

The content of the IMDBWIKI_target_gender.dat file would therefore look like:
# background
# ?-* (unknown gender)
# padding
default -1
# male gender
M-? 0 # unknown age
M-0 0
M-1 0
M-2 0
...
M-98 0
M-99 0
# female gender
F-? 1 # unknown age
F-0 1
F-1 1
F-2 1
...
F-98 1
F-99 1

5.2.3

ROIs extraction

The next step would be to extract detected face ROIs and assign for each ROI the most probable
gender. To this end, we can first set a detection threshold, in terms of probability, to select face
pixels. In the following, the threshold is fixed to 75% face probability:
[post.Transformation-thres]
Input=fc3.face
Type=Transformation
NbOutputs=1
Transformation=ThresholdTransformation
Operation=ToZero
Threshold=0.75

64/69

We can then assign a target of type TargetROIs to this layer that will automatically create the
bounding box using a segmentation algorithm.
[post.Transformation-thres.Target-face]
Type=TargetROIs
MinOverlap=0.33 ; Min. overlap fraction to match the ROI to an annotation
FilterMinWidth=5 ; Min. ROI width
FilterMinHeight=5 ; Min. ROI height
FilterMinAspectRatio=0.5 ; Min. ROI aspect ratio
FilterMaxAspectRatio=1.5 ; Max. ROI aspect ratio
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_face.dat

In order to assign a gender to the extracted ROIs, the above target must be modified to:
[post.Transformation-thres.Target-gender]
Type=TargetROIs
ROIsLabelTarget=fc3.gender.Target
MinOverlap=0.33
FilterMinWidth=5
FilterMinHeight=5
FilterMinAspectRatio=0.5
FilterMaxAspectRatio=1.5
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_gender.dat

Here, we use the fc3.gender.Target target to determine the most probable gender of the ROI.
5.2.4

Data visualization

For each Target in the network, a corresponding folder is created in the simulation directory, which
contains learning, validation and test confusion matrixes. The output estimation of the network for
each stimulus is also generated automatically for the test dataset and can be visualized with the
./test.py helper tool. An example is shown in figure 13.

Pixels input label (dataset annotation)

Network output estimation: pixels most probable object type

Image selection

Labels legend
(object type)

Figure 13: Example of the target visualization helper tool.

65/69

5.3

Transcoding a learned network in spike-coding

N2D2 embeds an event-based simulator (historically known as ’Xnet’) and allows to transcode a
whole DNN in a spike-coding version and evaluate the resulting spiking neural network performances.
In this tutorial, we will transcode the LeNet network described in section 5.1.
5.3.1

Render the network compatible with spike simulations

The first step is to specify that we want to use a transcode model (allowing both formal and spike
simulation of the same network), by changing the DefaultModel to:
DefaultModel=Transcode_CUDA

In order to perform spike simulations, the input of the network must be of type Environment,
which is a derived class of StimuliProvider that adds spike coding support. In the INI model file, it
is therefore necessary to replace the [sp] section by an [env] section and replace all references of sp
to env.
Note that these changes have at this point no impact at all on the formal coding simulations.
The beginning of the INI file should be:
DefaultModel=Transcode_CUDA
; Database
[database]
Type=MNIST_IDX_Database
Validation=0.2 ; Use 20% of the dataset for validation
; Environment
[env]
SizeX=32
SizeY=32
BatchSize=128
[env.Transformation_1]
Type=RescaleTransformation
Width=[env]SizeX
Height=[env]SizeY
[conv1]
Input=env
...

The dropout layer has no equivalence in spike-coding inference and must be removed:
...
[fc1.drop]
Input=fc1
Type=Dropout
NbOutputs=[fc1]NbOutputs
[fc2]
Input=fc1.drop
...

The softmax layer has no equivalence in spike-coding inference and must be removed as well.
The Target must therefore be attached to [fc2]:
...
[softmax]
Input=fc2
Type=Softmax
NbOutputs=[fc2]NbOutputs
WithLoss=1

66/69

[softmax.Target]
[fc2.Target]
...

The network is now compatible with spike-coding simulations. However, we did not specify at
this point how to translate the input stimuli data into spikes, nor the spiking neuron parameters
(threshold value, leak time constant...).
5.3.2

Configure spike-coding parameters

The first step is to configure how the input stimuli data must be coded into spikes. To this end, we
must attach a configuration section to the Environment. Here, we specify a periodic coding with
random initial jitter with a minimum period of 10 ns and a maximum period of 100 us:
[env]
...
ConfigSection=env.config
[env.config]
; Spike-based computing
StimulusType=JitteredPeriodic
PeriodMin=1,000,000 ; unit = fs
PeriodMeanMin=10,000,000 ; unit = fs
PeriodMeanMax=100,000,000,000 ; unit = fs
PeriodRelStdDev=0.0

The next step is to specify the neurons parameters, that will be common to all layers and can
therefore be specified in the [common.config] section. In N2D2, the base spike-coding layers use a
Leaky Integrate-and-Fire (LIF) neuron model. By default, the leak time constant is zero, resulting
to simple Integrate-and-Fire (IF) neurons.
Here we simply specify that the neurons threshold must be the unity, that the threshold is only
positive and that there is no incoming synaptic delay:
[common.config]
...
; Spike-based computing
Threshold=1.0
BipolarThreshold=0
IncomingDelay=0

Finally, we can limit the number of spikes required for the computation of each stimulus by
adding a decision delta threshold at the output layer:
[fc2]
...
ConfigSection=common.config,fc2.config
[fc2.Target]
[fc2.config]
; Spike-based computing
TerminateDelta=4
BipolarThreshold=1

The complete INI model corresponding to this tutorial can be found in models/LeNet_Spike.ini.
Here is a summary of the steps required to reproduce the whole experiment:
./n2d2 "$N2D2_MODELS/LeNet.ini" -learn 6000000 -log 100000
./n2d2 "$N2D2_MODELS/LeNet_Spike.ini" -test

The final recognition rate reported at the end of the spike inference should be almost identical
to the formal coding network (around 99% for the LeNet network).
67/69

Various statistics are available at the end of the spike-coding simulation in the stats_spike
folder and the stats_spike.log file. Looking in the stats_spike.log file, one can read the following
line towards the end of the file:
Read events per virtual synapse per pattern (average): 0.654124

This line reports the average number of accumulation operations per synapse per input stimulus
in the network. If this number if below 1.0, it means that the spiking version of the network is
more efficient than its formal counterpart in terms of total number of operations!

68/69

References
P. Dollár, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: A benchmark. In CVPR,
2009.
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples:
an incremental bayesian approach tested on 101 object categories. In IEEE. CVPR 2004,
Workshop on Generative-Model Based Vision, 2004.
X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks.
In International conference on artificial intelligence and statistics, page 249–256, 2010.
B. Graham. Fractional max-pooling. CoRR, abs/1412.6071, 2014.
G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset, 2007.
S. Houben, J. Stallkamp, J. Salmen, M. Schlipsing, and C. Igel. Detection of traffic signs in
real-world images: The German Traffic Sign Detection Benchmark. In International Joint
Conference on Neural Networks, number 1288, 2013.
S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing
internal covariate shift. CoRR, abs/1502.03167, 2015.
V. Jain and E. Learned-Miller. FDDB: A benchmark for face detection in unconstrained settings,
2010.
A. Krizhevsky. Learning multiple layers of features from tiny images, 2009.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document
recognition. In Proceedings of the IEEE, volume 86, pages 2278–2324, 1998.
P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews. The Extended CohnKanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression.
2010.
A. Rakotomamonjy and G. Gasso. Histogram of gradients of time-frequency representations for
audio scene detection, 2014.
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy,
A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. doi:
10.1007/s11263-015-0816-y.
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple
way to prevent neural networks from voverfitting. Journal of Machine Learning Research, 15:
1929–1958, 2012.
J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. Man vs. computer: Benchmarking machine
learning algorithms for traffic sign recognition. Neural Networks, 2012. ISSN 0893-6080. doi:
10.1016/j.neunet.2012.02.016.

69/69

Source Exif Data:

File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.6
Linearized                      : No
Page Count                      : 69
Page Mode                       : UseOutlines
Author                          : 
Title                           : 
Subject                         : 
Creator                         : LaTeX with hyperref package
Producer                        : pdfTeX-1.40.14
Create Date                     : 2018:02:14 15:46:05+01:00
Modify Date                     : 2018:02:14 15:46:05+01:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/Debian) kpathsea version 6.1.1

EXIF Metadata provided by EXIF.tools

Manual

manual

Navigation menu

Versions of this User Manual:

Views

Navigation