Manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 82

Download
Open PDF In Browser	View PDF

Neural Network Design & Deployment
Olivier Bichler, David Briand, Victor Gacoin,
Benjamin Bertelone, Thibault Allenet, Johannes C. Thiele
Wednesday 20th February, 2019

Commissariat à l’Energie Atomique et aux Energies Alternatives
Institut List | CEA Saclay Nano-INNOV | Bât. 861-PC142
91191 Gif-sur-Yvette Cedex - FRANCE
Tel. : +33 (0)1.69.08.49.67 | Fax : +33(0)1.69.08.83.95
www-list.cea.fr
Établissement Public à caractère Industriel et Commercial | RCS Paris B 775 685 019

Département Architecture Conception et Logiciels Embarqués

Contents
1 Presentation
1.1 Database handling . . .
1.2 Data pre-processing . .
1.3 Deep network building .
1.4 Performances evaluation
1.5 Hardware exports . . . .
1.6 Summary . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

2 About N2D2-IP

N2D2 IP only
N2D2 IP only

N2D2 IP only

6
6
6
7
8
8
10
11

3 Performing simulations
3.1 Obtaining the latest version of this manual . .
3.2 Minimum system requirements . . . . . . . . .
3.3 Obtaining N2D2 . . . . . . . . . . . . . . . . .
3.3.1 Prerequisites . . . . . . . . . . . . . . .
Red Hat Enterprise Linux (RHEL) 6 . .
Ubuntu . . . . . . . . . . . . . . . . . .
Windows . . . . . . . . . . . . . . . . .
3.3.2 Getting the sources . . . . . . . . . . . .
3.3.3 Compilation . . . . . . . . . . . . . . . .
3.4 Downloading training datasets . . . . . . . . .
3.5 Run the learning . . . . . . . . . . . . . . . . .
3.6 Test a learned network . . . . . . . . . . . . . .
3.6.1 Interpreting the results . . . . . . . . .
Recognition rate . . . . . . . . . . . . .
Confusion matrix . . . . . . . . . . . . .
Memory and computation requirements
Kernels and weights distribution . . . .
Output maps activity . . . . . . . . . .
3.7 Export a learned network . . . . . . . . . . . .
3.7.1 C export . . . . . . . . . . . . . . . . . .
3.7.2 CPP_OpenCL export . . . . . . . . . . . .
3.7.3 CPP_TensorRT export . . . . . . . . . .
3.7.4 CPP_cuDNN export . . . . . . . . . . . .
3.7.5 C_HLS export . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

11
11
11
12
12
12
12
13
13
13
14
14
14
14
14
14
14
14
15
15
17
18
19
20
20

4 INI file interface
4.1 Syntax . . . . . . . . . . . .
4.1.1 Properties . . . . . .
4.1.2 Sections . . . . . . .
4.1.3 Case sensitivity . . .
4.1.4 Comments . . . . . .
4.1.5 Quoted values . . . .
4.1.6 Whitespace . . . . .
4.1.7 Escape characters .
4.2 Template inclusion syntax .
4.2.1 Variable substitution
4.2.2 Control statements .
block . . . . . . . . .
for . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

21
21
21
21
21
21
21
21
21
22
22
22
23
23

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

2/82

4.3
4.4

4.5

4.6

N2D2 IP only
N2D2 IP only
N2D2 IP only

N2D2 IP only
N2D2 IP only
N2D2 IP only
N2D2 IP only

N2D2 IP only
N2D2 IP only

if . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
include . . . . . . . . . . . . . . . . . . . . . . . . . .
Global parameters . . . . . . . . . . . . . . . . . . . . . . . .
Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 MNIST . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2 GTSRB . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.3 Directory . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.4 Other built-in databases . . . . . . . . . . . . . . . . .
CIFAR10_Database . . . . . . . . . . . . . . . . . . . .
CIFAR100_Database . . . . . . . . . . . . . . . . . . .
CKP_Database . . . . . . . . . . . . . . . . . . . . . .
Caltech101_DIR_Database . . . . . . . . . . . . . . .
Caltech256_DIR_Database . . . . . . . . . . . . . . .
CaltechPedestrian_Database . . . . . . . . . . . . .
Cityscapes_Database . . . . . . . . . . . . . . . . . .
Daimler_Database . . . . . . . . . . . . . . . . . . . .
DOTA_Database . . . . . . . . . . . . . . . . . . . . . .
FDDB_Database . . . . . . . . . . . . . . . . . . . . . .
GTSDB_DIR_Database . . . . . . . . . . . . . . . . . .
ILSVRC2012_Database . . . . . . . . . . . . . . . . . .
KITTI_Database . . . . . . . . . . . . . . . . . . . . .
KITTI_Road_Database . . . . . . . . . . . . . . . . . .
KITTI_Object_Database . . . . . . . . . . . . . . . .
LITISRouen_Database . . . . . . . . . . . . . . . . . .
4.4.5 Dataset images slicing . . . . . . . . . . . . . . . . . .
Stimuli data analysis . . . . . . . . . . . . . . . . . . . . . . .
4.5.1 Zero-mean and unity standard deviation normalization
4.5.2 Substracting the mean image of the set . . . . . . . .
Environment . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.1 Built-in transformations . . . . . . . . . . . . . . . . .
AffineTransformation . . . . . . . . . . . . . . . . .
ApodizationTransformation . . . . . . . . . . . . . .
ChannelExtractionTransformation . . . . . . . . . .
ColorSpaceTransformation . . . . . . . . . . . . . .
DFTTransformation . . . . . . . . . . . . . . . . . . .
DistortionTransformation . . . . . . . . . . . . . .
EqualizeTransformation . . . . . . . . . . . . . . . .
ExpandLabelTransformation . . . . . . . . . . . . . .
FilterTransformation . . . . . . . . . . . . . . . . .
FlipTransformation . . . . . . . . . . . . . . . . . .
GradientFilterTransformation . . . . . . . . . . . .
LabelSliceExtractionTransformation . . . . . . . .
MagnitudePhaseTransformation . . . . . . . . . . . .
MorphologicalReconstructionTransformation . . .
MorphologyTransformation . . . . . . . . . . . . . .
NormalizeTransformation . . . . . . . . . . . . . . .
PadCropTransformation . . . . . . . . . . . . . . . .
RandomAffineTransformation . . . . . . . . . . . . .
RangeAffineTransformation . . . . . . . . . . . . . .
RangeClippingTransformation . . . . . . . . . . . .
RescaleTransformation . . . . . . . . . . . . . . . .
ReshapeTransformation . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

23
23
23
23
23
23
24
26
26
26
26
26
26
27
27
27
28
28
28
28
28
29
29
29
29
30
30
30
32
33
34
34
35
35
36
36
36
37
37
38
38
38
39
40
40
41
41
41
43
43
43
43
3/82

N2D2 IP only

N2D2 IP only

4.7

N2D2 IP only

SliceExtractionTransformation . . . .
ThresholdTransformation . . . . . . . .
TrimTransformation . . . . . . . . . . .
WallisFilterTransformation . . . . . .
Network layers . . . . . . . . . . . . . . . . . . .
4.7.1 Layer definition . . . . . . . . . . . . . . .
4.7.2 Weight fillers . . . . . . . . . . . . . . . .
ConstantFiller . . . . . . . . . . . . . .
HeFiller . . . . . . . . . . . . . . . . . .
NormalFiller . . . . . . . . . . . . . . .
UniformFiller . . . . . . . . . . . . . . .
XavierFiller . . . . . . . . . . . . . . .
4.7.3 Weight solvers . . . . . . . . . . . . . . .
SGDSolver_Frame . . . . . . . . . . . . .
SGDSolver_Frame_CUDA . . . . . . . . . .
AdamSolver_Frame . . . . . . . . . . . . .
AdamSolver_Frame_CUDA . . . . . . . . .
4.7.4 Activation functions . . . . . . . . . . . .
Logistic . . . . . . . . . . . . . . . . . .
LogisticWithLoss . . . . . . . . . . . . .
Rectifier . . . . . . . . . . . . . . . . .
Saturation . . . . . . . . . . . . . . . . .
Softplus . . . . . . . . . . . . . . . . . .
Tanh . . . . . . . . . . . . . . . . . . . . .
TanhLeCun . . . . . . . . . . . . . . . . .
4.7.5 Anchor . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
Outputs remapping . . . . . . . . . . . .
4.7.6 Conv . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
Configuration parameters (Spike models)
4.7.7 Deconv . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.8 Pool . . . . . . . . . . . . . . . . . . . . .
Maxout example . . . . . . . . . . . . . .
Configuration parameters (Spike models)
4.7.9 Unpool . . . . . . . . . . . . . . . . . . .
4.7.10 ElemWise . . . . . . . . . . . . . . . . . .
Sum operation . . . . . . . . . . . . . . . .
AbsSum operation . . . . . . . . . . . . . .
EuclideanSum operation . . . . . . . . . .
Prod operation . . . . . . . . . . . . . . .
Max operation . . . . . . . . . . . . . . . .
Examples . . . . . . . . . . . . . . . . . .
4.7.11 FMP . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.12 Fc . . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
Configuration parameters (Spike models)
4.7.13 Rbf . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models)
4.7.14 Softmax . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

43
44
44
44
44
44
45
46
46
46
46
46
47
47
47
48
48
48
48
48
49
49
49
49
49
49
49
50
51
53
53
54
56
56
56
58
58
60
60
60
60
60
60
60
61
61
61
61
62
63
63
63
4/82

4.7.15 LRN . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models) . . . . .
4.7.16 LSTM . . . . . . . . . . . . . . . . . . . . . . . . .
Global layer parameters (Frame_CUDA models) .
Configuration parameters (Frame_CUDA models)
Current restrictions . . . . . . . . . . . . . . . . .
Further development requirements . . . . . . . . .
Development guidance . . . . . . . . . . . . . . . .
4.7.17 Dropout . . . . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models) . . . . .
4.7.18 Padding . . . . . . . . . . . . . . . . . . . . . . . .
4.7.19 Resize . . . . . . . . . . . . . . . . . . . . . . . .
Configuration parameters . . . . . . . . . . . . . .
4.7.20 BatchNorm . . . . . . . . . . . . . . . . . . . . . .
Configuration parameters (Frame models) . . . . .
4.7.21 Transformation . . . . . . . . . . . . . . . . . . .
5 Tutorials
5.1 Learning deep neural networks: tips and tricks .
5.1.1 Choose the learning solver . . . . . . . . .
5.1.2 Choose the learning hyper-parameters . .
5.1.3 Convergence and normalization . . . . . .
5.2 Building a classifier neural network . . . . . . . .
5.3 Building a segmentation neural network . . . . .
5.3.1 Faces detection . . . . . . . . . . . . . . .
5.3.2 Gender recognition . . . . . . . . . . . . .
5.3.3 ROIs extraction . . . . . . . . . . . . . . .
5.3.4 Data visualization . . . . . . . . . . . . .
5.4 Transcoding a learned network in spike-coding .
5.4.1 Render the network compatible with spike
5.4.2 Configure spike-coding parameters . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

64
64
64
64
65
66
66
66
67
67
67
67
68
68
68
68

. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
simulations
. . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

70
70
70
70
71
71
74
76
76
77
78
78
78
79

5/82

1

Presentation

The N2D2 platform is a comprehensive solution for fast and accurate Deep Neural Network (DNN)
simulation and full and automated DNN-based applications building. The platform integrates
database construction, data pre-processing, network building, benchmarking and hardware export
to various targets. It is particularly useful for DNN design and exploration, allowing simple and fast
prototyping of DNN with different topologies. It is possible to define and learn multiple network
topology variations and compare the performances (in terms of recognition rate and computationnal
cost) automatically. Export targets include CPU, DSP and GPU with OpenMP, OpenCL, Cuda,
cuDNN and TensorRT programming models as well as custom hardware IP code generation with
High-Level Synthesis for FPGA and dedicated configurable DNN accelerator IP1 .
In the following, the first section describes the database handling capabilities of the tool,
which can automatically generate learning, validation and testing data sets from any hand made
database (for example from simple files directories). The second section briefly describes the data
pre-processing capabilites built-in the tool, which does not require any external pre-processing
step and can handle many data transformation, normalization and augmentation (for example
using elastic distortion to improve the learning). The third section show an example of DNN
building using a simple INI text configuration file. The fourth section show some examples of
metrics obtained after the learning and testing to evaluate the performances of the learned DNN.
Next, the fifth section introduces the DNN hardware export capabilities of the toolflow, which can
automatically generate ready to use code for various targets such as embedded GPUs or full custom
dedicated FPGA IP. Finally, we conclude by summarising the main features of the tool.

1.1

Database handling

The tool integrates everything needed to handle custom or hand made databases:
• Genericity: load image and sound, 1D, 2D or 3D data;
• Associate a label for each data point (useful for scene labeling for example) or a single label
to each data file (one object/class per image for example), 1D or 2D labels;
• Advanced Region of Interest (ROI) handling:
Support arbitrary ROI shapes (circular, rectangular, polygonal or pixelwise defined);
Convert ROIs to data point (pixelwise) labels;
Extract one or multiple ROIs from an initial dataset to create as many corresponding
additional data to feed the DNN;
• Native support of file directory-based databases, where each sub-directory represents a
different label. Most used image file formats are supported (JPEG, PNG, PGM...);
• Possibility to add custom datafile format in the tool without any change in the code base;
• Automatic random partitionning of the database into learning, validation and testing sets.

1.2

Data pre-processing

Data pre-processing, such as image rescaling, normalization, filtering... is directly integrated into
the toolflow, with no need for external tool or pre-processing. Each pre-processing step is called a
transformation.
The full sequence of transformations can be specified easily in a INI text configuration file. For
example:
; First step: convert the image to grayscale
[env.Transformation-1]
Type=ChannelExtractionTransformation
CSChannel=Gray
1

Ongoing work

6/82

; Second step: rescale the image to a 29x29 size
[env.Transformation-2]
Type=RescaleTransformation
Width=29
Height=29
; Third step: apply histogram equalization to the image
[env.Transformation-3]
Type=EqualizeTransformation
; Fourth step (only during learning): apply random elastic distortions to the images to extent the
learning set
[env.OnTheFlyTransformation]
Type=DistortionTransformation
ApplyTo=LearnOnly
ElasticGaussianSize=21
ElasticSigma=6.0
ElasticScaling=20.0
Scaling=15.0
Rotation=15.0

Example of pre-processing transformations built-in in the tool are:
•
•
•
•
•
•
•
•
•
•

1.3

Image color space change and color channel extraction;
Elastic distortion;
Histogram equalization (including CLAHE);
Convolutional filtering of the image with custom or pre-defined kernels (Gaussian, Gabor...);
(Random) image flipping;
(Random) extraction of fixed-size slices in a given label (for multi-label images)
Normalization;
Rescaling, padding/cropping, triming;
Image data range clipping;
(Random) extraction of fixed-size slices.

Deep network building

The building of a deep network is straightforward and can be done withing the same INI configuration
file. Several layer types are available: convolutional, pooling, fully connected, Radial-basis function
(RBF) and softmax. The tool is highly modular and new layer types can be added without
any change in the code base. Parameters of each layer type are modifiable, for example for the
convolutional layer, one can specify the size of the convolution kernels, the stride, the number of
kernels per input map and the learning parameters (learning rate, initial weights value...). For the
learning, the data dynamic can be chosen between 16 bits (with NVIDIA® cuDNN2 ), 32 bit and 64
bit floating point numbers.
The following example, which will serve as the use case for the rest of this presentation, shows
how to build a DNN with 5 layers: one convolution layer, followed by one MAX pooling layer,
followed by two fully connected layers and a softmax output layer.
; Specify the input data format
[env]
SizeX=24
SizeY=24
BatchSize=12
; First layer: convolutional with 3x3 kernels
[conv1]
Input=env
Type=Conv
2

On future GPUs

7/82

KernelWidth=3
KernelHeight=3
NbOutputs=32
Stride=1
; Second layer: MAX pooling with pooling area 2x2
[pool1]
Input=conv1
Type=Pool
Pooling=Max
PoolWidth=2
PoolHeight=2
NbOutputs=32
Stride=2
Mapping.Size=1 ; one to one connection between convolution output maps and pooling input maps
; Third layer: fully connected layer with 60 neurons
[fc1]
Input=pool1
Type=Fc
NbOutputs=60
; Fourth layer: fully connected with 10 neurons
[fc2]
Input=fc1
Type=Fc
NbOutputs=10
; Final layer: softmax
[softmax]
Input=fc2
Type=Softmax
NbOutputs=10
WithLoss=1
[softmax.Target]
TargetValue=1.0
DefaultValue=0.0

The resulting DNN is shown in figure 1.
The learning is accelerated in GPU using the NVIDIA® cuDNN framework, integrated into
the toolflow. Using GPU acceleration, learning times can be reduced typically by two orders of
magnitude, enabling the learning of large databases within tens of minutes to a few hours instead
of several days or weeks for non-GPU accelerated learning.

1.4

Performances evaluation

The software automatically outputs all the information needed for the network applicative performances analysis, such as the recognition rate and the validation score during the learning; the
confusion matrix during learning, validation and test; the memory and computation requirements
of the network; the output maps activity for each layer, and so on, as shown in figure 2.

1.5

Hardware exports

Once the learned DNN recognition rate performances are satisfying, an optimized version of the
network can be automatically exported for various embedded targets. An automated network
computation performances benchmarking can also be performed among different targets.
The following targets are currently supported by the toolflow:
• Plain C code (no dynamic memory allocation, no floating point processing);

8/82

env

conv1

24x24

32 (22x22)

pool1
32 (11x11)

Max

fc1

fc2

softmax

60

10

10

Figure 1: Automatically generated and ready to learn DNN from the INI configuration file example.

Recognition rate and validation score

Confusion matrix

Memory and computation requirements

Output maps activity

Figure 2: Example of information automatically generated by the software during and after learning.

• C code accelerated with OpenMP;
• C code tailored for High-Level Synthesis (HLS) with Xilinx® Vivado® HLS;
Direct synthesis to FPGA, with timing and utilization after routing;
9/82

Possibility to constrain the maximum number of clock cycles desired to compute the
whole network;
FPGA utilization vs number of clock cycle trade-off analysis;
• OpenCL code optimized for either CPU/DSP or GPU;
• Cuda kernels, cuDNN and TensorRT code optimized for NVIDIA® GPUs.
Different automated optimizations are embedded in the exports:
• DNN weights and signal data precision reduction (down to 8 bit integers or less for custom
FPGA IPs);
• Non-linear network activation functions approximations;
• Different weights discretization methods.
The exports are generated automatically and come with a Makefile and a working testbench,
including the pre-processed testing dataset. Once generated, the testbench is ready to be compiled
and executed on the target platform. The applicative performance (recognition rate) as well as the
computing time per input data can then be directly mesured by the testbench.

100000

OpenCL
CUDA
HLS FPGA

10000

Kpixels image / s

OpenMP

1000
100
10
1

Figure 3: Example of network benchmarking on different hardware targets.

The figure 3 shows an example of benchmarking results of the previous DNN on different targets
(in log scale). Compared to desktop CPUs, the number of input image pixels processed per second
is more than one order of magnitude higher with GPUsand at least two orders of magnitude better
with synthesized DNN on FPGA.

1.6

Summary

The N2D2 platform is today a complete and production ready neural network building tool, which
does not require advanced knownledges in deep learning to be used. It is tailored for fast neural
network applications generation and porting with minimum overhead in terms of database creation
and management, data pre-processing, networks configuration and optimized code generation,
which can save months of manual porting and verification effort to a single automated step in the
tool.

10/82

2

About N2D2-IP

While N2D2 is our deep learning open-source core framework, some modules referred as "N2D2-IP"
in the manual, are only available through custom license agreement with CEA LIST.
If you are interested in obtaining some of these modules, please contact our business developer
for more information on available licensing options:
Sandrine VARENNE (Sandrine.VARENNE@cea.fr)
In addition to N2D2-IP modules, we can also provide our expertise to design specific solutions
for integrating DNN in embedded hardware systems, where power, latency, form factor and/or
cost are constrained. We can target CPU/DSP/GPU CoTS hardware as well as our own PNeuro
(programmable) and DNeuro (dataflow) dedicated hardware accelerator IPs for DNN on FPGA or
ASIC.

3

Performing simulations

3.1

Obtaining the latest version of this manual

Before going further, please make sure you are reading the latest version of this manual. It is located
in the manual sub-directory. To compile the manual in PDF, just run the following command:
cd manual && make

In order to compile the manual, you must have pdflatex and bibtex installed, as well as some
common LaTeX packages.
• On Ubuntu, this can be done by installing the texlive and texlive-latex-extra software
packages.
• On Windows, you can install the MiKTeX software, which includes everything needed and will
install the required LaTeX packages on the fly.

3.2

Minimum system requirements

• Supported processors:
ARM Cortex A15 (tested on Tegra K1)
ARM Cortex A53/A57 (tested on Tegra X1)
Pentium-compatible PC (Pentium III, Athlon or more-recent system recommended)
• Supported operating systems:
Windows ≥ 7 or Windows Server ≥ 2012, 64 bits with Visual Studio ≥ 2015.2 (2015
Update 2)
GNU/Linux with GCC ≥ 4.4 (tested on RHEL ≥ 6, Debian ≥ 6, Ubuntu ≥ 14.04)
• At least 256 MB of RAM (1 GB with GPU/CUDA) for MNIST dataset processing
• At least 150 MB available hard disk space + 350 MB for MNIST dataset processing
For CUDA acceleration:
• CUDA ≥ 6.5 and CuDNN ≥ 1.0
• NVIDIA GPU with CUDA compute capability ≥ 3 (starting from Kepler micro-architecture)
• At least 512 MB GPU RAM for MNIST dataset processing

11/82

3.3
3.3.1

Obtaining N2D2
Prerequisites

Red Hat Enterprise Linux (RHEL) 6 Make sure you have the following packages installed:
•
•
•
•

cmake
gnuplot
opencv
opencv-devel

(may require the rhel-x86_64-workstation-optional-6 repository channel)

Plus, to be able to use GPU acceleration:
• Install the CUDA repository package:
rpm -Uhv http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-reporhel6-7.5-18.x86_64.rpm
yum clean expire-cache
yum install cuda

• Install cuDNN from the NVIDIA website: register to NVIDIA Developer and download the latest version of cuDNN. Simply copy the header and library files from the cuDNN archive to the
corresponding directories in the CUDA installation path (by default: /usr/local/cuda/include
and /usr/local/cuda/lib64, respectively).
• Make sure the CUDA library path (e.g. /usr/local/cuda/lib64) is added to the LD_LIBRARY_PATH
environment variable.
Ubuntu
version:
•
•
•
•
•

Make sure you have the following packages installed, if they are available on your Ubuntu

cmake
gnuplot
libopencv-dev
libcv-dev
libhighgui-dev

Plus, to be able to use GPU acceleration:
• Install the CUDA repository package matching your distribution. For example, for Ubuntu
14.04 64 bits:
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repoubuntu1404_7.5-18_amd64.deb
dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb

• Install the cuDNN repository package matching your distribution. For example, for Ubuntu
14.04 64 bits:
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64/
nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb
dpkg -i nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb

Note that the cuDNN repository package is provided by NVIDIA for Ubuntu starting from
version 14.04.
• Update the package lists: apt-get update
• Install the CUDA and cuDNN required packages:
apt-get install cuda-core-7-5 cuda-cudart-dev-7-5 cuda-cublas-dev-7-5 cuda-curand-dev-7-5
libcudnn5-dev

• Make sure there is a symlink to /usr/local/cuda:
ln -s /usr/local/cuda-7.5 /usr/local/cuda

• Make sure the CUDA library path (e.g. /usr/local/cuda/lib64) is added to the LD_LIBRARY_PATH
environment variable.
12/82

Windows On Windows 64 bits, Visual Studio ≥ 2015.2 (2015 Update 2) is required.
Make sure you have the following software installed:
• CMake (http://www.cmake.org/): download and run the Windows installer.
• dirent.h C++ header (https://github.com/tronkko/dirent): to be put in the Visual
Studio include path.
• Gnuplot (http://www.gnuplot.info/): the bin sub-directory in the install path needs to be
added to the Windows PATH environment variable.
• OpenCV (http://opencv.org/): download the latest 2.x version for Windows and extract it
to, for example, C:\OpenCV\. Make sure to define the environment variable OpenCV_DIR to point
to C:\OpenCV\opencv\build. Make sure to add the bin sub-directory (C:\OpenCV\opencv\build\x64
\vc12\bin) to the Windows PATH environment variable.
Plus, to be able to use GPU acceleration:
• Download and install CUDA toolkit 8.0 located at https://developer.nvidia.com/compute/
cuda/8.0/prod/local_installers/cuda_8.0.44_windows-exe:
rename cuda_8.0.44_windows-exe cuda_8.0.44_windows.exe
cuda_8.0.44_windows.exe -s compiler_8.0 cublas_8.0 cublas_dev_8.0 cudart_8.0 curand_8.0
curand_dev_8.0

• Update the PATH environment variable:
set PATH=%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin;%ProgramFiles%\NVIDIA GPU
Computing Toolkit\CUDA\v8.0\libnvvp;%PATH%

• Download and install cuDNN 8.0 located at http://developer.download.nvidia.com/
compute/redist/cudnn/v5.1/cudnn-8.0-windows7-x64-v5.1.zip (the following command
assumes that you have 7-Zip installed):
7z x cudnn-8.0-windows7-x64-v5.1.zip
copy cuda\include\*.* ^
"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include\"
copy cuda\lib\x64\*.* ^
"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64\"
copy cuda\bin\*.* ^
"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\"

3.3.2

Getting the sources

Use the following command:
git clone git@github.com:CEA-LIST/N2D2.git

3.3.3

Compilation

To compile the program:
mkdir build
cd build
cmake .. && make

On Windows, you may have to specify the generator, for example:
cmake .. -G"Visual Studio 14"

Then open the newly created N2D2 project in Visual Studio 2015. Select "Release" for the build
target. Right click on ALL_BUILD item and select "Build".

13/82

3.4

Downloading training datasets

A python script located in the repository root directory allows you to select and automatically
download some well-known datasets, like MNIST and GTSRB (the script requires Python 2.x with
bindings for GTK 2 package):
./tools/install_stimuli_gui.py

By default, the datasets are downloaded in the path specified in the N2D2_DATA environment
variable, which is the root path used by the N2D2 tool to locate the databases. If the N2D2_DATA
variable is not set, the default value used is /local/$USER/n2d2_data/ (or /local/n2d2_data/ if
the USER environment variable is not set) on Linux and C:\n2d2_data\ on Windows.
Please make sure you have write access to the N2D2_DATA path, or if not set, in the default
/local/$USER/n2d2_data/ path.

3.5

Run the learning

The following command will run the learning for 600,000 image presentations/steps and log the
performances of the network every 10,000 steps:
./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -learn 600000 -log 10000

Note: you may want to check the gradient computation using the -check option. Note that it
can be extremely long and can occasionally fail if the required precision is too high.

3.6

Test a learned network

After the learning is completed, this command evaluate the network performances on the test data
set:
./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -test

3.6.1

Interpreting the results

Recognition rate The recognition rate and the validation score are reported during the learning
in the TargetScore_*/Success_validation.png file, as shown in figure 4.
Confusion matrix The software automatically outputs the confusion matrix during learning,
validation and test, with an example shown in figure 5. Each row of the matrix contains the number
of occurrences estimated by the network for each label, for all the data corresponding to a single
actual, target label. Or equivalently, each column of the matrix contains the number of actual,
target label occurrences, corresponding to the same estimated label. Idealy, the matrix should be
diagonal, with no occurrence of an estimated label for a different actual label (network mistake).
The confusion matrix reports can be found in the simulation directory:
• TargetScore_*/ConfusionMatrix_learning.png;
• TargetScore_*/ConfusionMatrix_validation.png;
• TargetScore_*/ConfusionMatrix_test.png.
Memory and computation requirements The software also report the memory and computation requirements of the network, as shown in figure 6. The corresponding report can be found in
the stats sub-directory of the simulation.
Kernels and weights distribution The synaptic weights obtained during and after the learning
can be analyzed, in terms of distribution (weights sub-directory of the simulation) or in terms of
kernels (kernels sub-directory of the simulation), as shown in 7.
14/82

Figure 4: Recognition rate and validation score during learning.

Figure 5: Example of confusion matrix obtained after the learning.

Output maps activity The initial output maps activity for each layer can be visualized in the
outputs_init sub-directory of the simulation, as shown in figure 8.

3.7

Export a learned network

15/82

Figure 6: Example of memory and computation requirements of the network.

./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -export CPP_OpenCL

Export types:
• C C export using OpenMP;
• C_HLS C export tailored for HLS with Vivado HLS;
• CPP_OpenCL C++ export using OpenCL;
• CPP_Cuda C++ export using Cuda;
• CPP_cuDNN C++ export using cuDNN;
• CPP_TensorRT C++ export using tensorRT 2.1 API;
• SC_Spike SystemC spike export.
Other program options related to the exports:
Option [default value]
-nbbits [8]

-calib

[0]

-calib-passes

[2]

-no-unsigned
-db-export

[-1]

Description
Number of bits for the weights and signals. Must be 8, 16, 32
or 64 for integer export, or -32, -64 for floating point export.
The number of bits can be arbitrary for the C_HLS export (for
example, 6 bits)
Number of stimuli used for the calibration. 0 = no calibration
(default), -1 = use the full test dataset for calibration
Number of KL passes for determining the layer output values
distribution truncation threshold (0 = use the max. value,
no truncation)
If present, disable the use of unsigned data type in integer
exports
Max. number of stimuli to export (0 = no dataset export, -1
= unlimited)

16/82

conv1 kernels

conv2 kernels

conv1 weights distribution

conv2 weights distribution

Figure 7: Example of kernels and weights distribution analysis for two convolutional layers.

N2D2 IP only

3.7.1

C export

Test the exported network:
cd export_C_int8
make
./bin/n2d2_test

The result should look like:
...
1652.00/1762
( a v g = 93.757094%)
1653.00/1763
( a v g = 93.760635%)
1654.00/1764
( a v g = 93.764172%)
T e s t e d 1764 s t i m u l i
S u c c e s s r a t e = 93.764172%
P r o c e s s t i m e p e r s t i m u l u s = 1 8 7 . 5 4 8 1 8 6 us ( 1 2 t h r e a d s )
Confusion matrix :
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
| T \ E |
0 |
1 |
2 |
3 |
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

17/82

Figure 8: Output maps activity example of the first convolutional layer of the network.
|
0 |
329 |
1 |
5 |
2 |
|
|
97.63% |
0.30% |
1.48% |
0.59% |
|
1 |
0 |
692 |
2 |
6 |
|
|
0.00% |
98.86% |
0.29% |
0.86% |
|
2 |
11 |
27 |
609 |
55 |
|
|
1.57% |
3.85% |
86.75% |
7.83% |
|
3 |
0 |
0 |
1 |
24 |
|
|
0.00% |
0.00% |
4.00% |
96.00% |
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
T: Target
E: Estimated

N2D2 IP only

3.7.2

CPP_OpenCL export

The OpenCL export can run the generated program in GPU or CPU architectures. Compilation
features:

18/82

Preprocessor command [default value]
PROFILING [0]

GENERATE_KBIN
LOAD_KBIN

CUDA

[0]

MALI

[0]

INTEL
AMD

[0]

[0]

[0]

[1]

Description
Compile the binary with a synchronization between each layers and return the mean execution
time of each layer. This preprocessor option can
decrease performances.
Generate the binary output of the OpenCL kernel
.cl file use. The binary is store in the /bin folder.
Indicate to the program to load an OpenCL kernel as a binary from the /bin folder instead of a
.cl file.
Use the CUDA OpenCL SDK locate at
/usr/local/cuda
Use the MALI OpenCL SDK locate at
/usr/M aliO penCLS DKv XXX
Use the INTEL OpenCL SDK locate at
/opt/intel/opencl
Use the AMD OpenCL SDK locate at
/opt/AM DAP P SDK − XXX

Program options related to the OpenCL export:
Option [default value]
-cpu
-gpu
-batch

[1]

-stimulus

[NULL]

Description
If present, force to use a CPU architecture to run the program
If present, force to use a GPU architecture to run the program
Size of the batch to use
Path to a specific input stimulus to test. For example: stimulus /stimulus/env0000.pgm command will test the file
env0000.pgm of the stimulus folder.

Test the exported network:
cd export_CPP_OpenCL_float32
make
./bin/n2d2_opencl_test -gpu

3.7.3

CPP_TensorRT export

The tensorRT 2.1 API export can run the generated program in NVIDIA GPU architecture. It use
CUDA and tensorRT 2.1 API library. The currently supported layers by the tensorRT 2.1 export
are : Convolutional, Pooling, Concatenation, Fully-Connected, Softmax and all activations type.
Custom layers implementation through the plugin factory and generic 8-bits calibrations inference
features are under development.
Program options related to the tensorRT 2.1 API export:
Option [default value]
-batch [1]
-dev [0]
-stimulus [NULL]

-prof
-iter-build

[1]

Description
Size of the batch to use
CUDA Device ID selection
Path to a specific input stimulus to test. For example: stimulus /stimulus/env0000.pgm command will test the file
env0000.pgm of the stimulus folder.
Activates the layer wise profiling mechanism. This option
can decrease execution time performance.
Sets the number of minimization build iterations done by
the tensorRT builder to find the best layer tactics.
19/82

Test the exported network with layer wise profiling:
cd export_CPP_TensorRT_float32
make
./bin/n2d2_tensorRT_test -prof

The results of the layer wise profiling should look like:
(19%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ CONV1 + CONV1_ACTIVATION:
0 . 0 2 1 9 4 6 7 ms
(05%) ∗∗∗∗∗∗∗∗∗∗∗∗ POOL1 : 0 . 0 0 6 7 5 5 7 3 ms
(13%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ CONV2 + CONV2_ACTIVATION: 0 . 0 1 5 9 0 8 9 ms
(05%) ∗∗∗∗∗∗∗∗∗∗∗∗ POOL2 : 0 . 0 0 6 1 6 0 4 7 ms
(14%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ CONV3 + CONV3_ACTIVATION: 0 . 0 1 5 9 7 1 3 ms
(19%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ FC1 + FC1_ACTIVATION : 0 . 0 2 2 2 2 4 2 ms
(13%) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ FC2 : 0 . 0 1 4 9 0 1 3 ms
(08%) ∗∗ ∗ ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ SOFTMAX: 0 . 0 1 0 0 6 3 3 ms
A v er a g e p r o f i l e d tensorRT p r o c e s s t i m e p e r s t i m u l u s = 0 . 1 1 3 9 3 2 ms

3.7.4

CPP_cuDNN export

The cuDNN export can run the generated program in NVIDIA GPU architecture. It use CUDA
and cuDNN library. Compilation features:
Preprocessor command [default value]
PROFILING [0]

ARCH32

[0]

Description
Compile the binary with a synchronization between each layers and return the mean execution
time of each layer. This preprocessor option can
decrease performances.
Compile the binary with the 32-bits architecture
compatibility.

Program options related to the cuDNN export:
Option [default value]
-batch [1]
-dev [0]
-stimulus [NULL]

Description
Size of the batch to use
CUDA Device ID selection
Path to a specific input stimulus to test. For example: stimulus /stimulus/env0000.pgm command will test the file
env0000.pgm of the stimulus folder.

Test the exported network:
cd export_CPP_cuDNN_float32
make
./bin/n2d2_cudnn_test

N2D2 IP only

3.7.5

C_HLS export

Test the exported network:
cd export_C_HLS_int8
make
./bin/n2d2_test

Run the High-Level Synthesis (HLS) with Xilinx® Vivado® HLS:
vivado_hls -f run_hls.tcl

20/82

4

INI file interface

The INI file interface is the primary way of using N2D2. It is a simple, lightweight and user-friendly
format for specifying a complete DNN-based application, including dataset instanciation, data
pre-processing, neural network layers instanciation and post-processing, with all its hyperparameters.

4.1

Syntax

INI files are simple text files with a basic structure composed of sections, properties and values.
4.1.1

Properties

The basic element contained in an INI file is the property. Every property has a name and a value,
delimited by an equals sign (=). The name appears to the left of the equals sign.
name=value

4.1.2

Sections

Properties may be grouped into arbitrarily named sections. The section name appears on a line
by itself, in square brackets ([ and ]). All properties after the section declaration are associated
with that section. There is no explicit "end of section" delimiter; sections end at the next section
declaration, or the end of the file. Sections may not be nested.
[section]
a=a
b=b

4.1.3

Case sensitivity

Section and property names are case sensitive.
4.1.4

Comments

Semicolons (;) or number sign (#) at the beginning or in the middle of the line indicate a comment.
Comments are ignored.
; comment text
a=a # comment text
a="a ; not a comment" ; comment text

4.1.5

Quoted values

Values can be quoted, using double quotes. This allows for explicit declaration of whitespace,
and/or for quoting of special characters (equals, semicolon, etc.).
4.1.6

Whitespace

Leading and trailing whitespace on a line are ignored.
4.1.7

Escape characters

A backslash (\) followed immediately by EOL (end-of-line) causes the line break to be ignored.

21/82

4.2

Template inclusion syntax

Is is possible to recursively include templated INI files. For example, the main INI file can include
a templated file like the following:
[inception@inception_model.ini.tpl]
INPUT=layer_x
SIZE=32
ARRAY=2 ; Must be the number of elements in the array
ARRAY[0].P1=Conv
ARRAY[0].P2=32
ARRAY[1].P1=Pool
ARRAY[1].P2=64

If the inception_model.ini.tpl template file content is:
[{{SECTION_NAME}}_layer1]
Input={{INPUT}}
Type=Conv
NbOutputs={{SIZE}}
[{{SECTION_NAME}}_layer2]
Input={{SECTION_NAME}}_layer1
Type=Fc
NbOutputs={{SIZE}}
{% block ARRAY %}
[{{SECTION_NAME}}_array{{#}}]
Prop1=Config{{.P1}}
Prop2={{.P2}}
{% endblock %}

The resulting equivalent content for the main INI file will be:
[inception_layer1]
Input=layer_x
Type=Conv
NbOutputs=32
[inception_layer2]
Input=inception_layer1
Type=Fc
NbOutputs=32
[inception_array0]
Prop1=ConfigConv
Prop2=32
[inception_array1]
Prop1=ConfigPool
Prop2=64

The SECTION_NAME template parameter is automatically generated from the name of the including
section (before @).
4.2.1
{{VAR}}

4.2.2

Variable substitution
is replaced by the value of the VAR template parameter.
Control statements

Control statements are between {% and %} delimiters.

22/82

block {%block ARRAY %} ... {%endblock %}
The # template parameter is automatically generated from the {%block ... %} template control
statement and corresponds to the current item position, starting from 0.
for

{%for VAR in range([START, ]END])%} ... {%endfor %}
If START is not specified, the loop begins at 0 (first value of VAR). The last value of VAR is END-1.

if

... [{%else %}] ... {%endif %}
may be ==, !=, exists or not_exists.

{%if VAR OP [VALUE] %}
OP

include

4.3

{%include FILENAME %}

Global parameters

Option [default value]
DefaultModel [Transcode]

Description
Default layers model. Can be Frame, Frame_CUDA, Transcode or
Spike

DefaultDataType [Float32]
SignalsDiscretization

[0]

FreeParametersDiscretization

Default layers data type. Can be Float16, Float32 or Float64
Number of levels for signal discretization
Number of levels for weights discretization

[0]

4.4

Databases

The tool integrates pre-defined modules for several well-known database used in the deep learning
community, such as MNIST, GTSRB, CIFAR10 and so on. That way, no extra step is necessary to
be able to directly build a network and learn it on these database.
4.4.1

MNIST

MNIST (LeCun et al., 1998) is already fractionned into a learning set and a testing set, with:
• 60,000 digits in the learning set;
• 10,000 digits in the testing set.
Example:
[database]
Type=MNIST_IDX_Database
Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value]
Validation [0.0]
DataPath

Description
Fraction of the learning set used for validation
Path to the database

[$N2D2_DATA/mnist]
4.4.2

GTSRB

GTSRB (Stallkamp et al., 2012) is already fractionned into a learning set and a testing set, with:
• 39,209 digits in the learning set;
• 12,630 digits in the testing set.
Example:

23/82

[database]
Type=GTSRB_DIR_Database
Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value]
Validation [0.0]

Description
Fraction of the learning set used for validation
Path to the database

DataPath

[$N2D2_DATA/GTSRB]
4.4.3

Directory

Hand made database stored in files directories are directly supported with the DIR_Database module.
For example, suppose your database is organized as following (in the path specified in the N2D2_DATA
environment variable):
•
•
•
•

GST/airplanes:

800 images
123 images
GST/Faces: 435 images
GST/Motorbikes: 798 images
GST/car_side:

You can then instanciate this database as input of your neural network using the following
parameters:
[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.4 ; 40% of images of the smallest category = 49 (0.4x123) images for each category will be
used for learning
Validation=0.2 ; 20% of images of the smallest category = 25 (0.2x123) images for each category
will be used for validation
; the remaining images will be used for testing

Each subdirectory will be treated as a different label, so there will be 4 different labels, named
after the directory name.
The stimuli are equi-partitioned for the learning set and the validation set, meaning that the
same number of stimuli for each category is used. If the learn fraction is 0.4 and the validation
fraction is 0.2, as in the example above, the partitioning will be the following:
Label ID

Label name

Learn set

Validation set

Test set

0
1
2
3

airplanes

49
49
49
49
196

25
25
25
25
100

726
49
361
724
1860

car_side
Faces
Motorbikes

Total:
Mandatory option
Option [default value]
DataPath
Learn

LoadInMemory
Depth

[1]

[0]

Description
Path to the root stimuli directory
If PerLabelPartitioning is true, fraction of images used for
the learning; else, number of images used for the learning,
regardless of their labels
Load the whole database into memory
Number of sub-directory levels to include. Examples:
24/82

[]
LabelDepth [1]
LabelName

PerLabelPartitioning

Validation

Test

[1]

[0.0]

[1.0-Learn-Validation]

ValidExtensions
LoadMore
ROIFile

[]

[]

[]

DefaultLabel
ROIsMargin

[]

[0]

Depth = 0: load stimuli only from the current directory
(DataPath)
Depth = 1: load stimuli from DataPath and stimuli contained
in the sub-directories of DataPath
Depth < 0: load stimuli recursively from DataPath and all its
sub-directories
Base stimuli label name
Number of sub-directory name levels used to form the stimuli
labels. Examples:
LabelDepth = -1: no label for all stimuli (label ID = -1)
LabelDepth = 0: uses LabelName for all stimuli
LabelDepth = 1: uses LabelName for stimuli in the current
directory (DataPath) and LabelName/sub-directory name for
stimuli in the sub-directories
If true, the stimuli are equi-partitioned for the learn/validation/test sets, meaning that the same number of stimuli for
each label is used
If PerLabelPartitioning is true, fraction of images used for the
validation; else, number of images used for the validation,
regardless of their labels
If PerLabelPartitioning is true, fraction of images used for the
test; else, number of images used for the test, regardless of
their labels
List of space-separated valid stimulus file extensions (if left
empty, any file extension is considered a valid stimulus)
Name of an other section with the same options to load a
different DataPath
File containing the stimuli ROIs. If a ROI file is specified,
LabelDepth should be set to -1
Label name for pixels outside any ROI (default is no label,
pixels are ignored)
Number of pixels around ROIs that are ignored (and not
considered as DefaultLabel pixels)

To load and partition more than one DataPath, one can use the LoadMore option:
[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.6
Validation=0.4
LoadMore=database.test
; Load stimuli from the "GST_Test" path in the test dataset
[database.test]
DataPath=${N2D2_DATA}/GST_Test
Learn=0.0
Test=1.0
; The LoadMore option is recursive:
; LoadMore=database.more
; [database.more]
; Load even more data here

25/82

4.4.4

Other built-in databases

CIFAR10_Database CIFAR10 database (Krizhevsky, 2009).
Option [default value]
Validation [0.0]
DataPath

Description
Fraction of the learning set used for validation
Path to the database

[$N2D2_DATA/cifar-10-batchesbin]
CIFAR100_Database CIFAR100 database (Krizhevsky, 2009).
Option [default value]
Validation [0.0]
UseCoarse [0]
DataPath

Description
Fraction of the learning set used for validation
If true, use the coarse labeling (10 labels instead of 100)
Path to the database

[$N2D2_DATA/cifar-100-binary]
CKP_Database The Extended Cohn-Kanade (CK+) database for expression recognition (Lucey
et al., 2010).
Option [default value]
Learn
Validation

[0.0]

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the database

[$N2D2_DATA/cohn-kanadeimages]
Caltech101_DIR_Database Caltech 101 database (Fei-Fei et al., 2004).
Option [default value]
Learn

[0.0]
IncClutter [0]
Validation

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
If true, includes the BACKGROUND_Google directory of
the database
Path to the database

[$N2D2_DATA/
101_ObjectCategories]
Caltech256_DIR_Database Caltech 256 database (Griffin et al., 2007).
Option [default value]
Learn

[0.0]
IncClutter [0]
Validation

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
If true, includes the BACKGROUND_Google directory of
the database
Path to the database

[$N2D2_DATA/
256_ObjectCategories]

26/82

CaltechPedestrian_Database Caltech Pedestrian database (Dollár et al., 2009).
Note that the images and annotations must first be extracted from the seq video data located in
the videos directory using the dbExtract.m Matlab tool provided in the "Matlab evaluation/labeling
code" downloadable on the dataset website.
Assuming the following directory structure (in the path specified in the N2D2_DATA environment
variable):
•
•
•
•

(from the setxx.tar files)
(from the setxx.tar files)
CaltechPedestrians/tools/piotr_toolbox/toolbox (from the Piotr’s Matlab Toolbox archive)
CaltechPedestrians/*.m including dbExtract.m (from the Matlab evaluation/labeling code)
CaltechPedestrians/data-USA/videos/...

CaltechPedestrians/data-USA/annotations/...

Use the following command in Matlab to generate the images and annotations:
cd([getenv(’N2D2_DATA’) ’/CaltechPedestrians’])
addpath(genpath(’tools/piotr_toolbox/toolbox’)) % add the Piotr’s Matlab Toolbox in the Matlab
path
dbInfo(’USA’)
dbExtract()

Option [default value]
Validation [0.0]
SingleLabel [1]
IncAmbiguous [0]
DataPath

Description
Fraction of the learning set used for validation
Use the same label for "person" and "people" bounding box
Include ambiguous bounding box labeled "person?" using the
same label as "person"
Path to the database images

[$N2D2_DATA/
CaltechPedestrians/dataUSA/images]
Path to the database annotations

LabelPath

[$N2D2_DATA/
CaltechPedestrians/dataUSA/annotations]
Cityscapes_Database Cityscapes database (Cordts et al., 2016).
Option [default value]
IncTrainExtra [0]
UseCoarse

[0]

SingleInstanceLabels

[1]

DataPath

[$N2D2_DATA/
Cityscapes/leftImg8bit] or
[$CITYSCAPES_DATASET] if defined
LabelPath []

Description
If true, includes the left 8-bit images - trainextra set (19,998
images)
If true, only use coarse annotations (which are the only
annotations available for the trainextra set)
If true, convert group labels to single instance labels (for
example, cargroup becomes car)
Path to the database images

Path to the database annotations (deduced from DataPath if
left empty)

Daimler_Database Daimler Monocular Pedestrian Detection Benchmark (Daimler Pedestrian).

27/82

Option [default value]
Learn [1.0]
Validation [0.0]
Test [0.0]
Fully [0]

Description
Fraction of images used for the learning
Fraction of images used for the validation
Fraction of images used for the test
When activate it use the test dataset to learn. Use only on
fully-cnn mode

DOTA_Database DOTA database (Xia et al., 2017).
Option [default value]
Learn
DataPath

Description
Fraction of images used for the learning
Path to the database

[$N2D2_DATA/DOTA]
Path to the database labels list file

LabelPath

[]
FDDB_Database Face Detection Data Set and Benchmark (FDDB) (Jain and Learned-Miller,
2010).
Option [default value]
Learn
Validation

[0.0]

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the images (decompressed originalPics.tar.gz)

[$N2D2_DATA/FDDB]
Path to the annotations (decompressed FDDB-folds.tgz)

LabelPath

[$N2D2_DATA/FDDB]
GTSDB_DIR_Database GTSDB database (Houben et al., 2013).
Option [default value]
Learn
Validation

[0.0]

DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the database

[$N2D2_DATA/FullIJCNN2013]
ILSVRC2012_Database ILSVRC2012 database (Russakovsky et al., 2015).
Option [default value]
Learn
DataPath

Description
Fraction of images used for the learning
Path to the database

[$N2D2_DATA/ILSVRC2012]
LabelPath

Path to the database labels list file

[$N2D2_DATA
/ILSVRC2012/synsets.txt]
KITTI_Database The KITTI Database provide ROI which can be use for autonomous driving and
environment perception. The database provide 8 labeled different classes. Utilization of the KITTI
Database is under licensing conditions and request an email registration. To install it you have to
follow this link: http://www.cvlibs.net/datasets/kitti/eval_tracking.php and download
the left color images (15 GB) and the trainling labels of tracking data set (9 MB). Extract the
downloaded archives in your $N2D2_DATA/KITTI folder.
28/82

Option [default value]
Learn [0.8]
Validation [0.2]

Description
Fraction of images used for the learning
Fraction of images used for the validation

KITTI_Road_Database The KITTI Road Database provide ROI which can be used to road
segmentation. The dataset provide 1 labeled class (road) on 289 training images. The 290 test
images are not labeled. Utilization of the KITTI Road Database is under licensing conditions and
request an email registration. To install it you have to follow this link: http://www.cvlibs.net/
datasets/kitti/eval_road.php and download the "base kit" of (0.5 GB) with left color images,
calibration and training labels. Extract the downloaded archive in your $N2D2_DATA/KITTI folder.
Option [default value]
Learn [0.8]
Validation [0.2]

Description
Fraction of images used for the learning
Fraction of images used for the validation

KITTI_Object_Database The KITTI Object Database provide ROI which can be use for autonomous driving and environment perception. The database provide 8 labeled different classes
on 7481 training images. The 7518 test images are not labeled. The whole database provide 80256 labeled objects. Utilization of the KITTI Object Database is under licensing conditions and request an email registration. To install it you have to follow this link: http:
//www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark and download the "lef
color images" (12 GB) and the training labels of object data set (5 MB). Extract the downloaded
archives in your $N2D2_DATA/KITTI_Object folder.
Option [default value]
Learn [0.8]
Validation [0.2]

Description
Fraction of images used for the learning
Fraction of images used for the validation

LITISRouen_Database LITIS Rouen audio scene dataset (Rakotomamonjy and Gasso, 2014).
Option [default value]
Learn [0.4]
Validation [0.4]
DataPath

Description
Fraction of images used for the learning
Fraction of images used for the validation
Path to the database

[$N2D2_DATA/data_rouen]
4.4.5

Dataset images slicing

It is possible to automatically slice images from a dataset, with a given slice size and stride, using
the .slicing attribute. This effectively increases the number of stimuli in the set.
[database.slicing]
ApplyTo=NoLearn
Width=2048
Height=1024
StrideX=2048
StrideY=1024
RandomShuffle=1 ; 1 is the default value

The RandomShuffle option, enabled by default, randomly shuffle the dataset after slicing. If
disabled, the slices are added in order at the end of the dataset.

29/82

4.5

Stimuli data analysis

You can enable stimuli data reporting with the following section (the name of the section must
start with env.StimuliData):
[env.StimuliData-raw]
ApplyTo=LearnOnly
LogSizeRange=1
LogValueRange=1

The stimuli data reported for the full MNIST learning set will look like:
env . S t i m u l i D a t a −raw d a t a :
Number o f s t i m u l i : 60000
Data w i d t h r a n g e : [ 2 8 , 2 8 ]
Data h e i g h t r a n g e : [ 2 8 , 2 8 ]
Data c h a n n e l s r a n g e : [ 1 , 1 ]
Value r a n g e : [ 0 , 2 5 5 ]
Value mean : 3 3 . 3 1 8 4
Value s t d . d e v . : 7 8 . 5 6 7 5

4.5.1

Zero-mean and unity standard deviation normalization

It it possible to normalize the whole database to have zero mean and unity standard deviation on
the learning set using a RangeAffineTransformation transformation:
; Stimuli normalization based on learning set global mean and std.dev.
[env.Transformation-normalize]
Type=RangeAffineTransformation
FirstOperator=Minus
FirstValue=[env.StimuliData-raw]_GlobalValue.mean
SecondOperator=Divides
SecondValue=[env.StimuliData-raw]_GlobalValue.stdDev

The variables _GlobalValue.mean and _GlobalValue.stdDev are automatically generated in the [env.
StimuliData-raw] block. Thanks to this facility, unknown and arbitrary database can be analysed
and normalized in one single step without requiring any external data manipulation.
After normalization, the stimuli data reported is:
env . S t i m u l i D a t a −n o r m a l i z e d d a t a :
Number o f s t i m u l i : 60000
Data w i d t h r a n g e : [ 2 8 , 2 8 ]
Data h e i g h t r a n g e : [ 2 8 , 2 8 ]
Data c h a n n e l s r a n g e : [ 1 , 1 ]
Value r a n g e : [ − 0 . 4 2 4 0 7 4 , 2 . 8 2 1 5 4 ]
Value mean : 2 . 6 4 7 9 6 e −07
Value s t d . d e v . : 1

Where we can check that the global mean is close to 0 and the standard deviation is 1 on the
whole dataset. The result of the transformation on the first images of the set can be checked in the
generated frames folder, as shown in figure 9.
4.5.2

Substracting the mean image of the set

Using the StimuliData object followed with an AffineTransformation, it is also possible to use the
mean image of the dataset to normalize the data:
[env.StimuliData-meanData]
ApplyTo=LearnOnly
MeanData=1 ; Provides the _MeanData parameter used in the transformation
[env.Transformation]
Type=AffineTransformation
FirstOperator=Minus
FirstValue=[env.StimuliData-meanData]_MeanData

30/82

Figure 9: Image of the set after normalization.

The resulting global mean image can be visualized in env.StimuliData-meanData/meanData.bin.png
an is shown in figure 10.

Figure 10: Global mean image generated by StimuliData with the MeanData parameter enabled.

After this transformation, the reported stimuli data becomes:
env . S t i m u l i D a t a −p r o c e s s e d d a t a :

31/82

Number o f s t i m u l i : 60000
Data w i d t h r a n g e : [ 2 8 , 2 8 ]
Data h e i g h t r a n g e : [ 2 8 , 2 8 ]
Data c h a n n e l s r a n g e : [ 1 , 1 ]
Value r a n g e : [ − 1 3 9 . 5 5 4 , 2 5 4 . 9 7 9 ]
Value mean : −3.45583 e −08
Value s t d . d e v . : 6 6 . 1 2 8 8

The result of the transformation on the first images of the set can be checked in the generated
frames folder, as shown in figure 11.

Figure 11: Image of the set after the AffineTransformation substracting the global mean image (keep in
mind that the original image value range is [0, 255]).

4.6

Environment

The environment simply specify the input data format of the network (width, height and batch
size). Example:
[env]
SizeX=24
SizeY=24
BatchSize=12 ; [default: 1]

Option [default value]
SizeX
SizeY
NbChannels
BatchSize

[1]

[1]

CompositeStimuli
CachePath

[]

[0]

Description
Environment width
Environment height
Number of channels (applicable only if there is no env.
ChannelTransformation[...])
Batch size
If true, use pixel-wise stimuli labels
Stimuli cache path (no cache if left empty)
32/82

StimulusType [SingleBurst]
DiscardedLateStimuli

[1.0]

PeriodMeanMin

[50 TimeMs]

PeriodMeanMax

[12 TimeS]

PeriodRelStdDev
PeriodMin

4.6.1

[0.1]

[11 TimeMs]

Method for converting stimuli into spike trains. Can be any
of SingleBurst, Periodic, JitteredPeriodic or Poissonian
The pixels in the pre-processed stimuli with a value above
this limit never generate spiking events
Mean minimum period Tmin , used for periodic temporal codings, corresponding to pixels in the pre-processed stimuli with
a value of 0 (which are supposed to be the most significant
pixels)
Mean maximum period Tmax , used for periodic temporal
codings, corresponding to pixels in the pre-processed stimuli
with a value of 1 (which are supposed to be the least significant pixels). This maximum period may be never reached if
DiscardedLateStimuli is lower than 1.0
Relative standard deviation, used for periodic temporal codings, applied to the spiking period of a pixel
Absolute minimum period, or spiking interval, used for periodic temporal codings, for any pixel

Built-in transformations

There are 6 possible categories of transformations:
• env.Transformation[...] Transformations applied to the input images before channels creation;
• env.OnTheFlyTransformation[...] On-the-fly transformations applied to the input images before
channels creation;
• env.ChannelTransformation[...] Create or add transformation for a specific channel;
• env.ChannelOnTheFlyTransformation[...] Create or add on-the-fly transformation for a specific
channel;
• env.ChannelsTransformation[...] Transformations applied to all the channels of the input
images;
• env.ChannelsOnTheFlyTransformation[...] On-the-fly transformations applied to all the channels
of the input images.
Example:
[env.Transformation]
Type=PadCropTransformation
Width=24
Height=24

Several transformations can applied successively. In this case, to be able to apply multiple
transformations of the same category, a different suffix ([...]) must be added to each transformation.
The transformations will be processed in the order of appearance in the INI file
regardless of their suffix.
Common set of parameters for any kind of transformation:
Option [default value]
ApplyTo [All]

Description
Apply the transformation only to the specified stimuli sets.
Can be:
LearnOnly: learning set only
ValidationOnly: validation set only
TestOnly: testing set only
33/82

NoLearn:

validation and testing sets only
learning and testing sets only
NoTest: learning and validation sets only
All: all sets (default)
NoValidation:

Example:
[env.Transformation-1]
Type=ChannelExtractionTransformation
CSChannel=Gray
[env.Transformation-2]
Type=RescaleTransformation
Width=29
Height=29
[env.Transformation-3]
Type=EqualizeTransformation
[env.OnTheFlyTransformation]
Type=DistortionTransformation
ApplyTo=LearnOnly ; Apply this transformation for the Learning set only
ElasticGaussianSize=21
ElasticSigma=6.0
ElasticScaling=20.0
Scaling=15.0
Rotation=15.0

List of available transformations:
AffineTransformation Apply an element-wise affine transformation to the image with matrixes
of the same size.
Option [default value]
FirstOperator

Description
First element-wise operator, can be Plus, Minus, Multiplies,
Divides

FirstValue
SecondOperator [Plus]

First matrix file name
Second element-wise operator, can be Plus, Minus, Multiplies,
Divides

SecondValue

[]

Second matrix file name

The final operation is the following, with A the image matrix, B1st , B2nd the matrixes to
add/substract/multiply/divide and the element-wise operator :
f (A) = A op1st B1st



op2nd

B2nd

ApodizationTransformation Apply an apodization window to each data row.
Option [default value]
Size
WindowName [Rectangular]

Description
Window total size (must match the number of data columns)
Window name. Possible values are:
Rectangular: Rectangular
Hann: Hann
Hamming: Hamming
Cosine: Cosine
Gaussian: Gaussian
Blackman: Blackman
Kaiser: Kaiser
34/82

Gaussian window

Gaussian window.

Option [default value]
WindowName .Sigma
[0.4]
Blackman window

Sigma

Blackman window.

Option [default value]
WindowName .Alpha
[0.16]
Kaiser window

Description

Description
Alpha

Kaiser window.

Option [default value]
WindowName .Beta [5.0]

Description
Beta

ChannelExtractionTransformation Extract an image channel.
Option
CSChannel

Description
Blue: blue channel in the BGR colorspace, or first channel of
any colorspace
Green: green channel in the BGR colorspace, or second channel of any colorspace
Red: red channel in the BGR colorspace, or third channel of
any colorspace
Hue: hue channel in the HSV colorspace
Saturation: saturation channel in the HSV colorspace
Value: value channel in the HSV colorspace
Gray: gray conversion
Y: Y channel in the YCbCr colorspace
Cb: Cb channel in the YCbCr colorspace
Cr: Cr channel in the YCbCr colorspace

ColorSpaceTransformation Change the current image colorspace.

35/82

Option
ColorSpace

Description
BGR: convert any gray, BGR or BGRA image to BGR
RGB: convert any gray, BGR or BGRA image to RGB
HSV: convert BGR image to HSV
HLS: convert BGR image to HLS
YCrCb: convert BGR image to YCrCb
CIELab: convert BGR image to CIELab
CIELuv: convert BGR image to CIELuv
RGB_to_BGR: convert RGB image to BGR
RGB_to_HSV: convert RGB image to HSV
RGB_to_HLS: convert RGB image to HLS
RGB_to_YCrCb: convert RGB image to YCrCb
RGB_to_CIELab: convert RGB image to CIELab
RGB_to_CIELuv: convert RGB image to CIELuv
HSV_to_BGR: convert HSV image to BGR
HSV_to_RGB: convert HSV image to RGB
HLS_to_BGR: convert HLS image to BGR
HLS_to_RGB: convert HLS image to RGB
YCrCb_to_BGR: convert YCrCb image to BGR
YCrCb_to_RGB: convert YCrCb image to RGB
CIELab_to_BGR: convert CIELab image to BGR
CIELab_to_RGB: convert CIELab image to RGB
CIELuv_to_BGR: convert CIELuv image to BGR
CIELuv_to_RGB: convert CIELuv image to RGB

Note that the default colorspace in N2D2 is BGR, the same as in OpenCV.
DFTTransformation Apply a DFT to the data. The input data must be single channel, the
resulting data is two channels, the first for the real part and the second for the imaginary part.
Option [default value]
TwoDimensional [1]

Description
If true, compute a 2D image DFT. Otherwise, compute the
1D DFT of each data row

Note that this transformation can add zero-padding if required by the underlying FFT implementation.
N2D2 IP only

DistortionTransformation Apply elastic distortion to the image. This transformation is generally used on-the-fly (so that a different distortion is performed for each image), and for the learning
only.
Option [default value]
ElasticGaussianSize [15]
ElasticSigma [6.0]
ElasticScaling [0.0]
Scaling [0.0]
Rotation [0.0]

N2D2 IP only

Description
Size of the gaussian for elastic distortion (in pixels)
Sigma of the gaussian for elastic distortion
Scaling of the gaussian for elastic distortion
Maximum random scaling amplitude (+/-, in percentage)
Maximum random rotation amplitude (+/-, in °)

EqualizeTransformation Image histogram equalization.

36/82

Option [default value]
Method [Standard]

Description
Standard: standard histogram equalization
CLAHE: contrast limited adaptive histogram equalization
Threshold for contrast limiting (for CLAHE only)
Size of grid for histogram equalization (for CLAHE only). Input
image will be divided into equally sized rectangular tiles. This
parameter defines the number of tiles in row and column.

[40.0]
[8]

CLAHE_ClipLimit
CLAHE_GridSize

N2D2 IP only

ExpandLabelTransformation Expand single image label (1x1 pixel) to full frame label.
FilterTransformation Apply a convolution filter to the image.
Option [default value]

Description
Convolution kernel. Possible values are:
*: custom kernel
Gaussian: Gaussian kernel
LoG: Laplacian Of Gaussian kernel
DoG: Difference Of Gaussian kernel
Gabor: Gabor kernel

Kernel

* kernel

Custom kernel.
Option
Kernel.SizeX
Kernel.SizeY

[0]
[0]

Kernel.Mat

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
List of row-major ordered coefficients of
the kernel

If both Kernel.SizeX and Kernel.SizeY are 0, the kernel is assumed to be square.
Gaussian kernel

Gaussian kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY

√ [1]
Kernel.Sigma [ 2.0]
Kernel.Positive

LoG kernel

Laplacian Of Gaussian kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY

√ [1]
Kernel.Sigma [ 2.0]
Kernel.Positive

DoG kernel

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
If true, the center of the kernel is positive
Sigma of the kernel

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
If true, the center of the kernel is positive
Sigma of the kernel

Difference Of Gaussian kernel kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY

[1]
Kernel.Sigma1 [2.0]
Kernel.Sigma2 [1.0]
Kernel.Positive

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
If true, the center of the kernel is positive
Sigma1 of the kernel
Sigma2 of the kernel
37/82

Gabor kernel

Gabor kernel.

Option [default value]
Kernel.SizeX
Kernel.SizeY
Kernel.Theta

√
[ 2.0]
Kernel.Lambda [10.0]
Kernel.Psi [π/2.0]
Kernel.Gamma [0.5]
Kernel.Sigma

Description
Width of the kernel (numer of columns)
Height of the kernel (number of rows)
Theta of the kernel
Sigma of the kernel
Lambda of the kernel
Psi of the kernel
Gamma of the kernel

FlipTransformation Image flip transformation.
Option [default value]
HorizontalFlip [0]
VerticalFlip [0]
RandomHorizontalFlip [0]
RandomVerticalFlip [0]
N2D2 IP only

GradientFilterTransformation Compute image gradient.
Option [default value]
Scale [1.0]
Delta [0.0]
GradientFilter [Sobel]
KernelSize

[3]

ApplyToLabels

InvThreshold
Threshold
Label

[0]

[0]

[0.5]

[]

GradientScale

N2D2 IP only

Description
If true, flip the image horizontally
If true, flip the image vertically
If true, randomly flip the image horizontally
If true, randomly flip the image vertically

[1.0]

Description
Scale to apply to the computed gradient
Bias to add to the computed gradient
Filter type to use for computing the gradient. Possible
options are: Sobel, Scharr and Laplacian
Size of the filter kernel (has no effect when using the Scharr
filter, which kernel size is always 3x3)
If true, use the computed gradient to filter the image label and
ignore pixel areas where the gradient is below the Threshold.
In this case, only the labels are modified, not the image
If true, ignored label pixels will be the ones with a low
gradient (low contrasted areas)
Threshold applied on the image gradient
List of labels to filter (space-separated)
Rescale the image by this factor before applying the gradient
and the threshold, then scale it back to filter the labels

LabelSliceExtractionTransformation Extract a slice from an image belonging to a given label.

38/82

Option [default value]
Width
Height
Label

[-1]

RandomRotation

[0]

RandomRotationRange
SlicesMargin

[0.0 360.0]

[0]

KeepComposite

[0]

Description
Width of the slice to extract
Height of the slice to extract
Slice should belong to this label ID. If -1, the label ID is
random
If true, extract randomly rotated slices
Range of the random rotations, in degrees, counterclockwise
(if RandomRotation is enabled)
Positive or negative, indicates the margin around objects
that can be extracted in the slice
If false, the 2D label image is reduced to a single value
corresponding to the extracted object label (useful for patches
classification tasks). Note that if SlicesMargin is > 0, the 2D
label image may contain other labels before reduction. For
pixel-wise segmentation tasks, set KeepComposite to true.

This transformation is useful to learn sparse object occurrences in a lot of background. If the
dataset is very unbalanced towards background, this transformation will ensure that the learning is
done on a more balanced set of every labels, regardless of their actual pixel-wise ratio.

(a) Randomly extracted slices with label 0.

(b) Randomly extracted slices with label 1.

Figure 12: Illustration of the working behavior of LabelSliceExtractionTransformation with SlicesMargin
= 0.

When SlicesMargin is 0, only slices that fully include a given label are extracted, as shown
in figure 12. The behavior with SlicesMargin < 0 is illustrated in figure 13. Note that setting a
negative SlicesMargin larger in absolute value than Width/2 or Height/2 will lead in some (random)
cases in incorrect slice labels in respect to the majority pixel label in the slice.
MagnitudePhaseTransformation Compute the magnitude and phase of a complex two channels
input data, with the first channel x being the real part and the second channel y the imaginary
part. The resulting data is two channels, the first one with the magnitude and the second one with
the phase.
Option [default value]
LogScale [0]

Description
If true, compute the magnitude in log scale

39/82

(a) Randomly extracted slices including label 0.

(b) Randomly extracted slices including label 1.

Figure 13: Illustration of the working behavior of LabelSliceExtractionTransformation with SlicesMargin
= -32.

The magnitude is:
Mi,j =

q

x2i,j + x2i,j

0 = log(1 + M ).
If LogScale = 1, compute Mi,j
i,j
The phase is:

θi,j = atan2(yi,j , xi,j )
N2D2 IP only

MorphologicalReconstructionTransformation Apply a morphological reconstruction transformation to the image. This transformation is also useful for post-processing.
Option [default value]
Operation

Size
ApplyToLabels

[0]

Shape [Rectangular]
NbIterations

N2D2 IP only

[1]

Description
Morphological operation to apply. Can be:
ReconstructionByErosion: reconstruction by erosion operation
ReconstructionByDilation: reconstruction by dilation operation
OpeningByReconstruction: opening by reconstruction operation
ClosingByReconstruction: closing by reconstruction operation
Size of the structuring element
If true, apply the transformation to the labels instead of the
image
Shape of the structuring element used for morphology operations. Can be Rectangular, Elliptic or Cross.
Number of times erosion and dilation are applied for opening
and closing reconstructions

MorphologyTransformation Apply a morphology transformation to the image. This transformation is also useful for post-processing.

40/82

Option [default value]
Operation

Size
ApplyToLabels

[0]

Shape [Rectangular]
NbIterations

[1]

Description
Morphological operation to apply. Can be:
Erode: erode operation (= erode(src))
Dilate: dilate operation (= dilate(src))
Opening: opening operation (open(src) = dilate(erode(src)))
Closing: closing operation (close(src) = erode(dilate(src)))
Gradient: morphological gradient (= dilate(src)−erode(src))
TopHat: top hat (= src − open(src))
BlackHat: black hat (= close(src) − src)
Size of the structuring element
If true, apply the transformation to the labels instead of the
image
Shape of the structuring element used for morphology operations. Can be Rectangular, Elliptic or Cross.
Number of times erosion and dilation are applied

NormalizeTransformation Normalize the image.
Option [default value]
Norm [MinMax]

NormValue

[1.0]

NormMin

[0.0]

NormMax

[1.0]

PerChannel

[0]

Description
Norm type, can be:
L1: L1 normalization
L2: L2 normalization
Linf: Linf normalization
MinMax: min-max normalization
Norm value (for L1, L2 and Linf)
Such that ||data||Lp = N ormV alue
Min value (for MinMax only)
Such that min(data) = N ormM in
Max value (for MinMax only)
Such that max(data) = N ormM ax
If true, normalize each channel individually

PadCropTransformation Pad/crop the image to a specified size.
Option [default value]
Width
Height
PaddingBackground [MeanColor]

N2D2 IP only

Description
Width of the padded/cropped image
Height of the padded/cropped image
Background color used when padding. Possible values:
MeanColor: pad with the mean color of the image
BlackColor: pad with black

RandomAffineTransformation Apply a global random affine transformation to the values of the
image.

41/82

Option [default value]
GainRange [1.0 1.0]
GainRange[*] [1.0 1.0]

BiasRange

[0.0 0.0]
[0.0 0.0]

BiasRange[*]

[1.0 1.0]
GammaRange[*] [1.0 1.0]
GammaRange

GainVarProb

[1.0]

BiasVarProb

[1.0]

GammaVarProb

[1.0]

DisjointGamma

ChannelsMask

[0]

[]

Description
Random gain (α) range (identical for all channels)
Random gain (α) range for channel *. Mutually exclusive
with GainRange. If any specified, a different random gain will
always be sampled for each channel. Default gain is 1.0 (no
gain) for missing channels
The gain control the contrast of the image
Random bias (β) range (identical for all channels)
Random bias (β) range for channel *. Mutually exclusive
with BiasRange. If any specified, a different random bias will
always be sampled for each channel. Default bias is 0.0 (no
bias) for missing channels
The bias control the brightness of the image
Random gamma (γ) range (identical for all channels)
Random gamma (γ) range for channel *. Mutually exclusive
with GammaRange. If any specified, a different random gamma
will always be sampled for each channel. Default gamma is
1.0 (no change) for missing channels
The gamma control more or less the exposure of the image
Probability to have a gain variation for each channel. If only
one value is specified, the same probability applies to all
the channels. In this case, the same gain variation will be
sampled for all the channels only if a single range if specified
for all the channels using GainRange. If more than one value
is specified, a different random gain will always be sampled
for each channel, even if the probabilities and ranges are
identical
Probability to have a bias variation for each channel. If only
one value is specified, the same probability applies to all
the channels. In this case, the same bias variation will be
sampled for all the channels only if a single range if specified
for all the channels using BiasRange. If more than one value
is specified, a different random bias will always be sampled
for each channel, even if the probabilities and ranges are
identical
Probability to have a gamma variation for each channel. If
only one value is specified, the same probability applies to
all the channels. In this case, the same gamma variation
will be sampled for all the channels only if a single range if
specified for all the channels using GammaRange. If more than
one value is specified, a different random gamma will always
be sampled for each channel, even if the probabilities and
ranges are identical
If true, gamma variation and gain/bias variation are mutually exclusive. The probability to have a random gamma
variation is therefore GammaVarProb and the probability to have
a gain/bias variation is 1-GammaVarProb.
If not empty, specifies on which channels the transformation
is applied. For example, to apply the transformation only to
the first and third channel, set ChannelsMask to 1 0 1

The equation of the transformation is:
42/82

(

S=

numeric_limits::max() if is_integer
1.0
otherwise
 

v(i, j) = cv::saturate_cast α

v(i, j)
S

γ



S + β.S

RangeAffineTransformation Apply an affine transformation to the values of the image.
Option [default value]
FirstOperator
FirstValue
SecondOperator [Plus]
SecondValue

[0.0]

Description
First operator, can be Plus, Minus, Multiplies, Divides
First value
Second operator, can be Plus, Minus, Multiplies, Divides
Second value

The final operation is the following:
f (x) = (x opo1st val1st )
N2D2 IP only

o
op2nd

val2nd

RangeClippingTransformation Clip the value range of the image.
Option [default value]
RangeMin [min(data)]
RangeMax [max(data)]

Description
Image values below RangeMin are clipped to 0
Image values above RangeMax are clipped to 1 (or the maximum
integer value of the data type)

RescaleTransformation Rescale the image to a specified size.
Option [default value]
Width
Height
KeepAspectRatio
ResizeToFit

[0]

[1]

Description
Width of the rescaled image
Height of the rescaled image
If true, keeps the aspect ratio of the image
If true, resize along the longest dimension when
KeepAspectRatio is true

ReshapeTransformation Reshape the data to a specified size.
Option [default value]
NbRows
NbCols

[0]

NbChannels

N2D2 IP only

[0]

Description
New number of rows
New number of cols (0 = no check)
New number of channels (0 = no change)

SliceExtractionTransformation Extract a slice from an image.

43/82

Option [default value]
Width
Height
OffsetX
OffsetY

[0]
[0]

[0]
RandomOffsetY [0]
RandomRotation [0]
RandomOffsetX

RandomRotationRange
RandomScaling

[0]

RandomScalingRange
AllowPadding

[0.0 360.0]

[0.8 1.2]

[0]

Description
Width of the slice to extract
Height of the slice to extract
X offset of the slice to extract
Y offset of the slice to extract
If true, the X offset is chosen randomly
If true, the Y offset is chosen randomly
If true, extract randomly rotated slices
Range of the random rotations, in degrees, counterclockwise
(if RandomRotation is enabled)
If true, extract randomly scaled slices
Range of the random scaling (if RandomRotation is enabled)
If true, zero-padding is allowed if the image is smaller than
the slice to extract

ThresholdTransformation Apply a thresholding transformation to the image. This transformation is also useful for post-processing.
Option [default value]
Threshold
OtsuMethod
Operation

[0]

[Binary]

Description
Threshold value
Use Otsu’s method to determine the optimal threshold (if
true, the Threshold value is ignored)
Thresholding operation to apply. Can be:
Binary
BinaryInverted
Truncate
ToZero
ToZeroInverted

MaxValue

[1.0]

Max. value to use with Binary and BinaryInverted operations

TrimTransformation Trim the image.
Option [default value]
NbLevels
Method [Discretize]

N2D2 IP only

Description
Number of levels for the color discretization of the image
Possible values are:
Reduce: discretization using K-means
Discretize: simple discretization

WallisFilterTransformation Apply Wallis filter to the image.
Option [default value]
Size

[0.0]
[1.0]
PerChannel [0]
Mean

StdDev

4.7
4.7.1

Description
Size of the filter
Target mean value
Target standard deviation
If true, apply Wallis filter to each channel individually (this
parameter is meaningful only if Size is 0)

Network layers
Layer definition

Common set of parameters for any kind of layer.
44/82

Option [default value]
Input
Type
Model [DefaultModel]
DataType [DefaultDataType]
ConfigSection

[]

Description
Name of the section(s) for the input layer(s). Comma separated
Type of the layer. Can be any of the type described below
Layer model to use
Layer data type to use. Please note that some layers may
not support every data type.
Name of the configuration section for layer

To specify that the back-propagated error must be computed at the output of a given layer
(generally the last layer, or output layer), one must add a target section named LayerName .Target:
...
[LayerName.Target]
TargetValue=1.0 ; default: 1.0
DefaultValue=0.0 ; default: -1.0

4.7.2

Weight fillers

Fillers to initialize weights and biases in the different type of layer.
Usage example:
[conv1]
...
WeightsFiller=NormalFiller
WeightsFiller.Mean=0.0
WeightsFiller.StdDev=0.05
...

The initial weights distribution for each layer can be checked in the weights_init folder, with
an example shown in figure 14.

Figure 14: Initial weights distribution of a layer using a normal distribution (NormalFiller) with a 0 mean
and a 0.05 standard deviation.

45/82

ConstantFiller

Fill with a constant value.

Option
FillerName .Value

Description
Value for the filling

HeFiller Fill with an normal distribution with normalized variance taking into account the
rectifier nonlinearity (He et al., 2015). This filler is sometimes referred as MSRA filler.
Option [default value]
FillerName .VarianceNorm
[FanIn]
FillerName .Scaling [1.0]

Description
Normalization, can be FanIn, Average or FanOut
Scaling factor

Use a normal distribution with standard deviation
• n = f an-in with FanIn, resulting in V ar(W ) =
• n=

(f an-in+f an-out)
2

q

2.0
n .

2
f an-in

with Average, resulting in V ar(W ) =

• n = f an-out with FanOut, resulting in V ar(W ) =

4
f an-in+f an-out

2
f an-out

NormalFiller Fill with a normal distribution.
Option [default value]
FillerName .Mean [0.0]
FillerName .StdDev [1.0]

Description
Mean value of the distribution
Standard deviation of the distribution

UniformFiller Fill with an uniform distribution.
Option [default value]
FillerName .Min [0.0]
FillerName .Max [1.0]

Description
Min. value
Max. value

XavierFiller Fill with an uniform distribution with normalized variance (Glorot and Bengio,
2010).
Option [default value]
FillerName .VarianceNorm
[FanIn]
FillerName .Distribution
[Uniform]
FillerName .Scaling [1.0]

Description
Normalization, can be FanIn, Average or FanOut
Distribution, can be Uniform or Normal
Scaling factor

Use an uniform distribution with interval [−scale, scale], with scale =
• n = f an-in with FanIn, resulting in V ar(W ) =
• n=

(f an-in+f an-out)
2

q

3.0
n .

1
f an-in

with Average, resulting in V ar(W ) =

• n = f an-out with FanOut, resulting in V ar(W ) =

2
f an-in+f an-out

1
f an-out

46/82

4.7.3

Weight solvers

SGDSolver_Frame SGD Solver for Frame models.
Option [default value]
SolverName .LearningRate
[0.01]
SolverName .Momentum [0.0]
SolverName .Decay [0.0]
SolverName .
LearningRatePolicy [None]
SolverName .
LearningRateStepSize [1]
SolverName .LearningRateDecay
[0.1]
SolverName .Clamping [0]
SolverName .Power [0.0]
SolverName .MaxIterations
[0.0]

Description
Learning rate
Momentum
Decay
Learning rate decay policy. Can be any of None, StepDecay,
ExponentialDecay, InvTDecay, PolyDecay
Learning rate step size (in number of stimuli)
Learning rate decay
If true, clamp the weights and bias between -1 and 1
Polynomial learning rule power parameter
Polynomial learning rule maximum number of iterations

The learning rate decay policies are the following:
• StepDecay: every SolverName .LearningRateStepSize stimuli, the learning rate is reduced by a
factor SolverName .LearningRateDecay;
• ExponentialDecay: the learning rate is α = α0 exp(−kt), with α0 the initial learning rate
SolverName .LearningRate, k the rate decay SolverName .LearningRateDecay and t the step
number (one step every SolverName .LearningRateStepSize stimuli);
• InvTDecay: the learning rate is α = α0 /(1 + kt), with α0 the initial learning rate SolverName .
LearningRate, k the rate decay SolverName .LearningRateDecay and t the step number (one step
every SolverName .LearningRateStepSize stimuli).
• InvDecay: the learning rate is α = α0 ∗ (1 + kt)−n , with α0 the initial learning rate SolverName .LearningRate, k the rate decay SolverName .LearningRateDecay, t the current iteration
and n the power parameter SolverName .Power
• PolyDecay: the learning rate is α = α0 ∗ (1 − kt )n , with α0 the initial learning rate SolverName .LearningRate, k the current iteration, t the maximum number of iteration SolverName .
MaxIterations and n the power parameter SolverName .Power

SGDSolver_Frame_CUDA SGD Solver for Frame_CUDA models.

47/82

Option [default value]
SolverName .LearningRate
[0.01]
SolverName .Momentum [0.0]
SolverName .Decay [0.0]
SolverName .
LearningRatePolicy [None]
SolverName .
LearningRateStepSize [1]
SolverName .LearningRateDecay
[0.1]
SolverName .Clamping [0]

Description
Learning rate
Momentum
Decay
Learning rate decay policy. Can be any of None, StepDecay,
ExponentialDecay, InvTDecay
Learning rate step size (in number of stimuli)
Learning rate decay
If true, clamp the weights and bias between -1 and 1

The learning rate decay policies are identical to the ones in the SGDSolver\_Frame solver.
AdamSolver_Frame Adam Solver for Frame models (Kingma and Ba, 2014).
Option [default value]
SolverName .LearningRate
[0.001]
SolverName .Beta1 [0.9]
SolverName .Beta2 [0.999]
SolverName .Epsilon [1.0e-8]

Description
Learning rate (stepsize)
Exponential decay rate of these moving average of the first
moment
Exponential decay rate of these moving average of the second
moment
Epsilon

AdamSolver_Frame_CUDA Adam Solver for Frame_CUDA models (Kingma and Ba, 2014).
Option [default value]
SolverName .LearningRate
[0.001]
SolverName .Beta1 [0.9]
SolverName .Beta2 [0.999]
SolverName .Epsilon [1.0e-8]
4.7.4

Description
Learning rate (stepsize)
Exponential decay rate of these moving average of the first
moment
Exponential decay rate of these moving average of the second
moment
Epsilon

Activation functions

Activation function to be used at the output of layers.
Usage example:
[conv1]
...
ActivationFunction=Rectifier
ActivationFunction.LeakSlope=0.01
ActivationFunction.Clipping=20
...

Logistic Logistic activation function.
LogisticWithLoss Logistic with loss activation function.

48/82

Rectifier Rectifier or ReLU activation function.
Option [default value]
ActivationFunction.LeakSlope

Description
Leak slope for negative inputs

[0.0]
ActivationFunction.Clipping

Clipping value for positive outputs

[0.0]
Saturation

Saturation activation function.

Softplus Softplus activation function.
Tanh Tanh activation function.
Computes y = tanh(αx).
Option [default value]
ActivationFunction.Alpha

[1.0]

Description
α parameter

TanhLeCun Tanh activation function with an α parameter of 1.7159 × (2.0/3.0).
4.7.5

Anchor

Anchor layer for Faster R-CNN or Single Shot Detector.
Option [default value]
Input

Anchor[*]
ScoresCls

FeatureMapWidth

Description
This layer takes one or two inputs. The total number of
input channels must be ScoresCls + 4, with ScoresCls being
equal to 1 or 2.
Anchors definition. For each anchor, there must be two
space-separated values: the root area and the aspect ratio.
Number of classes per anchor. Must be 1 (if the scores input
uses logistic regression) or 2 (if the scores input is a two-class
softmax layer)
Reference width use to scale anchors coordinate.

[StimuliProvider.Width]
Reference height use to scale anchors coordinate.

FeatureMapHeight

[StimuliProvider.Height]

Configuration parameters (Frame models)
Option [default value]
PositiveIoU [0.7]
NegativeIoU

LossLambda

[0.3]

all Frame

[10.0]

LossPositiveSample

Model(s)
all Frame

[128]

all Frame
all Frame

Description
Assign a positive label for anchors whose IoU overlap
is higher than PositiveIoU with any ground-truth box
Assign a negative label for non-positive anchors whose
IoU overlap is lower than NegativeIoU for all groundtruth boxes
Balancing parameter λ
Number of random positive samples for the loss computation
49/82

LossNegativeSample

[128]

all Frame

Number of random negative samples for the loss computation

Usage example:
; RPN network: cls layer
[scores]
Input=...
Type=Conv
KernelWidth=1
KernelHeight=1
; 18 channels for 9 anchors
NbOutputs=18
...
[scores.softmax]
Input=scores
Type=Softmax
NbOutputs=[scores]NbOutputs
WithLoss=1
; RPN network: coordinates layer
[coordinates]
Input=...
Type=Conv
KernelWidth=1
KernelHeight=1
; 36 channels for 4 coordinates x 9 anchors
NbOutputs=36
...
; RPN network: anchors
[anchors]
Input=scores.softmax,coordinates
Type=Anchor
ScoresCls=2 ; using a two-class softmax for the scores
Anchor[0]=32 1.0
Anchor[1]=48 1.0
Anchor[2]=64 1.0
Anchor[3]=80 1.0
Anchor[4]=96 1.0
Anchor[5]=112 1.0
Anchor[6]=128 1.0
Anchor[7]=144 1.0
Anchor[8]=160 1.0
ConfigSection=anchors.config
[anchors.config]
PositiveIoU=0.7
NegativeIoU=0.3
LossLambda=1.0

Outputs remapping Outputs remapping allows to convert scores and coordinates output feature
maps layout from another ordering that the one used in the N2D2 Anchor layer, during weights
import/export.
For example, lets consider that the imported weights corresponds to the following output feature
maps ordering:
0 anchor[0].y
1 anchor[0].x

50/82

2 anchor[0].h
3 anchor[0].w
4 anchor[1].y
5 anchor[1].x
6 anchor[1].h
7 anchor[1].w
8 anchor[2].y
9 anchor[2].x
10 anchor[2].h
11 anchor[2].w

The output feature maps ordering required by the Anchor layer is:
0 anchor[0].x
1 anchor[1].x
2 anchor[2].x
3 anchor[0].y
4 anchor[1].y
5 anchor[2].y
6 anchor[0].w
7 anchor[1].w
8 anchor[2].w
9 anchor[0].h
10 anchor[1].h
11 anchor[2].h

The feature maps ordering can be changed during weights import/export:
; RPN network: coordinates layer
[coordinates]
Input=...
Type=Conv
KernelWidth=1
KernelHeight=1
; 36 channels for 4 coordinates x 9 anchors
NbOutputs=36
...
ConfigSection=coordinates.config
[coordinates.config]
WeightsExportFormat=HWCO ; Weights format used by TensorFlow
OutputsRemap=1:4,0:4,3:4,2:4

4.7.6

Conv

Convolutional layer.
Option [default value]
KernelWidth
KernelHeight
KernelDepth

[]

Description
Width of the kernels
Height of the kernels
Depth of the kernels (implies 3D kernels)
OR

KernelSize

[]

Kernels size (implies 2D square kernels)

KernelDims

[]

List of space-separated dimensions for N-D kernels
Number of output channels
X-axis subsampling factor of the output feature maps
Y-axis subsampling factor of the output feature maps
Z-axis subsampling factor of the output feature maps

OR

NbOutputs

[1]
SubSampleY [1]
SubSampleZ []
SubSampleX

OR

51/82

SubSample

[1]

Subsampling factor of the output feature maps
OR

SubSampleDims

[]

List of space-separated subsampling dimensions for N-D
kernels
X-axis stride of the kernels
Y-axis stride of the kernels
Z-axis stride of the kernels

[1]
[1]
StrideZ []
StrideX
StrideY

OR

Stride

[1]

Stride of the kernels
OR

[]
PaddingX [0]
PaddingY [0]
PaddingZ []

List of space-separated stride dimensions for N-D kernels
X-axis input padding
Y-axis input padding
Z-axis input padding

StrideDims

OR

Padding

[0]

Input padding
OR

[]
[1]
DilationY [1]
DilationZ []

List of space-separated padding dimensions for N-D kernels
X-axis dilation of the kernels
Y-axis dilation of the kernels
Z-axis dilation of the kernels

PaddingDims
DilationX

OR

Dilation

[1]

Dilation of the kernels
OR

DilationDims

[]

List of space-separated dilation dimensions for N-D kernels
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Weights initial values filler

ActivationFunction [Tanh]
WeightsFiller

[NormalFiller(0.0, 0.05)]
Biases initial values filler

BiasFiller

[NormalFiller(0.0, 0.05)]
Mapping.NbGroups []
Mapping.ChannelsPerGroup

[1]
[1]
Mapping.Size [1]
Mapping.SizeX
Mapping.SizeY

[1]
Mapping.StrideY [1]
Mapping.Stride [1]
Mapping.StrideX

[0]
[0]
Mapping.Offset [0]
Mapping.OffsetX
Mapping.OffsetY

Mapping.NbIterations

[1]
Mapping(in).SizeY [1]
Mapping(in).SizeX

[0]

[]

Mapping: number of groups
(mutually exclusive with all other Mapping.* options)
Mapping: number of channels per group
(mutually exclusive with all other Mapping.* options)
Mapping canvas pattern default width
Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in

52/82

Mapping(in).Size

[1]

[1]
Mapping(in).StrideY [1]
Mapping(in).Stride [1]
Mapping(in).StrideX

[0]
[0]
Mapping(in).Offset [0]
Mapping(in).OffsetX
Mapping(in).OffsetY

Mapping(in).NbIterations

[]
[]

WeightsSharing
BiasesSharing

[0]

Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in
Mapping canvas default Y-axis offset for input layer in
Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)
Share the weights with an other layer
Share the biases with an other layer

Configuration parameters (Frame models)
Option [default value]
NoBias [0]
Solvers.*
WeightsSolver.*

Model(s)
all Frame
all Frame
all Frame

BiasSolver.*

all Frame

WeightsExportFormat

all Frame

[OCHW]

WeightsExportFlip

[0]

all Frame

Description
If true, don’t use bias
Any solver parameters
Weights solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
Weights import/export format. Can be OCHW or OCHW,
with O the output feature map, C the input feature map
(channel), H the kernel row and W the kernel column, in
the order of the outermost dimension (in the leftmost
position) to the innermost dimension (in the rightmost
position)
If true, import/export flipped kernels

Configuration parameters (Spike models)
Experimental option (implementation may be wrong or susceptible to change)

Option [default value]
IncomingDelay [1 TimePs
;100 TimeFs]
Threshold [1.0]
BipolarThreshold [1]
Leak

[0.0]

Refractory

Model(s)
all Spike

Description
Synaptic incoming delay wdelay

Spike, Spike_RRAM

Threshold of the neuron Ithres
If true, the threshold is also applied to the absolute
value of negative values (generating negative spikes)
Neural leak time constant τleak (if 0, no leak)
Neural refractory period Tref rac

Spike, Spike_RRAM
Spike, Spike_RRAM

[0.0]

Spike, Spike_RRAM

53/82

[0.0;0.05]
[1;0.1]
WeightsMaxMean [100;10.0]
WeightsMinVarSlope [0.0]
WeightsMinVarOrigin [0.0]
WeightsMaxVarSlope [0.0]
WeightsMaxVarOrigin [0.0]
WeightsSetProba [1.0]
WeightsRelInit

Spike

WeightsMinMean

Spike_RRAM

WeightsResetProba

Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM

[1.0]

Spike_RRAM

[1]

Spike_RRAM

SynapticRedundancy
BipolarWeights

Spike_RRAM

[0]

BipolarIntegration

Spike_RRAM

[0]

Spike_RRAM

LtpProba

[0.2]

Spike_RRAM

LtdProba

[0.1]

Spike_RRAM

StdpLtp

[1000 TimePs]

Spike_RRAM

[0

InhibitRefractory

Spike_RRAM

Relative initial synaptic weight winit
Mean minimum synaptic weight wmin
Mean maximum synaptic weight wmax
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
Intrinsic SET switching probability PSET (upon receiving a SET programming pulse). Assuming uniform
statistical distribution (not well supported by experiments on RRAM)
Intrinsic RESET switching probability PRESET (upon
receiving a RESET programming pulse). Assuming
uniform statistical distribution (not well supported by
experiments on RRAM)
Synaptic redundancy (number of RRAM device per
synapse)
Bipolar weights
Bipolar integration
Extrinsic STDP LTP probability (cumulative with intrinsic SET switching probability PSET )
Extrinsic STDP LTD probability (cumulative with
intrinsic RESET switching probability PRESET )
STDP LTP time window TLT P
Neural lateral inhibition period Tinhibit

TimePs]
EnableStdp

[1]

Spike_RRAM

RefractoryIntegration
DigitalIntegration

4.7.7

[1]

[0]

Spike_RRAM
Spike_RRAM

If false, STDP is disabled (no synaptic weight change)
If true, reset the integration to 0 during the refractory
period
If false, the analog value of the devices is integrated,
instead of their binary value

Deconv

Deconvolutionlayer.
Option [default value]
KernelWidth
KernelHeight
KernelDepth

[]

Description
Width of the kernels
Height of the kernels
Depth of the kernels (implies 3D kernels)
OR

KernelSize

[]

Kernels size (implies 2D square kernels)

KernelDims

[]

List of space-separated dimensions for N-D kernels
Number of output channels
X-axis stride of the kernels
Y-axis stride of the kernels
Z-axis stride of the kernels

OR

NbOutputs

[1]
StrideY [1]
StrideZ []
StrideX

OR

Stride

[1]

Stride of the kernels
54/82

OR

[]
PaddingX [0]
PaddingY [0]
PaddingZ []

List of space-separated stride dimensions for N-D kernels
X-axis input padding
Y-axis input padding
Z-axis input padding

StrideDims

OR

Padding

[0]

Input padding
OR

[]
DilationX [1]
DilationY [1]
DilationZ []

List of space-separated padding dimensions for N-D kernels
X-axis dilation of the kernels
Y-axis dilation of the kernels
Z-axis dilation of the kernels

PaddingDims

OR

Dilation

[1]

Dilation of the kernels
OR

DilationDims

[]

List of space-separated dilation dimensions for N-D kernels
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Weights initial values filler

ActivationFunction [Tanh]
WeightsFiller

[NormalFiller(0.0, 0.05)]
Biases initial values filler

BiasFiller

[NormalFiller(0.0, 0.05)]
Mapping.NbGroups []
Mapping.ChannelsPerGroup

[1]
Mapping.SizeY [1]
Mapping.Size [1]
Mapping.SizeX

[1]
[1]
Mapping.Stride [1]
Mapping.StrideX
Mapping.StrideY

[0]
Mapping.OffsetY [0]
Mapping.Offset [0]
Mapping.OffsetX

Mapping.NbIterations

[0]

[1]
[1]
Mapping(in).Size [1]
Mapping(in).SizeX
Mapping(in).SizeY

[1]
[1]
Mapping(in).Stride [1]
Mapping(in).StrideX
Mapping(in).StrideY

Mapping(in).OffsetX

[0]

[]

Mapping: number of groups
(mutually exclusive with all other Mapping.* options)
Mapping: number of channels per group
(mutually exclusive with all other Mapping.* options)
Mapping canvas pattern default width
Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in
Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in

55/82

[0]
[0]

Mapping(in).OffsetY
Mapping(in).Offset

Mapping(in).NbIterations

[]
BiasesSharing []
WeightsSharing

[0]

Mapping canvas default Y-axis offset for input layer in
Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)
Share the weights with an other layer
Share the biases with an other layer

Configuration parameters (Frame models)
Option [default value]
NoBias [0]
BackPropagate [1]
Solvers.*
WeightsSolver.*

Model(s)
all Frame
all Frame
all Frame
all Frame

BiasSolver.*

all Frame

WeightsExportFormat

all Frame

[OCHW]

WeightsExportFlip

4.7.8

[0]

all Frame

Description
If true, don’t use bias
If true, enable backpropogation
Any solver parameters
Weights solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
Weights import/export format. Can be OCHW or OCHW,
with O the output feature map, C the input feature map
(channel), H the kernel row and W the kernel column, in
the order of the outermost dimension (in the leftmost
position) to the innermost dimension (in the rightmost
position)
If true, import/export flipped kernels

Pool

Pooling layer.
There are two CUDA models for this cell:
• Frame_CUDA, which uses CuDNN as back-end and only supports one-to-one input to output
map connection;
• Frame_EXT_CUDA, which uses custom CUDA kernels and allows arbitrary connections between
input and output maps (and can therefore be used to implement Maxout or both Maxout
and Pooling simultaneously).
Maxout example In the following INI section, one implements a Maxout between each consecutive pair of 8 input maps:
[maxout_layer]
Input=...
Type=Pool
Model=Frame_EXT_CUDA
PoolWidth=1
PoolHeight=1
NbOutputs=4
Pooling=Max
Mapping.SizeY=2
Mapping.StrideY=2

56/82

# input map

The layer connectivity is the following:
1
2
3
4
5
6
7
8
1 2 3 4
# output map

Option [default value]

Description
Type of pooling (Max or Average)
Width of the pooling area
Height of the pooling area
Depth of the pooling area (implies 3D pooling area)

Pooling
PoolWidth
PoolHeight
PoolDepth

[]

OR

PoolSize

[]

Pooling area size (implies 2D square pooling area)

PoolDims

[]

List of space-separated dimensions for N-D pooling area
Number of output channels
X-axis stride of the pooling area
Y-axis stride of the pooling area
Z-axis stride of the pooling area

OR

NbOutputs

[1]
[1]
StrideZ []
StrideX
StrideY

OR

Stride

[1]

Stride of the pooling area
OR

StrideDims

[]

List of space-separated stride dimensions for N-D pooling
area
X-axis input padding
Y-axis input padding
Z-axis input padding

[0]
PaddingY [0]
PaddingZ []
PaddingX

OR

Padding

[0]

Input padding
OR

PaddingDims

[]

ActivationFunction [Linear]
Mapping.NbGroups

[]

Mapping.ChannelsPerGroup

[1]
[1]
Mapping.Size [1]
Mapping.SizeX
Mapping.SizeY

[1]
Mapping.StrideY [1]
Mapping.StrideX

[]

List of space-separated padding dimensions for N-D pooling
area
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Mapping: number of groups
(mutually exclusive with all other Mapping.* options)
Mapping: number of channels per group
(mutually exclusive with all other Mapping.* options)
Mapping canvas pattern default width
Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
57/82

Mapping.Stride

[1]

[0]
[0]
Mapping.Offset [0]
Mapping.OffsetX
Mapping.OffsetY

Mapping.NbIterations

[0]

[1]
[1]
Mapping(in).Size [1]
Mapping(in).SizeX
Mapping(in).SizeY

[1]
[1]
Mapping(in).Stride [1]
Mapping(in).StrideX
Mapping(in).StrideY

[0]
[0]
Mapping(in).Offset [0]
Mapping(in).OffsetX
Mapping(in).OffsetY

Mapping(in).NbIterations

[0]

Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in
Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in
Mapping canvas default Y-axis offset for input layer in
Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)

Configuration parameters (Spike models)
Option [default value]
IncomingDelay [1 TimePs
;100 TimeFs]
value

4.7.9

Model(s)
all Spike

Description
Synaptic incoming delay wdelay

Unpool

Unpooling layer.
Option [default value]
Pooling
PoolWidth
PoolHeight
PoolDepth

[]

Description
Type of pooling (Max or Average)
Width of the pooling area
Height of the pooling area
Depth of the pooling area (implies 3D pooling area)
OR

PoolSize

[]

Pooling area size (implies 2D square pooling area)

PoolDims

[]

List of space-separated dimensions for N-D pooling area
Number of output channels

OR

NbOutputs

58/82

Name of the associated pool layer for the argmax (the pool
layer input and the unpool layer output dimension must
match)
X-axis stride of the pooling area
Y-axis stride of the pooling area
Z-axis stride of the pooling area

ArgMax

[1]
StrideY [1]
StrideZ []
StrideX

OR

Stride

[1]

Stride of the pooling area
OR

StrideDims

[]

List of space-separated stride dimensions for N-D pooling
area
X-axis input padding
Y-axis input padding
Z-axis input padding

[0]
PaddingY [0]
PaddingZ []
PaddingX

OR

Padding

[0]

Input padding
OR

PaddingDims

[]

ActivationFunction [Linear]
Mapping.NbGroups

[]

Mapping.ChannelsPerGroup

[1]
[1]
Mapping.Size [1]
Mapping.SizeX
Mapping.SizeY

[1]
Mapping.StrideY [1]
Mapping.Stride [1]
Mapping.StrideX

[0]
[0]
Mapping.Offset [0]
Mapping.OffsetX
Mapping.OffsetY

Mapping.NbIterations

[0]

[1]
Mapping(in).SizeY [1]
Mapping(in).Size [1]
Mapping(in).SizeX

[1]
Mapping(in).StrideY [1]
Mapping(in).Stride [1]
Mapping(in).StrideX

[0]
Mapping(in).OffsetY [0]
Mapping(in).OffsetX

[]

List of space-separated padding dimensions for N-D pooling
area
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Mapping: number of groups
(mutually exclusive with all other Mapping.* options)
Mapping: number of channels per group
(mutually exclusive with all other Mapping.* options)
Mapping canvas pattern default width
Mapping canvas pattern default height
Mapping canvas pattern default size
(mutually exclusive with Mapping.SizeX and Mapping.SizeY)
Mapping canvas default X-axis step
Mapping canvas default Y-axis step
Mapping canvas default step
(mutually exclusive with Mapping.StrideX and Mapping.StrideY)
Mapping canvas default X-axis offset
Mapping canvas default Y-axis offset
Mapping canvas default offset
(mutually exclusive with Mapping.OffsetX and Mapping.OffsetY)
Mapping canvas pattern default number of iterations (0
means no limit)
Mapping canvas pattern default width for input layer in
Mapping canvas pattern default height for input layer in
Mapping canvas pattern default size for input layer in
(mutually exclusive with Mapping(in).SizeX and
Mapping(in).SizeY)
Mapping canvas default X-axis step for input layer in
Mapping canvas default Y-axis step for input layer in
Mapping canvas default step for input layer in
(mutually exclusive with Mapping(in).StrideX and
Mapping(in).StrideY)
Mapping canvas default X-axis offset for input layer in
Mapping canvas default Y-axis offset for input layer in

59/82

Mapping(in).Offset

[0]

Mapping(in).NbIterations

4.7.10

Mapping canvas default offset for input layer in
(mutually exclusive with Mapping(in).OffsetX and
Mapping(in).OffsetY)
Mapping canvas pattern default number of iterations for
input layer in (0 means no limit)

[0]

ElemWise

Element-wise operation layer.
Option [default value]

Description
Number of output neurons
Type of operation (Sum, AbsSum, EuclideanSum, Prod, or Max)
Weights for the Sum, AbsSum, and EuclideanSum operation, in
the same order as the inputs
Shifts for the Sum and EuclideanSum operation, in the same
order as the inputs
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh

NbOutputs
Operation
Weights
Shifts

[1.0]

[0.0]

ActivationFunction [Linear]

Given N input tensors Ti , performs the following operation:
Sum operation Tout =

PN
1

AbsSum operation Tout =

(wi Ti + si )

PN
1

(wi |Ti |)

EuclideanSum operation Tout =
Prod operation Tout =

QN
1

q

PN
1

(wi Ti + si )2

(Ti )

Max operation Tout = M AX1N (Ti )
Examples

Sum of two inputs (Tout = T1 + T2 ):

[elemwise_sum]
Input=layer1,layer2
Type=ElemWise
NbOutputs=[layer1]NbOutputs
Operation=Sum

Weighted sum of two inputs, by a factor 0.5 for layer1 and 1.0 for layer2 (Tout = 0.5×T1 +1.0×T2 ):
[elemwise_weighted_sum]
Input=layer1,layer2
Type=ElemWise
NbOutputs=[layer1]NbOutputs
Operation=Sum
Weights=0.5 1.0

Single input scaling by a factor 0.5 and shifted by 0.1 (Tout = 0.5 × T1 + 0.1):

60/82

[elemwise_scale]
Input=layer1
Type=ElemWise
NbOutputs=[layer1]NbOutputs
Operation=Sum
Weights=0.5
Shifts=0.1

Absolute value of an input (Tout = |T1 |):
[elemwise_abs]
Input=layer1
Type=ElemWise
NbOutputs=[layer1]NbOutputs
Operation=Abs

4.7.11

FMP

Fractional max pooling layer (Graham, 2014).
Option [default value]
NbOutputs
ScalingRatio
ActivationFunction [Linear]

Description
Number of output channels


input size
Scaling ratio. The output size is round scaling
ratio .
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh

Configuration parameters (Frame models)
Option [default value]
Overlapping [1]
PseudoRandom [1]

4.7.12

Model(s)
all Frame
all Frame

Description
If true, use overlapping regions, else use disjoint regions
If true, use pseudorandom sequences, else use random
sequences

Fc

Fully connected layer.
Option [default value]
NbOutputs
WeightsFiller

Description
Number of output neurons
Weights initial values filler

[NormalFiller(0.0, 0.05)]
BiasFiller

[NormalFiller(0.0, 0.05)]
ActivationFunction [Tanh]

Biases initial values filler
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh

Configuration parameters (Frame models)
61/82

Option [default value]
NoBias [0]
BackPropagate [1]
Solvers.*
WeightsSolver.*

Model(s)
all Frame
all Frame
all Frame
all Frame

BiasSolver.*

all Frame

DropConnect

[1.0]

Frame

Description
If true, don’t use bias
If true, enable backpropogation
Any solver parameters
Weights solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
If below 1.0, fraction of synapses that are disabled with
drop connect

Configuration parameters (Spike models)
Option [default value]
IncomingDelay [1 TimePs
;100 TimeFs]
Threshold [1.0]
BipolarThreshold [1]
Leak

[0.0]

Model(s)
all Spike

Description
Synaptic incoming delay wdelay

Spike, Spike_RRAM

Spike_RRAM

Threshold of the neuron Ithres
If true, the threshold is also applied to the absolute
value of negative values (generating negative spikes)
Neural leak time constant τleak (if 0, no leak)
Neural refractory period Tref rac
Terminate delta
Relative initial synaptic weight winit
Mean minimum synaptic weight wmin
Mean maximum synaptic weight wmax
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
OXRAM specific parameter
Intrinsic SET switching probability PSET (upon receiving a SET programming pulse). Assuming uniform
statistical distribution (not well supported by experiments on RRAM)
Intrinsic RESET switching probability PRESET (upon
receiving a RESET programming pulse). Assuming
uniform statistical distribution (not well supported by
experiments on RRAM)
Synaptic redundancy (number of RRAM device per
synapse)
Bipolar weights
Bipolar integration
Extrinsic STDP LTP probability (cumulative with intrinsic SET switching probability PSET )
Extrinsic STDP LTD probability (cumulative with
intrinsic RESET switching probability PRESET )
STDP LTP time window TLT P
Neural lateral inhibition period Tinhibit

Spike_RRAM

If false, STDP is disabled (no synaptic weight change)

Spike, Spike_RRAM
Spike, Spike_RRAM

[0.0]
TerminateDelta [0]
WeightsRelInit [0.0;0.05]
WeightsMinMean [1;0.1]
WeightsMaxMean [100;10.0]
WeightsMinVarSlope [0.0]
WeightsMinVarOrigin [0.0]
WeightsMaxVarSlope [0.0]
WeightsMaxVarOrigin [0.0]
WeightsSetProba [1.0]
Refractory

WeightsResetProba

Spike, Spike_RRAM
Spike
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM
Spike_RRAM

[1.0]

Spike_RRAM

[1]

Spike_RRAM

SynapticRedundancy
BipolarWeights

Spike, Spike_RRAM

[0]

BipolarIntegration

Spike_RRAM

[0]

Spike_RRAM

LtpProba

[0.2]

Spike_RRAM

LtdProba

[0.1]

Spike_RRAM

StdpLtp

[1000 TimePs]

InhibitRefractory

Spike_RRAM

[0

TimePs]
EnableStdp

[1]

62/82

RefractoryIntegration
DigitalIntegration

N2D2 IP only

4.7.13

[1]

[0]

If true, reset the integration to 0 during the refractory
period
If false, the analog value of the devices is integrated,
instead of their binary value

Spike_RRAM
Spike_RRAM

Rbf

Radial basis function fully connected layer.
Option [default value]

Description
Number of output neurons
Centers initial values filler

NbOutputs
CentersFiller

[NormalFiller(0.5, 0.05)]
Scaling initial values filler

ScalingFiller

[NormalFiller(10.0, 0.05)]

Configuration parameters (Frame models)
Option [default value]
Solvers.*
CentersSolver.*

Model(s)
all Frame
all Frame

ScalingSolver.*

all Frame

RbfApprox [None]

Frame

4.7.14

Description
Any solver parameters
Centers solver parameters, take precedence over the
Solvers.* parameters
Scaling solver parameters, take precedence over the
Solvers.* parameters
Approximation for the Gaussian function, can be any
of: None, Rectangular or SemiLinear

Softmax

Softmax layer.
Option [default value]
NbOutputs
WithLoss

[0]
[0]

GroupSize

Description
Number of output neurons
Softmax followed with a multinomial logistic layer
Softmax is applied on groups of outputs. The group size
must be a divisor of NbOutputs parameter.

The softmax function performs the following operation, with aix,y and bix,y the input and the
output respectively at position (x, y) on channel i:
bix,y =

exp(aix,y )
N
P

exp(ajx,y )

j=0

63/82

and
daix,y =

N 
X



δij − aix,y ajx,y dbjx,y

j=0

When the WithLoss option is enabled, compute the gradient directly in respect of the cross-entropy
loss:
Lx,y =

N
X

tjx,y log(bjx,y )

j=0

In this case, the gradient output becomes:
daix,y = dbix,y
with
dbix,y = tix,y − bix,y
4.7.15

LRN

Local Response Normalization (LRN) layer.
Option [default value]
NbOutputs

Description
Number of output neurons

The response-normalized activity bix,y is given by the expression:
aix,y

bix,y =
k+α


min(N −1,i+n/2)
P

2
ajx,y

!β

j=max(0,i−n/2)

Configuration parameters (Frame models)
Option [default value]
N [5]
Alpha [1.0e-4]
Beta
K

[0.75]

[2.0]

4.7.16

Model(s)
all Frame
all Frame
all Frame
all Frame

Description
Normalization window width in elements
Value of the alpha variance scaling parameter in the
normalization formula
Value of the beta power parameter in the normalization
formula
Value of the k parameter in normalization formula

LSTM

Long Short Term Memory Layer (Hochreiter and Schmidhuber, 1997).
Global layer parameters (Frame_CUDA models)

64/82

Option [default value]

Description
SeqLength
Maximum sequence length that the LSTM can take as an
input.
BatchSize
Number of sequences used for a single weights actualisation
process : size of the batch.
InputDim
Dimension of every element composing a sequence.
HiddenSize
Dimension of the LSTM inner state and output.
SingleBackpropFeeding [1]
If disabled return the full output sequence.
Bidirectional [0]
If enabled, build a bidirectional structure.
AllGatesWeightsFiller
All Gates weights initial values filler.
AllGatesBiasFiller
All Gates bias initial values filler.
WeightsInputGateFiller
Input gate previous layer and recurrent weights initial values
filler. Take precedence over AllGatesWeightsFiller parameter.
WeightsForgetGateFiller
Forget gate previous layer and recurrent weights initial values
filler. Take precedence over AllGatesWeightsFiller parameter.
WeightsCellGateFiller
Cell gate (or new memory) previous layer and recurrent
weights initial values filler. Take precedence over AllGatesWeightsFiller parameter.
WeightsOutputGateFiller
Output gate previous layer and recurrent weights initial
values filler. Take precedence over AllGatesWeightsFiller
parameter.
BiasInputGateFiller
Input gate previous layer and recurrent bias initial values
filler. Take precedence over AllGatesBiasFiller parameter.
BiasRecurrentForgetGateFiller
Forget gate recurrent bias initial values filler. Take precedence over AllGatesBiasFiller parameter. Often set to 1.0 to
show better convergence performance.
BiasPreviousLayerForgetGateFillerForget gate previous layer bias initial values filler. Take
precedence over AllGatesBiasFiller parameter.
BiasCellGateFiller
Cell gate (or new memory) previous layer and recurrent bias
initial values filler. Take precedence over AllGatesBiasFiller
parameter.
BiasOutputGateFiller
Output gate previous layer and recurrent bias initial values
filler. Take precedence over AllGatesBiasFiller parameter.
HxFiller
Recurrent previous state initialisation. Often set to 0.0
CxFiller
Recurrent previous LSTM inner state initialisation. Often
set to 0.0

Configuration parameters (Frame_CUDA models)
Option [default value]
Solvers.*
Dropout [0.0]

Model(s)
all Frame
all Frame

[]

all Frame

InputMode
Algo

[0]

all Frame

Description
Any solver parameters
The probability with which the value from input would
be dropped.
If enabled, drop the matrix multiplication of the input
data.
Allow to choose different cuDNN implementation. Can
be 0 : STANDARD, 1 : STATIC, 2 : DYNAMIC. Case
1 and 2 aren’t supported yet.

65/82

Current restrictions

:

• Only Frame_Cuda version is supported yet.
• The implementation only support input sequences with a fixed length associated with a single
label.
• CuDNN structures requires the input data to be ordered as [1, InputDim, BatchSize, SeqLength]. Depending on the use case (like sequential-MNIST), the input data would need to
be shuffled between the stimuli provisder and the RNN in order to process batches of data.
No shuffling layer is yet operational. In that case, set batch to one for first experiments.
Further development requirements :
When it comes to RNN, two main factors needs to be considered to build proper interfaces :
1. Whether the input data has a variable or a fixed length over the data base, that is to say
whether the input data will have a variable or fixed Sequence length. Of course the main
strength of a RNN is to process variable length data.
2. Labelling granularity of the input data, that is to say wheteher every elements of a sequence
is labelled or the sequence itself has only one label.
For instance, let’s consider sentences as sequences of words in which every word would be part of a
vocabulary. Sentences could have a variable length and every element/word would have a label. In
that case, every relevant element of the output sequence from the recurrent structure is turned into
a prediction throught a fully connected layer with a linear activation fonction and a softmax.
On the opposite, using sequential-MNIST database, the sequence length would be the same regarding every image and there is only one label for an image. In that case, the last element of the
output sequence is the most relevant one to be turned into a prediction as it carries the information
of the entire input sequence.
To provide flexibility according to these factors, the first implementation choice is to set a
maximum sequence length emphSeqLength as an hyperparameter that the User provide. Variable
length senquences can be processed by padding the remaining steps of the input sequence.
Then two cases occur as the labeling granularity is scaled at each element of the sequence or scaled
at the sequence itself:
1. The sequence itself has only one label : The model has a fixed size with one fully connected
mapped to the relevant element of the output sequence according to the input sequence.
2. Every elements of a sequence is labelled :
The model has a fixed size with one big fully connected (or Tmax fully connected) mapped to the
relevant elements of the output sequence according to the input sequence. The remaining elements
need to be masked so it doesn’t influence longer sequences.

Development guidance

:

• Replace the inner local variables of LSTMCell_Frame_Cuda with a generic layer of shuffling
(on device) to enable the the process of data batch.

66/82

Figure 15: RNN model : variable sequence length and labeling scaled at the sequence

Figure 16: RNN model : variable sequence length and labeling scaled at each element of the sequence

• Develop some kind of label embedding within the layer to better articulate the labeling
granularity of the input data.
• Adapt structures to support the STATIC and DYNAMIC algorithm of cuDNN functions.
4.7.17

Dropout

Dropout layer (Srivastava et al., 2012).
Option [default value]
NbOutputs

Description
Number of output neurons

Configuration parameters (Frame models)
Option [default value]
Dropout [0.5]

Model(s)
all Frame

Description
The probability with which the value from input would
be dropped
67/82

Configuration parameters
Option [default value]
AlignCorners [True]

4.7.20

Model(s)
all Frame

Description
Corner alignement mode if BilinearTF is used as interpolation mode

BatchNorm

Batch Normalization layer (Ioffe and Szegedy, 2015).
Option [default value]
NbOutputs
ActivationFunction [Tanh]

[]
BiasesSharing []
MeansSharing []
ScalesSharing

VariancesSharing

[]

Description
Number of output neurons
Activation function. Can be any of Logistic, LogisticWithLoss,
Rectifier, Softplus, TanhLeCun, Linear, Saturation or Tanh
Share the scales with an other layer
Share the biases with an other layer
Share the means with an other layer
Share the variances with an other layer

Configuration parameters (Frame models)
Option [default value]
Solvers.*
ScaleSolver.*

Model(s)
all Frame
all Frame

BiasSolver.*

all Frame

[0.0]

all Frame

Epsilon

4.7.21

Description
Any solver parameters
Scale solver parameters, take precedence over the
Solvers.* parameters
Bias solver parameters, take precedence over the
Solvers.* parameters
Epsilon value used in the batch normalization formula.
If 0.0, automatically choose the minimum possible
value.

Transformation

Transformation layer, which can apply any transformation described in 4.6.1. Useful for fully CNN
post-processing for example.
Option [default value]
NbOutputs
Transformation

Description
Number of outputs
Name of the transformation to apply

The Transformation options must be placed in the same section.
Usage example for fully CNNs:
68/82

[post.Transformation-thres]
Input=... ; for example, network’s logistic of softmax output layer
NbOutputs=1
Type=Transformation
Transformation=ThresholdTransformation
Operation=ToZero
Threshold=0.75
[post.Transformation-morpho]
Input=post.Transformation-thres
NbOutputs=1
Type=Transformation
Transformation=MorphologyTransformation
Operation=Opening
Size=3

69/82

5

Tutorials

5.1
5.1.1

Learning deep neural networks: tips and tricks
Choose the learning solver

Generally, you should use the SGD solver with a momemtum (typical value for the momentum:
0.9). It generalizes better, often significantly better, than adaptive methods like Adam (Wilson
et al., 2017).
Adaptive solvers, like Adam, may be used for fast exploration and prototyping, thanks to their
fast convergence.
5.1.2

Choose the learning hyper-parameters

You can use the -find-lr option available in the n2d2 executable to automatically find the best
learning rate for a given neural network.
Usage example:
./n2d2 model.ini -find-lr 10000

This commnand starts from a very low learning rate (1.0e-6) and increase it exponentially to
reach the maximum value (10.0) after 10000 steps, as shown in figure 17. The loss change during
this phase is then plotted in function of the learning rate, as shown in figure 18.

Figure 17: Exponential increase of the learning rate over the specified number of iterations, equals to the
number of steps divided by the batch size (here: 24).

70/82

Figure 18: Loss change as a function of the learning rate.

Note that in N2D2, the learning rate is automatically normalized by the global batch size
(N × IterationSize) for the SGDSolver. A simple linear scaling rule is used, as recommanded in
(Goyal et al., 2017). The effective learning rate αeff applied for parameters update is therefore:
α
with α = LearningRate
N × IterationSize
Typical values for the SGDSolver are:
αeff =

Solvers.LearningRate=0.01
Solvers.Decay=0.0001
Solvers.Momentum=0.9

5.1.3

Convergence and normalization

Deep networks (> 30 layers) and especially residual networks usually don’t converge without
normalization. Indeed, batch normalization is almost always used. ZeroInit is a method that can
be used to overcome this issue without normalization (Zhang et al., 2019).

5.2

Building a classifier neural network

For this tutorial, we will use the classical MNIST handwritten digit dataset. A driver module
already exists for this dataset, named MNIST_IDX_Database.
To instantiate it, just add the following lines in a new INI file:
[database]
Type=MNIST_IDX_Database
Validation=0.2 ; Use 20% of the dataset for validation

In order to create a neural network, we first need to define its input, which is declared with a
[sp] section (sp for StimuliProvider). In this section, we configure the size of the input and the
batch size:

71/82

[sp]
SizeX=32
SizeY=32
BatchSize=128

We can also add pre-processing transformations to the StimuliProvider, knowing that the final
data size after transformations must match the size declared in the [sp] section. Here, we must
rescale the MNIST 28x28 images to match the 32x32 network input size.
[sp.Transformation_1]
Type=RescaleTransformation
Width=[sp]SizeX
Height=[sp]SizeY

Next, we declare the neural network layers. In this example, we reproduced the well-known
LeNet network. The first layer is a 5x5 convolutional layer, with 6 channels. Since there is only one
input channel, there will be only 6 convolution kernels in this layer.
[conv1]
Input=sp
Type=Conv
KernelWidth=5
KernelHeight=5
NbOutputs=6

The next layer is a 2x2 MAX pooling layer, with a stride of 2 (non-overlapping MAX pooling).
[pool1]
Input=conv1
Type=Pool
PoolWidth=2
PoolHeight=2
NbOutputs=[conv1]NbOutputs
Stride=2
Pooling=Max
Mapping.Size=1 ; One to one connection between input and output channels

The next layer is a 5x5 convolutional layer with 16 channels.
[conv2]
Input=pool1
Type=Conv
KernelWidth=5
KernelHeight=5
NbOutputs=16

Note that in LeNet, the [conv2] layer is not fully connected to the pooling layer. In N2D2, a
custom mapping can be defined for each input connection. The connection of n-th output map to
the inputs is defined by the n-th column of the matrix below, where the rows correspond to the
inputs.
Mapping(pool1)=\
1 0 0 0 1 1 1 0 0
1 1 0 0 0 1 1 1 0
1 1 1 0 0 0 1 1 1
0 1 1 1 0 0 1 1 1
0 0 1 1 1 0 0 1 1
0 0 0 1 1 1 0 0 1

1
0
0
1
1
1

1
1
0
0
1
1

1
1
1
0
0
1

1
1
0
1
1
0

0
1
1
0
1
1

1
0
1
1
0
1

1
1
1
1
1
1

\
\
\
\
\

Another MAX pooling and convolution layer follow:
[pool2]
Input=conv2
Type=Pool
PoolWidth=2
PoolHeight=2

72/82

NbOutputs=[conv2]NbOutputs
Stride=2
Pooling=Max
Mapping.Size=1
[conv3]
Input=pool2
Type=Conv
KernelWidth=5
KernelHeight=5
NbOutputs=120

The network is composed of two fully-connected layers of 84 and 10 neurons respectively:
[fc1]
Input=conv3
Type=Fc
NbOutputs=84
[fc2]
Input=fc1
Type=Fc
NbOutputs=10

Finally, we use a softmax layer to obtain output classification probabilities and compute the
loss function.
[softmax]
Input=fc2
Type=Softmax
NbOutputs=[fc2]NbOutputs
WithLoss=1

In order to tell N2D2 to compute the error and the classification score on this softmax layer, one
must attach a N2D2 Target to this layer, with a section with the same name suffixed with .Target:
[softmax.Target]

By default, the activation function for the convolution and the fully-connected layers is the
hyperbolic tangent. Because the [fc2] layer is fed to a softmax, it should not have any activation
function. We can specify it by adding the following line in the [fc2] section:
[fc2]
...
ActivationFunction=Linear

In order to improve further the networks performances, several things can be done:
• Use ReLU activation functions. In order to do so, just add the following in the [conv1],
[conv2], [conv3] and [fc1] layer sections:
ActivationFunction=Rectifier

For the ReLU activation function to be effective, the weights must be initialized carefully, in
order to avoid dead units that would be stuck in the ] − ∞, 0] output range before the ReLU
function. In N2D2, one can use a custom WeightsFiller for the weights initialization. For the
ReLU activation function, a popular and efficient filler is the so-called XavierFiller (see the
4.7.2 section for more information):
WeightsFiller=XavierFiller

• Use dropout layers. Dropout is highly effective to improve the network generalization
capacity. Here is an example of a dropout layer inserted between the [fc1] and [fc2] layers:
[fc1]
...

73/82

[fc1.drop]
Input=fc1
Type=Dropout
NbOutputs=[fc1]NbOutputs
[fc2]
Input=fc1.drop ; Replaces "Input=fc1"
...

• Tune the learning parameters. You may want to tune the learning rate and other learning
parameters depending on the learning problem at hand. In order to do so, you can add a
configuration section that can be common (or not) to all the layers. Here is an example of
configuration section:
[conv1]
...
ConfigSection=common.config
[...]
...
[common.config]
NoBias=1
WeightsSolver.LearningRate=0.05
WeightsSolver.Decay=0.0005
Solvers.LearningRatePolicy=StepDecay
Solvers.LearningRateStepSize=[sp]_EpochSize
Solvers.LearningRateDecay=0.993
Solvers.Clamping=1

For more details on the configuration parameters for the Solver, see section 4.7.3.
• Add input distortion. See for example the DistortionTransformation (section 4.6.1).
The complete INI model corresponding to this tutorial can be found in models/LeNet.ini.
In order to use CUDA/GPU accelerated learning, the default layer model should be switched to
Frame_CUDA. You can enable this model by adding the following line at the top of the INI file (before
the first section):
DefaultModel=Frame_CUDA

5.3

Building a segmentation neural network

In this tutorial, we will learn how to do image segmentation with N2D2. As an example, we will
implement a face detection and gender recognition neural network, using the IMDB-WIKI dataset.
First, we need to instanciate the IMDB-WIKI dataset built-in N2D2 driver:
[database]
Type=IMDBWIKI_Database
WikiSet=1 ; Use the WIKI part of the dataset
IMDBSet=0 ; Don’t use the IMDB part (less accurate annotation)
Learn=0.90
Validation=0.05
DefaultLabel=background ; Label for pixels outside any ROI (default is no label, pixels are
ignored)

We must specify a default label for the background, because we want to learn to differenciate
faces from the background (and not simply ignore the background for the learning).
The network input is then declared:
[sp]
SizeX=480

74/82

SizeY=360
BatchSize=48
CompositeStimuli=1

In order to work with segmented data, i.e. data with bounding box annotations or pixel-wise
annotations (as opposed to a single label per data), one must enable the CompositeStimuli option in
the [sp] section.
We can then perform various operations on the data before feeding it to the network, like for
example converting the 3-channels RGB input images to single-channel gray images:
[sp.Transformation-1]
Type=ChannelExtractionTransformation
CSChannel=Gray

We must only rescale the images to match the networks input size. This can be done using
a RescaleTransformation, followed by a PadCropTransformation if one want to keep the images aspect
ratio.
[sp.Transformation-2]
Type=RescaleTransformation
Width=[sp]SizeX
Height=[sp]SizeY
KeepAspectRatio=1 ; Keep images aspect ratio
; Required to ensure all the images are the same size
[sp.Transformation-3]
Type=PadCropTransformation
Width=[sp]SizeX
Height=[sp]SizeY

A common additional operation to extend the learning set is to apply random horizontal mirror
to images. This can be achieved with the following FlipTransformation:
[sp.OnTheFlyTransformation-4]
Type=FlipTransformation
RandomHorizontalFlip=1
ApplyTo=LearnOnly ; Apply this transformation only on the learning set

Note that this is an on-the-fly transformation, meaning it cannot be cached and is re-executed
every time even for the same stimuli. We also apply this transformation only on the learning set,
with the ApplyTo option.
Next, the neural network can be described:
[conv1.1]
Input=sp
Type=Conv
...
[pool1]
...
[...]
...
[fc2]
Input=drop1
Type=Conv
...
[drop2]
Input=fc2
Type=Dropout
NbOutputs=[fc2]NbOutputs

75/82

A full network description can be found in the IMDBWIKI.ini file in the models directory of
N2D2. It is a fully-CNN network.
Here we will focus on the output layers required to detect the faces and classify their gender.
We start from the [drop2] layer, which has 128 channels of size 60x45.
5.3.1

Faces detection

We want to first add an output stage for the faces detection. It is a 1x1 convolutional layer with a
single 60x45 output map. For each output pixel, this layer outputs the probability that the pixel
belongs to a face.
[fc3.face]
Input=drop2
Type=Conv
KernelWidth=1
KernelHeight=1
NbOutputs=1
Stride=1
ActivationFunction=LogisticWithLoss
WeightsFiller=XavierFiller
ConfigSection=common.config ; Same solver options that the other layers

In order to do so, the activation function of this layer must be of type LogisticWithLoss.
We must also tell N2D2 to compute the error and the classification score on this softmax layer,
by attaching a N2D2 Target to this layer, with a section with the same name suffixed with .Target:
[fc3.face.Target]
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_face.dat
; Visualization parameters
NoDisplayLabel=0
LabelsHueOffset=90

In this Target, we must specify how the dataset annotations are mapped to the layer’s output.
This can be done in a separate file using the LabelsMapping parameter. Here, since the output layer
has a single output per pixel, the target value can only be 0 or 1. A target value of -1 means that
this output is ignored (no error back-propagated). Since the only annotations in the IMDB-WIKI
dataset are faces, the mapping described in the IMDBWIKI_target_face.dat file is easy:
# background
background 0
# padding (*) is ignored (-1)
* -1
# not background = face
default 1

5.3.2

Gender recognition

We can also add a second output stage for gender recognition. Like before, it would be a 1x1
convolutional layer with a single 60x45 output map. But here, for each output pixel, this layer
would output the probability that the pixel represents a female face.
[fc3.gender]
Input=drop2
Type=Conv
KernelWidth=1
KernelHeight=1
NbOutputs=1
Stride=1
ActivationFunction=LogisticWithLoss
WeightsFiller=XavierFiller

76/82

ConfigSection=common.config

The output layer is therefore identical to the face’s output layer, but the target mapping is
different. For the target mapping, the idea is simply to ignore all pixels not belonging to a face and
affect the target 0 to male pixels and the target 1 to female pixels.
[fc3.gender.Target]
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_gender.dat
; Only display gender probability for pixels detected as face pixels
MaskLabelTarget=fc3.face.Target
MaskedLabel=1

The content of the IMDBWIKI_target_gender.dat file would therefore look like:
# background
# ?-* (unknown gender)
# padding
default -1
# male gender
M-? 0 # unknown age
M-0 0
M-1 0
M-2 0
...
M-98 0
M-99 0
# female gender
F-? 1 # unknown age
F-0 1
F-1 1
F-2 1
...
F-98 1
F-99 1

5.3.3

ROIs extraction

The next step would be to extract detected face ROIs and assign for each ROI the most probable
gender. To this end, we can first set a detection threshold, in terms of probability, to select face
pixels. In the following, the threshold is fixed to 75% face probability:
[post.Transformation-thres]
Input=fc3.face
Type=Transformation
NbOutputs=1
Transformation=ThresholdTransformation
Operation=ToZero
Threshold=0.75

We can then assign a target of type TargetROIs to this layer that will automatically create the
bounding box using a segmentation algorithm.
[post.Transformation-thres.Target-face]
Type=TargetROIs
MinOverlap=0.33 ; Min. overlap fraction to match the ROI to an annotation
FilterMinWidth=5 ; Min. ROI width
FilterMinHeight=5 ; Min. ROI height
FilterMinAspectRatio=0.5 ; Min. ROI aspect ratio
FilterMaxAspectRatio=1.5 ; Max. ROI aspect ratio
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_face.dat

In order to assign a gender to the extracted ROIs, the above target must be modified to:
77/82

[post.Transformation-thres.Target-gender]
Type=TargetROIs
ROIsLabelTarget=fc3.gender.Target
MinOverlap=0.33
FilterMinWidth=5
FilterMinHeight=5
FilterMinAspectRatio=0.5
FilterMaxAspectRatio=1.5
LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_gender.dat

Here, we use the fc3.gender.Target target to determine the most probable gender of the ROI.
5.3.4

Data visualization

For each Target in the network, a corresponding folder is created in the simulation directory, which
contains learning, validation and test confusion matrixes. The output estimation of the network for
each stimulus is also generated automatically for the test dataset and can be visualized with the
./test.py helper tool. An example is shown in figure 19.

Pixels input label (dataset annotation)

Network output estimation: pixels most probable object type

Image selection

Labels legend
(object type)

Figure 19: Example of the target visualization helper tool.

5.4

Transcoding a learned network in spike-coding

N2D2 embeds an event-based simulator (historically known as ’Xnet’) and allows to transcode a
whole DNN in a spike-coding version and evaluate the resulting spiking neural network performances.
In this tutorial, we will transcode the LeNet network described in section 5.2.
5.4.1

Render the network compatible with spike simulations

The first step is to specify that we want to use a transcode model (allowing both formal and spike
simulation of the same network), by changing the DefaultModel to:
DefaultModel=Transcode_CUDA

In order to perform spike simulations, the input of the network must be of type Environment,
which is a derived class of StimuliProvider that adds spike coding support. In the INI model file, it
78/82

is therefore necessary to replace the [sp] section by an [env] section and replace all references of sp
to env.
Note that these changes have at this point no impact at all on the formal coding simulations.
The beginning of the INI file should be:
DefaultModel=Transcode_CUDA
; Database
[database]
Type=MNIST_IDX_Database
Validation=0.2 ; Use 20% of the dataset for validation
; Environment
[env]
SizeX=32
SizeY=32
BatchSize=128
[env.Transformation_1]
Type=RescaleTransformation
Width=[env]SizeX
Height=[env]SizeY
[conv1]
Input=env
...

The dropout layer has no equivalence in spike-coding inference and must be removed:
...
[fc1.drop]
Input=fc1
Type=Dropout
NbOutputs=[fc1]NbOutputs
[fc2]
Input=fc1.drop
...

The softmax layer has no equivalence in spike-coding inference and must be removed as well.
The Target must therefore be attached to [fc2]:
...
[softmax]
Input=fc2
Type=Softmax
NbOutputs=[fc2]NbOutputs
WithLoss=1
[softmax.Target]
[fc2.Target]
...

The network is now compatible with spike-coding simulations. However, we did not specify at
this point how to translate the input stimuli data into spikes, nor the spiking neuron parameters
(threshold value, leak time constant...).
5.4.2

Configure spike-coding parameters

The first step is to configure how the input stimuli data must be coded into spikes. To this end, we
must attach a configuration section to the Environment. Here, we specify a periodic coding with
random initial jitter with a minimum period of 10 ns and a maximum period of 100 us:

79/82

[env]
...
ConfigSection=env.config
[env.config]
; Spike-based computing
StimulusType=JitteredPeriodic
PeriodMin=1,000,000 ; unit = fs
PeriodMeanMin=10,000,000 ; unit = fs
PeriodMeanMax=100,000,000,000 ; unit = fs
PeriodRelStdDev=0.0

The next step is to specify the neurons parameters, that will be common to all layers and can
therefore be specified in the [common.config] section. In N2D2, the base spike-coding layers use a
Leaky Integrate-and-Fire (LIF) neuron model. By default, the leak time constant is zero, resulting
to simple Integrate-and-Fire (IF) neurons.
Here we simply specify that the neurons threshold must be the unity, that the threshold is only
positive and that there is no incoming synaptic delay:
[common.config]
...
; Spike-based computing
Threshold=1.0
BipolarThreshold=0
IncomingDelay=0

Finally, we can limit the number of spikes required for the computation of each stimulus by
adding a decision delta threshold at the output layer:
[fc2]
...
ConfigSection=common.config,fc2.config
[fc2.Target]
[fc2.config]
; Spike-based computing
TerminateDelta=4
BipolarThreshold=1

The complete INI model corresponding to this tutorial can be found in models/LeNet_Spike.ini.
Here is a summary of the steps required to reproduce the whole experiment:
./n2d2 "$N2D2_MODELS/LeNet.ini" -learn 6000000 -log 100000
./n2d2 "$N2D2_MODELS/LeNet_Spike.ini" -test

The final recognition rate reported at the end of the spike inference should be almost identical
to the formal coding network (around 99% for the LeNet network).
Various statistics are available at the end of the spike-coding simulation in the stats_spike
folder and the stats_spike.log file. Looking in the stats_spike.log file, one can read the following
line towards the end of the file:
Read events per virtual synapse per pattern (average): 0.654124

This line reports the average number of accumulation operations per synapse per input stimulus
in the network. If this number if below 1.0, it means that the spiking version of the network is
more efficient than its formal counterpart in terms of total number of operations!

80/82

References
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and
B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
P. Dollár, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: A benchmark. In CVPR,
2009.
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples:
an incremental bayesian approach tested on 101 object categories. In IEEE. CVPR 2004,
Workshop on Generative-Model Based Vision, 2004.
X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks.
In International conference on artificial intelligence and statistics, page 249–256, 2010.
P. Goyal, P. Dollár, R. B. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y. Jia,
and K. He. Accurate, large minibatch SGD: training imagenet in 1 hour. CoRR, abs/1706.02677,
2017. URL http://arxiv.org/abs/1706.02677.
B. Graham. Fractional max-pooling. CoRR, abs/1412.6071, 2014.
G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset, 2007.
K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level
performance on imagenet classification. In Proceedings of the 2015 IEEE International Conference
on Computer Vision (ICCV), ICCV ’15, pages 1026–1034, 2015. doi: 10.1109/ICCV.2015.123.
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780,
1997. doi: 10.1162/neco.1997.9.8.1735.
S. Houben, J. Stallkamp, J. Salmen, M. Schlipsing, and C. Igel. Detection of traffic signs in
real-world images: The German Traffic Sign Detection Benchmark. In International Joint
Conference on Neural Networks, number 1288, 2013.
S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing
internal covariate shift. CoRR, abs/1502.03167, 2015.
V. Jain and E. Learned-Miller. FDDB: A benchmark for face detection in unconstrained settings,
2010.
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980,
2014. URL http://arxiv.org/abs/1412.6980.
A. Krizhevsky. Learning multiple layers of features from tiny images, 2009.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document
recognition. In Proceedings of the IEEE, volume 86, pages 2278–2324, 1998.
P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews. The Extended CohnKanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression.
2010.
A. Rakotomamonjy and G. Gasso. Histogram of gradients of time-frequency representations for
audio scene detection, 2014.

81/82

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy,
A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. doi:
10.1007/s11263-015-0816-y.
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple
way to prevent neural networks from voverfitting. Journal of Machine Learning Research, 15:
1929–1958, 2012.
J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. Man vs. computer: Benchmarking machine
learning algorithms for traffic sign recognition. Neural Networks, 2012. ISSN 0893-6080. doi:
10.1016/j.neunet.2012.02.016.
A. C. Wilson, R. Roelofs, M. Stern, N. Srebro, and B. Recht. The Marginal Value of Adaptive
Gradient Methods in Machine Learning. arXiv e-prints, art. arXiv:1705.08292, May 2017.
G. Xia, X. Bai, J. Ding, Z. Zhu, S. J. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang. DOTA:
A large-scale dataset for object detection in aerial images. CoRR, abs/1711.10398, 2017. URL
http://arxiv.org/abs/1711.10398.
H. Zhang, Y. N. Dauphin, and T. Ma. Residual learning without normalization via better
initialization. In International Conference on Learning Representations, 2019. URL https:
//openreview.net/forum?id=H1gsz30cKX.

82/82

Source Exif Data:

File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.6
Linearized                      : No
Page Count                      : 82
Page Mode                       : UseOutlines
Author                          : 
Title                           : 
Subject                         : 
Creator                         : LaTeX with hyperref package
Producer                        : pdfTeX-1.40.18
Create Date                     : 2019:02:20 16:52:56+01:00
Modify Date                     : 2019:02:20 16:52:56+01:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) kpathsea version 6.2.3

EXIF Metadata provided by EXIF.tools

Manual

Navigation menu

Versions of this User Manual:

Views

Navigation