Manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 78

1 Presentation
2 About N2D2-IP
3 Performing simulations
4 INI file interface
5 Tutorials

Commissariat à l’Energie Atomique et aux Energies Alternatives Département Architecture Conception et Logiciels Embarqués

Institut List | CEA Saclay Nano-INNOV | Bât. 861-PC142

91191 Gif-sur-Yvette Cedex - FRANCE

Tel. : +33 (0)1.69.08.49.67 | Fax : +33(0)1.69.08.83.95

www-list.cea.fr

Établissement Public à caractère Industriel et Commercial | RCS Paris B 775 685 019

Neural Network Design & Deployment

Olivier Bichler, David Briand, Victor Gacoin, Benjamin Bertelone, Thibault Allenet

Monday 21st January, 2019

Contents

1 Presentation 6

1.1 Databasehandling .................................... 6

1.2 Datapre-processing ................................... 6

1.3 Deepnetworkbuilding.................................. 7

1.4 Performancesevaluation................................. 8

1.5 Hardwareexports..................................... 8

1.6 Summary ......................................... 10

2 About N2D2-IP 11

3 Performing simulations 11

3.1 Obtaining the latest version of this manual . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Minimum system requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3 ObtainingN2D2 ..................................... 12

3.3.1 Prerequisites ................................... 12

Red Hat Enterprise Linux (RHEL) 6 . . . . . . . . . . . . . . . . . . . . . . 12

Ubuntu ...................................... 12

Windows ..................................... 13

3.3.2 Gettingthesources................................ 13

3.3.3 Compilation.................................... 13

3.4 Downloading training datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.5 Runthelearning ..................................... 14

3.6 Testalearnednetwork.................................. 14

3.6.1 Interpreting the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Recognitionrate ................................. 14

Confusionmatrix................................. 14

Memory and computation requirements . . . . . . . . . . . . . . . . . . . . 14

Kernels and weights distribution . . . . . . . . . . . . . . . . . . . . . . . . 14

Outputmapsactivity .............................. 15

3.7 Exportalearnednetwork ................................ 15

3.7.1 CexportN2D2 IP only ...................................... 17

3.7.2 CPP_OpenCL exportN2D2 IP only ................................ 18

3.7.3 CPP_TensorRT export .............................. 19

3.7.4 CPP_cuDNN export ................................ 20

3.7.5 C_HLS exportN2D2 IP only ................................... 20

4 INI ﬁle interface 21

4.1 Syntax........................................... 21

4.1.1 Properties..................................... 21

4.1.2 Sections...................................... 21

4.1.3 Casesensitivity.................................. 21

4.1.4 Comments..................................... 21

4.1.5 Quotedvalues................................... 21

4.1.6 Whitespace .................................... 21

4.1.7 Escapecharacters ................................ 21

4.2 Template inclusion syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2.1 Variable substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2.2 Controlstatements................................ 22

block........................................ 23

for......................................... 23

2/78

if.......................................... 23

include ...................................... 23

4.3 Globalparameters .................................... 23

4.4 Databases......................................... 23

4.4.1 MNIST ...................................... 23

4.4.2 GTSRB ...................................... 23

4.4.3 Directory ..................................... 24

4.4.4 Other built-in databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

CIFAR10_Database ................................ 26

CIFAR100_Database ............................... 26

CKP_Database .................................. 26

Caltech101_DIR_Database ........................... 26

Caltech256_DIR_Database ........................... 26

CaltechPedestrian_Database ......................... 27

Cityscapes_Database .............................. 27

Daimler_Database ................................ 27

DOTA_Database .................................. 28

FDDB_Database .................................. 28

GTSDB_DIR_Database .............................. 28

ILSVRC2012_Database .............................. 28

KITTI_Database ................................. 28

KITTI_Road_Database .............................. 29

KITTI_Object_Database ............................ 29

LITISRouen_Database .............................. 29

4.4.5 Dataset images slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.5 Stimulidataanalysis................................... 29

4.5.1 Zero-mean and unity standard deviation normalization . . . . . . . . . . . . 30

4.5.2 Substracting the mean image of the set . . . . . . . . . . . . . . . . . . . . 30

4.6 Environment ....................................... 32

4.6.1 Built-in transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

AffineTransformation ............................. 34

ApodizationTransformation .......................... 34

ChannelExtractionTransformation ...................... 35

ColorSpaceTransformation .......................... 35

DFTTransformation ............................... 35

DistortionTransformationN2D2 IP only .......................... 36

EqualizeTransformationN2D2 IP only ............................ 36

ExpandLabelTransformationN2D2 IP only .......................... 36

FilterTransformation ............................. 36

FlipTransformation .............................. 37

GradientFilterTransformationN2D2 IP only ........................ 37

LabelSliceExtractionTransformationN2D2 IP only .................... 38

MagnitudePhaseTransformation ........................ 38

MorphologicalReconstructionTransformationN2D2 IP only ............... 38

MorphologyTransformationN2D2 IP only .......................... 39

NormalizeTransformation ........................... 39

PadCropTransformation ............................ 39

RandomAffineTransformationN2D2 IP only ......................... 40

RangeAffineTransformation .......................... 40

RangeClippingTransformationN2D2 IP only ........................ 40

RescaleTransformation ............................ 40

ReshapeTransformation ............................ 40

3/78

SliceExtractionTransformationN2D2 IP only ....................... 41

ThresholdTransformation ........................... 41

TrimTransformation .............................. 41

WallisFilterTransformationN2D2 IP only ......................... 41

4.7 Networklayers ...................................... 41

4.7.1 Layerdeﬁnition.................................. 41

4.7.2 Weightﬁllers ................................... 42

ConstantFiller ................................. 43

HeFiller ..................................... 43

NormalFiller .................................. 43

UniformFiller .................................. 43

XavierFiller .................................. 43

4.7.3 Weightsolvers .................................. 44

SGDSolver_Frame ................................ 44

SGDSolver_Frame_CUDA ............................. 44

AdamSolver_Frame ................................ 45

AdamSolver_Frame_CUDA ............................ 45

4.7.4 Activationfunctions ............................... 45

Logistic ..................................... 45

LogisticWithLoss ................................ 45

Rectifier .................................... 46

Saturation .................................... 46

Softplus ..................................... 46

Tanh ........................................ 46

TanhLeCun .................................... 46

4.7.5 Anchor ...................................... 46

Conﬁguration parameters (Frame models)................... 46

Outputsremapping ............................... 47

4.7.6 Conv ........................................ 48

Conﬁguration parameters (Frame models)................... 50

Conﬁguration parameters (Spike models) ................... 50

4.7.7 Deconv ...................................... 51

Conﬁguration parameters (Frame models)................... 52

4.7.8 Pool ........................................ 53

Maxoutexample ................................. 53

Conﬁguration parameters (Spike models) ................... 55

4.7.9 Unpool ...................................... 55

4.7.10 ElemWise ..................................... 56

Sum operation................................... 57

AbsSum operation................................. 57

EuclideanSum operation............................. 57

Prod operation .................................. 57

Max operation................................... 57

Examples ..................................... 57

4.7.11 FMP ........................................ 58

Conﬁguration parameters (Frame models)................... 58

4.7.12 Fc ......................................... 58

Conﬁguration parameters (Frame models)................... 58

Conﬁguration parameters (Spike models) ................... 59

4.7.13 RbfN2D2 IP only ........................................ 59

Conﬁguration parameters (Frame models)................... 60

4.7.14 Softmax ...................................... 60

4/78

4.7.15 LRN ........................................ 61

Conﬁguration parameters (Frame models)................... 61

4.7.16 LSTM ....................................... 61

Global layer parameters (Frame_CUDA models) ............... 61

Conﬁguration parameters (Frame_CUDA models) .............. 62

Currentrestrictions ............................... 62

Further development requirements . . . . . . . . . . . . . . . . . . . . . . . 63

Developmentguidance.............................. 64

4.7.17 Dropout ...................................... 64

Conﬁguration parameters (Frame models)................... 64

4.7.18 Padding ...................................... 64

4.7.19 Resize ...................................... 64

Conﬁguration parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.7.20 BatchNorm .................................... 65

Conﬁguration parameters (Frame models)................... 65

4.7.21 Transformation ................................. 65

5 Tutorials 67

5.1 Learning deep neural networks: tips and tricks . . . . . . . . . . . . . . . . . . . . 67

5.1.1 Choose the learning solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.1.2 Choose the learning hyper-parameters . . . . . . . . . . . . . . . . . . . . . 67

5.1.3 Convergence and normalization . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.2 Building a classiﬁer neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.3 Building a segmentation neural network . . . . . . . . . . . . . . . . . . . . . . . . 70

5.3.1 Facesdetection.................................. 71

5.3.2 Genderrecognition................................ 72

5.3.3 ROIsextraction.................................. 73

5.3.4 Datavisualization ................................ 73

5.4 Transcoding a learned network in spike-coding . . . . . . . . . . . . . . . . . . . . 74

5.4.1 Render the network compatible with spike simulations . . . . . . . . . . . . 74

5.4.2 Conﬁgure spike-coding parameters . . . . . . . . . . . . . . . . . . . . . . . 75

5/78

1 Presentation

The N2D2 platform is a comprehensive solution for fast and accurate Deep Neural Network (DNN)

simulation and full and automated DNN-based applications building. The platform integrates

database construction, data pre-processing, network building, benchmarking and hardware export

to various targets. It is particularly useful for DNN design and exploration, allowing simple and fast

prototyping of DNN with diﬀerent topologies. It is possible to deﬁne and learn multiple network

topology variations and compare the performances (in terms of recognition rate and computationnal

cost) automatically. Export targets include CPU, DSP and GPU with OpenMP, OpenCL, Cuda,

cuDNN and TensorRT programming models as well as custom hardware IP code generation with

High-Level Synthesis for FPGA and dedicated conﬁgurable DNN accelerator IP1.

In the following, the ﬁrst section describes the database handling capabilities of the tool,

which can automatically generate learning, validation and testing data sets from any hand made

database (for example from simple ﬁles directories). The second section brieﬂy describes the data

pre-processing capabilites built-in the tool, which does not require any external pre-processing

step and can handle many data transformation, normalization and augmentation (for example

using elastic distortion to improve the learning). The third section show an example of DNN

building using a simple INI text conﬁguration ﬁle. The fourth section show some examples of

metrics obtained after the learning and testing to evaluate the performances of the learned DNN.

Next, the ﬁfth section introduces the DNN hardware export capabilities of the toolﬂow, which can

automatically generate ready to use code for various targets such as embedded GPUs or full custom

dedicated FPGA IP. Finally, we conclude by summarising the main features of the tool.

1.1 Database handling

The tool integrates everything needed to handle custom or hand made databases:

•Genericity: load image and sound, 1D, 2D or 3D data;

•

Associate a label for each data point (useful for scene labeling for example) or a single label

to each data ﬁle (one object/class per image for example), 1D or 2D labels;

•Advanced Region of Interest (ROI) handling:

Support arbitrary ROI shapes (circular, rectangular, polygonal or pixelwise deﬁned);

Convert ROIs to data point (pixelwise) labels;

Extract one or multiple ROIs from an initial dataset to create as many corresponding

additional data to feed the DNN;

•

Native support of ﬁle directory-based databases, where each sub-directory represents a

diﬀerent label. Most used image ﬁle formats are supported (JPEG, PNG, PGM...);

•Possibility to add custom dataﬁle format in the tool without any change in the code base;

•Automatic random partitionning of the database into learning, validation and testing sets.

1.2 Data pre-processing

Data pre-processing, such as image rescaling, normalization, ﬁltering... is directly integrated into

the toolﬂow, with no need for external tool or pre-processing. Each pre-processing step is called a

transformation.

The full sequence of transformations can be speciﬁed easily in a INI text conﬁguration ﬁle. For

example:

; First step: convert the image to grayscale

[env.Transformation-1]

Type=ChannelExtractionTransformation

CSChannel=Gray

1Ongoing work

6/78

; Second step: rescale the image to a 29x29 size

[env.Transformation-2]

Type=RescaleTransformation

Width=29

Height=29

; Third step: apply histogram equalization to the image

[env.Transformation-3]

Type=EqualizeTransformation

; Fourth step (only during learning): apply random elastic distortions to the images to extent the

learning set

[env.OnTheFlyTransformation]

Type=DistortionTransformation

ApplyTo=LearnOnly

ElasticGaussianSize=21

ElasticSigma=6.0

ElasticScaling=20.0

Scaling=15.0

Rotation=15.0

Example of pre-processing transformations built-in in the tool are:

•Image color space change and color channel extraction;

•Elastic distortion;

•Histogram equalization (including CLAHE);

•

Convolutional ﬁltering of the image with custom or pre-deﬁned kernels (Gaussian, Gabor...);

•(Random) image ﬂipping;

•(Random) extraction of ﬁxed-size slices in a given label (for multi-label images)

•Normalization;

•Rescaling, padding/cropping, triming;

•Image data range clipping;

•(Random) extraction of ﬁxed-size slices.

1.3 Deep network building

The building of a deep network is straightforward and can be done withing the same INI conﬁguration

ﬁle. Several layer types are available: convolutional, pooling, fully connected, Radial-basis function

(RBF) and softmax. The tool is highly modular and new layer types can be added without

any change in the code base. Parameters of each layer type are modiﬁable, for example for the

convolutional layer, one can specify the size of the convolution kernels, the stride, the number of

kernels per input map and the learning parameters (learning rate, initial weights value...). For the

learning, the data dynamic can be chosen between 16 bits (with NVIDIA

cuDNN

), 32 bit and 64

bit ﬂoating point numbers.

The following example, which will serve as the use case for the rest of this presentation, shows

how to build a DNN with 5 layers: one convolution layer, followed by one MAX pooling layer,

followed by two fully connected layers and a softmax output layer.

; Specify the input data format

[env]

SizeX=24

SizeY=24

BatchSize=12

; First layer: convolutional with 3x3 kernels

[conv1]

Input=env

Type=Conv

2On future GPUs

7/78

KernelWidth=3

KernelHeight=3

NbOutputs=32

Stride=1

; Second layer: MAX pooling with pooling area 2x2

[pool1]

Input=conv1

Type=Pool

Pooling=Max

PoolWidth=2

PoolHeight=2

NbOutputs=32

Stride=2

Mapping.Size=1 ; one to one connection between convolution output maps and pooling input maps

; Third layer: fully connected layer with 60 neurons

[fc1]

Input=pool1

Type=Fc

NbOutputs=60

; Fourth layer: fully connected with 10 neurons

[fc2]

Input=fc1

Type=Fc

NbOutputs=10

; Final layer: softmax

[softmax]

Input=fc2

Type=Softmax

NbOutputs=10

WithLoss=1

[softmax.Target]

TargetValue=1.0

DefaultValue=0.0

The resulting DNN is shown in ﬁgure 1.

The learning is accelerated in GPU using the NVIDIA

cuDNN framework, integrated into

the toolﬂow. Using GPU acceleration, learning times can be reduced typically by two orders of

magnitude, enabling the learning of large databases within tens of minutes to a few hours instead

of several days or weeks for non-GPU accelerated learning.

1.4 Performances evaluation

The software automatically outputs all the information needed for the network applicative per-

formances analysis, such as the recognition rate and the validation score during the learning; the

confusion matrix during learning, validation and test; the memory and computation requirements

of the network; the output maps activity for each layer, and so on, as shown in ﬁgure 2.

1.5 Hardware exports

Once the learned DNN recognition rate performances are satisfying, an optimized version of the

network can be automatically exported for various embedded targets. An automated network

computation performances benchmarking can also be performed among diﬀerent targets.

The following targets are currently supported by the toolﬂow:

•Plain C code (no dynamic memory allocation, no ﬂoating point processing);

8/78

env

24x24

conv1

32 (22x22)

pool1

32 (11x11) Max

fc1

fc2

softmax

Figure 1: Automatically generated and ready to learn DNN from the INI conﬁguration ﬁle example.

Recognition rate and validation score Confusion matrix

Memory and computation requirements Output maps activity

Figure 2: Example of information automatically generated by the software during and after learning.

•C code accelerated with OpenMP;

•C code tailored for High-Level Synthesis (HLS) with Xilinx®Vivado®HLS;

Direct synthesis to FPGA, with timing and utilization after routing;

9/78

Possibility to constrain the maximum number of clock cycles desired to compute the

whole network;

FPGA utilization vs number of clock cycle trade-oﬀ analysis;

•OpenCL code optimized for either CPU/DSP or GPU;

•Cuda kernels, cuDNN and TensorRT code optimized for NVIDIA®GPUs.

Diﬀerent automated optimizations are embedded in the exports:

•

DNN weights and signal data precision reduction (down to 8 bit integers or less for custom

FPGA IPs);

•Non-linear network activation functions approximations;

•Diﬀerent weights discretization methods.

The exports are generated automatically and come with a Makeﬁle and a working testbench,

including the pre-processed testing dataset. Once generated, the testbench is ready to be compiled

and executed on the target platform. The applicative performance (recognition rate) as well as the

computing time per input data can then be directly mesured by the testbench.

OpenMP

OpenCL

CUDA

HLS FPGA

100

1000

10000

100000

Kpixels image / s

Figure 3: Example of network benchmarking on diﬀerent hardware targets.

The ﬁgure 3 shows an example of benchmarking results of the previous DNN on diﬀerent targets

(in log scale). Compared to desktop CPUs, the number of input image pixels processed per second

is more than one order of magnitude higher with GPUsand at least two orders of magnitude better

with synthesized DNN on FPGA.

1.6 Summary

The N2D2 platform is today a complete and production ready neural network building tool, which

does not require advanced knownledges in deep learning to be used. It is tailored for fast neural

network applications generation and porting with minimum overhead in terms of database creation

and management, data pre-processing, networks conﬁguration and optimized code generation,

which can save months of manual porting and veriﬁcation eﬀort to a single automated step in the

tool.

10/78

2 About N2D2-IP

While N2D2 is our deep learning open-source core framework, some modules referred as "N2D2-IP"

in the manual, are only available through custom license agreement with CEA LIST.

If you are interested in obtaining some of these modules, please contact our business developer

for more information on available licensing options:

Sandrine VARENNE (Sandrine.VARENNE@cea.fr)

In addition to N2D2-IP modules, we can also provide our expertise to design speciﬁc solutions

for integrating DNN in embedded hardware systems, where power, latency, form factor and/or

cost are constrained. We can target CPU/DSP/GPU CoTS hardware as well as our own PNeuro

(programmable) and DNeuro (dataﬂow) dedicated hardware accelerator IPs for DNN on FPGA or

ASIC.

3 Performing simulations

3.1 Obtaining the latest version of this manual

Before going further, please make sure you are reading the latest version of this manual. It is located

in the manual sub-directory. To compile the manual in PDF, just run the following command:

cd manual && make

In order to compile the manual, you must have

pdflatex

and

bibtex

installed, as well as some

common LaTeX packages.

•

On Ubuntu, this can be done by installing the

texlive

and

texlive-latex-extra

software

packages.

•

On Windows, you can install the

MiKTeX

software, which includes everything needed and will

install the required LaTeX packages on the ﬂy.

3.2 Minimum system requirements

•Supported processors:

ARM Cortex A15 (tested on Tegra K1)

ARM Cortex A53/A57 (tested on Tegra X1)

Pentium-compatible PC (Pentium III, Athlon or more-recent system recommended)

•Supported operating systems:

Windows

≥

7 or Windows Server

≥

2012, 64 bits with Visual Studio

≥

2015.2 (2015

Update 2)

GNU/Linux with GCC ≥4.4 (tested on RHEL ≥6, Debian ≥6, Ubuntu ≥14.04)

•At least 256 MB of RAM (1 GB with GPU/CUDA) for MNIST dataset processing

•At least 150 MB available hard disk space + 350 MB for MNIST dataset processing

For CUDA acceleration:

•CUDA ≥6.5 and CuDNN ≥1.0

•

NVIDIA GPU with CUDA compute capability

≥

3 (starting from Kepler micro-architecture)

•At least 512 MB GPU RAM for MNIST dataset processing

11/78

3.3 Obtaining N2D2

3.3.1 Prerequisites

Red Hat Enterprise Linux (RHEL) 6 Make sure you have the following packages installed:

•cmake

•gnuplot

•opencv

•opencv-devel (may require the rhel-x86_64-workstation-optional-6 repository channel)

Plus, to be able to use GPU acceleration:

•Install the CUDA repository package:

rpm -Uhv http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-

rhel6-7.5-18.x86_64.rpm

yum clean expire-cache

yum install cuda

•

Install cuDNN from the NVIDIA website: register to NVIDIA Developer and download the lat-

est version of cuDNN. Simply copy the header and library ﬁles from the cuDNN archive to the

corresponding directories in the CUDA installation path (by default:

/usr/local/cuda/include

and /usr/local/cuda/lib64, respectively).

•

Make sure the CUDA library path (e.g.

/usr/local/cuda/lib64

) is added to the

LD_LIBRARY_PATH

environment variable.

Ubuntu

Make sure you have the following packages installed, if they are available on your Ubuntu

version:

•cmake

•gnuplot

•libopencv-dev

•libcv-dev

•libhighgui-dev

Plus, to be able to use GPU acceleration:

•

Install the CUDA repository package matching your distribution. For example, for Ubuntu

14.04 64 bits:

wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-

ubuntu1404_7.5-18_amd64.deb

dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb

•

Install the cuDNN repository package matching your distribution. For example, for Ubuntu

14.04 64 bits:

wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64/

nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb

dpkg -i nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb

Note that the cuDNN repository package is provided by NVIDIA for Ubuntu starting from

version 14.04.

•Update the package lists: apt-get update

•Install the CUDA and cuDNN required packages:

apt-get install cuda-core-7-5 cuda-cudart-dev-7-5 cuda-cublas-dev-7-5 cuda-curand-dev-7-5

libcudnn5-dev

•Make sure there is a symlink to /usr/local/cuda:

ln -s /usr/local/cuda-7.5 /usr/local/cuda

•

Make sure the CUDA library path (e.g.

/usr/local/cuda/lib64

) is added to the

LD_LIBRARY_PATH

environment variable.

12/78

Windows On Windows 64 bits, Visual Studio ≥2015.2 (2015 Update 2) is required.

Make sure you have the following software installed:

•CMake (http://www.cmake.org/): download and run the Windows installer.

•dirent.h

C++ header (

https://github.com/tronkko/dirent

): to be put in the Visual

Studio include path.

•

Gnuplot (

http://www.gnuplot.info/

): the bin sub-directory in the install path needs to be

added to the Windows PATH environment variable.

•

OpenCV (

http://opencv.org/

): download the latest 2.x version for Windows and extract it

to, for example,

C:\OpenCV\

. Make sure to deﬁne the environment variable

OpenCV_DIR

to point

C:\OpenCV\opencv\build

. Make sure to add the bin sub-directory (

C:\OpenCV\opencv\build\x64

\vc12\bin) to the Windows PATH environment variable.

Plus, to be able to use GPU acceleration:

•

Download and install CUDA toolkit 8.0 located at

https://developer.nvidia.com/compute/

cuda/8.0/prod/local_installers/cuda_8.0.44_windows-exe:

rename cuda_8.0.44_windows-exe cuda_8.0.44_windows.exe

cuda_8.0.44_windows.exe -s compiler_8.0 cublas_8.0 cublas_dev_8.0 cudart_8.0 curand_8.0

curand_dev_8.0

•Update the PATH environment variable:

set PATH=%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin;%ProgramFiles%\NVIDIA GPU

Computing Toolkit\CUDA\v8.0\libnvvp;%PATH%

•

Download and install cuDNN 8.0 located at

http://developer.download.nvidia.com/

compute/redist/cudnn/v5.1/cudnn-8.0-windows7-x64-v5.1.zip

(the following command

assumes that you have 7-Zip installed):

7z x cudnn-8.0-windows7-x64-v5.1.zip

copy cuda\include\*.* ^

"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include\"

copy cuda\lib\x64\*.* ^

"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64\"

copy cuda\bin\*.* ^

"%ProgramFiles%\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\"

3.3.2 Getting the sources

Use the following command:

git clone git@github.com:CEA-LIST/N2D2.git

3.3.3 Compilation

To compile the program:

mkdir build

cd build

cmake .. && make

On Windows, you may have to specify the generator, for example:

cmake .. -G"Visual Studio 14"

Then open the newly created N2D2 project in Visual Studio 2015. Select "Release" for the build

target. Right click on ALL_BUILD item and select "Build".

13/78

3.4 Downloading training datasets

A python script located in the repository root directory allows you to select and automatically

download some well-known datasets, like MNIST and GTSRB (the script requires Python 2.x with

bindings for GTK 2 package):

./tools/install_stimuli_gui.py

By default, the datasets are downloaded in the path speciﬁed in the

N2D2_DATA

environment

variable, which is the root path used by the N2D2 tool to locate the databases. If the

N2D2_DATA

variable is not set, the default value used is

/local/$USER/n2d2_data/

(or

/local/n2d2_data/

the USER environment variable is not set) on Linux and C:\n2d2_data\ on Windows.

Please make sure you have write access to the

N2D2_DATA

path, or if not set, in the default

/local/$USER/n2d2_data/ path.

3.5 Run the learning

The following command will run the learning for 600,000 image presentations/steps and log the

performances of the network every 10,000 steps:

./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -learn 600000 -log 10000

Note: you may want to check the gradient computation using the

-check

option. Note that it

can be extremely long and can occasionally fail if the required precision is too high.

3.6 Test a learned network

After the learning is completed, this command evaluate the network performances on the test data

set:

./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -test

3.6.1 Interpreting the results

Recognition rate

The recognition rate and the validation score are reported during the learning

in the TargetScore_*/Success_validation.png ﬁle, as shown in ﬁgure 4.

Confusion matrix

The software automatically outputs the confusion matrix during learning,

validation and test, with an example shown in ﬁgure 5. Each row of the matrix contains the number

of occurrences estimated by the network for each label, for all the data corresponding to a single

actual, target label. Or equivalently, each column of the matrix contains the number of actual,

target label occurrences, corresponding to the same estimated label. Idealy, the matrix should be

diagonal, with no occurrence of an estimated label for a diﬀerent actual label (network mistake).

The confusion matrix reports can be found in the simulation directory:

•TargetScore_*/ConfusionMatrix_learning.png;

•TargetScore_*/ConfusionMatrix_validation.png;

•TargetScore_*/ConfusionMatrix_test.png.

Memory and computation requirements

The software also report the memory and compu-

tation requirements of the network, as shown in ﬁgure 6. The corresponding report can be found in

the stats sub-directory of the simulation.

Kernels and weights distribution

The synaptic weights obtained during and after the learning

can be analyzed, in terms of distribution (weights sub-directory of the simulation) or in terms of

kernels (kernels sub-directory of the simulation), as shown in 7.

14/78

Figure 4: Recognition rate and validation score during learning.

Figure 5: Example of confusion matrix obtained after the learning.

Output maps activity

The initial output maps activity for each layer can be visualized in the

outputs_init sub-directory of the simulation, as shown in ﬁgure 8.

3.7 Export a learned network

15/78

Figure 6: Example of memory and computation requirements of the network.

./n2d2 "mnist24_16c4s2_24c5s2_150_10.ini" -export CPP_OpenCL

Export types:

•CC export using OpenMP;

•C_HLS C export tailored for HLS with Vivado HLS;

•CPP_OpenCL C++ export using OpenCL;

•CPP_Cuda C++ export using Cuda;

•CPP_cuDNN C++ export using cuDNN;

•CPP_TensorRT C++ export using tensorRT 2.1 API;

•SC_Spike SystemC spike export.

Other program options related to the exports:

Option [default value] Description

-nbbits [8]

Number of bits for the weights and signals. Must be 8, 16, 32

or 64 for integer export, or -32, -64 for ﬂoating point export.

The number of bits can be arbitrary for the

C_HLS

export (for

example, 6 bits)

-calib [0]

Number of stimuli used for the calibration. 0 = no calibration

(default), -1 = use the full test dataset for calibration

-calib-passes [2]

Number of KL passes for determining the layer output values

distribution truncation threshold (0 = use the max. value,

no truncation)

-no-unsigned

If present, disable the use of unsigned data type in integer

exports

-db-export [-1]

Max. number of stimuli to export (0 = no dataset export, -1

= unlimited)

16/78

conv1 kernels conv2 kernels

conv1 weights distribution conv2 weights distribution

Figure 7: Example of kernels and weights distribution analysis for two convolutional layers.

3.7.1 CexportN2D2 IP only

Test the exported network:

cd export_C_int8

make

./bin/n2d2_test

The result should look like:

...

16 52 .0 0/ 17 62 ( avg = 93.757094%)

16 53 .0 0/ 17 63 ( avg = 93.760635%)

16 54 .0 0/ 17 64 ( avg = 93.764172%)

Te st ed 1764 s t i m u l i

S u c c e s s r a t e = 9 3.764 172%

P ro ce ss t ime pe r s t i m u l u s = 18 7. 548 186 u s ( 12 t h r e a d s )

Co nf us io n m a tri x :

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

| T \ E | 0 | 1 | 2 | 3 |

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

17/78

Figure 8: Output maps activity example of the ﬁrst convolutional layer of the network.

| 0 | 329 | 1 | 5 | 2 |

| | 97.63% | 0.30% | 1.48% | 0.59% |

| 1 | 0 | 692 | 2 | 6 |

| | 0.00% | 98.86% | 0.29% | 0.86% |

| 2 | 11 | 27 | 609 | 55 |

| | 1.57% | 3.85% | 86.75% | 7.83% |

| 3 | 0 | 0 | 1 | 24 |

| | 0.00% | 0.00% | 4.00% | 96.00% |

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

T: T ar ge t E: Es t im a te d

3.7.2 CPP_OpenCL exportN2D2 IP only

The OpenCL export can run the generated program in GPU or CPU architectures. Compilation

features:

18/78

Preprocessor command [default value] Description

PROFILING [0]

Compile the binary with a synchronization be-

tween each layers and return the mean execution

time of each layer. This preprocessor option can

decrease performances.

GENERATE_KBIN [0]

Generate the binary output of the OpenCL kernel

.cl ﬁle use. The binary is store in the /bin folder.

LOAD_KBIN [0]

Indicate to the program to load an OpenCL ker-

nel as a binary from the /bin folder instead of a

.cl ﬁle.

CUDA [0]

Use the CUDA OpenCL SDK locate at

/usr/local/cuda

MALI [0]

Use the MALI OpenCL SDK locate at

/usr/MaliOpenCLSDKvXXX

INTEL [0]

Use the INTEL OpenCL SDK locate at

/opt/intel/opencl

AMD [1]

Use the AMD OpenCL SDK locate at

/opt/AM DAP P SDK −XXX

Program options related to the OpenCL export:

Option [default value] Description

-cpu

If present, force to use a CPU architecture to run the program

-gpu

If present, force to use a GPU architecture to run the program

-batch [1] Size of the batch to use

-stimulus [NULL]

Path to a speciﬁc input stimulus to test. For example: -

stimulus

/stimulus/env0000.pgm

command will test the ﬁle

env0000.pgm of the stimulus folder.

Test the exported network:

cd export_CPP_OpenCL_float32

make

./bin/n2d2_opencl_test -gpu

3.7.3 CPP_TensorRT export

The tensorRT 2.1 API export can run the generated program in NVIDIA GPU architecture. It use

CUDA and tensorRT 2.1 API library. The currently supported layers by the tensorRT 2.1 export

are : Convolutional, Pooling, Concatenation, Fully-Connected, Softmax and all activations type.

Custom layers implementation through the plugin factory and generic 8-bits calibrations inference

features are under development.

Program options related to the tensorRT 2.1 API export:

Option [default value] Description

-batch [1] Size of the batch to use

-dev [0] CUDA Device ID selection

-stimulus [NULL]

Path to a speciﬁc input stimulus to test. For example: -

stimulus

/stimulus/env0000.pgm

command will test the ﬁle

env0000.pgm of the stimulus folder.

-prof

Activates the layer wise proﬁling mechanism. This option

can decrease execution time performance.

-iter-build [1]

Sets the number of minimization build iterations done by

the tensorRT builder to ﬁnd the best layer tactics.

19/78

Test the exported network with layer wise proﬁling:

cd export_CPP_TensorRT_float32

make

./bin/n2d2_tensorRT_test -prof

The results of the layer wise proﬁling should look like:

(19%) ∗∗∗∗ ∗∗∗∗∗ ∗∗∗∗∗ ∗∗∗∗∗ ∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗∗∗∗∗ ∗∗ CONV1 + CONV1_ACTIVATION:

0. 02 194 67 ms

(05%) ∗∗∗∗∗∗∗∗∗∗∗∗ POOL1: 0.0 06 75 57 3 ms

(13%) ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ CONV2 + CONV2_ACTIVATION: 0.0 15 90 89 ms

(05%) ∗∗∗∗∗∗∗∗∗∗∗∗ POOL2: 0.0 06 16 04 7 ms

(14%) ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ CONV3 + CONV3_ACTIVATION: 0. 0 15 97 13 ms

(19%) ∗∗∗∗ ∗∗∗∗∗ ∗∗∗∗∗ ∗∗∗∗∗ ∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗∗∗∗∗ ∗∗ FC1 + FC1_ACTIVATION : 0 .0 22 224 2 ms

(13%) ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ FC2 : 0 .01 49 013 ms

(08%) ∗∗∗∗∗∗∗ ∗∗∗∗∗∗∗∗∗∗∗ SOFTMAX: 0 .0 10 0 63 3 ms

Average p r o f i l e d tensorRT p r o c es s ti me p er s t i m u l u s = 0.1 1 39 32 ms

3.7.4 CPP_cuDNN export

The cuDNN export can run the generated program in NVIDIA GPU architecture. It use CUDA

and cuDNN library. Compilation features:

Preprocessor command [default value] Description

PROFILING [0]

Compile the binary with a synchronization be-

tween each layers and return the mean execution

time of each layer. This preprocessor option can

decrease performances.

ARCH32 [0]

Compile the binary with the 32-bits architecture

compatibility.

Program options related to the cuDNN export:

Option [default value] Description

-batch [1] Size of the batch to use

-dev [0] CUDA Device ID selection

-stimulus [NULL]

Path to a speciﬁc input stimulus to test. For example: -

stimulus

/stimulus/env0000.pgm

command will test the ﬁle

env0000.pgm of the stimulus folder.

Test the exported network:

cd export_CPP_cuDNN_float32

make

./bin/n2d2_cudnn_test

3.7.5 C_HLS exportN2D2 IP only

Test the exported network:

cd export_C_HLS_int8

make

./bin/n2d2_test

Run the High-Level Synthesis (HLS) with Xilinx®Vivado®HLS:

vivado_hls -f run_hls.tcl

20/78

4 INI ﬁle interface

The INI ﬁle interface is the primary way of using N2D2. It is a simple, lightweight and user-friendly

format for specifying a complete DNN-based application, including dataset instanciation, data

pre-processing, neural network layers instanciation and post-processing, with all its hyperparameters.

4.1 Syntax

INI ﬁles are simple text ﬁles with a basic structure composed of sections, properties and values.

4.1.1 Properties

The basic element contained in an INI ﬁle is the property. Every property has a name and a value,

delimited by an equals sign (=). The name appears to the left of the equals sign.

name=value

4.1.2 Sections

Properties may be grouped into arbitrarily named sections. The section name appears on a line

by itself, in square brackets ([ and ]). All properties after the section declaration are associated

with that section. There is no explicit "end of section" delimiter; sections end at the next section

declaration, or the end of the ﬁle. Sections may not be nested.

[section]

a=a

b=b

4.1.3 Case sensitivity

Section and property names are case sensitive.

4.1.4 Comments

Semicolons (

;

) or number sign (

) at the beginning or in the middle of the line indicate a comment.

Comments are ignored.

; comment text

a=a # comment text

a="a ; not a comment" ; comment text

4.1.5 Quoted values

Values can be quoted, using double quotes. This allows for explicit declaration of whitespace,

and/or for quoting of special characters (equals, semicolon, etc.).

4.1.6 Whitespace

Leading and trailing whitespace on a line are ignored.

4.1.7 Escape characters

A backslash (\) followed immediately by EOL (end-of-line) causes the line break to be ignored.

21/78

4.2 Template inclusion syntax

Is is possible to recursively include templated INI ﬁles. For example, the main INI ﬁle can include

a templated ﬁle like the following:

[inception@inception_model.ini.tpl]

INPUT=layer_x

SIZE=32

ARRAY=2 ; Must be the number of elements in the array

ARRAY[0].P1=Conv

ARRAY[0].P2=32

ARRAY[1].P1=Pool

ARRAY[1].P2=64

If the inception_model.ini.tpl template ﬁle content is:

[{{SECTION_NAME}}_layer1]

Input={{INPUT}}

Type=Conv

NbOutputs={{SIZE}}

[{{SECTION_NAME}}_layer2]

Input={{SECTION_NAME}}_layer1

Type=Fc

NbOutputs={{SIZE}}

{% block ARRAY %}

[{{SECTION_NAME}}_array{{#}}]

Prop1=Config{{.P1}}

Prop2={{.P2}}

{% endblock %}

The resulting equivalent content for the main INI ﬁle will be:

[inception_layer1]

Input=layer_x

Type=Conv

NbOutputs=32

[inception_layer2]

Input=inception_layer1

Type=Fc

NbOutputs=32

[inception_array0]

Prop1=ConfigConv

Prop2=32

[inception_array1]

Prop1=ConfigPool

Prop2=64

The

SECTION_NAME

template parameter is automatically generated from the name of the including

section (before @).

4.2.1 Variable substitution

{{VAR}} is replaced by the value of the VAR template parameter.

4.2.2 Control statements

Control statements are between {% and %} delimiters.

22/78

block {%block ARRAY %} ... {%endblock %}

The

template parameter is automatically generated from the

{%block ... %}

template control

statement and corresponds to the current item position, starting from 0.

for {%for VAR in range([START, ]END])%} ... {%endfor %}

If START is not speciﬁed, the loop begins at 0 (ﬁrst value of VAR). The last value of VAR is END-1.

if {%if VAR OP [VALUE] %} ... [{%else %}] ... {%endif %}

OP may be ==,!=,exists or not_exists.

include {%include FILENAME %}

4.3 Global parameters

Option [default value] Description

DefaultModel [Transcode]

Default layers model. Can be

Frame

Frame_CUDA

Transcode

Spike

DefaultDataType [Float32]

Default layers data type. Can be

Float16

Float32

Float64

SignalsDiscretization [0] Number of levels for signal discretization

FreeParametersDiscretization

[0]

Number of levels for weights discretization

4.4 Databases

The tool integrates pre-deﬁned modules for several well-known database used in the deep learning

community, such as MNIST, GTSRB, CIFAR10 and so on. That way, no extra step is necessary to

be able to directly build a network and learn it on these database.

4.4.1 MNIST

MNIST (LeCun et al.,1998) is already fractionned into a learning set and a testing set, with:

•60,000 digits in the learning set;

•10,000 digits in the testing set.

Example:

[database]

Type=MNIST_IDX_Database

Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value] Description

Validation [0.0] Fraction of the learning set used for validation

DataPath Path to the database

[$N2D2_DATA/mnist]

4.4.2 GTSRB

GTSRB (Stallkamp et al.,2012) is already fractionned into a learning set and a testing set, with:

•39,209 digits in the learning set;

•12,630 digits in the testing set.

Example:

23/78

[database]

Type=GTSRB_DIR_Database

Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value] Description

Validation [0.0] Fraction of the learning set used for validation

DataPath Path to the database

[$N2D2_DATA/GTSRB]

4.4.3 Directory

Hand made database stored in ﬁles directories are directly supported with the

DIR_Database

module.

For example, suppose your database is organized as following (in the path speciﬁed in the

N2D2_DATA

environment variable):

•GST/airplanes: 800 images

•GST/car_side: 123 images

•GST/Faces: 435 images

•GST/Motorbikes: 798 images

You can then instanciate this database as input of your neural network using the following

parameters:

[database]

Type=DIR_Database

DataPath=${N2D2_DATA}/GST

Learn=0.4 ; 40% of images of the smallest category = 49 (0.4x123) images for each category will be

used for learning

Validation=0.2 ; 20% of images of the smallest category = 25 (0.2x123) images for each category

will be used for validation

; the remaining images will be used for testing

Each subdirectory will be treated as a diﬀerent label, so there will be 4 diﬀerent labels, named

after the directory name.

The stimuli are equi-partitioned for the learning set and the validation set, meaning that the

same number of stimuli for each category is used. If the learn fraction is 0.4 and the validation

fraction is 0.2, as in the example above, the partitioning will be the following:

Label ID Label name Learn set Validation set Test set

0airplanes 49 25 726

1car_side 49 25 49

2Faces 49 25 361

3Motorbikes 49 25 724

Total: 196 100 1860

Mandatory option

Option [default value] Description

DataPath Path to the root stimuli directory

Learn

PerLabelPartitioning

is true, fraction of images used for

the learning; else, number of images used for the learning,

regardless of their labels

LoadInMemory [0] Load the whole database into memory

Depth [1] Number of sub-directory levels to include. Examples:

24/78

Depth

= 0: load stimuli only from the current directory

(DataPath)

Depth

= 1: load stimuli from

DataPath

and stimuli contained

in the sub-directories of DataPath

Depth

< 0: load stimuli recursively from

DataPath

and all its

sub-directories

LabelName [] Base stimuli label name

LabelDepth [1]

Number of sub-directory name levels used to form the stimuli

labels. Examples:

LabelDepth = -1: no label for all stimuli (label ID = -1)

LabelDepth = 0: uses LabelName for all stimuli

LabelDepth

= 1: uses

LabelName

for stimuli in the current

directory (

DataPath

) and

LabelName

/sub-directory name for

stimuli in the sub-directories

PerLabelPartitioning [1]

If true, the stimuli are equi-partitioned for the learn/valida-

tion/test sets, meaning that the same number of stimuli for

each label is used

Validation [0.0]

PerLabelPartitioning

is true, fraction of images used for the

validation; else, number of images used for the validation,

regardless of their labels

Test [1.0-Learn-Validation]

PerLabelPartitioning

is true, fraction of images used for the

test; else, number of images used for the test, regardless of

their labels

ValidExtensions []

List of space-separated valid stimulus ﬁle extensions (if left

empty, any ﬁle extension is considered a valid stimulus)

LoadMore []

Name of an other section with the same options to load a

diﬀerent DataPath

ROIFile []

File containing the stimuli ROIs. If a ROI ﬁle is speciﬁed,

LabelDepth should be set to -1

DefaultLabel []

Label name for pixels outside any ROI (default is no label,

pixels are ignored)

ROIsMargin [0]

Number of pixels around ROIs that are ignored (and not

considered as DefaultLabel pixels)

To load and partition more than one DataPath, one can use the LoadMore option:

[database]

Type=DIR_Database

DataPath=${N2D2_DATA}/GST

Learn=0.6

Validation=0.4

LoadMore=database.test

; Load stimuli from the "GST_Test" path in the test dataset

[database.test]

DataPath=${N2D2_DATA}/GST_Test

Learn=0.0

Test=1.0

; The LoadMore option is recursive:

; LoadMore=database.more

; [database.more]

; Load even more data here

25/78

4.4.4 Other built-in databases

CIFAR10_Database CIFAR10 database (Krizhevsky,2009).

Option [default value] Description

Validation [0.0] Fraction of the learning set used for validation

DataPath Path to the database

[

$N2D2_DATA

/cifar-10-batches-

bin]

CIFAR100_Database CIFAR100 database (Krizhevsky,2009).

Option [default value] Description

Validation [0.0] Fraction of the learning set used for validation

UseCoarse [0] If true, use the coarse labeling (10 labels instead of 100)

DataPath Path to the database

[$N2D2_DATA/cifar-100-binary]

CKP_Database

The Extended Cohn-Kanade (CK+) database for expression recognition (Lucey

et al.,2010).

Option [default value] Description

Learn Fraction of images used for the learning

Validation [0.0] Fraction of images used for the validation

DataPath Path to the database

[

$N2D2_DATA

/cohn-kanade-

images]

Caltech101_DIR_Database Caltech 101 database (Fei-Fei et al.,2004).

Option [default value] Description

Learn Fraction of images used for the learning

Validation [0.0] Fraction of images used for the validation

IncClutter [0]

If true, includes the BACKGROUND_Google directory of

the database

DataPath Path to the database

[$N2D2_DATA/

101_ObjectCategories]

Caltech256_DIR_Database Caltech 256 database (Griﬃn et al.,2007).

Option [default value] Description

Learn Fraction of images used for the learning

Validation [0.0] Fraction of images used for the validation

IncClutter [0]

If true, includes the BACKGROUND_Google directory of

the database

DataPath Path to the database

[$N2D2_DATA/

256_ObjectCategories]

26/78

CaltechPedestrian_Database Caltech Pedestrian database (Dollár et al.,2009).

Note that the images and annotations must ﬁrst be extracted from the seq video data located in

the videos directory using the

dbExtract.m

Matlab tool provided in the "Matlab evaluation/labeling

code" downloadable on the dataset website.

Assuming the following directory structure (in the path speciﬁed in the

N2D2_DATA

environment

variable):

•CaltechPedestrians/data-USA/videos/... (from the setxx.tar ﬁles)

•CaltechPedestrians/data-USA/annotations/... (from the setxx.tar ﬁles)

•CaltechPedestrians/tools/piotr_toolbox/toolbox (from the Piotr’s Matlab Toolbox archive)

•CaltechPedestrians/*.m including dbExtract.m (from the Matlab evaluation/labeling code)

Use the following command in Matlab to generate the images and annotations:

cd([getenv(’N2D2_DATA’)’/CaltechPedestrians’])

addpath(genpath(’tools/piotr_toolbox/toolbox’)) % add the Piotr’s Matlab Toolbox in the Matlab

path

dbInfo(’USA’)

dbExtract()

Option [default value] Description

Validation [0.0] Fraction of the learning set used for validation

SingleLabel [1] Use the same label for "person" and "people" bounding box

IncAmbiguous [0]

Include ambiguous bounding box labeled "person?" using the

same label as "person"

DataPath Path to the database images

[$N2D2_DATA/

CaltechPedestrians/data-

USA/images]

LabelPath Path to the database annotations

[$N2D2_DATA/

CaltechPedestrians/data-

USA/annotations]

Cityscapes_Database Cityscapes database (Cordts et al.,2016).

Option [default value] Description

IncTrainExtra [0]

If true, includes the left 8-bit images - trainextra set (19,998

images)

UseCoarse [0]

If true, only use coarse annotations (which are the only

annotations available for the trainextra set)

SingleInstanceLabels [1]

If true, convert group labels to single instance labels (for

example, cargroup becomes car)

DataPath Path to the database images

[$N2D2_DATA/

Cityscapes/leftImg8bit] or

[

$CITYSCAPES_DATASET

] if deﬁned

LabelPath []

Path to the database annotations (deduced from

DataPath

left empty)

Daimler_Database Daimler Monocular Pedestrian Detection Benchmark (Daimler Pedestrian).

27/78

Option [default value] Description

Learn [1.0] Fraction of images used for the learning

Validation [0.0] Fraction of images used for the validation

Test [0.0] Fraction of images used for the test

Fully [0]

When activate it use the test dataset to learn. Use only on

fully-cnn mode

DOTA_Database DOTA database (Xia et al.,2017).

Option [default value] Description

Learn Fraction of images used for the learning

DataPath Path to the database

[$N2D2_DATA/DOTA]

LabelPath Path to the database labels list ﬁle

[]

FDDB_Database

Face Detection Data Set and Benchmark (FDDB) (Jain and Learned-Miller,

2010).

Option [default value] Description

Learn Fraction of images used for the learning

Validation [0.0] Fraction of images used for the validation

DataPath Path to the images (decompressed originalPics.tar.gz)

[$N2D2_DATA/FDDB]

LabelPath Path to the annotations (decompressed FDDB-folds.tgz)

[$N2D2_DATA/FDDB]

GTSDB_DIR_Database GTSDB database (Houben et al.,2013).

Option [default value] Description

Learn Fraction of images used for the learning

Validation [0.0] Fraction of images used for the validation

DataPath Path to the database

[$N2D2_DATA/FullIJCNN2013]

ILSVRC2012_Database ILSVRC2012 database (Russakovsky et al.,2015).

Option [default value] Description

Learn Fraction of images used for the learning

DataPath Path to the database

[$N2D2_DATA/ILSVRC2012]

LabelPath Path to the database labels list ﬁle

[

$N2D2_DATA

/ILSVRC2012/synsets.txt]

KITTI_Database

The KITTI Database provide ROI which can be use for autonomous driving and

environment perception. The database provide 8 labeled diﬀerent classes. Utilization of the KITTI

Database is under licensing conditions and request an email registration. To install it you have to

follow this link:

http://www.cvlibs.net/datasets/kitti/eval_tracking.php

and download

the left color images (15 GB) and the trainling labels of tracking data set (9 MB). Extract the

downloaded archives in your $N2D2_DATA/KITTI folder.

28/78

Option [default value] Description

Learn [0.8] Fraction of images used for the learning

Validation [0.2] Fraction of images used for the validation

KITTI_Road_Database

The KITTI Road Database provide ROI which can be used to road

segmentation. The dataset provide 1 labeled class (road) on 289 training images. The 290 test

images are not labeled. Utilization of the KITTI Road Database is under licensing conditions and

request an email registration. To install it you have to follow this link:

http://www.cvlibs.net/

datasets/kitti/eval_road.php

and download the "base kit" of (0.5 GB) with left color images,

calibration and training labels. Extract the downloaded archive in your

$N2D2_DATA/KITTI

folder.

Option [default value] Description

Learn [0.8] Fraction of images used for the learning

Validation [0.2] Fraction of images used for the validation

KITTI_Object_Database

The KITTI Object Database provide ROI which can be use for au-

tonomous driving and environment perception. The database provide 8 labeled diﬀerent classes

on 7481 training images. The 7518 test images are not labeled. The whole database pro-

vide 80256 labeled objects. Utilization of the KITTI Object Database is under licensing con-

ditions and request an email registration. To install it you have to follow this link:

http:

//www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark

and download the "lef

color images" (12 GB) and the training labels of object data set (5 MB). Extract the downloaded

archives in your $N2D2_DATA/KITTI_Object folder.

Option [default value] Description

Learn [0.8] Fraction of images used for the learning

Validation [0.2] Fraction of images used for the validation

LITISRouen_Database LITIS Rouen audio scene dataset (Rakotomamonjy and Gasso,2014).

Option [default value] Description

Learn [0.4] Fraction of images used for the learning

Validation [0.4] Fraction of images used for the validation

DataPath Path to the database

[$N2D2_DATA/data_rouen]

4.4.5 Dataset images slicing

It is possible to automatically slice images from a dataset, with a given slice size and stride, using

the .slicing attribute. This eﬀectively increases the number of stimuli in the set.

[database.slicing]

ApplyTo=NoLearn

Width=2048

Height=1024

StrideX=2048

StrideY=1024

4.5 Stimuli data analysis

You can enable stimuli data reporting with the following section (the name of the section must

start with env.StimuliData):

29/78

[env.StimuliData-raw]

ApplyTo=LearnOnly

LogSizeRange=1

LogValueRange=1

The stimuli data reported for the full MNIST learning set will look like:

env . StimuliData−raw dat a :

Number of s t i m u l i : 60000

Data w i d t h r a n ge : [ 2 8 , 2 8 ]

Data h e i g h t ra ng e : [ 2 8 , 28 ]

Data c h a n n e l s r ang e : [ 1 , 1 ]

Val ue r a ng e : [ 0 , 2 5 5 ]

Value mean : 33. 3 18 4

Value s t d . de v . : 7 8.5 67 5

4.5.1 Zero-mean and unity standard deviation normalization

It it possible to normalize the whole database to have zero mean and unity standard deviation on

the learning set using a RangeAffineTransformation transformation:

; Stimuli normalization based on learning set global mean and std.dev.

[env.Transformation-normalize]

Type=RangeAffineTransformation

FirstOperator=Minus

FirstValue=[env.StimuliData-raw]_GlobalValue.mean

SecondOperator=Divides

SecondValue=[env.StimuliData-raw]_GlobalValue.stdDev

The variables

_GlobalValue.mean

and

_GlobalValue.stdDev

are automatically generated in the

[env.

StimuliData-raw]

block. Thanks to this facility, unknown and arbitrary database can be analysed

and normalized in one single step without requiring any external data manipulation.

After normalization, the stimuli data reported is:

env . StimuliData−n o r m a l i z e d d a t a :

Number of s t i m u l i : 60000

Data w i d t h r a n ge : [ 2 8 , 2 8 ]

Data h e i g h t ra ng e : [ 2 8 , 28 ]

Data c h a n n e l s r ang e : [ 1 , 1 ]

Value ra ng e : [ −0.4 2407 4 , 2 . 8 2 1 5 4 ]

Value mean : 2. 64 7 96 e −07

Value s t d . dev . : 1

Where we can check that the global mean is close to 0 and the standard deviation is 1 on the

whole dataset. The result of the transformation on the ﬁrst images of the set can be checked in the

generated frames folder, as shown in ﬁgure 9.

4.5.2 Substracting the mean image of the set

Using the

StimuliData

object followed with an

AffineTransformation

, it is also possible to use the

mean image of the dataset to normalize the data:

[env.StimuliData-meanData]

ApplyTo=LearnOnly

MeanData=1 ; Provides the _MeanData parameter used in the transformation

[env.Transformation]

Type=AffineTransformation

FirstOperator=Minus

FirstValue=[env.StimuliData-meanData]_MeanData

The resulting global mean image can be visualized in env.StimuliData-meanData/meanData.bin.png

an is shown in ﬁgure 10.

After this transformation, the reported stimuli data becomes:

30/78

Figure 9: Image of the set after normalization.

Figure 10: Global mean image generated by StimuliData with the MeanData parameter enabled.

env . StimuliData−p r o c e s s e d d a ta :

Number of s t i m u l i : 60000

Data w i d t h r a n ge : [ 2 8 , 2 8 ]

Data h e i g h t ra ng e : [ 2 8 , 28 ]

Data c h a n n e l s r ang e : [ 1 , 1 ]

Value ra ng e : [ −139 .554 , 2 5 4 . 9 7 9 ]

31/78

Value mean : −3.45583 e −08

Value s t d . de v . : 66 .1 28 8

The result of the transformation on the ﬁrst images of the set can be checked in the generated

frames folder, as shown in ﬁgure 11.

Figure 11: Image of the set after the

AffineTransformation

substracting the global mean image (keep in

mind that the original image value range is [0, 255]).

4.6 Environment

The environment simply specify the input data format of the network (width, height and batch

size). Example:

[env]

SizeX=24

SizeY=24

BatchSize=12 ; [default: 1]

Option [default value] Description

SizeX Environment width

SizeY Environment height

NbChannels [1]

Number of channels (applicable only if there is no

env.

ChannelTransformation[...])

BatchSize [1] Batch size

CompositeStimuli [0] If true, use pixel-wise stimuli labels

CachePath [] Stimuli cache path (no cache if left empty)

StimulusType [SingleBurst]

Method for converting stimuli into spike trains. Can be any

of SingleBurst,Periodic,JitteredPeriodic or Poissonian

DiscardedLateStimuli [1.0]

The pixels in the pre-processed stimuli with a value above

this limit never generate spiking events

32/78

PeriodMeanMin [50 TimeMs]

Mean minimum period

Tmin

, used for periodic temporal cod-

ings, corresponding to pixels in the pre-processed stimuli with

a value of 0 (which are supposed to be the most signiﬁcant

pixels)

PeriodMeanMax [12 TimeS]

Mean maximum period

Tmax

, used for periodic temporal

codings, corresponding to pixels in the pre-processed stimuli

with a value of 1 (which are supposed to be the least signiﬁ-

cant pixels). This maximum period may be never reached if

DiscardedLateStimuli is lower than 1.0

PeriodRelStdDev [0.1]

Relative standard deviation, used for periodic temporal cod-

ings, applied to the spiking period of a pixel

PeriodMin [11 TimeMs]

Absolute minimum period, or spiking interval, used for peri-

odic temporal codings, for any pixel

4.6.1 Built-in transformations

There are 6 possible categories of transformations:

•env.Transformation[...]

Transformations applied to the input images before channels creation;

•env.OnTheFlyTransformation[...]

On-the-ﬂy transformations applied to the input images before

channels creation;

•env.ChannelTransformation[...] Create or add transformation for a speciﬁc channel;

•env.ChannelOnTheFlyTransformation[...]

Create or add on-the-ﬂy transformation for a speciﬁc

channel;

•env.ChannelsTransformation[...]

Transformations applied to all the channels of the input

images;

•env.ChannelsOnTheFlyTransformation[...]

On-the-ﬂy transformations applied to all the channels

of the input images.

Example:

[env.Transformation]

Type=PadCropTransformation

Width=24

Height=24

Several transformations can applied successively. In this case, to be able to apply multiple

transformations of the same category, a diﬀerent suﬃx (

[...]

) must be added to each transformation.

The transformations will be processed in the order of appearance in the INI ﬁle

regardless of their suﬃx.

Common set of parameters for any kind of transformation:

Option [default value] Description

ApplyTo [All]

Apply the transformation only to the speciﬁed stimuli sets.

Can be:

LearnOnly: learning set only

ValidationOnly: validation set only

TestOnly: testing set only

NoLearn: validation and testing sets only

NoValidation: learning and testing sets only

NoTest: learning and validation sets only

All: all sets (default)

33/78

Example:

[env.Transformation-1]

Type=ChannelExtractionTransformation

CSChannel=Gray

[env.Transformation-2]

Type=RescaleTransformation

Width=29

Height=29

[env.Transformation-3]

Type=EqualizeTransformation

[env.OnTheFlyTransformation]

Type=DistortionTransformation

ApplyTo=LearnOnly ; Apply this transformation for the Learning set only

ElasticGaussianSize=21

ElasticSigma=6.0

ElasticScaling=20.0

Scaling=15.0

Rotation=15.0

List of available transformations:

AffineTransformation

Apply an element-wise aﬃne transformation to the image with matrixes

of the same size.

Option [default value] Description

FirstOperator

First element-wise operator, can be

Plus

Minus

Multiplies

Divides

FirstValue First matrix ﬁle name

SecondOperator [Plus]

Second element-wise operator, can be

Plus

Minus

Multiplies

Divides

SecondValue [] Second matrix ﬁle name

The ﬁnal operation is the following, with

the image matrix,

B1st

B2nd

the matrixes to

add/substract/multiply/divide and the element-wise operator :

f(A) = A

op1st B1st

op2nd B2nd

ApodizationTransformation Apply an apodization window to each data row.

Option [default value] Description

Size

Window total size (must match the number of data columns)

WindowName [Rectangular] Window name. Possible values are:

Rectangular: Rectangular

Hann: Hann

Hamming: Hamming

Cosine: Cosine

Gaussian: Gaussian

Blackman: Blackman

Kaiser: Kaiser

34/78

Gaussian window Gaussian window.

Option [default value] Description

WindowName

.Sigma

[0.4] Sigma

Blackman window Blackman window.

Option [default value] Description

WindowName

.Alpha

[0.16] Alpha

Kaiser window Kaiser window.

Option [default value] Description

WindowName

.Beta

[5.0]

Beta

ChannelExtractionTransformation Extract an image channel.

Option Description

CSChannel Blue

: blue channel in the BGR colorspace, or ﬁrst channel of

any colorspace

Green

: green channel in the BGR colorspace, or second chan-

nel of any colorspace

Red

: red channel in the BGR colorspace, or third channel of

any colorspace

Hue: hue channel in the HSV colorspace

Saturation: saturation channel in the HSV colorspace

Value: value channel in the HSV colorspace

Gray: gray conversion

Y: Y channel in the YCbCr colorspace

Cb: Cb channel in the YCbCr colorspace

Cr: Cr channel in the YCbCr colorspace

ColorSpaceTransformation Change the current image colorspace.

Option Description

ColorSpace BGR: if the image is in grayscale, convert it in BGR

HSV

HLS

YCrCb

CIELab

CIELuv

DFTTransformation

Apply a DFT to the data. The input data must be single channel, the

resulting data is two channels, the ﬁrst for the real part and the second for the imaginary part.

Option [default value] Description

TwoDimensional [1]

If true, compute a 2D image DFT. Otherwise, compute the

1D DFT of each data row

Note that this transformation can add zero-padding if required by the underlying FFT imple-

mentation.

35/78

DistortionTransformationN2D2 IP only

Apply elastic distortion to the image. This transformation is gener-

ally used on-the-ﬂy (so that a diﬀerent distortion is performed for each image), and for the learning

only.

Option [default value] Description

ElasticGaussianSize [15] Size of the gaussian for elastic distortion (in pixels)

ElasticSigma [6.0] Sigma of the gaussian for elastic distortion

ElasticScaling [0.0] Scaling of the gaussian for elastic distortion

Scaling [0.0] Maximum random scaling amplitude (+/-, in percentage)

Rotation [0.0] Maximum random rotation amplitude (+/-, in °)

EqualizeTransformationN2D2 IP only Image histogram equalization.

Option [default value] Description

Method [Standard]Standard: standard histogram equalization

CLAHE: contrast limited adaptive histogram equalization

CLAHE_ClipLimit [40.0] Threshold for contrast limiting (for CLAHE only)

CLAHE_GridSize [8]

Size of grid for histogram equalization (for

CLAHE

only). Input

image will be divided into equally sized rectangular tiles. This

parameter deﬁnes the number of tiles in row and column.

ExpandLabelTransformationN2D2 IP only Expand single image label (1x1 pixel) to full frame label.

FilterTransformation Apply a convolution ﬁlter to the image.

Option [default value] Description

Kernel Convolution kernel. Possible values are:

*: custom kernel

Gaussian: Gaussian kernel

LoG: Laplacian Of Gaussian kernel

DoG: Diﬀerence Of Gaussian kernel

Gabor: Gabor kernel

*kernel Custom kernel.

Option Description

Kernel.SizeX [0] Width of the kernel (numer of columns)

Kernel.SizeY [0] Height of the kernel (number of rows)

Kernel.Mat

List of row-major ordered coeﬃcients of

the kernel

If both Kernel.SizeX and Kernel.SizeY are 0, the kernel is assumed to be square.

Gaussian kernel Gaussian kernel.

Option [default value] Description

Kernel.SizeX Width of the kernel (numer of columns)

Kernel.SizeY Height of the kernel (number of rows)

Kernel.Positive [1]

If true, the center of the kernel is positive

Kernel.Sigma [√2.0]Sigma of the kernel

36/78

LoG kernel Laplacian Of Gaussian kernel.

Option [default value] Description

Kernel.SizeX Width of the kernel (numer of columns)

Kernel.SizeY Height of the kernel (number of rows)

Kernel.Positive [1]

If true, the center of the kernel is positive

Kernel.Sigma [√2.0]Sigma of the kernel

DoG kernel Diﬀerence Of Gaussian kernel kernel.

Option [default value] Description

Kernel.SizeX Width of the kernel (numer of columns)

Kernel.SizeY Height of the kernel (number of rows)

Kernel.Positive [1]

If true, the center of the kernel is positive

Kernel.Sigma1 [2.0] Sigma1 of the kernel

Kernel.Sigma2 [1.0] Sigma2 of the kernel

Gabor kernel Gabor kernel.

Option [default value] Description

Kernel.SizeX Width of the kernel (numer of columns)

Kernel.SizeY Height of the kernel (number of rows)

Kernel.Theta Theta of the kernel

Kernel.Sigma [√2.0]Sigma of the kernel

Kernel.Lambda [10.0] Lambda of the kernel

Kernel.Psi [π/2.0] Psi of the kernel

Kernel.Gamma [0.5] Gamma of the kernel

FlipTransformation Image ﬂip transformation.

Option [default value] Description

HorizontalFlip [0] If true, ﬂip the image horizontally

VerticalFlip [0] If true, ﬂip the image vertically

RandomHorizontalFlip [0] If true, randomly ﬂip the image horizontally

RandomVerticalFlip [0] If true, randomly ﬂip the image vertically

GradientFilterTransformationN2D2 IP only Compute image gradient.

37/78

Option [default value] Description

Scale [1.0] Scale to apply to the computed gradient

Delta [0.0] Bias to add to the computed gradient

GradientFilter [Sobel]

Filter type to use for computing the gradient. Possible

options are: Sobel,Scharr and Laplacian

KernelSize [3]

Size of the ﬁlter kernel (has no eﬀect when using the

Scharr

ﬁlter, which kernel size is always 3x3)

ApplyToLabels [0]

If true, use the computed gradient to ﬁlter the image label and

ignore pixel areas where the gradient is below the

Threshold

In this case, only the labels are modiﬁed, not the image

InvThreshold [0]

If true, ignored label pixels will be the ones with a low

gradient (low contrasted areas)

Threshold [0.5] Threshold applied on the image gradient

Label [] List of labels to ﬁlter (space-separated)

GradientScale [1.0]

Rescale the image by this factor before applying the gradient

and the threshold, then scale it back to ﬁlter the labels

LabelSliceExtractionTransformationN2D2 IP only

Extract a slice from an image belonging to a given label.

Option [default value] Description

Width Width of the slice to extract

Height Height of the slice to extract

Label [-1]

Slice should belong to this label ID. If -1, the label ID is

random

MagnitudePhaseTransformation

Compute the magnitude and phase of a complex two channels

input data, with the ﬁrst channel

being the real part and the second channel

the imaginary

part. The resulting data is two channels, the ﬁrst one with the magnitude and the second one with

the phase.

Option [default value] Description

LogScale [0] If true, compute the magnitude in log scale

The magnitude is:

Mi,j =qx2

i,j +x2

i,j

If LogScale = 1, compute M0

i,j =log(1 + Mi,j ).

The phase is:

θi,j =atan2(yi,j, xi,j )

MorphologicalReconstructionTransformationN2D2 IP only

Apply a morphological reconstruction transfor-

mation to the image. This transformation is also useful for post-processing.

38/78

Option [default value] Description

Operation Morphological operation to apply. Can be:

ReconstructionByErosion

: reconstruction by erosion operation

ReconstructionByDilation

: reconstruction by dilation opera-

tion

OpeningByReconstruction

: opening by reconstruction operation

ClosingByReconstruction

: closing by reconstruction operation

Size Size of the structuring element

ApplyToLabels [0]

If true, apply the transformation to the labels instead of the

image

Shape [Rectangular]

Shape of the structuring element used for morphology opera-

tions. Can be Rectangular,Elliptic or Cross.

NbIterations [1]

Number of times erosion and dilation are applied for opening

and closing reconstructions

MorphologyTransformationN2D2 IP only

Apply a morphology transformation to the image. This transforma-

tion is also useful for post-processing.

Option [default value] Description

Operation Morphological operation to apply. Can be:

Erode: erode operation (=erode(src))

Dilate: dilate operation (=dilate(src))

Opening

: opening operation (

open

(

src

) =

dilate

(

erode

(

src

)))

Closing: closing operation (close(src) = erode(dilate(src)))

Gradient

: morphological gradient (=

dilate

(

src

)

−erode

(

src

))

TopHat: top hat (=src −open(src))

BlackHat: black hat (=close(src)−src)

Size Size of the structuring element

ApplyToLabels [0]

If true, apply the transformation to the labels instead of the

image

Shape [Rectangular]

Shape of the structuring element used for morphology opera-

tions. Can be Rectangular,Elliptic or Cross.

NbIterations [1] Number of times erosion and dilation are applied

NormalizeTransformation Normalize the image.

Option [default value] Description

Norm [MinMax] Norm type, can be:

L1: L1 normalization

L2: L2 normalization

Linf: Linf normalization

MinMax: min-max normalization

NormValue [1.0] Norm value (for L1,L2 and Linf)

Such that ||data||Lp=NormV alue

NormMin [0.0] Min value (for MinMax only)

Such that min(data) = N ormMin

NormMax [1.0] Max value (for MinMax only)

Such that max(data) = N ormMax

PerChannel [0] If true, normalize each channel individually

PadCropTransformation Pad/crop the image to a speciﬁed size.

39/78

Option [default value] Description

Width Width of the padded/cropped image

Height Height of the padded/cropped image

PaddingBackground [MeanColor] Background color used when padding. Possible values:

MeanColor: pad with the mean color of the image

BlackColor: pad with black

RandomAffineTransformationN2D2 IP only

Apply a global random aﬃne transformation to the values of the

image.

Option [default value] Description

GainVar Random gain is in range ±GainVar

BiasVar [0.0] Random bias is in range ±BiasVar

RangeAffineTransformation Apply an aﬃne transformation to the values of the image.

Option [default value] Description

FirstOperator First operator, can be Plus,Minus,Multiplies,Divides

FirstValue First value

SecondOperator [Plus] Second operator, can be Plus,Minus,Multiplies,Divides

SecondValue [0.0] Second value

The ﬁnal operation is the following:

f(x)=(xo

op1st val1st)o

op2nd val2nd

RangeClippingTransformationN2D2 IP only Clip the value range of the image.

Option [default value] Description

RangeMin [min(data)] Image values below RangeMin are clipped to 0

RangeMax [max(data)]

Image values above

RangeMax

are clipped to 1 (or the maximum

integer value of the data type)

RescaleTransformation Rescale the image to a speciﬁed size.

Option [default value] Description

Width Width of the rescaled image

Height Height of the rescaled image

KeepAspectRatio [0] If true, keeps the aspect ratio of the image

ResizeToFit [1]

If true, resize along the longest dimension when

KeepAspectRatio is true

ReshapeTransformation Reshape the data to a speciﬁed size.

Option [default value] Description

NbRows New number of rows

NbCols [0] New number of cols (0 = no check)

NbChannels [0] New number of channels (0 = no change)

40/78

SliceExtractionTransformationN2D2 IP only Extract a slice from an image.

Option [default value] Description

Width Width of the slice to extract

Height Height of the slice to extract

OffsetX [0] X oﬀset of the slice to extract

OffsetY [0] Y oﬀset of the slice to extract

RandomOffsetX [0] If true, the X oﬀset is chosen randomly

RandomOffsetY [0] If true, the Y oﬀset is chosen randomly

RandomRotation [0] If true, extract randomly rotated slices

RandomRotationRange [0.0 360.0]

Range of the random rotations, in degrees, counterclockwise

(if RandomRotation is enabled)

AllowPadding [0]

If true, zero-padding is allowed if the image is smaller than

the slice to extract

ThresholdTransformation

Apply a thresholding transformation to the image. This transforma-

tion is also useful for post-processing.

Option [default value] Description

Threshold Threshold value

OtsuMethod [0]

Use Otsu’s method to determine the optimal threshold (if

true, the Threshold value is ignored)

Operation [Binary] Thresholding operation to apply. Can be:

Binary

BinaryInverted

Truncate

ToZero

ToZeroInverted

MaxValue [1.0]

Max. value to use with

Binary

and

BinaryInverted

operations

TrimTransformation Trim the image.

Option [default value] Description

NbLevels Number of levels for the color discretization of the image

Method [Discretize] Possible values are:

Reduce: discretization using K-means

Discretize: simple discretization

WallisFilterTransformationN2D2 IP only Apply Wallis ﬁlter to the image.

Option [default value] Description

Size Size of the ﬁlter

Mean [0.0] Target mean value

StdDev [1.0] Target standard deviation

PerChannel [0]

If true, apply Wallis ﬁlter to each channel individually (this

parameter is meaningful only if Size is 0)

4.7 Network layers

4.7.1 Layer deﬁnition

Common set of parameters for any kind of layer.

41/78

Option [default value] Description

Input

Name of the section(s) for the input layer(s). Comma sepa-

rated

Type Type of the layer. Can be any of the type described below

Model [DefaultModel] Layer model to use

DataType [DefaultDataType]

Layer data type to use. Please note that some layers may

not support every data type.

ConfigSection [] Name of the conﬁguration section for layer

To specify that the back-propagated error must be computed at the output of a given layer

(generally the last layer, or output layer), one must add a target section named LayerName

.Target

...

[LayerName.Target]

TargetValue=1.0 ; default: 1.0

DefaultValue=0.0 ; default: -1.0

4.7.2 Weight ﬁllers

Fillers to initialize weights and biases in the diﬀerent type of layer.

Usage example:

[conv1]

...

WeightsFiller=NormalFiller

WeightsFiller.Mean=0.0

WeightsFiller.StdDev=0.05

...

The initial weights distribution for each layer can be checked in the weights_init folder, with

an example shown in ﬁgure 12.

Figure 12: Initial weights distribution of a layer using a normal distribution (

NormalFiller

) with a 0 mean

and a 0.05 standard deviation.

42/78

ConstantFiller Fill with a constant value.

Option Description

FillerName.Value Value for the ﬁlling

HeFiller

Fill with an normal distribution with normalized variance taking into account the

rectiﬁer nonlinearity (He et al.,2015). This ﬁller is sometimes referred as MSRA ﬁller.

Option [default value] Description

FillerName

.VarianceNorm

[FanIn]

Normalization, can be FanIn,Average or FanOut

Use a normal distribution with standard deviation q2.0

•n=fan-in with FanIn, resulting in V ar(W) = 2

fan-in

•n=(fan-in+fan-out)

2with Average, resulting in V ar(W) = 4

fan-in+fan-out

•n=fan-out with FanOut, resulting in V ar(W) = 2

fan-out

NormalFiller Fill with a normal distribution.

Option [default value] Description

FillerName.Mean [0.0] Mean value of the distribution

FillerName.StdDev [1.0] Standard deviation of the distribution

UniformFiller Fill with an uniform distribution.

Option [default value] Description

FillerName.Min [0.0] Min. value

FillerName.Max [1.0] Max. value

XavierFiller

Fill with an uniform distribution with normalized variance (Glorot and Bengio,

2010).

Option [default value] Description

FillerName

.VarianceNorm

[FanIn]

Normalization, can be FanIn,Average or FanOut

FillerName

.Distribution

[Uniform]

Distribution, can be Uniform or Normal

Use an uniform distribution with interval [−scale, scale], with scale =q3.0

•n=fan-in with FanIn, resulting in V ar(W) = 1

fan-in

•n=(fan-in+fan-out)

2with Average, resulting in V ar(W) = 2

fan-in+fan-out

•n=fan-out with FanOut, resulting in V ar(W) = 1

fan-out

43/78

4.7.3 Weight solvers

SGDSolver_Frame SGD Solver for Frame models.

Option [default value] Description

SolverName

.LearningRate

[0.01]

Learning rate

SolverName.Momentum [0.0] Momentum

SolverName.Decay [0.0] Decay

SolverName

LearningRatePolicy [None]

Learning rate decay policy. Can be any of

None

StepDecay

ExponentialDecay,InvTDecay,PolyDecay

SolverName

LearningRateStepSize [1]

Learning rate step size (in number of stimuli)

SolverName

.LearningRateDecay

[0.1]

Learning rate decay

SolverName.Clamping [0] If true, clamp the weights and bias between -1 and 1

SolverName.Power [0.0] Polynomial learning rule power parameter

SolverName

.MaxIterations

[0.0]

Polynomial learning rule maximum number of iterations

The learning rate decay policies are the following:

•StepDecay

: every SolverName

.LearningRateStepSize

stimuli, the learning rate is reduced by a

factor SolverName.LearningRateDecay;

•ExponentialDecay

: the learning rate is

α0exp

(

−kt

), with

α0

the initial learning rate

SolverName

.LearningRate

the rate decay SolverName

.LearningRateDecay

and

the step

number (one step every SolverName.LearningRateStepSize stimuli);

•InvTDecay

: the learning rate is

α0/

(1 +

), with

α0

the initial learning rate SolverName

LearningRate

the rate decay SolverName

.LearningRateDecay

and

the step number (one step

every SolverName.LearningRateStepSize stimuli).

•InvDecay

: the learning rate is

α0∗

(1 +

)

−n

, with

α0

the initial learning rate Solver-

Name

.LearningRate

the rate decay SolverName

.LearningRateDecay

the current iteration

and nthe power parameter SolverName.Power

•PolyDecay

: the learning rate is

α0∗

−k

)

, with

α0

the initial learning rate Solver-

Name

.LearningRate

the current iteration,

the maximum number of iteration SolverName

MaxIterations and nthe power parameter SolverName.Power

SGDSolver_Frame_CUDA SGD Solver for Frame_CUDA models.

44/78

Option [default value] Description

SolverName

.LearningRate

[0.01]

Learning rate

SolverName.Momentum [0.0] Momentum

SolverName.Decay [0.0] Decay

SolverName

LearningRatePolicy [None]

Learning rate decay policy. Can be any of

None

StepDecay

ExponentialDecay,InvTDecay

SolverName

LearningRateStepSize [1]

Learning rate step size (in number of stimuli)

SolverName

.LearningRateDecay

[0.1]

Learning rate decay

SolverName.Clamping [0] If true, clamp the weights and bias between -1 and 1

The learning rate decay policies are identical to the ones in the SGDSolver\_Frame solver.

AdamSolver_Frame Adam Solver for Frame models (Kingma and Ba,2014).

Option [default value] Description

SolverName

.LearningRate

[0.001]

Learning rate (stepsize)

SolverName.Beta1 [0.9]

Exponential decay rate of these moving average of the ﬁrst

moment

SolverName.Beta2 [0.999]

Exponential decay rate of these moving average of the second

moment

SolverName.Epsilon [1.0e-8] Epsilon

AdamSolver_Frame_CUDA Adam Solver for Frame_CUDA models (Kingma and Ba,2014).

Option [default value] Description

SolverName

.LearningRate

[0.001]

Learning rate (stepsize)

SolverName.Beta1 [0.9]

Exponential decay rate of these moving average of the ﬁrst

moment

SolverName.Beta2 [0.999]

Exponential decay rate of these moving average of the second

moment

SolverName.Epsilon [1.0e-8] Epsilon

4.7.4 Activation functions

Activation function to be used at the output of layers.

Usage example:

[conv1]

...

ActivationFunction=Rectifier

ActivationFunction.LeakSlope=0.01

ActivationFunction.Clipping=20

...

Logistic Logistic activation function.

LogisticWithLoss Logistic with loss activation function.

45/78

Rectifier Rectiﬁer or ReLU activation function.

Option [default value] Description

ActivationFunction.LeakSlope

[0.0]

Leak slope for negative inputs

ActivationFunction.Clipping

[0.0]

Clipping value for positive outputs

Saturation Saturation activation function.

Softplus Softplus activation function.

Tanh Tanh activation function.

Computes y=tanh(αx).

Option [default value] Description

ActivationFunction.Alpha [1.0] αparameter

TanhLeCun Tanh activation function with an αparameter of 1.7159 ×(2.0/3.0).

4.7.5 Anchor

Anchor layer for Faster R-CNN or Single Shot Detector.

Option [default value] Description

Input

This layer takes one or two inputs. The total number of

input channels must be

ScoresCls

+ 4, with

ScoresCls

being

equal to 1 or 2.

Anchor[*]

Anchors deﬁnition. For each anchor, there must be two

space-separated values: the root area and the aspect ratio.

ScoresCls

Number of classes per anchor. Must be 1 (if the scores input

uses logistic regression) or 2 (if the scores input is a two-class

softmax layer)

FeatureMapWidth

[StimuliProvider.Width]

Reference width use to scale anchors coordinate.

FeatureMapHeight

[StimuliProvider.Height]

Reference height use to scale anchors coordinate.

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

PositiveIoU [0.7] all Frame

Assign a positive label for anchors whose IoU overlap

is higher than PositiveIoU with any ground-truth box

NegativeIoU [0.3] all Frame

Assign a negative label for non-positive anchors whose

IoU overlap is lower than

NegativeIoU

for all ground-

truth boxes

LossLambda [10.0] all Frame Balancing parameter λ

LossPositiveSample [128] all Frame

Number of random positive samples for the loss com-

putation

46/78

LossNegativeSample [128] all Frame

Number of random negative samples for the loss com-

putation

Usage example:

; RPN network: cls layer

[scores]

Input=...

Type=Conv

KernelWidth=1

KernelHeight=1

; 18 channels for 9 anchors

NbOutputs=18

...

[scores.softmax]

Input=scores

Type=Softmax

NbOutputs=[scores]NbOutputs

WithLoss=1

; RPN network: coordinates layer

[coordinates]

Input=...

Type=Conv

KernelWidth=1

KernelHeight=1

; 36 channels for 4 coordinates x 9 anchors

NbOutputs=36

...

; RPN network: anchors

[anchors]

Input=scores.softmax,coordinates

Type=Anchor

ScoresCls=2 ; using a two-class softmax for the scores

Anchor[0]=32 1.0

Anchor[1]=48 1.0

Anchor[2]=64 1.0

Anchor[3]=80 1.0

Anchor[4]=96 1.0

Anchor[5]=112 1.0

Anchor[6]=128 1.0

Anchor[7]=144 1.0

Anchor[8]=160 1.0

ConfigSection=anchors.config

[anchors.config]

PositiveIoU=0.7

NegativeIoU=0.3

LossLambda=1.0

Outputs remapping

Outputs remapping allows to convert scores and coordinates output feature

maps layout from another ordering that the one used in the N2D2

Anchor

layer, during weights

import/export.

For example, lets consider that the imported weights corresponds to the following output feature

maps ordering:

0 anchor[0].y

1 anchor[0].x

47/78

2 anchor[0].h

3 anchor[0].w

4 anchor[1].y

5 anchor[1].x

6 anchor[1].h

7 anchor[1].w

8 anchor[2].y

9 anchor[2].x

10 anchor[2].h

11 anchor[2].w

The output feature maps ordering required by the Anchor layer is:

0 anchor[0].x

1 anchor[1].x

2 anchor[2].x

3 anchor[0].y

4 anchor[1].y

5 anchor[2].y

6 anchor[0].w

7 anchor[1].w

8 anchor[2].w

9 anchor[0].h

10 anchor[1].h

11 anchor[2].h

The feature maps ordering can be changed during weights import/export:

; RPN network: coordinates layer

[coordinates]

Input=...

Type=Conv

KernelWidth=1

KernelHeight=1

; 36 channels for 4 coordinates x 9 anchors

NbOutputs=36

...

ConfigSection=coordinates.config

[coordinates.config]

WeightsExportFormat=HWCO ; Weights format used by TensorFlow

OutputsRemap=1:4,0:4,3:4,2:4

4.7.6 Conv

Convolutional layer.

Option [default value] Description

KernelWidth Width of the kernels

KernelHeight Height of the kernels

KernelDepth [] Depth of the kernels (implies 3D kernels)

KernelSize [] Kernels size (implies 2D square kernels)

KernelDims [] List of space-separated dimensions for N-D kernels

NbOutputs Number of output channels

SubSampleX [1] X-axis subsampling factor of the output feature maps

SubSampleY [1] Y-axis subsampling factor of the output feature maps

SubSampleZ [] Z-axis subsampling factor of the output feature maps

48/78

SubSample [1] Subsampling factor of the output feature maps

SubSampleDims []

List of space-separated subsampling dimensions for N-D

kernels

StrideX [1] X-axis stride of the kernels

StrideY [1] Y-axis stride of the kernels

StrideZ [] Z-axis stride of the kernels

Stride [1] Stride of the kernels

StrideDims [] List of space-separated stride dimensions for N-D kernels

PaddingX [0] X-axis input padding

PaddingY [0] Y-axis input padding

PaddingZ [] Z-axis input padding

Padding [0] Input padding

PaddingDims []

List of space-separated padding dimensions for N-D kernels

ActivationFunction [Tanh]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

WeightsFiller Weights initial values ﬁller

[NormalFiller(0.0, 0.05)]

BiasFiller Biases initial values ﬁller

[NormalFiller(0.0, 0.05)]

Mapping.NbGroups [] Mapping: number of groups

(mutually exclusive with all other Mapping.* options)

Mapping.ChannelsPerGroup [] Mapping: number of channels per group

(mutually exclusive with all other Mapping.* options)

Mapping.SizeX [1] Mapping canvas pattern default width

Mapping.SizeY [1] Mapping canvas pattern default height

Mapping.Size [1] Mapping canvas pattern default size

(mutually exclusive with Mapping.SizeX and Mapping.SizeY)

Mapping.StrideX [1] Mapping canvas default X-axis step

Mapping.StrideY [1] Mapping canvas default Y-axis step

Mapping.Stride [1] Mapping canvas default step

(mutually exclusive with

Mapping.StrideX

and

Mapping.StrideY

)

Mapping.OffsetX [0] Mapping canvas default X-axis oﬀset

Mapping.OffsetY [0] Mapping canvas default Y-axis oﬀset

Mapping.Offset [0] Mapping canvas default oﬀset

(mutually exclusive with

Mapping.OffsetX

and

Mapping.OffsetY

)

Mapping.NbIterations [0]

Mapping canvas pattern default number of iterations (0

means no limit)

Mapping(in).SizeX [1] Mapping canvas pattern default width for input layer in

Mapping(in).SizeY [1] Mapping canvas pattern default height for input layer in

Mapping(in).Size [1] Mapping canvas pattern default size for input layer in

(mutually exclusive with Mapping(in).SizeX and

Mapping(in).SizeY)

Mapping(in).StrideX [1] Mapping canvas default X-axis step for input layer in

Mapping(in).StrideY [1] Mapping canvas default Y-axis step for input layer in

Mapping(in).Stride [1] Mapping canvas default step for input layer in

(mutually exclusive with Mapping(in).StrideX and

49/78

Mapping(in).StrideY)

Mapping(in).OffsetX [0] Mapping canvas default X-axis oﬀset for input layer in

Mapping(in).OffsetY [0] Mapping canvas default Y-axis oﬀset for input layer in

Mapping(in).Offset [0] Mapping canvas default oﬀset for input layer in

(mutually exclusive with Mapping(in).OffsetX and

Mapping(in).OffsetY)

Mapping(in).NbIterations [0]

Mapping canvas pattern default number of iterations for

input layer in (0 means no limit)

WeightsSharing [] Share the weights with an other layer

BiasesSharing [] Share the biases with an other layer

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

NoBias [0] all Frame If true, don’t use bias

Solvers.*all Frame Any solver parameters

WeightsSolver.*all Frame

Weights solver parameters, take precedence over the

Solvers.* parameters

BiasSolver.*all Frame

Bias solver parameters, take precedence over the

Solvers.* parameters

WeightsExportFormat

[OCHW]

all Frame

Weights import/export format. Can be

OCHW

with

the output feature map,

the input feature map

(channel),

the kernel row and

the kernel column, in

the order of the outermost dimension (in the leftmost

position) to the innermost dimension (in the rightmost

position)

WeightsExportFlip [0] all Frame If true, import/export ﬂipped kernels

Conﬁguration parameters (Spike models)

Experimental option (implementation may be wrong or susceptible to change)

Option [default value] Model(s) Description

IncomingDelay

TimePs

;100 TimeFs]

all Spike Synaptic incoming delay wdelay

Threshold [1.0] Spike,Spike_RRAM Threshold of the neuron Ithres

BipolarThreshold [1] Spike,Spike_RRAM

If true, the threshold is also applied to the absolute

value of negative values (generating negative spikes)

Leak [0.0] Spike,Spike_RRAM Neural leak time constant τleak (if 0, no leak)

Refractory [0.0] Spike,Spike_RRAM Neural refractory period Trefrac

WeightsRelInit [0.0;0.05] Spike Relative initial synaptic weight winit

WeightsMinMean [1;0.1] Spike_RRAM Mean minimum synaptic weight wmin

WeightsMaxMean

[100;10.0]

Spike_RRAM Mean maximum synaptic weight wmax

WeightsMinVarSlope [0.0] Spike_RRAM OXRAM speciﬁc parameter

WeightsMinVarOrigin [0.0] Spike_RRAM OXRAM speciﬁc parameter

WeightsMaxVarSlope [0.0] Spike_RRAM OXRAM speciﬁc parameter

WeightsMaxVarOrigin [0.0] Spike_RRAM OXRAM speciﬁc parameter

50/78

WeightsSetProba [1.0] Spike_RRAM

Intrinsic SET switching probability

PSET

(upon receiv-

ing a SET programming pulse). Assuming uniform

statistical distribution (not well supported by experi-

ments on RRAM)

WeightsResetProba [1.0] Spike_RRAM

Intrinsic RESET switching probability

PRESET

(upon

receiving a RESET programming pulse). Assuming

uniform statistical distribution (not well supported by

experiments on RRAM)

SynapticRedundancy [1] Spike_RRAM

Synaptic redundancy (number of RRAM device per

synapse)

BipolarWeights [0] Spike_RRAM Bipolar weights

BipolarIntegration [0] Spike_RRAM Bipolar integration

LtpProba [0.2] Spike_RRAM

Extrinsic STDP LTP probability (cumulative with in-

trinsic SET switching probability PSET )

LtdProba [0.1] Spike_RRAM

Extrinsic STDP LTD probability (cumulative with

intrinsic RESET switching probability PRESET )

StdpLtp [1000 TimePs]Spike_RRAM STDP LTP time window TLT P

InhibitRefractory

TimePs]

Spike_RRAM Neural lateral inhibition period Tinhibit

EnableStdp [1] Spike_RRAM

If false, STDP is disabled (no synaptic weight change)

RefractoryIntegration

[1]

Spike_RRAM

If true, reset the integration to 0 during the refractory

period

DigitalIntegration [0] Spike_RRAM

If false, the analog value of the devices is integrated,

instead of their binary value

4.7.7 Deconv

Deconvolutionlayer.

Option [default value] Description

KernelWidth Width of the kernels

KernelHeight Height of the kernels

KernelDepth [] Depth of the kernels (implies 3D kernels)

KernelSize [] Kernels size (implies 2D square kernels)

KernelDims [] List of space-separated dimensions for N-D kernels

NbOutputs Number of output channels

StrideX [1] X-axis stride of the kernels

StrideY [1] Y-axis stride of the kernels

StrideZ [] Z-axis stride of the kernels

Stride [1] Stride of the kernels

StrideDims [] List of space-separated stride dimensions for N-D kernels

PaddingX [0] X-axis input padding

PaddingY [0] Y-axis input padding

PaddingZ [] Z-axis input padding

Padding [0] Input padding

51/78

PaddingDims []

List of space-separated padding dimensions for N-D kernels

ActivationFunction [Tanh]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

WeightsFiller Weights initial values ﬁller

[NormalFiller(0.0, 0.05)]

BiasFiller Biases initial values ﬁller

[NormalFiller(0.0, 0.05)]

Mapping.NbGroups [] Mapping: number of groups

(mutually exclusive with all other Mapping.* options)

Mapping.ChannelsPerGroup [] Mapping: number of channels per group

(mutually exclusive with all other Mapping.* options)

Mapping.SizeX [1] Mapping canvas pattern default width

Mapping.SizeY [1] Mapping canvas pattern default height

Mapping.Size [1] Mapping canvas pattern default size

(mutually exclusive with Mapping.SizeX and Mapping.SizeY)

Mapping.StrideX [1] Mapping canvas default X-axis step

Mapping.StrideY [1] Mapping canvas default Y-axis step

Mapping.Stride [1] Mapping canvas default step

(mutually exclusive with

Mapping.StrideX

and

Mapping.StrideY

)

Mapping.OffsetX [0] Mapping canvas default X-axis oﬀset

Mapping.OffsetY [0] Mapping canvas default Y-axis oﬀset

Mapping.Offset [0] Mapping canvas default oﬀset

(mutually exclusive with

Mapping.OffsetX

and

Mapping.OffsetY

)

Mapping.NbIterations [0]

Mapping canvas pattern default number of iterations (0

means no limit)

Mapping(in).SizeX [1] Mapping canvas pattern default width for input layer in

Mapping(in).SizeY [1] Mapping canvas pattern default height for input layer in

Mapping(in).Size [1] Mapping canvas pattern default size for input layer in

(mutually exclusive with Mapping(in).SizeX and

Mapping(in).SizeY)

Mapping(in).StrideX [1] Mapping canvas default X-axis step for input layer in

Mapping(in).StrideY [1] Mapping canvas default Y-axis step for input layer in

Mapping(in).Stride [1] Mapping canvas default step for input layer in

(mutually exclusive with Mapping(in).StrideX and

Mapping(in).StrideY)

Mapping(in).OffsetX [0] Mapping canvas default X-axis oﬀset for input layer in

Mapping(in).OffsetY [0] Mapping canvas default Y-axis oﬀset for input layer in

Mapping(in).Offset [0] Mapping canvas default oﬀset for input layer in

(mutually exclusive with Mapping(in).OffsetX and

Mapping(in).OffsetY)

Mapping(in).NbIterations [0]

Mapping canvas pattern default number of iterations for

input layer in (0 means no limit)

WeightsSharing [] Share the weights with an other layer

BiasesSharing [] Share the biases with an other layer

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

52/78

NoBias [0] all Frame If true, don’t use bias

BackPropagate [1] all Frame If true, enable backpropogation

Solvers.*all Frame Any solver parameters

WeightsSolver.*all Frame

Weights solver parameters, take precedence over the

Solvers.* parameters

BiasSolver.*all Frame

Bias solver parameters, take precedence over the

Solvers.* parameters

WeightsExportFormat

[OCHW]

all Frame

Weights import/export format. Can be

OCHW

with

the output feature map,

the input feature map

(channel),

the kernel row and

the kernel column, in

the order of the outermost dimension (in the leftmost

position) to the innermost dimension (in the rightmost

position)

WeightsExportFlip [0] all Frame If true, import/export ﬂipped kernels

4.7.8 Pool

Pooling layer.

There are two CUDA models for this cell:

•Frame_CUDA

, which uses CuDNN as back-end and only supports one-to-one input to output

map connection;

•Frame_EXT_CUDA

, which uses custom CUDA kernels and allows arbitrary connections between

input and output maps (and can therefore be used to implement Maxout or both Maxout

and Pooling simultaneously).

Maxout example

In the following INI section, one implements a Maxout between each consecu-

tive pair of 8 input maps:

[maxout_layer]

Input=...

Type=Pool

Model=Frame_EXT_CUDA

PoolWidth=1

PoolHeight=1

NbOutputs=4

Pooling=Max

Mapping.SizeY=2

Mapping.StrideY=2

The layer connectivity is the following:

# input map

1 2 3 4

# output map

53/78

Option [default value] Description

Pooling Type of pooling (Max or Average)

PoolWidth Width of the pooling area

PoolHeight Height of the pooling area

PoolDepth [] Depth of the pooling area (implies 3D pooling area)

PoolSize [] Pooling area size (implies 2D square pooling area)

PoolDims [] List of space-separated dimensions for N-D pooling area

NbOutputs Number of output channels

StrideX [1] X-axis stride of the pooling area

StrideY [1] Y-axis stride of the pooling area

StrideZ [] Z-axis stride of the pooling area

Stride [1] Stride of the pooling area

StrideDims []

List of space-separated stride dimensions for N-D pooling

area

PaddingX [0] X-axis input padding

PaddingY [0] Y-axis input padding

PaddingZ [] Z-axis input padding

Padding [0] Input padding

PaddingDims []

List of space-separated padding dimensions for N-D pooling

area

ActivationFunction [Linear]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

Mapping.NbGroups [] Mapping: number of groups

(mutually exclusive with all other Mapping.* options)

Mapping.ChannelsPerGroup [] Mapping: number of channels per group

(mutually exclusive with all other Mapping.* options)

Mapping.SizeX [1] Mapping canvas pattern default width

Mapping.SizeY [1] Mapping canvas pattern default height

Mapping.Size [1] Mapping canvas pattern default size

(mutually exclusive with Mapping.SizeX and Mapping.SizeY)

Mapping.StrideX [1] Mapping canvas default X-axis step

Mapping.StrideY [1] Mapping canvas default Y-axis step

Mapping.Stride [1] Mapping canvas default step

(mutually exclusive with

Mapping.StrideX

and

Mapping.StrideY

)

Mapping.OffsetX [0] Mapping canvas default X-axis oﬀset

Mapping.OffsetY [0] Mapping canvas default Y-axis oﬀset

Mapping.Offset [0] Mapping canvas default oﬀset

(mutually exclusive with

Mapping.OffsetX

and

Mapping.OffsetY

)

Mapping.NbIterations [0]

Mapping canvas pattern default number of iterations (0

means no limit)

Mapping(in).SizeX [1] Mapping canvas pattern default width for input layer in

Mapping(in).SizeY [1] Mapping canvas pattern default height for input layer in

Mapping(in).Size [1] Mapping canvas pattern default size for input layer in

(mutually exclusive with Mapping(in).SizeX and

Mapping(in).SizeY)

54/78

Mapping(in).StrideX [1] Mapping canvas default X-axis step for input layer in

Mapping(in).StrideY [1] Mapping canvas default Y-axis step for input layer in

Mapping(in).Stride [1] Mapping canvas default step for input layer in

(mutually exclusive with Mapping(in).StrideX and

Mapping(in).StrideY)

Mapping(in).OffsetX [0] Mapping canvas default X-axis oﬀset for input layer in

Mapping(in).OffsetY [0] Mapping canvas default Y-axis oﬀset for input layer in

Mapping(in).Offset [0] Mapping canvas default oﬀset for input layer in

(mutually exclusive with Mapping(in).OffsetX and

Mapping(in).OffsetY)

Mapping(in).NbIterations [0]

Mapping canvas pattern default number of iterations for

input layer in (0 means no limit)

Conﬁguration parameters (Spike models)

Option [default value] Model(s) Description

IncomingDelay

TimePs

;100 TimeFs]

all Spike Synaptic incoming delay wdelay

value

4.7.9 Unpool

Unpooling layer.

Option [default value] Description

Pooling Type of pooling (Max or Average)

PoolWidth Width of the pooling area

PoolHeight Height of the pooling area

PoolDepth [] Depth of the pooling area (implies 3D pooling area)

PoolSize [] Pooling area size (implies 2D square pooling area)

PoolDims [] List of space-separated dimensions for N-D pooling area

NbOutputs Number of output channels

ArgMax

Name of the associated pool layer for the argmax (the pool

layer input and the unpool layer output dimension must

match)

StrideX [1] X-axis stride of the pooling area

StrideY [1] Y-axis stride of the pooling area

StrideZ [] Z-axis stride of the pooling area

Stride [1] Stride of the pooling area

StrideDims []

List of space-separated stride dimensions for N-D pooling

area

PaddingX [0] X-axis input padding

PaddingY [0] Y-axis input padding

55/78

PaddingZ [] Z-axis input padding

Padding [0] Input padding

PaddingDims []

List of space-separated padding dimensions for N-D pooling

area

ActivationFunction [Linear]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

Mapping.NbGroups [] Mapping: number of groups

(mutually exclusive with all other Mapping.* options)

Mapping.ChannelsPerGroup [] Mapping: number of channels per group

(mutually exclusive with all other Mapping.* options)

Mapping.SizeX [1] Mapping canvas pattern default width

Mapping.SizeY [1] Mapping canvas pattern default height

Mapping.Size [1] Mapping canvas pattern default size

(mutually exclusive with Mapping.SizeX and Mapping.SizeY)

Mapping.StrideX [1] Mapping canvas default X-axis step

Mapping.StrideY [1] Mapping canvas default Y-axis step

Mapping.Stride [1] Mapping canvas default step

(mutually exclusive with

Mapping.StrideX

and

Mapping.StrideY

)

Mapping.OffsetX [0] Mapping canvas default X-axis oﬀset

Mapping.OffsetY [0] Mapping canvas default Y-axis oﬀset

Mapping.Offset [0] Mapping canvas default oﬀset

(mutually exclusive with

Mapping.OffsetX

and

Mapping.OffsetY

)

Mapping.NbIterations [0]

Mapping canvas pattern default number of iterations (0

means no limit)

Mapping(in).SizeX [1] Mapping canvas pattern default width for input layer in

Mapping(in).SizeY [1] Mapping canvas pattern default height for input layer in

Mapping(in).Size [1] Mapping canvas pattern default size for input layer in

(mutually exclusive with Mapping(in).SizeX and

Mapping(in).SizeY)

Mapping(in).StrideX [1] Mapping canvas default X-axis step for input layer in

Mapping(in).StrideY [1] Mapping canvas default Y-axis step for input layer in

Mapping(in).Stride [1] Mapping canvas default step for input layer in

(mutually exclusive with Mapping(in).StrideX and

Mapping(in).StrideY)

Mapping(in).OffsetX [0] Mapping canvas default X-axis oﬀset for input layer in

Mapping(in).OffsetY [0] Mapping canvas default Y-axis oﬀset for input layer in

Mapping(in).Offset [0] Mapping canvas default oﬀset for input layer in

(mutually exclusive with Mapping(in).OffsetX and

Mapping(in).OffsetY)

Mapping(in).NbIterations [0]

Mapping canvas pattern default number of iterations for

input layer in (0 means no limit)

4.7.10 ElemWise

Element-wise operation layer.

Option [default value] Description

56/78

NbOutputs Number of output neurons

Operation Type of operation (Sum,AbsSum,EuclideanSum,Prod, or Max)

Weights [1.0]

Weights for the

Sum

AbsSum

, and

EuclideanSum

operation, in

the same order as the inputs

Shifts [0.0]

Shifts for the

Sum

and

EuclideanSum

operation, in the same

order as the inputs

ActivationFunction [Linear]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

Given Ninput tensors Ti, performs the following operation:

Sum operation Tout =PN

1(wiTi+si)

AbsSum operation Tout =PN

1(wi|Ti|)

EuclideanSum operation Tout =qPN

1(wiTi+si)2

Prod operation Tout =QN

1(Ti)

Max operation Tout =MAXN

1(Ti)

Examples Sum of two inputs (Tout =T1+T2):

[elemwise_sum]

Input=layer1,layer2

Type=ElemWise

NbOutputs=[layer1]NbOutputs

Operation=Sum

Weighted sum of two inputs, by a factor 0.5 for

layer1

and 1.0 for

layer2

(

Tout

= 0

×T1

×T2

[elemwise_weighted_sum]

Input=layer1,layer2

Type=ElemWise

NbOutputs=[layer1]NbOutputs

Operation=Sum

Weights=0.5 1.0

Single input scaling by a factor 0.5 and shifted by 0.1 (Tout = 0.5×T1+ 0.1):

[elemwise_scale]

Input=layer1

Type=ElemWise

NbOutputs=[layer1]NbOutputs

Operation=Sum

Weights=0.5

Shifts=0.1

Absolute value of an input (Tout =|T1|):

[elemwise_abs]

Input=layer1

Type=ElemWise

NbOutputs=[layer1]NbOutputs

Operation=Abs

57/78

4.7.11 FMP

Fractional max pooling layer (Graham,2014).

Option [default value] Description

NbOutputs Number of output channels

ScalingRatio Scaling ratio. The output size is round input size

scaling ratio .

ActivationFunction [Linear]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

Overlapping [1] all Frame

If true, use overlapping regions, else use disjoint regions

PseudoRandom [1] all Frame

If true, use pseudorandom sequences, else use random

sequences

4.7.12 Fc

Fully connected layer.

Option [default value] Description

NbOutputs Number of output neurons

WeightsFiller Weights initial values ﬁller

[NormalFiller(0.0, 0.05)]

BiasFiller Biases initial values ﬁller

[NormalFiller(0.0, 0.05)]

ActivationFunction [Tanh]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

NoBias [0] all Frame If true, don’t use bias

BackPropagate [1] all Frame If true, enable backpropogation

Solvers.*all Frame Any solver parameters

WeightsSolver.*all Frame

Weights solver parameters, take precedence over the

Solvers.* parameters

BiasSolver.*all Frame

Bias solver parameters, take precedence over the

Solvers.* parameters

DropConnect [1.0] Frame

If below 1.0, fraction of synapses that are disabled with

drop connect

58/78

Conﬁguration parameters (Spike models)

Option [default value] Model(s) Description

IncomingDelay

TimePs

;100 TimeFs]

all Spike Synaptic incoming delay wdelay

Threshold [1.0] Spike,Spike_RRAM Threshold of the neuron Ithres

BipolarThreshold [1] Spike,Spike_RRAM

If true, the threshold is also applied to the absolute

value of negative values (generating negative spikes)

Leak [0.0] Spike,Spike_RRAM Neural leak time constant τleak (if 0, no leak)

Refractory [0.0] Spike,Spike_RRAM Neural refractory period Trefrac

TerminateDelta [0] Spike,Spike_RRAM Terminate delta

WeightsRelInit [0.0;0.05] Spike Relative initial synaptic weight winit

WeightsMinMean [1;0.1] Spike_RRAM Mean minimum synaptic weight wmin

WeightsMaxMean

[100;10.0]

Spike_RRAM Mean maximum synaptic weight wmax

WeightsMinVarSlope [0.0] Spike_RRAM OXRAM speciﬁc parameter

WeightsMinVarOrigin [0.0] Spike_RRAM OXRAM speciﬁc parameter

WeightsMaxVarSlope [0.0] Spike_RRAM OXRAM speciﬁc parameter

WeightsMaxVarOrigin [0.0] Spike_RRAM OXRAM speciﬁc parameter

WeightsSetProba [1.0] Spike_RRAM

Intrinsic SET switching probability

PSET

(upon receiv-

ing a SET programming pulse). Assuming uniform

statistical distribution (not well supported by experi-

ments on RRAM)

WeightsResetProba [1.0] Spike_RRAM

Intrinsic RESET switching probability

PRESET

(upon

receiving a RESET programming pulse). Assuming

uniform statistical distribution (not well supported by

experiments on RRAM)

SynapticRedundancy [1] Spike_RRAM

Synaptic redundancy (number of RRAM device per

synapse)

BipolarWeights [0] Spike_RRAM Bipolar weights

BipolarIntegration [0] Spike_RRAM Bipolar integration

LtpProba [0.2] Spike_RRAM

Extrinsic STDP LTP probability (cumulative with in-

trinsic SET switching probability PSET )

LtdProba [0.1] Spike_RRAM

Extrinsic STDP LTD probability (cumulative with

intrinsic RESET switching probability PRESET )

StdpLtp [1000 TimePs]Spike_RRAM STDP LTP time window TLT P

InhibitRefractory

TimePs]

Spike_RRAM Neural lateral inhibition period Tinhibit

EnableStdp [1] Spike_RRAM

If false, STDP is disabled (no synaptic weight change)

RefractoryIntegration

[1]

Spike_RRAM

If true, reset the integration to 0 during the refractory

period

DigitalIntegration [0] Spike_RRAM

If false, the analog value of the devices is integrated,

instead of their binary value

4.7.13 RbfN2D2 IP only

Radial basis function fully connected layer.

59/78

Option [default value] Description

NbOutputs Number of output neurons

CentersFiller Centers initial values ﬁller

[NormalFiller(0.5, 0.05)]

ScalingFiller Scaling initial values ﬁller

[NormalFiller(10.0, 0.05)]

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

Solvers.*all Frame Any solver parameters

CentersSolver.*all Frame

Centers solver parameters, take precedence over the

Solvers.* parameters

ScalingSolver.*all Frame

Scaling solver parameters, take precedence over the

Solvers.* parameters

RbfApprox [None]Frame

Approximation for the Gaussian function, can be any

of: None,Rectangular or SemiLinear

4.7.14 Softmax

Softmax layer.

Option [default value] Description

NbOutputs Number of output neurons

WithLoss [0] Softmax followed with a multinomial logistic layer

GroupSize [0]

Softmax is applied on groups of outputs. The group size

must be a divisor of NbOutputs parameter.

The softmax function performs the following operation, with

x,y

and

x,y

the input and the

output respectively at position (x, y)on channel i:

x,y =exp(ai

x,y)

j=0

exp(aj

x,y)

and

dai

x,y =

j=0 δij −ai

x,yaj

x,ydbj

x,y

When the

WithLoss

option is enabled, compute the gradient directly in respect of the cross-entropy

loss:

Lx,y =

j=0

x,y log(bj

x,y)

60/78

In this case, the gradient output becomes:

dai

x,y =dbi

x,y

with

dbi

x,y =ti

x,y −bi

x,y

4.7.15 LRN

Local Response Normalization (LRN) layer.

Option [default value] Description

NbOutputs Number of output neurons

The response-normalized activity bi

x,y is given by the expression:

x,y =ai

x,y

k+α

min(N−1,i+n/2)

j=max(0,i−n/2) aj

x,y2!β

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

N[5] all Frame Normalization window width in elements

Alpha [1.0e-4] all Frame

Value of the alpha variance scaling parameter in the

normalization formula

Beta [0.75] all Frame

Value of the beta power parameter in the normalization

formula

K[2.0] all Frame Value of the k parameter in normalization formula

4.7.16 LSTM

Long Short Term Memory Layer (Hochreiter and Schmidhuber,1997).

Global layer parameters (Frame_CUDA models)

Option [default value] Description

SeqLength

Maximum sequence length that the LSTM can take as an

input.

BatchSize

Number of sequences used for a single weights actualisation

process : size of the batch.

InputDim Dimension of every element composing a sequence.

HiddenSize Dimension of the LSTM inner state and output.

SingleBackpropFeeding [1] If disabled return the full output sequence.

Bidirectional [0] If enabled, build a bidirectional structure.

AllGatesWeightsFiller All Gates weights initial values ﬁller.

61/78

AllGatesBiasFiller All Gates bias initial values ﬁller.

WeightsInputGateFiller

Input gate previous layer and recurrent weights initial values

ﬁller. Take precedence over AllGatesWeightsFiller parameter.

WeightsForgetGateFiller

Forget gate previous layer and recurrent weights initial values

ﬁller. Take precedence over AllGatesWeightsFiller parameter.

WeightsCellGateFiller

Cell gate (or new memory) previous layer and recurrent

weights initial values ﬁller. Take precedence over All-

GatesWeightsFiller parameter.

WeightsOutputGateFiller

Output gate previous layer and recurrent weights initial

values ﬁller. Take precedence over AllGatesWeightsFiller

parameter.

BiasInputGateFiller

Input gate previous layer and recurrent bias initial values

ﬁller. Take precedence over AllGatesBiasFiller parameter.

BiasRecurrentForgetGateFiller

Forget gate recurrent bias initial values ﬁller. Take prece-

dence over AllGatesBiasFiller parameter. Often set to 1.0 to

show better convergence performance.

BiasPreviousLayerForgetGateFiller

Forget gate previous layer bias initial values ﬁller. Take

precedence over AllGatesBiasFiller parameter.

BiasCellGateFiller

Cell gate (or new memory) previous layer and recurrent bias

initial values ﬁller. Take precedence over AllGatesBiasFiller

parameter.

BiasOutputGateFiller Output gate previous layer and recurrent bias initial values

ﬁller. Take precedence over AllGatesBiasFiller parameter.

HxFiller Recurrent previous state initialisation. Often set to 0.0

CxFiller

Recurrent previous LSTM inner state initialisation. Often

set to 0.0

Conﬁguration parameters (Frame_CUDA models)

Option [default value] Model(s) Description

Solvers.*all Frame Any solver parameters

Dropout [0.0] all Frame

The probability with which the value from input would

be dropped.

InputMode [] all Frame

If enabled, drop the matrix multiplication of the input

data.

Algo [0] all Frame

Allow to choose diﬀerent cuDNN implementation. Can

be 0 : STANDARD, 1 : STATIC, 2 : DYNAMIC. Case

1 and 2 aren’t supported yet.

Current restrictions :

•Only Frame_Cuda version is supported yet.

•

The implementation only support input sequences with a ﬁxed length associated with a single

label.

•

CuDNN structures requires the input data to be ordered as [1, InputDim, BatchSize, Se-

qLength]. Depending on the use case (like sequential-MNIST), the input data would need to

62/78

be shuﬄed between the stimuli provisder and the RNN in order to process batches of data.

No shuﬄing layer is yet operational. In that case, set batch to one for ﬁrst experiments.

Further development requirements :

When it comes to RNN, two main factors needs to be considered to build proper interfaces :

Whether the input data has a variable or a ﬁxed length over the data base, that is to say

whether the input data will have a variable or ﬁxed Sequence length. Of course the main

strength of a RNN is to process variable length data.

Labelling granularity of the input data, that is to say wheteher every elements of a sequence

is labelled or the sequence itself has only one label.

For instance, let’s consider sentences as sequences of words in which every word would be part of a

vocabulary. Sentences could have a variable length and every element/word would have a label. In

that case, every relevant element of the output sequence from the recurrent structure is turned into

a prediction throught a fully connected layer with a linear activation fonction and a softmax.

On the opposite, using sequential-MNIST database, the sequence length would be the same re-

garding every image and there is only one label for an image. In that case, the last element of the

output sequence is the most relevant one to be turned into a prediction as it carries the information

of the entire input sequence.

To provide ﬂexibility according to these factors, the ﬁrst implementation choice is to set a

maximum sequence length emphSeqLength as an hyperparameter that the User provide. Variable

length senquences can be processed by padding the remaining steps of the input sequence.

Then two cases occur as the labeling granularity is scaled at each element of the sequence or scaled

at the sequence itself:

The sequence itself has only one label : The model has a ﬁxed size with one fully connected

Figure 13: RNN model : variable sequence length and labeling scaled at the sequence

mapped to the relevant element of the output sequence according to the input sequence.

2. Every elements of a sequence is labelled :

The model has a ﬁxed size with one big fully connected (or Tmax fully connected) mapped to the

relevant elements of the output sequence according to the input sequence. The remaining elements

63/78

Figure 14: RNN model : variable sequence length and labeling scaled at each element of the sequence

need to be masked so it doesn’t inﬂuence longer sequences.

Development guidance :

•

Replace the inner local variables of LSTMCell_Frame_Cuda with a generic layer of shuﬄing

(on device) to enable the the process of data batch.

•

Develop some kind of label embedding within the layer to better articulate the labeling

granularity of the input data.

•Adapt structures to support the STATIC and DYNAMIC algorithm of cuDNN functions.

4.7.17 Dropout

Dropout layer (Srivastava et al.,2012).

Option [default value] Description

NbOutputs Number of output neurons

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

Dropout [0.5] all Frame

The probability with which the value from input would

be dropped

4.7.18 Padding

Padding layer.

Option [default value] Description

NbOutputs Number of output neurons

TopPadding Size of the top padding (positive or negative)

BottomPadding Size of the bottom padding (positive or negative)

LeftPadding Size of the left padding (positive or negative)

RightPadding Size of the right padding (positive or negative)

The padding layer allow to insert asymmetric padding for each layer axes.

4.7.19 Resize

Resize layer can be applied to change dimension of features maps or of stimuli provider.

Option [default value] Description

64/78

NbOutputs Number of output feature maps

OutputHeight Output height dimension

OutputWidth Output width dimension

Mode

Resize interpolation mode. Can be,

Bilinear

BilinearTF

(TensorFlow implementation)

Conﬁguration parameters

Option [default value] Model(s) Description

AlignCorners [True] all Frame

Corner alignement mode if

BilinearTF

is used as inter-

polation mode

4.7.20 BatchNorm

Batch Normalization layer (Ioﬀe and Szegedy,2015).

Option [default value] Description

NbOutputs Number of output neurons

ActivationFunction [Tanh]

Activation function. Can be any of

Logistic

LogisticWithLoss

Rectifier,Softplus,TanhLeCun,Linear,Saturation or Tanh

ScalesSharing [] Share the scales with an other layer

BiasesSharing [] Share the biases with an other layer

MeansSharing [] Share the means with an other layer

VariancesSharing [] Share the variances with an other layer

Conﬁguration parameters (Frame models)

Option [default value] Model(s) Description

Solvers.*all Frame Any solver parameters

ScaleSolver.*all Frame

Scale solver parameters, take precedence over the

Solvers.* parameters

BiasSolver.*all Frame

Bias solver parameters, take precedence over the

Solvers.* parameters

Epsilon [0.0] all Frame

Epsilon value used in the batch normalization formula.

If 0.0, automatically choose the minimum possible

value.

4.7.21 Transformation

Transformation layer, which can apply any transformation described in 4.6.1. Useful for fully CNN

post-processing for example.

65/78

Option [default value] Description

NbOutputs Number of outputs

Transformation Name of the transformation to apply

The Transformation options must be placed in the same section.

Usage example for fully CNNs:

[post.Transformation-thres]

Input=... ; for example, network’s logistic of softmax output layer

NbOutputs=1

Type=Transformation

Transformation=ThresholdTransformation

Operation=ToZero

Threshold=0.75

[post.Transformation-morpho]

Input=post.Transformation-thres

NbOutputs=1

Type=Transformation

Transformation=MorphologyTransformation

Operation=Opening

Size=3

66/78

5 Tutorials

5.1 Learning deep neural networks: tips and tricks

5.1.1 Choose the learning solver

Generally, you should use the SGD solver with a momemtum (typical value for the momentum:

0.9). It generalizes better, often signiﬁcantly better, than adaptive methods like Adam (Wilson

et al.,2017).

Adaptive solvers, like Adam, may be used for fast exploration and prototyping, thanks to their

fast convergence.

5.1.2 Choose the learning hyper-parameters

To start a learning from scratch, a learning rate of 0.1 or 0.01 may be considered, for large batch

sizes (typically 256). Remind that if you scale the batch size (

) by a factor

, you should scale

the learning rate accordingly. A simple linear scaling rule is recommanded (Goyal et al.,2017).

Typical values for the SGDSolver are:

Solvers.LearningRate=0.1

Solvers.Decay=0.0001

Solvers.Momentum=0.9

5.1.3 Convergence and normalization

Deep networks (> 30 layers) and especially residual networks usually don’t converge without

normalization. Indeed, batch normalization is almost always used. ZeroInit is a method that can

be used to overcome this issue without normalization (Zhang et al.,2019).

5.2 Building a classiﬁer neural network

For this tutorial, we will use the classical MNIST handwritten digit dataset. A driver module

already exists for this dataset, named MNIST_IDX_Database.

To instantiate it, just add the following lines in a new INI ﬁle:

[database]

Type=MNIST_IDX_Database

Validation=0.2 ; Use 20% of the dataset for validation

In order to create a neural network, we ﬁrst need to deﬁne its input, which is declared with a

[sp]

section (sp for StimuliProvider). In this section, we conﬁgure the size of the input and the

batch size:

[sp]

SizeX=32

SizeY=32

BatchSize=128

We can also add pre-processing transformations to the StimuliProvider, knowing that the ﬁnal

data size after transformations must match the size declared in the

[sp]

section. Here, we must

rescale the MNIST 28x28 images to match the 32x32 network input size.

[sp.Transformation_1]

Type=RescaleTransformation

Width=[sp]SizeX

Height=[sp]SizeY

Next, we declare the neural network layers. In this example, we reproduced the well-known

LeNet network. The ﬁrst layer is a 5x5 convolutional layer, with 6 channels. Since there is only one

input channel, there will be only 6 convolution kernels in this layer.

67/78

[conv1]

Input=sp

Type=Conv

KernelWidth=5

KernelHeight=5

NbOutputs=6

The next layer is a 2x2 MAX pooling layer, with a stride of 2 (non-overlapping MAX pooling).

[pool1]

Input=conv1

Type=Pool

PoolWidth=2

PoolHeight=2

NbOutputs=[conv1]NbOutputs

Stride=2

Pooling=Max

Mapping.Size=1 ; One to one connection between input and output channels

The next layer is a 5x5 convolutional layer with 16 channels.

[conv2]

Input=pool1

Type=Conv

KernelWidth=5

KernelHeight=5

NbOutputs=16

Note that in LeNet, the

[conv2]

layer is not fully connected to the pooling layer. In N2D2, a

custom mapping can be deﬁned for each input connection. The connection of

-th output map to

the inputs is deﬁned by the

-th column of the matrix below, where the rows correspond to the

inputs.

Mapping(pool1)=\

1000111001111011\

1100011100111101\

1110001110010111\

0111001111001011\

0011100111101101\

0001110011110111

Another MAX pooling and convolution layer follow:

[pool2]

Input=conv2

Type=Pool

PoolWidth=2

PoolHeight=2

NbOutputs=[conv2]NbOutputs

Stride=2

Pooling=Max

Mapping.Size=1

[conv3]

Input=pool2

Type=Conv

KernelWidth=5

KernelHeight=5

NbOutputs=120

The network is composed of two fully-connected layers of 84 and 10 neurons respectively:

[fc1]

Input=conv3

Type=Fc

NbOutputs=84

68/78

[fc2]

Input=fc1

Type=Fc

NbOutputs=10

Finally, we use a softmax layer to obtain output classiﬁcation probabilities and compute the

loss function.

[softmax]

Input=fc2

Type=Softmax

NbOutputs=[fc2]NbOutputs

WithLoss=1

In order to tell N2D2 to compute the error and the classiﬁcation score on this softmax layer, one

must attach a N2D2 Target to this layer, with a section with the same name suﬃxed with

.Target

[softmax.Target]

By default, the activation function for the convolution and the fully-connected layers is the

hyperbolic tangent. Because the

[fc2]

layer is fed to a softmax, it should not have any activation

function. We can specify it by adding the following line in the [fc2] section:

[fc2]

...

ActivationFunction=Linear

In order to improve further the networks performances, several things can be done:

•Use ReLU activation functions.

In order to do so, just add the following in the

[conv1]

[conv2],[conv3] and [fc1] layer sections:

ActivationFunction=Rectifier

For the ReLU activation function to be eﬀective, the weights must be initialized carefully, in

order to avoid dead units that would be stuck in the ]

− ∞,

0] output range before the ReLU

function. In N2D2, one can use a custom

WeightsFiller

for the weights initialization. For the

ReLU activation function, a popular and eﬃcient ﬁller is the so-called

XavierFiller

(see the

4.7.2 section for more information):

WeightsFiller=XavierFiller

•Use dropout layers.

Dropout is highly eﬀective to improve the network generalization

capacity. Here is an example of a dropout layer inserted between the [fc1] and [fc2] layers:

[fc1]

...

[fc1.drop]

Input=fc1

Type=Dropout

NbOutputs=[fc1]NbOutputs

[fc2]

Input=fc1.drop ; Replaces "Input=fc1"

...

•Tune the learning parameters.

You may want to tune the learning rate and other learning

parameters depending on the learning problem at hand. In order to do so, you can add a

conﬁguration section that can be common (or not) to all the layers. Here is an example of

conﬁguration section:

[conv1]

...

ConfigSection=common.config

69/78

[...]

...

[common.config]

NoBias=1

WeightsSolver.LearningRate=0.05

WeightsSolver.Decay=0.0005

Solvers.LearningRatePolicy=StepDecay

Solvers.LearningRateStepSize=[sp]_EpochSize

Solvers.LearningRateDecay=0.993

Solvers.Clamping=1

For more details on the conﬁguration parameters for the Solver, see section 4.7.3.

•Add input distortion. See for example the DistortionTransformation (section 4.6.1).

The complete INI model corresponding to this tutorial can be found in models/LeNet.ini.

In order to use CUDA/GPU accelerated learning, the default layer model should be switched to

Frame_CUDA

. You can enable this model by adding the following line at the top of the INI ﬁle (before

the ﬁrst section):

DefaultModel=Frame_CUDA

5.3 Building a segmentation neural network

In this tutorial, we will learn how to do image segmentation with N2D2. As an example, we will

implement a face detection and gender recognition neural network, using the IMDB-WIKI dataset.

First, we need to instanciate the IMDB-WIKI dataset built-in N2D2 driver:

[database]

Type=IMDBWIKI_Database

WikiSet=1 ; Use the WIKI part of the dataset

IMDBSet=0 ; Don’t use the IMDB part (less accurate annotation)

Learn=0.90

Validation=0.05

DefaultLabel=background ; Label for pixels outside any ROI (default is no label, pixels are

ignored)

We must specify a default label for the background, because we want to learn to diﬀerenciate

faces from the background (and not simply ignore the background for the learning).

The network input is then declared:

[sp]

SizeX=480

SizeY=360

BatchSize=48

CompositeStimuli=1

In order to work with segmented data, i.e. data with bounding box annotations or pixel-wise

annotations (as opposed to a single label per data), one must enable the

CompositeStimuli

option in

the [sp] section.

We can then perform various operations on the data before feeding it to the network, like for

example converting the 3-channels RGB input images to single-channel gray images:

[sp.Transformation-1]

Type=ChannelExtractionTransformation

CSChannel=Gray

We must only rescale the images to match the networks input size. This can be done using

RescaleTransformation

, followed by a

PadCropTransformation

if one want to keep the images aspect

ratio.

70/78

[sp.Transformation-2]

Type=RescaleTransformation

Width=[sp]SizeX

Height=[sp]SizeY

KeepAspectRatio=1 ; Keep images aspect ratio

; Required to ensure all the images are the same size

[sp.Transformation-3]

Type=PadCropTransformation

Width=[sp]SizeX

Height=[sp]SizeY

A common additional operation to extend the learning set is to apply random horizontal mirror

to images. This can be achieved with the following FlipTransformation:

[sp.OnTheFlyTransformation-4]

Type=FlipTransformation

RandomHorizontalFlip=1

ApplyTo=LearnOnly ; Apply this transformation only on the learning set

Note that this is an on-the-ﬂy transformation, meaning it cannot be cached and is re-executed

every time even for the same stimuli. We also apply this transformation only on the learning set,

with the ApplyTo option.

Next, the neural network can be described:

[conv1.1]

Input=sp

Type=Conv

...

[pool1]

...

[...]

...

[fc2]

Input=drop1

Type=Conv

...

[drop2]

Input=fc2

Type=Dropout

NbOutputs=[fc2]NbOutputs

A full network description can be found in the IMDBWIKI.ini ﬁle in the models directory of

N2D2. It is a fully-CNN network.

Here we will focus on the output layers required to detect the faces and classify their gender.

We start from the [drop2] layer, which has 128 channels of size 60x45.

5.3.1 Faces detection

We want to ﬁrst add an output stage for the faces detection. It is a 1x1 convolutional layer with a

single 60x45 output map. For each output pixel, this layer outputs the probability that the pixel

belongs to a face.

[fc3.face]

Input=drop2

Type=Conv

KernelWidth=1

KernelHeight=1

NbOutputs=1

71/78

Stride=1

ActivationFunction=LogisticWithLoss

WeightsFiller=XavierFiller

ConfigSection=common.config ; Same solver options that the other layers

In order to do so, the activation function of this layer must be of type LogisticWithLoss.

We must also tell N2D2 to compute the error and the classiﬁcation score on this softmax layer,

by attaching a N2D2 Target to this layer, with a section with the same name suﬃxed with

.Target

[fc3.face.Target]

LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_face.dat

; Visualization parameters

NoDisplayLabel=0

LabelsHueOffset=90

In this Target, we must specify how the dataset annotations are mapped to the layer’s output.

This can be done in a separate ﬁle using the

LabelsMapping

parameter. Here, since the output layer

has a single output per pixel, the target value can only be 0 or 1. A target value of -1 means that

this output is ignored (no error back-propagated). Since the only annotations in the IMDB-WIKI

dataset are faces, the mapping described in the IMDBWIKI_target_face.dat ﬁle is easy:

# background

background 0

# padding (*) is ignored (-1)

* -1

# not background = face

default 1

5.3.2 Gender recognition

We can also add a second output stage for gender recognition. Like before, it would be a 1x1

convolutional layer with a single 60x45 output map. But here, for each output pixel, this layer

would output the probability that the pixel represents a female face.

[fc3.gender]

Input=drop2

Type=Conv

KernelWidth=1

KernelHeight=1

NbOutputs=1

Stride=1

ActivationFunction=LogisticWithLoss

WeightsFiller=XavierFiller

ConfigSection=common.config

The output layer is therefore identical to the face’s output layer, but the target mapping is

diﬀerent. For the target mapping, the idea is simply to ignore all pixels not belonging to a face and

aﬀect the target 0 to male pixels and the target 1 to female pixels.

[fc3.gender.Target]

LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_gender.dat

; Only display gender probability for pixels detected as face pixels

MaskLabelTarget=fc3.face.Target

MaskedLabel=1

The content of the IMDBWIKI_target_gender.dat ﬁle would therefore look like:

# background

# ?-* (unknown gender)

# padding

default -1

72/78

# male gender

M-? 0 # unknown age

M-0 0

M-1 0

M-2 0

...

M-98 0

M-99 0

# female gender

F-? 1 # unknown age

F-0 1

F-1 1

F-2 1

...

F-98 1

F-99 1

5.3.3 ROIs extraction

The next step would be to extract detected face ROIs and assign for each ROI the most probable

gender. To this end, we can ﬁrst set a detection threshold, in terms of probability, to select face

pixels. In the following, the threshold is ﬁxed to 75% face probability:

[post.Transformation-thres]

Input=fc3.face

Type=Transformation

NbOutputs=1

Transformation=ThresholdTransformation

Operation=ToZero

Threshold=0.75

We can then assign a target of type

TargetROIs

to this layer that will automatically create the

bounding box using a segmentation algorithm.

[post.Transformation-thres.Target-face]

Type=TargetROIs

MinOverlap=0.33 ; Min. overlap fraction to match the ROI to an annotation

FilterMinWidth=5 ; Min. ROI width

FilterMinHeight=5 ; Min. ROI height

FilterMinAspectRatio=0.5 ; Min. ROI aspect ratio

FilterMaxAspectRatio=1.5 ; Max. ROI aspect ratio

LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_face.dat

In order to assign a gender to the extracted ROIs, the above target must be modiﬁed to:

[post.Transformation-thres.Target-gender]

Type=TargetROIs

ROIsLabelTarget=fc3.gender.Target

MinOverlap=0.33

FilterMinWidth=5

FilterMinHeight=5

FilterMinAspectRatio=0.5

FilterMaxAspectRatio=1.5

LabelsMapping=${N2D2_MODELS}/IMDBWIKI_target_gender.dat

Here, we use the fc3.gender.Target target to determine the most probable gender of the ROI.

5.3.4 Data visualization

For each Target in the network, a corresponding folder is created in the simulation directory, which

contains learning, validation and test confusion matrixes. The output estimation of the network for

each stimulus is also generated automatically for the test dataset and can be visualized with the

./test.py helper tool. An example is shown in ﬁgure 15.

73/78

Image selection

Labels legend

(object type)

Network output estimation: pixels most probable object type

Pixels input label (dataset annotation)

Figure 15: Example of the target visualization helper tool.

5.4 Transcoding a learned network in spike-coding

N2D2 embeds an event-based simulator (historically known as ’Xnet’) and allows to transcode a

whole DNN in a spike-coding version and evaluate the resulting spiking neural network performances.

In this tutorial, we will transcode the LeNet network described in section 5.2.

5.4.1 Render the network compatible with spike simulations

The ﬁrst step is to specify that we want to use a transcode model (allowing both formal and spike

simulation of the same network), by changing the DefaultModel to:

DefaultModel=Transcode_CUDA

In order to perform spike simulations, the input of the network must be of type Environment,

which is a derived class of StimuliProvider that adds spike coding support. In the INI model ﬁle, it

is therefore necessary to replace the

[sp]

section by an

[env]

section and replace all references of

to env.

Note that these changes have at this point no impact at all on the formal coding simulations.

The beginning of the INI ﬁle should be:

DefaultModel=Transcode_CUDA

; Database

[database]

Type=MNIST_IDX_Database

Validation=0.2 ; Use 20% of the dataset for validation

; Environment

[env]

SizeX=32

SizeY=32

BatchSize=128

[env.Transformation_1]

Type=RescaleTransformation

74/78

Width=[env]SizeX

Height=[env]SizeY

[conv1]

Input=env

...

The dropout layer has no equivalence in spike-coding inference and must be removed:

...

[fc1.drop]

Input=fc1

Type=Dropout

NbOutputs=[fc1]NbOutputs

[fc2]

Input=fc1.drop

...

The softmax layer has no equivalence in spike-coding inference and must be removed as well.

The Target must therefore be attached to [fc2]:

...

[softmax]

Input=fc2

Type=Softmax

NbOutputs=[fc2]NbOutputs

WithLoss=1

[softmax.Target]

[fc2.Target]

...

The network is now compatible with spike-coding simulations. However, we did not specify at

this point how to translate the input stimuli data into spikes, nor the spiking neuron parameters

(threshold value, leak time constant...).

5.4.2 Conﬁgure spike-coding parameters

The ﬁrst step is to conﬁgure how the input stimuli data must be coded into spikes. To this end, we

must attach a conﬁguration section to the Environment. Here, we specify a periodic coding with

random initial jitter with a minimum period of 10 ns and a maximum period of 100 us:

[env]

...

ConfigSection=env.config

[env.config]

; Spike-based computing

StimulusType=JitteredPeriodic

PeriodMin=1,000,000 ; unit = fs

PeriodMeanMin=10,000,000 ; unit = fs

PeriodMeanMax=100,000,000,000 ; unit = fs

PeriodRelStdDev=0.0

The next step is to specify the neurons parameters, that will be common to all layers and can

therefore be speciﬁed in the

[common.config]

section. In N2D2, the base spike-coding layers use a

Leaky Integrate-and-Fire (LIF) neuron model. By default, the leak time constant is zero, resulting

to simple Integrate-and-Fire (IF) neurons.

Here we simply specify that the neurons threshold must be the unity, that the threshold is only

positive and that there is no incoming synaptic delay:

75/78

[common.config]

...

; Spike-based computing

Threshold=1.0

BipolarThreshold=0

IncomingDelay=0

Finally, we can limit the number of spikes required for the computation of each stimulus by

adding a decision delta threshold at the output layer:

[fc2]

...

ConfigSection=common.config,fc2.config

[fc2.Target]

[fc2.config]

; Spike-based computing

TerminateDelta=4

BipolarThreshold=1

The complete INI model corresponding to this tutorial can be found in models/LeNet_Spike.ini.

Here is a summary of the steps required to reproduce the whole experiment:

./n2d2 "$N2D2_MODELS/LeNet.ini" -learn 6000000 -log 100000

./n2d2 "$N2D2_MODELS/LeNet_Spike.ini" -test

The ﬁnal recognition rate reported at the end of the spike inference should be almost identical

to the formal coding network (around 99% for the LeNet network).

Various statistics are available at the end of the spike-coding simulation in the stats_spike

folder and the stats_spike.log ﬁle. Looking in the stats_spike.log ﬁle, one can read the following

line towards the end of the ﬁle:

Read events per virtual synapse per pattern (average): 0.654124

This line reports the average number of accumulation operations per synapse per input stimulus

in the network. If this number if below 1.0, it means that the spiking version of the network is

more eﬃcient than its formal counterpart in terms of total number of operations!

76/78

References

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and

B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE

Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

P. Dollár, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: A benchmark. In CVPR,

2009.

L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples:

an incremental bayesian approach tested on 101 object categories. In IEEE. CVPR 2004,

Workshop on Generative-Model Based Vision, 2004.

X. Glorot and Y. Bengio. Understanding the diﬃculty of training deep feedforward neural networks.

In International conference on artiﬁcial intelligence and statistics, page 249–256, 2010.

P. Goyal, P. Dollár, R. B. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y. Jia,

and K. He. Accurate, large minibatch SGD: training imagenet in 1 hour. CoRR, abs/1706.02677,

2017. URL http://arxiv.org/abs/1706.02677.

B. Graham. Fractional max-pooling. CoRR, abs/1412.6071, 2014.

G. Griﬃn, A. Holub, and P. Perona. Caltech-256 object category dataset, 2007.

K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectiﬁers: Surpassing human-level

performance on imagenet classiﬁcation. In Proceedings of the 2015 IEEE International Conference

on Computer Vision (ICCV), ICCV ’15, pages 1026–1034, 2015. doi: 10.1109/ICCV.2015.123.

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780,

1997. doi: 10.1162/neco.1997.9.8.1735.

S. Houben, J. Stallkamp, J. Salmen, M. Schlipsing, and C. Igel. Detection of traﬃc signs in

real-world images: The German Traﬃc Sign Detection Benchmark. In International Joint

Conference on Neural Networks, number 1288, 2013.

S. Ioﬀe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing

internal covariate shift. CoRR, abs/1502.03167, 2015.

V. Jain and E. Learned-Miller. FDDB: A benchmark for face detection in unconstrained settings,

2010.

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980,

2014. URL http://arxiv.org/abs/1412.6980.

A. Krizhevsky. Learning multiple layers of features from tiny images, 2009.

Y. LeCun, L. Bottou, Y. Bengio, and P. Haﬀner. Gradient-based learning applied to document

recognition. In Proceedings of the IEEE, volume 86, pages 2278–2324, 1998.

P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews. The Extended Cohn-

Kanade Dataset (CK+): A complete dataset for action unit and emotion-speciﬁed expression.

2010.

A. Rakotomamonjy and G. Gasso. Histogram of gradients of time-frequency representations for

audio scene detection, 2014.

77/78

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy,

A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recog-

nition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. doi:

10.1007/s11263-015-0816-y.

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple

way to prevent neural networks from voverﬁtting. Journal of Machine Learning Research, 15:

1929–1958, 2012.

J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. Man vs. computer: Benchmarking machine

learning algorithms for traﬃc sign recognition. Neural Networks, 2012. ISSN 0893-6080. doi:

10.1016/j.neunet.2012.02.016.

A. C. Wilson, R. Roelofs, M. Stern, N. Srebro, and B. Recht. The Marginal Value of Adaptive

Gradient Methods in Machine Learning. arXiv e-prints, art. arXiv:1705.08292, May 2017.

G. Xia, X. Bai, J. Ding, Z. Zhu, S. J. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang. DOTA:

A large-scale dataset for object detection in aerial images. CoRR, abs/1711.10398, 2017. URL

http://arxiv.org/abs/1711.10398.

H. Zhang, Y. N. Dauphin, and T. Ma. Residual learning without normalization via better

initialization. In International Conference on Learning Representations, 2019. URL

https:

//openreview.net/forum?id=H1gsz30cKX.

78/78

Manual

Navigation menu

Versions of this User Manual:

Views

Navigation