Xilinx OpenCV User Guide Ug1233

ug1233-xilinx-opencv-user-guide

User Manual:

Open the PDF directly: View PDF .
Page Count: 189 [warning: Documents this large are best viewed by clicking the View PDF Link!]

Xilinx OpenCV User Guide

Xilinx OpenCV User Guide

UG1233 (v2018.2) July 2, 2018

Revision History

The following table shows the revision history for this document.

Section Revision Summary

07/02/2018 Version 2018.2

Houghlines Updated the function description and its respective tables

xfOpenCV Library Functions Added a note to the xfOpenCV Library Functions table

06/06/2018 Version 2018.2

Houghlines Added a new function

Semi Global Method for Stereo Disparity Estimation Added a new function

04/04/2018 Version 2018.1

General Updates Minor editorial updates for 2018.1 release

InitUndistortRectifyMapInverse Added a new function

cornersImgToList() Added a new function

Revision History

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 2

Table of Contents

Revision History...............................................................................................................2

Chapter 1: Overview......................................................................................................4

Basic Features..............................................................................................................................4

xfOpenCV Kernel on the reVISION Platform............................................................................5

Chapter 2: Getting Started........................................................................................ 7

Prerequisites................................................................................................................................ 7

xfOpenCV Library Contents........................................................................................................7

Using the xfOpenCV Library.......................................................................................................8

Changing the Hardware Kernel Configuration......................................................................21

Using the xfOpenCV Library Functions on Hardware...........................................................21

Chapter 3: xfOpenCV Library API Reference................................................. 25

xf::Mat Image Container Class................................................................................................ 25

xfOpenCV Library Functions.................................................................................................... 33

Chapter 4: Design Examples Using xfOpenCV Library........................... 172

Iterative Pyramidal Dense Optical Flow............................................................................... 172

Corner Tracking Using Sparse Optical Flow.........................................................................177

Color Detection........................................................................................................................182

Difference of Gaussian Filter................................................................................................. 183

Stereo Vision Pipeline............................................................................................................. 185

Appendix A: Additional Resources and Legal Notices........................... 187

Xilinx Resources.......................................................................................................................187

Documentation Navigator and Design Hubs...................................................................... 187

References................................................................................................................................188

Please Read: Important Legal Notices................................................................................. 189

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 3

Chapter 1

Overview

This document describes the FPGA device opmized OpenCV library, called the Xilinx®

xfOpenCV library and is intended for applicaon developers using Zynq®-7000 SoC and Zynq

UltraScale+ MPSoC devices. xfOpenCV library has been designed to work in the SDx™

development environment, and provides a soware interface for computer vision funcons

accelerated on an FPGA device. xfOpenCV library funcons are mostly similar in funconality to

their OpenCV equivalent. Any deviaons, if present, are documented.

Note: For more informaon on the xfOpenCV library prerequisites, see the Prerequisites. To familiarize

yourself with the steps required to use the xfOpenCV library funcons, see the Using the xfOpenCV

Library.

Basic Features

All xfOpenCV library funcons follow a common format. The following properes hold true for

all the funcons.

• All the funcons are designed as templates and all arguments that are images, must be

provided as xf::Mat.

• All funcons are dened in the xf namespace.

• Some of the major template arguments are:

○Maximum size of the image to be processed

○Datatype dening the properes of each pixel

○Number of pixels to be processed per clock cycle

○Other compile-me arguments relevent to the funconality.

The xfOpenCV library contains enumerated datatypes which enables you to congure xf::Mat.

For more details on xf::Mat, see the xf::Mat Image Container Class.

Chapter 1: Overview

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 4

xfOpenCV Kernel on the reVISION Platform

The xfOpenCV library is designed to be used with the SDx™ development environment.

xfOpenCV kernels are evaluated on the reVISION™ plaorm.

The following steps describe the general ow of an example design, where both the input and

the output are image les.

1. Read the image using cv::imread().

2. Copy the data to xf::Mat.

3. Call the processing funcon(s) in xfOpenCV.

4. Copy the data from xf::Mat to cv::Mat.

5. Write the output to image using cv::imwrite().

The enre code is wrien as the host code for the pipeline , from which all the calls to xfOpenCV

funcons are moved to hardware. Funcons from OpenCV are used to read and write images in

the memory. The image containers for xfOpenCV library funcons are xf::Mat objects. For

more informaon, see the xf::Mat Image Container Class.

The reVISION plaorm supports both live and le input-output (I/O) modes. For more details, see

the reVISION Geng Started Guide.

• File I/O mode enables the controller to transfer images from SD Card to the hardware kernel.

The following steps describe the le I/O mode.

1. Processing system (PS) reads the image frame from the SD Card and stores it in the

DRAM.

2. The xfOpenCV kernel reads the image from the DRAM, processes it and stores the output

back in the DRAM memory.

3. The PS reads the output image frame from the DRAM and writes it back to the SD Card.

• Live I/O mode enables streaming frames into the plaorm, processing frames with the

xfOpenCV kernel, and streaming out the frames through the appropriate interface. The

following steps describe the live I/O mode.

1. Video capture IPs receive a frame and store it in the DRAM.

2. The xfOpenCV kernel fetches the image from the DRAM, processes the image, and stores

the output in the DRAM.

3. Display IPs read the output frame from the DRAM and transmits the frame through the

appropriate display interface.

Following gure shows the reVISION plaorm with the xfOpenCV kernel block:

Chapter 1: Overview

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 5

Figure 1: xfOpenCV Kernel on the reVISION Platform

ARM Core

Central Interconnect

DDR

Controller

HP Ports

HDMI TX and RX

IPs

Data

Movers

AXI

Interconnects

HDMI TX and RX

IPs

xfOpenCV Kernel Interface

xfOpenCV Kernel

Programmable Logic (PL)

Processing System (PS)

HPM/GP Ports

AXIS AXIMM

Note: For more informaon on the PS-PL interfaces and PL-DDR interfaces, see the Zynq UltraScale+

Device Technical Reference Manual (UG1085).

Chapter 1: Overview

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 6

Chapter 2

Getting Started

This chapter provides the informaon you need to bring up your design using the xfOpenCV

library funcons.

Prerequisites

This secon lists the prerequisites for using the xfOpenCV library funcons on ZCU102 based

plaorms. The methodology holds true for ZC702 and ZC706 reVISION plaorms as well.

• Download and install the SDx development environment according to the direcons provided

in SDSoC Environments Release Notes, Installaon, and Licensing Guide (UG1294). Before

launching the SDx development environment on Linux, set the $SYSROOT environment

variable to point to the Linux root le system, delivered with the reVISION plaorm. For

example:

export SYSROOT = <local folder>/zcu102_[es2_]rv_ss/sw/aarch64-linux-gnu/

sysroot

• Download the Zynq® UltraScale+™ MPSoC Embedded Vision Plaorm zip le and extract its

contents. Create the SDx development environment workspace in the

zcu102_[es2_]rv_ss folder of the extracted design le hierarchy. For more details, see the

reVISION Geng Started Guide.

• Set up the ZCU102 evaluaon board. For more details, see the reVISION Geng Started

Guide.

• Download the xfOpenCV library. This library is made available through github. Run the

following git clone command to clone the xfOpenCV repository to your local disk:

git clone https://github.com/Xilinx/xfopencv.git

xfOpenCV Library Contents

The following table lists the contents of the xfOpenCV library.

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 7

Table 1: xfOpenCV Library Contents

Folder Details

include Contains the header files required by the library.

include/common Contains the common library infrastructure headers, such

as types specific to the library.

include/core Contains the core library functionality headers, such as the

math functions.

include/features Contains the feature extraction kernel function definitions.

For example, Harris.

include/imgproc Contains all the kernel function definitions, except the ones

available in the features folder.

examples Contains the sample test bench code to facilitate running

unit tests. The examples/ folder contains the folde rs

with algorithm names. Each algorithm folder contains host

files, .json file, and data folder. For more details on

how to use the xfOpenCV library, see xfOpenCV Kernel on

the reVISION Platform.

Using the xfOpenCV Library

This secon describes using the xfOpenCV library in the SDx development environment.

Note: The instrucons in this secon assume that you have downloaded and installed all the required

packages. For more informaon, see the Prerequisites.

The xfOpenCV library is structured as shown in the following table. The include folder

constutes all the necessary components to build a Computer Vision or Image Processing

pipeline using the library. The folders common and core contain the infrastructure that the

library funcons need for basic funcons, Mat class, and macros. The library funcons are

categorized into two folders, features and imgproc based on the operaon they perform.

The names of the folders are self-explanatory.

To work with the library funcons, you need to include the path to the include folder in the

SDx project. You can include relevant header les for the library funcons you will be working

with aer you source the include folder’s path to the compiler. For example, if you would like

to work with Harris Corner Detector and Bilateral Filter, you must use the following lines in the

host code:

#include “features/xf_harris.hpp” //for Harris Corner Detector

#include “imgproc/xf_bilateral_filter.hpp” //for Bilateral Filter

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 8

Aer the headers are included, you can work with the library funcons as described in the

Chapter 3: xfOpenCV Library API Reference using the examples in the examples folder as

reference.

The following table gives the name of the header le, including the folder name, which contains

the library funcon.

Table 2: xfOpenCV Library Contents

Function Name File Path in the include folder

xf::accumulate imgproc/xf_accumulate_image.hpp

xf::accumulateSquare imgproc/xf_accumulate_squared.hpp

xf::accumulateWeighted imgproc/xf_accumulate_weighted.hpp

xf::absdiff, xf::add, xf::subtract, xf::bitwise_and,

xf::bitwise_or, xf::bitwise_not, xf::bitwise_xor

core/xf_arithm.hpp

xf::bilateralFilter imgproc/xf_histogram.hpp

xf::boxFilter imgproc/xf_box_filter.hpp

xf::Canny imgproc/xf_canny.hpp

xf::merge imgproc/xf_channel_combine.hpp

xf::extractChannel imgproc/xf_channel_extract.hpp

xf::convertTo imgproc/xf_convert_bitdepth.hpp

xf::filter2D imgproc/xf_custom_convolution.hpp

xf::nv122iyuv, xf::nv122rgba, xf::nv122yuv4, xf::nv212iyuv,

xf::nv212rgba, xf::nv212yuv4, xf::rgba2yuv4, xf::rgba2iyuv,

xf::rgba2nv12, xf::rgba2nv21, xf::uyvy2iyuv, xf::uyvy2nv12,

xf::uyvy2rgba, xf::yuyv2iyuv, xf::yuyv2nv12, xf::yuyv2rgba

imgproc/xf_cvt_color.hpp

xf::dilate imgproc/xf_dilation.hpp

xf::erode imgproc/xf_erosion.hpp

xf::fast features/xf_fast.hpp

xf::GaussianBlur imgproc/xf_gaussian_filter.hpp

xf::cornerHarris features/xf_harris.hpp

xf::calcHist imgproc/xf_histogram.hpp

xf::equalizeHist imgproc/xf_hist_equalize.hpp

xf::HOGDescriptor imgproc/xf_hog_descriptor.hpp

xf::Houghlines imgproc/xf_houghlines.hpp

xf::integralImage imgproc/xf_integral_image.hpp

xf::densePyrOpticalFlow imgproc/xf_pyr_dense_optical_flow.hpp

xf::DenseNonPyrOpticalFlow imgproc/xf_dense_npyr_optical_flow.hpp

xf::LUT imgproc/xf_lut.hpp

xf::magnitude core/xf_magnitude.hpp

xf::MeanShift imgproc/xf_mean_shift.hpp

xf::meanStdDev core/xf_mean_stddev.hpp

xf::medianBlur imgproc/xf_median_blur.hpp

xf::minMaxLoc core/xf_min_max_loc.hpp

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 9

Table 2: xfOpenCV Library Contents (cont'd)

Function Name File Path in the include folder

xf::OtsuThreshold imgproc/xf_otsuthreshold.hpp

xf::phase core/xf_phase.hpp

xf::pyrDown imgproc/xf_pyr_down.hpp

xf::pyrUp imgproc/xf_pyr_up.hpp

xf::remap imgproc/xf_remap.hpp

xf::resize imgproc/xf_resize.hpp

xf::Scharr imgproc/xf_scharr.hpp

xf::SemiGlobalBM imgproc/xf_sgbm.hpp

xf::Sobel imgproc/xf_sobel.hpp

xf::StereoPipeline imgproc/xf_stereo_pipeline.hpp

xf::StereoBM imgproc/xf_stereoBM.hpp

xf::SVM imgproc/xf_svm.hpp

xf::Threshold imgproc/xf_threshold.hpp

xf::warpAffine imgproc/xf_warpaffine.hpp

xf::warpPerspective imgproc/xf_warpperspective.hpp

xf::warpTransform imgproc/xf_warp_transform.hpp

The dierent ways to use the xfOpenCV library examples are listed below:

•Downloading and Using xfOpenCV Libraries from SDx GUI

•Building a Project Using the Example Makeles on Linux

•Using reVISION Samples on the reVISION Plaorm

•Using the xfOpenCV Library on a non-reVISION Plaorm

Downloading and Using xfOpenCV Libraries from

SDx GUI

You can download xfOpenCV directly from SDx GUI. To build a project using the example

makeles on the Linux plaorm:

1. From SDx IDE, click Xilinx and select SDx Libraries.

2. Click Download next to the Xilinx xfOpenCV Library.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 10

Figure 2: SDx Libraries

The library is downloaded into <home directory>/Xilinx/SDx/2018.2/xfopencv.

Aer the library is downloaded, the enre set of examples in the library are available in the

list of templates while creang a new project.

Note: The library can be added to any project from the IDE menu opons.

3. To add a library to a project, from SDx IDE, click Xilinx and select SDx Libraries.

4. Select Xilinx xfOpenCV Library and click Add to project. The dropdown menu consists of

opons of which project the libraries need to be included to.

All the headers as part of the include/ folder in xfOpenCV library would be copied into the

local project directory as <project_dir>/libs/xfopencv/include. All the sengs

required for the libraries to be run are also set when this acon is completed.

Building a Project Using the Example Makefiles on

Linux

Use the following steps to build a project using the example makeles on the Linux plaorm:

1. Open a terminal.

2. Set the environment variable SYSROOT to the <path to sysroot folder>/sw/

aarch64-linux-gnu/sysroot folder.

3. Change the plaorm variable to point to the downloaded plaorm folder in makele. Ensure

that the folder name of the downloaded plaorm is unchanged.

4. Change the directory to the locaon where you want to build the example.

cd <path to example>

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 11

5. Set the environment variables to run SDx development environment.

• For c shell:

source <SDx tools install path>/settings.csh

• For bash shell:

source <SDx tools install path>/settings.sh

6. Type the make command in the terminal. The sd_card folder is created and can be found in

the <path to example> folder.

Using reVISION Samples on the reVISION Platform

Use the following steps to run a unit test for bilateral lter on zcu102_es2_reVISION:

1. Launch the SDx development environment using the desktop icon or the Start menu.

The Workspace Launcher dialog appears.

2. Click Browse to enter a workspace folder used to store your projects (you can use workspace

folders to organize your work), then click OK to dismiss the Workspace Launcher dialog.

Note: Before launching the SDx IDE on Linux, ensure that you use the same shell that you have used to set

the $SYSROOT environment variable. This is usually the le path to the Linux root le system.

The SDx development environment window opens with the Welcome tab visible when you

create a new workspace. The Welcome tab can be closed by clicking the X icon or minimized

if you do not wish to use it.

3. Select File → New → Xilinx SDx Project from the SDx development environment menu bar.

The New Project dialog box opens.

4. Specify the name of the project. For example Bilateral.

5. Click Next.

The the Choose Hardware Plaorm page appears.

6. From the Choose Hardware Plaorm page, click the Add Custom Plaorm buon.

7. Browse to the directory where you extracted the reVISION plaorm les. Ensure that you

select the zcu102_es2_reVISION folder.

8. From the Choose Hardware Plaorm page, select zcu102_es2_reVISION (custom).

9. Click Next.

The Templates page appears, containing source code examples for the selected plaorm.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 12

10. From the list of applicaon templates, select bilateral - File I/O and click Finish.

11. Click the Acve build conguraons drop-down from the SDx Project Sengs window, to

select the acve conguraon or create a build conguraon.

The standard build conguraons are Debug and Release. To get the best runme

performance, switch to use the Release build conguraon as it uses a higher compiler

opmizaon seng than the Debug build conguraon.

Figure 3: SDx Project Settings - Active Build Configuration

12. Set the Data moon network clock frequency (MHz) to the required frequency, on the SDx

Project Sengs page.

13. Right-click the project and select Build Project or press Ctrl+B keys to build the project, in

the Project Explorer view.

14. Copy the contents of the newly created sd_card folder to the SD card.

The sd_card folder contains all the les required to run designs on the ZCU102 board.

15. Insert the SD card in the ZCU102 board card slot and switch it ON.

Note: A serial port emulator (Teraterm/ minicom) is required to interface the user commands to the board.

16. Upon successful boot, run the following command in the Teraterm terminal (serial port

emulator.)

#cd /media/card

#remount

17. Run the .elf le for the respecve funcons.

For more informaon, see the Using the xfOpenCV Library Funcons on Hardware.

Using the xfOpenCV Library on a non-reVISION

Platform

This secon describes using the xfOpenCV library on a non-reVISION plaorm, in the SDx

development environment. The examples in xfOpenCV require OpenCV libraries for successful

compilaon. In the case of a non-reVISION plaorm, you are responsible for providing the

required OpenCV libraries, either as part of the plaorm or otherwise.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 13

Note: The instrucons in this secon assume that you have downloaded and installed all the required

packages. For more informaon, see the Prerequisites.

Use the following steps to import the xfOpenCV library into a SDx project and execute it on a

custom plaorm:

1. Launch the SDx development environment using the desktop icon or the Start menu.

The Workspace Launcher dialog appears.

2. Click Browse to enter a workspace folder used to store your projects (you can use workspace

folders to organize your work), then click OK to dismiss the Workspace Launcher dialog.

Note: Before launching the SDx IDE on Linux, ensure that you use the same shell that you have used to set

the $SYSROOT environment variable. This is usually the le path to the Linux root le system.

The SDx development environment window opens with the Welcome tab visible when you

create a new workspace. The Welcome tab can be closed by clicking the X icon or minimized

if you do not wish to use it.

3. Select File → New → Xilinx SDx Project from the SDx development environment menu bar.

The New Project dialog box opens.

4. Specify the name of the project. For example Test.

5. Click Next.

The the Choose Hardware Plaorm page appears.

6. From the Choose Hardware Plaorm page, select a suitable plaorm. For example, zcu102.

7. Click Next.

The Choose Soware Plaorm and Target CPU page appears.

8. From the Choose Soware Plaorm and Target CPU page, select an appropriate soware

plaorm and the target CPU. For example, select A9 from the CPU dropdown list for ZC702

and ZC706 reVISION plaorms.

9. Click Next. The Templates page appears, containing source code examples for the selected

plaorm.

10. From the list of applicaon templates, select Empty Applicaon and click Finish.

The New Project dialog box closes. A new project with the specied conguraon is created.

The SDx Project Sengs view appears. Noce the progress bar in the lower right border of

the view, Wait for a few moments for the C/C++ Indexer to nish.

11. The standard build conguraons are Debug and Release. To get the best run-me

performance, switch to use the Release build conguraon as it uses a higher compiler

opmizaon seng than the Debug build conguraon.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 14

Figure 4: SDx Project Settings - Active Build Configuration

12. Set the Data moon network clock frequency (MHz) to the required frequency, on the SDx

Project Sengs page.

13. Select the Generate bitstream and Generate SD card image check boxes.

14. Right-click on the newly created project in the Project Explorer view.

15. From the context menu that appears, select C/C++ Build Sengs.

The Properes for <project> dialog box appears.

16. Click the Tool Sengs tab.

17. Expand the SDS++ Compiler → Directories tree.

18. Click the icon to add the “<xfopencv_location>\include and

“<OpenCV_location>\include folder locaons to the Include Paths list.

Note: The OpenCV library is not provided by Xilinx for custom plaorms. You are required to provide the

library. Use the reVISION plaorm in order to use the OpenCV library provided by Xilinx.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 15

Figure 5: SDS++ Compiler Settings

19. Click Apply.

20. Expand the SDS++ Linker → Libraries tree.

21. Click the icon and add the following libraries to the Libraries(-l) list. These libraries are

required by OpenCV.

• opencv_core

• opencv_imgproc

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 16

• opencv_imgcodecs

• opencv_features2d

• opencv_calib3d

•opencv_ann

• lzma

•

• png16

• z

• jpeg

• dl

• rt

• webp

22. Click the icon and add <opencv_Location>/lib folder locaon to the Libraries

search path (-L) list.

Note: The OpenCV library is not provided by Xilinx for custom plaorms. You are required to provide the

library. Use the reVISION plaorm in order to use the OpenCV library provided by Xilinx.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 17

Figure 6: SDS++ Linker Settings

23. Click Apply to save the conguraon.

24. Click OK to close the Properes for <project> dialog box.

25. Expand the newly created project tree in the Project Explorer view.

26. Right-click the src folder and select Import. The Import dialog box appears.

27. Select File System and click Next.

28. Click Browse to navigate to the <xfopencv_Location>/examples folder locaon.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 18

29. Select the folder that corresponds to the library that you desire to import. For example,

accumulate.

Figure 7: Import Library Example Source Files

30. Right-click the library funcon in the Project Explorer view and select Toggle HW/SW to

move the funcon to the hardware.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 19

Figure 8: Moving a Library Function to the Hardware

31. Right-click the project and select Build Project or press Ctrl+B keys to build the project, in

the Project Explorer view.

The build process may take anyme between few minutes to several hours, depending on the

power of the host machine and the complexity of the design. By far, the most me is spent

processing the rounes that have been tagged for realizaon in hardware.

32. Copy the contents of the newly created .\<workspace>\<function>\Release

\sd_card folder to the SD card. The sd_card folder contains all the les required to run

designs on a board.

33. Insert the SD card in the board card slot and switch it ON.

Note: A serial port emulator (Teraterm/ minicom) is required to interface the user commands to the board.

34. Upon successful boot, navigate to the ./mnt folder and run the following command at the

prompt:

#cd /mnt

Note: It is assumed that the OpenCV libraries are a port of the root lesystem. If not, add the locaon of

OpenCV libraries to LD_LIBRARY_PATH using the $ export LD_LIBRARY_PATH=<location of

OpenCV libraries>/lib command.

35. Run the .elf executable le. For more informaon, see the Using the xfOpenCV Library

Funcons on Hardware.

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 20

Changing the Hardware Kernel

Configuration

Use the following steps to change the hardware kernel conguraon:

1. Update the <path to xfOpenCV git folder>/xfOpenCV/examples/<function>/

xf_config_params.h le.

2. Update the makele along with the xf_config_params.h le:

a. Find the line with the funcon name in the makele. For bilateral lter, the line in the

makele will be xf::BilateralFilter<3,1,0,1080,1920,1>.

b. Update the template parameters in the makele to reect changes made in the

xf_config_params.h le. For more details, see the Chapter 3: xfOpenCV Library API

Reference.

Using the xfOpenCV Library Functions on

Hardware

The following table lists the xfOpenCV library funcons and the command to run the respecve

examples on hardware. It is assumed that your design is completely built and the board has

booted up correctly.

Table 3: Using the xfOpenCV Library Function on Hardware

Example Function Name Usage on Hardware

accumulate xf::accumulate ./<executable name>.elf <path to input

image 1> <path to input image 2>

accumulatesquared xf::accumulateSquare ./<executable name>.elf <path to input

image 1> <path to input image 2>

accumulateweighted xf::accumulateWeighted ./<executable name>.elf <path to input

image 1> <path to input image 2>

arithm xf::absdiff, xf::add, xf::subtract, xf::bitwise_and,

xf::bitwise_or, xf::bitwise_not, xf::bitwise_xor

./<executable name>.elf <path to input

image 1> <path to input image 2>

Bilateralfilter xf::bilateralFilter ./<executable name>.elf <path to input

image>

Boxfilter xf::boxFilter ./<executable name>.elf <path to input

image>

Canny xf::Canny ./<executable name>.elf <path to input

image>

channelcombine xf::merge ./<executable name>.elf <path to input

image 1> <path to input image 2> <path to

input image 3> <path to input image 4>

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 21

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)

Example Function Name Usage on Hardware

Channelextract xf::extractChannel ./<executable name>.elf <path to input

image>

Colordetect xf::RGB2HSV, xf::colorthresholding, xf:: erode,

and xf:: dilate

./<executable name>.elf <path to input

image>

Convertbitdepth xf::convertTo ./<executable name>.elf <path to input

image>

Cornertracker xf::cornerTracker ./exe <input video> <no. of frames> <Harris

Threshold> <No. of frames after which Harris

Corners are Reset>

Customconv xf::filter2D ./<executable name>.elf <path to input

image>

cvtcolor IYUV2NV12 xf::iyuv2nv12 ./<executable name>.elf <path to input

image 1> <path to input image 2> <path to

input image 3>

cvtcolor IYUV2RGBA xf::iyuv2rgba ./<executable name>.elf <path to input

image 1> <path to input image 2> <path to

input image 3>

cvtcolor IYUV2YUV4 xf::iyuv2yuv4 ./<executable name>.elf <path to input

image 1> <path to input image 2> <path to

input image 3> <path to input image 4>

<path to input image 5> <path to input image

cvtcolor NV122IYUV xf::nv122iyuv ./<executable name>.elf <path to input

image 1> <path to input image 2>

cvtcolor NV122RGBA xf::nv122rgba ./<executable name>.elf <path to input

image 1> <path to input image 2>

cvtcolor NV122YUV4 xf::nv122yuv4 ./<executable name>.elf <path to input

image 1> <path to input image 2>

cvtcolor NV212IYUV xf::nv212iyuv ./<executable name>.elf <path to input

image 1> <path to input image 2>

cvtcolor NV212RGBA xf::nv212rgba ./<executable name>.elf <path to input

image 1> <path to input image 2>

cvtcolor NV212YUV4 xf::nv212yuv4 ./<executable name>.elf <path to input

image 1> <path to input image 2>

cvtcolor RGBA2YUV4 xf::rgba2yuv4 ./<executable name>.elf <path to input

image>

cvtcolor RGBA2IYUV xf::rgba2iyuv ./<executable name>.elf <path to input

image>

cvtcolor RGBA2NV12 xf::rgba2nv12 ./<executable name>.elf <path to input

image>

cvtcolor RGBA2NV21 xf::rgba2nv21 ./<executable name>.elf <path to input

image>

cvtcolor UYVY2IYUV xf::uyvy2iyuv ./<executable name>.elf <path to input

image>

cvtcolor UYVY2NV12 xf::uyvy2nv12 ./<executable name>.elf <path to input

image>

cvtcolor UYVY2RGBA xf::uyvy2rgba ./<executable name>.elf <path to input

image>

cvtcolor YUYV2IYUV xf::yuyv2iyuv ./<executable name>.elf <path to input

image>

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 22

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)

Example Function Name Usage on Hardware

cvtcolor YUYV2NV12 xf::yuyv2nv12 ./<executable name>.elf <path to input

image>

cvtcolor YUYV2RGBA xf::yuyv2rgba ./<executable name>.elf <path to input

image>

Difference of Gaussian xf:: GaussianBlur, xf:: duplicateMat, xf::

delayMat, and xf::subtract

./<exe-name>.elf <path to input image>

Dilation xf::dilate ./<executable name>.elf <path to input

image>

Erosion xf::erode ./<executable name>.elf <path to input

image>

Fast xf::fast ./<executable name>.elf <path to input

image>

Gaussianfilter xf::GaussianBlur ./<executable name>.elf <path to input

image>

Harris xf::cornerHarris ./<executable name>.elf <path to input

image>

Histogram xf::calcHist ./<executable name>.elf <path to input

image>

Histequialize xf::equalizeHist ./<executable name>.elf <path to input

image>

Hog xf::HOGDescriptor ./<executable name>.elf <path to input

image>

Houghlines xf::HoughLines ./<executable name>.elf <path to input

image>

Integralimg xf::integralImage ./<executable name>.elf <path to input

image>

Lkdensepyrof xf::densePyrOpticalFlow ./<executable name>.elf <path to input

image 1> <path to input image 2>

Lknpyroflow xf::DenseNonPyrLKOpticalFlow ./<executable name>.elf <path to input

image 1> <path to input image 2>

Lut xf::LUT ./<executable name>.elf <path to input

image>

Magnitude xf::magnitude ./<executable name>.elf <path to input

image>

meanshifttracking xf::MeanShift ./<executable name>.elf <path to input

video/input image files> <Number of objects

to track>

meanstddev xf::meanStdDev ./<executable name>.elf <path to input

image>

medianblur xf::medianBlur ./<executable name>.elf <path to input

image>

Minmaxloc xf::minMaxLoc ./<executable name>.elf <path to input

image>

otsuthreshold xf::OtsuThreshold ./<executable name>.elf <path to input

image>

Phase xf::phase ./<executable name>.elf <path to input

image>

Pyrdown xf::pyrDown ./<executable name>.elf <path to input

image>

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 23

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)

Example Function Name Usage on Hardware

Pyrup xf::pyrUp ./<executable name>.elf <path to input

image>

remap xf::remap ./<executable name>.elf <path to input

image> <path to mapx data> <path to mapy

data>

Resize xf::resize ./<executable name>.elf <path to input

image>

scharrfilter xf::Scharr ./<executable name>.elf <path to input

image>

SemiGlobalBM xf::SemiGlobalBM ./<executable name>.elf <path to left image>

sobelfilter xf::Sobel ./<executable name>.elf <path to input

image>

stereopipeline xf::StereoPipeline ./<executable name>.elf <path to left image>

stereolbm xf::StereoBM ./<executable name>.elf <path to left image>

Svm xf::SVM ./<executable name>.elf

threshold xf::Threshold ./<executable name>.elf <path to input

image>

warpaffine xf::warpAffine ./<executable name>.elf <path to input

image>

warpperspective xf::warpPerspective ./<executable name>.elf <path to input

image>

warptransform xf::warpTransform ./<executable name>.elf <path to input

image>

Chapter 2: Getting Started

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 24

Chapter 3

xfOpenCV Library API Reference

To facilitate local memory allocaon on FPGA devices, the xfOpenCV library funcons are

provided in templates with compile-me parameters. Data is explicitly copied from cv::Mat to

xf::Mat and is stored in physically conguous memory to achieve the best possible

performance. Aer processing, the output in xf::Mat is copied back to cv::Mat to write it

into the memory.

xf::Mat Image Container Class

xf::Mat is a template class that serves as a container for storing image data and its aributes.

Note: The xf::Mat image container class is similar to the cv::Mat class of the OpenCV library.

Class Definition

template<int TYPE, int ROWS, int COLS, int NPC>

class Mat {

public:

Mat(); // default constructor

Mat(int _rows, int _cols);

Mat(int _rows, int _cols, void *_data);

Mat(int _size, int _rows, int _cols);

~Mat();

void init(int _rows, int _cols);

void copyTo(XF_PTSNAME(T,NPC)* fromData);

unsigned char * copyFrom();

Mat(const Mat& src);

Mat& operator=(const Mat& src);

template<int DST_T>

void convertTo(Mat<DST_T,ROWS, COLS, NPC> &dst, int otype, double

alpha=1, double beta=0);

int rows, cols, size; // actual image size

#ifndef __SYNTHESIS__

XF_TNAME(T,NPC)*data;

#else

XF_TNAME(T,NPC) data[ROWS*(COLS>> (XF_BITSHIFT(NPC)))];

#endif

};

Parameter Descriptions

The following table lists the xf::Mat class parameters and their descripons:

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 25

Table 4: xf::Mat Class Parameter Descriptions

Parameter Description

rows The number of rows in the image or height of the image.

cols The number of columns in the image or width of the image.

size The number of words stored in the data member. The value is calculated using rows*cols/

(number of pixels packed per word).

*data The pointer to the words that store the pixels of the image.

Member Functions Description

The following table lists the member funcons and their descripons:

Table 5: xf::Mat Member Function Descriptions

Member Functions Description

Mat() This default constructor initializes the Mat object sizes, using the template parameters ROWS

and COLS.

Mat(int _rows, int _cols) This constructor initializes the Mat object using arguments _rows and _cols.

Mat(const xf::Mat &_src) This constructor helps clone a Mat object to another. New memory will be allocated for the

newly created constructor.

Mat(int _rows, int _cols,

void *_data)

This constructor initializes the Mat object using arguments _rows, _cols, and _data. The *data

member of the Mat object points to the memory allocated for _data argument, when this

constructor is used. No new memory is allocated for the *data member.

convertTo(Mat<DST_T,ROW

S, COLS, NPC> &dst, int

otype, double alpha=1,

double beta=0)

Refer to xf::convertTo

copyTo(* fromData) Copies the data from Data pointer into physically contiguous memory allocated inside the

constructor.

copyFrom() Returns the pointer to the first location of the *data member.

~Mat() This is a default destructor of the Mat object.

Template Parameter Descriptions

Template parameters of the xf::Mat class are used to set the depth of the pixel, number of

channels in the image, number of pixels packed per word, maximum number of rows and columns

of the image. The following table lists the template parameters and their descripons:

Table 6: xf::Mat Template Parameter Descriptions

Parameters Description

TYPE Type of the pixel data. For example, XF_8UC1 stands for 8-bit unsigned and one channel pixel.

More types can be found in include/common/xf_params.h.

HEIGHT Maximum height of an image.

WIDTH Maximum width of an image.

NPC The number of pixels to be packed per word. For instance, XF_NPPC1 for 1 pixel per word; and

XF_NPPC8 for 8 pixels per word.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 26

Pixel-Level Parallelism

The amount of parallelism to be implemented in a funcon from xfOpenCV is kept as a

congurable parameter. In most funcons, there are two opons for processing data.

• Single-pixel processing

• Processing eight pixels in parallel

The following table describes the opons available for specifying the level of parallelism required

in a parcular funcon:

Table 7: Options Available for Specifying the Level of Parallelism

Option Description

XF_NPPC1 Process 1 pixel per clock cycle

XF_NPPC2 Process 2 pixels per clock cycle

XF_NPPC8 Process 8 pixels per clock cycle

Macros to Work With Parallelism

There are two macros that are dened to work with parallelism.

• The XF_NPIXPERCYCLE(flags) macro resolves to the number of pixels processed per

cycle.

○XF_NPIXPERCYCLE(XF_NPPC1) resolves to 1

○XF_NPIXPERCYCLE(XF_NPPC2) resolves to 2

○XF_NPIXPERCYCLE(XF_NPPC8) resolves to 8

• The XF_BITSHIFT(flags) macro resolves to the number of mes to shi the image size to

right to arrive at the nal data transfer size for parallel processing.

○XF_BITSHIFT(XF_NPPC1) resolves to 0

○XF_BITSHIFT(XF_NPPC2) resolves to 1

○XF_BITSHIFT(XF_NPPC8) resolves to 3

Pixel Types

Parameter types will dier, depending on the combinaon of the depth of pixels and the number

of channels in the image. The generic nomenclature of the parameter is listed below.

XF_<Number of bits per pixel><signed (S) or unsigned (U) or float

(F)>C<number of channels>

For example, for an 8-bit pixel - unsigned - 1 channel the data type is XF_8UC1.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 27

The following table lists the available data types for the xf::Mat class:

Table 8: xf::Mat Class - Available Data Types

Option Number of bits per

Pixel

Unsigned/ Signed/

Float Type Number of Channels

XF_8UC1 8 Unsigned 1

XF_16UC1 16 Unsigned 1

XF_16SC1 16 Signed 1

XF_32UC1 32 Unsigned 1

XF_32FC1 32 Float 1

XF_32SC1 32 Signed 1

XF_8UC2 8 Unsigned 2

XF_8UC4 8 Unsigned 4

XF_2UC1 2 Unsigned 1

Manipulating Data Type

Based on the number of pixels to process per clock cycle and the type parameter, there are

dierent possible data types. The xfOpenCV library uses these datatypes for internal processing

and inside the xf::Mat class. The following are a few supported types:

•XF_TNAME(TYPE,NPPC) resolves to the data type of the data member of the xf::Mat

object. For instance, XF_TNAME(XF_8UC1,XF_NPPC8) resolves to ap_uint<64>.

•Word width = pixel depth * number of channels * number of pixels to

process per cycle (NPPC).

•XF_DTUNAME(TYPE,NPPC) resolves to the data type of the pixel. For instance,

XF_DTUNAME(XF_32FC1,XF_NPPC1) resolves to float.

•XF_PTSNAME(TYPE,NPPC) resolves to the ‘C’ data type of the pixel. For instance,

XF_PTSNAME (XF_16UC1,XF_NPPC2) resolves to unsigned short.

Note: ap_uint<>, ap_int<>, ap_fixed<>, and ap_ufixed<> types belong to the high-level synthesis

(HLS) library. For more informaon, see the Vivado Design Suite User Guide: High-Level Synthesis (UG902).

Sample Illustration

The following code illustrates the conguraons that are required to build the gaussian lter on

an image, using SDSoC tool for Zynq® UltraScale™ plaorm.

Note: In case of a real-me applicaon, where the video is streamed in, it is recommended that the locaon

of frame buer is xf::Mat and is processed using the library funcon. The resultant locaon pointer is

passed to display IPs.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 28

xf_config_params.h

#define FILTER_SIZE_3 1

#define FILTER_SIZE_5 0

#define FILTER_SIZE_7 0

#define RO 0

#define NO 1

#if NO

#define NPC1 XF_NPPC1

#endif

#if RO

#define NPC1 XF_NPPC8

#endif

xf_gaussian_filter_tb.cpp

int main(int argc, char **argv)

{

cv::Mat in_img, out_img, ocv_ref;

cv::Mat in_gray, in_gray1, diff;

in_img = cv::imread(argv[1], 1); // reading in the color image

extractChannel(in_img, in_gray, 1);

xf::Mat<XF_8UC1, HEIGHT, WIDTH, NPC1> imgInput(in_img.rows,in_img.cols);

xf::Mat<XF_8UC1, HEIGHT, WIDTH, NPC1> imgOutput(in_img.rows,in_img.cols);

imgInput.copyTo(in_gray.data);

gaussian_filter_accel(imgInput,imgOutput,sigma);

// Write output image

xf::imwrite("hls_out.jpg",imgOutput);

}

xf_gaussian_filter_accel.cpp

#include "xf_gaussian_filter_config.h"

void gaussian_filter_accel(xf::Mat<XF_8UC1,HEIGHT,WIDTH,NPC1>

&imgInput,xf::Mat<XF_8UC1,HEIGHT,WIDTH,NPC1>&imgOutput,float sigma)

{

xf::GaussianBlur<FILTER_WIDTH, XF_BORDER_CONSTANT, XF_8UC1, HEIGHT,

WIDTH, NPC1>(imgInput, imgOutput, sigma);

}

xf_gaussian_filter.hpp

#pragma SDS data data_mover("_src.data":AXIDMA_SIMPLE)

#pragma SDS data data_mover("_dst.data":AXIDMA_SIMPLE)

#pragma SDS data access_pattern("_src.data":SEQUENTIAL)

#pragma SDS data copy("_src.data"[0:"_src.size"])

#pragma SDS data access_pattern("_dst.data":SEQUENTIAL)

#pragma SDS data copy("_dst.data"[0:"_dst.size"])

template<int FILTER_SIZE, int BORDER_TYPE, int SRC_T, int ROWS,

int COLS,int NPC = 1>

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 29

void GaussianBlur(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,

xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst, float sigma)

{

//function body

}

The design fetches data from external memory (with the help of SDSoC data movers) and is

transferred to the funcon in 8-bit or 64-bit packets, based on the congured mode. Assuming

8-bits per pixel, 8 pixels can be packed into 64-bits. Therefore, 8 pixels are available to be

processed in parallel.

Enable the FILTER_SIZE_3 and the NO macros in the xf_config_params.h le. The macro

is used to set the lter size to 3x3 and #define NO 1 macro enables 1 pixel parallelism.

Specify the SDSoC tool specic pragmas, in the xf_gaussian_filter.hpp le.

#pragma SDS data data_mover("_src.data":AXIDMA_SIMPLE)

#pragma SDS data data_mover("_dst.data":AXIDMA_SIMPLE)

#pragma SDS data access_pattern("_src.data":SEQUENTIAL)

#pragma SDS data copy("_src.data"[0:"_src.size"])

#pragma SDS data access_pattern("_dst.data":SEQUENTIAL)

#pragma SDS data copy("_dst.data"[0:"_dst.size"])

Note: For more informaon on the pragmas used for hardware accelerator funcons in SDSoC, see SDSoC

Environment User Guide (UG1027).

Additional Utility Functions for Software

xf::imread

The funcon xf::imread loads an image from the specied le path, copies into xf::Mat and

returns it. If the image cannot be read (because of missing le, improper permissions,

unsupported or invalid format), the funcon exits with a non-zero return code and an error

statement.

Note: In an HLS standalone mode like Cosim, use cv::imread followed by copyTo funcon, instead of

xf::imread.

API Syntax

template<int PTYPE, int ROWS, int COLS, int NPC>

xf::Mat<PTYPE, ROWS, COLS, NPC> imread (char *filename, int type)

Parameter Descriptions

The table below describes the template and the funcon parameters.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 30

Table 9: xf::imread Function Parameter Descriptions

Parameter Description

PTYPE Input pixel type. Value should be in accordance with the ‘type’ argument’s value.

ROWS Maximum height of the image to be read

COLS Maximum width of the image to be read

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

filename Name of the file to be loaded

type Flag that depicts the type of image. The values are:

•'0' for gray scale

•'1' for color image

xf::imwrite

The funcon xf::imwrite saves the image to the specied le from the given xf::Mat. The image

format is chosen based on the le name extension. This funcon internally uses cv::imwrite for

the processing. Therefore, all the limitaons of cv::imwrite are also applicable to xf::imwrite.

API Syntax

template <int PTYPE, int ROWS, int COLS, int NPC>

void imwrite(const char *img_name, xf::Mat<PTYPE, ROWS, COLS, NPC> &img)

Parameter Descriptions

The table below describes the template and the funcon parameters.

Table 10: xf::imwrite Function Parameter Descriptions

Parameter Description

PTYPE Input pixel type. Supported types are: XF_8UC1, XF_16UC1, XF_8UC3, XF_16UC3, XF_8UC4, and

XF_16UC4

ROWS Maximum height of the image to be read

COLS Maximum width of the image to be read

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

img_name Name of the file with the extension

img xf::Mat array to be saved

xf::absDiff

The funcon xf::absDi computes the absolute dierence between each individual pixels of an

xf::Mat and a cv::Mat, and returns the dierence values in a cv::Mat.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 31

API Syntax

template <int PTYPE, int ROWS, int COLS, int NPC>

void absDiff(cv::Mat &cv_img, xf::Mat<PTYPE, ROWS, COLS, NPC>& xf_img,

cv::Mat &diff_img )

Parameter Descriptions

The table below describes the template and the funcon parameters.

Table 11: xf::absDiff Function Parameter Descriptions

Parameter Description

PTYPE Input pixel type

ROWS Maximum height of the image to be read

COLS Maximum width of the image to be read

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and

XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively.

cv_img cv::Mat array to be compared

xf_img xf::Mat array to be compared

diff_img Output difference image(cv::Mat)

xf::convertTo

The xf::convertTo funcon performs bit depth conversion on each individual pixel of the given

input image. This method converts the source pixel values to the target data type with

appropriate casng.

dst(x,y)= cast<target-data-type>(α(src(x,y)+β))

Note: The output and input Mat cannot be the same. That is, the converted image cannot be stored in the

Mat of the input image.

API Syntax

template<int DST_T> void convertTo(xf::Mat<DST_T,ROWS, COLS, NPC> &dst,

int ctype, double alpha=1, double beta=0)

Parameter Descriptions

The table below describes the template and funcon parameters.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 32

Table 12: xf::convertTo Function Parameter Descriptions

Parameter Description

DST_T Output pixel type. Possible values are XF_8UC1, XF_16UC1, XF_16SC1, and XF_32SC1.

ROWS Maximum height of image to be read

COLS Maximum width of image to be read

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and

XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively.

dst Converted xf Mat

ctype Conversion type : Possible values are listed here.

//Down-convert:

•XF_CONVERT_16U_TO_8U

•XF_CONVERT_16S_TO_8U

•XF_CONVERT_32S_TO_8U

•XF_CONVERT_32S_TO_16U

•XF_CONVERT_32S_TO_16S

//Up-convert:

•XF_CONVERT_8U_TO_16U

•XF_CONVERT_8U_TO_16S

•XF_CONVERT_8U_TO_32S

•XF_CONVERT_16U_TO_32S

•XF_CONVERT_16S_TO_32S

alpha Optional scale factor

beta Optional delta added to the scaled values

xfOpenCV Library Functions

The xfOpenCV library is a set of select OpenCV funcons opmized for Zynq®-7000 SoC and

Zynq UltraScale+ MPSoC devices. The following table lists the xfOpenCV library funcons.

Table 13: xfOpenCV Library Functions

Computations Input Processing Filters Other

Absolute Difference Bit Depth Conversion Bilateral Filter Canny Edge Detection

Accumulate Channel Combine Box Filter FAST Corner Detection

Accumulate Squared Channel Extract Custom Convolution Harris Corner Detection

Accumulate Weighted Color Conversion Dilate Histogram Computation

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 33

Table 13: xfOpenCV Library Functions (cont'd)

Computations Input Processing Filters Other

Atan2 Histogram Equalization Erode Dense Pyramidal LK Optical

Flow

Bitwise AND, Bitwise NOT,

Bitwise OR, Bitwise XOR

Look Up Table Gaussian Filter Dense Non-Pyramidal LK

Optical Flow

Gradient Magnitude Remap Sobel Filter MinMax Location

Gradient Phase Resolution Conversion

(Resize)

Median Blur Filter Thresholding

Integral Image Scharr Filter WarpAffine

Inverse (Reciprocal) WarpPerspective

Pixel-Wise Addition SVM

Pixel-Wise Multiplication Otsu Threshold

Pixel-Wise Subtraction Mean Shift Tracking

Square Root HOG

Mean and Standard

Deviation

Stereo Local Block Matching

WarpTransform

Pyramid Up

Pyramid Down

Delay

Duplicate

Color Thresholding

RGB2HSV

InitUndistortRectifyMapInver

Houghlines

Semi Global Method for

Stereo Disparity Estimation

Notes:

1. The maximum resolution supported for all the functions is 4K, except Houghlines, HOG (RB mode), and Canny Edge

Detection.

The following table lists the funcons that are not supported on Zynq-7000 SoC devices, when

congured to use 128-bit interfaces in 8 pixel per cycle mode.

Table 14: Unsupported Functions Using 128-bit Interfaces in 8 Pixel Per Cycle Mode

on Zynq-7000 SoC

Computations Input Processing Filters

Accumulate Bit Depth Conversion Box Filter: signed 16-bit pixel type, and

unsigned 16-bit pixel type

Accumulate Squared Custom Convolution: signed 16-bit

output pixel type

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 34

Table 14: Unsupported Functions Using 128-bit Interfaces in 8 Pixel Per Cycle Mode

on Zynq-7000 SoC (cont'd)

Computations Input Processing Filters

Accumulate Weighted Sobel Filter

Gradient Magnitude Scharr Filter

Gradient Phase

Pixel-Wise Addition: signed 16-bit pixel

type, and unsigned 16-bit pixel type

Pixel-Wise Multiplication: signed 16-bit

pixel type, and unsigned 16-bit pixel

type

Pixel-Wise Subtraction: signed 16-bit

pixel type, and unsigned 16-bit pixel

type

Note: Resoluon Conversion (Resize) in 8 pixel per cycle mode, Dense Pyramidal LK Opcal Flow, and

Dense Non-Pyramidal LK Opcal Flow funcons are not supported on the Zynq-7000 SoC ZC702 devices,

due to the higher resource ulizaon.

Absolute Difference

The absdiff funcon nds the pixel wise absolute dierence between two input images and

returns an output image. The input and the output images must be the XF_8UC1 type.

Iout (x, y)=

Iin1 (x, y)- Iin2

(x, y)

Where,

• Iout(x, y) is the intensity of output image at (x,y) posion.

• Iin1(x, y) is the intensity of rst input image at (x,y) posion.

• Iin2(x, y) is the intensity of second input image at (x,y) posion.

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1>

void absdiff(

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 35

Table 15: absdiff Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is

supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a

multiple of 8)

COLS Maximum width of input and output image (must be a

multiple of 8)

NPC Number of pixels to be processed per cycle; possible

options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel

operations respectively.

src1 Input image

src2 Input image

dst Output image

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado® HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image.

Table 16: absdiff Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 62 67 17

8 pixel 150 0 0 67 234 39

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Table 17: absdiff Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.69

Deviation from OpenCV

There is no deviaon from OpenCV, except that the absdiff funcon supports 8-bit pixels.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 36

Accumulate

The accumulate funcon adds an image (src1) to the accumulator image (src2), and generates

the accumulated result image (dst).

dst(x,y)=src1(x,y)+src2⎛

⎝x,y⎞

⎠

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void accumulate (

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,

xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 18: accumulate Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8

for 1 pixel and 8 pixel operations respectively.

src1 Input image

src2 Input image

dst Output image

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale

HD (1080x1920) image.

Table 19: accumulate Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48E FF LUT CLB

1 pixel 300 0 0 62 55 12

8 pixel 150 0 0 389 285 61

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 37

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Table 20: accumulate Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

Deviation from OpenCV

In OpenCV the accumulated image is stored in the second input image. The src2 image acts as

both input and output, as shown below:

src2(x,y)=src1(x,y)+src2⎛

⎝x,y⎞

⎠

Whereas, in the xfOpenCV implementaon, the accumulated image is stored separately, as

shown below:

dst(x,y)=src1(x,y)+src2⎛

⎝x,y⎞

⎠

Accumulate Squared

The accumulateSquare funcon adds the square of an image (src1) to the accumulator image

(src2) and generates the accumulated result (dst).

dst(x,y)=src1(x,y)2+src2⎛

⎝x,y⎞

⎠

The accumulated result is a separate argument in the funcon, instead of having src2 as the

accumulated result. In this implementaon, having a bi-direconal accumulator is not possible as

the funcon makes use of streams.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void accumulateSquare (

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,

xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst)

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 38

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 21: accumulateSquare Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src1 Input image

src2 Input image

dst Output image

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image.

Table 22: accumulateSquare Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48E FF LUT CLB

1 pixel 300 0 1 71 52 14

8 pixel 150 0 8 401 247 48

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Table 23: accumulateSquare Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.6

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 39

Deviation from OpenCV

In OpenCV the accumulated squared image is stored in the second input image. The src2 image

acts as input as well as output.

src2(x,y)=src1(x,y)2+src2⎛

⎝x,y⎞

⎠

Whereas, in the xfOpenCV implementaon, the accumulated squared image is stored separately.

dst(x,y)=src1(x,y)2+src2⎛

⎝x,y⎞

⎠

Accumulate Weighted

The accumulateWeighted funcon computes the weighted sum of the input image (src1) and

the accumulator image (src2) and generates the result in dst.

dst(x,y)=alpha*src1(x,y)+⎛

⎝1 - alpha⎞

⎠*src2⎛

⎝x,y⎞

⎠

The accumulated result is a separate argument in the funcon, instead of having src2 as the

accumulated result. In this implementaon, having a bi-direconal accumulator is not possible, as

the funcon uses streams.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void accumulateWeighted (

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,

xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst,

float alpha )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 24: accumulateWeighted Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src1 Input image

src2 Input image

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 40

Table 24: accumulateWeighted Function Parameter Descriptions (cont'd)

Parameter Description

dst Output image

alpha Weight applied to input image

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image.

Table 25: accumulateWeighted Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 5 295 255 52

8 pixel 150 0 19 556 476 88

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Table 26: accumulateWeighted Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

Deviation from OpenCV

The resultant image in OpenCV is stored in the second input image. The src2 image acts as input

as well as output, as shown below:

src2(x,y)=alpha*src1(x,y)+⎛

⎝1 - alpha⎞

⎠*src2⎛

⎝x,y⎞

⎠

Whereas, in xfOpenCV implementaon, the accumulated weighted image is stored separately.

dst(x,y)=alpha*src1(x,y)+⎛

⎝1 - alpha⎞

⎠*src2⎛

⎝x,y⎞

⎠

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 41

Bilateral Filter

In general, any smoothing lter smoothens the image which will aect the edges of the image. To

preserve the edges while smoothing, a bilateral lter can be used. In an analogous way as the

Gaussian lter, the bilateral lter also considers the neighboring pixels with weights assigned to

each of them. These weights have two components, the rst of which is the same weighing used

by the Gaussian lter. The second component takes into account the dierence in the intensity

between the neighboring pixels and the evaluated one.

The bilateral lter applied on an image is:

BF[I]p=1

Wp∑qϵSGσs

⎛

⎝‖p-q‖⎞

⎠Gσr

⎛

⎝‖Ip-Iq‖⎞

⎠Iq

Where

Wp=∑qϵSGσs

⎛

⎝‖p-q‖⎞

⎠Gσr

⎛

⎝‖Ip-Iq‖⎞

⎠

and

Gσ

is a gaussian lter with variance

The gaussian lter is given by:

Gσ=e

-⎛

⎝x2+y2⎞

⎠

2σ2

API Syntax

template<int FILTER_SIZE, int BORDER_TYPE, int TYPE, int ROWS, int COLS,

int NPC=1>

void bilateralFilter (

xf::Mat<int TYPE, int ROWS, int COLS, int NPC> src,

xf::Mat<int TYPE, int ROWS, int COLS, int NPC> dst,

float sigma_space, float sigma_color )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 27: bilateralFilter Function Parameter Descriptions

Parameter Description

FILTER_SIZE Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7)

are supported

BORDER_TYPE Border type supported is XF_BORDER_CONSTANT

TYPE Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported

(XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 42

Table 27: bilateralFilter Function Parameter Descriptions (cont'd)

Parameter Description

NPC Number of pixels to be processed per cycle; this function supports only XF_NPPC1

or 1 pixel per cycle operations.

src Input image

dst Output image

sigma_space Standard deviation of filter in spatial domain

sigma_color Standard deviation of filter used in color space

Resource Utilization

The following table summarizes the resource ulizaon of the kernel in dierent conguraons,

generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA,

to progress a grayscale HD (1080x1920) image.

Table 28: bilateralFilter Resource Utilization Summary

Operating

Mode Filter Size

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT

1 pixel 3x3 300 6 22 4934 4293

5x5 300 12 30 5481 4943

7x7 300 37 48 7084 6195

Performance Estimate

The following table summarizes a performance esmate of the kernel in dierent conguraons,

as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to

process a grayscale HD (1080x1920) image.

Table 29: bilateralFilter Function Performance Estimate Summary

Operating Mode Filter Size

Latency Estimate

168 MHz

Max (ms)

1 pixel 3x3 7.18

5x5 7.20

7x7 7.22

Deviation from OpenCV

Unlike OpenCV, xfOpenCV only supports lter sizes of 3, 5 and 7.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 43

Bit Depth Conversion

The convertTo funcon converts the input image bit depth to the required bit depth in the

output image.

API Syntax

template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void convertTo(xf::Mat<SRC_T, ROWS, COLS, NPC> &_src_mat, xf::Mat<DST_T,

ROWS, COLS, NPC> &_dst_mat, ap_uint<4> _convert_type, int _shift)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 30: convertTo Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1),

16-bit, unsigned, 1 channel (XF_16UC1),

16-bit, signed, 1 channel (XF_16SC1),

32-bit, unsigned, 1 channel (XF_32UC1)

32-bit, signed, 1 channel (XF_32SC1) are supported.

DST_T Output pixel yype. 8-bit, unsigned, 1 channel (XF_8UC1),

16-bit, unsigned, 1 channel (XF_16UC1),

16-bit, signed, 1 channel (XF_16SC1),

32-bit, unsigned, 1 channel (XF_32UC1)

32-bit, signed, 1 channel (XF_32SC1) are supported.

ROWS Height of input and output images

COLS Width of input and output images

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src_mat Input image

_dst_mat Output image

_convert_type This parameter specifies the type of conversion required. (See XF_convert_bit_depth_e

enumerated type in file xf_params.h for possible values.)

_shift Optional scale factor

Possible Conversions

The following table summarizes supported conversions. The rows are possible input image bit

depths and the columns are corresponding possible output image bit depths (U=unsigned,

S=signed).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 44

Table 31: convertTo Function Supported Conversions

INPUT/

OUTPUT U8 U16 S16 U32 S32

U8 NA yes yes NA yes

U16 yes NA NA NA yes

S16 yes NA NA NA yes

U32 NA NA NA NA NA

S32 yes yes yes NA NA

Resource Utilization

The following table summarizes the resource ulizaon of the convertTo funcon, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image.

Table 32: convertTo Function Resource Utilization Summary For

XF_CONVERT_8U_TO_16S Conversion

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 8 581 523 119

8 pixel 150 0 8 963 1446 290

Table 33: convertTo Function Resource Utilization Summary For

XF_CONVERT_16U_TO_8U Conversion

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 8 591 541 124

8 pixel 150 0 8 915 1500 308

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 45

Table 34: convertTo Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency

1 pixel operation (300 MHz) 6.91 ms

8 pixel operation (150 MHz) 1.69 ms

Bitwise AND

The bitwise_and funcon performs the bitwise AND operaon for each pixel between two

input images, and returns an output image.

Iout⎛

⎝x,y⎞

⎠ = I in1⎛

⎝x,y⎞

⎠ & I in2⎛

⎝x,y⎞

⎠

Where,

•

Iout⎛

⎝x,y⎞

⎠

is the intensity of output image at (x, y) posion

•

Iin1⎛

⎝x,y⎞

⎠

is the intensity of rst input image at (x, y) posion

•

Iin2⎛

⎝x,y⎞

⎠

is the intensity of second input image at (x, y) posion

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1>

void bitwise_and (

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 35: bitwise_and Function Parameter Descriptions

Parameter Description

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src1 Input image

src2 Input image

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 46

Table 35: bitwise_and Function Parameter Descriptions (cont'd)

Parameter Description

dst Output image

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image.

Table 36: bitwise_and Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 62 44 10

8 pixel 150 0 0 59 72 13

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Table 37: bitwise_and Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

Bitwise NOT

The bitwise_not funcon performs the pixel wise bitwise NOT operaon for the pixels in the

input image, and returns an output image.

Iout⎛

⎝x,y⎞

⎠= ~Iin⎛

⎝x,y⎞

⎠

Where,

•

Iout⎛

⎝x,y⎞

⎠

is the intensity of output image at (x, y) posion

•

Iin⎛

⎝x,y⎞

⎠

is the intensity of input image at (x, y) posion

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 47

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1>

void bitwise_not (

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 38: bitwise_not Function Parameter Descriptions

Parameter Description

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src Input image

dst Output image

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image.

Table 39: bitwise_not Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 97 78 20

8 pixel 150 0 0 88 97 21

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 48

Table 40: bitwise_not Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

Bitwise OR

The bitwise_or funcon performs the pixel wise bitwise OR operaon between two input

images, and returns an output image.

Iout (x,y)= I in1(x,y)

I in2⎛

⎝x,y⎞

⎠

Where,

•

Iout(x,y)

is the intensity of output image at (x, y) posion

•

Iin1(x,y)

is the intensity of rst input image at (x, y) posion

•

Iin2(x,y)

is the intensity of second input image at (x, y) posion

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1>

void bitwise_or (

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 41: bitwise_or Function Parameter Descriptions

Parameter Description

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src1 Input image

src2 Input image

dst Output image

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 49

Chapter 3: xfOpenCV Library API Reference

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image.

Table 42: bitwise_or Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 62 44 10

8 pixel 150 0 0 59 72 13

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image.

Table 43: bitwise_or Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

Bitwise XOR

The bitwise_xor funcon performs the pixel wise bitwise XOR operaon between two input

images, and returns an output image, as shown below:

Iout⎛

⎝x,y⎞

⎠= I in1⎛

⎝x,y⎞

⎠⊕ I in2⎛

⎝x,y⎞

⎠

Where,

•

Iout⎛

⎝x,y⎞

⎠

is the intensity of output image at (x, y) posion

•

Iin1

⎛

⎝x,y⎞

⎠

is the intensity of rst input image at (x, y) posion

•

Iin2

⎛

⎝x,y⎞

⎠

is the intensity of second input image at (x, y) posion

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 50

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1>

void bitwise_xor(

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,

xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 44: bitwise_xor Function Parameter Descriptions

Parameter Description

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

src1 Input image

src2 Input image

dst Output image

Resource Utilization

The following table summarizes the resource ulizaon in dierent conguraons, generated

using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image:

Table 45: bitwise_xor Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 62 44 10

8 pixel 150 0 0 59 72 13

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD

(1080x1920) image:

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 51

Table 46: bitwise_xor Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

Box Filter

The boxFilter funcon performs box ltering on the input image. Box lter acts as a low-pass

lter and performs blurring over the image. The boxFilter funcon or the box blur is a spaal

domain linear lter in which each pixel in the resulng image has a value equal to the average

value of the neighboring pixels in the image.

Kbox =1

(ksize*ksize)

⎡

⎣

⎢

1 . . . 1

⎤

⎦

⎥

API Syntax

template<int BORDER_TYPE,int FILTER_TYPE, int SRC_T, int ROWS, int

COLS,int NPC=1>

void boxFilter(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<SRC_T,

ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 47: boxFilter Function Parameter Descriptions

Parameter Description

FILTER_SIZE Filter size. Filter size of 3(XF_FILTER_3X3), 5(XF_FILTER_5X5) and 7(XF_FILTER_7X7) are

supported

BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT

SRC_T Input and output pixel type. 8-bit, unsigned, 16-bit unsigned and 16-bit signed, 1 channel is

supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8

for 1 pixel and 8 pixel operations respectively.

_src_mat Input image

_dst_mat Output image

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 52

Resource Utilization

The following table summarizes the resource ulizaon of the kernel in dierent conguraons,

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to

process a grayscale HD (1080x1920) image.

Table 48: boxFilter Function Resource Utilization Summary

Operating

Mode Filter Size

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 3x3 300 3 1 545 519 104

5x5 300 5 1 876 870 189

7x7 300 7 1 1539 1506 300

8 pixel 3x3 150 6 8 1002 1368 264

5x5 150 10 8 1576 3183 611

7x7 150 14 8 2414 5018 942

Performance Estimate

The following table summarizes the performance of the kernel in dierent conguraons, as

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image:

Table 49: boxFilter Function Performance Estimate Summary

Operating Mode

Operating Frequency

(MHz) Filter Size

Latency Estimate

Max (ms)

1 pixel 300 3x3 7.2

300 5x5 7.21

300 7x7 7.22

8 pixel 150 3x3 1.7

150 5x5 1.7

150 7x7 1.7

Canny Edge Detection

The Canny edge detector nds the edges in an image or video frame. It is one of the most

popular algorithms for edge detecon. Canny algorithm aims to sasfy three main criteria:

1. Low error rate: A good detecon of only existent edges.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 53

2. Good localizaon: The distance between edge pixels detected and real edge pixels have to be

minimized.

3. Minimal response: Only one detector response per edge.

In this algorithm, the noise in the image is reduced rst by applying a Gaussian mask. The

Gaussian mask used here is the average mask of size 3x3. Thereaer, gradients along x and y

direcons are computed using the Sobel gradient funcon. The gradients are used to compute

the magnitude and phase of the pixels. The phase is quanzed and the pixels are binned

accordingly. Non-maximal suppression is applied on the pixels to remove the weaker edges.

Edge tracing is applied on the remaining pixels to draw the edges on the image. In this algorithm,

the canny up to non-maximal suppression is in one kernel and the edge linking module is in

another kernel. Aer non-maxima suppression, the output is represented as 2-bit per pixel,

Where:

•00 - represents the background

•01 - represents the weaker edge

•11 - represents the strong edge

The output is packed as 8-bit (four 2-bit pixels) in 1 pixel per cycle operaon and packed as 16-

bit (eight 2-bit pixels) in 8 pixel per cycle operaon. For the edge linking module, the input is 64-

bit, such 32 pixels of 2-bit are packed into a 64-bit. The edge tracing is applied on the pixels and

returns the edges in the image.

API Syntax

The API syntax for Canny is:

template<int FILTER_TYPE,int NORM_TYPE,int SRC_T,int DST_T, int ROWS, int

COLS,int NPC,int NPC1>

void Canny(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<DST_T, ROWS,

COLS, NPC1> & _dst_mat,unsigned char _lowthreshold,unsigned char

_highthreshold)

The API syntax for EdgeTracing is:

template<int SRC_T, int DST_T, int ROWS, int COLS,int NPC_SRC,int NPC_DST>

voidEdgeTracing(xf::Mat<SRC_T, ROWS, COLS, NPC_SRC> & _src,xf::Mat<DST_T,

ROWS, COLS, NPC_DST> & _dst)

Parameter Descriptions

The following table describes the xf::Canny template and funcon parameters:

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 54

Table 50: xf::Canny Function Parameter Descriptions

Parameter Description

FILTER_TYPE The filter window dimensions. The options are 3 and 5.

NORM_TYPE The type of norm used. The options for norm type are L1NORM and L2NORM.

SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T Output pixel type. The output in 1pixel case is 8-bit and packing four 2-bit pixel

values into 8-bit. The Output in 8 pixel case is 16-bit, 8-bit, 2-bit pixel values are

packing into 16-bit.

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

_src_mat Input image

_dst_mat Output image

_lowthreshold The lower value of threshold for binary thresholding.

_highthreshold The higher value of threshold for binary thresholding.

The following table describes the EdgeTracing template and funcon parameters:

Table 51: EdgeTracing Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type

DST_T Output pixel type

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC_SRC Number of pixels to be processed per cycle. Fixed to XF_NPPC32.

NPC_DST Number of pixels to be written to destination. Fixed to XF_NPPC8.

_src Input image

_dst Output image

Resource Utilization

The following table summarizes the resource ulizaon of xf::Canny and EdgeTracing in

dierent conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image for Filter size is 3.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 55

Table 52: xf::Canny and EdgeTracing Function Resource Utilization Summary

Name

Resource Utilization

1 pixel 1 pixel 8 pixel 8 pixel Edge

Linking

Edge

Linking

L1NORM,FS:

L2NORM,FS:

L1NORM,FS:

L2NORM,FS:

300 MHz 300 MHz 150 MHz 150 MHz 300 MHz 150 MHz

BRAM_18K 22 18 36 32 84 84

DSP48E 2 4 16 32 3 3

FF 3027 3507 4899 6208 17600 14356

LUT 2626 3170 6518 9560 15764 14274

CLB 606 708 1264 1871 2955 3241

Performance Estimate

The following table summarizes the performance of the kernel in dierent conguraons, as

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image for L1NORM, lter size is 3 and including the edge linking

module.

Table 53: xf::Canny and EdgeTracing Function Performance Estimate Summary

Operating Mode

Latency Estimate

Operating Frequency (MHz) Latency (in ms)

1 pixel 300 10.2

8 pixel 150 8

Deviation from OpenCV

In OpenCV Canny funcon, the Gaussian blur is not applied as a pre-processing step.

Channel Combine

The merge funcon, merges single channel images into a mul-channel image. The number of

channels to be merged should be four.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void merge(xf::Mat<SRC_T, ROWS, COLS, NPC> &_src1, xf::Mat<SRC_T, ROWS,

COLS, NPC> &_src2, xf::Mat<SRC_T, ROWS, COLS, NPC> &_src3, xf::Mat<SRC_T,

ROWS, COLS, NPC> &_src4, xf::Mat<DST_T, ROWS, COLS, NPC> &_dst)

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 56

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 54: merge Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 4 channel is supported (XF_8UC1)

DST_T Output pixel type. Only 8-bit, unsigned,1 channel is supported (XF_8UC4)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.

_src1 Input single-channel image

_src2 Input single-channel image

_src3 Input single-channel image

_src4 Input single-channel image

_dst Output multi-channel image

Resource Utilization

The following table summarizes the resource ulizaon of the merge funcon, generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process 4 single-

channel HD (1080x1920) images.

Table 55: merge Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 8 494 386 85

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process 4 single channel HD

(1080x1920) images.

Table 56: merge Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency

1 pixel operation (300 MHz) 6.92 ms

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 57

Channel Extract

The extractChannel funcon splits a mul-channel array (32-bit pixel-interleaved data) into

several single-channel arrays and returns a single channel. The channel to be extracted is

specied by using the channel argument.

The value of the channel argument is specied by macros dened in the

xf_channel_extract_e enumerated data type. The following table summarizes the possible

values for the xf_channel_extract_e enumerated data type:

Table 57: xf_channel_extract_e Enumerated Data Type Values

Channel Enumerated Type

Unknown XF_EXTRACT_CH_0

Unknown XF_EXTRACT_CH_1

Unknown XF_EXTRACT_CH_2

Unknown XF_EXTRACT_CH_3

RED XF_EXTRACT_CH_R

GREEN XF_EXTRACT_CH_G

BLUE XF_EXTRACT_CH_B

ALPHA XF_EXTRACT_CH_A

LUMA XF_EXTRACT_CH_Y

Cb/U XF_EXTRACT_CH_U

Cr/V/Value XF_EXTRACT_CH_V

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void extractChannel(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,

xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat, uint16_t _channel)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 58: extractChannel Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 4channel is supported (XF_8UC4)

DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.

_src_mat Input multi-channel image

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 58

Table 58: extractChannel Function Parameter Descriptions (cont'd)

Parameter Description

_dst_mat Output single channel image

_channel Channel to be extracted (See xf_channel_extract_e enumerated type in file xf_params.h for

possible values.)

Resource Utilization

The following table summarizes the resource ulizaon of the extractChannel funcon,

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to

process a 4 channel HD (1080x1920) image.

Table 59: extractChannel Function Resource Utilization Summary

Operating

Mode Operating Frequency (MHz) Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 8 508 354 96

Performance Estimate

The following table summarizes the performance in dierent conguraons, as generated using

Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a 4 channel HD

(1080x1920) image.

Table 60: extractChannel Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.92

Color Conversion

The color conversion funcons convert one image format to another image format, for the

combinaons listed in the following table. The rows represent the input formats and the columns

represent the output formats. Supported conversions are discussed in the following secons.

I/O Formats RGBA NV12 NV21 IYUV UYVY YUYV YUV4

RGBA N/A For details,

see the RGBA

to NV12

For details,

see the RGBA

to NV21

For details,

see the RGBA

to IYUV

For details,

see the RGBA

to YUV4

NV12 For details,

see the NV12

to RGBA

N/A For details,

see the NV12

to IYUV

For details,

see the NV12

to YUV4

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 59

NV21 For details,

see the NV21

to RGBA

N/A For details,

see the NV21

to IYUV

For details,

see the NV21

to YUV4

IYUV For details,

see the IYUV

to RGBA

For details,

see the IYUV

to NV12

N/A For details,

see the IYUV

to YUV4

UYVY For details,

see the UYVY

to RGBA

For details,

see the UYVY

to NV12

For details,

see the UYVY

to IYUV

N/A

YUYV For details,

see the YUYV

to RGBA

For details,

see the YUYV

to NV12

For details,

see the YUYV

to IYUV

N/A

YUV4 N/A

RGB to YUV Conversion Matrix

Following is the formula to convert RGB data to YUV data:

⎡

⎣

⎢

⎤

⎦

⎥=

⎡

⎣

⎢

0.257 0.504 0.098 16

-0.148 -0.291 0.439 128

0.439 -0.368 -0.071 128

⎤

⎦

⎥

⎡

⎣

⎢

⎤

⎦

⎥

YUV to RGB Conversion Matrix

Following is the formula to convert YUV data to RGB data:

⎡

⎣

⎢

⎤

⎦

⎥=

⎡

⎣

⎢

1.164 0 1.596

1.164 -0.391 -0.813

1.164 2.018 0

⎤

⎦

⎥

⎡

⎣

⎢

⎢(Y- 16)

(U- 128)

(V- 128)

⎤

⎦

⎥

Source: hp://www.fourcc.org/fccyvrgb.php

RGBA to YUV4

The rgba2yuv4 funcon converts a 4-channel RGBA image to YUV444 format. The funcon

outputs Y, U, and V streams separately.

API Syntax

template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void rgba2yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,

ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS, COLS, NPC> & _u_image,

xf::Mat<DST_T, ROWS, COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 60

Table 62: rgba2yuv4 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input Y plane of size (ROWS, COLS).

_y_image Output Y image of size (ROWS, COLS).

_u_image Output U image of size (ROWS, COLS).

_v_image Output V image of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of RGBA to YUV4 for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Table 63: rgba2yuv4 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 9 589 328 96

Performance Estimate

The following table summarizes the performance of RGBA to YUV4 for dierent conguraons,

as generated using the Vivado HLS 2018.2 version for the Xilinx Xczu9eg-vb1156-1-i-es1, to

process a grayscale HD (1080x1920) image.

Table 64: rgba2yuv4 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 1.89

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 61

RGBA to IYUV

The rgba2iyuv funcon converts a 4-channel RGBA image to IYUV (4:2:0) format. The

funcon outputs Y, U, and V planes separately. IYUV holds subsampled data, Y is sampled for

every RGBA pixel and U,V are sampled once for 2row and 2column(2x2) pixels. U and V planes

are of (rows/2)*(columns/2) size, by cascading the consecuve rows into a single row the planes

size becomes (rows/4)*columns.

API Syntax

template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void rgba2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,

ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image,

xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 65: rgba2iyuv Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit,unsigned, 4-channel is supported (XF_8UC4).

DST_T Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input Y plane of size (ROWS, COLS).

_y_image Output Y image of size (ROWS, COLS).

_u_image Output U image of size (ROWS/4, COLS).

_v_image Output V image of size (ROWS/4, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of RGBA to IYUV for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 62

Table 66: rgba2iyuv Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 9 816 472 149

Performance Estimate

The following table summarizes the performance of RGBA to IYUV for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 67: rgba2iyuv Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 1.8

RGBA to NV12

The rgba2nv12 funcon converts a 4-channel RGBA image to NV12 (4:2:0) format. The

funcon outputs Y plane and interleaved UV plane separately. NV12 holds the subsampled data,

Y is sampled for every RGBA pixel and U, V are sampled once for 2row and 2columns (2x2)

pixels. UV plane is of (rows/2)*(columns/2) size as U and V values are interleaved.

API Syntax

template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1>

void rgba2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS,

COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & _uv)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 68: rgba2nv12 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit,unsigned, 4-channel is supported (XF_8UC4).

Y_T Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).

UV_T Output pixel type. Only 8-bit,unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 63

Table 68: rgba2nv12 Function Parameter Descriptions (cont'd)

Parameter Description

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input RGBA image of size (ROWS, COLS).

_y Output Y image of size (ROWS, COLS).

_uv Output UV image of size (ROWS/2, COLS/2).

Resource Utilization

The following table summarizes the resource ulizaon of RGBA to NV12 for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Table 69: rgba2nv12 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 9 802 452 128

Performance Estimate

The following table summarizes the performance of RGBA to NV12 for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 70: rgba2nv12 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 1.8

RGBA to NV21

The rgba2nv21 funcon converts a 4-channel RGBA image to NV21 (4:2:0) format. The

funcon outputs Y plane and interleaved VU plane separately. NV21 holds subsampled data, Y is

sampled for every RGBA pixel and U, V are sampled once for 2 row and 2 columns (2x2) RGBA

pixels. UV plane is of (rows/2)*(columns/2) size as V and U values are interleaved.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 64

API Syntax

template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1>

void rgba2nv21(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS,

COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & _uv)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 71: rgba2nv21 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input RGBA image of size (ROWS, COLS).

_y Output Y image of size (ROWS, COLS).

_uv Output UV image of size (ROWS/2, COLS/2).

Resource Utilization

The following table summarizes the resource ulizaon of RGBA to NV21 for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Table 72: rgba2nv21 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 9 802 453 131

Performance Estimate

The following table summarizes the performance of RGBA to NV21 for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 65

Table 73: rgba2nv21 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 1.89

YUYV to RGBA

The yuyv2rgba funcon converts a single-channel YUYV (YUV 4:2:2) image format to a 4-

channel RGBA image. YUYV is a sub-sampled format, a set of YUYV value gives 2 RGBA pixel

values. YUYV is represented in 16-bit values where as, RGBA is represented in 32-bit values.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void yuyv2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,

ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 74: yuyv2rgba Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input image of size (ROWS, COLS).

_dst Output image of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of YUYV to RGBA for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 66

Table 75: yuyv2rgba Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 6 765 705 165

Performance Estimate

The following table summarizes the performance of UYVY to RGBA for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 76: yuyv2rgba Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

YUYV to NV12

The yuyv2nv12 funcon converts a single-channel YUYV (YUV 4:2:2) image format to NV12

(YUV 4:2:0) format. YUYV is a sub-sampled format, 1 set of YUYV value gives 2 Y values and 1 U

and V value each.

API Syntax

template<int SRC_T,int Y_T,int UV_T,int ROWS,int COLS,int NPC=1,int

NPC_UV=1>

void yuyv2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS,

COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 77: yuyv2nv12 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 67

Chapter 3: xfOpenCV Library API Reference

Table 77: yuyv2nv12 Function Parameter Descriptions (cont'd)

Parameter Description

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

_src Input image of size (ROWS, COLS).

_y_image Output Y plane of size (ROWS, COLS).

_uv_image Output U plane of size (ROWS/2, COLS/2).

Resource Utilization

The following table summarizes the resource ulizaon of YUYV to NV12 for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Table 78: yuyv2nv12 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 831 491 149

8 pixel 150 0 0 1196 632 161

Performance Estimate

The following table summarizes the performance of YUYV to NV12 for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 79: yuyv2nv12 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 68

Chapter 3: xfOpenCV Library API Reference

YUYV to IYUV

The yuyv2iyuv funcon converts a single-channel YUYV (YUV 4:2:2) image format to

IYUV(4:2:0) format. Outputs of the funcon are separate Y, U, and V planes. YUYV is a sub-

sampled format, 1 set of YUYV value gives 2 Y values and 1 U and V value each. U, V values of

the odd rows are dropped as U, V values are sampled once for 2 rows and 2 columns in the

IYUV(4:2:0) format.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void yuyv2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,

ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image,

xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 80: yuyv2iyuv Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 16-bit, unsigned,1 channel is supported (XF_16UC1).

DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input image of size (ROWS, COLS).

_y_image Output Y plane of size (ROWS, COLS).

_u_image Output U plane of size (ROWS/4, COLS).

_v_image Output V plane of size (ROWS/4, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of YUYV to IYUV for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 69

Table 81: yuyv2iyuv Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 835 497 149

8 pixel 150 0 0 1428 735 210

Performance Estimate

The following table summarizes the performance of YUYV to IYUV for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 82: yuyv2iyuv Function Performance Estimate

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

UYVY to IYUV

The uyvy2iyuv funcon converts a UYVY (YUV 4:2:2) single-channel image to the IYUV

format. The outputs of the funcons are separate Y, U, and V planes. UYVY is sub sampled

format. 1 set of UYVY value gives 2 Y values and 1 U and V value each.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void uyvy2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,

ROWS, COLS, NPC> & _y_image,xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image,

xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 83: uyvy2iyuv Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS Maximum height of input and output image (must be a multiple of 8).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 70

Table 83: uyvy2iyuv Function Parameter Descriptions (cont'd)

Parameter Description

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input image of size (ROWS, COLS).

_y_image Output Y plane of size (ROWS, COLS).

_u_image Output U plane of size (ROWS/4, COLS).

_v_image Output V plane of size (ROWS/4, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of UYVY to IYUV for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image..

Table 84: uyvy2iyuv Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 835 494 139

8 pixel 150 0 0 1428 740 209

Performance Estimate

The following table summarizes the performance of UYVY to IYUV for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 85: uyvy2iyuv Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

UYVY to RGBA

The uyvy2rgba funcon converts a UYVY (YUV 4:2:2) single-channel image to a 4-channel

RGBA image. UYVY is sub sampled format, 1set of UYVY value gives 2 RGBA pixel values. UYVY

is represented in 16-bit values where as RGBA is represented in 32-bit values.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 71

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void uyvy2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,

ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 86: uyvy2rgba Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src Input image of size (ROWS, COLS).

_dst Output image of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of UYVY to RGBA for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Table 87: uyvy2rgba Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 6 773 704 160

Performance Estimate

The following table summarizes the performance of UYVY to RGBA for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 72

Table 88: uyvy2rgba Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.8

UYVY to NV12

The uyvy2nv12 funcon converts a UYVY (YUV 4:2:2) single-channel image to NV12 format.

The outputs are separate Y and UV planes. UYVY is sub sampled format, 1 set of UYVY value

gives 2 Y values and 1 U and V value each.

API Syntax

template<int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1, int

NPC_UV=1>

void uyvy2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS,

COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 89: uyvy2nv12 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC4 for 1 pixel and 8 pixel operations respectively.

_src Input image of size (ROWS, COLS).

_y_image Output Y plane of size (ROWS, COLS).

_uv_image Output U plane of size (ROWS/2, COLS/2).

Resource Utilization

The following table summarizes the resource ulizaon of UYVY to NV12 for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 73

Table 90: uyvy2nv12 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 831 488 131

8 pixel 150 0 0 1235 677 168

Performance Estimate

The following table summarizes the performance of UYVY to NV12 for dierent conguraons,

as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 91: uyvy2nv12 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

IYUV to RGBA

The iyuv2rgba funcon converts single channel IYUV (YUV 4:2:0) image to a 4-channel RGBA

image. The inputs to the funcon are separate Y, U, and V planes. IYUV is sub sampled format, U

and V values are sampled once for 2 rows and 2 columns of the RGBA pixels. The data of the

consecuve rows of size (columns/2) is combined to form a single row of size (columns).

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>

void iyuv2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T,

ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_v,

xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 92: iyuv2rgba Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 74

Table 92: iyuv2rgba Function Parameter Descriptions (cont'd)

Parameter Description

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_u Input U plane of size (ROWS/4, COLS).

src_v Input V plane of size (ROWS/4, COLS).

_dst0 Output RGBA image of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of IYUV to RGBA for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Table 93: iyuv2rgba Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 2 5 1208 728 196

Performance Estimate

The following table summarizes the performance of IYUV to RGBA for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 94: iyuv2rgba Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

IYUV to NV12

The iyuv2nv12 funcon converts single channel IYUV image to NV12 format. The inputs are

separate U and V planes. There is no need of processing Y plane as both the formats have a same

Y plane. U and V values are rearranged from plane interleaved to pixel interleaved.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 75

API Syntax

template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC =1, int NPC_UV=1>

void iyuv2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T,

ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> &

src_v,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<UV_T, ROWS/2,

COLS/2, NPC_UV> & _uv_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 95: iyuv2nv12 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Output pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

NPC_UV Number of UV Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4

for 1 pixel and 4-pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_u Input U plane of size (ROWS/4, COLS).

src_v Input V plane of size (ROWS/4, COLS).

_y_image Output V plane of size (ROWS, COLS).

_uv_image Output UV plane of size (ROWS/2, COLS/2).

Resource Utilization

The following table summarizes the resource ulizaon of IYUV to NV12 for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image..

Table 96: iyuv2nv12 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 12 907 677 158

8 pixel 150 0 12 1591 1022 235

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 76

Performance Estimate

The following table summarizes the performance of IYUV to NV12 for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 97: iyuv2nv12 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

IYUV to YUV4

The iyuv2yuv4 funcon converts a single channel IYUV image to a YUV444 format. Y plane is

same for both the formats. The inputs are separate U and V planes of IYUV image and the

outputs are separate U and V planes of YUV4 image. IYUV stores subsampled U,V values. YUV

format stores U and V values for every pixel. The same U, V values are duplicated for 2 rows and

2 columns (2x2) pixels in order to get the required data in the YUV444 format.

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1>

void iyuv2yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T,

ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> &

src_v,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<SRC_T, ROWS,

COLS, NPC> & _u_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 98: iyuv2yuv4 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_u Input U plane of size (ROWS/4, COLS).

src_v Input V plane of size (ROWS/4, COLS).

_y_image Output Y image of size (ROWS, COLS).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 77

Table 98: iyuv2yuv4 Function Parameter Descriptions (cont'd)

Parameter Description

_u_image Output U image of size (ROWS, COLS).

_v_image Output V image of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of IYUV to YUV4 for dierent

conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a HD (1080x1920) image.

Table 99: iyuv2yuv4 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 1398 870 232

8 pixel 150 0 0 2134 1214 304

Performance Estimate

The following table summarizes the performance of IYUV to YUV4 for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 100: iyuv2yuv4 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 13.8

8 pixel operation (150 MHz) 3.4

NV12 to IYUV

The nv122iyuv funcon converts NV12 format to IYUV format. The funcon inputs the

interleaved UV plane and the outputs are separate U and V planes. There is no need of

processing the Y plane as both the formats have a same Y plane. U and V values are rearranged

from pixel interleaved to plane interleaved.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 78

API Syntax

template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1, int NPC_UV=1>

void nv122iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,

ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> &

_y_image,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _u_image,xf::Mat<SRC_T,

ROWS/4, COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 101: nv122iyuv Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_uv Input UV plane of size (ROWS/2, COLS/2).

_y_image Output Y plane of size (ROWS, COLS).

_u_image Output U plane of size (ROWS/4, COLS).

_v_image Output V plane of size (ROWS/4, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of NV12 to IYUV for dierent

conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 102: nv122iyuv Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 1 1344 717 208

8 pixel 150 0 1 1961 1000 263

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 79

Performance Estimate

The following table summarizes the performance of NV12 to IYUV for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 103: nv122iyuv Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

NV12 to RGBA

The nv122rgba funcon converts NV12 image format to a 4-channel RGBA image. The inputs

to the funcon are separate Y and UV planes. NV12 holds sub sampled data, Y plane is sampled

at unit rate and 1 U and 1V value each for every 2x2 Y values. To generate the RGBA data, each

U and V value is duplicated (2x2) mes.

API Syntax

template<int SRC_T, int UV_T, int DST_T, int ROWS, int COLS, int NPC=1>

void nv122rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y,xf::Mat<UV_T,

ROWS/2, COLS/2, NPC> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 104: nv122rgba Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

DST_T Output pixel type. Only 8-bit,unsigned,4channel is supported (XF_8UC4).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_uv Input UV plane of size (ROWS/2, COLS/2).

_dst0 Output RGBA image of size (ROWS, COLS).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 80

Resource Utilization

The following table summarizes the resource ulizaon of NV12 to RGBA for dierent

conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 105: nv122rgba Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 2 5 1191 708 195

Performance Estimate

The following table summarizes the performance of NV12 to RGBA for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 106: nv122rgba Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

NV12 to YUV4

The nv122yuv4 funcon converts a NV12 image format to a YUV444 format. The funcon

outputs separate U and V planes. Y plane is same for both the image formats. The UV planes are

duplicated 2x2 mes to represent one U plane and V plane of the YUV444 image format.

API Syntax

template<int SRC_T,int UV_T, int ROWS, int COLS, int NPC=1, int NPC_UV=1>

void nv122yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,

ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> &

_y_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _u_image,xf::Mat<SRC_T, ROWS,

COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 81

Table 107: nv122yuv4 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_uv Input UV plane of size (ROWS/2, COLS/2).

_y_image Output Y plane of size (ROWS, COLS).

_u_image Output U plane of size (ROWS, COLS).

_v_image Output V plane of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of NV12 to YUV4 for dierent

conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 108: nv122yuv4 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 1383 832 230

8 pixel 150 0 0 1772 1034 259

Performance Estimate

The following table summarizes the performance of NV12 to YUV4 for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 109: nv122yuv4 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 13.8

8 pixel operation (150 MHz) 3.4

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 82

NV21 to IYUV

The nv212iyuv funcon converts a NV21 image format to an IYUV image format. The input to

the funcon is the interleaved VU plane only and the outputs are separate U and V planes. There

is no need of processing Y plane as both the formats have same the Y plane. U and V values are

rearranged from pixel interleaved to plane interleaved.

API Syntax

template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>

void nv212iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,

ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> &

_y_image, xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _u_image,xf::Mat<SRC_T,

ROWS/4, COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 110: nv212iyuv Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_uv Input UV plane of size (ROWS/2, COLS/2).

_y_image Output Y plane of size (ROWS, COLS).

_u_image Output U plane of size (ROWS/4, COLS).

_v_image Output V plane of size (ROWS/4, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of NV21 to IYUV for dierent

conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 83

Table 111: nv212iyuv Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 1 1377 730 219

8 pixel 150 0 1 1975 1012 279

Performance Estimate

The following table summarizes the performance of NV21 to IYUV for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 112: nv212iyuv Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

8 pixel operation (150 MHz) 1.7

NV21 to RGBA

The nv212rgba funcon converts a NV21 image format to a 4-channel RGBA image. The inputs

to the funcon are separate Y and VU planes. NV21 holds sub sampled data, Yplane is sampled

at unit rate and 1 U and 1V value each for every 2x2 Yvalues. To generate the RGBA data, each U

and V value is duplicated (2x2) mes.

API Syntax

template<int SRC_T, int UV_T, int DST_T, int ROWS, int COLS, int NPC=1>

void nv212rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,

ROWS/2, COLS/2, NPC> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 113: nv212rgba Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 84

Table 113: nv212rgba Function Parameter Descriptions (cont'd)

Parameter Description

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_uv Input UV plane of size (ROWS/2, COLS/2).

_dst0 Output RGBA image of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of NV21 to RGBA for dierent

conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 114: nv212rgba Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 2 5 1170 673 183

Performance Estimate

The following table summarizes the performance of NV12 to RGBA for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 115: nv212rgba Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 6.9

NV21 to YUV4

The nv212yuv4 funcon converts an image in the NV21 format to a YUV444 format. The

funcon outputs separate U and V planes. Y plane is same for both formats. The UV planes are

duplicated 2x2 mes to represent one U plane and V plane of YUV444 format.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 85

API Syntax

template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>

void nv212yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,

ROWS/2, COLS/2, NPC_UV> & src_uv, xf::Mat<SRC_T, ROWS, COLS, NPC> &

_y_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _u_image, xf::Mat<SRC_T, ROWS,

COLS, NPC> & _v_image)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 116: nv212yuv4 Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS Maximum height of input and output image (must be a multiple of 8).

COLS Maximum width of input and output image (must be a multiple of 8).

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y Input Y plane of size (ROWS, COLS).

src_uv Input UV plane of size (ROWS/2, COLS/2).

_y_image Output Y plane of size (ROWS, COLS).

_u_image Output U plane of size (ROWS, COLS).

_v_image Output V plane of size (ROWS, COLS).

Resource Utilization

The following table summarizes the resource ulizaon of NV21 to YUV4 for dierent

conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 117: nv212yuv4 Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 0 0 1383 817 233

8 pixel 150 0 0 1887 1087 287

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 86

Performance Estimate

The following table summarizes the performance of NV21 to YUV4 for dierent conguraons,

as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,

to process a grayscale HD (1080x1920) image.

Table 118: nv212yuv4 Function Performance Estimate Summary

Operating Mode Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 13.8

8 pixel operation (150 MHz) 3.5

Color Thresholding

The colorthresholding funcon compares the color space values of the source image with

low and high threshold values, and returns either 255 or 0 as the output.

API Syntax

template<int SRC_T,int DST_T,int MAXCOLORS, int ROWS, int COLS,int NPC>

void colorthresholding(xf::Mat<SRC_T, ROWS, COLS, NPC> &

_src_mat,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat,unsigned char

low_thresh[MAXCOLORS*3], unsigned char high_thresh[MAXCOLORS*3])

Parameter Descriptions

The table below describes the template and the funcon parameters.

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 4 channel is supported (XF_8UC4)

DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

MAXCOLORS Maximum number of color values

ROWS Maximum height of input and output image

COLS Maximum width of input and output image

NPC Number of pixels to be processed per cycle. Only XF_NPPC1 supported.

_src_mat Input image

_dst_mat Thresholded image

low_thresh Lowest threshold values for the colors

high_thresh Highest threshold values for the colors

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 87

Custom Convolution

The filter2D funcon performs convoluon over an image using a user-dened kernel.

Convoluon is a mathemacal operaon on two funcons f and g, producing a third funcon,

The third funcon is typically viewed as a modied version of one of the original funcons, that

gives the area overlap between the two funcons to an extent that one of the original funcons

is translated.

The lter can be unity gain lter or a non-unity gain lter. The lter must be of type AU_16SP. If

the co-ecients are oang point, it must be converted into the Qm.n and provided as the input

as well as the shi parameter has to be set with the ‘n’ value. Else, if the input is not of oang

point, the lter is provided directly and the shi parameter is set to zero.

API Syntax

template<int BORDER_TYPE,int FILTER_WIDTH,int FILTER_HEIGHT, int SRC_T,int

DST_T, int ROWS, int COLS,int NPC=1>

void filter2D(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<DST_T,

ROWS, COLS, NPC> & _dst_mat,short int

filter[FILTER_HEIGHT*FILTER_WIDTH],unsigned char _shift)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 119: filter2D Function Parameter Descriptions

Parameter Description

BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT

FILTER_HEIGHT Number of rows in the input filter

FILTER_WIDTH Number of columns in the input filter

SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T Output pixel type.8-bit unsigned single channel (XF_8UC1) and 16-bit signed single

channel (XF_16SC1) supported.

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and

XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

_src_mat Input image

_dst_mat Output image

filter The input filter of any size, provided the dimensions should be an odd number. The filter

co-efficients either a 16-bit value or a 16-bit fixed point equivalent value.

_shift The filter must be of type XF_16SP. If the co-efficients are floating point, it must be

converted into the Qm.n and provided as the input as well as the shift parameter has to

be set with the ‘n’ value. Else, if the input is not of floating point, the filter is provided

directly and the shift parameter is set to zero.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 88

Resource Utilization

The following table summarizes the resource ulizaon of the kernel in dierent conguraons,

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to

process a grayscale HD (1080x1920) image.

Table 120: filter2D Function Resource Utilization Summary

Operating

Mode Filter Size

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 3x3 300 3 9 1701 1161 269

5x5 300 5 25 3115 2144 524

8 pixel 3x3 150 6 72 2783 2768 638

5x5 150 10 216 3020 4443 1007

Performance Estimate

The following table summarizes the performance of the kernel in dierent conguraons, as

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 121: filter2D Function Performance Estimate Summary

Operating Mode

Operating Frequency

(MHz) Filter Size

Latency Estimate

Max (ms)

1 pixel 300 3x3 7

300 5x5 7.1

8 pixel 150 3x3 1.86

150 5x5 1.86

Delay

In image processing pipelines, it is possible that the inputs to a funcon with FIFO interfaces are

not synchronized. That is, the rst data packet for rst input might arrive a nite number of clock

cycles aer the rst data packet of the second input. If the funcon has FIFOs at its interface

with insucient depth, this causes the whole design to stall on hardware. To synchronize the

inputs, we provide this funcon to delay the input packet that arrives early, by a nite number of

clock cycles.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 89

API Syntax

template<int MAXDELAY, int SRC_T, int ROWS, int COLS,int NPC=1 >

void delayMat(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,

xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The table below describes the template and the funcon parameters.

Parameter Description

SRC_T Input and output pixel type

ROWS Maximum height of input and output image (must be multiple of 8)

COLS Maximum width of input and output image (must be multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

MAXDELAY Maximum delay that the function is to be instantiated for.

_src Input image

_dst Output image

Dilate

During a dilaon operaon, the current pixel intensity is replaced by the maximum value of the

intensity in a 3x3 neighborhood of the current pixel.

dst(x,y)= max

x-1 ≤ x'≤x+ 1

y- 1 ≤ y'≤y+ 1

src⎛

⎝x',y'⎞

⎠

API Syntax

template<int BORDER_TYPE, int SRC_T, int ROWS, int COLS,int NPC=1>

void dilate(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat, xf::Mat<SRC_T,

ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 122: dilate Function Parameter Descriptions

Parameter Description

BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 90

Table 122: dilate Function Parameter Descriptions (cont'd)

Parameter Description

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for

1 pixel and 8 pixel operations respectively.

_src_mat Input image

_dst_mat Output image

Resource Utilization

The following table summarizes the resource ulizaon of the Dilaon funcon for 1 pixel

operaon and 8 pixel operaon, generated using Vivado HLS 2018.2 version tool for the Xilinx

Xczu9eg-vb1156-1-i-es1 FPGA.

Table 123: dilate Function Resource Utilization Summary

Name

Resource Utilization

1 pixel per clock operation 8 pixel per clock operation

300 MHz 150 MHz

BRAM_18K 3 6

DSP48E 0 0

FF 339 644

LUT 350 1325

CLB 81 245

Performance Estimate

The following table summarizes a performance esmate of the Dilaon funcon for Normal

Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS

2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1 FPGA.

Table 124: dilate Function Performance Estimate Summary

Operating Mode Latency Estimate

Min (ms) Max (ms)

1 pixel (300 MHz) 7.0 7.0

8 pixel (150 MHz) 1.87 1.87

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 91

Duplicate

When various funcons in a pipeline are implemented by a programmable logic, FIFOs are

instanated between two funcons for dataow processing. When the output from one funcon

is consumed by two funcons in a pipeline, the FIFOs need to be duplicated. This funcon

facilitates the duplicaon process of the FIFOs.

API Syntax

template<int SRC_T, int ROWS, int COLS,int NPC=1>

void duplicateMat(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,

xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst1,xf::Mat<SRC_T, ROWS, COLS, NPC> &

_dst2)

Parameter Descriptions

The table below describes the template and the funcon parameters.

Parameter Description

SRC_T Input and output pixel type

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle. Possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel

and 8 pixel operations respectively.

_src Input image

_dst1 Duplicate output for _src

_dst2 Duplicate output for _src

Erode

The erode funcon nds the minimum pixel intensity in the 3x3 neighborhood of a pixel and

replaces the pixel intensity with the minimum value.

dst(x,y)= min

x- 1 ≤ x'≤x+ 1

y- 1 ≤ y'≤y+ 1

src⎛

⎝x',y'⎞

⎠

API Syntax

template<int BORDER_TYPE, int SRC_T, int ROWS, int COLS,int NPC=1>

void erode(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat, xf::Mat<SRC_T,

ROWS, COLS, NPC> & _dst_mat)

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 92

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 125: erode Function Parameter Descriptions

Parameter Description

BORDER_TYPE Border type supported is XF_BORDER_CONSTANT

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8

for 1 pixel and 8 pixel operations respectively.

_src_mat Input image

_dst_mat Output image

Resource Utilization

The following table summarizes the resource ulizaon of the Erosion funcon generated using

Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA.

Table 126: erode Function Resource Utilization Summary

Name

Resource Utilization

1 pixel per clock operation 8 pixel per clock operation

300 MHz 150 MHz

BRAM_18K 3 6

DSP48E 0 0

FF 342 653

LUT 351 1316

CLB 79 230

Performance Estimate

The following table summarizes a performance esmate of the Erosion funcon for Normal

Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS

2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1 FPGA.

Table 127: erode Function Performance Estimate Summary

Operating Mode Latency Estimate

Min (ms) Max (ms)

1 pixel (300 MHz) 7.0 7.0

8 pixel (150 MHz) 1.85 1.85

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 93

FAST Corner Detection

Features from accelerated segment test (FAST) is a corner detecon algorithm, that is faster than

most of the other feature detectors.

The fast funcon picks up a pixel in the image and compares the intensity of 16 pixels in its

neighborhood on a circle, called the Bresenham's circle. If the intensity of 9 conguous pixels is

found to be either more than or less than that of the candidate pixel by a given threshold, then

the pixel is declared as a corner. Once the corners are detected, the non-maximal suppression is

applied to remove the weaker corners.

This funcon can be used for both sll images and videos. The corners are marked in the image.

If the corner is found in a parcular locaon, that locaon is marked with 255, otherwise it is

zero.

API Syntax

template<int NMS,int SRC_T,int ROWS, int COLS,int NPC=1>

void fast(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<SRC_T, ROWS,

COLS, NPC> & _dst_mat,unsigned char _threshold)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 128: fast Function Parameter Descriptions

Parameter Description

NMS If NMS == 1, non-maximum suppression is applied to detected corners (keypoints). The value

should be 0 or 1.

SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)

ROWS Maximum height of input image (must be a multiple of 8)

COLS Maximum width of input image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src_mat Input image

_dst_mat Output image. The corners are marked in the image.

_threshold Threshold on the intensity difference between the center pixel and its neighbors. Usually it is taken

around 20.

Resource Utilization

The following table summarizes the resource ulizaon of the kernel for dierent conguraons,

generated using Vivado HLS 2018.2 for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a

grayscale HD (1080x1920) image with NMS.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 94

Table 129: fast Function Resource Utilization Summary

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 10 20

DSP48E 0 0

FF 2695 7310

LUT 3792 20956

CLB 769 3519

Performance Estimate

The following table summarizes the performance of kernel for dierent conguraons, as

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image with non-maximum suppression (NMS).

Table 130: fast Function Performance Estimate Summary

Operating Mode

Operating Frequency

(MHz) Filter Size

Latency Estimate

Max (ms)

1 pixel 300 3x3 7

8 pixel 150 3x3 1.86

Gaussian Filter

The GaussianBlur funcon applies Gaussian blur on the input image. Gaussian ltering is done

by convolving each point in the input image with a Gaussian kernel.

G0(x,y)=e

-(x-μx)2

2σx

2+-(y-μy)2

2σy

Where

μx

μy

are the mean values and

σx

σy

are the variances in x and y direcons

respecvely. In the GaussianBlur funcon, values of

μx

μy

are considered as zeroes and the

values of

σx

σy

are equal.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 95

API Syntax

template<int FILTER_SIZE, int BORDER_TYPE, int SRC_T, int ROWS, int COLS,

int NPC = 1>

void GaussianBlur(xf::Mat<SRC_T, ROWS, COLS, NPC> & src, xf::Mat<SRC_T,

ROWS, COLS, NPC> & dst, float sigma)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 131: GaussianBlur Function Parameter Descriptions

Parameter Description

FILTER_SIZE Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7) are

supported.

BORDER_TYPE Border type supported is XF_BORDER_CONSTANT

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for

1 pixel and 8 pixel operations respectively.

src Input image

dst Output image

sigma Standard deviation of Gaussian filter

Resource Utilization

The following table summarizes the resource ulizaon of the Gaussian Filter in dierent

conguraons, generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to progress a grayscale HD (1080x1920) image.

Table 132: GaussianBlur Function Resource Utilization Summary

Operating

Mode Filter Size

Operating

Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 3x3 300 3 17 3641 2791 610

5x5 300 5 27 4461 3544 764

7x7 250 7 35 4770 4201 894

8 pixel 3x3 150 6 52 3939 3784 814

5x5 150 10 111 5688 5639 1133

7x7 150 14 175 7594 7278 1518

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 96

Performance Estimate

The following table summarizes a performance esmate of the Gaussian Filter in dierent

conguraons, as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1

FPGA, to process a grayscale HD (1080x1920) image.

Table 133: GaussianBlur Function Performance Estimate Summary

Operating Mode Filter Size Latency Estimate

Max Latency (ms)

1 pixel operation (300 MHz) 3x3 7.01

5x5 7.03

7x7 7.06

8 pixel operation (150 MHz) 3x3 1.6

5x5 1.7

7x7 1.74

Gradient Magnitude

The magnitude funcon computes the magnitude for the images. The input images are x-

gradient and y-gradient images of type 16S. The output image is of same type as the input image.

For L1NORM normalizaon, the magnitude computed image is the pixel-wise added image of

absolute of x-gradient and y-gradient, as shown below:.

g=|gx|+

For L2NORM normalizaon, the magnitude computed image is as follows:

g=⎛

⎝gx

2+gy

2⎞

⎠

API Syntax

template< int NORM_TYPE ,int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1>

void magnitude(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_matx,xf::Mat<DST_T,

ROWS, COLS, NPC> & _src_maty,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 97

Table 134: magnitude Function Parameter Descriptions

Parameter Description

NORM_TYPE Normalization type can be either L1 or L2 norm. Values are XF_L1NORM or XF_L2NORM

SRC_T Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)

DST_T Output pixel type. Only 16-bit, signed,1 channel is supported (XF_16SC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for 1 pixel

and 8 pixel operations respectively.

_src_matx First input, x-gradient image.

_src_maty Second input, y-gradient image.

_dst_mat Output, magnitude computed image.

Resource Utilization

The following table summarizes the resource ulizaon of the kernel in dierent conguraons,

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to

process a grayscale HD (1080x1920) image and for L2 normalizaon.

Table 135: magnitude Function Resource Utilization Summary

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 0 0

DSP48E 2 16

FF 707 2002

LUT 774 3666

CLB 172 737

Performance Estimate

The following table summarizes the performance of the kernel in dierent conguraons, as

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image and for L2 normalizaon.

Table 136: magnitude Function Performance Estimate Summary

Operating Mode Operating Frequency (MHz) Latency Estimate

Max (ms)

1 pixel 300 7.2

8 pixel 150 1.7

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 98

Gradient Phase

The phase funcon computes the polar angles of two images. The input images are x-gradient

and y-gradient images of type 16S. The output image is of same type as the input image.

For radians:

angle(x,y)=atan2⎛

⎝gy, gx⎞

⎠

For degrees:

angle(x,y)=atan2(gy, gx)*180

API Syntax

template<int RET_TYPE ,int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1 >

void phase(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_matx,xf::Mat<DST_T,

ROWS, COLS, NPC> & _src_maty,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 137: phase Function Parameter Descriptions

Parameter Description

RET_TYPE Output format can be either in radians or degrees. Options are XF_RADIANS or XF_DEGREES.

•If the XF_RADIANS option is selected, phase API will return result in Q4.12 format. The output

range is (0, 2 pi)

•If the XF_DEGREES option is selected, xFphaseAPI will return result in Q10.6 degrees and

output range is (0, 360)

SRC_T Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)

DST_T Output pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

_src_matx First input, x-gradient image.

_src_maty Second input, y-gradient image.

_dst_mat Output, phase computed image.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 99

Resource Utilization

The following table summarizes the resource ulizaon of the kernel in dierent conguraons,

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to

process a grayscale HD (1080x1920) image.

Table 138: phase Function Resource Utilization Summary

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 6 24

DSP48E 6 19

FF 873 2396

LUT 753 3895

CLB 185 832

Performance Estimate

The following table summarizes the performance of the kernel in dierent conguraons, as

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 139: phase Function Performance Estimate Summary

Operating Mode Operating Frequency (MHz) Latency Estimate (ms)

1 pixel 300 7.2

8 pixel 150 1.7

Deviation from OpenCV

In phase implementaon, the output is returned in a xed point format. If XF_RADIANS opon is

selected, phase API will return result in Q4.12 format. The output range is (0, 2 pi). If

XF_DEGREES opon is selected, phase API will return result in Q10.6 degrees and output range

is (0, 360);

Harris Corner Detection

In order to understand Harris Corner Detecon, let us consider a grayscale image. Sweep a

window w(x,y) (with displacements u in the x-direcon and v in the y-direcon), I calculates

the variaon of intensity w(x,y).

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 100

E(u,v)=∑w(x,y)⎡

⎣I(x+u,y+v)-I(x,y)⎤

⎦2

Where:

•w(x,y) is the window posion at (x,y)

•I(x,y) is the intensity at (x,y)

•I(x+u,y+v) is the intensity at the moved window (x+u,y+v).

Since we are looking for windows with corners, we are looking for windows with a large variaon

in intensity. Hence, we have to maximize the equaon above, specically the term:

⎡

⎣I(x+u,y+v)-I(x,y)⎤

⎦2

Using Taylor expansion:

E(u,v)=∑⎡

⎣I(x,y)+uI x+vI y-I(x,y)⎤

⎦2

Expanding the equaon and cancelling I(x,y) with -I(x,y):

E(u,v)=∑u2Ix

2+ 2uvI xIy+v2Iy

The above equaon can be expressed in a matrix form as:

E(u,v)=[u v]⎛

⎝

⎜∑w(x,y)⟦Ix

2IxIy

IxIyIy

2⟧⎞

⎠

⎟⟦u

v⟧

So, our equaon is now:

E(u,v)=[u v]M⟦u

v⟧

A score is calculated for each window, to determine if it can possibly contain a corner:

R=det(M)-k(trace(M))2

Where,

•

det(M)=λ1λ2

•

trace(M)=λ1+λ2

Non-Maximum Suppression:

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 101

In non-maximum suppression (NMS) if radius = 1, then the bounding box is 2*r+1 = 3.

In this case, consider a 3x3 neighborhood across the center pixel. If the center pixel is greater

than the surrounding pixel, then it is considered a corner. The comparison is made with the

surrounding pixels, which are within the radius.

Radius = 1

x-1, y-1 x-1, y x-1, y+1

x, y-1 x, y x, y+1

x+1, y-1 x+1, y x+1, y+1

Threshold:

A threshold=442, 3109 and 566 is used for 3x3, 5x5, and 7x7 lters respecvely. This threshold

is veried over 40 sets of images. The threshold can be varied, based on the applicaon. The

corners are marked in the output image. If the corner is found in a parcular locaon, that

locaon is marked with 255, otherwise it is zero.

API Syntax

template<int FILTERSIZE,int BLOCKWIDTH, int NMSRADIUS,int SRC_T,int ROWS,

int COLS,int NPC=1>

void cornerHarris(xf::Mat<SRC_T, ROWS, COLS, NPC> & src,xf::Mat<SRC_T,

ROWS, COLS, NPC> & dst,uint16_t threshold, uint16_t k)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 140: cornerHarris Function Parameter Descriptions

Parameter Description

FILTERSIZE Size of the Sobel filter. 3, 5, and 7 supported.

BLOCKWIDTH Size of the box filter. 3, 5, and 7 supported.

NMSRADIUS Radius considered for non-maximum suppression. Values supported are 1 and 2.

TYPE Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)

ROWS Maximum height of input image (must be a multiple of 8)

COLS Maximum width of input image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1

pixel and 8 pixel operations respectively.

src Input image

dst Output image.

threshold Threshold applied to the corner measure.

k Harris detector parameter

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 102

Resource Utilization

The following table summarizes the resource ulizaon of the Harris corner detecon in dierent

conguraons, generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=3 and

NMS_RADIUS =1

Table 141: Resource Utilization Summary - For Sobel Filter = 3, Box filter=3 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 33 66

DSP48E 10 80

FF 3254 9330

LUT 3522 13222

CLB 731 2568

The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=5 and

NMS_RADIUS =1

Table 142: Resource Utilization Summary - Sobel Filter = 3, Box filter=5 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 45 90

DSP48E 10 80

FF 5455 12459

LUT 5675 24594

CLB 1132 4498

The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=7 and

NMS_RADIUS =1

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 103

Table 143: Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 57 114

DSP48E 10 80

FF 8783 16593

LUT 9157 39813

CLB 1757 6809

The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=3 and

NMS_RADIUS =1

Table 144: Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 200 MHz

BRAM_18K 35 70

DSP48E 10 80

FF 4656 11659

LUT 4681 17394

CLB 1005 3277

The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=5 and

NMS_RADIUS =1

Table 145: Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 47 94

DSP48E 10 80

FF 6019 14776

LUT 6337 28795

CLB 1353 5102

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 104

The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=7 and

NMS_RADIUS =1

Table 146: Resource Utilization Summary - Sobel Filter = 5, Box filter=7 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 59 118

DSP48E 10 80

FF 9388 18913

LUT 9414 43070

CLB 1947 7508

The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=3 and

NMS_RADIUS =1

Table 147: Resource Utilization Summary - Sobel Filter = 7, Box filter=3 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 37 74

DSP48E 11 88

FF 6002 13880

LUT 6337 25573

CLB 1327 4868

The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=5 and

NMS_RADIUS =1

Table 148: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 49 98

DSP48E 11 88

FF 7410 17049

LUT 8076 36509

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 105

Table 148: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and

NMS_RADIUS =1 (cont'd)

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

CLB 1627 6518

The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=7 and

NMS_RADIUS =1

Table 149: Resource Utilization Summary - Sobel Filter = 7, Box filter=7 and

NMS_RADIUS =1

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 61 122

DSP48E 11 88

FF 10714 21137

LUT 11500 51331

CLB 2261 8863

The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=3 and

NMS_RADIUS =2

Table 150: Resource Utilization Summary - Sobel Filter = 3, Box filter=3 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 41 82

DSP48E 10 80

FF 5519 10714

LUT 5094 16930

CLB 1076 3127

Resource ulizaon: For Sobel Filter = 3, Box lter=5 and NMS_RADIUS =2

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 106

Table 151: Resource Utilization Summary

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 53 106

DSP48E 10 80

FF 6798 13844

LUT 6866 28286

CLB 1383 4965

The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=7 and

NMS_RADIUS =2

Table 152: Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 65 130

DSP48E 10 80

FF 10137 17977

LUT 10366 43589

CLB 1940 7440

The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=3 and

NMS_RADIUS =2

Table 153: Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 43 86

DSP48E 10 80

FF 5957 12930

LUT 5987 21187

CLB 1244 3922

The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=5 and

NMS_RADIUS =2

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 107

Table 154: Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 55 110

DSP48E 10 80

FF 5442 16053

LUT 6561 32377

CLB 1374 5871

The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=7 and

NMS_RADIUS =2

Table 155: Resource Utilization Summary - Sobel Filter = 5, Box filter=7 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 67 134

DSP48E 10 80

FF 10673 20190

LUT 10793 46785

CLB 2260 8013

The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=3 and

NMS_RADIUS =2

Table 156: Resource Utilization Summary - Sobel Filter = 7, Box filter=3 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 45 90

DSP48E 11 88

FF 7341 15161

LUT 7631 29185

CLB 1557 5425

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 108

The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=5 and

NMS_RADIUS =2

Table 157: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 57 114

DSP48E 11 88

FF 8763 18330

LUT 9368 40116

CLB 1857 7362

The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=7 and

NMS_RADIUS =2

Table 158: Resource Utilization Summary - Sobel Filter = 7, Box filter=7 and

NMS_RADIUS =2

Name

Resource Utilization

1 pixel 8 pixel

300 MHz 150 MHz

BRAM_18K 69 138

DSP48E 11 88

FF 12078 22414

LUT 12831 54652

CLB 2499 9628

Performance Estimate

The following table summarizes a performance esmate of the Harris corner detecon in

dierent conguraons, as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-

vb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 159: cornerHarris Function Performance Estimate Summary

Operating

Mode

Operating

Frequency

(MHz)

Configuration Latency Estimate

Sobel Box NMS Radius Latency(In ms)

1 pixel 300 MHz 3 3 1 7

1 pixel 300 MHz 3 5 1 7.1

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 109

Table 159: cornerHarris Function Performance Estimate Summary (cont'd)

Operating

Mode

Operating

Frequency

(MHz)

Configuration Latency Estimate

Sobel Box NMS Radius Latency(In ms)

1 pixel 300 MHz 3 7 1 7.1

1 pixel 300 MHz 5 3 1 7.2

1 pixel 300 MHz 5 5 1 7.2

1 pixel 300 MHz 5 7 1 7.2

1 pixel 300 MHz 7 3 1 7.22

1 pixel 300 MHz 7 5 1 7.22

1 pixel 300 MHz 7 7 1 7.22

8 pixel 150 MHz 3 3 1 1.7

8 pixel 150 MHz 3 5 1 1.7

8 pixel 150 MHz 3 7 1 1.7

8 pixel 150 MHz 5 3 1 1.71

8 pixel 150 MHz 5 5 1 1.71

8 pixel 150 MHz 5 7 1 1.71

8 pixel 150 MHz 7 3 1 1.8

8 pixel 150 MHz 7 5 1 1.8

8 pixel 150 MHz 7 7 1 1.8

1 pixel 300 MHz 3 3 2 7.1

1 pixel 300 MHz 3 5 2 7.1

1 pixel 300 MHz 3 7 2 7.1

1 pixel 300 MHz 5 3 2 7.21

1 pixel 300 MHz 5 5 2 7.21

1 pixel 300 MHz 5 7 2 7.21

1 pixel 300 MHz 7 3 2 7.22

1 pixel 300 MHz 7 5 2 7.22

1 pixel 300 MHz 7 7 2 7.22

8 pixel 150 MHz 3 3 2 1.8

8 pixel 150 MHz 3 5 2 1.8

8 pixel 150 MHz 3 7 2 1.8

8 pixel 150 MHz 5 3 2 1.81

8 pixel 150 MHz 5 5 2 1.81

8 pixel 150 MHz 5 7 2 1.81

8 pixel 150 MHz 7 3 2 1.9

8 pixel 150 MHz 7 5 2 1.91

8 pixel 150 MHz 7 7 2 1.92

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 110

Deviation from OpenCV

In xfOpenCV thresholding and NMS are included, but in OpenCV they are not included. In

xfOpenCV, all the blocks are implemented in xed point. Whereas,in OpenCV, all the blocks are

implemented in oang point.

Histogram Computation

The calcHist funcon computes the histogram of given input image.

H⎡

⎣src(x,y)⎤

⎦=H⎡

⎣src(x,y)⎤

⎦+ 1

Where, H is the array of 256 elements.

API Syntax

template<int SRC_T,int ROWS, int COLS,int NPC=1>

void calcHist(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, uint32_t *histogram)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 160: calcHist Function Parameter Descriptions

Parameter Description

SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle

_src Input image

histogram Output array of 256 elements

Resource Utilization

The following table summarizes the resource ulizaon of the calcHist funcon for Normal

Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS

2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel case

and at 150 MHz for 8 pixel mode.

Table 161: calcHist Function Resource Utilization Summary

Name

Resource Utilization

Normal Operation (1 pixel) Resource Optimized (8 pixel)

BRAM_18K 2 16

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 111

Table 161: calcHist Function Resource Utilization Summary (cont'd)

Name

Resource Utilization

Normal Operation (1 pixel) Resource Optimized (8 pixel)

DSP48E 0 0

FF 196 274

LUT 240 912

CLB 57 231

Performance Estimate

The following table summarizes a performance esmate of the calcHist funcon for Normal

Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS

2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and

150 MHz for 8 pixel mode.

Table 162: calcHist Function Performance Estimate Summary

Operating Mode Latency Estimate

Max (ms)

1 pixel 6.9

8 pixel 1.7

Histogram Equalization

The equalizeHist funcon performs histogram equalizaon on input image or video. It

improves the contrast in the image, to stretch out the intensity range. This funcon maps one

distribuon (histogram) to another distribuon (a wider and more uniform distribuon of

intensity values), so the intensies are spread over the whole range.

For histogram H[i], the cumulave distribuon H'[i] is given as:

H'[i]=∑0 ≤ j<iH⎡

⎣j⎤

⎦

The intensies in the equalized image are computed as:

dst(x,y)=H'⎛

⎝src⎛

⎝x,y⎞

⎠⎞

⎠

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 112

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC = 1>

void equalizeHist(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<SRC_T,

ROWS, COLS, NPC> & _src1,xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 163: equalizeHist Function Parameter Descriptions

Parameter Description

SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle

_src Input image

_src1 Input image

_dst Output image

Resource Utilization

The following table summarizes the resource ulizaon of the equalizeHist funcon for Normal

Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS

2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and

150 MHz for 8 pixel mode.

Table 164: equalizeHist Function Resource Utilization Summary

Operating

Mode

Operating Frequency

(MHz)

Utilization Estimate

BRAM_18K DSP_48Es FF LUT CLB

1 pixel 300 4 5 3492 1807 666

8 pixel 150 25 5 3526 2645 835

Performance Estimate

The following table summarizes a performance esmate of the equalizeHist funcon for Normal

Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS

2018.2version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and

150 MHz for 8 pixel mode.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 113

Table 165: equalizeHist Function Performance Estimate Summary

Operating Mode Latency Estimate

Max (ms)

1 pixel per clock operation 13.8

8 pixel per clock operation 3.4

HOG

The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision for the

purpose of object detecon. The feature descriptors produced from this approach is widely used

in the pedestrian detecon.

The technique counts the occurrences of gradient orientaon in localized porons of an image.

HOG is computed over a dense grid of uniformly spaced cells and normalized over overlapping

blocks, for improved accuracy. The concept behind HOG is that the object appearance and shape

within an image can be described by the distribuon of intensity gradients or edge direcon.

Both RGB and gray inputs are accepted to the funcon. In the RGB mode, gradients are

computed for each plane separately, but the one with the higher magnitude is selected. With the

conguraons provided, the window dimensions are 64x128, block dimensions are 16x16.

API Syntax

template<int WIN_HEIGHT, int WIN_WIDTH, int WIN_STRIDE, int BLOCK_HEIGHT,

int BLOCK_WIDTH, int CELL_HEIGHT, int CELL_WIDTH, int NOB, int DESC_SIZE,

int IMG_COLOR, int OUTPUT_VARIANT, int SRC_T, int DST_T, int ROWS, int

COLS, int NPC = XF_NPPC1>

void HOGDescriptor(xf::Mat<SRC_T, ROWS, COLS, NPC> &_in_mat,

xf::Mat<DST_T, 1, DESC_SIZE, NPC> &_desc_mat);

Parameter Descriptions

The following table describes the template parameters.

Table 166: HOGDescriptor Template Parameter Descriptions

PARAMETERS DESCRIPTION

WIN_HEIGHT The number of pixel rows in the window. This must be a multiple of 8 and should not exceed the

number of image rows.

WIN_WIDTH The number of pixel cols in the window. This must be a multiple of 8 and should not exceed the

number of image columns.

WIN_STRIDE The pixel stride between two adjacent windows. It is fixed at 8.

BLOCK_HEIGHT Height of the block. It is fixed at 16.

BLOCK_WIDTH Width of the block. It is fixed at 16.

CELL_HEIGHT Number of rows in a cell. It is fixed at 8.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 114

Table 166: HOGDescriptor Template Parameter Descriptions (cont'd)

PARAMETERS DESCRIPTION

CELL_WIDTH Number of cols in a cell. It is fixed at 8.

NOB Number of histogram bins for a cell. It is fixed at 9

DESC_SIZE The size of the output descriptor.

IMG_COLOR The type of the image, set as either XF_GRAY or XF_RGB

OUTPUT_VARIENT Must be either XF_HOG_RB or XF_HOG_NRB

SRC_T Input pixel type. Must be either XF_8UC1 or XF_8UC4, for gray and color respectively.

DST_T Output descriptor type. Must be XF_32UC1.

ROWS Number of rows in the image being processed. (Should be a multiple of 8)

COLS Number of columns in the image being processed. (Should be a multiple of 8)

NPC Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per

cycle operations.

The following table describes the funcon parameters.

Table 167: HOGDescriptor Function Parameter Descriptions

PARAMETERS DESCRIPTION

_in_mat Input image, of xf::Mat type

_desc_mat Output descriptors, of xf::Mat type

Where,

• NO is normal operaon (single pixel processing)

• RB is repeve blocks (descriptor data are wrien window wise)

• NRB is non-repeve blocks (descriptor data are wrien block wise, in order to reduce the

number of writes).

Note: In the RB mode, the block data is wrien to the memory taking the overlap windows into

consideraon. In the NRB mode, the block data is wrien directly to the output stream without

consideraon of the window overlap. In the host side, the overlap must be taken care.

Resource Utilization

The following table shows the resource ulizaon of HOGDescriptor funcon for normal

operaon (1 pixel) mode as generated in Vivado HLS 2018.2 version tool for the part Xilinx

Xczu9eg-vb1156-1-i-es1 at 300 MHz to process an image of 1920x1080 resoluon.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 115

Table 168: HOGDescriptor Function Resource Utilization Summary

Resource

Utilization (at 300 MHz) of 1 pixel operation

NRB RB

Gray RGB Gray RGB

BRAM_18K 43 49 171 177

DSP48E 34 46 36 48

FF 15365 15823 15205 15663

LUT 12868 13267 13443 13848

Performance Estimate

The following table shows the performance esmates of HOGDescriptor() funcon for dierent

conguraons as generated in Vivado HLS 2018.2 version tool for the part Xilinx Xczu9eg-

vb1156-1-i-es1 to process an image of 1920x1080p resoluon.

Table 169: HOGDescriptor Function Performance Estimate Summary

Operating Mode Operating Frequency

(MHz)

Latency Estimate

Min (ms) Max (ms)

NRB-Gray 300 6.98 8.83

NRB-RGBA 300 6.98 8.83

RB-Gray 300 176.81 177

RB-RGBA 300 176.81 177

Deviations from OpenCV

Listed below are the deviaons from the OpenCV:

1. Border care

The border care that OpenCV has taken in the gradient computaon is

BORDER_REFLECT_101, in which the border padding will be the neighboring pixels'

reecon. Whereas, in the Xilinx implementaon, BORDER_CONSTANT (zero padding) was

used for the border care.

2. Gaussian weighing

The Gaussian weights are mulplied on the pixels over the block, that is a block has 256

pixels, and each posion of the block are mulplied with its corresponding Gaussian weights.

Whereas, in the HLS implementaon, gaussian weighing was not performed.

3. Cell-wise interpolaon

The magnitude values of the pixels are distributed across dierent cells in the blocks but on

the corresponding bins.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 116

Pixels in the region 1 belong only to its corresponding cells, but the pixels in region 2 and 3

are interpolated to the adjacent 2 cells and 4 cells respecvely. This operaon was not

performed in the HLS implementaon.

4. Output handling

The output of the OpenCV will be in the column major form. In the HLS implementaon,

output will be in the row major form. Also, the feature vector will be in the xed point type

Q0.16 in the HLS implementaon, while in the OpenCV it will be in oang point.

Limitations

1. The conguraons are limited to Dalal's implementaon

2. Image height and image width must be a mulple of cell height and cell width respecvely.

Houghlines

The HoughLines funcon here is equivalent to HoughLines Standard in OpenCV. The

Houghlines funcon is used to detect straight lines in a binary image. To apply the Hough

transform, edge detecon preprocessing is required. The input to the Hough transform is an edge

detected binary image. For each point (xi,yi) in a binary image, we dene a family of lines that go

through the point as:

rho= xi cos(theta) + yi sin(theta)

1N. Dalal, B. Triggs: Histograms of oriented gradients for human detecon, IEEE Computer Society

Conference on Computer Vision and Paern Recognion, 2005.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 117

Each pair of (rho,theta) represents a line that passes through the point (xi,yi). These (rho,theta)

pairs of this family of lines passing through the point form a sinusoidal curve in (rho,theta) plane.

If the sinusoids of N dierent points intersect in the (rho,theta) plane, then that intersecon

(rho1, theta1) represents the line that passes through these N points. In the Houghlines funcon,

an accumulator is used to keep the count (also called vong) of all the intersecon points in the

(rho,theta) plane. Aer vong, the funcon lters spurious lines by performing thinning, that is,

checking if the center vote value is greater than the neighborhood votes and threshold, then

making that center vote as valid and other wise making it zero. Finally, the funcon returns the

desired maximum number of lines (LINESMAX) in (rho,theta) form as output.

The design assumes the origin at the center of the image i.e at (Floor(COLS/2), Floor(ROWS/2)).

The ranges of rho and theta are:

theta = [0, pi)

rho=[-DIAG/2, DIAG/2), where DIAG = cvRound{SquareRoot( (COLS*COLS) +

(ROWS*ROWS))}

For ease of use, the input angles THETA, MINTHETA and MAXTHETA are taken in degrees, while

the output theta is in radians. The angle resoluon THETA is declared as an integer, but treated

as a value in Q6.1 format (that is, THETA=3 signies that the resoluon used in the funcon is

1.5 degrees). When the output (rho, Ɵ theta) is used for drawing lines,you should be aware of the

fact that origin is at the center of the image.

API Syntax

template<unsigned int RHO,unsigned int THETA,int MAXLINES,int DIAG,int

MINTHETA,int MAXTHETA,int SRC_T, int ROWS, int COLS,int NPC>

void HoughLines(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,float

outputrho[MAXLINES],float outputtheta[MAXLINES],short threshold,short

linesmax)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 170: Houghlines Function Parameter Descriptions

Parameter Description

RHO Distance resolution of the accumulator in pixels.

THETA Angle resolution of the accumulator in degrees and Q6.1 format.

MAXLINES Maximum number of lines to be detected

MINTHETA Minimum angle in degrees to check lines.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 118

Table 170: Houghlines Function Parameter Descriptions (cont'd)

Parameter Description

MAXTHETA Maximum angle in degrees to check lines

DIAG Diagonal of the image. It should be cvRound(sqrt(rows*rows + cols*cols)/RHO)

SRC_T Input Pixel Type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS Maximum height of input image

COLS Maximum width of input image

NPC Number of Pixels to be processed per cycle; Only single pixel supported XF_NPPC1.

_src_mat Input image should be 8-bit, single-channel binary image.

outputrho Output array of rho values. rho is the distance from the coordinate origin (center of the image).

outputtheta Output array of theta values. Theta is the line rotation angle in radians.

threshold Accumulator threshold parameter. Only those lines are returned that get enough votes (>threshold).

linesmax Maximum number of lines.

Resource Utilization

The table below shows the resource ulizaon of the kernel for dierent conguraons,

generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 to

process a grayscale HD (1080x1920) image for 512 lines.

Table 171: Houghlines Function Resource Utilization Summary

Name

Resource Utilization

THETA=1, RHO=1

BRAM_18K 542

DSP48E 10

FF 60648

LUT 56131

Performance Estimate

The following table shows the performance of kernel for dierent conguraons, generated

using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 to process a

grayscale HD (1080x1920) image for 512 lines.

Table 172: Houghlines Function Performance Estimate Summary

Operating Mode Operating Frequency (MHz) Latency Estimate

Max (ms)

THETA=1, RHO=1 300 12.5

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 119

Pyramid Up

The pyrUp funcon is an image up-sampling algorithm. It rst inserts zero rows and zero

columns aer every input row and column making up to the size of the output image. The output

image size is always

(2*rows × 2*columns)

.The zero padded image is then smoothened

using Gaussian image lter. Gaussian lter for the pyramid-up funcon uses a xed lter kernel

as given below:

256

⎡

⎣

⎢

⎢1 4 6 4 1

4 16 24 16 4

6 24 36 24 6

4 16 24 16 4

1 4 6 4 1

⎤

⎦

⎥

However, to make up for the pixel intensity that is reduced due to zero padding, each output

pixel is mulplied by 4.

API Syntax

template<int TYPE, int ROWS, int COLS, int NPC>

void pyrUp (xf::Mat<TYPE, ROWS, COLS, NPC> & _src, xf::Mat<TYPE, ROWS,

COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 173: pyrUp Function Parameter Descriptions

Parameter Description

TYPE Pixel type. XF_8UC1 is the only supported pixel type.

ROWS Maximum Height or number of output rows to build the hardware for this kernel

COLS Maximum Width or number of output columns to build the hardware for this kernel

NPC Number of pixels to process per cycle. Currently, the kernel supports only 1 pixel per cycle

processing (XF_NPPC1).

_src Input image stream

_dst Output image stream

Resource Utilization

The following table summarizes the resource ulizaon of pyrUp for 1 pixel per cycle

implementaon, for a maximum input image size of 1920x1080 pixels. The results are aer

synthesis in Vivado HLS 2018.2 for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz.

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 120

Table 174: pyrUp Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

LUTs FFs DSPs BRAMs

1 Pixel 300 1124 1199 0 10

Performance Estimate

The following table summarizes performance esmates of pyrUp funcon on Vivado HLS 2018.2

for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA.

Table 175: pyrUp Function Performance Estimate Summary

Operating Mode

Operating Frequency

(MHz) Input Image Size

Latency Estimate

Max (ms)

1 pixel 300 1920x1080 27.82

Pyramid Down

The pyrDown funcon is an image down-sampling algorithm which smoothens the image before

down-scaling it. The image is smoothened using a Gaussian lter with the following kernel:

256

⎡

⎣

⎢

⎢1 4 6 4 1

4 16 24 16 4

6 24 36 24 6

4 16 24 16 4

1 4 6 4 1

⎤

⎦

⎥

Down-scaling is performed by dropping pixels in the even rows and the even columns. The

resulng image size is

⎛

⎝rows + 1

2 columns + 1

⎞

⎠

API Syntax

template<int TYPE, int ROWS, int COLS, int NPC>

void pyrDown (xf::Mat<TYPE, ROWS, COLS, NPC> & _src, xf::Mat<TYPE, ROWS,

COLS, NPC> & _dst)

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 121

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 176: pyrDown Function Parameter Descriptions

Parameter Description

TYPE Pixel type. XF_8UC1 is the only supported pixel type.

ROWS Maximum Height or number of input rows to build the hardware for this kernel

COLS Maximum Width or number of input columns to build the hardware for this kernel

NPC Number of pixels to process per cycle. Currently, the kernel supports only 1 pixel per cycle

processing (XF_NPPC1).

_src Input image stream

_dst Output image stream

Resource Utilization

The following table summarizes the resource ulizaon of pyrDown for 1 pixel per cycle

implementaon, for a maximum input image size of 1920x1080 pixels. The results are aer

synthesis in Vivado HLS 2018.2 for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz.

Table 177: pyrDown Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

LUTs FFs DSPs BRAMs

1 Pixel 300 1171 1238 1 5

Performance Estimate

The following table summarizes performance esmates of pyrDown funcon in Vivado HLS

2018.2 for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA.

Table 178: pyrDown Function Performance Estimate Summary

Operating Mode

Operating Frequency

(MHz) Input Image Size

Latency Estimate

Max (ms)

1 pixel 300 1920x1080 6.99

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 122

InitUndistortRectifyMapInverse

The InitUndistortRectifyMapInverse funcon generates mapx and mapy, based on a set

of camera parameters, where mapx and mapy are inputs for the xf::remap funcon. That is, for

each pixel in the locaon (u, v) in the desnaon (corrected and reced) image, the funcon

computes the corresponding coordinates in the source image (the original image from camera).

The InitUndistortRecfyMapInverse module is opmized for hardware, so the inverse of rotaon

matrix is computed outside the synthesizable logic. Note that the inputs are xed point, so the

oang point camera parameters must be type casted to Q12.20 format.

API Syntax

template< int CM_SIZE, int DC_SIZE, int MAP_T, int ROWS, int COLS, int NPC

void InitUndistortRectifyMapInverse ( ap_fixed<32,12> *cameraMatrix,

ap_fixed<32,12> *distCoeffs, ap_fixed<32,12> *ir, xf::Mat<MAP_T, ROWS,

COLS, NPC> &_mapx_mat, xf::Mat<MAP_T, ROWS, COLS, NPC> &_mapy_mat, int

_cm_size, int _dc_size)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 179: InitUndistortRectifyMapInverse Function Parameter Descriptions

Parameter Description

CM_SIZE It must be set at the compile time, 9 for 3x3 matrix

DC_SIZE It must be set at the compile time, must be 4,5 or 8

MAP_T It is the type of output maps, and must be XF_32FC1

ROWS Maximum image height, necessary to generate the output maps

COLS Maximum image width, necessary to generate the output maps

NPC Number of pixels per cycle. This function supports only one pixel per cycle, so set to XF_NPPC1

cameraMatrix The input matrix representing the camera in the old coordinate system

distCoeffs The input distortion coefficients (k1,k2,p1,p2[,k3[,k4,k5,k6]])

ir The input transformation matrix is equal to Invert(newCameraMatrix*R), where

newCameraMatrix represents the camera in the new coordinate system and R is the rotation

matrix.. This processing will be done outside the synthesizable block

_mapx_mat Output mat objects containing the mapx

_mapy_mat Output mat objects containing the mapy

_cm_size 9 for 3x3 matrix

_dc_size 4, 5 or 8. If this is 0, then it means there is no distortion

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 123

Integral Image

The integral funcon computes an integral image of the input. Each output pixel is the sum of

all pixels above and to the le of itself.

dst(x,y)=sum(x,y)=sum(x,y)+sum⎛

⎝x- 1, y⎞

⎠+sum⎛

⎝x,y- 1⎞

⎠-sum⎛

⎝x- 1, y- 1⎞

⎠

API Syntax

template<int SRC_TYPE,int DST_TYPE, int ROWS, int COLS, int NPC=1>

void integral(xf::Mat<SRC_TYPE, ROWS, COLS, NPC> & _src_mat,

xf::Mat<DST_TYPE, ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 180: integral Function Parameter Descriptions

Parameter Description

SRC_TYPE Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_TYPE Output pixel type. Only 32-bit,unsigned,1 channel is supported(XF_32UC1)

ROWS Maximum height of input and output image (must be a multiple of 8)

COLS Maximum width of input and output image (must be a multiple of 8)

NPC Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per

cycle operations.

_src_mat Input image

_dst_mat Output image

Resource Utilization

The following table summarizes the resource ulizaon of the kernel in dierent conguraons,

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to

process a grayscale HD (1080x1920) image.

Table 181: integral Function Resource Utilization Summary

Name

Resource Utilization

1 pixel

300 MHz

BRAM_18K 4

DSP48E 0

FF 613

LUT 378

CLB 102

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 124

Performance Estimate

The following table summarizes the performance of the kernel in dierent conguraons, as

generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a

grayscale HD (1080x1920) image.

Table 182: integral Function Performance Estimate Summary

Operating Mode

Latency Estimate

Operating Frequency

(MHz) Latency(in ms)

1pixel 300 7.2

Dense Pyramidal LK Optical Flow

Opcal ow is the paern of apparent moon of image objects between two consecuve

frames, caused by the movement of object or camera. It is a 2D vector eld, where each vector is

a displacement vector showing the movement of points from rst frame to second.

Opcal Flow works on the following assumpons:

• Pixel intensies of an object do not have too many variaons in consecuve frames

• Neighboring pixels have similar moon

Consider a pixel I(x, y, t) in rst frame. (Note that a new dimension, me, is added here. When

working with images only, there is no need of me). The pixel moves by distance (dx, dy) in the

next frame taken aer me dt. Thus, since those pixels are the same and the intensity does not

change, the following is true:

I(x,y,t)=I⎛

⎝x+dx,y+dy,t+dt⎞

⎠

Taking the Taylor series approximaon on the right-hand side, removing common terms, and

dividing by dt gives the following equaon:

fxu+fyv+ft= 0

Where

fx=δ f

δx

fy=δ f

δx

u =dx

and

v=dy

The above equaon is called the Opcal Flow equaon, where, fx and fy are the image

gradientsand ft is the gradient along me. However, (u, v) is unknown. It is not possible to solve

this equaon with two unknown variables. Thus, several methods are provided to solve this

problem. One method is Lucas-Kanade. Previously it was assumed that all neighboring pixels

have similar moon. The Lucas-Kanade method takes a patch around the point, whose size can

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 125

be dened through the ‘WINDOW_SIZE’ template parameter. Thus, all the points in that patch

have the same moon. It is possible to nd (fx, fy, ft ) for these points. Thus, the problem now

becomes solving ‘WINDOW_SIZE * WINDOW_SIZE’ equaons with two unknown

variables,which is over-determined. A beer soluon is obtained with the “least square t”

method. Below is the nal soluon, which is a problem with two equaons and two unknowns:

⎡

⎣

⎤

⎦=

⎡

⎣

⎢∑fxi

2∑fxifyi

∑fxifyi∑fyi

⎤

⎦

⎥

-1⎡

⎣

⎢-∑fxifti

-∑fyifti

⎤

⎦

⎥

This soluon fails when a large moon is involved and so pyramids are used. Going up in the

pyramid, small moons are removed and large moons become small moons and so by applying

Lucas-Kanade, the opcal ow along with the scale is obtained.

API Syntax

template< int NUM_PYR_LEVELS, int NUM_LINES, int WINSIZE, int FLOW_WIDTH,

int FLOW_INT, int TYPE, int ROWS, int COLS, int NPC>

void densePyrOpticalFlow(

xf::Mat<TYPE,ROWS,COLS,NPC> & _current_img,

xf::Mat<TYPE,ROWS,COLS,NPC> & _next_image,

xf::Mat<XF_32UC1,ROWS,COLS,NPC> & _streamFlowin,

xf::Mat<XF_32UC1,ROWS,COLS,NPC> & _streamFlowout,

const int level, const unsigned char scale_up_flag, float scale_in,

ap_uint<1> init_flag)

Parameter Descriptions

The following table describes the template and the funcon parameters.

Table 183: densePyrOpticalFlow Function Parameter Descriptions

Parameter Description

NUM_PYR_LEVELS Number of Image Pyramid levels used for the optical flow computation

NUM_LINES Number of lines to buffer for the remap algorithm – used to find the temporal gradient

WINSIZE Window Size over which Optical Flow is computed

FLOW_WIDTH,

FLOW_INT

Data width and number of integer bits to define the signed flow vector data type. Integer bit

includes the signed bit.

The default type is 16-bit signed word with 10 integer bits and 6 decimal bits.

TYPE Pixel type of the input image. XF_8UC1 is only the supported value.

ROWS Maximum Height or number of rows to build the hardware for this kernel

COLS Maximum Width or number of columns to build the hardware for this kernel

NPC Number of pixels the hardware kernel must process per clock cycle. Only XF_NPPC1, 1 pixel per

cycle, is supported.

_curr_img First input image stream

_next_img Second input image to which the optical flow is computed with respect to the first image

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 126

Table 183: densePyrOpticalFlow Function Parameter Descriptions (cont'd)

Parameter Description

_streamFlowin 32-bit Packed U and V flow vectors input for optical flow. The bits from 31-16 represent the flow

vector U while the bits from 15-0 represent the flow vector V.

_streamFlowout 32-bit Packed U and V flow vectors output after optical flow computation. The bits from 31-16

represent the flow vector U while the bits from 15-0 represent the flow vector V.

level Image pyramid level at which the algorithm is currently computing the optical flow.

scale_up_flag Flag to enable the scaling-up of the flow vectors. This flag is set at the host when switching from

one image pyramid level to the other.

scale_in Floating point scale up factor for the scaling-up the flow vectors.

The value is (previous_rows-1)/(current_rows-1). This is not 1 when switching from one image

pyramid level to the other.

init_flag Flag to initialize flow vectors to 0 in the first iteration of the highest pyramid level. This flag must

be set in the first iteration of the highest pyramid level (smallest image in the pyramid). The flag

must be unset for all the other iterations.

Resource Utilization

The following table summarizes the resource ulizaon of densePyrOpcalFlow for 1 pixel per

cycle implementaon, with the opcal ow computed for a window size of 11 over an image size

of 1920x1080 pixels. The results are aer implementaon in Vivado HLS 2018.2 for the Xilinx

xczu9eg-vb1156-2L-e FPGA at 300 MHz.

Table 184: densePyrOpticalFlow Function Resource Utilization Summary

Operating

Mode

Operating

Frequency

(MHz)

Utilization Estimate

LUTs FFs DSPs BRAMs

1 Pixel 300 32231 16596 52 215

Performance Estimate

The following table summarizes performance gures on hardware for the densePyrOpcalFlow

funcon for 5 iteraons over 5 pyramid levels scaled down by a factor of two at each level. This

has been tested on the zcu102 evaluaon board.

Table 185: densePyrOpticalFlow Function Performance Estimate Summary

Operating Mode

Operating Frequency

(MHz) Image Size

Latency Estimate

Max (ms)

1 pixel 300 1920x1080 49.7

1 pixel 300 1280x720 22.9

1 pixel 300 1226x370 12.02

Chapter 3: xfOpenCV Library API Reference

UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]

Xilinx OpenCV User Guide 127

Dense Non-Pyramidal LK Optical Flow