Xilinx OpenCV User Guide Ug1233

ug1233-xilinx-opencv-user-guide

ug1233-xilinx-opencv-user-guide

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 189

DownloadXilinx OpenCV User Guide Ug1233-xilinx-opencv-user-guide
Open PDF In BrowserView PDF
Xilinx OpenCV User Guide
UG1233 (v2018.2) July 2, 2018

Revision History

Revision History
The following table shows the revision history for this document.
Section

Revision Summary
07/02/2018 Version 2018.2

Houghlines

Updated the function description and its respective tables

xfOpenCV Library Functions

Added a note to the xfOpenCV Library Functions table
06/06/2018 Version 2018.2

Houghlines

Added a new function

Semi Global Method for Stereo Disparity Estimation

Added a new function

04/04/2018 Version 2018.1
General Updates

Minor editorial updates for 2018.1 release

InitUndistortRectifyMapInverse

Added a new function

cornersImgToList()

Added a new function

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
2

Send Feedback

Table of Contents
Revision History...............................................................................................................2
Chapter 1: Overview......................................................................................................4
Basic Features.............................................................................................................................. 4
xfOpenCV Kernel on the reVISION Platform............................................................................5

Chapter 2: Getting Started........................................................................................ 7
Prerequisites................................................................................................................................ 7
xfOpenCV Library Contents........................................................................................................7
Using the xfOpenCV Library.......................................................................................................8
Changing the Hardware Kernel Configuration......................................................................21
Using the xfOpenCV Library Functions on Hardware...........................................................21

Chapter 3: xfOpenCV Library API Reference................................................. 25
xf::Mat Image Container Class................................................................................................ 25
xfOpenCV Library Functions.................................................................................................... 33

Chapter 4: Design Examples Using xfOpenCV Library........................... 172
Iterative Pyramidal Dense Optical Flow............................................................................... 172
Corner Tracking Using Sparse Optical Flow.........................................................................177
Color Detection........................................................................................................................182
Difference of Gaussian Filter................................................................................................. 183
Stereo Vision Pipeline............................................................................................................. 185

Appendix A: Additional Resources and Legal Notices........................... 187
Xilinx Resources.......................................................................................................................187
Documentation Navigator and Design Hubs...................................................................... 187
References................................................................................................................................188
Please Read: Important Legal Notices................................................................................. 189

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
3

Send Feedback

Chapter 1: Overview

Chapter 1

Overview
This document describes the FPGA device optimized OpenCV library, called the Xilinx®
xfOpenCV library and is intended for application developers using Zynq®-7000 SoC and Zynq
UltraScale+ MPSoC devices. xfOpenCV library has been designed to work in the SDx™
development environment, and provides a software interface for computer vision functions
accelerated on an FPGA device. xfOpenCV library functions are mostly similar in functionality to
their OpenCV equivalent. Any deviations, if present, are documented.
Note: For more information on the xfOpenCV library prerequisites, see the Prerequisites. To familiarize
yourself with the steps required to use the xfOpenCV library functions, see the Using the xfOpenCV
Library.

Basic Features
All xfOpenCV library functions follow a common format. The following properties hold true for
all the functions.
• All the functions are designed as templates and all arguments that are images, must be
provided as xf::Mat.
• All functions are defined in the xf namespace.
• Some of the major template arguments are:
○

Maximum size of the image to be processed

○

Datatype defining the properties of each pixel

○

Number of pixels to be processed per clock cycle

○

Other compile-time arguments relevent to the functionality.

The xfOpenCV library contains enumerated datatypes which enables you to configure xf::Mat.
For more details on xf::Mat, see the xf::Mat Image Container Class.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
4

Send Feedback

Chapter 1: Overview

xfOpenCV Kernel on the reVISION Platform
The xfOpenCV library is designed to be used with the SDx™ development environment.
xfOpenCV kernels are evaluated on the reVISION™ platform.
The following steps describe the general flow of an example design, where both the input and
the output are image files.
1. Read the image using cv::imread().
2. Copy the data to xf::Mat.
3. Call the processing function(s) in xfOpenCV.
4. Copy the data from xf::Mat to cv::Mat.
5. Write the output to image using cv::imwrite().
The entire code is written as the host code for the pipeline , from which all the calls to xfOpenCV
functions are moved to hardware. Functions from OpenCV are used to read and write images in
the memory. The image containers for xfOpenCV library functions are xf::Mat objects. For
more information, see the xf::Mat Image Container Class.
The reVISION platform supports both live and file input-output (I/O) modes. For more details, see
the reVISION Getting Started Guide.
• File I/O mode enables the controller to transfer images from SD Card to the hardware kernel.
The following steps describe the file I/O mode.
1. Processing system (PS) reads the image frame from the SD Card and stores it in the
DRAM.
2. The xfOpenCV kernel reads the image from the DRAM, processes it and stores the output
back in the DRAM memory.
3. The PS reads the output image frame from the DRAM and writes it back to the SD Card.
• Live I/O mode enables streaming frames into the platform, processing frames with the
xfOpenCV kernel, and streaming out the frames through the appropriate interface. The
following steps describe the live I/O mode.
1. Video capture IPs receive a frame and store it in the DRAM.
2. The xfOpenCV kernel fetches the image from the DRAM, processes the image, and stores
the output in the DRAM.
3. Display IPs read the output frame from the DRAM and transmits the frame through the
appropriate display interface.
Following figure shows the reVISION platform with the xfOpenCV kernel block:

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
5

Send Feedback

Chapter 1: Overview

Figure 1: xfOpenCV Kernel on the reVISION Platform

Processing System (PS)
ARM Core

Central Interconnect

HPM/GP Ports

HDMI TX and RX
IPs

DDR
Controller

HP Ports

Data
Movers
AXIS

AXI
Interconnects
AXIMM

xfOpenCV Kernel Interface

xfOpenCV Kernel
Programmable Logic (PL)

Note: For more information on the PS-PL interfaces and PL-DDR interfaces, see the Zynq UltraScale+
Device Technical Reference Manual (UG1085).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
6

Send Feedback

Chapter 2

Getting Started
This chapter provides the information you need to bring up your design using the xfOpenCV
library functions.

Prerequisites
This section lists the prerequisites for using the xfOpenCV library functions on ZCU102 based
platforms. The methodology holds true for ZC702 and ZC706 reVISION platforms as well.
• Download and install the SDx development environment according to the directions provided
in SDSoC Environments Release Notes, Installation, and Licensing Guide (UG1294). Before
launching the SDx development environment on Linux, set the $SYSROOT environment
variable to point to the Linux root file system, delivered with the reVISION platform. For
example:
export SYSROOT = /zcu102_[es2_]rv_ss/sw/aarch64-linux-gnu/
sysroot

• Download the Zynq® UltraScale+™ MPSoC Embedded Vision Platform zip file and extract its
contents. Create the SDx development environment workspace in the
zcu102_[es2_]rv_ss folder of the extracted design file hierarchy. For more details, see the
reVISION Getting Started Guide.
• Set up the ZCU102 evaluation board. For more details, see the reVISION Getting Started
Guide.
• Download the xfOpenCV library. This library is made available through github. Run the
following git clone command to clone the xfOpenCV repository to your local disk:
git clone https://github.com/Xilinx/xfopencv.git

xfOpenCV Library Contents
The following table lists the contents of the xfOpenCV library.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
7

Send Feedback

Chapter 2: Getting Started

Table 1: xfOpenCV Library Contents
Folder

Details

include

Contains the header files required by the library.

include/common

Contains the common library infrastructure headers, such
as types specific to the library.

include/core

Contains the core library functionality headers, such as the
math functions.

include/features

Contains the feature extraction kernel function definitions.
For example, Harris.

include/imgproc

Contains all the kernel function definitions, except the ones
available in the features folder.

examples

Contains the sample test bench code to facilitate running
unit tests. The examples/ folder contains the folde rs
with algorithm names. Each algorithm folder contains host
files, .json file, and data folder. For more details on
how to use the xfOpenCV library, see xfOpenCV Kernel on
the reVISION Platform.

Using the xfOpenCV Library
This section describes using the xfOpenCV library in the SDx development environment.
Note: The instructions in this section assume that you have downloaded and installed all the required
packages. For more information, see the Prerequisites.

The xfOpenCV library is structured as shown in the following table. The include folder
constitutes all the necessary components to build a Computer Vision or Image Processing
pipeline using the library. The folders common and core contain the infrastructure that the
library functions need for basic functions, Mat class, and macros. The library functions are
categorized into two folders, features and imgproc based on the operation they perform.
The names of the folders are self-explanatory.
To work with the library functions, you need to include the path to the include folder in the
SDx project. You can include relevant header files for the library functions you will be working
with after you source the include folder’s path to the compiler. For example, if you would like
to work with Harris Corner Detector and Bilateral Filter, you must use the following lines in the
host code:
#include “features/xf_harris.hpp” //for Harris Corner Detector
#include “imgproc/xf_bilateral_filter.hpp” //for Bilateral Filter

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
8

Send Feedback

Chapter 2: Getting Started

After the headers are included, you can work with the library functions as described in the
Chapter 3: xfOpenCV Library API Reference using the examples in the examples folder as
reference.
The following table gives the name of the header file, including the folder name, which contains
the library function.
Table 2: xfOpenCV Library Contents
File Path in the include folder

Function Name
xf::accumulate

imgproc/xf_accumulate_image.hpp

xf::accumulateSquare

imgproc/xf_accumulate_squared.hpp

xf::accumulateWeighted

imgproc/xf_accumulate_weighted.hpp

xf::absdiff, xf::add, xf::subtract, xf::bitwise_and,
xf::bitwise_or, xf::bitwise_not, xf::bitwise_xor

core/xf_arithm.hpp

xf::bilateralFilter

imgproc/xf_histogram.hpp

xf::boxFilter

imgproc/xf_box_filter.hpp

xf::Canny

imgproc/xf_canny.hpp

xf::merge

imgproc/xf_channel_combine.hpp

xf::extractChannel

imgproc/xf_channel_extract.hpp

xf::convertTo

imgproc/xf_convert_bitdepth.hpp

xf::filter2D

imgproc/xf_custom_convolution.hpp

xf::nv122iyuv, xf::nv122rgba, xf::nv122yuv4, xf::nv212iyuv,
xf::nv212rgba, xf::nv212yuv4, xf::rgba2yuv4, xf::rgba2iyuv,
xf::rgba2nv12, xf::rgba2nv21, xf::uyvy2iyuv, xf::uyvy2nv12,
xf::uyvy2rgba, xf::yuyv2iyuv, xf::yuyv2nv12, xf::yuyv2rgba

imgproc/xf_cvt_color.hpp

xf::dilate

imgproc/xf_dilation.hpp

xf::erode

imgproc/xf_erosion.hpp

xf::fast

features/xf_fast.hpp

xf::GaussianBlur

imgproc/xf_gaussian_filter.hpp

xf::cornerHarris

features/xf_harris.hpp

xf::calcHist

imgproc/xf_histogram.hpp

xf::equalizeHist

imgproc/xf_hist_equalize.hpp

xf::HOGDescriptor

imgproc/xf_hog_descriptor.hpp

xf::Houghlines

imgproc/xf_houghlines.hpp

xf::integralImage

imgproc/xf_integral_image.hpp

xf::densePyrOpticalFlow

imgproc/xf_pyr_dense_optical_flow.hpp

xf::DenseNonPyrOpticalFlow

imgproc/xf_dense_npyr_optical_flow.hpp

xf::LUT

imgproc/xf_lut.hpp

xf::magnitude

core/xf_magnitude.hpp

xf::MeanShift

imgproc/xf_mean_shift.hpp

xf::meanStdDev

core/xf_mean_stddev.hpp

xf::medianBlur

imgproc/xf_median_blur.hpp

xf::minMaxLoc

core/xf_min_max_loc.hpp

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
9

Send Feedback

Chapter 2: Getting Started

Table 2: xfOpenCV Library Contents (cont'd)
File Path in the include folder

Function Name
xf::OtsuThreshold

imgproc/xf_otsuthreshold.hpp

xf::phase

core/xf_phase.hpp

xf::pyrDown

imgproc/xf_pyr_down.hpp

xf::pyrUp

imgproc/xf_pyr_up.hpp

xf::remap

imgproc/xf_remap.hpp

xf::resize

imgproc/xf_resize.hpp

xf::Scharr

imgproc/xf_scharr.hpp

xf::SemiGlobalBM

imgproc/xf_sgbm.hpp

xf::Sobel

imgproc/xf_sobel.hpp

xf::StereoPipeline

imgproc/xf_stereo_pipeline.hpp

xf::StereoBM

imgproc/xf_stereoBM.hpp

xf::SVM

imgproc/xf_svm.hpp

xf::Threshold

imgproc/xf_threshold.hpp

xf::warpAffine

imgproc/xf_warpaffine.hpp

xf::warpPerspective

imgproc/xf_warpperspective.hpp

xf::warpTransform

imgproc/xf_warp_transform.hpp

The different ways to use the xfOpenCV library examples are listed below:
• Downloading and Using xfOpenCV Libraries from SDx GUI
• Building a Project Using the Example Makefiles on Linux
• Using reVISION Samples on the reVISION Platform
• Using the xfOpenCV Library on a non-reVISION Platform

Downloading and Using xfOpenCV Libraries from
SDx GUI
You can download xfOpenCV directly from SDx GUI. To build a project using the example
makefiles on the Linux platform:
1. From SDx IDE, click Xilinx and select SDx Libraries.
2. Click Download next to the Xilinx xfOpenCV Library.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
10

Send Feedback

Chapter 2: Getting Started

Figure 2: SDx Libraries

The library is downloaded into /Xilinx/SDx/2018.2/xfopencv.
After the library is downloaded, the entire set of examples in the library are available in the
list of templates while creating a new project.
Note: The library can be added to any project from the IDE menu options.

3. To add a library to a project, from SDx IDE, click Xilinx and select SDx Libraries.
4. Select Xilinx xfOpenCV Library and click Add to project. The dropdown menu consists of
options of which project the libraries need to be included to.
All the headers as part of the include/ folder in xfOpenCV library would be copied into the
local project directory as /libs/xfopencv/include. All the settings
required for the libraries to be run are also set when this action is completed.

Building a Project Using the Example Makefiles on
Linux
Use the following steps to build a project using the example makefiles on the Linux platform:
1. Open a terminal.
2. Set the environment variable SYSROOT to the /sw/
aarch64-linux-gnu/sysroot folder.
3. Change the platform variable to point to the downloaded platform folder in makefile. Ensure
that the folder name of the downloaded platform is unchanged.
4. Change the directory to the location where you want to build the example.
cd 

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
11

Send Feedback

Chapter 2: Getting Started

5. Set the environment variables to run SDx development environment.
• For c shell:
source /settings.csh

• For bash shell:
source /settings.sh

6. Type the make command in the terminal. The sd_card folder is created and can be found in
the  folder.

Using reVISION Samples on the reVISION Platform
Use the following steps to run a unit test for bilateral filter on zcu102_es2_reVISION:
1. Launch the SDx development environment using the desktop icon or the Start menu.
The Workspace Launcher dialog appears.
2. Click Browse to enter a workspace folder used to store your projects (you can use workspace
folders to organize your work), then click OK to dismiss the Workspace Launcher dialog.
Note: Before launching the SDx IDE on Linux, ensure that you use the same shell that you have used to set
the $SYSROOT environment variable. This is usually the file path to the Linux root file system.

The SDx development environment window opens with the Welcome tab visible when you
create a new workspace. The Welcome tab can be closed by clicking the X icon or minimized
if you do not wish to use it.
3. Select File → New → Xilinx SDx Project from the SDx development environment menu bar.
The New Project dialog box opens.
4. Specify the name of the project. For example Bilateral.
5. Click Next.
The the Choose Hardware Platform page appears.
6. From the Choose Hardware Platform page, click the Add Custom Platform button.
7. Browse to the directory where you extracted the reVISION platform files. Ensure that you
select the zcu102_es2_reVISION folder.
8. From the Choose Hardware Platform page, select zcu102_es2_reVISION (custom).
9. Click Next.
The Templates page appears, containing source code examples for the selected platform.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
12

Send Feedback

Chapter 2: Getting Started

10. From the list of application templates, select bilateral - File I/O and click Finish.
11. Click the Active build configurations drop-down from the SDx Project Settings window, to
select the active configuration or create a build configuration.
The standard build configurations are Debug and Release. To get the best runtime
performance, switch to use the Release build configuration as it uses a higher compiler
optimization setting than the Debug build configuration.
Figure 3: SDx Project Settings - Active Build Configuration

12. Set the Data motion network clock frequency (MHz) to the required frequency, on the SDx
Project Settings page.
13. Right-click the project and select Build Project or press Ctrl+B keys to build the project, in
the Project Explorer view.
14. Copy the contents of the newly created sd_card folder to the SD card.
The sd_card folder contains all the files required to run designs on the ZCU102 board.
15. Insert the SD card in the ZCU102 board card slot and switch it ON.
Note: A serial port emulator (Teraterm/ minicom) is required to interface the user commands to the board.

16. Upon successful boot, run the following command in the Teraterm terminal (serial port
emulator.)
#cd /media/card
#remount

17. Run the .elf file for the respective functions.
For more information, see the Using the xfOpenCV Library Functions on Hardware.

Using the xfOpenCV Library on a non-reVISION
Platform
This section describes using the xfOpenCV library on a non-reVISION platform, in the SDx
development environment. The examples in xfOpenCV require OpenCV libraries for successful
compilation. In the case of a non-reVISION platform, you are responsible for providing the
required OpenCV libraries, either as part of the platform or otherwise.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
13

Send Feedback

Chapter 2: Getting Started

Note: The instructions in this section assume that you have downloaded and installed all the required
packages. For more information, see the Prerequisites.

Use the following steps to import the xfOpenCV library into a SDx project and execute it on a
custom platform:
1. Launch the SDx development environment using the desktop icon or the Start menu.
The Workspace Launcher dialog appears.
2. Click Browse to enter a workspace folder used to store your projects (you can use workspace
folders to organize your work), then click OK to dismiss the Workspace Launcher dialog.
Note: Before launching the SDx IDE on Linux, ensure that you use the same shell that you have used to set
the $SYSROOT environment variable. This is usually the file path to the Linux root file system.

The SDx development environment window opens with the Welcome tab visible when you
create a new workspace. The Welcome tab can be closed by clicking the X icon or minimized
if you do not wish to use it.
3. Select File → New → Xilinx SDx Project from the SDx development environment menu bar.
The New Project dialog box opens.
4. Specify the name of the project. For example Test.
5. Click Next.
The the Choose Hardware Platform page appears.
6. From the Choose Hardware Platform page, select a suitable platform. For example, zcu102.
7. Click Next.
The Choose Software Platform and Target CPU page appears.
8. From the Choose Software Platform and Target CPU page, select an appropriate software
platform and the target CPU. For example, select A9 from the CPU dropdown list for ZC702
and ZC706 reVISION platforms.
9. Click Next. The Templates page appears, containing source code examples for the selected
platform.
10. From the list of application templates, select Empty Application and click Finish.
The New Project dialog box closes. A new project with the specified configuration is created.
The SDx Project Settings view appears. Notice the progress bar in the lower right border of
the view, Wait for a few moments for the C/C++ Indexer to finish.
11. The standard build configurations are Debug and Release. To get the best run-time
performance, switch to use the Release build configuration as it uses a higher compiler
optimization setting than the Debug build configuration.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
14

Send Feedback

Chapter 2: Getting Started

Figure 4: SDx Project Settings - Active Build Configuration

12. Set the Data motion network clock frequency (MHz) to the required frequency, on the SDx
Project Settings page.
13. Select the Generate bitstream and Generate SD card image check boxes.
14. Right-click on the newly created project in the Project Explorer view.
15. From the context menu that appears, select C/C++ Build Settings.
The Properties for  dialog box appears.
16. Click the Tool Settings tab.
17. Expand the SDS++ Compiler → Directories tree.
icon to add the “\include and
18. Click the
“\include folder locations to the Include Paths list.
Note: The OpenCV library is not provided by Xilinx for custom platforms. You are required to provide the
library. Use the reVISION platform in order to use the OpenCV library provided by Xilinx.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
15

Send Feedback

Chapter 2: Getting Started

Figure 5: SDS++ Compiler Settings

19. Click Apply.
20. Expand the SDS++ Linker → Libraries tree.
icon and add the following libraries to the Libraries(-l) list. These libraries are
21. Click the
required by OpenCV.
• opencv_core
• opencv_imgproc

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
16

Send Feedback

Chapter 2: Getting Started

• opencv_imgcodecs
• opencv_features2d
• opencv_calib3d
• opencv_flann
• lzma
• tiff
• png16
• z
• jpeg
• dl
• rt
• webp
icon and add /lib folder location to the Libraries
22. Click the
search path (-L) list.
Note: The OpenCV library is not provided by Xilinx for custom platforms. You are required to provide the
library. Use the reVISION platform in order to use the OpenCV library provided by Xilinx.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
17

Send Feedback

Chapter 2: Getting Started

Figure 6: SDS++ Linker Settings

23. Click Apply to save the configuration.
24. Click OK to close the Properties for  dialog box.
25. Expand the newly created project tree in the Project Explorer view.
26. Right-click the src folder and select Import. The Import dialog box appears.
27. Select File System and click Next.
28. Click Browse to navigate to the /examples folder location.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
18

Send Feedback

Chapter 2: Getting Started

29. Select the folder that corresponds to the library that you desire to import. For example,
accumulate.
Figure 7: Import Library Example Source Files

30. Right-click the library function in the Project Explorer view and select Toggle HW/SW to
move the function to the hardware.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
19

Send Feedback

Chapter 2: Getting Started

Figure 8: Moving a Library Function to the Hardware

31. Right-click the project and select Build Project or press Ctrl+B keys to build the project, in
the Project Explorer view.
The build process may take anytime between few minutes to several hours, depending on the
power of the host machine and the complexity of the design. By far, the most time is spent
processing the routines that have been tagged for realization in hardware.
32. Copy the contents of the newly created .\\\Release
\sd_card folder to the SD card. The sd_card folder contains all the files required to run
designs on a board.
33. Insert the SD card in the board card slot and switch it ON.
Note: A serial port emulator (Teraterm/ minicom) is required to interface the user commands to the board.

34. Upon successful boot, navigate to the ./mnt folder and run the following command at the
prompt:
#cd /mnt
Note: It is assumed that the OpenCV libraries are a port of the root filesystem. If not, add the location of
OpenCV libraries to LD_LIBRARY_PATH using the $ export LD_LIBRARY_PATH=/lib command.

35. Run the .elf executable file. For more information, see the Using the xfOpenCV Library
Functions on Hardware.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
20

Send Feedback

Chapter 2: Getting Started

Changing the Hardware Kernel
Configuration
Use the following steps to change the hardware kernel configuration:
1. Update the /xfOpenCV/examples//
xf_config_params.h file.
2. Update the makefile along with the xf_config_params.h file:
a. Find the line with the function name in the makefile. For bilateral filter, the line in the
makefile will be xf::BilateralFilter<3,1,0,1080,1920,1>.
b. Update the template parameters in the makefile to reflect changes made in the
xf_config_params.h file. For more details, see the Chapter 3: xfOpenCV Library API
Reference.

Using the xfOpenCV Library Functions on
Hardware
The following table lists the xfOpenCV library functions and the command to run the respective
examples on hardware. It is assumed that your design is completely built and the board has
booted up correctly.
Table 3: Using the xfOpenCV Library Function on Hardware
Example

Function Name

Usage on Hardware

accumulate

xf::accumulate

./.elf  

accumulatesquared

xf::accumulateSquare

./.elf  

accumulateweighted

xf::accumulateWeighted

./.elf  

arithm

xf::absdiff, xf::add, xf::subtract, xf::bitwise_and,
xf::bitwise_or, xf::bitwise_not, xf::bitwise_xor

./.elf  

Bilateralfilter

xf::bilateralFilter

./.elf 

Boxfilter

xf::boxFilter

./.elf 

Canny

xf::Canny

./.elf 

channelcombine

xf::merge

./.elf    

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
21

Send Feedback

Chapter 2: Getting Started

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)
Example

Function Name

Usage on Hardware

Channelextract

xf::extractChannel

./.elf 

Colordetect

xf::RGB2HSV, xf::colorthresholding, xf:: erode,
and xf:: dilate

./.elf 

Convertbitdepth

xf::convertTo

./.elf 

Cornertracker

xf::cornerTracker

./exe    

Customconv

xf::filter2D

./.elf 

cvtcolor IYUV2NV12

xf::iyuv2nv12

./.elf   

cvtcolor IYUV2RGBA

xf::iyuv2rgba

./.elf   

cvtcolor IYUV2YUV4

xf::iyuv2yuv4

./.elf    
 

cvtcolor NV122IYUV

xf::nv122iyuv

./.elf  

cvtcolor NV122RGBA

xf::nv122rgba

./.elf  

cvtcolor NV122YUV4

xf::nv122yuv4

./.elf  

cvtcolor NV212IYUV

xf::nv212iyuv

./.elf  

cvtcolor NV212RGBA

xf::nv212rgba

./.elf  

cvtcolor NV212YUV4

xf::nv212yuv4

./.elf  

cvtcolor RGBA2YUV4

xf::rgba2yuv4

./.elf 

cvtcolor RGBA2IYUV

xf::rgba2iyuv

./.elf 

cvtcolor RGBA2NV12

xf::rgba2nv12

./.elf 

cvtcolor RGBA2NV21

xf::rgba2nv21

./.elf 

cvtcolor UYVY2IYUV

xf::uyvy2iyuv

./.elf 

cvtcolor UYVY2NV12

xf::uyvy2nv12

./.elf 

cvtcolor UYVY2RGBA

xf::uyvy2rgba

./.elf 

cvtcolor YUYV2IYUV

xf::yuyv2iyuv

./.elf 

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
22

Send Feedback

Chapter 2: Getting Started

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)
Example

Function Name

Usage on Hardware

cvtcolor YUYV2NV12

xf::yuyv2nv12

./.elf 

cvtcolor YUYV2RGBA

xf::yuyv2rgba

./.elf 

Difference of Gaussian

xf:: GaussianBlur, xf:: duplicateMat, xf::
delayMat, and xf::subtract

./.elf 

Dilation

xf::dilate

./.elf 

Erosion

xf::erode

./.elf 

Fast

xf::fast

./.elf 

Gaussianfilter

xf::GaussianBlur

./.elf 

Harris

xf::cornerHarris

./.elf 

Histogram

xf::calcHist

./.elf 

Histequialize

xf::equalizeHist

./.elf 

Hog

xf::HOGDescriptor

./.elf 

Houghlines

xf::HoughLines

./.elf 

Integralimg

xf::integralImage

./.elf 

Lkdensepyrof

xf::densePyrOpticalFlow

./.elf  

Lknpyroflow

xf::DenseNonPyrLKOpticalFlow

./.elf  

Lut

xf::LUT

./.elf 

Magnitude

xf::magnitude

./.elf 

meanshifttracking

xf::MeanShift

./.elf  

meanstddev

xf::meanStdDev

./.elf 

medianblur

xf::medianBlur

./.elf 

Minmaxloc

xf::minMaxLoc

./.elf 

otsuthreshold

xf::OtsuThreshold

./.elf 

Phase

xf::phase

./.elf 

Pyrdown

xf::pyrDown

./.elf 

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
23

Send Feedback

Chapter 2: Getting Started

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)
Example

Function Name

Usage on Hardware

Pyrup

xf::pyrUp

./.elf 

remap

xf::remap

./.elf   

Resize

xf::resize

./.elf 

scharrfilter

xf::Scharr

./.elf 

SemiGlobalBM

xf::SemiGlobalBM

./.elf 


sobelfilter

xf::Sobel

./.elf 

stereopipeline

xf::StereoPipeline

./.elf 


stereolbm

xf::StereoBM

./.elf 


Svm

xf::SVM

./.elf

threshold

xf::Threshold

./.elf 

warpaffine

xf::warpAffine

./.elf 

warpperspective

xf::warpPerspective

./.elf 

warptransform

xf::warpTransform

./.elf 

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
24

Send Feedback

Chapter 3

xfOpenCV Library API Reference
To facilitate local memory allocation on FPGA devices, the xfOpenCV library functions are
provided in templates with compile-time parameters. Data is explicitly copied from cv::Mat to
xf::Mat and is stored in physically contiguous memory to achieve the best possible
performance. After processing, the output in xf::Mat is copied back to cv::Mat to write it
into the memory.

xf::Mat Image Container Class
xf::Mat is a template class that serves as a container for storing image data and its attributes.
Note: The xf::Mat image container class is similar to the cv::Mat class of the OpenCV library.

Class Definition
template
class Mat {
public:
Mat();
// default constructor
Mat(int _rows, int _cols);
Mat(int _rows, int _cols, void *_data);
Mat(int _size, int _rows, int _cols);
~Mat();
void init(int _rows, int _cols);
void copyTo(XF_PTSNAME(T,NPC)* fromData);
unsigned char * copyFrom();
Mat(const Mat& src);
Mat& operator=(const Mat& src);
template
void convertTo(Mat &dst, int otype, double
alpha=1, double beta=0);
int rows, cols, size;
// actual image size
#ifndef __SYNTHESIS__
XF_TNAME(T,NPC)*data;
#else
XF_TNAME(T,NPC) data[ROWS*(COLS>> (XF_BITSHIFT(NPC)))];
#endif
};

Parameter Descriptions
The following table lists the xf::Mat class parameters and their descriptions:

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
25

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 4: xf::Mat Class Parameter Descriptions
Parameter

Description

rows

The number of rows in the image or height of the image.

cols

The number of columns in the image or width of the image.

size

The number of words stored in the data member. The value is calculated using rows*cols/
(number of pixels packed per word).

*data

The pointer to the words that store the pixels of the image.

Member Functions Description
The following table lists the member functions and their descriptions:
Table 5: xf::Mat Member Function Descriptions
Member Functions

Description

Mat()

This default constructor initializes the Mat object sizes, using the template parameters ROWS
and COLS.

Mat(int _rows, int _cols)

This constructor initializes the Mat object using arguments _rows and _cols.

Mat(const xf::Mat &_src)

This constructor helps clone a Mat object to another. New memory will be allocated for the
newly created constructor.

Mat(int _rows, int _cols,
void *_data)

This constructor initializes the Mat object using arguments _rows, _cols, and _data. The *data
member of the Mat object points to the memory allocated for _data argument, when this
constructor is used. No new memory is allocated for the *data member.

convertTo(Mat &dst, int
otype, double alpha=1,
double beta=0)
copyTo(* fromData)

Copies the data from Data pointer into physically contiguous memory allocated inside the
constructor.

copyFrom()

Returns the pointer to the first location of the *data member.

~Mat()

This is a default destructor of the Mat object.

Template Parameter Descriptions
Template parameters of the xf::Mat class are used to set the depth of the pixel, number of
channels in the image, number of pixels packed per word, maximum number of rows and columns
of the image. The following table lists the template parameters and their descriptions:
Table 6: xf::Mat Template Parameter Descriptions
Parameters

Description

TYPE

Type of the pixel data. For example, XF_8UC1 stands for 8-bit unsigned and one channel pixel.
More types can be found in include/common/xf_params.h.

HEIGHT

Maximum height of an image.

WIDTH

Maximum width of an image.

NPC

The number of pixels to be packed per word. For instance, XF_NPPC1 for 1 pixel per word; and
XF_NPPC8 for 8 pixels per word.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
26

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Pixel-Level Parallelism
The amount of parallelism to be implemented in a function from xfOpenCV is kept as a
configurable parameter. In most functions, there are two options for processing data.
• Single-pixel processing
• Processing eight pixels in parallel
The following table describes the options available for specifying the level of parallelism required
in a particular function:
Table 7: Options Available for Specifying the Level of Parallelism
Option

Description

XF_NPPC1

Process 1 pixel per clock cycle

XF_NPPC2

Process 2 pixels per clock cycle

XF_NPPC8

Process 8 pixels per clock cycle

Macros to Work With Parallelism
There are two macros that are defined to work with parallelism.
• The XF_NPIXPERCYCLE(flags) macro resolves to the number of pixels processed per
cycle.
○

○

○

XF_NPIXPERCYCLE(XF_NPPC1) resolves to 1
XF_NPIXPERCYCLE(XF_NPPC2) resolves to 2
XF_NPIXPERCYCLE(XF_NPPC8) resolves to 8

• The XF_BITSHIFT(flags) macro resolves to the number of times to shift the image size to
right to arrive at the final data transfer size for parallel processing.
○

○

○

XF_BITSHIFT(XF_NPPC1) resolves to 0
XF_BITSHIFT(XF_NPPC2) resolves to 1
XF_BITSHIFT(XF_NPPC8) resolves to 3

Pixel Types
Parameter types will differ, depending on the combination of the depth of pixels and the number
of channels in the image. The generic nomenclature of the parameter is listed below.
XF_C

For example, for an 8-bit pixel - unsigned - 1 channel the data type is XF_8UC1.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
27

Send Feedback

Chapter 3: xfOpenCV Library API Reference

The following table lists the available data types for the xf::Mat class:
Table 8: xf::Mat Class - Available Data Types
Number of bits per
Pixel

Option

Unsigned/ Signed/
Float Type

Number of Channels

XF_8UC1

8

Unsigned

1

XF_16UC1

16

Unsigned

1

XF_16SC1

16

Signed

1

XF_32UC1

32

Unsigned

1

XF_32FC1

32

Float

1

XF_32SC1

32

Signed

1

XF_8UC2

8

Unsigned

2

XF_8UC4

8

Unsigned

4

XF_2UC1

2

Unsigned

1

Manipulating Data Type
Based on the number of pixels to process per clock cycle and the type parameter, there are
different possible data types. The xfOpenCV library uses these datatypes for internal processing
and inside the xf::Mat class. The following are a few supported types:
• XF_TNAME(TYPE,NPPC) resolves to the data type of the data member of the xf::Mat
object. For instance, XF_TNAME(XF_8UC1,XF_NPPC8) resolves to ap_uint<64>.
• Word width = pixel depth * number of channels * number of pixels to
process per cycle (NPPC).
• XF_DTUNAME(TYPE,NPPC) resolves to the data type of the pixel. For instance,
XF_DTUNAME(XF_32FC1,XF_NPPC1) resolves to float.
• XF_PTSNAME(TYPE,NPPC) resolves to the ‘C’ data type of the pixel. For instance,
XF_PTSNAME (XF_16UC1,XF_NPPC2) resolves to unsigned short.
Note: ap_uint<>, ap_int<>, ap_fixed<>, and ap_ufixed<> types belong to the high-level synthesis
(HLS) library. For more information, see the Vivado Design Suite User Guide: High-Level Synthesis (UG902).

Sample Illustration
The following code illustrates the configurations that are required to build the gaussian filter on
an image, using SDSoC tool for Zynq® UltraScale™ platform.
Note: In case of a real-time application, where the video is streamed in, it is recommended that the location
of frame buffer is xf::Mat and is processed using the library function. The resultant location pointer is
passed to display IPs.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
28

Send Feedback

Chapter 3: xfOpenCV Library API Reference

xf_config_params.h
#define
#define
#define
#define
#define

FILTER_SIZE_3 1
FILTER_SIZE_5 0
FILTER_SIZE_7 0
RO 0
NO 1

#if NO
#define NPC1 XF_NPPC1
#endif
#if RO
#define NPC1 XF_NPPC8
#endif

xf_gaussian_filter_tb.cpp
int main(int argc, char **argv)
{
cv::Mat in_img, out_img, ocv_ref;
cv::Mat in_gray, in_gray1, diff;
in_img = cv::imread(argv[1], 1); // reading in the color image
extractChannel(in_img, in_gray, 1);
xf::Mat imgInput(in_img.rows,in_img.cols);
xf::Mat imgOutput(in_img.rows,in_img.cols);
imgInput.copyTo(in_gray.data);
gaussian_filter_accel(imgInput,imgOutput,sigma);
// Write output image
xf::imwrite("hls_out.jpg",imgOutput);
}

xf_gaussian_filter_accel.cpp
#include "xf_gaussian_filter_config.h"
void gaussian_filter_accel(xf::Mat
&imgInput,xf::Mat&imgOutput,float sigma)
{
xf::GaussianBlur(imgInput, imgOutput, sigma);
}

xf_gaussian_filter.hpp
#pragma SDS data data_mover("_src.data":AXIDMA_SIMPLE)
#pragma SDS data data_mover("_dst.data":AXIDMA_SIMPLE)
#pragma SDS data access_pattern("_src.data":SEQUENTIAL)
#pragma SDS data copy("_src.data"[0:"_src.size"])
#pragma SDS data access_pattern("_dst.data":SEQUENTIAL)
#pragma SDS data copy("_dst.data"[0:"_dst.size"])
template

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
29

Send Feedback

Chapter 3: xfOpenCV Library API Reference

void GaussianBlur(xf::Mat & _src,
xf::Mat & _dst, float sigma)
{
//function body
}

The design fetches data from external memory (with the help of SDSoC data movers) and is
transferred to the function in 8-bit or 64-bit packets, based on the configured mode. Assuming
8-bits per pixel, 8 pixels can be packed into 64-bits. Therefore, 8 pixels are available to be
processed in parallel.
Enable the FILTER_SIZE_3 and the NO macros in the xf_config_params.h file. The macro
is used to set the filter size to 3x3 and #define NO 1 macro enables 1 pixel parallelism.
Specify the SDSoC tool specific pragmas, in the xf_gaussian_filter.hpp file.
#pragma
#pragma
#pragma
#pragma
#pragma
#pragma

SDS
SDS
SDS
SDS
SDS
SDS

data
data
data
data
data
data

data_mover("_src.data":AXIDMA_SIMPLE)
data_mover("_dst.data":AXIDMA_SIMPLE)
access_pattern("_src.data":SEQUENTIAL)
copy("_src.data"[0:"_src.size"])
access_pattern("_dst.data":SEQUENTIAL)
copy("_dst.data"[0:"_dst.size"])

Note: For more information on the pragmas used for hardware accelerator functions in SDSoC, see SDSoC
Environment User Guide (UG1027).

Additional Utility Functions for Software
xf::imread
The function xf::imread loads an image from the specified file path, copies into xf::Mat and
returns it. If the image cannot be read (because of missing file, improper permissions,
unsupported or invalid format), the function exits with a non-zero return code and an error
statement.
Note: In an HLS standalone mode like Cosim, use cv::imread followed by copyTo function, instead of
xf::imread.

API Syntax
template
xf::Mat imread (char *filename, int type)

Parameter Descriptions
The table below describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
30

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 9: xf::imread Function Parameter Descriptions
Parameter

Description

PTYPE

Input pixel type. Value should be in accordance with the ‘type’ argument’s value.

ROWS

Maximum height of the image to be read

COLS

Maximum width of the image to be read

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

filename

Name of the file to be loaded

type

Flag that depicts the type of image. The values are:

•

'0' for gray scale

•

'1' for color image

xf::imwrite
The function xf::imwrite saves the image to the specified file from the given xf::Mat. The image
format is chosen based on the file name extension. This function internally uses cv::imwrite for
the processing. Therefore, all the limitations of cv::imwrite are also applicable to xf::imwrite.
API Syntax
template 
void imwrite(const char *img_name, xf::Mat &img)

Parameter Descriptions
The table below describes the template and the function parameters.
Table 10: xf::imwrite Function Parameter Descriptions
Parameter

Description

PTYPE

Input pixel type. Supported types are: XF_8UC1, XF_16UC1, XF_8UC3, XF_16UC3, XF_8UC4, and
XF_16UC4

ROWS

Maximum height of the image to be read

COLS

Maximum width of the image to be read

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

img_name

Name of the file with the extension

img

xf::Mat array to be saved

xf::absDiff
The function xf::absDiff computes the absolute difference between each individual pixels of an
xf::Mat and a cv::Mat, and returns the difference values in a cv::Mat.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
31

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template 
void absDiff(cv::Mat &cv_img, xf::Mat& xf_img,
cv::Mat &diff_img )

Parameter Descriptions
The table below describes the template and the function parameters.
Table 11: xf::absDiff Function Parameter Descriptions
Parameter

Description

PTYPE

Input pixel type

ROWS

Maximum height of the image to be read

COLS

Maximum width of the image to be read

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and
XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively.

cv_img

cv::Mat array to be compared

xf_img

xf::Mat array to be compared

diff_img

Output difference image(cv::Mat)

xf::convertTo
The xf::convertTo function performs bit depth conversion on each individual pixel of the given
input image. This method converts the source pixel values to the target data type with
appropriate casting.
dst(x,y)= cast(α(src(x,y)+β))
Note: The output and input Mat cannot be the same. That is, the converted image cannot be stored in the
Mat of the input image.

API Syntax
template void convertTo(xf::Mat &dst,
int ctype, double alpha=1, double beta=0)

Parameter Descriptions
The table below describes the template and function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
32

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 12: xf::convertTo Function Parameter Descriptions
Parameter

Description

DST_T

Output pixel type. Possible values are XF_8UC1, XF_16UC1, XF_16SC1, and XF_32SC1.

ROWS

Maximum height of image to be read

COLS

Maximum width of image to be read

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and
XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively.

dst

Converted xf Mat

ctype

Conversion type : Possible values are listed here.
//Down-convert:

•

XF_CONVERT_16U_TO_8U

•

XF_CONVERT_16S_TO_8U

•

XF_CONVERT_32S_TO_8U

•

XF_CONVERT_32S_TO_16U

•

XF_CONVERT_32S_TO_16S

//Up-convert:

•

XF_CONVERT_8U_TO_16U

•

XF_CONVERT_8U_TO_16S

•

XF_CONVERT_8U_TO_32S

•

XF_CONVERT_16U_TO_32S

•

XF_CONVERT_16S_TO_32S

alpha

Optional scale factor

beta

Optional delta added to the scaled values

xfOpenCV Library Functions
The xfOpenCV library is a set of select OpenCV functions optimized for Zynq®-7000 SoC and
Zynq UltraScale+ MPSoC devices. The following table lists the xfOpenCV library functions.
Table 13: xfOpenCV Library Functions
Computations
Absolute Difference

Input Processing
Bit Depth Conversion

Filters
Bilateral Filter

Other
Canny Edge Detection

Accumulate

Channel Combine

Box Filter

FAST Corner Detection

Accumulate Squared

Channel Extract

Custom Convolution

Harris Corner Detection

Accumulate Weighted

Color Conversion

Dilate

Histogram Computation

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
33

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 13: xfOpenCV Library Functions (cont'd)
Computations

Input Processing

Filters

Other

Atan2

Histogram Equalization

Erode

Dense Pyramidal LK Optical
Flow

Bitwise AND, Bitwise NOT,
Bitwise OR, Bitwise XOR

Look Up Table

Gaussian Filter

Dense Non-Pyramidal LK
Optical Flow

Gradient Magnitude

Remap

Sobel Filter

MinMax Location

Gradient Phase

Resolution Conversion
(Resize)

Median Blur Filter

Thresholding

Scharr Filter

WarpAffine

Integral Image
Inverse (Reciprocal)

WarpPerspective

Pixel-Wise Addition

SVM

Pixel-Wise Multiplication

Otsu Threshold

Pixel-Wise Subtraction

Mean Shift Tracking

Square Root

HOG

Mean and Standard
Deviation

Stereo Local Block Matching
WarpTransform
Pyramid Up
Pyramid Down
Delay
Duplicate
Color Thresholding
RGB2HSV
InitUndistortRectifyMapInver
se
Houghlines
Semi Global Method for
Stereo Disparity Estimation

Notes:
1.

The maximum resolution supported for all the functions is 4K, except Houghlines, HOG (RB mode), and Canny Edge
Detection.

The following table lists the functions that are not supported on Zynq-7000 SoC devices, when
configured to use 128-bit interfaces in 8 pixel per cycle mode.
Table 14: Unsupported Functions Using 128-bit Interfaces in 8 Pixel Per Cycle Mode
on Zynq-7000 SoC
Computations
Accumulate

Input Processing
Bit Depth Conversion

Accumulate Squared

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Filters
Box Filter: signed 16-bit pixel type, and
unsigned 16-bit pixel type
Custom Convolution: signed 16-bit
output pixel type

www.xilinx.com

[placeholder text]
34

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 14: Unsupported Functions Using 128-bit Interfaces in 8 Pixel Per Cycle Mode
on Zynq-7000 SoC (cont'd)
Computations

Input Processing

Filters

Accumulate Weighted

Sobel Filter

Gradient Magnitude

Scharr Filter

Gradient Phase
Pixel-Wise Addition: signed 16-bit pixel
type, and unsigned 16-bit pixel type
Pixel-Wise Multiplication: signed 16-bit
pixel type, and unsigned 16-bit pixel
type
Pixel-Wise Subtraction: signed 16-bit
pixel type, and unsigned 16-bit pixel
type

Note: Resolution Conversion (Resize) in 8 pixel per cycle mode, Dense Pyramidal LK Optical Flow, and
Dense Non-Pyramidal LK Optical Flow functions are not supported on the Zynq-7000 SoC ZC702 devices,
due to the higher resource utilization.

Absolute Difference
The absdiff function finds the pixel wise absolute difference between two input images and
returns an output image. The input and the output images must be the XF_8UC1 type.

I out (x, y) = | I in1 (x, y) - I in2(x, y)|
Where,
• Iout(x, y) is the intensity of output image at (x,y) position.
• Iin1(x, y) is the intensity of first input image at (x,y) position.
• Iin2(x, y) is the intensity of second input image at (x,y) position.
API Syntax
template
void absdiff(
xf::Mat src1,
xf::Mat src2,
xf::Mat dst )

Parameter Descriptions
The following table describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
35

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 15: absdiff Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is
supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a
multiple of 8)

COLS

Maximum width of input and output image (must be a
multiple of 8)

NPC

Number of pixels to be processed per cycle; possible
options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel
operations respectively.

src1

Input image

src2

Input image

dst

Output image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado® HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 16: absdiff Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

62

67

17

8 pixel

150

0

0

67

234

39

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 17: absdiff Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.69

Deviation from OpenCV
There is no deviation from OpenCV, except that the absdiff function supports 8-bit pixels.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
36

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Accumulate
The accumulate function adds an image (src1) to the accumulator image (src2), and generates
the accumulated result image (dst).

dst(x, y) = src1(x, y) + src2⎛⎝x, y⎞⎠
API Syntax
template
void accumulate (
xf::Mat src1,
xf::Mat src2,
xf::Mat dst )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 18: accumulate Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8
for 1 pixel and 8 pixel operations respectively.

src1

Input image

src2

Input image

dst

Output image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale
HD (1080x1920) image.
Table 19: accumulate Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48E

FF

LUT

CLB

1 pixel

300

0

0

62

55

12

8 pixel

150

0

0

389

285

61

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
37

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 20: accumulate Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

Deviation from OpenCV
In OpenCV the accumulated image is stored in the second input image. The src2 image acts as
both input and output, as shown below:

src2(x, y) = src1(x, y) + src2⎛⎝x, y⎞⎠

Whereas, in the xfOpenCV implementation, the accumulated image is stored separately, as
shown below:

dst(x, y) = src1(x, y) + src2⎛⎝x, y⎞⎠

Accumulate Squared
The accumulateSquare function adds the square of an image (src1) to the accumulator image
(src2) and generates the accumulated result (dst).

dst(x, y) = src1(x, y) 2 + src2⎛⎝x, y⎞⎠
The accumulated result is a separate argument in the function, instead of having src2 as the
accumulated result. In this implementation, having a bi-directional accumulator is not possible as
the function makes use of streams.
API Syntax
template
void accumulateSquare (
xf::Mat src1,
xf::Mat src2,
xf::Mat dst)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
38

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parameter Descriptions
The following table describes the template and the function parameters.
Table 21: accumulateSquare Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src1

Input image

src2

Input image

dst

Output image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 22: accumulateSquare Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48E

FF

LUT

CLB

1 pixel

300

0

1

71

52

14

8 pixel

150

0

8

401

247

48

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 23: accumulateSquare Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.6

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
39

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Deviation from OpenCV
In OpenCV the accumulated squared image is stored in the second input image. The src2 image
acts as input as well as output.

src2(x, y) = src1(x, y) 2 + src2⎛⎝x, y⎞⎠
Whereas, in the xfOpenCV implementation, the accumulated squared image is stored separately.
⎞
⎛
2

dst(x, y) = src1(x, y) + src2⎝x, y⎠

Accumulate Weighted
The accumulateWeighted function computes the weighted sum of the input image (src1) and
the accumulator image (src2) and generates the result in dst.

dst(x, y) = alpha*src1(x, y) + ⎛⎝1 - alpha⎞⎠*src2⎛⎝x, y⎞⎠
The accumulated result is a separate argument in the function, instead of having src2 as the
accumulated result. In this implementation, having a bi-directional accumulator is not possible, as
the function uses streams.
API Syntax
template
void accumulateWeighted (
xf::Mat src1,
xf::Mat src2,
xf::Mat dst,
float alpha )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 24: accumulateWeighted Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src1

Input image

src2

Input image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
40

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 24: accumulateWeighted Function Parameter Descriptions (cont'd)
Parameter

Description

dst

Output image

alpha

Weight applied to input image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 25: accumulateWeighted Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

5

295

255

52

8 pixel

150

0

19

556

476

88

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 26: accumulateWeighted Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

Deviation from OpenCV
The resultant image in OpenCV is stored in the second input image. The src2 image acts as input
as well as output, as shown below:

src2(x, y) = alpha*src1(x, y) + ⎛⎝1 - alpha⎞⎠*src2⎛⎝x, y⎞⎠
Whereas, in xfOpenCV implementation, the accumulated weighted image is stored separately.

dst(x, y) = alpha*src1(x, y) + ⎛⎝1 - alpha⎞⎠*src2⎛⎝x, y⎞⎠

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
41

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Bilateral Filter
In general, any smoothing filter smoothens the image which will affect the edges of the image. To
preserve the edges while smoothing, a bilateral filter can be used. In an analogous way as the
Gaussian filter, the bilateral filter also considers the neighboring pixels with weights assigned to
each of them. These weights have two components, the first of which is the same weighing used
by the Gaussian filter. The second component takes into account the difference in the intensity
between the neighboring pixels and the evaluated one.
The bilateral filter applied on an image is:

BF[I] p = 1 ∑ G σ s⎛⎝ ‖ p - q ‖ ⎞⎠G σ r⎛⎝ ‖ I p - I q ‖ ⎞⎠I q
W p qϵS
Where

Wp = ∑

qϵS

G σ s⎛⎝ ‖ p - q ‖ ⎞⎠G σ r⎛⎝ ‖ I p - I q ‖ ⎞⎠

and G σ is a gaussian filter with variance σ .
⎛

⎞

- ⎝x 2 + y 2⎠

The gaussian filter is given by: G σ = e

2σ 2

API Syntax
template
void bilateralFilter (
xf::Mat src,
xf::Mat dst,
float sigma_space, float sigma_color )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 27: bilateralFilter Function Parameter Descriptions
Parameter
FILTER_SIZE

Description
Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7)
are supported

BORDER_TYPE

Border type supported is XF_BORDER_CONSTANT

TYPE

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported
(XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
42

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 27: bilateralFilter Function Parameter Descriptions (cont'd)
Parameter

Description

NPC

Number of pixels to be processed per cycle; this function supports only XF_NPPC1
or 1 pixel per cycle operations.

src

Input image

dst

Output image

sigma_space

Standard deviation of filter in spatial domain

sigma_color

Standard deviation of filter used in color space

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA,
to progress a grayscale HD (1080x1920) image.
Table 28: bilateralFilter Resource Utilization Summary

Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency

Filter Size

(MHz)

BRAM_18K

DSP_48Es

FF

LUT

3x3

300

6

22

4934

4293

5x5

300

12

30

5481

4943

7x7

300

37

48

7084

6195

Performance Estimate
The following table summarizes a performance estimate of the kernel in different configurations,
as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 29: bilateralFilter Function Performance Estimate Summary
Latency Estimate
Operating Mode

Filter Size

168 MHz
Max (ms)

1 pixel

3x3

7.18

5x5

7.20

7x7

7.22

Deviation from OpenCV
Unlike OpenCV, xfOpenCV only supports filter sizes of 3, 5 and 7.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
43

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Bit Depth Conversion
The convertTo function converts the input image bit depth to the required bit depth in the
output image.
API Syntax
template 
void convertTo(xf::Mat &_src_mat, xf::Mat &_dst_mat, ap_uint<4> _convert_type, int _shift)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 30: convertTo Function Parameter Descriptions
Parameter
SRC_T

Description
Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1),
16-bit, unsigned, 1 channel (XF_16UC1),
16-bit, signed, 1 channel (XF_16SC1),
32-bit, unsigned, 1 channel (XF_32UC1)
32-bit, signed, 1 channel (XF_32SC1) are supported.

DST_T

Output pixel yype. 8-bit, unsigned, 1 channel (XF_8UC1),
16-bit, unsigned, 1 channel (XF_16UC1),
16-bit, signed, 1 channel (XF_16SC1),
32-bit, unsigned, 1 channel (XF_32UC1)
32-bit, signed, 1 channel (XF_32SC1) are supported.

ROWS

Height of input and output images

COLS

Width of input and output images

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image

_convert_type

This parameter specifies the type of conversion required. (See XF_convert_bit_depth_e
enumerated type in file xf_params.h for possible values.)

_shift

Optional scale factor

Possible Conversions
The following table summarizes supported conversions. The rows are possible input image bit
depths and the columns are corresponding possible output image bit depths (U=unsigned,
S=signed).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
44

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 31: convertTo Function Supported Conversions
INPUT/
OUTPUT

U8

U16

S16

U32

S32

U8

NA

yes

yes

NA

yes

U16

yes

NA

NA

NA

yes

S16

yes

NA

NA

NA

yes

U32

NA

NA

NA

NA

NA

S32

yes

yes

yes

NA

NA

Resource Utilization
The following table summarizes the resource utilization of the convertTo function, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 32: convertTo Function Resource Utilization Summary For
XF_CONVERT_8U_TO_16S Conversion

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

8

581

523

119

8 pixel

150

0

8

963

1446

290

Table 33: convertTo Function Resource Utilization Summary For
XF_CONVERT_16U_TO_8U Conversion

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

8

591

541

124

8 pixel

150

0

8

915

1500

308

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
45

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 34: convertTo Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency

1 pixel operation (300 MHz)

6.91 ms

8 pixel operation (150 MHz)

1.69 ms

Bitwise AND
The bitwise_and function performs the bitwise AND operation for each pixel between two
input images, and returns an output image.

I out⎛⎝x, y⎞⎠ =

I in1⎛⎝x, y⎠⎞ & I in2⎛⎝x, y⎞⎠

Where,
•

I out⎛⎝x, y⎞⎠

is the intensity of output image at (x, y) position

•

I in1⎛⎝x, y⎞⎠

is the intensity of first input image at (x, y) position

•

I in2⎛⎝x, y⎞⎠

is the intensity of second input image at (x, y) position

API Syntax
template
void bitwise_and (
xf::Mat src1,
xf::Mat src2,
xf::Mat dst )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 35: bitwise_and Function Parameter Descriptions
Parameter

Description

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src1

Input image

src2

Input image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
46

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 35: bitwise_and Function Parameter Descriptions (cont'd)
Parameter

Description

dst

Output image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 36: bitwise_and Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

62

44

10

8 pixel

150

0

0

59

72

13

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 37: bitwise_and Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

Bitwise NOT
The bitwise_not function performs the pixel wise bitwise NOT operation for the pixels in the
input image, and returns an output image.

I out⎛⎝x, y⎞⎠ = ~ I in⎛⎝x, y⎞⎠

Where,
•

I out⎛⎝x, y⎞⎠

•

I in⎛⎝x, y⎞⎠

is the intensity of output image at (x, y) position
is the intensity of input image at (x, y) position

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
47

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void bitwise_not (
xf::Mat src,
xf::Mat dst )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 38: bitwise_not Function Parameter Descriptions
Parameter

Description

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src

Input image

dst

Output image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 39: bitwise_not Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

97

78

20

8 pixel

150

0

0

88

97

21

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
48

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 40: bitwise_not Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

Bitwise OR
The bitwise_or function performs the pixel wise bitwise OR operation between two input
images, and returns an output image.

I out (x, y) = I in1 (x, y) | I in2⎛⎝x, y⎞⎠

Where,
•

I out(x, y)

is the intensity of output image at (x, y) position

•

I in1(x, y)

is the intensity of first input image at (x, y) position

•

I in2(x, y)

is the intensity of second input image at (x, y) position

API Syntax
template
void bitwise_or (
xf::Mat src1,
xf::Mat src2,
xf::Mat dst )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 41: bitwise_or Function Parameter Descriptions
Parameter

Description

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src1

Input image

src2

Input image

dst

Output image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
49

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 42: bitwise_or Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

62

44

10

8 pixel

150

0

0

59

72

13

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 43: bitwise_or Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

Bitwise XOR
The bitwise_xor function performs the pixel wise bitwise XOR operation between two input
images, and returns an output image, as shown below:

I out⎛⎝x, y⎞⎠ = I in1⎛⎝x, y⎞⎠ ⊕ I in2⎛⎝x, y⎞⎠
Where,
•

I out⎛⎝x, y⎞⎠

is the intensity of output image at (x, y) position

•

I in1⎛⎝x, y⎞⎠

is the intensity of first input image at (x, y) position

•

I in2⎛⎝x, y⎞⎠

is the intensity of second input image at (x, y) position

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
50

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void bitwise_xor(
xf::Mat src1,
xf::Mat src2,
xf::Mat dst )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 44: bitwise_xor Function Parameter Descriptions
Parameter

Description

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

src1

Input image

src2

Input image

dst

Output image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image:
Table 45: bitwise_xor Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

62

44

10

8 pixel

150

0

0

59

72

13

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image:

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
51

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 46: bitwise_xor Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

Box Filter
The boxFilter function performs box filtering on the input image. Box filter acts as a low-pass
filter and performs blurring over the image. The boxFilter function or the box blur is a spatial
domain linear filter in which each pixel in the resulting image has a value equal to the average
value of the neighboring pixels in the image.

⎡1 . . . 1⎤
1
⎢1 . . . 1⎥
K box =
(ksize*ksize) ⎣
1 . . . 1⎦

API Syntax
template
void boxFilter(xf::Mat & _src_mat,xf::Mat & _dst_mat)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 47: boxFilter Function Parameter Descriptions
Parameter

Description

FILTER_SIZE

Filter size. Filter size of 3(XF_FILTER_3X3), 5(XF_FILTER_5X5) and 7(XF_FILTER_7X7) are
supported

BORDER_TYPE

Border Type supported is XF_BORDER_CONSTANT

SRC_T

Input and output pixel type. 8-bit, unsigned, 16-bit unsigned and 16-bit signed, 1 channel is
supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8
for 1 pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
52

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 48: boxFilter Function Resource Utilization Summary

Operating
Mode
1 pixel

8 pixel

Filter Size

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

3x3

300

3

1

545

519

104

5x5

300

5

1

876

870

189

7x7

300

7

1

1539

1506

300

3x3

150

6

8

1002

1368

264

5x5

150

10

8

1576

3183

611

7x7

150

14

8

2414

5018

942

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image:
Table 49: boxFilter Function Performance Estimate Summary

Operating Mode
1 pixel

8 pixel

Latency Estimate

Operating Frequency

Filter Size

(MHz)

Max (ms)

300

3x3

7.2

300

5x5

7.21

300

7x7

7.22

150

3x3

1.7

150

5x5

1.7

150

7x7

1.7

Canny Edge Detection
The Canny edge detector finds the edges in an image or video frame. It is one of the most
popular algorithms for edge detection. Canny algorithm aims to satisfy three main criteria:
1. Low error rate: A good detection of only existent edges.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
53

Send Feedback

Chapter 3: xfOpenCV Library API Reference

2. Good localization: The distance between edge pixels detected and real edge pixels have to be
minimized.
3. Minimal response: Only one detector response per edge.
In this algorithm, the noise in the image is reduced first by applying a Gaussian mask. The
Gaussian mask used here is the average mask of size 3x3. Thereafter, gradients along x and y
directions are computed using the Sobel gradient function. The gradients are used to compute
the magnitude and phase of the pixels. The phase is quantized and the pixels are binned
accordingly. Non-maximal suppression is applied on the pixels to remove the weaker edges.
Edge tracing is applied on the remaining pixels to draw the edges on the image. In this algorithm,
the canny up to non-maximal suppression is in one kernel and the edge linking module is in
another kernel. After non-maxima suppression, the output is represented as 2-bit per pixel,
Where:
• 00 - represents the background
• 01 - represents the weaker edge
• 11 - represents the strong edge
The output is packed as 8-bit (four 2-bit pixels) in 1 pixel per cycle operation and packed as 16bit (eight 2-bit pixels) in 8 pixel per cycle operation. For the edge linking module, the input is 64bit, such 32 pixels of 2-bit are packed into a 64-bit. The edge tracing is applied on the pixels and
returns the edges in the image.
API Syntax
The API syntax for Canny is:
template
void Canny(xf::Mat & _src_mat,xf::Mat & _dst_mat,unsigned char _lowthreshold,unsigned char
_highthreshold)

The API syntax for EdgeTracing is:
template
voidEdgeTracing(xf::Mat & _src,xf::Mat & _dst)

Parameter Descriptions
The following table describes the xf::Canny template and function parameters:

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
54

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 50: xf::Canny Function Parameter Descriptions
Parameter

Description

FILTER_TYPE

The filter window dimensions. The options are 3 and 5.

NORM_TYPE

The type of norm used. The options for norm type are L1NORM and L2NORM.

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output pixel type. The output in 1pixel case is 8-bit and packing four 2-bit pixel
values into 8-bit. The Output in 8 pixel case is 16-bit, 8-bit, 2-bit pixel values are
packing into 16-bit.

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image

_lowthreshold

The lower value of threshold for binary thresholding.

_highthreshold

The higher value of threshold for binary thresholding.

The following table describes the EdgeTracing template and function parameters:
Table 51: EdgeTracing Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type

DST_T

Output pixel type

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC_SRC

Number of pixels to be processed per cycle. Fixed to XF_NPPC32.

NPC_DST

Number of pixels to be written to destination. Fixed to XF_NPPC8.

_src

Input image

_dst

Output image

Resource Utilization
The following table summarizes the resource utilization of xf::Canny and EdgeTracing in
different configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image for Filter size is 3.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
55

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 52: xf::Canny and EdgeTracing Function Resource Utilization Summary
Resource Utilization
1 pixel
Name

1 pixel

8 pixel

8 pixel

L1NORM,FS: L2NORM,FS: L1NORM,FS: L2NORM,FS:
3
3
3
3
300 MHz

300 MHz

150 MHz

150 MHz

Edge
Linking

Edge
Linking

300 MHz

150 MHz

BRAM_18K

22

18

36

32

84

84

DSP48E

2

4

16

32

3

3

FF

3027

3507

4899

6208

17600

14356

LUT

2626

3170

6518

9560

15764

14274

CLB

606

708

1264

1871

2955

3241

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image for L1NORM, filter size is 3 and including the edge linking
module.
Table 53: xf::Canny and EdgeTracing Function Performance Estimate Summary
Latency Estimate
Operating Mode

Operating Frequency (MHz)

Latency (in ms)

1 pixel

300

10.2

8 pixel

150

8

Deviation from OpenCV
In OpenCV Canny function, the Gaussian blur is not applied as a pre-processing step.

Channel Combine
The merge function, merges single channel images into a multi-channel image. The number of
channels to be merged should be four.
API Syntax
template
void merge(xf::Mat &_src1, xf::Mat &_src2, xf::Mat &_src3, xf::Mat &_src4, xf::Mat &_dst)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
56

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parameter Descriptions
The following table describes the template and the function parameters.
Table 54: merge Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 4 channel is supported (XF_8UC1)

DST_T

Output pixel type. Only 8-bit, unsigned,1 channel is supported (XF_8UC4)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.

_src1

Input single-channel image

_src2

Input single-channel image

_src3

Input single-channel image

_src4

Input single-channel image

_dst

Output multi-channel image

Resource Utilization
The following table summarizes the resource utilization of the merge function, generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process 4 singlechannel HD (1080x1920) images.
Table 55: merge Function Resource Utilization Summary

Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

DSP_48Es
8

FF

LUT

494

CLB

386

85

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process 4 single channel HD
(1080x1920) images.
Table 56: merge Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Max Latency
6.92 ms

www.xilinx.com

[placeholder text]
57

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Channel Extract
The extractChannel function splits a multi-channel array (32-bit pixel-interleaved data) into
several single-channel arrays and returns a single channel. The channel to be extracted is
specified by using the channel argument.
The value of the channel argument is specified by macros defined in the
xf_channel_extract_e enumerated data type. The following table summarizes the possible
values for the xf_channel_extract_e enumerated data type:
Table 57: xf_channel_extract_e Enumerated Data Type Values
Channel

Enumerated Type

Unknown

XF_EXTRACT_CH_0

Unknown

XF_EXTRACT_CH_1

Unknown

XF_EXTRACT_CH_2

Unknown

XF_EXTRACT_CH_3

RED

XF_EXTRACT_CH_R

GREEN

XF_EXTRACT_CH_G

BLUE

XF_EXTRACT_CH_B

ALPHA

XF_EXTRACT_CH_A

LUMA

XF_EXTRACT_CH_Y

Cb/U

XF_EXTRACT_CH_U

Cr/V/Value

XF_EXTRACT_CH_V

API Syntax
template
void extractChannel(xf::Mat & _src_mat,
xf::Mat & _dst_mat, uint16_t _channel)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 58: extractChannel Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 4channel is supported (XF_8UC4)

DST_T

Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.

_src_mat

Input multi-channel image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
58

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 58: extractChannel Function Parameter Descriptions (cont'd)
Parameter

Description

_dst_mat

Output single channel image

_channel

Channel to be extracted (See xf_channel_extract_e enumerated type in file xf_params.h for
possible values.)

Resource Utilization
The following table summarizes the resource utilization of the extractChannel function,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a 4 channel HD (1080x1920) image.
Table 59: extractChannel Function Resource Utilization Summary
Operating
Mode
1 pixel

Operating Frequency (MHz)
300

Utilization Estimate
BRAM_18K
0

DSP_48Es
8

FF
508

LUT

CLB

354

96

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a 4 channel HD
(1080x1920) image.
Table 60: extractChannel Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.92

Color Conversion
The color conversion functions convert one image format to another image format, for the
combinations listed in the following table. The rows represent the input formats and the columns
represent the output formats. Supported conversions are discussed in the following sections.
I/O Formats

RGBA

NV12

NV21

IYUV

RGBA

N/A

For details,
see the RGBA
to NV12

For details,
see the RGBA
to NV21

For details,
see the RGBA
to IYUV

For details,
see the RGBA
to YUV4

NV12

For details,
see the NV12
to RGBA

N/A

For details,
see the NV12
to IYUV

For details,
see the NV12
to YUV4

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

UYVY

YUYV

YUV4

[placeholder text]
59

Send Feedback

Chapter 3: xfOpenCV Library API Reference

NV21

For details,
see the NV21
to RGBA

IYUV

For details,
see the IYUV
to RGBA

UYVY

YUYV

N/A

For details,
see the NV21
to IYUV

For details,
see the NV21
to YUV4

For details,
see the IYUV
to NV12

N/A

For details,
see the IYUV
to YUV4

For details,
see the UYVY
to RGBA

For details,
see the UYVY
to NV12

For details,
see the UYVY
to IYUV

For details,
see the YUYV
to RGBA

For details,
see the YUYV
to NV12

For details,
see the YUYV
to IYUV

N/A

YUV4

N/A

N/A

RGB to YUV Conversion Matrix
Following is the formula to convert RGB data to YUV data:

⎡R ⎤
⎡Y ⎤ ⎡ 0.257 0.504 0.098 16 ⎤ ⎢ ⎥
⎢U ⎥ = ⎢-0.148 -0.291 0.439 128⎥ G
⎣V ⎦ ⎣ 0.439 -0.368 -0.071 128⎦ ⎢B ⎥
⎣1 ⎦

YUV to RGB Conversion Matrix
Following is the formula to convert YUV data to RGB data:

0
1.596 ⎤ ⎡ (Y - 16) ⎤
⎡R ⎤ ⎡1.164
⎢G⎥ = ⎢1.164 -0.391 -0.813⎥ ⎢⎢(U - 128)⎥⎥
⎣B ⎦ ⎣1.164 2.018
0 ⎦ ⎣(V - 128) ⎦

Source: http://www.fourcc.org/fccyvrgb.php

RGBA to YUV4
The rgba2yuv4 function converts a 4-channel RGBA image to YUV444 format. The function
outputs Y, U, and V streams separately.
API Syntax
template 
void rgba2yuv4(xf::Mat & _src, xf::Mat & _y_image, xf::Mat & _u_image,
xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
60

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 62: rgba2yuv4 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

DST_T

Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input Y plane of size (ROWS, COLS).

_y_image

Output Y image of size (ROWS, COLS).

_u_image

Output U image of size (ROWS, COLS).

_v_image

Output V image of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of RGBA to YUV4 for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 63: rgba2yuv4 Function Resource Utilization Summary

Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

DSP_48Es
9

FF
589

LUT

CLB

328

96

Performance Estimate
The following table summarizes the performance of RGBA to YUV4 for different configurations,
as generated using the Vivado HLS 2018.2 version for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to
process a grayscale HD (1080x1920) image.
Table 64: rgba2yuv4 Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Max Latency (ms)
1.89

www.xilinx.com

[placeholder text]
61

Send Feedback

Chapter 3: xfOpenCV Library API Reference

RGBA to IYUV
The rgba2iyuv function converts a 4-channel RGBA image to IYUV (4:2:0) format. The
function outputs Y, U, and V planes separately. IYUV holds subsampled data, Y is sampled for
every RGBA pixel and U,V are sampled once for 2row and 2column(2x2) pixels. U and V planes
are of (rows/2)*(columns/2) size, by cascading the consecutive rows into a single row the planes
size becomes (rows/4)*columns.
API Syntax
template 
void rgba2iyuv(xf::Mat & _src, xf::Mat & _y_image, xf::Mat & _u_image,
xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 65: rgba2iyuv Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit,unsigned, 4-channel is supported (XF_8UC4).

DST_T

Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input Y plane of size (ROWS, COLS).

_y_image

Output Y image of size (ROWS, COLS).

_u_image

Output U image of size (ROWS/4, COLS).

_v_image

Output V image of size (ROWS/4, COLS).

Resource Utilization
The following table summarizes the resource utilization of RGBA to IYUV for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
62

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 66: rgba2iyuv Function Resource Utilization Summary
Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

DSP_48Es
9

FF

LUT

816

CLB

472

149

Performance Estimate
The following table summarizes the performance of RGBA to IYUV for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 67: rgba2iyuv Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

Max Latency (ms)
1.8

RGBA to NV12
The rgba2nv12 function converts a 4-channel RGBA image to NV12 (4:2:0) format. The
function outputs Y plane and interleaved UV plane separately. NV12 holds the subsampled data,
Y is sampled for every RGBA pixel and U, V are sampled once for 2row and 2columns (2x2)
pixels. UV plane is of (rows/2)*(columns/2) size as U and V values are interleaved.
API Syntax
template 
void rgba2nv12(xf::Mat & _src, xf::Mat & _y, xf::Mat & _uv)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 68: rgba2nv12 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit,unsigned, 4-channel is supported (XF_8UC4).

Y_T

Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).

UV_T

Output pixel type. Only 8-bit,unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
63

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 68: rgba2nv12 Function Parameter Descriptions (cont'd)
Parameter

Description

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input RGBA image of size (ROWS, COLS).

_y

Output Y image of size (ROWS, COLS).

_uv

Output UV image of size (ROWS/2, COLS/2).

Resource Utilization
The following table summarizes the resource utilization of RGBA to NV12 for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 69: rgba2nv12 Function Resource Utilization Summary
Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

DSP_48Es
9

FF
802

LUT

CLB

452

128

Performance Estimate
The following table summarizes the performance of RGBA to NV12 for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 70: rgba2nv12 Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

Max Latency (ms)
1.8

RGBA to NV21
The rgba2nv21 function converts a 4-channel RGBA image to NV21 (4:2:0) format. The
function outputs Y plane and interleaved VU plane separately. NV21 holds subsampled data, Y is
sampled for every RGBA pixel and U, V are sampled once for 2 row and 2 columns (2x2) RGBA
pixels. UV plane is of (rows/2)*(columns/2) size as V and U values are interleaved.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
64

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template 
void rgba2nv21(xf::Mat & _src, xf::Mat & _y, xf::Mat & _uv)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 71: rgba2nv21 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

Y_T

Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input RGBA image of size (ROWS, COLS).

_y

Output Y image of size (ROWS, COLS).

_uv

Output UV image of size (ROWS/2, COLS/2).

Resource Utilization
The following table summarizes the resource utilization of RGBA to NV21 for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 72: rgba2nv21 Function Resource Utilization Summary
Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

DSP_48Es
9

FF
802

LUT
453

CLB
131

Performance Estimate
The following table summarizes the performance of RGBA to NV21 for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
65

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 73: rgba2nv21 Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

1.89

YUYV to RGBA
The yuyv2rgba function converts a single-channel YUYV (YUV 4:2:2) image format to a 4channel RGBA image. YUYV is a sub-sampled format, a set of YUYV value gives 2 RGBA pixel
values. YUYV is represented in 16-bit values where as, RGBA is represented in 32-bit values.
API Syntax
template
void yuyv2rgba(xf::Mat & _src, xf::Mat & _dst)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 74: yuyv2rgba Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

DST_T

Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input image of size (ROWS, COLS).

_dst

Output image of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of YUYV to RGBA for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
66

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 75: yuyv2rgba Function Resource Utilization Summary
Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

FF

DSP_48Es
6

LUT

765

CLB

705

165

Performance Estimate
The following table summarizes the performance of UYVY to RGBA for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 76: yuyv2rgba Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

Max Latency (ms)
6.9

YUYV to NV12
The yuyv2nv12 function converts a single-channel YUYV (YUV 4:2:2) image format to NV12
(YUV 4:2:0) format. YUYV is a sub-sampled format, 1 set of YUYV value gives 2 Y values and 1 U
and V value each.
API Syntax
template
void yuyv2nv12(xf::Mat & _src,xf::Mat & _y_image,xf::Mat & _uv_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 77: yuyv2nv12 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

Y_T

Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
67

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 77: yuyv2nv12 Function Parameter Descriptions (cont'd)
Parameter

Description

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

NPC_UV

Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

_src

Input image of size (ROWS, COLS).

_y_image

Output Y plane of size (ROWS, COLS).

_uv_image

Output U plane of size (ROWS/2, COLS/2).

Resource Utilization
The following table summarizes the resource utilization of YUYV to NV12 for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 78: yuyv2nv12 Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

831

491

149

8 pixel

150

0

0

1196

632

161

Performance Estimate
The following table summarizes the performance of YUYV to NV12 for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 79: yuyv2nv12 Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
68

Send Feedback

Chapter 3: xfOpenCV Library API Reference

YUYV to IYUV
The yuyv2iyuv function converts a single-channel YUYV (YUV 4:2:2) image format to
IYUV(4:2:0) format. Outputs of the function are separate Y, U, and V planes. YUYV is a subsampled format, 1 set of YUYV value gives 2 Y values and 1 U and V value each. U, V values of
the odd rows are dropped as U, V values are sampled once for 2 rows and 2 columns in the
IYUV(4:2:0) format.
API Syntax
template
void yuyv2iyuv(xf::Mat & _src, xf::Mat & _y_image, xf::Mat & _u_image,
xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 80: yuyv2iyuv Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 16-bit, unsigned,1 channel is supported (XF_16UC1).

DST_T

Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input image of size (ROWS, COLS).

_y_image

Output Y plane of size (ROWS, COLS).

_u_image

Output U plane of size (ROWS/4, COLS).

_v_image

Output V plane of size (ROWS/4, COLS).

Resource Utilization
The following table summarizes the resource utilization of YUYV to IYUV for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
69

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 81: yuyv2iyuv Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

835

497

149

8 pixel

150

0

0

1428

735

210

Performance Estimate
The following table summarizes the performance of YUYV to IYUV for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 82: yuyv2iyuv Function Performance Estimate
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

UYVY to IYUV
The uyvy2iyuv function converts a UYVY (YUV 4:2:2) single-channel image to the IYUV
format. The outputs of the functions are separate Y, U, and V planes. UYVY is sub sampled
format. 1 set of UYVY value gives 2 Y values and 1 U and V value each.
API Syntax
template
void uyvy2iyuv(xf::Mat & _src, xf::Mat & _y_image,xf::Mat & _u_image,
xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 83: uyvy2iyuv Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

DST_T

Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS

Maximum height of input and output image (must be a multiple of 8).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
70

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 83: uyvy2iyuv Function Parameter Descriptions (cont'd)
Parameter

Description

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input image of size (ROWS, COLS).

_y_image

Output Y plane of size (ROWS, COLS).

_u_image

Output U plane of size (ROWS/4, COLS).

_v_image

Output V plane of size (ROWS/4, COLS).

Resource Utilization
The following table summarizes the resource utilization of UYVY to IYUV for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image..
Table 84: uyvy2iyuv Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

835

494

139

8 pixel

150

0

0

1428

740

209

Performance Estimate
The following table summarizes the performance of UYVY to IYUV for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 85: uyvy2iyuv Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

UYVY to RGBA
The uyvy2rgba function converts a UYVY (YUV 4:2:2) single-channel image to a 4-channel
RGBA image. UYVY is sub sampled format, 1set of UYVY value gives 2 RGBA pixel values. UYVY
is represented in 16-bit values where as RGBA is represented in 32-bit values.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
71

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void uyvy2rgba(xf::Mat & _src, xf::Mat & _dst)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 86: uyvy2rgba Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

DST_T

Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input image of size (ROWS, COLS).

_dst

Output image of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of UYVY to RGBA for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 87: uyvy2rgba Function Resource Utilization Summary
Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

DSP_48Es
6

FF
773

LUT
704

CLB
160

Performance Estimate
The following table summarizes the performance of UYVY to RGBA for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
72

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 88: uyvy2rgba Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.8

UYVY to NV12
The uyvy2nv12 function converts a UYVY (YUV 4:2:2) single-channel image to NV12 format.
The outputs are separate Y and UV planes. UYVY is sub sampled format, 1 set of UYVY value
gives 2 Y values and 1 U and V value each.
API Syntax
template
void uyvy2nv12(xf::Mat & _src,xf::Mat & _y_image,xf::Mat & _uv_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 89: uyvy2nv12 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).

Y_T

Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

NPC_UV

Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 8 pixel operations respectively.

_src

Input image of size (ROWS, COLS).

_y_image

Output Y plane of size (ROWS, COLS).

_uv_image

Output U plane of size (ROWS/2, COLS/2).

Resource Utilization
The following table summarizes the resource utilization of UYVY to NV12 for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
73

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 90: uyvy2nv12 Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

831

488

131

8 pixel

150

0

0

1235

677

168

Performance Estimate
The following table summarizes the performance of UYVY to NV12 for different configurations,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 91: uyvy2nv12 Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

IYUV to RGBA
The iyuv2rgba function converts single channel IYUV (YUV 4:2:0) image to a 4-channel RGBA
image. The inputs to the function are separate Y, U, and V planes. IYUV is sub sampled format, U
and V values are sampled once for 2 rows and 2 columns of the RGBA pixels. The data of the
consecutive rows of size (columns/2) is combined to form a single row of size (columns).
API Syntax
template
void iyuv2rgba(xf::Mat & src_y, xf::Mat & src_u,xf::Mat & src_v,
xf::Mat & _dst0)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 92: iyuv2rgba Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

DST_T

Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
74

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 92: iyuv2rgba Function Parameter Descriptions (cont'd)
Parameter

Description

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_u

Input U plane of size (ROWS/4, COLS).

src_v

Input V plane of size (ROWS/4, COLS).

_dst0

Output RGBA image of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of IYUV to RGBA for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 93: iyuv2rgba Function Resource Utilization Summary
Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

2

DSP_48Es
5

FF
1208

LUT

CLB

728

196

Performance Estimate
The following table summarizes the performance of IYUV to RGBA for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 94: iyuv2rgba Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

Max Latency (ms)
6.9

IYUV to NV12
The iyuv2nv12 function converts single channel IYUV image to NV12 format. The inputs are
separate U and V planes. There is no need of processing Y plane as both the formats have a same
Y plane. U and V values are rearranged from plane interleaved to pixel interleaved.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
75

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void iyuv2nv12(xf::Mat & src_y, xf::Mat & src_u,xf::Mat &
src_v,xf::Mat & _y_image, xf::Mat & _uv_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 95: iyuv2nv12 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Output pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

NPC_UV

Number of UV Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4
for 1 pixel and 4-pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_u

Input U plane of size (ROWS/4, COLS).

src_v

Input V plane of size (ROWS/4, COLS).

_y_image

Output V plane of size (ROWS, COLS).

_uv_image

Output UV plane of size (ROWS/2, COLS/2).

Resource Utilization
The following table summarizes the resource utilization of IYUV to NV12 for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image..
Table 96: iyuv2nv12 Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

12

907

677

158

8 pixel

150

0

12

1591

1022

235

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
76

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance of IYUV to NV12 for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 97: iyuv2nv12 Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

IYUV to YUV4
The iyuv2yuv4 function converts a single channel IYUV image to a YUV444 format. Y plane is
same for both the formats. The inputs are separate U and V planes of IYUV image and the
outputs are separate U and V planes of YUV4 image. IYUV stores subsampled U,V values. YUV
format stores U and V values for every pixel. The same U, V values are duplicated for 2 rows and
2 columns (2x2) pixels in order to get the required data in the YUV444 format.
API Syntax
template
void iyuv2yuv4(xf::Mat & src_y, xf::Mat & src_u,xf::Mat &
src_v,xf::Mat & _y_image, xf::Mat & _u_image, xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 98: iyuv2yuv4 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_u

Input U plane of size (ROWS/4, COLS).

src_v

Input V plane of size (ROWS/4, COLS).

_y_image

Output Y image of size (ROWS, COLS).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
77

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 98: iyuv2yuv4 Function Parameter Descriptions (cont'd)
Parameter

Description

_u_image

Output U image of size (ROWS, COLS).

_v_image

Output V image of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of IYUV to YUV4 for different
configurations, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 99: iyuv2yuv4 Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

1398

870

232

8 pixel

150

0

0

2134

1214

304

Performance Estimate
The following table summarizes the performance of IYUV to YUV4 for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 100: iyuv2yuv4 Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

13.8

8 pixel operation (150 MHz)

3.4

NV12 to IYUV
The nv122iyuv function converts NV12 format to IYUV format. The function inputs the
interleaved UV plane and the outputs are separate U and V planes. There is no need of
processing the Y plane as both the formats have a same Y plane. U and V values are rearranged
from pixel interleaved to plane interleaved.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
78

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void nv122iyuv(xf::Mat & src_y, xf::Mat & src_uv,xf::Mat &
_y_image,xf::Mat & _u_image,xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 101: nv122iyuv Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

NPC_UV

Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_uv

Input UV plane of size (ROWS/2, COLS/2).

_y_image

Output Y plane of size (ROWS, COLS).

_u_image

Output U plane of size (ROWS/4, COLS).

_v_image

Output V plane of size (ROWS/4, COLS).

Resource Utilization
The following table summarizes the resource utilization of NV12 to IYUV for different
configurations, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 102: nv122iyuv Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

1

1344

717

208

8 pixel

150

0

1

1961

1000

263

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
79

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance of NV12 to IYUV for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 103: nv122iyuv Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

NV12 to RGBA
The nv122rgba function converts NV12 image format to a 4-channel RGBA image. The inputs
to the function are separate Y and UV planes. NV12 holds sub sampled data, Y plane is sampled
at unit rate and 1 U and 1V value each for every 2x2 Y values. To generate the RGBA data, each
U and V value is duplicated (2x2) times.
API Syntax
template
void nv122rgba(xf::Mat & src_y,xf::Mat & src_uv,xf::Mat & _dst0)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 104: nv122rgba Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

DST_T

Output pixel type. Only 8-bit,unsigned,4channel is supported (XF_8UC4).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_uv

Input UV plane of size (ROWS/2, COLS/2).

_dst0

Output RGBA image of size (ROWS, COLS).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
80

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of NV12 to RGBA for different
configurations, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 105: nv122rgba Function Resource Utilization Summary

Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

2

DSP_48Es
5

FF
1191

LUT

CLB

708

195

Performance Estimate
The following table summarizes the performance of NV12 to RGBA for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 106: nv122rgba Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

Max Latency (ms)
6.9

NV12 to YUV4
The nv122yuv4 function converts a NV12 image format to a YUV444 format. The function
outputs separate U and V planes. Y plane is same for both the image formats. The UV planes are
duplicated 2x2 times to represent one U plane and V plane of the YUV444 image format.
API Syntax
template
void nv122yuv4(xf::Mat & src_y, xf::Mat & src_uv,xf::Mat &
_y_image, xf::Mat & _u_image,xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
81

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 107: nv122yuv4 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

NPC_UV

Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_uv

Input UV plane of size (ROWS/2, COLS/2).

_y_image

Output Y plane of size (ROWS, COLS).

_u_image

Output U plane of size (ROWS, COLS).

_v_image

Output V plane of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of NV12 to YUV4 for different
configurations, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 108: nv122yuv4 Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

1383

832

230

8 pixel

150

0

0

1772

1034

259

Performance Estimate
The following table summarizes the performance of NV12 to YUV4 for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 109: nv122yuv4 Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

13.8

8 pixel operation (150 MHz)

3.4

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
82

Send Feedback

Chapter 3: xfOpenCV Library API Reference

NV21 to IYUV
The nv212iyuv function converts a NV21 image format to an IYUV image format. The input to
the function is the interleaved VU plane only and the outputs are separate U and V planes. There
is no need of processing Y plane as both the formats have same the Y plane. U and V values are
rearranged from pixel interleaved to plane interleaved.
API Syntax
template
void nv212iyuv(xf::Mat & src_y, xf::Mat & src_uv,xf::Mat &
_y_image, xf::Mat & _u_image,xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 110: nv212iyuv Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

NPC_UV

Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_uv

Input UV plane of size (ROWS/2, COLS/2).

_y_image

Output Y plane of size (ROWS, COLS).

_u_image

Output U plane of size (ROWS/4, COLS).

_v_image

Output V plane of size (ROWS/4, COLS).

Resource Utilization
The following table summarizes the resource utilization of NV21 to IYUV for different
configurations, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
83

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 111: nv212iyuv Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

1

1377

730

219

8 pixel

150

0

1

1975

1012

279

Performance Estimate
The following table summarizes the performance of NV21 to IYUV for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 112: nv212iyuv Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

NV21 to RGBA
The nv212rgba function converts a NV21 image format to a 4-channel RGBA image. The inputs
to the function are separate Y and VU planes. NV21 holds sub sampled data, Yplane is sampled
at unit rate and 1 U and 1V value each for every 2x2 Yvalues. To generate the RGBA data, each U
and V value is duplicated (2x2) times.
API Syntax
template
void nv212rgba(xf::Mat & src_y, xf::Mat & src_uv,xf::Mat & _dst0)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 113: nv212rgba Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

DST_T

Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
84

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 113: nv212rgba Function Parameter Descriptions (cont'd)
Parameter

Description

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_uv

Input UV plane of size (ROWS/2, COLS/2).

_dst0

Output RGBA image of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of NV21 to RGBA for different
configurations, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 114: nv212rgba Function Resource Utilization Summary

Operating
Mode
1 pixel

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

2

DSP_48Es
5

FF
1170

LUT

CLB

673

183

Performance Estimate
The following table summarizes the performance of NV12 to RGBA for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 115: nv212rgba Function Performance Estimate Summary
Latency Estimate

Operating Mode
1 pixel operation (300 MHz)

Max Latency (ms)
6.9

NV21 to YUV4
The nv212yuv4 function converts an image in the NV21 format to a YUV444 format. The
function outputs separate U and V planes. Y plane is same for both formats. The UV planes are
duplicated 2x2 times to represent one U plane and V plane of YUV444 format.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
85

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void nv212yuv4(xf::Mat & src_y, xf::Mat & src_uv, xf::Mat &
_y_image, xf::Mat & _u_image, xf::Mat & _v_image)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 116: nv212yuv4 Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

UV_T

Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).

ROWS

Maximum height of input and output image (must be a multiple of 8).

COLS

Maximum width of input and output image (must be a multiple of 8).

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

NPC_UV

Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.

src_y

Input Y plane of size (ROWS, COLS).

src_uv

Input UV plane of size (ROWS/2, COLS/2).

_y_image

Output Y plane of size (ROWS, COLS).

_u_image

Output U plane of size (ROWS, COLS).

_v_image

Output V plane of size (ROWS, COLS).

Resource Utilization
The following table summarizes the resource utilization of NV21 to YUV4 for different
configurations, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 117: nv212yuv4 Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

1383

817

233

8 pixel

150

0

0

1887

1087

287

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
86

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance of NV21 to YUV4 for different configurations,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 118: nv212yuv4 Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

13.8

8 pixel operation (150 MHz)

3.5

Color Thresholding
The colorthresholding function compares the color space values of the source image with
low and high threshold values, and returns either 255 or 0 as the output.
API Syntax
template
void colorthresholding(xf::Mat &
_src_mat,xf::Mat & _dst_mat,unsigned char
low_thresh[MAXCOLORS*3], unsigned char high_thresh[MAXCOLORS*3])

Parameter Descriptions
The table below describes the template and the function parameters.
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 4 channel is supported (XF_8UC4)

DST_T

Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

MAXCOLORS

Maximum number of color values

ROWS

Maximum height of input and output image

COLS

Maximum width of input and output image

NPC

Number of pixels to be processed per cycle. Only XF_NPPC1 supported.

_src_mat

Input image

_dst_mat

Thresholded image

low_thresh

Lowest threshold values for the colors

high_thresh

Highest threshold values for the colors

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
87

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Custom Convolution
The filter2D function performs convolution over an image using a user-defined kernel.
Convolution is a mathematical operation on two functions f and g, producing a third function,
The third function is typically viewed as a modified version of one of the original functions, that
gives the area overlap between the two functions to an extent that one of the original functions
is translated.
The filter can be unity gain filter or a non-unity gain filter. The filter must be of type AU_16SP. If
the co-efficients are floating point, it must be converted into the Qm.n and provided as the input
as well as the shift parameter has to be set with the ‘n’ value. Else, if the input is not of floating
point, the filter is provided directly and the shift parameter is set to zero.
API Syntax
template
void filter2D(xf::Mat & _src_mat,xf::Mat & _dst_mat,short int
filter[FILTER_HEIGHT*FILTER_WIDTH],unsigned char _shift)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 119: filter2D Function Parameter Descriptions
Parameter

Description

BORDER_TYPE

Border Type supported is XF_BORDER_CONSTANT

FILTER_HEIGHT

Number of rows in the input filter

FILTER_WIDTH

Number of columns in the input filter

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output pixel type.8-bit unsigned single channel (XF_8UC1) and 16-bit signed single
channel (XF_16SC1) supported.

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image

filter

The input filter of any size, provided the dimensions should be an odd number. The filter
co-efficients either a 16-bit value or a 16-bit fixed point equivalent value.

_shift

The filter must be of type XF_16SP. If the co-efficients are floating point, it must be
converted into the Qm.n and provided as the input as well as the shift parameter has to
be set with the ‘n’ value. Else, if the input is not of floating point, the filter is provided
directly and the shift parameter is set to zero.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
88

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 120: filter2D Function Resource Utilization Summary

Operating
Mode
1 pixel
8 pixel

Filter Size

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

3x3

300

3

9

1701

1161

269

5x5

300

5

25

3115

2144

524

3x3

150

6

72

2783

2768

638

5x5

150

10

216

3020

4443

1007

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 121: filter2D Function Performance Estimate Summary

Operating Mode
1 pixel
8 pixel

Latency Estimate

Operating Frequency

Filter Size

(MHz)

Max (ms)

300

3x3

7

300

5x5

7.1

150

3x3

1.86

150

5x5

1.86

Delay
In image processing pipelines, it is possible that the inputs to a function with FIFO interfaces are
not synchronized. That is, the first data packet for first input might arrive a finite number of clock
cycles after the first data packet of the second input. If the function has FIFOs at its interface
with insufficient depth, this causes the whole design to stall on hardware. To synchronize the
inputs, we provide this function to delay the input packet that arrives early, by a finite number of
clock cycles.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
89

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void delayMat(xf::Mat & _src,
xf::Mat & _dst)

Parameter Descriptions
The table below describes the template and the function parameters.
Parameter

Description

SRC_T

Input and output pixel type

ROWS

Maximum height of input and output image (must be multiple of 8)

COLS

Maximum width of input and output image (must be multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

MAXDELAY

Maximum delay that the function is to be instantiated for.

_src

Input image

_dst

Output image

Dilate
During a dilation operation, the current pixel intensity is replaced by the maximum value of the
intensity in a 3x3 neighborhood of the current pixel.

dst(x, y) =

max
'

x-1≤ x ≤ x+1

src⎛⎝x ', y '⎞⎠

y - 1 ≤ y' ≤ y + 1

API Syntax
template
void dilate(xf::Mat & _src_mat, xf::Mat & _dst_mat)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 122: dilate Function Parameter Descriptions
Parameter
BORDER_TYPE

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Description
Border Type supported is XF_BORDER_CONSTANT

www.xilinx.com

[placeholder text]
90

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 122: dilate Function Parameter Descriptions (cont'd)
Parameter

Description

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for
1 pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image

Resource Utilization
The following table summarizes the resource utilization of the Dilation function for 1 pixel
operation and 8 pixel operation, generated using Vivado HLS 2018.2 version tool for the Xilinx
Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 123: dilate Function Resource Utilization Summary
Resource Utilization
Name

1 pixel per clock operation

8 pixel per clock operation

300 MHz

150 MHz

BRAM_18K

3

6

DSP48E

0

0

FF

339

644

LUT

350

1325

CLB

81

245

Performance Estimate
The following table summarizes a performance estimate of the Dilation function for Normal
Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS
2018.2 tool for Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 124: dilate Function Performance Estimate Summary
Latency Estimate

Operating Mode

Min (ms)

Max (ms)

1 pixel (300 MHz)

7.0

7.0

8 pixel (150 MHz)

1.87

1.87

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
91

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Duplicate
When various functions in a pipeline are implemented by a programmable logic, FIFOs are
instantiated between two functions for dataflow processing. When the output from one function
is consumed by two functions in a pipeline, the FIFOs need to be duplicated. This function
facilitates the duplication process of the FIFOs.
API Syntax
template
void duplicateMat(xf::Mat & _src,
xf::Mat & _dst1,xf::Mat &
_dst2)

Parameter Descriptions
The table below describes the template and the function parameters.
Parameter

Description

SRC_T

Input and output pixel type

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle. Possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel
and 8 pixel operations respectively.

_src

Input image

_dst1

Duplicate output for _src

_dst2

Duplicate output for _src

Erode
The erode function finds the minimum pixel intensity in the 3x3 neighborhood of a pixel and
replaces the pixel intensity with the minimum value.

dst(x, y) =

min
'

x-1≤ x ≤ x+1

src⎛⎝x ', y '⎞⎠

y - 1 ≤ y' ≤ y + 1

API Syntax
template
void erode(xf::Mat & _src_mat, xf::Mat & _dst_mat)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
92

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parameter Descriptions
The following table describes the template and the function parameters.
Table 125: erode Function Parameter Descriptions
Parameter

Description

BORDER_TYPE

Border type supported is XF_BORDER_CONSTANT

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8
for 1 pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image

Resource Utilization
The following table summarizes the resource utilization of the Erosion function generated using
Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 126: erode Function Resource Utilization Summary
Resource Utilization
Name

1 pixel per clock operation

8 pixel per clock operation

300 MHz

150 MHz

BRAM_18K

3

6

DSP48E

0

0

FF

342

653

LUT

351

1316

CLB

79

230

Performance Estimate
The following table summarizes a performance estimate of the Erosion function for Normal
Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS
2018.2 tool for Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 127: erode Function Performance Estimate Summary
Latency Estimate

Operating Mode

Min (ms)

Max (ms)

1 pixel (300 MHz)

7.0

7.0

8 pixel (150 MHz)

1.85

1.85

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
93

Send Feedback

Chapter 3: xfOpenCV Library API Reference

FAST Corner Detection
Features from accelerated segment test (FAST) is a corner detection algorithm, that is faster than
most of the other feature detectors.
The fast function picks up a pixel in the image and compares the intensity of 16 pixels in its
neighborhood on a circle, called the Bresenham's circle. If the intensity of 9 contiguous pixels is
found to be either more than or less than that of the candidate pixel by a given threshold, then
the pixel is declared as a corner. Once the corners are detected, the non-maximal suppression is
applied to remove the weaker corners.
This function can be used for both still images and videos. The corners are marked in the image.
If the corner is found in a particular location, that location is marked with 255, otherwise it is
zero.
API Syntax
template
void fast(xf::Mat & _src_mat,xf::Mat & _dst_mat,unsigned char _threshold)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 128: fast Function Parameter Descriptions
Parameter

Description

NMS

If NMS == 1, non-maximum suppression is applied to detected corners (keypoints). The value
should be 0 or 1.

SRC_T

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)

ROWS

Maximum height of input image (must be a multiple of 8)

COLS

Maximum width of input image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image. The corners are marked in the image.

_threshold

Threshold on the intensity difference between the center pixel and its neighbors. Usually it is taken
around 20.

Resource Utilization
The following table summarizes the resource utilization of the kernel for different configurations,
generated using Vivado HLS 2018.2 for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image with NMS.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
94

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 129: fast Function Resource Utilization Summary
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

10

20

DSP48E

0

0

FF

2695

7310

LUT

3792

20956

CLB

769

3519

Performance Estimate
The following table summarizes the performance of kernel for different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image with non-maximum suppression (NMS).
Table 130: fast Function Performance Estimate Summary

Operating Mode

Latency Estimate

Operating Frequency

Filter Size

(MHz)

Max (ms)

1 pixel

300

3x3

7

8 pixel

150

3x3

1.86

Gaussian Filter
The GaussianBlur function applies Gaussian blur on the input image. Gaussian filtering is done
by convolving each point in the input image with a Gaussian kernel.
-(x - μ x) 2

G 0(x, y) = e

2σ 2x

+

-(y - μ y) 2
2σ 2y

μ
σ
Where μ x , y are the mean values and σ x , y are the variances in x and y directions
μ
respectively. In the GaussianBlur function, values of μ x , y are considered as zeroes and the
σ
values of σ x , y are equal.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
95

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void GaussianBlur(xf::Mat & src, xf::Mat & dst, float sigma)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 131: GaussianBlur Function Parameter Descriptions
Parameter

Description

FILTER_SIZE

Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7) are
supported.

BORDER_TYPE

Border type supported is XF_BORDER_CONSTANT

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for
1 pixel and 8 pixel operations respectively.

src

Input image

dst

Output image

sigma

Standard deviation of Gaussian filter

Resource Utilization
The following table summarizes the resource utilization of the Gaussian Filter in different
configurations, generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to progress a grayscale HD (1080x1920) image.
Table 132: GaussianBlur Function Resource Utilization Summary

Operating
Mode
1 pixel

8 pixel

Utilization Estimate

Operating
Frequency

Filter Size

(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

3x3

300

3

17

3641

2791

610

5x5

300

5

27

4461

3544

764

7x7

250

7

35

4770

4201

894

3x3

150

6

52

3939

3784

814

5x5

150

10

111

5688

5639

1133

7x7

150

14

175

7594

7278

1518

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
96

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes a performance estimate of the Gaussian Filter in different
configurations, as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-ffvb1156-1-i-es1
FPGA, to process a grayscale HD (1080x1920) image.
Table 133: GaussianBlur Function Performance Estimate Summary
Operating Mode
1 pixel operation (300 MHz)

8 pixel operation (150 MHz)

Latency Estimate

Filter Size

Max Latency (ms)

3x3

7.01

5x5

7.03

7x7

7.06

3x3

1.6

5x5

1.7

7x7

1.74

Gradient Magnitude
The magnitude function computes the magnitude for the images. The input images are xgradient and y-gradient images of type 16S. The output image is of same type as the input image.
For L1NORM normalization, the magnitude computed image is the pixel-wise added image of
absolute of x-gradient and y-gradient, as shown below:.

| |

g = |g x| + g y

For L2NORM normalization, the magnitude computed image is as follows:

g=

⎛ 2
⎝g x

+ g 2y⎞⎠

API Syntax
template< int NORM_TYPE ,int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1>
void magnitude(xf::Mat & _src_matx,xf::Mat & _src_maty,xf::Mat & _dst_mat)

Parameter Descriptions
The following table describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
97

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 134: magnitude Function Parameter Descriptions
Parameter

Description

NORM_TYPE

Normalization type can be either L1 or L2 norm. Values are XF_L1NORM or XF_L2NORM

SRC_T

Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)

DST_T

Output pixel type. Only 16-bit, signed,1 channel is supported (XF_16SC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for 1 pixel
and 8 pixel operations respectively.

_src_matx

First input, x-gradient image.

_src_maty

Second input, y-gradient image.

_dst_mat

Output, magnitude computed image.

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image and for L2 normalization.
Table 135: magnitude Function Resource Utilization Summary
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

0

0

DSP48E

2

16

FF

707

2002

LUT

774

3666

CLB

172

737

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image and for L2 normalization.
Table 136: magnitude Function Performance Estimate Summary
Operating Mode

Latency Estimate

Operating Frequency (MHz)

Max (ms)

1 pixel

300

7.2

8 pixel

150

1.7

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
98

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Gradient Phase
The phase function computes the polar angles of two images. The input images are x-gradient
and y-gradient images of type 16S. The output image is of same type as the input image.
For radians:

angle(x, y) = atan2⎛⎝g y, g x⎞⎠
For degrees:

angle(x, y) = atan2(g y, g x)* 180
π
API Syntax
template
void phase(xf::Mat & _src_matx,xf::Mat & _src_maty,xf::Mat & _dst_mat)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 137: phase Function Parameter Descriptions
Parameter
RET_TYPE

Description
Output format can be either in radians or degrees. Options are XF_RADIANS or XF_DEGREES.

•

If the XF_RADIANS option is selected, phase API will return result in Q4.12 format. The output
range is (0, 2 pi)

•

If the XF_DEGREES option is selected, xFphaseAPI will return result in Q10.6 degrees and
output range is (0, 360)

SRC_T

Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)

DST_T

Output pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src_matx

First input, x-gradient image.

_src_maty

Second input, y-gradient image.

_dst_mat

Output, phase computed image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
99

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 138: phase Function Resource Utilization Summary
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

6

24

DSP48E

6

19

FF

873

2396

LUT

753

3895

CLB

185

832

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 139: phase Function Performance Estimate Summary
Operating Mode

Operating Frequency (MHz)

Latency Estimate (ms)

1 pixel

300

7.2

8 pixel

150

1.7

Deviation from OpenCV
In phase implementation, the output is returned in a fixed point format. If XF_RADIANS option is
selected, phase API will return result in Q4.12 format. The output range is (0, 2 pi). If
XF_DEGREES option is selected, phase API will return result in Q10.6 degrees and output range
is (0, 360);

Harris Corner Detection
In order to understand Harris Corner Detection, let us consider a grayscale image. Sweep a
window w(x,y) (with displacements u in the x-direction and v in the y-direction), I calculates
the variation of intensity w(x,y).

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
100

Send Feedback

Chapter 3: xfOpenCV Library API Reference

E(u, v) = ∑ w(x, y)⎡⎣I(x + u, y + v) - I(x, y)⎤⎦ 2
Where:
• w(x,y) is the window position at (x,y)
• I(x,y) is the intensity at (x,y)
• I(x+u,y+v) is the intensity at the moved window (x+u,y+v).
Since we are looking for windows with corners, we are looking for windows with a large variation
in intensity. Hence, we have to maximize the equation above, specifically the term:

I(x + u, y + v) - I(x, y)⎤⎦ 2

⎡
⎣

Using Taylor expansion:

E(u, v) = ∑ ⎡⎣I(x, y) + uI x + vI y - I(x, y)⎤⎦ 2
Expanding the equation and cancelling I(x,y) with -I(x,y):

E(u, v) = ∑ u 2I 2x + 2uvI x I y + v 2 I 2y
The above equation can be expressed in a matrix form as:

⎛
I 2x I x I y ⎞ u
⟧ ⎟⟦v ⟧
E(u, v) = [u v]⎜∑ w(x, y)⟦
I x I y I 2y
⎝
⎠
So, our equation is now:

u
E(u, v) = [u v]M⟦v ⟧
A score is calculated for each window, to determine if it can possibly contain a corner:

R = det(M) - k(trace(M)) 2
Where,
• det(M) = λ 1 λ 2
• trace(M) = λ 1 + λ 2
Non-Maximum Suppression:

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
101

Send Feedback

Chapter 3: xfOpenCV Library API Reference

In non-maximum suppression (NMS) if radius = 1, then the bounding box is 2*r+1 = 3.
In this case, consider a 3x3 neighborhood across the center pixel. If the center pixel is greater
than the surrounding pixel, then it is considered a corner. The comparison is made with the
surrounding pixels, which are within the radius.
Radius = 1
x-1, y-1

x-1, y

x-1, y+1

x, y-1

x, y

x, y+1

x+1, y-1

x+1, y

x+1, y+1

Threshold:
A threshold=442, 3109 and 566 is used for 3x3, 5x5, and 7x7 filters respectively. This threshold
is verified over 40 sets of images. The threshold can be varied, based on the application. The
corners are marked in the output image. If the corner is found in a particular location, that
location is marked with 255, otherwise it is zero.
API Syntax
template
void cornerHarris(xf::Mat & src,xf::Mat & dst,uint16_t threshold, uint16_t k)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 140: cornerHarris Function Parameter Descriptions
Parameter

Description

FILTERSIZE

Size of the Sobel filter. 3, 5, and 7 supported.

BLOCKWIDTH

Size of the box filter. 3, 5, and 7 supported.

NMSRADIUS

Radius considered for non-maximum suppression. Values supported are 1 and 2.

TYPE

Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)

ROWS

Maximum height of input image (must be a multiple of 8)

COLS

Maximum width of input image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src

Input image

dst

Output image.

threshold

Threshold applied to the corner measure.

k

Harris detector parameter

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
102

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of the Harris corner detection in different
configurations, generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.
The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=3 and
NMS_RADIUS =1
Table 141: Resource Utilization Summary - For Sobel Filter = 3, Box filter=3 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

33

66

DSP48E

10

80

FF

3254

9330

LUT

3522

13222

CLB

731

2568

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=5 and
NMS_RADIUS =1
Table 142: Resource Utilization Summary - Sobel Filter = 3, Box filter=5 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

45

90

DSP48E

10

80

FF

5455

12459

LUT

5675

24594

CLB

1132

4498

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=7 and
NMS_RADIUS =1

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
103

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 143: Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

57

114

DSP48E

10

80

FF

8783

16593

LUT

9157

39813

CLB

1757

6809

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=3 and
NMS_RADIUS =1
Table 144: Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

200 MHz

BRAM_18K

35

70

DSP48E

10

80

FF

4656

11659

LUT

4681

17394

CLB

1005

3277

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=5 and
NMS_RADIUS =1
Table 145: Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

47

94

DSP48E

10

80

FF

6019

14776

LUT

6337

28795

CLB

1353

5102

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
104

Send Feedback

Chapter 3: xfOpenCV Library API Reference

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=7 and
NMS_RADIUS =1
Table 146: Resource Utilization Summary - Sobel Filter = 5, Box filter=7 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

59

118

DSP48E

10

80

FF

9388

18913

LUT

9414

43070

CLB

1947

7508

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=3 and
NMS_RADIUS =1
Table 147: Resource Utilization Summary - Sobel Filter = 7, Box filter=3 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

37

74

DSP48E

11

88

FF

6002

13880

LUT

6337

25573

CLB

1327

4868

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =1
Table 148: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

49

98

DSP48E

11

88

FF

7410

17049

LUT

8076

36509

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
105

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 148: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =1 (cont'd)
Resource Utilization
Name
CLB

1 pixel

8 pixel

300 MHz

150 MHz

1627

6518

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=7 and
NMS_RADIUS =1
Table 149: Resource Utilization Summary - Sobel Filter = 7, Box filter=7 and
NMS_RADIUS =1
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

61

122

DSP48E

11

88

FF

10714

21137

LUT

11500

51331

CLB

2261

8863

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=3 and
NMS_RADIUS =2
Table 150: Resource Utilization Summary - Sobel Filter = 3, Box filter=3 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

41

82

DSP48E

10

80

FF

5519

10714

LUT

5094

16930

CLB

1076

3127

Resource utilization: For Sobel Filter = 3, Box filter=5 and NMS_RADIUS =2

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
106

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 151: Resource Utilization Summary
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

53

106

DSP48E

10

80

FF

6798

13844

LUT

6866

28286

CLB

1383

4965

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=7 and
NMS_RADIUS =2
Table 152: Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

65

130

DSP48E

10

80

FF

10137

17977

LUT

10366

43589

CLB

1940

7440

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=3 and
NMS_RADIUS =2
Table 153: Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

43

86

DSP48E

10

80

FF

5957

12930

LUT

5987

21187

CLB

1244

3922

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=5 and
NMS_RADIUS =2

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
107

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 154: Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

55

110

DSP48E

10

80

FF

5442

16053

LUT

6561

32377

CLB

1374

5871

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=7 and
NMS_RADIUS =2
Table 155: Resource Utilization Summary - Sobel Filter = 5, Box filter=7 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

67

134

DSP48E

10

80

FF

10673

20190

LUT

10793

46785

CLB

2260

8013

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=3 and
NMS_RADIUS =2
Table 156: Resource Utilization Summary - Sobel Filter = 7, Box filter=3 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

45

90

DSP48E

11

88

FF

7341

15161

LUT

7631

29185

CLB

1557

5425

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
108

Send Feedback

Chapter 3: xfOpenCV Library API Reference

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =2
Table 157: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

57

114

DSP48E

11

88

FF

8763

18330

LUT

9368

40116

CLB

1857

7362

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=7 and
NMS_RADIUS =2
Table 158: Resource Utilization Summary - Sobel Filter = 7, Box filter=7 and
NMS_RADIUS =2
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

BRAM_18K

69

138

DSP48E

11

88

FF

12078

22414

LUT

12831

54652

CLB

2499

9628

Performance Estimate
The following table summarizes a performance estimate of the Harris corner detection in
different configurations, as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.
Table 159: cornerHarris Function Performance Estimate Summary

Operating
Mode

Operating
Frequency

Configuration

Latency Estimate

Box

Latency(In ms)

Sobel

(MHz)

NMS Radius

1 pixel

300 MHz

3

3

1

7

1 pixel

300 MHz

3

5

1

7.1

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
109

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 159: cornerHarris Function Performance Estimate Summary (cont'd)

Operating
Mode

Operating
Frequency

Configuration

Latency Estimate

Box

Latency(In ms)

Sobel

(MHz)

NMS Radius

1 pixel

300 MHz

3

7

1

7.1

1 pixel

300 MHz

5

3

1

7.2

1 pixel

300 MHz

5

5

1

7.2

1 pixel

300 MHz

5

7

1

7.2

1 pixel

300 MHz

7

3

1

7.22

1 pixel

300 MHz

7

5

1

7.22

1 pixel

300 MHz

7

7

1

7.22

8 pixel

150 MHz

3

3

1

1.7

8 pixel

150 MHz

3

5

1

1.7

8 pixel

150 MHz

3

7

1

1.7

8 pixel

150 MHz

5

3

1

1.71

8 pixel

150 MHz

5

5

1

1.71

8 pixel

150 MHz

5

7

1

1.71

8 pixel

150 MHz

7

3

1

1.8

8 pixel

150 MHz

7

5

1

1.8

8 pixel

150 MHz

7

7

1

1.8

1 pixel

300 MHz

3

3

2

7.1

1 pixel

300 MHz

3

5

2

7.1

1 pixel

300 MHz

3

7

2

7.1

1 pixel

300 MHz

5

3

2

7.21

1 pixel

300 MHz

5

5

2

7.21

1 pixel

300 MHz

5

7

2

7.21

1 pixel

300 MHz

7

3

2

7.22

1 pixel

300 MHz

7

5

2

7.22

1 pixel

300 MHz

7

7

2

7.22

8 pixel

150 MHz

3

3

2

1.8

8 pixel

150 MHz

3

5

2

1.8

8 pixel

150 MHz

3

7

2

1.8

8 pixel

150 MHz

5

3

2

1.81

8 pixel

150 MHz

5

5

2

1.81

8 pixel

150 MHz

5

7

2

1.81

8 pixel

150 MHz

7

3

2

1.9

8 pixel

150 MHz

7

5

2

1.91

8 pixel

150 MHz

7

7

2

1.92

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
110

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Deviation from OpenCV
In xfOpenCV thresholding and NMS are included, but in OpenCV they are not included. In
xfOpenCV, all the blocks are implemented in fixed point. Whereas,in OpenCV, all the blocks are
implemented in floating point.

Histogram Computation
The calcHist function computes the histogram of given input image.

H ⎡⎣src(x, y)⎤⎦ = H ⎡⎣src(x, y)⎤⎦ + 1

Where, H is the array of 256 elements.
API Syntax
template
void calcHist(xf::Mat & _src, uint32_t *histogram)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 160: calcHist Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle

_src

Input image

histogram

Output array of 256 elements

Resource Utilization
The following table summarizes the resource utilization of the calcHist function for Normal
Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS
2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA at 300 MHz for 1 pixel case
and at 150 MHz for 8 pixel mode.
Table 161: calcHist Function Resource Utilization Summary
Resource Utilization
Name
BRAM_18K

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Normal Operation (1 pixel)
2

Resource Optimized (8 pixel)
16

www.xilinx.com

[placeholder text]
111

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 161: calcHist Function Resource Utilization Summary (cont'd)
Resource Utilization
Name

Normal Operation (1 pixel)

Resource Optimized (8 pixel)

DSP48E

0

0

FF

196

274

LUT

240

912

CLB

57

231

Performance Estimate
The following table summarizes a performance estimate of the calcHist function for Normal
Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS
2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and
150 MHz for 8 pixel mode.
Table 162: calcHist Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max (ms)

1 pixel

6.9

8 pixel

1.7

Histogram Equalization
The equalizeHist function performs histogram equalization on input image or video. It
improves the contrast in the image, to stretch out the intensity range. This function maps one
distribution (histogram) to another distribution (a wider and more uniform distribution of
intensity values), so the intensities are spread over the whole range.
For histogram H[i], the cumulative distribution H'[i] is given as:

H'[i] = ∑

0≤ j
void equalizeHist(xf::Mat & _src,xf::Mat & _src1,xf::Mat & _dst)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 163: equalizeHist Function Parameter Descriptions
Parameter

Description

SRC_T

Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle

_src

Input image

_src1

Input image

_dst

Output image

Resource Utilization
The following table summarizes the resource utilization of the equalizeHist function for Normal
Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS
2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and
150 MHz for 8 pixel mode.
Table 164: equalizeHist Function Resource Utilization Summary
Operating
Mode

Operating Frequency
(MHz)

Utilization Estimate
BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

4

5

3492

1807

666

8 pixel

150

25

5

3526

2645

835

Performance Estimate
The following table summarizes a performance estimate of the equalizeHist function for Normal
Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS
2018.2version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and
150 MHz for 8 pixel mode.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
113

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 165: equalizeHist Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max (ms)

1 pixel per clock operation

13.8

8 pixel per clock operation

3.4

HOG
The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision for the
purpose of object detection. The feature descriptors produced from this approach is widely used
in the pedestrian detection.
The technique counts the occurrences of gradient orientation in localized portions of an image.
HOG is computed over a dense grid of uniformly spaced cells and normalized over overlapping
blocks, for improved accuracy. The concept behind HOG is that the object appearance and shape
within an image can be described by the distribution of intensity gradients or edge direction.
Both RGB and gray inputs are accepted to the function. In the RGB mode, gradients are
computed for each plane separately, but the one with the higher magnitude is selected. With the
configurations provided, the window dimensions are 64x128, block dimensions are 16x16.
API Syntax
template
void HOGDescriptor(xf::Mat &_in_mat,
xf::Mat &_desc_mat);

Parameter Descriptions
The following table describes the template parameters.
Table 166: HOGDescriptor Template Parameter Descriptions
PARAMETERS

DESCRIPTION

WIN_HEIGHT

The number of pixel rows in the window. This must be a multiple of 8 and should not exceed the
number of image rows.

WIN_WIDTH

The number of pixel cols in the window. This must be a multiple of 8 and should not exceed the
number of image columns.

WIN_STRIDE

The pixel stride between two adjacent windows. It is fixed at 8.

BLOCK_HEIGHT

Height of the block. It is fixed at 16.

BLOCK_WIDTH

Width of the block. It is fixed at 16.

CELL_HEIGHT

Number of rows in a cell. It is fixed at 8.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
114

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 166: HOGDescriptor Template Parameter Descriptions (cont'd)
PARAMETERS

DESCRIPTION

CELL_WIDTH

Number of cols in a cell. It is fixed at 8.

NOB

Number of histogram bins for a cell. It is fixed at 9

DESC_SIZE

The size of the output descriptor.

IMG_COLOR

The type of the image, set as either XF_GRAY or XF_RGB

OUTPUT_VARIENT

Must be either XF_HOG_RB or XF_HOG_NRB

SRC_T

Input pixel type. Must be either XF_8UC1 or XF_8UC4, for gray and color respectively.

DST_T

Output descriptor type. Must be XF_32UC1.

ROWS

Number of rows in the image being processed. (Should be a multiple of 8)

COLS

Number of columns in the image being processed. (Should be a multiple of 8)

NPC

Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per
cycle operations.

The following table describes the function parameters.
Table 167: HOGDescriptor Function Parameter Descriptions
PARAMETERS

DESCRIPTION

_in_mat

Input image, of xf::Mat type

_desc_mat

Output descriptors, of xf::Mat type

Where,
• NO is normal operation (single pixel processing)
• RB is repetitive blocks (descriptor data are written window wise)
• NRB is non-repetitive blocks (descriptor data are written block wise, in order to reduce the
number of writes).
Note: In the RB mode, the block data is written to the memory taking the overlap windows into
consideration. In the NRB mode, the block data is written directly to the output stream without
consideration of the window overlap. In the host side, the overlap must be taken care.

Resource Utilization
The following table shows the resource utilization of HOGDescriptor function for normal
operation (1 pixel) mode as generated in Vivado HLS 2018.2 version tool for the part Xilinx
Xczu9eg-ffvb1156-1-i-es1 at 300 MHz to process an image of 1920x1080 resolution.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
115

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 168: HOGDescriptor Function Resource Utilization Summary
Utilization (at 300 MHz) of 1 pixel operation
Resource

NRB

RB

Gray

RGB

Gray

RGB

BRAM_18K

43

49

171

177

DSP48E

34

46

36

48

FF

15365

15823

15205

15663

LUT

12868

13267

13443

13848

Performance Estimate
The following table shows the performance estimates of HOGDescriptor() function for different
configurations as generated in Vivado HLS 2018.2 version tool for the part Xilinx Xczu9egffvb1156-1-i-es1 to process an image of 1920x1080p resolution.
Table 169: HOGDescriptor Function Performance Estimate Summary
Operating Mode

Latency Estimate

Operating Frequency
(MHz)

Min (ms)

Max (ms)

NRB-Gray

300

6.98

8.83

NRB-RGBA

300

6.98

8.83

RB-Gray

300

176.81

177

RB-RGBA

300

176.81

177

Deviations from OpenCV
Listed below are the deviations from the OpenCV:
1. Border care
The border care that OpenCV has taken in the gradient computation is
BORDER_REFLECT_101, in which the border padding will be the neighboring pixels'
reflection. Whereas, in the Xilinx implementation, BORDER_CONSTANT (zero padding) was
used for the border care.
2. Gaussian weighing
The Gaussian weights are multiplied on the pixels over the block, that is a block has 256
pixels, and each position of the block are multiplied with its corresponding Gaussian weights.
Whereas, in the HLS implementation, gaussian weighing was not performed.
3. Cell-wise interpolation
The magnitude values of the pixels are distributed across different cells in the blocks but on
the corresponding bins.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
116

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Pixels in the region 1 belong only to its corresponding cells, but the pixels in region 2 and 3
are interpolated to the adjacent 2 cells and 4 cells respectively. This operation was not
performed in the HLS implementation.
4. Output handling
The output of the OpenCV will be in the column major form. In the HLS implementation,
output will be in the row major form. Also, the feature vector will be in the fixed point type
Q0.16 in the HLS implementation, while in the OpenCV it will be in floating point.
Limitations
1. The configurations are limited to Dalal's implementation
2. Image height and image width must be a multiple of cell height and cell width respectively.

Houghlines
The HoughLines function here is equivalent to HoughLines Standard in OpenCV. The
Houghlines function is used to detect straight lines in a binary image. To apply the Hough
transform, edge detection preprocessing is required. The input to the Hough transform is an edge
detected binary image. For each point (xi,yi) in a binary image, we define a family of lines that go
through the point as:
rho= xi cos(theta) + yi sin(theta)

1

N. Dalal, B. Triggs: Histograms of oriented gradients for human detection, IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, 2005.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
117

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Each pair of (rho,theta) represents a line that passes through the point (xi,yi). These (rho,theta)
pairs of this family of lines passing through the point form a sinusoidal curve in (rho,theta) plane.
If the sinusoids of N different points intersect in the (rho,theta) plane, then that intersection
(rho1, theta1) represents the line that passes through these N points. In the Houghlines function,
an accumulator is used to keep the count (also called voting) of all the intersection points in the
(rho,theta) plane. After voting, the function filters spurious lines by performing thinning, that is,
checking if the center vote value is greater than the neighborhood votes and threshold, then
making that center vote as valid and other wise making it zero. Finally, the function returns the
desired maximum number of lines (LINESMAX) in (rho,theta) form as output.
The design assumes the origin at the center of the image i.e at (Floor(COLS/2), Floor(ROWS/2)).
The ranges of rho and theta are:
theta = [0, pi)
rho=[-DIAG/2, DIAG/2), where DIAG = cvRound{SquareRoot( (COLS*COLS) +
(ROWS*ROWS))}

For ease of use, the input angles THETA, MINTHETA and MAXTHETA are taken in degrees, while
the output theta is in radians. The angle resolution THETA is declared as an integer, but treated
as a value in Q6.1 format (that is, THETA=3 signifies that the resolution used in the function is
1.5 degrees). When the output (rho, Ɵ theta) is used for drawing lines,you should be aware of the
fact that origin is at the center of the image.
API Syntax
template

void HoughLines(xf::Mat & _src_mat,float
outputrho[MAXLINES],float outputtheta[MAXLINES],short threshold,short
linesmax)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 170: Houghlines Function Parameter Descriptions
Parameter

Description

RHO

Distance resolution of the accumulator in pixels.

THETA

Angle resolution of the accumulator in degrees and Q6.1 format.

MAXLINES

Maximum number of lines to be detected

MINTHETA

Minimum angle in degrees to check lines.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
118

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 170: Houghlines Function Parameter Descriptions (cont'd)
Parameter

Description

MAXTHETA

Maximum angle in degrees to check lines

DIAG

Diagonal of the image. It should be cvRound(sqrt(rows*rows + cols*cols)/RHO)

SRC_T

Input Pixel Type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).

ROWS

Maximum height of input image

COLS

Maximum width of input image

NPC

Number of Pixels to be processed per cycle; Only single pixel supported XF_NPPC1.

_src_mat

Input image should be 8-bit, single-channel binary image.

outputrho

Output array of rho values. rho is the distance from the coordinate origin (center of the image).

outputtheta

Output array of theta values. Theta is the line rotation angle in radians.

threshold

Accumulator threshold parameter. Only those lines are returned that get enough votes (>threshold).

linesmax

Maximum number of lines.

Resource Utilization
The table below shows the resource utilization of the kernel for different configurations,
generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 to
process a grayscale HD (1080x1920) image for 512 lines.
Table 171: Houghlines Function Resource Utilization Summary
Resource Utilization
Name

THETA=1, RHO=1

BRAM_18K

542

DSP48E

10

FF

60648

LUT

56131

Performance Estimate
The following table shows the performance of kernel for different configurations, generated
using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 to process a
grayscale HD (1080x1920) image for 512 lines.
Table 172: Houghlines Function Performance Estimate Summary
Operating Mode
THETA=1, RHO=1

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Latency Estimate

Operating Frequency (MHz)
300

Max (ms)
12.5

www.xilinx.com

[placeholder text]
119

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Pyramid Up
The pyrUp function is an image up-sampling algorithm. It first inserts zero rows and zero
columns after every input row and column making up to the size of the output image. The output

(2*rows × 2*columns)

image size is always
.The zero padded image is then smoothened
using Gaussian image filter. Gaussian filter for the pyramid-up function uses a fixed filter kernel
as given below:

⎡1
⎢4
1 ⎢6
256 ⎢
⎢4
⎣1

4
16
24
16
4

6
24
36
24
6

4
16
24
16
4

1⎤
4⎥
⎥
6⎥
4⎥
1⎦

However, to make up for the pixel intensity that is reduced due to zero padding, each output
pixel is multiplied by 4.
API Syntax
template
void pyrUp (xf::Mat & _src, xf::Mat & _dst)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 173: pyrUp Function Parameter Descriptions
Parameter

Description

TYPE

Pixel type. XF_8UC1 is the only supported pixel type.

ROWS

Maximum Height or number of output rows to build the hardware for this kernel

COLS

Maximum Width or number of output columns to build the hardware for this kernel

NPC

Number of pixels to process per cycle. Currently, the kernel supports only 1 pixel per cycle
processing (XF_NPPC1).

_src

Input image stream

_dst

Output image stream

Resource Utilization
The following table summarizes the resource utilization of pyrUp for 1 pixel per cycle
implementation, for a maximum input image size of 1920x1080 pixels. The results are after
synthesis in Vivado HLS 2018.2 for the Xilinx xczu9eg-ffvb1156-1-i-es1 FPGA at 300 MHz.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
120

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 174: pyrUp Function Resource Utilization Summary

Operating
Mode
1 Pixel

Utilization Estimate

Operating
Frequency
LUTs

(MHz)
300

FFs

1124

DSPs

1199

BRAMs

0

10

Performance Estimate
The following table summarizes performance estimates of pyrUp function on Vivado HLS 2018.2
for the Xilinx xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 175: pyrUp Function Performance Estimate Summary

Operating Mode
1 pixel

Operating Frequency
(MHz)
300

Latency Estimate
Input Image Size
1920x1080

Max (ms)
27.82

Pyramid Down
The pyrDown function is an image down-sampling algorithm which smoothens the image before
down-scaling it. The image is smoothened using a Gaussian filter with the following kernel:

⎡1
⎢4
1 ⎢6
256 ⎢
⎢4
⎣1

4
16
24
16
4

6
24
36
24
6

4
16
24
16
4

1⎤
4⎥
⎥
6⎥
4⎥
1⎦

Down-scaling is performed by dropping pixels in the even rows and the even columns. The
resulting image size is

⎛rows + 1
⎝
2

columns + 1 ⎞
⎠
2
.

API Syntax
template
void pyrDown (xf::Mat & _src, xf::Mat & _dst)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
121

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parameter Descriptions
The following table describes the template and the function parameters.
Table 176: pyrDown Function Parameter Descriptions
Parameter

Description

TYPE

Pixel type. XF_8UC1 is the only supported pixel type.

ROWS

Maximum Height or number of input rows to build the hardware for this kernel

COLS

Maximum Width or number of input columns to build the hardware for this kernel

NPC

Number of pixels to process per cycle. Currently, the kernel supports only 1 pixel per cycle
processing (XF_NPPC1).

_src

Input image stream

_dst

Output image stream

Resource Utilization
The following table summarizes the resource utilization of pyrDown for 1 pixel per cycle
implementation, for a maximum input image size of 1920x1080 pixels. The results are after
synthesis in Vivado HLS 2018.2 for the Xilinx xczu9eg-ffvb1156-1-i-es1 FPGA at 300 MHz.
Table 177: pyrDown Function Resource Utilization Summary

Operating
Mode
1 Pixel

Utilization Estimate

Operating
Frequency
LUTs

(MHz)
300

1171

FFs
1238

DSPs

BRAMs

1

5

Performance Estimate
The following table summarizes performance estimates of pyrDown function in Vivado HLS
2018.2 for the Xilinx xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 178: pyrDown Function Performance Estimate Summary

Operating Mode
1 pixel

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Operating Frequency
(MHz)
300

Latency Estimate
Input Image Size
1920x1080

www.xilinx.com

Max (ms)
6.99

[placeholder text]
122

Send Feedback

Chapter 3: xfOpenCV Library API Reference

InitUndistortRectifyMapInverse
The InitUndistortRectifyMapInverse function generates mapx and mapy, based on a set
of camera parameters, where mapx and mapy are inputs for the xf::remap function. That is, for
each pixel in the location (u, v) in the destination (corrected and rectified) image, the function
computes the corresponding coordinates in the source image (the original image from camera).
The InitUndistortRectifyMapInverse module is optimized for hardware, so the inverse of rotation
matrix is computed outside the synthesizable logic. Note that the inputs are fixed point, so the
floating point camera parameters must be type casted to Q12.20 format.
API Syntax
template< int CM_SIZE, int DC_SIZE, int MAP_T, int ROWS, int COLS, int NPC
>
void InitUndistortRectifyMapInverse ( ap_fixed<32,12> *cameraMatrix,
ap_fixed<32,12> *distCoeffs, ap_fixed<32,12> *ir, xf::Mat &_mapx_mat, xf::Mat &_mapy_mat, int
_cm_size, int _dc_size)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 179: InitUndistortRectifyMapInverse Function Parameter Descriptions
Parameter

Description

CM_SIZE

It must be set at the compile time, 9 for 3x3 matrix

DC_SIZE

It must be set at the compile time, must be 4,5 or 8

MAP_T

It is the type of output maps, and must be XF_32FC1

ROWS

Maximum image height, necessary to generate the output maps

COLS

Maximum image width, necessary to generate the output maps

NPC

Number of pixels per cycle. This function supports only one pixel per cycle, so set to XF_NPPC1

cameraMatrix

The input matrix representing the camera in the old coordinate system

distCoeffs

The input distortion coefficients (k1,k2,p1,p2[,k3[,k4,k5,k6]])

ir

The input transformation matrix is equal to Invert(newCameraMatrix*R), where
newCameraMatrix represents the camera in the new coordinate system and R is the rotation
matrix.. This processing will be done outside the synthesizable block

_mapx_mat

Output mat objects containing the mapx

_mapy_mat

Output mat objects containing the mapy

_cm_size

9 for 3x3 matrix

_dc_size

4, 5 or 8. If this is 0, then it means there is no distortion

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
123

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Integral Image
The integral function computes an integral image of the input. Each output pixel is the sum of
all pixels above and to the left of itself.

dst(x, y) = sum(x, y) = sum(x, y) + sum⎛⎝x - 1, y⎞⎠ + sum⎛⎝x, y - 1⎞⎠ - sum⎛⎝x - 1, y - 1⎞⎠
API Syntax
template
void integral(xf::Mat & _src_mat,
xf::Mat & _dst_mat)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 180: integral Function Parameter Descriptions
Parameter

Description

SRC_TYPE

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_TYPE

Output pixel type. Only 32-bit,unsigned,1 channel is supported(XF_32UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per
cycle operations.

_src_mat

Input image

_dst_mat

Output image

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 181: integral Function Resource Utilization Summary
Resource Utilization
Name

1 pixel
300 MHz

BRAM_18K

4

DSP48E

0

FF

613

LUT

378

CLB

102

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
124

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 182: integral Function Performance Estimate Summary
Latency Estimate
Operating Frequency

Operating Mode

Latency(in ms)

(MHz)
1pixel

300

7.2

Dense Pyramidal LK Optical Flow
Optical flow is the pattern of apparent motion of image objects between two consecutive
frames, caused by the movement of object or camera. It is a 2D vector field, where each vector is
a displacement vector showing the movement of points from first frame to second.
Optical Flow works on the following assumptions:
• Pixel intensities of an object do not have too many variations in consecutive frames
• Neighboring pixels have similar motion
Consider a pixel I(x, y, t) in first frame. (Note that a new dimension, time, is added here. When
working with images only, there is no need of time). The pixel moves by distance (dx, dy) in the
next frame taken after time dt. Thus, since those pixels are the same and the intensity does not
change, the following is true:
I(x, y, t) = I ⎛⎝x + dx, y + dy, t + dt⎞⎠
Taking the Taylor series approximation on the right-hand side, removing common terms, and
dividing by dt gives the following equation:

f xu + f yv + f t = 0

Where

fx=

dy
δf
δf
u = dx
v=
fy=
dt and
dt .
δx ,
δx ,

The above equation is called the Optical Flow equation, where, fx and fy are the image
gradientsand ft is the gradient along time. However, (u, v) is unknown. It is not possible to solve
this equation with two unknown variables. Thus, several methods are provided to solve this
problem. One method is Lucas-Kanade. Previously it was assumed that all neighboring pixels
have similar motion. The Lucas-Kanade method takes a patch around the point, whose size can

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
125

Send Feedback

Chapter 3: xfOpenCV Library API Reference

be defined through the ‘WINDOW_SIZE’ template parameter. Thus, all the points in that patch
have the same motion. It is possible to find (fx, fy, ft ) for these points. Thus, the problem now
becomes solving ‘WINDOW_SIZE * WINDOW_SIZE’ equations with two unknown
variables,which is over-determined. A better solution is obtained with the “least square fit”
method. Below is the final solution, which is a problem with two equations and two unknowns:

⎡ ∑ f 2x ∑ f x f y ⎤-1⎡- ∑ f x f t ⎤
i
i
i
i
i
⎡u⎤
⎥ ⎢
⎥
⎣v ⎦ = ⎢
2
⎣ ∑ f x i f y i ∑ f y i ⎦ ⎣- ∑ f y i f t i⎦
This solution fails when a large motion is involved and so pyramids are used. Going up in the
pyramid, small motions are removed and large motions become small motions and so by applying
Lucas-Kanade, the optical flow along with the scale is obtained.
API Syntax
template< int NUM_PYR_LEVELS, int NUM_LINES, int WINSIZE, int FLOW_WIDTH,
int FLOW_INT, int TYPE, int ROWS, int COLS, int NPC>
void densePyrOpticalFlow(
xf::Mat & _current_img,
xf::Mat & _next_image,
xf::Mat & _streamFlowin,
xf::Mat & _streamFlowout,
const int level, const unsigned char scale_up_flag, float scale_in,
ap_uint<1> init_flag)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 183: densePyrOpticalFlow Function Parameter Descriptions
Parameter

Description

NUM_PYR_LEVELS

Number of Image Pyramid levels used for the optical flow computation

NUM_LINES

Number of lines to buffer for the remap algorithm – used to find the temporal gradient

WINSIZE

Window Size over which Optical Flow is computed

FLOW_WIDTH,
FLOW_INT

Data width and number of integer bits to define the signed flow vector data type. Integer bit
includes the signed bit.
The default type is 16-bit signed word with 10 integer bits and 6 decimal bits.

TYPE

Pixel type of the input image. XF_8UC1 is only the supported value.

ROWS

Maximum Height or number of rows to build the hardware for this kernel

COLS

Maximum Width or number of columns to build the hardware for this kernel

NPC

Number of pixels the hardware kernel must process per clock cycle. Only XF_NPPC1, 1 pixel per
cycle, is supported.

_curr_img

First input image stream

_next_img

Second input image to which the optical flow is computed with respect to the first image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
126

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 183: densePyrOpticalFlow Function Parameter Descriptions (cont'd)
Parameter

Description

_streamFlowin

32-bit Packed U and V flow vectors input for optical flow. The bits from 31-16 represent the flow
vector U while the bits from 15-0 represent the flow vector V.

_streamFlowout

32-bit Packed U and V flow vectors output after optical flow computation. The bits from 31-16
represent the flow vector U while the bits from 15-0 represent the flow vector V.

level

Image pyramid level at which the algorithm is currently computing the optical flow.

scale_up_flag

Flag to enable the scaling-up of the flow vectors. This flag is set at the host when switching from
one image pyramid level to the other.

scale_in

Floating point scale up factor for the scaling-up the flow vectors.
The value is (previous_rows-1)/(current_rows-1). This is not 1 when switching from one image
pyramid level to the other.

init_flag

Flag to initialize flow vectors to 0 in the first iteration of the highest pyramid level. This flag must
be set in the first iteration of the highest pyramid level (smallest image in the pyramid). The flag
must be unset for all the other iterations.

Resource Utilization
The following table summarizes the resource utilization of densePyrOpticalFlow for 1 pixel per
cycle implementation, with the optical flow computed for a window size of 11 over an image size
of 1920x1080 pixels. The results are after implementation in Vivado HLS 2018.2 for the Xilinx
xczu9eg-ffvb1156-2L-e FPGA at 300 MHz.
Table 184: densePyrOpticalFlow Function Resource Utilization Summary

Operating
Mode
1 Pixel

Utilization Estimate

Operating
Frequency
LUTs

(MHz)
300

32231

FFs
16596

DSPs

BRAMs

52

215

Performance Estimate
The following table summarizes performance figures on hardware for the densePyrOpticalFlow
function for 5 iterations over 5 pyramid levels scaled down by a factor of two at each level. This
has been tested on the zcu102 evaluation board.
Table 185: densePyrOpticalFlow Function Performance Estimate Summary

Operating Mode

Operating Frequency
(MHz)

Latency Estimate
Image Size

Max (ms)

1 pixel

300

1920x1080

49.7

1 pixel

300

1280x720

22.9

1 pixel

300

1226x370

12.02

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
127

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Dense Non-Pyramidal LK Optical Flow
Optical flow is the pattern of apparent motion of image objects between two consecutive
frames, caused by the movement of object or camera. It is a 2D vector field, where each vector is
a displacement vector showing the movement of points from first frame to second.
Optical Flow works on the following assumptions:
• Pixel intensities of an object do not have too many variations in consecutive frames
• Neighboring pixels have similar motion
Consider a pixel I(x, y, t) in first frame. (Note that a new dimension, time, is added here. When
working with images only, there is no need of time). The pixel moves by distance (dx, dy) in the
next frame taken after time dt. Thus, since those pixels are the same and the intensity does not
change, the following is true:
I(x, y, t) = I ⎛⎝x + dx, y + dy, t + dt⎞⎠
Taking the Taylor series approximation on the right-hand side, removing common terms, and
dividing by dt gives the following equation:

f xu + f yv + f t = 0

Where

fx=

dy
δf
δf
u = dx
v=
fy=
dt and
dt .
δx ,
δx ,

The above equation is called the Optical Flow equation, where, fx and fy are the image
gradientsand ft is the gradient along time. However, (u, v) is unknown. It is not possible to solve
this equation with two unknown variables. Thus, several methods are provided to solve this
problem. One method is Lucas-Kanade. Previously it was assumed that all neighboring pixels
have similar motion. The Lucas-Kanade method takes a patch around the point, whose size can
be defined through the ‘WINDOW_SIZE’ template parameter. Thus, all the points in that patch
have the same motion. It is possible to find (fx, fy, ft ) for these points. Thus, the problem now
becomes solving ‘WINDOW_SIZE * WINDOW_SIZE’ equations with two unknown
variables,which is over-determined. A better solution is obtained with the “least square fit”
method. Below is the final solution, which is a problem with two equations and two unknowns:

⎡ ∑ f 2x ∑ f x f y ⎤-1⎡- ∑ f x f t ⎤
i
i
i
i
i
⎡u⎤
⎥ ⎢
⎥
⎣v ⎦ = ⎢
2
⎣ ∑ f x i f y i ∑ f y i ⎦ ⎣- ∑ f y i f t i⎦

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
128

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void DenseNonPyrLKOpticalFlow (xf::Mat & frame0,
xf::Mat & frame1, xf::Mat & flowx, xf::Mat & flowy)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 186: DenseNonPyrLKOpticalFlow Function Parameter Descriptions
Parameter

Description

Type

pixel type. The current supported pixel value is XF_8UC1, unsigned 8 bit.

ROWS

Maximum number of rows of the input image that the hardware kernel must be built for.

COLS

Maximum number of columns of the input image that the hardware kernel must be built for.

NPC

Number of pixels to process per cycle. Supported values are XF_NPPC1 (=1) and XF_NPPC2(=2).

WINDOW_SIZE

Window size over which optical flow will be computed. This can be any odd positive integer.

frame0

First input images.

frame1

Second input image. Optical flow is computed between frame0 and frame1.

flowx

Horizontal component of the flow vectors. The format of the flow vectors is XF_32FC1 or single
precision.

flowy

Vertical component of the flow vectors. The format of the flow vectors is XF_32FC1 or single
precision.

Resource Utilization
The following table summarizes the resource utilization of DenseNonPyrLKOpticalFlow for a 4K
image, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-ies1 FPGA at 300 MHz.
Table 187: DenseNonPyrLKOpticalFlow Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUTs

1 pixel

300

178

42

11984

7730

2 pixel

300

258

82

22747

15126

Performance Estimate
The following table summarizes performance estimates of the DenseNonPyrLKOpticalFlow
function for a 4K image, generated using Vivado HLS 2018.2 version tool for the Xilinx xczu9egffvb1156-1-i-es1 FPGA.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
129

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 188: DenseNonPyrLKOpticalFlow Function Performance Estimate Summary
Operating Mode

Operating Frequency

Latency Estimate

(MHz)

Max (ms)

1 pixel

300

28.01

2 pixel

300

14.01

Mean and Standard Deviation
The meanStdDev function computes the mean and standard deviation of input image. The
output Mean value is in fixed point Q8.8 format, and the Standard Deviation value is in Q8.8
format. Mean and standard deviation are calculated as follows:
height width

μ=

∑

∑

y=0 x=0
⎛
⎝

width*height⎞⎠

height width

σ=

src(x, y)

∑

∑

y=0 x=0

(μ - src(x, y)) 2

⎛
⎝

width*height⎞⎠

API Syntax
template
void meanStdDev(xf::Mat & _src,unsigned short*
_mean,unsigned short* _stddev)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 189: meanStdDev Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1) is supported.

ROWS

Number of rows in the image being processed.

COLS

Number of columns in the image being processed.

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
130

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 189: meanStdDev Function Parameter Descriptions (cont'd)
Parameter

Description

_mean

16-bit data pointer through which the computed mean of the image is returned.

_stddev

16-bit data pointer through which the computed standard deviation of the image is returned.

Resource Utilization
The following table summarizes the resource utilization of the meanStdDev function, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 190: meanStdDev Function Resource Utilization Summary
Operating
Mode

Operating Frequency
(MHz)

Utilization Estimate
BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

6

896

461

121

8 pixel

150

0

13

1180

985

208

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 191: meanStdDev Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency

1 pixel operation (300 MHz)

6.9 ms

8 pixel operation (150 MHz)

1.69 ms

Median Blur Filter
The function medianBlur performs a median filter operation on the input image. The median filter
acts as a non-linear digital filter which improves noise reduction. A filter size of N would output
the median value of the NxN neighborhood pixel values, for each pixel.
API Syntax
template
void medianBlur (xf::Mat & _src, xf::Mat & _dst)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
131

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parameter Descriptions
The following table describes the template and the function parameters.
Table 192: medianBlur Function Parameter Descriptions
Parameter

Description

FILTER_SIZE

Window size of the hardware filter for which the hardware kernel will be built. This can be any
odd positive integer greater than 1.

BORDER_TYPE

The way in which borders will be processed in the hardware kernel. Currently, only
XF_BORDER_REPLICATE is supported.

TYPE

Type of input pixel. XF_8UC1 is supported.

ROWS

Number of rows in the image being processed.

COLS

Number of columns in the image being processed.

NPC

Number of pixels to be processed in parallel. Options are XF_NPPC1 (for 1 pixel processing per
clock), XF_NPPC8 (for 8 pixel processing per clock

_src

Input image.

_dst

Output image.

Resource Utilization
The following table summarizes the resource utilization of the medianBlur function for
XF_NPPC1 and XF_NPPC8 configurations, generated using Vivado HLS 2018.2 version tool for
the Xilinx Xc7z020clg484-1 FPGA.
Table 193: medianBlur Function Resource Utilization Summary

Operating
Mode

FILTER_SIZE

Utilization Estimate

Operating
Frequency
LUTs

(MHz)

FFs

DSPs

BRAMs

1 pixel

3

300

1197

771

0

3

8 pixel

3

150

6559

1595

0

6

1 pixel

5

300

5860

1886

0

5

Performance Estimate
The following table summarizes performance estimates of medianBlur function on Vivado HLS
2018.2 version tool for the Xilinx xczu9eg-ffvb1156-1-i-es1 FPGA.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
132

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 194: medianBlur Function Performance Estimate Summary

Operating Mode

Latency Estimate

Operating
Frequency

FILTER_SIZE

Input Image Size

Max (ms)

(MHz)
1 pixel

3

300

1920x1080

6.99

8 pixel

3

150

1920x1080

1.75

1 pixel

5

300

1920x1080

7.00

MinMax Location
The minMaxLoc function finds the minimum and maximum values in an image and location of
those values.

minVal =

min
'

0 ≤ x ≤ width

src⎛⎝x ', y '⎞⎠

0 ≤ y ' ≤ height

maxVal =

max
'

0 ≤ x ≤ width

src⎛⎝x ', y '⎞⎠

0 ≤ y ' ≤ height

API Syntax
template
void minMaxLoc(xf::Mat & _src,int32_t *max_value,
int32_t *min_value,uint16_t *_minlocx, uint16_t *_minlocy, uint16_t
*_maxlocx, uint16_t *_maxlocy )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 195: minMaxLoc Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1), 16-bit, unsigned, 1 channel (XF_16UC1), 16bit, signed, 1 channel (XF_16SC1), 32-bit, signed, 1 channel (XF_32SC1) are supported.

ROWS

Number of rows in the image being processed.

COLS

Number of columns in the image being processed.

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
133

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 195: minMaxLoc Function Parameter Descriptions (cont'd)
Parameter

Description

max_val

Maximum value in the image, of type int.

min_val

Minimum value in the image, of type int.

_minlocx

x-coordinate location of the first minimum value.

_minlocy

y-coordinate location of the first maximum value.

_maxlocx

x-coordinate location of the first minimum value.

_maxlocy

y-coordinate location of the first maximum value.

Resource Utilization
The following table summarizes the resource utilization of the minMaxLoc function, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 196: minMaxLoc Function Resource Utilization Summary
Operating
Mode

Operating Frequency
(MHz)

Utilization Estimate
BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

3

451

398

86

8 pixel

150

0

3

1049

1025

220

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 197: minMaxLoc Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency

1 pixel operation (300 MHz)

6.9 ms

8 pixel operation (150 MHz)

1.69 ms

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
134

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Mean Shift Tracking
Mean shift tracking is one of the basic object tracking algorithms. Mean-shift tracking tries to
find the area of a video frame that is locally most similar to a previously initialized model. The
object to be tracked is represented by a histogram. In object tracking algorithms target
representation is mainly rectangular or elliptical region. It contains target model and target
candidate. Color histogram is used to characterize the object. Target model is generally
represented by its probability density function (pdf). Weighted RGB histogram is used to give
more importance to object pixels.
Mean-shift algorithm is an iterative technique for locating the maxima of a density function. For
object tracking, the density function used is the weight image formed using color histograms of
the object to be tracked and the frame to be tested. By using the weighted histogram we are
taking spatial position into consideration unlike the normal histogram calculation. This function
will take input image pointer, top left and bottom right coordinates of the rectangular object,
frame number and tracking status as inputs and returns the centroid using recursive mean shift
approach.
API Syntax
template 
void MeanShift(xf::Mat &_in_mat, uint16_t* x1,
uint16_t* y1, uint16_t* obj_height, uint16_t* obj_width, uint16_t* dx,
uint16_t* dy, uint16_t* status, uint8_t frame_status, uint8_t no_objects,
uint8_t no_iters );

Template Parameter Descriptions
The following table describes the template parameters.
Table 198: MeanShift Template Parameters
Parameter

Description

MAXOBJ

Maximum number of objects to be tracked

MAXITERS

Maximum iterations for convergence

OBJ_ROWS

Maximum Height of the object to be tracked

OBJ_COLS

Maximum width of the object to be tracked

SRC_T

Type of the input xf::Mat, must be XF_8UC4, 8-bit data with 4 channels

ROWS

Maximum height of the image

COLS

Maximum width of the image

NPC

Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per
cycle operations.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
135

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Function Parameter Description
The following table describes the function parameters.
Table 199: MeanShift Function Parameters
Parameter

Description

_in_mat

Input xF Mat

x1

Top Left corner x-coordinate of all the objects

y1

Top Left corner y-coordinate of all the objects

obj_height

Height of all the objects

obj_width

Width of all the objects

dx

Centers x-coordinate of all the objects returned by the kernel function

dy

Centers y-coordinate of all the objects returned by the kernel function

status

Track the object only if the status of the object is true (i.e.) if the object goes out of the frame,
status is made zero

frame_status

Set as zero for the first frame and one for other frames

no_objects

Number of objects racked

no_iters

Number of iterations for convergence

Resource Utilization and Performance Estimate
The following table summarizes the resource utilization of the MeanShift function for normal (1
pixel) configuration as generated in Vivado HLS 2018.2 release tool for the part xczu9egffvb1156-i-es1 at 300 MHz to process a RGB image of resolution,1920x1080, and for 10 objects
of size of 250x250 and 4 iterations.
Table 200: MeanShift Function Resource Utilization and Performance Estimate
Summary
Configuration
1 pixel

Max. Latency
19.28

BRAMs
76

DSPs
14

FFs
13198

LUTs
10064

Limitations
The maximum number of objects that can be tracked is 10.

Otsu Threshold
Otsu threshold is used to automatically perform clustering-based image thresholdingor the
reduction of a gray-level image to a binary image. The algorithm assumes that the image contains
two classes of pixels following bi-modal histogram (foreground pixels and background pixels), it
then calculates the optimum threshold separating the two classes.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
136

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Otsu method is used to find the threshold which can minimize the intra class variance which
separates two classes defined by weighted sum of variances of two classes.

σ 2w⎛⎝t⎞⎠ = w 1 σ 21⎛⎝t⎞⎠ + w 2 σ 22⎛⎝t⎞⎠
Where, w_1is the class probability computed from the histogram.
t

⎛ ⎞
⎝ ⎠

w2 =

w1 = ∑ p i
1

I

∑

t+1

p⎛⎝i⎞⎠

Otsu shows that minimizing the intra-class variance is the same as maximizing inter-class
variance

σ 2b = σ - σ 2w
σ 2b = w 1 w 2 (μ b - μ f ) 2
⎤
⎡ t
μ b = ⎢∑ p(i)x(i)⎥
⎦
⎣ 1
Where,
API Syntax

/

w1

,

/

⎤
⎡ I
μ f = ⎢∑ p(i)x(i) ⎥ w 2
⎦
⎣t + 1

is the class mean.

template void
OtsuThreshold(xf::Mat & _src_mat, uint8_t &_thresh)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 201: OtsuThreshold Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel
and 8 pixel operations respectively.

_src_mat

Input image

_thresh

Output threshold value after the computation

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
137

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of the OtsuThreshold function, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 202: OtsuThreshold Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

8

49

2239

3353

653

8 pixel

150

22

49

1106

3615

704

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 203: OtsuThreshold Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.92 ms

8 pixel operation (150 MHz)

1.76 ms

Pixel-Wise Addition
The add function performs the pixel-wise addition between two input images and returns the
output image.
Iout(x, y) = Iin1(x, y) + Iin2(x, y)
Where:
• Iout(x, y) is the intensity of the output image at (x, y) position
• Iin1(x, y) is the intensity of the first input image at (x, y) position
• Iin2(x, y) is the intensity of the second input image at (x, y) position.
XF_CONVERT_POLICY_TRUNCATE: Results are the least significant bits of the output operand,
as if stored in two’s complement binary format in the size of its bit-depth.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
138

Send Feedback

Chapter 3: xfOpenCV Library API Reference

XF_CONVERT_POLICY_SATURATE: Results are saturated to the bit depth of the output
operand.
API Syntax
template
void add (
xf::Mat src1,
xf::Mat src2,
xf::Mat dst )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 204: add Function Parameter Descriptions
Parameter

Description

POLICY_TYPE

Type of overflow handling. It can be either, XF_CONVERT_POLICY_SATURATE or
XF_CONVERT_POLICY_TRUNCATE.

SRC_T

pixel type. Options are XF_8UC1 and XF_16SC1.

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel
and 8 pixel operations respectively.

src1

Input image

src2

Input image

dst

Output image

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 205: add Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

62

55

11

8 pixel

150

0

0

65

138

24

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
139

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 206: add Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150MHz)

1.7

Pixel-Wise Multiplication
The multiply function performs the pixel-wise multiplication between two input images and
returns the output image.
Iout(x, y) = Iin1(x, y) * Iin2(x, y) * scale_val
Where:
• Iout(x, y) is the intensity of the output image at (x, y) position
• Iin1(x, y) is the intensity of the first input image at (x, y) position
• Iin2(x, y) is the intensity of the second input image at (x, y) position
• scale_val is the scale value.
XF_CONVERT_POLICY_TRUNCATE: Results are the least significant bits of the output operand,
as if stored in two’s complement binary format in the size of its bit-depth.
XF_CONVERT_POLICY_SATURATE: Results are saturated to the bit depth of the output
operand.
API Syntax
template
void multiply (
xf::Mat src1,
xf::Mat src2,
xf::Mat dst,
float scale)

Parameter Descriptions
The following table describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
140

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 207: multiply Function Parameter Descriptions
Parameter

Description

POLICY_TYPE

Type of overflow handling. It can be either, XF_CONVERT_POLICY_SATURATE or
XF_CONVERT_POLICY_TRUNCATE.

SRC_T

pixel type. Options are XF_8UC1 and XF_16SC1.

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel
and 8 pixel operations respectively.

src1

Input image

src2

Input image

dst

Output image

scale_val

Weighing factor within the range of 0 and 1

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 208: multiply Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

2

124

59

18

8 pixel

150

0

16

285

108

43

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 209: multiply Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (in ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.6

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
141

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Pixel-Wise Subtraction
The subtract function performs the pixel-wise subtraction between two input images and
returns the output image.
Iout(x, y) = Iin1(x, y) - Iin2(x, y)
Where:
• Iout(x, y) is the intensity of the output image at (x, y) position
• Iin1(x, y) is the intensity of the first input image at (x, y) position
• Iin2(x, y) is the intensity of the second input image at (x, y) position.
XF_CONVERT_POLICY_TRUNCATE: Results are the least significant bits of the output operand,
as if stored in two’s complement binary format in the size of its bit-depth.
XF_CONVERT_POLICY_SATURATE: Results are saturated to the bit depth of the output
operand.
API Syntax
template
int COLS, int NPC> src1,
int COLS, int NPC> src2,
int COLS, int NPC> dst )

Parameter Descriptions
The following table describes the template and the function parameters.
Table 210: subtract Function Parameter Descriptions
Parameter
POLICY_TYPE

Description
Type of overflow handling. It can be either, XF_CONVERT_POLICY_SATURATE or
XF_CONVERT_POLICY_TRUNCATE.

SRC_T

pixel type. Options are XF_8UC1 and XF_16SC1.

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

src1

Input image

src2

Input image

dst

Output image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
142

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization in different configurations, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 211: subtract Function Resource Utilization Summary

Operating
Mode

Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)

DSP_48Es

FF

LUT

CLB

1 pixel

300

0

0

62

53

11

8 pixel

150

0

0

59

13

21

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 212: subtract Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency (ms)

1 pixel operation (300 MHz)

6.9

8 pixel operation (150 MHz)

1.7

Remap
The remap function takes pixels from one place in the image and relocates them to another
position in another image. Two types of interpolation methods are used here for mapping the
image from source to destination image.

dst = src⎛⎝map x (x, y), map y (x, y)⎞⎠

API Syntax
template
void remap (xf::Mat &_src_mat,
xf::Mat &_remapped_mat,
xf::Mat &_mapx_mat,
xf::Mat &_mapy_mat);

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
143

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parameter Descriptions
The following table describes the template parameters.
Table 213: remap template Parameter Descriptions
Parameter

Description

WIN_ROWS

Number of input image rows to be buffered inside. Must be set based on the map data. For
instance, for left right flip, 2 rows are sufficient.

INTERPOLATION_TYPE

Type of interpolation, either XF_INTERPOLATION_NN (nearest neighbor) or
XF_INTERPOLATION_BILINEAR (linear interpolation)

SRC_T

Input image type. Grayscale image of type 8-bits and single channel. XF_8UC1.

MAP_T

Map type. Single channel float type. XF_32FC1.

DST_T

Output image type. Grayscale image of type 8-bits and single channel. XF_8UC1.

ROWS

Height of input and output images

COLS

Width of input and output images

NPC

Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per
cycle operations.

The following table describes the function parameters.
Table 214: remap Function Parameter Descriptions
PARAMETERS

DESCRIPTION

_src_mat

Input xF Mat

_remapped_mat

Output xF Mat

_mapx_mat

mapX Mat of float type

_mapy_mat

mapY Mat of float type

Resource Utilization
The following table summarizes the resource utilization of remap, for HD (1080x1920) images
generated in the Vivado HLS 2018.2 version tool for the Xilinx xczu9eg-ffvb1156-i-es1 FPGA at
300 MHz, with WIN_ROWS as 64 for the XF_INTERPOLATION_BILINEAR mode.
Table 215: remap Function Resource Utilization Summary
Name

Resource Utilization

BRAM_18K

64

DSP48E

17

FF

1738

LUT

1593

CLB

360

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
144

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance of remap(), for HD (1080x1920) images
generated in the Vivado HLS 2018.2 version tool for the Xilinx xczu9eg-ffvb1156-i-es1 FPGA at
300 MHz, with WIN_ROWS as 64 for XF_INTERPOLATION_BILINEAR mode.
Table 216: remap Function Performance Estimate Summary

Operating Mode
1 pixel mode

Operating Frequency

Latency Estimate

(MHz)

Max latency (ms)

300

7.2

Resolution Conversion (Resize)
Resolution Conversion is the method used to resize the source image to the size of the
destination image. Different types of interpolation techniques can be used in resize function,
namely: Nearest-neighbor, Bilinear, and Area interpolation. The type of interpolation can be
passed as a template parameter to the API. The following enumeration types can be used to
specify the interpolation type:
• XF_INTERPOLATION_NN - For Nearest-neighbor interpolation
• XF_INTERPOLATION_BILINEAR - For Bilinear interpolation
• XF_INTERPOLATION_AREA - For Area interpolation
Note: Scaling factors greater than or equal to 0.25 are supported in down-scaling and values less than or
equal to 8 are supported for up-scaling.

API Syntax
template
void resize (xf::Mat & _src, xf::Mat & _dst)

Parameter Descriptions
The following table describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
145

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 217: resize Function Parameter Descriptions
Parameter

Description

INTERPOLATION_TYPE

Interpolation type. The different options possible are

•

XF_INTERPOLATION_NN – Nearest Neighbor Interpolation

•

XF_INTERPOLATION_BILINEAR – Bilinear interpolation

•

XF_INTERPOLATION_AREA – Area Interpolation

TYPE

Number of bits per pixel. Only XF_8UC1 is supported.

SRC_ROWS

Maximum Height of input image for which the hardware kernel would be built.

SRC_COLS

Maximum Width of input image for which the hardware kernel would be built (must be a
multiple of 8).

DST_ROWS

Maximum Height of output image for which the hardware kernel would be built.

DST_COLS

Maximum Width of output image for which the hardware kernel would be built (must be a
multiple of 8).

NPC

Number of pixels to be processed per cycle. Possible options are XF_NPPC1 (1 pixel per cycle)
and XF_NPPC8 (8 pixel per cycle).

_src

Input Image

_dst

Output Image

Resource Utilization
The following table summarizes the resource utilization of Resize function in Resource Optimized
(8 pixel) mode and Normal mode, as generated in the Vivado HLS 2018.2 tool for the Xilinx
xczu9eg-ffvb1156-2-i-es2 FPGA.
Table 218: resize Function Resource Utilization Summary
Utilization Estimate
Operating Mode

1 Pixel (at 300 MHz)
LUTs

FFs

DSPs

8 Pixel (at 150MHz)
BRAMs

LUTs

FFs

DSPs

BRAMs

Downscale Nearest
Neighbor

1215

1625

4

3

2533

2454

4

24

Downscale Bilinear

1473

1821

8

3

5647

3714

36

24

Downscale Area

2558

2995

42

30

Configuration not supported

Upscale Nearest
Neighbor

1215

1625

4

3

1599

1636

8

20

Upscale Bilinear

1473

1821

8

3

4116

3423

36

12

Upscale Area

1461

2107

16

25

Configuration not supported

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
146

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes the performance estimation of Resize for various configurations,
as generated in the Vivado HLS 2018.2 tool for the xczu9eg-ffvb1156-2-i-es2 FPGA at 300 MHz
to resize a grayscale image from 1080x1920 to 480x640 (downscale); and to resize a grayscale
image from 1080x1920 to 2160x3840 (upscale). This table also shows the latencies obtained for
different interpolation types.
Table 219: resize Function Performance Estimate Summary

Operating
Mode

1 pixel

Latency Estimate (ms)
Operating
Frequency Downscale Downscale Downscale
Upscale
(MHz)
300

NN
6.94

Bilinear
6.97

NN

Area
7.09

27.71

Upscale

Upscale

Bilinear

Area

27.75

27.74

RGB2HSV
The RGB2HSV function converts the input image color space to HSV color space and returns the
HSV image as the output.
API Syntax
template
void RGB2HSV(xf::Mat &
_src_mat,xf::Mat & _dst_mat)

Parameter Descriptions
The table below describes the template and the function parameters.
Parameter

Description

SRC_T

Input pixel type should be XF_8UC4

DST_T

Output pixel type should be XF_8UC4

ROWS

Maximum height of input and output image

COLS

Maximum width of input and output image

NPC

Number of pixels to be processed per cycle. Only XF_NPPC1 is supported.

_src_mat

Input image

_dst_mat

Output image

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
147

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Scharr Filter
The Scharr function computes the gradients of input image in both x and y direction by
convolving the kernel with input image being processed.
For Kernel size 3x3:
• GradientX:

⎡ -3 0 3 ⎤
G x = ⎢-10 0 10⎥*I
⎣ -3 0 3 ⎦

• GradientY:

⎡-3 -10 -3⎤
G y = ⎢ 0 0 0 ⎥*I
⎣ 3 10 3 ⎦

API Syntax
template
void Scharr(xf::Mat & _src_mat,xf::Mat & _dst_matx,xf::Mat & _dst_maty)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 220: Scharr Function Parameter Descriptions
Parameter

Description

BORDER_TYPE

Border Type supported is XF_BORDER_CONSTANT

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output pixel type. Only 8-bit unsigned, 16-bit signed,1 channel is supported (XF_8UC1,
XF_16SC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_matx

X gradient output image.

_dst_maty

Y gradient output image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
148

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 221: Scharr Function Resource Utilization Summary
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150MHz

BRAM_18K

3

6

DSP48E

0

0

FF

728

1434

LUT

812

2481

CLB

171

461

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 222: Scharr Function Performance Estimate Summary

Operating Mode

Operating Frequency

Latency Estimate

(MHz)

(ms)

1 pixel

300

7.2

8 pixel

150

1.7

Sobel Filter
The Sobel function Computes the gradients of input image in both x and y direction by
convolving the kernel with input image being processed.
• For Kernel size 3x3
○

GradientX:

○

GradientY:

⎡-3 -10 -3⎤
G y = ⎢ 0 0 0 ⎥*I
⎣ 3 10 3 ⎦

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
149

Send Feedback

Chapter 3: xfOpenCV Library API Reference

⎡-1 -2 -1⎤
G y = ⎢ 0 0 0 ⎥*I
⎣1 2 1⎦
• For Kernel size 5x5
○

○

GradientX:

⎡ -1
⎢ -4
⎢
G x = ⎢ -6
⎢ -4
⎣ -1

-2
-8
-12
-8
-2

GradientY:

⎡ -1
⎢ -2
⎢
Gy = ⎢ 0
⎢2
⎣1

-4
-8
0
8
4

0
0
0
0
0

2
8
12
8
2

-6
-12
0
12
6

-4
-8
0
8
4

1⎤
4⎥
⎥
6⎥*I
4⎥
1⎦
-1⎤
-2⎥
⎥
0 ⎥*I
2⎥
1⎦

• For Kernel size 7x7
○

○

GradientX:

⎡ -1
⎢ -6
⎢-15
⎢
G x = ⎢-20
⎢-15
⎢ -6
⎣ -1

-4
-24
-60
-80
-60
-24
-4

-5
-30
75
-100
-75
-30
-5

0
0
0
0
0
0
0

5
30
75
75
75
30
5

4
24
60
60
60
24
4

1⎤
6⎥
15⎥
⎥
15⎥*I
15⎥
6⎥
1⎦

GradientY:

⎡-1
⎢-4
⎢-5
⎢
Gy = ⎢ 0
⎢5
⎢4
⎣1

-6
-24
-30
0
30
24
6

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

-15
-60
-75
0
75
60
15

-20
-80
-100
0
100
80
20

-15
-60
-75
0
75
60
15

-6
-24
-30
0
30
24
6

-1⎤
-4⎥
-5⎥
⎥
0 ⎥*I
5⎥
4⎥
1⎦

www.xilinx.com

[placeholder text]
150

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void Sobel(xf::Mat & _src_mat,xf::Mat & _dst_matx,xf::Mat & _dst_maty)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 223: Sobel Function Parameter Descriptions
Parameter

Description

FILTER_TYPE

Filter size. Filter size of 3(XF_FILTER_3X3), 5(XF_FILTER_5X5) and 7(XF_FILTER_7X7) are supported

BORDER_TYPE

Border Type supported is XF_BORDER_CONSTANT

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output pixel type. Only 8-bit unsigned, 16-bit signed, 1 channel is supported (XF_8UC1,
XF_16SC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_matx

X gradient output image.

_dst_maty

Y gradient output image.

1.

Sobel 7x7 8-pixel is not supported.

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 224: Sobel Function Resource Utilization Summary

Operating
Mode
1 pixel

Filter Size

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

CLB

LUT

3x3

300

3

0

609

616

135

5x5

300

5

0

1133

1499

308

7x7

300

7

0

2658

3334

632

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
151

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 224: Sobel Function Resource Utilization Summary (cont'd)

Operating
Mode
8 pixel

Filter Size

Utilization Estimate

Operating
Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

CLB

LUT

3x3

150

6

0

1159

1892

341

5x5

150

10

0

3024

5801

999

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 225: Sobel Function Performance Estimate Summary

Operating Mode
1 pixel

8 pixel

Operating Frequency

Latency Estimate

Filter Size

(MHz)

(in ms)

300

3x3

7.5

300

5x5

7.5

300

7x7

7.5

150

3x3

1.7

150

5x5

1.71

Stereo Local Block Matching
Stereo block matching is a method to estimate the motion of the blocks between the consecutive
frames, called stereo pair. The postulate behind this idea is that, considering a stereo pair, the
foreground objects will have disparities higher than the background. Local block matching uses
the information in the neighboring patch based on the window size, for identifying the conjugate
point in its stereo pair. While, the techniques under global method, used the information from
the whole image for computing the matching pixel, providing much better accuracy than local
methods. But, the efficiency in the global methods are obtained with the cost of resources, which
is where local methods stands out.
Local block matching algorithm consists of preprocessing and disparity estimation stages. The
preprocessing consists of Sobel gradient computation followed by image clipping. And the
disparity estimation consists of SAD (Sum of Absolute Difference) computation and obtaining the
disparity using winner takes all method (least SAD will be the disparity). Invalidity of the pixel
relies upon its uniqueness from the other possible disparities. And the invalid pixels are indicated
with the disparity value of zero.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
152

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template 
void StereoBM(xf::Mat &_left_mat, xf::Mat &_right_mat, xf::Mat &_disp_mat,
xf::xFSBMState &sbmstate);

Parameter Descriptions
The following table describes the template and the function parameters.
Table 226: StereoBM Function Parameter Descriptions
Parameter

Description

WSIZE

Size of the window used for disparity computation

NDISP

Number of disparities

NDISP_UNITS

Number of disparities to be computed in parallel.

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

DST_T

Output type. This is XF_16UC1, where the disparities are arranged in Q12.4 format.

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 only.

left_image

Image from the left camera

right_image

Image from the right camera

disparity_image

Disparities output in the form of an image.

sbmstate

Class object consisting of various parameters regarding the stereo block matching algorithm.
1. preFilterCap: default value is 31, can be altered by the user, value ranges from 1 to 63
2. minDisparity: default value is 0, can be altered by the user, value ranges from 0 to (imgWidthNDISP)
3. uniquenessRatio: default set to 15, but can be altered to any non-negative integer.
4. textureThreshold: default set to 10, but can be modified to any non-negative integer.

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA,
to progress a grayscale HD (1080x1920) image.
The configurations are in the format: imageSize_WSIZE_NDisp_NDispUnits.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
153

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 227: StereoBM Function Resource Utilization Summary
Configurations

Resource Utilization

Frequency
BRAM_18k

( MHz)

DSP48E

FF

LUT

HD_5_16_2

300

37

20

6856

7181

HD_9_32_4

300

45

20

9700

10396

HD_11_32_32

300

49

20

34519

31978

HD_15_128_32

300

57

20

41017

35176

HD_21_64_16

300

69

20

29853

30706

Performance Estimate
The following table summarizes a performance estimate of the Stereo local block matching in
different configurations, as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9egffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.
The configurations are in the format: imageSize_WSIZE_NDisp_NDispUnits.
Table 228: StereoBM Function Performance Estimate Summary
Latency ( ms)

Frequency

Configurations

Min

( MHz)

Max

HD_5_16_2

300

55.296

55.296

HD_9_32_4

300

55.296

55.296

HD_11_32_32

300

6.912

6.912

HD_15_48_16

300

20.736

20.736

HD_15_128_32

300

27.648

27.648

HD_21_64_16

300

27.648

27.648

Semi Global Method for Stereo Disparity Estimation
Stereo matching algorithms are used for finding relative depth from a pair of rectified stereo
images. The resultant disparity information can be used for 3D reconstruction by triangulation,
using the known intrinsic and extrinsic parameters of the stereo camera. The Semi global method
for stereo disparity estimation aggregates the cost in terms of dissimilarity across multiple paths
leading to a smoother estimate of the disparity map.
For the semi-global method in xfOpenCV, census transform in conjunction with Hamming
distance is used for cost computation. The semiglobal optimization block is based on the
implementation by Hirschmuller, but approximates the cost aggregation by considering only four
directions.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
154

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parallelism is achieved by computing and aggregating cost for multiple disparities in parallel, and
this parameter is included as a compile-time input.
API Syntax
template
void SemiGlobalBM(xf::Mat & _src_mat_l,
xf::Mat & _src_mat_r, xf::Mat &
_dst_mat, uint8_t p1, uint8_t p2)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 229: StereoSGM Function Parameter Descriptions
Parameter

Description

BORDER_TYPE

The border pixels are processed in Census transform function based on this parameter. Only
XF_BORDER_CONSTANT is supported.

WINDOW_SIZE

Size of the window used for Census transform computation. Only ‘5’ (5x5) is supported.

NDISP

Number of disparities

PU

Number of disparity units to be computed in parallel

R

Number of directions for cost aggregation. It must be 2, 3, or 4.

SRC_T

Type of input image Mat object. It must be XF_8UC1.

DST_T

Type of output disparity image Mat object. It must be XF_8UC1.

ROWS

Maximum height of the input image.

COLS

Maximum width of the input image.

NPC

Number of pixels to be computed in parallel. It must be XF_NPPC1.

_src_mat_l

Left input image Mat

_src_mat_r

Right input image Mat

_dst_mat

Output disparity image Mat

p1

Small penalty for cost aggregation

p2

Large penalty for cost aggregation. The maximum value is 100.

Resource Utilization
The following table summarizes the resource utilization for a 1920 x 1080 image, with 64
number of disparities, and 16 parallel units.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
155

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 230: StereoSGM Function Resource Utilization Summary
Operating
Mode

Filter Size

Operating
Frequency
(MHz)

Resource Utilization
BRAM_18k

1 pixel

5x5

300

205

DSP48E
78

FF

LUT

10847

12498

Performance Estimate
The following table summarizes a performance estimate for a 1920x1080 image.
Table 231: StereoSGM Function Performance Estimate Summary
Operating
Frequency

Operating Mode
1 pixel/clock

300 MHz

Number of
Disparities
64

Parallel Units
16

Latency
50 ms

SVM
The SVM function is the SVM core operation, which performs dot product between the input
arrays. The function returns the resultant dot product value with its fixed point type.
API Syntax
template
void SVM(xf::Mat &in_1, xf::Mat &in_2, uint16_t idx1, uint16_t idx2, uchar_t frac1, uchar_t
frac2, uint16_t n, uchar_t *out_frac, ap_int *result)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 232: SVM Function Parameter Descriptions
Parameters

Description

SRC1_T

Input pixel type. 16-bit, signed, 1 channel (XF_16SC1) is supported.

SRC2_T

Input pixel type. 16-bit, signed, 1 channel (XF_16SC1) is supported.

DST_T

Output data Type. 32-bit, signed, 1 channel (XF_32SC1) is supported.

ROWS1

Number of rows in the first image being processed.

COLS1

Number of columns in the first image being processed.

ROWS2

Number of rows in the second image being processed.

COLS2

Number of columns in the second image being processed.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
156

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 232: SVM Function Parameter Descriptions (cont'd)
Parameters

Description

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1.

N

Max number of kernel operations

in_1

First Input Array.

in_2

Second Input Array.

idx1

Starting index of the first array.

idx2

Starting index of the second array.

frac1

Number of fractional bits in the first array data.

frac2

Number of fractional bits in the second array data.

n

Number of kernel operations.

out_frac

Number of fractional bits in the resultant value.

result

Resultant value

Resource Utilization
The following table summarizes the resource utilization of the SVM function, generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 233: SVM Function Resource Utilization Summary
Utilization Estimate (ms)

Operating
Frequency
(MHz)
300

BRAM_18K
0

DSP_48Es
1

FF
27

LUT
34

CLB
12

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 234: SVM Function Performance Estimate Summary
Latency Estimate

Operating Frequency (MHz)
300

Min (cycles)
204

Max (cycles)
204

Thresholding
The Threshold function performs thresholding operation on the input image. Thresholding are
of two types, binary and range.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
157

Send Feedback

Chapter 3: xfOpenCV Library API Reference

In the Binary thresholding, the threshold value will be set and the depending upon the threshold
value the output is set to either 0 or 255. The _binary_thresh_val has to be set to the threshold
value when the binary thresholding is selected, as well as the _upper_range and _lower_range has
to be set to 0.
⎧ 255, i f src(x, y) > threshold
dst(x, y) = ⎨
⎩0,
Otherwise
In the Range thresholding, the upper and the lower threshold values are set and depending upon
those values the output is set either 0 or 255. The _upper_range and _lower_range values has to
be set to the upper and the lower threshold values respectively, when the range thresholding is
selected, as well as in this case the _binary_thresh_val has to be set to 0.

⎧ 0, i f src(x, y) > upper
dst(x, y) = ⎨ 0, i f src(x, y) < lower
⎩255,
otherwise

API Syntax
template
void Threshold(xf::Mat & _src_mat,xf::Mat & _dst_mat,short thresh_val,short thresh_upper,short
thresh_lower)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 235: Threshold Function Parameter Descriptions
Parameter

Description

THRESHOLD_TYPE

Type of thresholding. It can be either binary thresholding or range thresholding. Options are
XF_THRESHOLD_TYPE_BINARY or XF_THRESHOLD_TYPE_RANGE.

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle.

_src_mat

Input image

_dst_mat

Output image

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
158

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 236: Threshold Function Resource Utilization Summary
Resource Utilization
Configurations

1 pixel

8 pixel

300 MHz

150MHz

BRAM_18K

0

0

DSP48E

3

3

FF

410

469

LUT

277

443

CLB

72

103

Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 237: Threshold Function Performance Estimate Summary

Operating Mode

Operating Frequency

Latency Estimate

(MHz)

(ms)

1 pixel

300

7.2

8 pixel

150

1.7

WarpAffine
Affine Transformation is used to express rotation, scaling and translation operations.
warpAffine takes inverse transformation as input and transforms the input image by matrix
multiplication with 2x3 matrix.Calculate the input position corresponding to every output pixel in
the following way.

⎡x output⎤
⎡x input⎤ ⎡⎢ M 0,0 M 0,1 M 0,2⎤⎥ ⎢y
⎣y input⎦ = ⎣ M 1,0 M 1,1 M 1,2⎦ output⎥
⎣ 1 ⎦

x input = M 0,0 *x output + M 0,1 *y output + M 0,2
y input = M 1,0 *x output + M 1,1 *y output + M 1,2

output⎛⎝x output, y output⎞⎠ = input⎛⎝x input, y input⎞⎠

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
159

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Note: Prior scaling of greater than or equal to 0.25 and less than or equal to 8 are supported and throws an
assertion if the transformation matrix passed is having unsupported scaling factors.

API Syntax
template
void warpAffine(xf::Mat & _src,
xf::Mat & _dst, float* transformation_matrix)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 238: warpAffine Function Parameter Descriptions
Parameter

Description

INTERPOLATION_TYPE

Interpolation technique to be used
XF_INTERPOLATION_NN - Nearest Neighbor
XF_INTERPOLATION_BILINEAR - Bilinear

SRC_T

Input pixel type.

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input Image pointer

_dst

Output Image pointer

transformation_matrix

Inverse Affine Transformation matrix

Resource Utilization
The following table shows the resource utilization of warpAffine in different configurations as
generated in Vivado HLS 2018.2 version tool for the part Xilinx Xczu9eg-ffvb1156-1-i-es1.
Table 239: warpAffine Function Resource Utilization Summary
Utilization (at 250 MHz)

Name

Normal Operation

Resource Optimized

(1 pixel) Mode

(8 pixel) Mode

Nearest
Neighbour

Bilinear
Interpolation

Nearest
Neighbour

Bilinear
Interpolation

BRAM

200

194

200

200

DSP48E

121

125

139

155

FF

9523

10004

8070

9570

LUT

9928

10592

13296

13894

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
160

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 239: warpAffine Function Resource Utilization Summary (cont'd)
Utilization (at 250 MHz)
Normal Operation

Resource Optimized

(1 pixel) Mode

(8 pixel) Mode

Name

Nearest
Neighbour
CLB

Bilinear
Interpolation

2390

Nearest
Neighbour

2435

Bilinear
Interpolation

2932

2829

Note: NO is single pixel processing and RO is 4-pixel processing.

Performance Estimate
The following table shows the performance of Affine Transformation in different configurations,
as generated in Vivado HLS 2018.2 version tool for the part Xilinx Xczu9eg-ffvb1156-1-i-es1.
Table 240: warpAffine Function Performance Estimate Summary
Latency Estimate
1 pixel (300 MHz)
Bilinear

Nearest Neighbour

Bilinear

Nearest Neighbour

Interpolation

Max(in ms)
10.4

8 pixel (150 MHz)

Max(in ms)

Interpolation

Max(in ms)

19.3

10.7

Max(in ms)
10.4

WarpPerspective
The warpPerspective function performs the perspective transformation on the image. It
takes inverse transformation as input and applies perspective transformation with a 3x3 matrix.
Calculate the input position corresponding to every output pixel as following:

⎡M

M

M ⎤ ⎡x output⎤
⎢y output⎥

0,1
0,2
⎡x⎤ ⎢ 0,0
⎥
⎢y⎥ = M 1,0 M 1,1 M 1,2
⎥
⎣z ⎦ ⎢

⎣M 2,0 M 2,1 M 2,2⎦ ⎣ 1 ⎦

x = M 0,0 *x output + M 0,1 *y output + M 0,2
y = M 1,0 *x output + M 1,1 *y output + M 1,2

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
161

Send Feedback

Chapter 3: xfOpenCV Library API Reference

z = M 2,0 *x output + M 2,1 *y output + M 2,2
output⎛⎝x output, y output⎞⎠ = input⎛⎝x / z , y / z ⎞⎠
Note: Prior scaling of greater than or equal to 0.25 and less than or equal to 8 are supported and throws an
assertion if the transformation matrix passed is having unsupported scaling factors.

API Syntax
template< int INTERPOLATION_TYPE ,int SRC_T, int ROWS, int COLS,int NPC>
void warpPerspective(xf::Mat &
_src_mat,xf::Mat & _dst_mat,float
*transformation_matrix)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 241: warpPerspective Function Parameter Descriptions
Parameter
INTERPOLATION_TYPE

Description
Interpolation technique to be used
XF_INTERPOLATION_NN - Nearest Neightbour
XF_INTERPOLATION_BILINEAR - Bilinear

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (must be a multiple of 8)

COLS

Maximum width of input and output image (must be a multiple of 8)

NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src_mat

Input image

_dst_mat

Output image

transformation_matrix

Inverse Perspective Transformation matrix

Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image for Bilinear interpolation.
Table 242: warpPerspective Function Resource Utilization Summary
Resource Utilization
Name
BRAM_18K

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

1 pixel

8 pixel

300 MHz

150 MHz

223

233

www.xilinx.com

[placeholder text]
162

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 242: warpPerspective Function Resource Utilization Summary (cont'd)
Resource Utilization
Name

1 pixel

8 pixel

300 MHz

150 MHz

DSP48E

191

293

FF

17208

14330

LUT

14458

18969

CLB

3230

3876

Performance Estimate
The following table summarizes the performance of kernel for different configurations, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a
grayscale HD (1080x1920) image for Bilinear interpolation.
Table 243: warpPerspective Function Performance Estimate Summary

Operating Mode

Operating Frequency

Latency Estimate

(MHz)

Max (ms)

1 pixel

300

19.3

8 pixel

150

15.5

Atan2
The Atan2LookupFP function finds the arctangent of y/x. It returns the angle made by the
⎡x⎤
⎣y⎦

with respect to origin. The angle returned by atan2 will also contain the quadrant
vector
information.
Atan2LookupFP is a fixed point version of the standard atan2 function. This function
implements the atan2 using a lookup table approach. The values in the look up table are
represented in Q4.12 format and so the values returned by this function are in Q4.12. A
maximum error of 0.2 degrees is present in the range of 89 to 90 degrees when compared to the
standard atan2 function available in glibc. For the other angles (0 to 89) the maximum error is in
the order of 10-3. This function returns 0 when both xs and ys are zeroes.
API Syntax
short Atan2LookupFP(short xs, short ys, int M1,int N1,int M2, int N2)

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
163

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Parameter Descriptions
The following table describes the template and the function parameters.
Table 244: Atan2LookupFP Function Parameter Descriptions
Parameter

Description

xs

16-bit signed value x in fixed point format of QM1.N1

ys

16-bit signed value y in fixed point format of QM2.N2

M1

Number of bits to represent integer part of x.

N1

Number of bits to represent fractional part of y. Must be equal to 16-M1.

M2

Number of bits to represent integer part of y.

N2

Number of bits to represent fractional part of y. Must be equal to 16-N1.

Return

Return value is in radians. Its range varies from -pi to +pi in fixed point format of Q4.12

Resource Utilization
The following table summarizes the resource utilization of the Atan2LookupFP function ,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 245: Atan2LookupFP Function Resource Utilization Summary
Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

4

DSP_48Es
2

FF
275

LUT

CLB

75

139

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 246: Atan2LookupFP Function Performance Estimate Summary
Latency Estimate

Operating Frequency
Min (cycles)

(MHz)
300

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

1

Max (cycles)
15

www.xilinx.com

[placeholder text]
164

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Inverse (Reciprocal)
The Inverse function computes the reciprocal of a number x. The values of 1/x are stored in a
look up table of 2048 size. The index for picking the 1/x value is computed using the fixed point
format of x. Once this index is computed, the corresponding 1/x value is fetched from the look
up table and returned along with the number of fractional bits needed to represent this value in
fixed point format.
API Syntax
unsigned int Inverse(unsigned short x,int M,char *N)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 247: Inverse Function Parameter Descriptions
Parameter

Description

x

16-bit unsigned value x in fixed point format of QM.(16-M)

M

Number of bits to represent integer part of x.

N

Pointer to a char variable which stores the number of bits to represent fractional part of 1/x.
This value is returned from the function.

Return

1/x value is returned in 32-bit format represented by a fixed point format of Q(32-N).N

Resource Utilization
The following table summarizes the resource utilization of the Inverse function, generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 248: Inverse Function Resource Utilization Summary
Utilization Estimate (ms)

Operating
Frequency
BRAM_18K

(MHz)
300

4

DSP_48Es
0

FF
68

LUT
128

CLB
22

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
165

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Table 249: Inverse Function Performance Estimate Summary
Latency Estimate

Operating Frequency
Min (cycles)

(MHz)
300

1

Max (cycles)
8

Look Up Table
The LUT function performs the table lookup operation. Transforms the source image into the
destination image using the given look-up table. The input image must be of depth AU_8UP and
the output image of same type as input image.
Iout(x, y) = LUT [Iin1(x, y)]
Where:
• Iout(x, y) is the intensity of output image at (x, y) position
• Iin(x, y) is the intensity of first input image at (x, y) position
• LUT is the lookup table of size 256 and type unsigned char.
API Syntax
template 
void LUT(xf::Mat & _src, xf::Mat & _dst,unsigned char* _lut)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 250: LUT Function Parameter Descriptions
Parameter

Description

SRC_T

Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1) is supported.

ROWS

Number of rows in the image being processed.

COLS

Number of columns in the image being processed.

NPC

Number of pixels to be processed in parallel. Possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.

_src

Input image of size (ROWS, COLS) and type 8U.

_dst

Output image of size (ROWS, COLS) and same type as input.

_lut

Input lookup Table of size 256 and type unsigned char.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
166

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource utilization of the LUT function, generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale
HD (1080x1920) image.
Table 251: LUT Function Resource Utilization Summary
Operating
Mode

Utilization Estimate

Operating Frequency
(MHz)

BRAM_18K

DSP_48Es

FF

LUT

CLB

1 pixel

300

1

0

937

565

137

8 pixel

150

9

0

1109

679

162

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 252: LUT Function Performance Estimate Summary
Latency Estimate

Operating Mode

Max Latency

1 pixel operation (300 MHz)

6.92 ms

8 pixel operation (150 MHz)

1.66 ms

Square Root
The Sqrt function computes the square root of a 16-bit fixed point number using the nonrestoring square root algorithm. The non-restoring square root algorithm uses the two's
complement representation for the square root result. At each iteration the algorithm can
generate exact result value even in the last bit.
Input argument D must be 16-bit number, though it is declared as 32-bit. The output sqrt(D) is
16-bit type. If format of D is QM.N (where M+N = 16) then format of output is Q(M/2).N
To get a precision of 'n' bits in fractional part, you can simply left shift the radicand (D) by '2n'
before the function call and shift the solution right by 'n' to get the correct answer. For example,
to find the square root of 35 (011000112) with one bit after the decimal point, that is, N=1:
1. Shift the number (01100011002) left by 2
2. Shift the answer (10112) right by 1. The correct answer is 101.1, which is 5.5.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
167

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
int Sqrt(unsigned int D)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 253: Sqrt Function Parameter Descriptions
Parameter

Description

D

Input data in a 16-bit fixed-point format.

Return

Output value in short int format.

Resource Utilization
The following table summarizes the resource utilization of the Sqrt function, generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 254: Sqrt Function Resource Utilization Summary
Utilization Estimate

Operating
Frequency
BRAM_18K

(MHz)
300

0

DSP_48Es
0

FF
8

LUT
6

CLB
1

Performance Estimate
The following table summarizes the performance in different configurations, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA.
Table 255: Sqrt Function Performance Estimate Summary
Latency Estimate

Operating Frequency
Min (cycles)

(MHz)
300

18

Max (cycles)
18

WarpTransform
The warpTransform function is designed to perform the perspective and affine geometric
transformations on an image. The type of transform is a compile time parameter to the function.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
168

Send Feedback

Chapter 3: xfOpenCV Library API Reference

The function uses a streaming interface to perform the transformation. Due to this and due to
the fact that geometric transformations need access to many different rows of input data to
compute one output row, the function stores some rows of the input data in BRAMs. The
number of rows the function stores can be configured by the user by modifying a template
parameter. Based on the transformation matrix, you can decide on the number of rows to be
stored. You can also choose when to start transforming the input image in terms of the number
of rows of stored image.
Affine Transformation
The transformation matrix consists of size parameters, and is as shown:

M=

⎡ M 11 M 12 M 13⎤
⎣ M 21 M 22 M 23⎦

Affine transformation is applied in the warpTransform function following the equation:

⎛x⎞
dst⎝ y ⎠ =

⎛ x⎞
M* src⎜ y ⎟
⎝ 1⎠

Perspective Transformation
The transformation matrix is a 3x3 matrix as shown below:

⎡ M 11 M 12 M 13⎤
⎢
⎥
M = ⎢ M 21 M 22 M 23⎥
⎣ M 31 M 32 M 33⎦
Perspective transformation is applied in warpTransform following the equation:

⎛ x⎞
⎛ x⎞
dst y ⎟ = M* src⎜ y ⎟
⎝ n⎠
⎝ 1⎠
1⎜

The destination pixel is then computed by dividing the first two dimensions of the dst1 by the
third dimension

⎛ x⎞
⎛ x⎞
dst y ⎟ = M* src⎜ y ⎟
⎝ n⎠
⎝ 1⎠
1⎜

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
169

Send Feedback

Chapter 3: xfOpenCV Library API Reference

API Syntax
template
void warpTransform(xf::Mat & src, xf::Mat & dst, float *transformation_matrix)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 256: warpTransform Function Parameter Descriptions
Parameter

Description

STORE_LINES

Number of lines of the image that need to be buffered locally on FPGA.

START_ROW

Number of the input rows to store before starting the image transformation. This must be less
than or equal to STORE_LINES.

TRANSFORMATION_TYPE Affine and perspective transformations are supported. Set this flag to ‘0’ for affine and ‘1’ for
perspective transformation.
INTERPOLATION_TYPE

Set flag to ‘1’ for bilinear interpolation and ‘0’ for nearest neighbor interpolation.

SRC_T

Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image.

COLS

Maximum width of input and output image.

NPC

Number of pixels to be processed per cycle; only one-pixel operation supported (XF_NPPC1).

src

Input image

dst

Output image

transformation_matrix

Transformation matrix that is applied to the input image.

Resource Utilization
The following table summarizes the resource utilization of the Warp transform, generated using
Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to progress a
grayscale HD (1080x1920) image.
Table 257: warpTransform Function Resource Utilization Summary

Transformation

INTERPOLATION
_TYPE

STORE
_LINES

START
_ROW

Utilization Estimate

Operating
Frequency
(MHz)

LUTs

FFs

DSPs

BRAMs

Perspective

Bilinear

100

50

300

7468

9804

61

112

Perspective

Nearest Neighbor

100

50

300

4514

6761

35

104

Affine

Bilinear

100

50

300

6139

5606

40

124

Affine

Nearest Neighbor

100

50

300

4611

4589

18

112

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
170

Send Feedback

Chapter 3: xfOpenCV Library API Reference

Performance Estimate
The following table summarizes a performance estimate of the Warp transform, as generated
using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale
HD (1080x1920) image.
Table 258: warpTransform Function Performance Estimate Summary

INTERPOLATION
_TYPE

Transformation

Perspective

STORE
_LINES

START
_ROW

Operating
Frequency

Latency Estimate
Max (ms)

(MHz)

Bilinear

100

50

300

7.46

Perspective

Nearest Neighbor

100

50

300

7.31

Affine

Bilinear

100

50

300

7.31

Affine

Nearest Neighbor

100

50

300

7.24

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
171

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Chapter 4

Design Examples Using xfOpenCV
Library
All the hardware functions in the library have their own respective examples that are available in
the github. This section provides details of image processing functions and pipelines
implemented using a combination of various functions in xfOpenCV. They illustrate how to best
implement various functionalities using the capabilities of both the processor and the
programmable logic. These examples also illustrate different ways to implement complex
dataflow paths. The following examples are described in this section:
• Iterative Pyramidal Dense Optical Flow
• Corner Tracking Using Sparse Optical Flow
• Color Detection
• Difference of Gaussian Filter
• Stereo Vision Pipeline

Iterative Pyramidal Dense Optical Flow
The Dense Pyramidal Optical Flow example uses the xf::pyrDown and
xf::densePyrOpticalFlow hardware functions from the xfOpenCV library, to create an
image pyramid, iterate over it and compute the Optical Flow between two input images. The
example uses two hardware instances of the xf::pyrDown function to compute the image
pyramids of the two input images in parallel. The two image pyramids are processed by one
hardware instance of the xf::densePyrOpticalFlow function, starting from the smallest
image size going up to the largest image size. The output flow vectors of each iteration are fed
back to the hardware kernel as input to the hardware function. The output of the last iteration on
the largest image size is treated as the output of the dense pyramidal optical flow example.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
172

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Figure 9: Iterative Pyramidal Dense Optical Flow

Specific details of the implementation of the example on the host follow to help understand the
process in which the claimed throughput is achieved.

pyrof_hw()
The pyrof_hw() is the host function that computes the dense optical flow.
API Syntax
void pyrof_hw(cv::Mat im0, cv::Mat im1, cv::Mat flowUmat, cv::Mat
flowVmat, xf::Mat & flow,
xf::Mat & flow_iter,
xf::Mat mat_imagepyr1[NUM_LEVELS] ,
xf::Mat mat_imagepyr2[NUM_LEVELS] , int
pyr_h[NUM_LEVELS], int pyr_w[NUM_LEVELS])

Parameter Descriptions
The table below describes the template and the function parameters.
Parameter

Description

im0

First input image in cv::Mat

im1

Second input image in cv::Mat

flowUmat

Allocated cv::Mat to store the horizontal component of the output flow vector

flowVmat

Allocated cv::Mat to store the vertical component of the output flow vector

flow

Allocated xf::Mat to temporarily store the packed flow vectors, during the iterative computation
using the hardware function

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
173

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Parameter

Description

flow_iter

Allocated xf::Mat to temporarily store the packed flow vectors, during the iterative computation
using the hardware function

mat_imagepyr1

An array, of size equal to the number of image pyramid levels, of xf::Mat to store the image
pyramid of the first image

mat_imagepyr2

An array, of size equal to the number of image pyramid levels, of xf::Mat to store the image
pyramid of the second image

pyr_h

An array of integers which includes the size of number of image pyramid levels, to store the
height of the image at each pyramid level

pyr_w

An array of integers which includes the size of the number of image pyramid levels, to store the
width of the image at each pyramid level

Dataflow
The pyrof_hw() function performs the following:
1. Set the sizes of the images in various levels of the image pyramid
2. Copy input images from cv::Mat format to the xf::Mat object allocated to contain the largest
image pyramid level
3. Create the image pyramid calling the pyr_dense_optical_flow_pyr_down_accel()
function
4. Use the pyr_dense_optical_flow_accel() function to compute the optical flow
output by iterating over the pyramid levels as input by the user
5. Unpack the flow vectors and convert them to the floating point, and return
The important steps 3 and 4 in the above processes will be explained in detail.

pyr_dense_optical_flow_pyr_down_accel()
API Syntax
void
pyr_dense_optical_flow_pyr_down_accel(xf::Mat mat_imagepyr1[NUM_LEVELS], xf::Mat
mat_imagepyr2[NUM_LEVELS])

Parameter Descriptions
The table below describes the template and the function parameters.
Parameter
mat_imagepyr1

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

Description
An array, of size equal to the number of image pyramid levels, of xf::Mat to store the image
pyramid of the first image. The memory location corresponding to the highest pyramid level [0]
in this allocated memory must contain the first input image.

www.xilinx.com

[placeholder text]
174

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Parameter
mat_imagepyr2

Description
An array, of size equal to the number of image pyramid levels, of xf::Mat to store the image
pyramid of the second image. The memory location corresponding to the highest pyramid level
[0] in this allocated memory must contain the second input image.

The pyr_dense_optical_flow_pyr_down_accel() just runs one for loop calling the
xf::pyrDown hardware function as follows:
for(int pyr_comp=0;pyr_comp(mat_imagepyr1[pyr_comp],
mat_imagepyr1[pyr_comp+1]);
#pragma SDS async(2)
#pragma SDS resource(2)
xf::pyrDown(mat_imagepyr2[pyr_comp],
mat_imagepyr2[pyr_comp+1]);
#pragma SDS wait(1)
#pragma SDS wait(2)
}

The code is straightforward without the pragmas, and the xf::pyrDown function is being called
twice every iteration. First with the first image and then with the second image. Note that the
input to the next iteration is the output of the current iteration. The pragma #pragma SDS
async(ID) makes the Arm® processor call the hardware function and not wait for the hardware
function to return. The Arm processor takes some cycles to call the function, which includes
programming the DMA. The pragma #pragma SDS wait(ID) makes the Arm processor wait for the
hardware function called with the async(ID) pragma to finish processing. The pragma #pragma
SDS resource(ID) creates a separate hardware instance each time the hardware function is called
with a different ID. With this new information it is easy to assimilate that the loop in the above
host function calls the two hardware instances of xf::pyrDown functions in parallel, waits until
both the functions return and proceed to the next iteration.
Dense Pyramidal Optical Flow Computation
for (int l=NUM_LEVELS-1; l>=0; l--) {
//compute current level height
int curr_height = pyr_h[l];
int curr_width = pyr_w[l];
//compute the flow vectors for the current pyramid level
iteratively
for(int iterations=0;iterations init_flag = ((iterations==0) && (l==NUM_LEVELS-1))?

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
175

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

1 : 0;

if(flag_flowin)
{
flow.rows = pyr_h[l];
flow.cols = pyr_w[l];
flow.size = pyr_h[l]*pyr_w[l];
pyr_dense_optical_flow_accel(mat_imagepyr1[l],
mat_imagepyr2[l], flow_iter, flow, l, scale_up_flag, scale_in, init_flag);
flag_flowin = 0;
}
else
{
flow_iter.rows = pyr_h[l];
flow_iter.cols = pyr_w[l];
flow_iter.size = pyr_h[l]*pyr_w[l];
pyr_dense_optical_flow_accel(mat_imagepyr1[l],
mat_imagepyr2[l], flow, flow_iter, l, scale_up_flag, scale_in, init_flag);
flag_flowin = 1;
}
}//end iterative coptical flow computation
} // end pyramidal iterative optical flow HLS computation

The Iterative Pyramidal Dense Optical Flow is computed in a nested for loop which runs for
iterations*pyramid levels number of iterations. The main loop starts from the smallest image size
and iterates up to the largest image size. Before the loop iterates in one pyramid level, it sets the
current pyramid level’s height and width, in curr_height and current_width variables. In the
nested loop, the next_height variable is set to the previous image height if scaling up is necessary,
that is, in the first iterations. As divisions are costly and one time divisions can be avoided in
hardware, the scale factor is computed in the host and passed as an argument to the hardware
kernel. After each pyramid level, in the first iteration, the scale-up flag is set to let the hardware
function know that the input flow vectors need to be scaled up to the next higher image size.
Scaling up is done using bilinear interpolation in the hardware kernel.
After all the input data is prepared, and the flags are set, the host processor calls the hardware
function. Please note that the host function swaps the flow vector inputs and outputs to the
hardware function to iteratively solve the optimization problem. Also note that the
pyr_dense_optical_flow_accel() function is just a wrapper to the hardware function
xf::densePyrOpticalFlow. Template parameters to the hardware function are passed inside
this wrapper function.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
176

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Corner Tracking Using Sparse Optical Flow
This example illustrates how to detect and track the characteristic feature points in a set of
successive frames of video. A Harris corner detector is used as the feature detector, and a
modified version of Lucas Kanade optical flow is used for tracking. The core part of the algorithm
takes in current and next frame as the inputs and outputs the list of tracked corners. If the
current image is the first frame in the set, then corner detection is performed to detect the
features to track. The number of frames in which the points need to be tracked is also provided
as the input.
Corner tracking example uses five hardware functions from the xfOpenCV library
xf::cornerHarris, xf:: cornersImgToList, xf::cornerUpdate, xf::pyrDown, and
xf::densePyrOpticalFlow.
Figure 10: Corner Tracking Using Sparse Optical Flow

Detect Harris
corners

Current frame

Next frame

Compute image pyramid
for current frame

Compute image pyramid
for next frame

Extract corners
to list

Compute dense
optical flow

Update location of
corners

Tracked corners

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
177

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

A new hardware function, xf::cornerUpdate, has been added to ensure that the dense flow
vectors from the output of thexf::densePyrOpticalFlow function are sparsely picked and
stored in a new memory location as a sparse array. This was done to ensure that the next
function in the pipeline would not have to surf through the memory by random accesses. The
function takes corners from Harris corner detector and dense optical flow vectors from the
dense pyramidal optical flow function and outputs the updated corner locations, tracking the
input corners using the dense flow vectors, thereby imitating the sparse optical flow behavior.
This hardware function runs at 300 MHz for 10,000 corners on a 720p image, adding very
minimal latency to the pipeline.

cornerUpdate()
API Syntax
template 
void cornerUpdate(ap_uint<64> *list_fix, unsigned int *list, uint32_t
nCorners, xf::Mat &flow_vectors, ap_uint<1>
harris_flag)

Parameter Descriptions
The following table describes the template and the function parameters.
Table 259: CornerUpdate Function Parameter Descriptions
Parameter

Description

MAXCORNERSNO

Maximum number of corners that the function needs to work on

TYPE

Input Pixel Type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)

ROWS

Maximum height of input and output image (Must be multiple of 8)

COLS

Maximum width of input and output image (Must be multiple of 8)

NPC

Number of pixels to be processed per cycle. This function supports only XF_NPPC1 or 1-pixel per cycle
operations.

list_fix

A list of packed fixed point coordinates of the corner locations in 16, 5 (16 integer bits and 5 fractional
bits) format. Bits from 20 to 0 represent the column number, while the bits 41 to 21 represent the row
number. The rest of the bits are used for flag, this flag is set when the tracked corner is valid.

list

A list of packed positive short integer coordinates of the corner locations in unsigned short format.
Bits from 15 to 0 represent the column number, while the bits 31 to 16 represent the row number.
This list is same as the list output by Harris Corner Detector.

nCorners

Number of corners to track

flow_vectors

Packed flow vectors as in xf::DensePyrOpticalFlow function

harris_flag

If set to 1, the function takes input corners from list.
if set to 0, the function takes input corners from list_fix.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
178

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

The example codeworks on an input video which is read and processed using the xfOpenCV
library. The core processing and tracking is done by the xf_corner_tracker_accel()
function at the host.

cornersImgToList()
API Syntax
template 
void cornersImgToList(xf::Mat &_src, unsigned int
list[MAXCORNERSNO], unsigned int *ncorners)

Parameter Descriptions
The following table describes the template and theKintex® UltraScale+™ function parameters.
Table 260: CornerImgToList Function Parameter Descriptions
Parameter

Description

_src

The output image of harris corner detector. The size of this xf::Mat object is the size of the input
image to Harris corner detector. The value of each pixel is 255 if a corner is present in the location, 0
otherwise.

list

A 32 bit memory allocated, the size of MAXCORNERS, to store the corners detected by Harris Detector

ncorners

Total number of corners detected by Harris, that is, the number of corners in the list

cornerTracker()
The xf_corner_tracker_accel() function does the core procesing and tracking at the
host.
API Syntax
void cornerTracker(xf::Mat & flow,
xf::Mat & flow_iter,
xf::Mat mat_imagepyr1[NUM_LEVELS] ,
xf::Mat mat_imagepyr2[NUM_LEVELS] ,
xf::Mat &inHarris, xf::Mat &outHarris, unsigned int *list, ap_uint<64>
*listfixed, int pyr_h[NUM_LEVELS], int pyr_w[NUM_LEVELS], unsigned int
*num_corners, unsigned int harrisThresh, bool *harris_flag)

Parameter Descriptions
The table below describes the template and the function parameters.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
179

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Parameter

Description

flow

Allocated xf::Mat to temporarily store the packed flow vectors during the iterative computation
using the hardware function

flow_iter

Allocated xf::Mat to temporarily store the packed flow vectors during the iterative computation
using the hardware function

mat_imagepyr1
mat_imagepyr2

An array, of size equal to the number of image pyramid levels, of xf::Mat to store the image
pyramid of the second imageAn array, of size equal to the number of image pyramid levels, of
xf::Mat to store the image pyramid of the first image

inHarris

Input image to Harris Corner Detector in xf::Mat

outHarris

Output image from Harris detector. Image has 255 if a corner is present in the location and 0
otherwise

list

A 32 bit memory allocated, the size of MAXCORNERS, to store the corners detected by Harris
Detector

listfixed

A 64 bit memory allocated, the size of MAXCORNERS, to store the corners tracked by
xf::cornerUpdate

pyr_h

An array of integers the size of number of image pyramid levels to store the height of the image
at each pyramid level

pyr_w

An array of integers the size of number of image pyramid levels to store the width of the image
at each pyramid level

num_corners

An array, of size equal to the number ofNumber of corners detected by Harris Corner Detector

harrisThresh

Threshold input to the Harris Corner Detector, xf::harris

harris_flag

Flag used by the caller of this function to use the corners detected by xf::harris for the set of
input images

Image Processing
The following steps demonstrate the Image Processing procedure in the hardware pipeline
1. xf::cornerharris is called to start processing the first input image
2. The output of xf::cornerHarris is pipelined by SDSoC on hardware to
xf::cornersImgToList. This function takes in an image with corners marked as 255 and
0 elsewhere, and converts them to a list of corners.
3. Simultaneously, xf::pyrDown creates the two image pyramids and Dense Optical Flow is
computed using the two image pyramids as described in the Iterative Pyramidal Dense
Optical Flow example.
4. xf::densePyrOpticalFlow is called with the two image pyramids as inputs.
5. xf::cornerUpdate function is called to track the corner locations in the second image. If
harris_flag is enabled, the cornerUpdate tracks corners from the output of the list, else it
tracks the previously tracked corners.
if(*harris_flag == true)
{
#pragma SDS async(1)
xf::cornerHarris(inHarris, outHarris, Thresh, k);

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
180

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

#pragma SDS async(2)
xf::cornersImgToList(outHarris,
list, &nCorners);
}
//Code to compute Iterative Pyramidal Dense Optical Flow
if(*harris_flag == true)
{
#pragma SDS wait(1)
#pragma SDS wait(2)
*num_corners = nCorners;
}
if(flag_flowin)
{
xf::cornerUpdate(listfixed,
list, *num_corners, flow_iter, (ap_uint<1>)(*harris_flag));
}
else
{
xf::cornerUpdate(listfixed,
list, *num_corners, flow, (ap_uint<1>)(*harris_flag));
}
if(*harris_flag == true)
{
*harris_flag = false;
}

The xf_corner_tracker_accel() function takes a flag called harris_flag which is set during
the first frame or when the corners need to be redetected. The xf::cornerUpdate function
outputs the updated corners to the same memory location as the output corners list of
xf::cornerImgToList. This means that when harris_flag is unset, the corners input to the
xf::cornerUpdate are the corners tracked in the previous cycle, that is, the corners in the
first frame of the current input frames.
After the Dense Optical Flow is computed, if harris_flag is set, the number of corners that
xf::cornerharris has detected and xf::cornersImgToList has updated is copied to
num_corners variable which is one of the outputs of the xf_corner_tracker_accel()
function. The other being the tracked corners list, listfixed. If harris_flag is set,
xf::cornerUpdate tracks the corners in ‘list’ memory location, otherwise it tracks the corners
in ‘listfixed’ memory location.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
181

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Color Detection
The Color Detection algorithm is basically used for color object tracking and object detection,
based on the color of the object. The color based methods are very useful for object detection
and segmentation, when the object and the background have a significant difference in color.
The Color Detection example uses four hardware functions from the xfOpenCV library. They are:
• xf::RGB2HSV
• xf::colorthresholding
• xf:: erode
• xf:: dilate
In the Color Detection example, the color space of the original BGR image is converted into an
HSV color space. Because HSV color space is the most suitable color space for color based image
segmentation. Later, based on the H (hue), S (saturation) and V (value) values, apply the
thresholding operation on the HSV image and return either 255 or 0. After thresholding the
image, apply erode (morphological opening) and dilate (morphological opening) functions to
reduce unnecessary white patches (noise) in the image. Here, the example uses two hardware
instances of erode and dilate functions. The erode followed by dilate and once again applying
dilate followed by erode.
Figure 11: Color Detection

The following example demonstrates the Color Detection algorithm.
void colordetect_accel(xf::Mat &_src,
xf::Mat &_rgb2hsv,
xf::Mat &_thresholdedimg,
xf::Mat &_erodeimage1,
xf::Mat &_dilateimage1,
xf::Mat &_dilateimage2,
xf::Mat &_dst,
unsigned char *low_thresh, unsigned char *high_thresh){
xf::RGB2HSV< XF_8UC4,HEIGHT, WIDTH, XF_NPPC1>(_src, _rgb2hsv);
xf::colorthresholding(_rgb2hsv,_ thresholdedimage, low_thresh, high_thresh);
xf::erode(_thresholdedimg, _
erodeimage1);
xf::dilate(_
erodeimage1, _ dilateimage1);

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
182

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

xf::dilate(_
dilateimage1, _ dilateimage2);
xf::erode(_
dilateimage2, _dst);
}

In the given example, the source image is passed to the xf::RGB2HSV function, the output of
that function is passed to the xf::colorthresholding module, the thresholded image is
passed to the xf::erode function and, the xf::dilate functions and the final output image
are returned.

Difference of Gaussian Filter
The Difference of Gaussian Filter example uses four hardware functions from the xfOpenCV
library. They are:
• xf:: GaussianBlur
• xf:: duplicateMat
• xf:: delayMat
• xf::subtract
The Difference of Gaussian Filter function can be implemented by applying Gaussian Filter on
the original source image, and that Gaussian blurred image is duplicated as two images. The
Gaussian blur function is applied to one of the duplicated images, whereas the other one is
stored as it is. Later, perform the Subtraction function on, two times Gaussian applied image and
one of the duplicated image. Here, the duplicated image has to wait until the Gaussian applied
for other one generates at least for one pixel output. Therefore, here xf::delayMat function is
used to add delay.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
183

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

Figure 12: Difference of Gaussian Filter

The following example demonstrates the Difference of Gaussian Filter example.
void gaussian_diff_accel(xf::Mat &imgInput,
xf::Mat &imgin1,
xf::Mat &imgin2,
xf::Mat &imgin3,
xf::Mat &imgin4,
xf::Mat &imgin5,
xf::Mat&imgOutput,
float sigma)
{
xf::GaussianBlur
(imgInput, imgin1, sigma);
xf::duplicateMat(imgin1,imgin2,imgin3);
xf::delayMat(imgin3,imgin5);
xf::GaussianBlur
(imgin2, imgin4, sigma);
xf::subtract(imgin5,imgin4,imgOutput);
}

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
184

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

In the given example, the Gaussain Blur function is applied for source image imginput, and
resultant image imgin1 is passed to xf::duplicateMat. The imgin2 and imgin3 are the
duplicate images of Gaussian applied image. Again gaussian blur is applied to imgin2 and the
result is stored in imgin4. Now, perform the subtraction between imgin4 and imgin3, but
here imgin3 has to wait up to at least one pixel of imgin4 generation. So, delay has applied for
imgin3 and stored in imgin5. Finally the subtraction performed on imgin4 and imgin5.

Stereo Vision Pipeline
Disparity map generation is one of the first steps in creating a three dimensional map of the
environment. The xfOpenCV library has components to build an image processing pipeline to
compute a disparity map given the camera parameters and inputs from a stereo camera setup.
The two main components involved in the pipeline are stereo rectification and disparity
estimation using local block matching method. While disparity estimation using local block
matching is a discrete component in xfOpenCV, rectification block can be constructed using
xf::InitUndistortRectifyMapInverse() and xf::Remap(). The dataflow pipeline is
shown below. The camera parameters are an additional input to the pipeline.
Figure 13: Stereo Vision Pipeline

The following code is for the pipeline.
void stereopipeline_accel(xf::Mat
&leftMat, xf::Mat &rightMat,
xf::Mat &dispMat,
xf::Mat &mapxLMat,
xf::Mat &mapyLMat,
xf::Mat &mapxRMat,
xf::Mat &mapyRMat,
xf::Mat &leftRemappedMat,
xf::Mat &rightRemappedMat,
xf::xFSBMState
&bm_state, ap_fixed<32,12> *cameraMA_l_fix, ap_fixed<32,12>
*cameraMA_r_fix, ap_fixed<32,12> *distC_l_fix, ap_fixed<32,12>

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
185

Send Feedback

Chapter 4: Design Examples Using xfOpenCV Library

*distC_r_fix,
ap_fixed<32,12> *irA_l_fix, ap_fixed<32,12> *irA_r_fix, int _cm_size,
int _dc_size)
{
xf::InitUndistortRectifyMapInverse(cameraMA_l_fix, distC_l_fix, irA_l_fix,
mapxLMat,mapyLMat, _cm_size, _dc_size);
xf::remap(leftMat, leftRemappedMat, mapxLMat, mapyLMat,
XF_INTERPOLATION_BILINEAR);
xf::InitUndistortRectifyMapInverse(cameraMA_r_fix, distC_r_fix, irA_r_fix, mapxRMat, mapyRMat,
_cm_size, _dc_size);
xf::remap(rightMat, rightRemappedMat, mapxRMat, mapyRMat,
XF_INTERPOLATION_BILINEAR);
xf::StereoBM(leftRemappedMat,
rightRemappedMat, dispMat, bm_state);
}

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
186

Send Feedback

Appendix A

Additional Resources and Legal
Notices
Xilinx Resources
For support resources such as Answers, Documentation, Downloads, and Forums, see Xilinx
Support.
Solution Centers
See the Xilinx Solution Centers for support on devices, software tools, and intellectual property
at all stages of the design cycle. Topics include design assistance, advisories, and troubleshooting
tips

Xilinx Resources
For support resources such as Answers, Documentation, Downloads, and Forums, see Xilinx
Support.

Documentation Navigator and Design
Hubs
Xilinx® Documentation Navigator provides access to Xilinx documents, videos, and support
resources, which you can filter and search to find information. To open the Xilinx Documentation
Navigator (DocNav):
• From the Vivado® IDE, select Help → Documentation and Tutorials.
• On Windows, select Start → All Programs → Xilinx Design Tools → DocNav.
• At the Linux command prompt, enter docnav.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
187

Send Feedback

Appendix A: Additional Resources and Legal Notices

Xilinx Design Hubs provide links to documentation organized by design tasks and other topics,
which you can use to learn key concepts and address frequently asked questions. To access the
Design Hubs:
• In the Xilinx Documentation Navigator, click the Design Hubs View tab.
• On the Xilinx website, see the Design Hubs page.
Note: For more information on Documentation Navigator, see the Documentation Navigator page on the
Xilinx website.

References
These documents provide supplemental material useful with this guide:
1. SDSoC Environments Release Notes, Installation, and Licensing Guide (UG1294)
2. SDSoC Environment User Guide (UG1027)
3. SDSoC Environment Getting Started Tutorial (UG1028)
4. SDSoC Environment Platform Creation Tutorial (UG1236)
5. SDSoC Environment Platform Development Guide (UG1146)
6. SDSoC Environment Profiling and Optimization Guide (UG1235)
7. SDx Command and Utility Reference Guide (UG1279)
8. SDSoC Environment Programmers Guide (UG1278)
9. SDSoC Environment Debugging Guide (UG1282)
10. SDx Pragma Reference Guide (UG1253)
11. UltraFast Embedded Design Methodology Guide (UG1046)
12. Zynq-7000 SoC Software Developers Guide (UG821)
13. Zynq UltraScale+ MPSoC: Software Developers Guide (UG1137)
14. ZC702 Evaluation Board for the Zynq-7000 XC7Z020 SoC User Guide (UG850)
15. ZCU102 Evaluation Board User Guide (UG1182)
16. Vivado Design Suite User Guide: High-Level Synthesis (UG902)
17. Vivado Design Suite: Creating and Packaging Custom IP (UG1118)
18. SDSoC Development Environment web page
19. Vivado® Design Suite Documentation

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
188

Send Feedback

Appendix A: Additional Resources and Legal Notices

Please Read: Important Legal Notices
The information disclosed to you hereunder (the "Materials") is provided solely for the selection
and use of Xilinx products. To the maximum extent permitted by applicable law: (1) Materials are
made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES AND
CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY
PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including
negligence, or under any other theory of liability) for any loss or damage of any kind or nature
related to, arising under, or in connection with, the Materials (including your use of the
Materials), including for any direct, indirect, special, incidental, or consequential loss or damage
(including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any
action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx
had been advised of the possibility of the same. Xilinx assumes no obligation to correct any
errors contained in the Materials or to notify you of updates to the Materials or to product
specifications. You may not reproduce, modify, distribute, or publicly display the Materials
without prior written consent. Certain products are subject to the terms and conditions of
Xilinx's limited warranty, please refer to Xilinx's Terms of Sale which can be viewed at https://
www.xilinx.com/legal.htm#tos; IP cores may be subject to warranty and support terms contained
in a license issued to you by Xilinx. Xilinx products are not designed or intended to be fail-safe or
for use in any application requiring fail-safe performance; you assume sole risk and liability for
use of Xilinx products in such critical applications, please refer to Xilinx's Terms of Sale which can
be viewed at https://www.xilinx.com/legal.htm#tos.
AUTOMOTIVE APPLICATIONS DISCLAIMER
AUTOMOTIVE PRODUCTS (IDENTIFIED AS "XA" IN THE PART NUMBER) ARE NOT
WARRANTED FOR USE IN THE DEPLOYMENT OF AIRBAGS OR FOR USE IN APPLICATIONS
THAT AFFECT CONTROL OF A VEHICLE ("SAFETY APPLICATION") UNLESS THERE IS A
SAFETY CONCEPT OR REDUNDANCY FEATURE CONSISTENT WITH THE ISO 26262
AUTOMOTIVE SAFETY STANDARD ("SAFETY DESIGN"). CUSTOMER SHALL, PRIOR TO USING
OR DISTRIBUTING ANY SYSTEMS THAT INCORPORATE PRODUCTS, THOROUGHLY TEST
SUCH SYSTEMS FOR SAFETY PURPOSES. USE OF PRODUCTS IN A SAFETY APPLICATION
WITHOUT A SAFETY DESIGN IS FULLY AT THE RISK OF CUSTOMER, SUBJECT ONLY TO
APPLICABLE LAWS AND REGULATIONS GOVERNING LIMITATIONS ON PRODUCT
LIABILITY.
Copyright
© Copyright 2017-2018 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex,
Vivado, Zynq, and other designated brands included herein are trademarks of Xilinx in the United
States and other countries. OpenCL and the OpenCL logo are trademarks of Apple Inc. used by
permission by Khronos. All other trademarks are the property of their respective owners.

UG1233 (v2018.2) July 2, 2018
Xilinx OpenCV User Guide

www.xilinx.com

[placeholder text]
189

Send Feedback



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Language                        : EN
XMP Toolkit                     : Adobe XMP Core 5.6-c015 84.159810, 2016/09/10-02:41:30
Format                          : application/pdf
Creator                         : Xilinx, Inc.
Description                     : Describes the FPGA-optimized OpenCV libraries, called xFast OpenCV (xFopencv) libraries. The document is intended for application developers using Xilinx® Zynq® family devices.
Title                           : Xilinx OpenCV User Guide
Creator Tool                    : AH XSL Formatter V6.4 MR2 for Linux64 : 6.4.4.28196 (2017/03/15 08:49JST)
Create Date                     : 2018:07:11 22:57:03-08:00
Modify Date                     : 2018:07:12 11:50:03-06:00
Metadata Date                   : 2018:07:12 11:50:03-06:00
Producer                        : Xilinx, Inc. via ABCpdf
Trapped                         : False
Xilinx Doc ID                   : UG1233
Document ID                     : uuid:0643ccd7-02ce-45b8-b4c1-8a6b270e73d5
Instance ID                     : uuid:91d120c0-23b9-4bf8-bbd7-badeb319a99f
Page Layout                     : SinglePage
Page Mode                       : UseOutlines
Page Count                      : 189
Author                          : Xilinx, Inc.
Keywords                        : SDx, SDAccel, SDSoC, Vivado HLS, OpenCV, xFopenCV, Zynq
Subject                         : Describes the FPGA-optimized OpenCV libraries, called xFast OpenCV (xFopencv) libraries. The document is intended for application developers using Xilinx® Zynq® family devices.
EXIF Metadata provided by EXIF.tools

Navigation menu