Xilinx OpenCV User Guide Ug1233
ug1233-xilinx-opencv-user-guide
ug1233-xilinx-opencv-user-guide
User Manual:
Open the PDF directly: View PDF .
Page Count: 189 [warning: Documents this large are best viewed by clicking the View PDF Link!]
- Xilinx OpenCV User Guide
- Revision History
- Table of Contents
- Ch. 1: Overview
- Ch. 2: Getting Started
- Ch. 3: xfOpenCV Library API Reference
- xf::Mat Image Container Class
- xfOpenCV Library Functions
- Absolute Difference
- Accumulate
- Accumulate Squared
- Accumulate Weighted
- Bilateral Filter
- Bit Depth Conversion
- Bitwise AND
- Bitwise NOT
- Bitwise OR
- Bitwise XOR
- Box Filter
- Canny Edge Detection
- Channel Combine
- Channel Extract
- Color Conversion
- Color Thresholding
- Custom Convolution
- Delay
- Dilate
- Duplicate
- Erode
- FAST Corner Detection
- Gaussian Filter
- Gradient Magnitude
- Gradient Phase
- Harris Corner Detection
- Histogram Computation
- Histogram Equalization
- HOG
- Houghlines
- Pyramid Up
- Pyramid Down
- InitUndistortRectifyMapInverse
- Integral Image
- Dense Pyramidal LK Optical Flow
- Dense Non-Pyramidal LK Optical Flow
- Mean and Standard Deviation
- Median Blur Filter
- MinMax Location
- Mean Shift Tracking
- Otsu Threshold
- Pixel-Wise Addition
- Pixel-Wise Multiplication
- Pixel-Wise Subtraction
- Remap
- Resolution Conversion (Resize)
- RGB2HSV
- Scharr Filter
- Sobel Filter
- Stereo Local Block Matching
- Semi Global Method for Stereo Disparity Estimation
- SVM
- Thresholding
- WarpAffine
- WarpPerspective
- Atan2
- Inverse (Reciprocal)
- Look Up Table
- Square Root
- WarpTransform
- Ch. 4: Design Examples Using xfOpenCV Library
- Appx. A: Additional Resources and Legal Notices

Revision History
The following table shows the revision history for this document.
Section Revision Summary
07/02/2018 Version 2018.2
Houghlines Updated the function description and its respective tables
xfOpenCV Library Functions Added a note to the xfOpenCV Library Functions table
06/06/2018 Version 2018.2
Houghlines Added a new function
Semi Global Method for Stereo Disparity Estimation Added a new function
04/04/2018 Version 2018.1
General Updates Minor editorial updates for 2018.1 release
InitUndistortRectifyMapInverse Added a new function
cornersImgToList() Added a new function
Revision History
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 2

Table of Contents
Revision History...............................................................................................................2
Chapter 1: Overview......................................................................................................4
Basic Features..............................................................................................................................4
xfOpenCV Kernel on the reVISION Platform............................................................................5
Chapter 2: Getting Started........................................................................................ 7
Prerequisites................................................................................................................................ 7
xfOpenCV Library Contents........................................................................................................7
Using the xfOpenCV Library.......................................................................................................8
Changing the Hardware Kernel Configuration......................................................................21
Using the xfOpenCV Library Functions on Hardware...........................................................21
Chapter 3: xfOpenCV Library API Reference................................................. 25
xf::Mat Image Container Class................................................................................................ 25
xfOpenCV Library Functions.................................................................................................... 33
Chapter 4: Design Examples Using xfOpenCV Library........................... 172
Iterative Pyramidal Dense Optical Flow............................................................................... 172
Corner Tracking Using Sparse Optical Flow.........................................................................177
Color Detection........................................................................................................................182
Difference of Gaussian Filter................................................................................................. 183
Stereo Vision Pipeline............................................................................................................. 185
Appendix A: Additional Resources and Legal Notices........................... 187
Xilinx Resources.......................................................................................................................187
Documentation Navigator and Design Hubs...................................................................... 187
References................................................................................................................................188
Please Read: Important Legal Notices................................................................................. 189
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 3

Chapter 1
Overview
This document describes the FPGA device opmized OpenCV library, called the Xilinx®
xfOpenCV library and is intended for applicaon developers using Zynq®-7000 SoC and Zynq
UltraScale+ MPSoC devices. xfOpenCV library has been designed to work in the SDx™
development environment, and provides a soware interface for computer vision funcons
accelerated on an FPGA device. xfOpenCV library funcons are mostly similar in funconality to
their OpenCV equivalent. Any deviaons, if present, are documented.
Note: For more informaon on the xfOpenCV library prerequisites, see the Prerequisites. To familiarize
yourself with the steps required to use the xfOpenCV library funcons, see the Using the xfOpenCV
Library.
Basic Features
All xfOpenCV library funcons follow a common format. The following properes hold true for
all the funcons.
• All the funcons are designed as templates and all arguments that are images, must be
provided as xf::Mat.
• All funcons are dened in the xf namespace.
• Some of the major template arguments are:
○Maximum size of the image to be processed
○Datatype dening the properes of each pixel
○Number of pixels to be processed per clock cycle
○Other compile-me arguments relevent to the funconality.
The xfOpenCV library contains enumerated datatypes which enables you to congure xf::Mat.
For more details on xf::Mat, see the xf::Mat Image Container Class.
Chapter 1: Overview
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 4

xfOpenCV Kernel on the reVISION Platform
The xfOpenCV library is designed to be used with the SDx™ development environment.
xfOpenCV kernels are evaluated on the reVISION™ plaorm.
The following steps describe the general ow of an example design, where both the input and
the output are image les.
1. Read the image using cv::imread().
2. Copy the data to xf::Mat.
3. Call the processing funcon(s) in xfOpenCV.
4. Copy the data from xf::Mat to cv::Mat.
5. Write the output to image using cv::imwrite().
The enre code is wrien as the host code for the pipeline , from which all the calls to xfOpenCV
funcons are moved to hardware. Funcons from OpenCV are used to read and write images in
the memory. The image containers for xfOpenCV library funcons are xf::Mat objects. For
more informaon, see the xf::Mat Image Container Class.
The reVISION plaorm supports both live and le input-output (I/O) modes. For more details, see
the reVISION Geng Started Guide.
• File I/O mode enables the controller to transfer images from SD Card to the hardware kernel.
The following steps describe the le I/O mode.
1. Processing system (PS) reads the image frame from the SD Card and stores it in the
DRAM.
2. The xfOpenCV kernel reads the image from the DRAM, processes it and stores the output
back in the DRAM memory.
3. The PS reads the output image frame from the DRAM and writes it back to the SD Card.
• Live I/O mode enables streaming frames into the plaorm, processing frames with the
xfOpenCV kernel, and streaming out the frames through the appropriate interface. The
following steps describe the live I/O mode.
1. Video capture IPs receive a frame and store it in the DRAM.
2. The xfOpenCV kernel fetches the image from the DRAM, processes the image, and stores
the output in the DRAM.
3. Display IPs read the output frame from the DRAM and transmits the frame through the
appropriate display interface.
Following gure shows the reVISION plaorm with the xfOpenCV kernel block:
Chapter 1: Overview
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 5

Figure 1: xfOpenCV Kernel on the reVISION Platform
ARM Core
Central Interconnect
DDR
Controller
HP Ports
HDMI TX and RX
IPs
Data
Movers
AXI
Interconnects
HDMI TX and RX
IPs
xfOpenCV Kernel Interface
xfOpenCV Kernel
Programmable Logic (PL)
Processing System (PS)
HPM/GP Ports
AXIS AXIMM
Note: For more informaon on the PS-PL interfaces and PL-DDR interfaces, see the Zynq UltraScale+
Device Technical Reference Manual (UG1085).
Chapter 1: Overview
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 6

Chapter 2
Getting Started
This chapter provides the informaon you need to bring up your design using the xfOpenCV
library funcons.
Prerequisites
This secon lists the prerequisites for using the xfOpenCV library funcons on ZCU102 based
plaorms. The methodology holds true for ZC702 and ZC706 reVISION plaorms as well.
• Download and install the SDx development environment according to the direcons provided
in SDSoC Environments Release Notes, Installaon, and Licensing Guide (UG1294). Before
launching the SDx development environment on Linux, set the $SYSROOT environment
variable to point to the Linux root le system, delivered with the reVISION plaorm. For
example:
export SYSROOT = <local folder>/zcu102_[es2_]rv_ss/sw/aarch64-linux-gnu/
sysroot
• Download the Zynq® UltraScale+™ MPSoC Embedded Vision Plaorm zip le and extract its
contents. Create the SDx development environment workspace in the
zcu102_[es2_]rv_ss folder of the extracted design le hierarchy. For more details, see the
reVISION Geng Started Guide.
• Set up the ZCU102 evaluaon board. For more details, see the reVISION Geng Started
Guide.
• Download the xfOpenCV library. This library is made available through github. Run the
following git clone command to clone the xfOpenCV repository to your local disk:
git clone https://github.com/Xilinx/xfopencv.git
xfOpenCV Library Contents
The following table lists the contents of the xfOpenCV library.
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 7

Table 1: xfOpenCV Library Contents
Folder Details
include Contains the header files required by the library.
include/common Contains the common library infrastructure headers, such
as types specific to the library.
include/core Contains the core library functionality headers, such as the
math functions.
include/features Contains the feature extraction kernel function definitions.
For example, Harris.
include/imgproc Contains all the kernel function definitions, except the ones
available in the features folder.
examples Contains the sample test bench code to facilitate running
unit tests. The examples/ folder contains the folde rs
with algorithm names. Each algorithm folder contains host
files, .json file, and data folder. For more details on
how to use the xfOpenCV library, see xfOpenCV Kernel on
the reVISION Platform.
Using the xfOpenCV Library
This secon describes using the xfOpenCV library in the SDx development environment.
Note: The instrucons in this secon assume that you have downloaded and installed all the required
packages. For more informaon, see the Prerequisites.
The xfOpenCV library is structured as shown in the following table. The include folder
constutes all the necessary components to build a Computer Vision or Image Processing
pipeline using the library. The folders common and core contain the infrastructure that the
library funcons need for basic funcons, Mat class, and macros. The library funcons are
categorized into two folders, features and imgproc based on the operaon they perform.
The names of the folders are self-explanatory.
To work with the library funcons, you need to include the path to the include folder in the
SDx project. You can include relevant header les for the library funcons you will be working
with aer you source the include folder’s path to the compiler. For example, if you would like
to work with Harris Corner Detector and Bilateral Filter, you must use the following lines in the
host code:
#include “features/xf_harris.hpp” //for Harris Corner Detector
#include “imgproc/xf_bilateral_filter.hpp” //for Bilateral Filter
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 8

Aer the headers are included, you can work with the library funcons as described in the
Chapter 3: xfOpenCV Library API Reference using the examples in the examples folder as
reference.
The following table gives the name of the header le, including the folder name, which contains
the library funcon.
Table 2: xfOpenCV Library Contents
Function Name File Path in the include folder
xf::accumulate imgproc/xf_accumulate_image.hpp
xf::accumulateSquare imgproc/xf_accumulate_squared.hpp
xf::accumulateWeighted imgproc/xf_accumulate_weighted.hpp
xf::absdiff, xf::add, xf::subtract, xf::bitwise_and,
xf::bitwise_or, xf::bitwise_not, xf::bitwise_xor
core/xf_arithm.hpp
xf::bilateralFilter imgproc/xf_histogram.hpp
xf::boxFilter imgproc/xf_box_filter.hpp
xf::Canny imgproc/xf_canny.hpp
xf::merge imgproc/xf_channel_combine.hpp
xf::extractChannel imgproc/xf_channel_extract.hpp
xf::convertTo imgproc/xf_convert_bitdepth.hpp
xf::filter2D imgproc/xf_custom_convolution.hpp
xf::nv122iyuv, xf::nv122rgba, xf::nv122yuv4, xf::nv212iyuv,
xf::nv212rgba, xf::nv212yuv4, xf::rgba2yuv4, xf::rgba2iyuv,
xf::rgba2nv12, xf::rgba2nv21, xf::uyvy2iyuv, xf::uyvy2nv12,
xf::uyvy2rgba, xf::yuyv2iyuv, xf::yuyv2nv12, xf::yuyv2rgba
imgproc/xf_cvt_color.hpp
xf::dilate imgproc/xf_dilation.hpp
xf::erode imgproc/xf_erosion.hpp
xf::fast features/xf_fast.hpp
xf::GaussianBlur imgproc/xf_gaussian_filter.hpp
xf::cornerHarris features/xf_harris.hpp
xf::calcHist imgproc/xf_histogram.hpp
xf::equalizeHist imgproc/xf_hist_equalize.hpp
xf::HOGDescriptor imgproc/xf_hog_descriptor.hpp
xf::Houghlines imgproc/xf_houghlines.hpp
xf::integralImage imgproc/xf_integral_image.hpp
xf::densePyrOpticalFlow imgproc/xf_pyr_dense_optical_flow.hpp
xf::DenseNonPyrOpticalFlow imgproc/xf_dense_npyr_optical_flow.hpp
xf::LUT imgproc/xf_lut.hpp
xf::magnitude core/xf_magnitude.hpp
xf::MeanShift imgproc/xf_mean_shift.hpp
xf::meanStdDev core/xf_mean_stddev.hpp
xf::medianBlur imgproc/xf_median_blur.hpp
xf::minMaxLoc core/xf_min_max_loc.hpp
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 9

Table 2: xfOpenCV Library Contents (cont'd)
Function Name File Path in the include folder
xf::OtsuThreshold imgproc/xf_otsuthreshold.hpp
xf::phase core/xf_phase.hpp
xf::pyrDown imgproc/xf_pyr_down.hpp
xf::pyrUp imgproc/xf_pyr_up.hpp
xf::remap imgproc/xf_remap.hpp
xf::resize imgproc/xf_resize.hpp
xf::Scharr imgproc/xf_scharr.hpp
xf::SemiGlobalBM imgproc/xf_sgbm.hpp
xf::Sobel imgproc/xf_sobel.hpp
xf::StereoPipeline imgproc/xf_stereo_pipeline.hpp
xf::StereoBM imgproc/xf_stereoBM.hpp
xf::SVM imgproc/xf_svm.hpp
xf::Threshold imgproc/xf_threshold.hpp
xf::warpAffine imgproc/xf_warpaffine.hpp
xf::warpPerspective imgproc/xf_warpperspective.hpp
xf::warpTransform imgproc/xf_warp_transform.hpp
The dierent ways to use the xfOpenCV library examples are listed below:
•Downloading and Using xfOpenCV Libraries from SDx GUI
•Building a Project Using the Example Makeles on Linux
•Using reVISION Samples on the reVISION Plaorm
•Using the xfOpenCV Library on a non-reVISION Plaorm
Downloading and Using xfOpenCV Libraries from
SDx GUI
You can download xfOpenCV directly from SDx GUI. To build a project using the example
makeles on the Linux plaorm:
1. From SDx IDE, click Xilinx and select SDx Libraries.
2. Click Download next to the Xilinx xfOpenCV Library.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 10

Figure 2: SDx Libraries
The library is downloaded into <home directory>/Xilinx/SDx/2018.2/xfopencv.
Aer the library is downloaded, the enre set of examples in the library are available in the
list of templates while creang a new project.
Note: The library can be added to any project from the IDE menu opons.
3. To add a library to a project, from SDx IDE, click Xilinx and select SDx Libraries.
4. Select Xilinx xfOpenCV Library and click Add to project. The dropdown menu consists of
opons of which project the libraries need to be included to.
All the headers as part of the include/ folder in xfOpenCV library would be copied into the
local project directory as <project_dir>/libs/xfopencv/include. All the sengs
required for the libraries to be run are also set when this acon is completed.
Building a Project Using the Example Makefiles on
Linux
Use the following steps to build a project using the example makeles on the Linux plaorm:
1. Open a terminal.
2. Set the environment variable SYSROOT to the <path to sysroot folder>/sw/
aarch64-linux-gnu/sysroot folder.
3. Change the plaorm variable to point to the downloaded plaorm folder in makele. Ensure
that the folder name of the downloaded plaorm is unchanged.
4. Change the directory to the locaon where you want to build the example.
cd <path to example>
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 11

5. Set the environment variables to run SDx development environment.
• For c shell:
source <SDx tools install path>/settings.csh
• For bash shell:
source <SDx tools install path>/settings.sh
6. Type the make command in the terminal. The sd_card folder is created and can be found in
the <path to example> folder.
Using reVISION Samples on the reVISION Platform
Use the following steps to run a unit test for bilateral lter on zcu102_es2_reVISION:
1. Launch the SDx development environment using the desktop icon or the Start menu.
The Workspace Launcher dialog appears.
2. Click Browse to enter a workspace folder used to store your projects (you can use workspace
folders to organize your work), then click OK to dismiss the Workspace Launcher dialog.
Note: Before launching the SDx IDE on Linux, ensure that you use the same shell that you have used to set
the $SYSROOT environment variable. This is usually the le path to the Linux root le system.
The SDx development environment window opens with the Welcome tab visible when you
create a new workspace. The Welcome tab can be closed by clicking the X icon or minimized
if you do not wish to use it.
3. Select File → New → Xilinx SDx Project from the SDx development environment menu bar.
The New Project dialog box opens.
4. Specify the name of the project. For example Bilateral.
5. Click Next.
The the Choose Hardware Plaorm page appears.
6. From the Choose Hardware Plaorm page, click the Add Custom Plaorm buon.
7. Browse to the directory where you extracted the reVISION plaorm les. Ensure that you
select the zcu102_es2_reVISION folder.
8. From the Choose Hardware Plaorm page, select zcu102_es2_reVISION (custom).
9. Click Next.
The Templates page appears, containing source code examples for the selected plaorm.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 12

10. From the list of applicaon templates, select bilateral - File I/O and click Finish.
11. Click the Acve build conguraons drop-down from the SDx Project Sengs window, to
select the acve conguraon or create a build conguraon.
The standard build conguraons are Debug and Release. To get the best runme
performance, switch to use the Release build conguraon as it uses a higher compiler
opmizaon seng than the Debug build conguraon.
Figure 3: SDx Project Settings - Active Build Configuration
12. Set the Data moon network clock frequency (MHz) to the required frequency, on the SDx
Project Sengs page.
13. Right-click the project and select Build Project or press Ctrl+B keys to build the project, in
the Project Explorer view.
14. Copy the contents of the newly created sd_card folder to the SD card.
The sd_card folder contains all the les required to run designs on the ZCU102 board.
15. Insert the SD card in the ZCU102 board card slot and switch it ON.
Note: A serial port emulator (Teraterm/ minicom) is required to interface the user commands to the board.
16. Upon successful boot, run the following command in the Teraterm terminal (serial port
emulator.)
#cd /media/card
#remount
17. Run the .elf le for the respecve funcons.
For more informaon, see the Using the xfOpenCV Library Funcons on Hardware.
Using the xfOpenCV Library on a non-reVISION
Platform
This secon describes using the xfOpenCV library on a non-reVISION plaorm, in the SDx
development environment. The examples in xfOpenCV require OpenCV libraries for successful
compilaon. In the case of a non-reVISION plaorm, you are responsible for providing the
required OpenCV libraries, either as part of the plaorm or otherwise.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 13

Note: The instrucons in this secon assume that you have downloaded and installed all the required
packages. For more informaon, see the Prerequisites.
Use the following steps to import the xfOpenCV library into a SDx project and execute it on a
custom plaorm:
1. Launch the SDx development environment using the desktop icon or the Start menu.
The Workspace Launcher dialog appears.
2. Click Browse to enter a workspace folder used to store your projects (you can use workspace
folders to organize your work), then click OK to dismiss the Workspace Launcher dialog.
Note: Before launching the SDx IDE on Linux, ensure that you use the same shell that you have used to set
the $SYSROOT environment variable. This is usually the le path to the Linux root le system.
The SDx development environment window opens with the Welcome tab visible when you
create a new workspace. The Welcome tab can be closed by clicking the X icon or minimized
if you do not wish to use it.
3. Select File → New → Xilinx SDx Project from the SDx development environment menu bar.
The New Project dialog box opens.
4. Specify the name of the project. For example Test.
5. Click Next.
The the Choose Hardware Plaorm page appears.
6. From the Choose Hardware Plaorm page, select a suitable plaorm. For example, zcu102.
7. Click Next.
The Choose Soware Plaorm and Target CPU page appears.
8. From the Choose Soware Plaorm and Target CPU page, select an appropriate soware
plaorm and the target CPU. For example, select A9 from the CPU dropdown list for ZC702
and ZC706 reVISION plaorms.
9. Click Next. The Templates page appears, containing source code examples for the selected
plaorm.
10. From the list of applicaon templates, select Empty Applicaon and click Finish.
The New Project dialog box closes. A new project with the specied conguraon is created.
The SDx Project Sengs view appears. Noce the progress bar in the lower right border of
the view, Wait for a few moments for the C/C++ Indexer to nish.
11. The standard build conguraons are Debug and Release. To get the best run-me
performance, switch to use the Release build conguraon as it uses a higher compiler
opmizaon seng than the Debug build conguraon.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 14

Figure 4: SDx Project Settings - Active Build Configuration
12. Set the Data moon network clock frequency (MHz) to the required frequency, on the SDx
Project Sengs page.
13. Select the Generate bitstream and Generate SD card image check boxes.
14. Right-click on the newly created project in the Project Explorer view.
15. From the context menu that appears, select C/C++ Build Sengs.
The Properes for <project> dialog box appears.
16. Click the Tool Sengs tab.
17. Expand the SDS++ Compiler → Directories tree.
18. Click the icon to add the “<xfopencv_location>\include and
“<OpenCV_location>\include folder locaons to the Include Paths list.
Note: The OpenCV library is not provided by Xilinx for custom plaorms. You are required to provide the
library. Use the reVISION plaorm in order to use the OpenCV library provided by Xilinx.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 15

Figure 5: SDS++ Compiler Settings
19. Click Apply.
20. Expand the SDS++ Linker → Libraries tree.
21. Click the icon and add the following libraries to the Libraries(-l) list. These libraries are
required by OpenCV.
• opencv_core
• opencv_imgproc
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 16

• opencv_imgcodecs
• opencv_features2d
• opencv_calib3d
•opencv_ann
• lzma
•
• png16
• z
• jpeg
• dl
• rt
• webp
22. Click the icon and add <opencv_Location>/lib folder locaon to the Libraries
search path (-L) list.
Note: The OpenCV library is not provided by Xilinx for custom plaorms. You are required to provide the
library. Use the reVISION plaorm in order to use the OpenCV library provided by Xilinx.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 17

Figure 6: SDS++ Linker Settings
23. Click Apply to save the conguraon.
24. Click OK to close the Properes for <project> dialog box.
25. Expand the newly created project tree in the Project Explorer view.
26. Right-click the src folder and select Import. The Import dialog box appears.
27. Select File System and click Next.
28. Click Browse to navigate to the <xfopencv_Location>/examples folder locaon.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 18

29. Select the folder that corresponds to the library that you desire to import. For example,
accumulate.
Figure 7: Import Library Example Source Files
30. Right-click the library funcon in the Project Explorer view and select Toggle HW/SW to
move the funcon to the hardware.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 19

Figure 8: Moving a Library Function to the Hardware
31. Right-click the project and select Build Project or press Ctrl+B keys to build the project, in
the Project Explorer view.
The build process may take anyme between few minutes to several hours, depending on the
power of the host machine and the complexity of the design. By far, the most me is spent
processing the rounes that have been tagged for realizaon in hardware.
32. Copy the contents of the newly created .\<workspace>\<function>\Release
\sd_card folder to the SD card. The sd_card folder contains all the les required to run
designs on a board.
33. Insert the SD card in the board card slot and switch it ON.
Note: A serial port emulator (Teraterm/ minicom) is required to interface the user commands to the board.
34. Upon successful boot, navigate to the ./mnt folder and run the following command at the
prompt:
#cd /mnt
Note: It is assumed that the OpenCV libraries are a port of the root lesystem. If not, add the locaon of
OpenCV libraries to LD_LIBRARY_PATH using the $ export LD_LIBRARY_PATH=<location of
OpenCV libraries>/lib command.
35. Run the .elf executable le. For more informaon, see the Using the xfOpenCV Library
Funcons on Hardware.
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 20

Changing the Hardware Kernel
Configuration
Use the following steps to change the hardware kernel conguraon:
1. Update the <path to xfOpenCV git folder>/xfOpenCV/examples/<function>/
xf_config_params.h le.
2. Update the makele along with the xf_config_params.h le:
a. Find the line with the funcon name in the makele. For bilateral lter, the line in the
makele will be xf::BilateralFilter<3,1,0,1080,1920,1>.
b. Update the template parameters in the makele to reect changes made in the
xf_config_params.h le. For more details, see the Chapter 3: xfOpenCV Library API
Reference.
Using the xfOpenCV Library Functions on
Hardware
The following table lists the xfOpenCV library funcons and the command to run the respecve
examples on hardware. It is assumed that your design is completely built and the board has
booted up correctly.
Table 3: Using the xfOpenCV Library Function on Hardware
Example Function Name Usage on Hardware
accumulate xf::accumulate ./<executable name>.elf <path to input
image 1> <path to input image 2>
accumulatesquared xf::accumulateSquare ./<executable name>.elf <path to input
image 1> <path to input image 2>
accumulateweighted xf::accumulateWeighted ./<executable name>.elf <path to input
image 1> <path to input image 2>
arithm xf::absdiff, xf::add, xf::subtract, xf::bitwise_and,
xf::bitwise_or, xf::bitwise_not, xf::bitwise_xor
./<executable name>.elf <path to input
image 1> <path to input image 2>
Bilateralfilter xf::bilateralFilter ./<executable name>.elf <path to input
image>
Boxfilter xf::boxFilter ./<executable name>.elf <path to input
image>
Canny xf::Canny ./<executable name>.elf <path to input
image>
channelcombine xf::merge ./<executable name>.elf <path to input
image 1> <path to input image 2> <path to
input image 3> <path to input image 4>
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 21

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)
Example Function Name Usage on Hardware
Channelextract xf::extractChannel ./<executable name>.elf <path to input
image>
Colordetect xf::RGB2HSV, xf::colorthresholding, xf:: erode,
and xf:: dilate
./<executable name>.elf <path to input
image>
Convertbitdepth xf::convertTo ./<executable name>.elf <path to input
image>
Cornertracker xf::cornerTracker ./exe <input video> <no. of frames> <Harris
Threshold> <No. of frames after which Harris
Corners are Reset>
Customconv xf::filter2D ./<executable name>.elf <path to input
image>
cvtcolor IYUV2NV12 xf::iyuv2nv12 ./<executable name>.elf <path to input
image 1> <path to input image 2> <path to
input image 3>
cvtcolor IYUV2RGBA xf::iyuv2rgba ./<executable name>.elf <path to input
image 1> <path to input image 2> <path to
input image 3>
cvtcolor IYUV2YUV4 xf::iyuv2yuv4 ./<executable name>.elf <path to input
image 1> <path to input image 2> <path to
input image 3> <path to input image 4>
<path to input image 5> <path to input image
6>
cvtcolor NV122IYUV xf::nv122iyuv ./<executable name>.elf <path to input
image 1> <path to input image 2>
cvtcolor NV122RGBA xf::nv122rgba ./<executable name>.elf <path to input
image 1> <path to input image 2>
cvtcolor NV122YUV4 xf::nv122yuv4 ./<executable name>.elf <path to input
image 1> <path to input image 2>
cvtcolor NV212IYUV xf::nv212iyuv ./<executable name>.elf <path to input
image 1> <path to input image 2>
cvtcolor NV212RGBA xf::nv212rgba ./<executable name>.elf <path to input
image 1> <path to input image 2>
cvtcolor NV212YUV4 xf::nv212yuv4 ./<executable name>.elf <path to input
image 1> <path to input image 2>
cvtcolor RGBA2YUV4 xf::rgba2yuv4 ./<executable name>.elf <path to input
image>
cvtcolor RGBA2IYUV xf::rgba2iyuv ./<executable name>.elf <path to input
image>
cvtcolor RGBA2NV12 xf::rgba2nv12 ./<executable name>.elf <path to input
image>
cvtcolor RGBA2NV21 xf::rgba2nv21 ./<executable name>.elf <path to input
image>
cvtcolor UYVY2IYUV xf::uyvy2iyuv ./<executable name>.elf <path to input
image>
cvtcolor UYVY2NV12 xf::uyvy2nv12 ./<executable name>.elf <path to input
image>
cvtcolor UYVY2RGBA xf::uyvy2rgba ./<executable name>.elf <path to input
image>
cvtcolor YUYV2IYUV xf::yuyv2iyuv ./<executable name>.elf <path to input
image>
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 22

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)
Example Function Name Usage on Hardware
cvtcolor YUYV2NV12 xf::yuyv2nv12 ./<executable name>.elf <path to input
image>
cvtcolor YUYV2RGBA xf::yuyv2rgba ./<executable name>.elf <path to input
image>
Difference of Gaussian xf:: GaussianBlur, xf:: duplicateMat, xf::
delayMat, and xf::subtract
./<exe-name>.elf <path to input image>
Dilation xf::dilate ./<executable name>.elf <path to input
image>
Erosion xf::erode ./<executable name>.elf <path to input
image>
Fast xf::fast ./<executable name>.elf <path to input
image>
Gaussianfilter xf::GaussianBlur ./<executable name>.elf <path to input
image>
Harris xf::cornerHarris ./<executable name>.elf <path to input
image>
Histogram xf::calcHist ./<executable name>.elf <path to input
image>
Histequialize xf::equalizeHist ./<executable name>.elf <path to input
image>
Hog xf::HOGDescriptor ./<executable name>.elf <path to input
image>
Houghlines xf::HoughLines ./<executable name>.elf <path to input
image>
Integralimg xf::integralImage ./<executable name>.elf <path to input
image>
Lkdensepyrof xf::densePyrOpticalFlow ./<executable name>.elf <path to input
image 1> <path to input image 2>
Lknpyroflow xf::DenseNonPyrLKOpticalFlow ./<executable name>.elf <path to input
image 1> <path to input image 2>
Lut xf::LUT ./<executable name>.elf <path to input
image>
Magnitude xf::magnitude ./<executable name>.elf <path to input
image>
meanshifttracking xf::MeanShift ./<executable name>.elf <path to input
video/input image files> <Number of objects
to track>
meanstddev xf::meanStdDev ./<executable name>.elf <path to input
image>
medianblur xf::medianBlur ./<executable name>.elf <path to input
image>
Minmaxloc xf::minMaxLoc ./<executable name>.elf <path to input
image>
otsuthreshold xf::OtsuThreshold ./<executable name>.elf <path to input
image>
Phase xf::phase ./<executable name>.elf <path to input
image>
Pyrdown xf::pyrDown ./<executable name>.elf <path to input
image>
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 23

Table 3: Using the xfOpenCV Library Function on Hardware (cont'd)
Example Function Name Usage on Hardware
Pyrup xf::pyrUp ./<executable name>.elf <path to input
image>
remap xf::remap ./<executable name>.elf <path to input
image> <path to mapx data> <path to mapy
data>
Resize xf::resize ./<executable name>.elf <path to input
image>
scharrfilter xf::Scharr ./<executable name>.elf <path to input
image>
SemiGlobalBM xf::SemiGlobalBM ./<executable name>.elf <path to left image>
<path to right image>
sobelfilter xf::Sobel ./<executable name>.elf <path to input
image>
stereopipeline xf::StereoPipeline ./<executable name>.elf <path to left image>
<path to right image>
stereolbm xf::StereoBM ./<executable name>.elf <path to left image>
<path to right image>
Svm xf::SVM ./<executable name>.elf
threshold xf::Threshold ./<executable name>.elf <path to input
image>
warpaffine xf::warpAffine ./<executable name>.elf <path to input
image>
warpperspective xf::warpPerspective ./<executable name>.elf <path to input
image>
warptransform xf::warpTransform ./<executable name>.elf <path to input
image>
Chapter 2: Getting Started
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 24

Chapter 3
xfOpenCV Library API Reference
To facilitate local memory allocaon on FPGA devices, the xfOpenCV library funcons are
provided in templates with compile-me parameters. Data is explicitly copied from cv::Mat to
xf::Mat and is stored in physically conguous memory to achieve the best possible
performance. Aer processing, the output in xf::Mat is copied back to cv::Mat to write it
into the memory.
xf::Mat Image Container Class
xf::Mat is a template class that serves as a container for storing image data and its aributes.
Note: The xf::Mat image container class is similar to the cv::Mat class of the OpenCV library.
Class Definition
template<int TYPE, int ROWS, int COLS, int NPC>
class Mat {
public:
Mat(); // default constructor
Mat(int _rows, int _cols);
Mat(int _rows, int _cols, void *_data);
Mat(int _size, int _rows, int _cols);
~Mat();
void init(int _rows, int _cols);
void copyTo(XF_PTSNAME(T,NPC)* fromData);
unsigned char * copyFrom();
Mat(const Mat& src);
Mat& operator=(const Mat& src);
template<int DST_T>
void convertTo(Mat<DST_T,ROWS, COLS, NPC> &dst, int otype, double
alpha=1, double beta=0);
int rows, cols, size; // actual image size
#ifndef __SYNTHESIS__
XF_TNAME(T,NPC)*data;
#else
XF_TNAME(T,NPC) data[ROWS*(COLS>> (XF_BITSHIFT(NPC)))];
#endif
};
Parameter Descriptions
The following table lists the xf::Mat class parameters and their descripons:
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 25

Table 4: xf::Mat Class Parameter Descriptions
Parameter Description
rows The number of rows in the image or height of the image.
cols The number of columns in the image or width of the image.
size The number of words stored in the data member. The value is calculated using rows*cols/
(number of pixels packed per word).
*data The pointer to the words that store the pixels of the image.
Member Functions Description
The following table lists the member funcons and their descripons:
Table 5: xf::Mat Member Function Descriptions
Member Functions Description
Mat() This default constructor initializes the Mat object sizes, using the template parameters ROWS
and COLS.
Mat(int _rows, int _cols) This constructor initializes the Mat object using arguments _rows and _cols.
Mat(const xf::Mat &_src) This constructor helps clone a Mat object to another. New memory will be allocated for the
newly created constructor.
Mat(int _rows, int _cols,
void *_data)
This constructor initializes the Mat object using arguments _rows, _cols, and _data. The *data
member of the Mat object points to the memory allocated for _data argument, when this
constructor is used. No new memory is allocated for the *data member.
convertTo(Mat<DST_T,ROW
S, COLS, NPC> &dst, int
otype, double alpha=1,
double beta=0)
Refer to xf::convertTo
copyTo(* fromData) Copies the data from Data pointer into physically contiguous memory allocated inside the
constructor.
copyFrom() Returns the pointer to the first location of the *data member.
~Mat() This is a default destructor of the Mat object.
Template Parameter Descriptions
Template parameters of the xf::Mat class are used to set the depth of the pixel, number of
channels in the image, number of pixels packed per word, maximum number of rows and columns
of the image. The following table lists the template parameters and their descripons:
Table 6: xf::Mat Template Parameter Descriptions
Parameters Description
TYPE Type of the pixel data. For example, XF_8UC1 stands for 8-bit unsigned and one channel pixel.
More types can be found in include/common/xf_params.h.
HEIGHT Maximum height of an image.
WIDTH Maximum width of an image.
NPC The number of pixels to be packed per word. For instance, XF_NPPC1 for 1 pixel per word; and
XF_NPPC8 for 8 pixels per word.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 26

Pixel-Level Parallelism
The amount of parallelism to be implemented in a funcon from xfOpenCV is kept as a
congurable parameter. In most funcons, there are two opons for processing data.
• Single-pixel processing
• Processing eight pixels in parallel
The following table describes the opons available for specifying the level of parallelism required
in a parcular funcon:
Table 7: Options Available for Specifying the Level of Parallelism
Option Description
XF_NPPC1 Process 1 pixel per clock cycle
XF_NPPC2 Process 2 pixels per clock cycle
XF_NPPC8 Process 8 pixels per clock cycle
Macros to Work With Parallelism
There are two macros that are dened to work with parallelism.
• The XF_NPIXPERCYCLE(flags) macro resolves to the number of pixels processed per
cycle.
○XF_NPIXPERCYCLE(XF_NPPC1) resolves to 1
○XF_NPIXPERCYCLE(XF_NPPC2) resolves to 2
○XF_NPIXPERCYCLE(XF_NPPC8) resolves to 8
• The XF_BITSHIFT(flags) macro resolves to the number of mes to shi the image size to
right to arrive at the nal data transfer size for parallel processing.
○XF_BITSHIFT(XF_NPPC1) resolves to 0
○XF_BITSHIFT(XF_NPPC2) resolves to 1
○XF_BITSHIFT(XF_NPPC8) resolves to 3
Pixel Types
Parameter types will dier, depending on the combinaon of the depth of pixels and the number
of channels in the image. The generic nomenclature of the parameter is listed below.
XF_<Number of bits per pixel><signed (S) or unsigned (U) or float
(F)>C<number of channels>
For example, for an 8-bit pixel - unsigned - 1 channel the data type is XF_8UC1.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 27

The following table lists the available data types for the xf::Mat class:
Table 8: xf::Mat Class - Available Data Types
Option Number of bits per
Pixel
Unsigned/ Signed/
Float Type Number of Channels
XF_8UC1 8 Unsigned 1
XF_16UC1 16 Unsigned 1
XF_16SC1 16 Signed 1
XF_32UC1 32 Unsigned 1
XF_32FC1 32 Float 1
XF_32SC1 32 Signed 1
XF_8UC2 8 Unsigned 2
XF_8UC4 8 Unsigned 4
XF_2UC1 2 Unsigned 1
Manipulating Data Type
Based on the number of pixels to process per clock cycle and the type parameter, there are
dierent possible data types. The xfOpenCV library uses these datatypes for internal processing
and inside the xf::Mat class. The following are a few supported types:
•XF_TNAME(TYPE,NPPC) resolves to the data type of the data member of the xf::Mat
object. For instance, XF_TNAME(XF_8UC1,XF_NPPC8) resolves to ap_uint<64>.
•Word width = pixel depth * number of channels * number of pixels to
process per cycle (NPPC).
•XF_DTUNAME(TYPE,NPPC) resolves to the data type of the pixel. For instance,
XF_DTUNAME(XF_32FC1,XF_NPPC1) resolves to float.
•XF_PTSNAME(TYPE,NPPC) resolves to the ‘C’ data type of the pixel. For instance,
XF_PTSNAME (XF_16UC1,XF_NPPC2) resolves to unsigned short.
Note: ap_uint<>, ap_int<>, ap_fixed<>, and ap_ufixed<> types belong to the high-level synthesis
(HLS) library. For more informaon, see the Vivado Design Suite User Guide: High-Level Synthesis (UG902).
Sample Illustration
The following code illustrates the conguraons that are required to build the gaussian lter on
an image, using SDSoC tool for Zynq® UltraScale™ plaorm.
Note: In case of a real-me applicaon, where the video is streamed in, it is recommended that the locaon
of frame buer is xf::Mat and is processed using the library funcon. The resultant locaon pointer is
passed to display IPs.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 28

xf_config_params.h
#define FILTER_SIZE_3 1
#define FILTER_SIZE_5 0
#define FILTER_SIZE_7 0
#define RO 0
#define NO 1
#if NO
#define NPC1 XF_NPPC1
#endif
#if RO
#define NPC1 XF_NPPC8
#endif
xf_gaussian_filter_tb.cpp
int main(int argc, char **argv)
{
cv::Mat in_img, out_img, ocv_ref;
cv::Mat in_gray, in_gray1, diff;
in_img = cv::imread(argv[1], 1); // reading in the color image
extractChannel(in_img, in_gray, 1);
xf::Mat<XF_8UC1, HEIGHT, WIDTH, NPC1> imgInput(in_img.rows,in_img.cols);
xf::Mat<XF_8UC1, HEIGHT, WIDTH, NPC1> imgOutput(in_img.rows,in_img.cols);
imgInput.copyTo(in_gray.data);
gaussian_filter_accel(imgInput,imgOutput,sigma);
// Write output image
xf::imwrite("hls_out.jpg",imgOutput);
}
xf_gaussian_filter_accel.cpp
#include "xf_gaussian_filter_config.h"
void gaussian_filter_accel(xf::Mat<XF_8UC1,HEIGHT,WIDTH,NPC1>
&imgInput,xf::Mat<XF_8UC1,HEIGHT,WIDTH,NPC1>&imgOutput,float sigma)
{
xf::GaussianBlur<FILTER_WIDTH, XF_BORDER_CONSTANT, XF_8UC1, HEIGHT,
WIDTH, NPC1>(imgInput, imgOutput, sigma);
}
xf_gaussian_filter.hpp
#pragma SDS data data_mover("_src.data":AXIDMA_SIMPLE)
#pragma SDS data data_mover("_dst.data":AXIDMA_SIMPLE)
#pragma SDS data access_pattern("_src.data":SEQUENTIAL)
#pragma SDS data copy("_src.data"[0:"_src.size"])
#pragma SDS data access_pattern("_dst.data":SEQUENTIAL)
#pragma SDS data copy("_dst.data"[0:"_dst.size"])
template<int FILTER_SIZE, int BORDER_TYPE, int SRC_T, int ROWS,
int COLS,int NPC = 1>
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 29

void GaussianBlur(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,
xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst, float sigma)
{
//function body
}
The design fetches data from external memory (with the help of SDSoC data movers) and is
transferred to the funcon in 8-bit or 64-bit packets, based on the congured mode. Assuming
8-bits per pixel, 8 pixels can be packed into 64-bits. Therefore, 8 pixels are available to be
processed in parallel.
Enable the FILTER_SIZE_3 and the NO macros in the xf_config_params.h le. The macro
is used to set the lter size to 3x3 and #define NO 1 macro enables 1 pixel parallelism.
Specify the SDSoC tool specic pragmas, in the xf_gaussian_filter.hpp le.
#pragma SDS data data_mover("_src.data":AXIDMA_SIMPLE)
#pragma SDS data data_mover("_dst.data":AXIDMA_SIMPLE)
#pragma SDS data access_pattern("_src.data":SEQUENTIAL)
#pragma SDS data copy("_src.data"[0:"_src.size"])
#pragma SDS data access_pattern("_dst.data":SEQUENTIAL)
#pragma SDS data copy("_dst.data"[0:"_dst.size"])
Note: For more informaon on the pragmas used for hardware accelerator funcons in SDSoC, see SDSoC
Environment User Guide (UG1027).
Additional Utility Functions for Software
xf::imread
The funcon xf::imread loads an image from the specied le path, copies into xf::Mat and
returns it. If the image cannot be read (because of missing le, improper permissions,
unsupported or invalid format), the funcon exits with a non-zero return code and an error
statement.
Note: In an HLS standalone mode like Cosim, use cv::imread followed by copyTo funcon, instead of
xf::imread.
API Syntax
template<int PTYPE, int ROWS, int COLS, int NPC>
xf::Mat<PTYPE, ROWS, COLS, NPC> imread (char *filename, int type)
Parameter Descriptions
The table below describes the template and the funcon parameters.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 30

Table 9: xf::imread Function Parameter Descriptions
Parameter Description
PTYPE Input pixel type. Value should be in accordance with the ‘type’ argument’s value.
ROWS Maximum height of the image to be read
COLS Maximum width of the image to be read
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
filename Name of the file to be loaded
type Flag that depicts the type of image. The values are:
•'0' for gray scale
•'1' for color image
xf::imwrite
The funcon xf::imwrite saves the image to the specied le from the given xf::Mat. The image
format is chosen based on the le name extension. This funcon internally uses cv::imwrite for
the processing. Therefore, all the limitaons of cv::imwrite are also applicable to xf::imwrite.
API Syntax
template <int PTYPE, int ROWS, int COLS, int NPC>
void imwrite(const char *img_name, xf::Mat<PTYPE, ROWS, COLS, NPC> &img)
Parameter Descriptions
The table below describes the template and the funcon parameters.
Table 10: xf::imwrite Function Parameter Descriptions
Parameter Description
PTYPE Input pixel type. Supported types are: XF_8UC1, XF_16UC1, XF_8UC3, XF_16UC3, XF_8UC4, and
XF_16UC4
ROWS Maximum height of the image to be read
COLS Maximum width of the image to be read
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
img_name Name of the file with the extension
img xf::Mat array to be saved
xf::absDiff
The funcon xf::absDi computes the absolute dierence between each individual pixels of an
xf::Mat and a cv::Mat, and returns the dierence values in a cv::Mat.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 31

API Syntax
template <int PTYPE, int ROWS, int COLS, int NPC>
void absDiff(cv::Mat &cv_img, xf::Mat<PTYPE, ROWS, COLS, NPC>& xf_img,
cv::Mat &diff_img )
Parameter Descriptions
The table below describes the template and the funcon parameters.
Table 11: xf::absDiff Function Parameter Descriptions
Parameter Description
PTYPE Input pixel type
ROWS Maximum height of the image to be read
COLS Maximum width of the image to be read
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and
XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively.
cv_img cv::Mat array to be compared
xf_img xf::Mat array to be compared
diff_img Output difference image(cv::Mat)
xf::convertTo
The xf::convertTo funcon performs bit depth conversion on each individual pixel of the given
input image. This method converts the source pixel values to the target data type with
appropriate casng.
dst(x,y)= cast<target-data-type>(α(src(x,y)+β))
Note: The output and input Mat cannot be the same. That is, the converted image cannot be stored in the
Mat of the input image.
API Syntax
template<int DST_T> void convertTo(xf::Mat<DST_T,ROWS, COLS, NPC> &dst,
int ctype, double alpha=1, double beta=0)
Parameter Descriptions
The table below describes the template and funcon parameters.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 32

Table 12: xf::convertTo Function Parameter Descriptions
Parameter Description
DST_T Output pixel type. Possible values are XF_8UC1, XF_16UC1, XF_16SC1, and XF_32SC1.
ROWS Maximum height of image to be read
COLS Maximum width of image to be read
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and
XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively.
dst Converted xf Mat
ctype Conversion type : Possible values are listed here.
//Down-convert:
•XF_CONVERT_16U_TO_8U
•XF_CONVERT_16S_TO_8U
•XF_CONVERT_32S_TO_8U
•XF_CONVERT_32S_TO_16U
•XF_CONVERT_32S_TO_16S
//Up-convert:
•XF_CONVERT_8U_TO_16U
•XF_CONVERT_8U_TO_16S
•XF_CONVERT_8U_TO_32S
•XF_CONVERT_16U_TO_32S
•XF_CONVERT_16S_TO_32S
alpha Optional scale factor
beta Optional delta added to the scaled values
xfOpenCV Library Functions
The xfOpenCV library is a set of select OpenCV funcons opmized for Zynq®-7000 SoC and
Zynq UltraScale+ MPSoC devices. The following table lists the xfOpenCV library funcons.
Table 13: xfOpenCV Library Functions
Computations Input Processing Filters Other
Absolute Difference Bit Depth Conversion Bilateral Filter Canny Edge Detection
Accumulate Channel Combine Box Filter FAST Corner Detection
Accumulate Squared Channel Extract Custom Convolution Harris Corner Detection
Accumulate Weighted Color Conversion Dilate Histogram Computation
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 33

Table 13: xfOpenCV Library Functions (cont'd)
Computations Input Processing Filters Other
Atan2 Histogram Equalization Erode Dense Pyramidal LK Optical
Flow
Bitwise AND, Bitwise NOT,
Bitwise OR, Bitwise XOR
Look Up Table Gaussian Filter Dense Non-Pyramidal LK
Optical Flow
Gradient Magnitude Remap Sobel Filter MinMax Location
Gradient Phase Resolution Conversion
(Resize)
Median Blur Filter Thresholding
Integral Image Scharr Filter WarpAffine
Inverse (Reciprocal) WarpPerspective
Pixel-Wise Addition SVM
Pixel-Wise Multiplication Otsu Threshold
Pixel-Wise Subtraction Mean Shift Tracking
Square Root HOG
Mean and Standard
Deviation
Stereo Local Block Matching
WarpTransform
Pyramid Up
Pyramid Down
Delay
Duplicate
Color Thresholding
RGB2HSV
InitUndistortRectifyMapInver
se
Houghlines
Semi Global Method for
Stereo Disparity Estimation
Notes:
1. The maximum resolution supported for all the functions is 4K, except Houghlines, HOG (RB mode), and Canny Edge
Detection.
The following table lists the funcons that are not supported on Zynq-7000 SoC devices, when
congured to use 128-bit interfaces in 8 pixel per cycle mode.
Table 14: Unsupported Functions Using 128-bit Interfaces in 8 Pixel Per Cycle Mode
on Zynq-7000 SoC
Computations Input Processing Filters
Accumulate Bit Depth Conversion Box Filter: signed 16-bit pixel type, and
unsigned 16-bit pixel type
Accumulate Squared Custom Convolution: signed 16-bit
output pixel type
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 34

Table 14: Unsupported Functions Using 128-bit Interfaces in 8 Pixel Per Cycle Mode
on Zynq-7000 SoC (cont'd)
Computations Input Processing Filters
Accumulate Weighted Sobel Filter
Gradient Magnitude Scharr Filter
Gradient Phase
Pixel-Wise Addition: signed 16-bit pixel
type, and unsigned 16-bit pixel type
Pixel-Wise Multiplication: signed 16-bit
pixel type, and unsigned 16-bit pixel
type
Pixel-Wise Subtraction: signed 16-bit
pixel type, and unsigned 16-bit pixel
type
Note: Resoluon Conversion (Resize) in 8 pixel per cycle mode, Dense Pyramidal LK Opcal Flow, and
Dense Non-Pyramidal LK Opcal Flow funcons are not supported on the Zynq-7000 SoC ZC702 devices,
due to the higher resource ulizaon.
Absolute Difference
The absdiff funcon nds the pixel wise absolute dierence between two input images and
returns an output image. The input and the output images must be the XF_8UC1 type.
Iout (x, y)=
|
Iin1 (x, y)- Iin2
|
(x, y)
Where,
• Iout(x, y) is the intensity of output image at (x,y) posion.
• Iin1(x, y) is the intensity of rst input image at (x,y) posion.
• Iin2(x, y) is the intensity of second input image at (x,y) posion.
API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void absdiff(
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 35

Table 15: absdiff Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is
supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a
multiple of 8)
COLS Maximum width of input and output image (must be a
multiple of 8)
NPC Number of pixels to be processed per cycle; possible
options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel
operations respectively.
src1 Input image
src2 Input image
dst Output image
Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado® HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 16: absdiff Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 67 17
8 pixel 150 0 0 67 234 39
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 17: absdiff Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.69
Deviation from OpenCV
There is no deviaon from OpenCV, except that the absdiff funcon supports 8-bit pixels.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 36

Accumulate
The accumulate funcon adds an image (src1) to the accumulator image (src2), and generates
the accumulated result image (dst).
dst(x,y)=src1(x,y)+src2⎛
⎝x,y⎞
⎠
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void accumulate (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 18: accumulate Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
DST_T Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8
for 1 pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image
Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale
HD (1080x1920) image.
Table 19: accumulate Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48E FF LUT CLB
1 pixel 300 0 0 62 55 12
8 pixel 150 0 0 389 285 61
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 37

Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 20: accumulate Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
Deviation from OpenCV
In OpenCV the accumulated image is stored in the second input image. The src2 image acts as
both input and output, as shown below:
src2(x,y)=src1(x,y)+src2⎛
⎝x,y⎞
⎠
Whereas, in the xfOpenCV implementaon, the accumulated image is stored separately, as
shown below:
dst(x,y)=src1(x,y)+src2⎛
⎝x,y⎞
⎠
Accumulate Squared
The accumulateSquare funcon adds the square of an image (src1) to the accumulator image
(src2) and generates the accumulated result (dst).
dst(x,y)=src1(x,y)2+src2⎛
⎝x,y⎞
⎠
The accumulated result is a separate argument in the funcon, instead of having src2 as the
accumulated result. In this implementaon, having a bi-direconal accumulator is not possible as
the funcon makes use of streams.
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void accumulateSquare (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst)
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 38

Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 21: accumulateSquare Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
DST_T Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image
Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 22: accumulateSquare Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48E FF LUT CLB
1 pixel 300 0 1 71 52 14
8 pixel 150 0 8 401 247 48
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 23: accumulateSquare Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.6
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 39

Deviation from OpenCV
In OpenCV the accumulated squared image is stored in the second input image. The src2 image
acts as input as well as output.
src2(x,y)=src1(x,y)2+src2⎛
⎝x,y⎞
⎠
Whereas, in the xfOpenCV implementaon, the accumulated squared image is stored separately.
dst(x,y)=src1(x,y)2+src2⎛
⎝x,y⎞
⎠
Accumulate Weighted
The accumulateWeighted funcon computes the weighted sum of the input image (src1) and
the accumulator image (src2) and generates the result in dst.
dst(x,y)=alpha*src1(x,y)+⎛
⎝1 - alpha⎞
⎠*src2⎛
⎝x,y⎞
⎠
The accumulated result is a separate argument in the funcon, instead of having src2 as the
accumulated result. In this implementaon, having a bi-direconal accumulator is not possible, as
the funcon uses streams.
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void accumulateWeighted (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst,
float alpha )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 24: accumulateWeighted Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
DST_T Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 40

Table 24: accumulateWeighted Function Parameter Descriptions (cont'd)
Parameter Description
dst Output image
alpha Weight applied to input image
Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 25: accumulateWeighted Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 5 295 255 52
8 pixel 150 0 19 556 476 88
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 26: accumulateWeighted Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
Deviation from OpenCV
The resultant image in OpenCV is stored in the second input image. The src2 image acts as input
as well as output, as shown below:
src2(x,y)=alpha*src1(x,y)+⎛
⎝1 - alpha⎞
⎠*src2⎛
⎝x,y⎞
⎠
Whereas, in xfOpenCV implementaon, the accumulated weighted image is stored separately.
dst(x,y)=alpha*src1(x,y)+⎛
⎝1 - alpha⎞
⎠*src2⎛
⎝x,y⎞
⎠
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 41

Bilateral Filter
In general, any smoothing lter smoothens the image which will aect the edges of the image. To
preserve the edges while smoothing, a bilateral lter can be used. In an analogous way as the
Gaussian lter, the bilateral lter also considers the neighboring pixels with weights assigned to
each of them. These weights have two components, the rst of which is the same weighing used
by the Gaussian lter. The second component takes into account the dierence in the intensity
between the neighboring pixels and the evaluated one.
The bilateral lter applied on an image is:
BF[I]p=1
Wp∑qϵSGσs
⎛
⎝‖p-q‖⎞
⎠Gσr
⎛
⎝‖Ip-Iq‖⎞
⎠Iq
Where
Wp=∑qϵSGσs
⎛
⎝‖p-q‖⎞
⎠Gσr
⎛
⎝‖Ip-Iq‖⎞
⎠
and
Gσ
is a gaussian lter with variance
σ
.
The gaussian lter is given by:
Gσ=e
-⎛
⎝x2+y2⎞
⎠
2σ2
API Syntax
template<int FILTER_SIZE, int BORDER_TYPE, int TYPE, int ROWS, int COLS,
int NPC=1>
void bilateralFilter (
xf::Mat<int TYPE, int ROWS, int COLS, int NPC> src,
xf::Mat<int TYPE, int ROWS, int COLS, int NPC> dst,
float sigma_space, float sigma_color )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 27: bilateralFilter Function Parameter Descriptions
Parameter Description
FILTER_SIZE Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7)
are supported
BORDER_TYPE Border type supported is XF_BORDER_CONSTANT
TYPE Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported
(XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 42

Table 27: bilateralFilter Function Parameter Descriptions (cont'd)
Parameter Description
NPC Number of pixels to be processed per cycle; this function supports only XF_NPPC1
or 1 pixel per cycle operations.
src Input image
dst Output image
sigma_space Standard deviation of filter in spatial domain
sigma_color Standard deviation of filter used in color space
Resource Utilization
The following table summarizes the resource ulizaon of the kernel in dierent conguraons,
generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA,
to progress a grayscale HD (1080x1920) image.
Table 28: bilateralFilter Resource Utilization Summary
Operating
Mode Filter Size
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 3x3 300 6 22 4934 4293
5x5 300 12 30 5481 4943
7x7 300 37 48 7084 6195
Performance Estimate
The following table summarizes a performance esmate of the kernel in dierent conguraons,
as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 29: bilateralFilter Function Performance Estimate Summary
Operating Mode Filter Size
Latency Estimate
168 MHz
Max (ms)
1 pixel 3x3 7.18
5x5 7.20
7x7 7.22
Deviation from OpenCV
Unlike OpenCV, xfOpenCV only supports lter sizes of 3, 5 and 7.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 43

Bit Depth Conversion
The convertTo funcon converts the input image bit depth to the required bit depth in the
output image.
API Syntax
template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void convertTo(xf::Mat<SRC_T, ROWS, COLS, NPC> &_src_mat, xf::Mat<DST_T,
ROWS, COLS, NPC> &_dst_mat, ap_uint<4> _convert_type, int _shift)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 30: convertTo Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1),
16-bit, unsigned, 1 channel (XF_16UC1),
16-bit, signed, 1 channel (XF_16SC1),
32-bit, unsigned, 1 channel (XF_32UC1)
32-bit, signed, 1 channel (XF_32SC1) are supported.
DST_T Output pixel yype. 8-bit, unsigned, 1 channel (XF_8UC1),
16-bit, unsigned, 1 channel (XF_16UC1),
16-bit, signed, 1 channel (XF_16SC1),
32-bit, unsigned, 1 channel (XF_32UC1)
32-bit, signed, 1 channel (XF_32SC1) are supported.
ROWS Height of input and output images
COLS Width of input and output images
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
_convert_type This parameter specifies the type of conversion required. (See XF_convert_bit_depth_e
enumerated type in file xf_params.h for possible values.)
_shift Optional scale factor
Possible Conversions
The following table summarizes supported conversions. The rows are possible input image bit
depths and the columns are corresponding possible output image bit depths (U=unsigned,
S=signed).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 44

Table 31: convertTo Function Supported Conversions
INPUT/
OUTPUT U8 U16 S16 U32 S32
U8 NA yes yes NA yes
U16 yes NA NA NA yes
S16 yes NA NA NA yes
U32 NA NA NA NA NA
S32 yes yes yes NA NA
Resource Utilization
The following table summarizes the resource ulizaon of the convertTo funcon, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 32: convertTo Function Resource Utilization Summary For
XF_CONVERT_8U_TO_16S Conversion
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 581 523 119
8 pixel 150 0 8 963 1446 290
Table 33: convertTo Function Resource Utilization Summary For
XF_CONVERT_16U_TO_8U Conversion
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 591 541 124
8 pixel 150 0 8 915 1500 308
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 45

Table 34: convertTo Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency
1 pixel operation (300 MHz) 6.91 ms
8 pixel operation (150 MHz) 1.69 ms
Bitwise AND
The bitwise_and funcon performs the bitwise AND operaon for each pixel between two
input images, and returns an output image.
Iout⎛
⎝x,y⎞
⎠ = I in1⎛
⎝x,y⎞
⎠ & I in2⎛
⎝x,y⎞
⎠
Where,
•
Iout⎛
⎝x,y⎞
⎠
is the intensity of output image at (x, y) posion
•
Iin1⎛
⎝x,y⎞
⎠
is the intensity of rst input image at (x, y) posion
•
Iin2⎛
⎝x,y⎞
⎠
is the intensity of second input image at (x, y) posion
API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void bitwise_and (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 35: bitwise_and Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 46

Table 35: bitwise_and Function Parameter Descriptions (cont'd)
Parameter Description
dst Output image
Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 36: bitwise_and Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 44 10
8 pixel 150 0 0 59 72 13
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 37: bitwise_and Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
Bitwise NOT
The bitwise_not funcon performs the pixel wise bitwise NOT operaon for the pixels in the
input image, and returns an output image.
Iout⎛
⎝x,y⎞
⎠= ~Iin⎛
⎝x,y⎞
⎠
Where,
•
Iout⎛
⎝x,y⎞
⎠
is the intensity of output image at (x, y) posion
•
Iin⎛
⎝x,y⎞
⎠
is the intensity of input image at (x, y) posion
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 47

API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void bitwise_not (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 38: bitwise_not Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src Input image
dst Output image
Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 39: bitwise_not Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 97 78 20
8 pixel 150 0 0 88 97 21
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 48

Table 40: bitwise_not Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
Bitwise OR
The bitwise_or funcon performs the pixel wise bitwise OR operaon between two input
images, and returns an output image.
Iout (x,y)= I in1(x,y)
|
I in2⎛
⎝x,y⎞
⎠
Where,
•
Iout(x,y)
is the intensity of output image at (x, y) posion
•
Iin1(x,y)
is the intensity of rst input image at (x, y) posion
•
Iin2(x,y)
is the intensity of second input image at (x, y) posion
API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void bitwise_or (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 41: bitwise_or Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 49
Chapter 3: xfOpenCV Library API Reference

Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 42: bitwise_or Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 44 10
8 pixel 150 0 0 59 72 13
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 43: bitwise_or Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
Bitwise XOR
The bitwise_xor funcon performs the pixel wise bitwise XOR operaon between two input
images, and returns an output image, as shown below:
Iout⎛
⎝x,y⎞
⎠= I in1⎛
⎝x,y⎞
⎠⊕ I in2⎛
⎝x,y⎞
⎠
Where,
•
Iout⎛
⎝x,y⎞
⎠
is the intensity of output image at (x, y) posion
•
Iin1
⎛
⎝x,y⎞
⎠
is the intensity of rst input image at (x, y) posion
•
Iin2
⎛
⎝x,y⎞
⎠
is the intensity of second input image at (x, y) posion
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 50

API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void bitwise_xor(
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 44: bitwise_xor Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image
Resource Utilization
The following table summarizes the resource ulizaon in dierent conguraons, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image:
Table 45: bitwise_xor Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 44 10
8 pixel 150 0 0 59 72 13
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image:
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 51

Table 46: bitwise_xor Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
Box Filter
The boxFilter funcon performs box ltering on the input image. Box lter acts as a low-pass
lter and performs blurring over the image. The boxFilter funcon or the box blur is a spaal
domain linear lter in which each pixel in the resulng image has a value equal to the average
value of the neighboring pixels in the image.
Kbox =1
(ksize*ksize)
⎡
⎣
⎢
1 . . . 1
1 . . . 1
1 . . . 1
⎤
⎦
⎥
API Syntax
template<int BORDER_TYPE,int FILTER_TYPE, int SRC_T, int ROWS, int
COLS,int NPC=1>
void boxFilter(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<SRC_T,
ROWS, COLS, NPC> & _dst_mat)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 47: boxFilter Function Parameter Descriptions
Parameter Description
FILTER_SIZE Filter size. Filter size of 3(XF_FILTER_3X3), 5(XF_FILTER_5X5) and 7(XF_FILTER_7X7) are
supported
BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT
SRC_T Input and output pixel type. 8-bit, unsigned, 16-bit unsigned and 16-bit signed, 1 channel is
supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8
for 1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 52

Resource Utilization
The following table summarizes the resource ulizaon of the kernel in dierent conguraons,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 48: boxFilter Function Resource Utilization Summary
Operating
Mode Filter Size
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 3x3 300 3 1 545 519 104
5x5 300 5 1 876 870 189
7x7 300 7 1 1539 1506 300
8 pixel 3x3 150 6 8 1002 1368 264
5x5 150 10 8 1576 3183 611
7x7 150 14 8 2414 5018 942
Performance Estimate
The following table summarizes the performance of the kernel in dierent conguraons, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image:
Table 49: boxFilter Function Performance Estimate Summary
Operating Mode
Operating Frequency
(MHz) Filter Size
Latency Estimate
Max (ms)
1 pixel 300 3x3 7.2
300 5x5 7.21
300 7x7 7.22
8 pixel 150 3x3 1.7
150 5x5 1.7
150 7x7 1.7
Canny Edge Detection
The Canny edge detector nds the edges in an image or video frame. It is one of the most
popular algorithms for edge detecon. Canny algorithm aims to sasfy three main criteria:
1. Low error rate: A good detecon of only existent edges.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 53

2. Good localizaon: The distance between edge pixels detected and real edge pixels have to be
minimized.
3. Minimal response: Only one detector response per edge.
In this algorithm, the noise in the image is reduced rst by applying a Gaussian mask. The
Gaussian mask used here is the average mask of size 3x3. Thereaer, gradients along x and y
direcons are computed using the Sobel gradient funcon. The gradients are used to compute
the magnitude and phase of the pixels. The phase is quanzed and the pixels are binned
accordingly. Non-maximal suppression is applied on the pixels to remove the weaker edges.
Edge tracing is applied on the remaining pixels to draw the edges on the image. In this algorithm,
the canny up to non-maximal suppression is in one kernel and the edge linking module is in
another kernel. Aer non-maxima suppression, the output is represented as 2-bit per pixel,
Where:
•00 - represents the background
•01 - represents the weaker edge
•11 - represents the strong edge
The output is packed as 8-bit (four 2-bit pixels) in 1 pixel per cycle operaon and packed as 16-
bit (eight 2-bit pixels) in 8 pixel per cycle operaon. For the edge linking module, the input is 64-
bit, such 32 pixels of 2-bit are packed into a 64-bit. The edge tracing is applied on the pixels and
returns the edges in the image.
API Syntax
The API syntax for Canny is:
template<int FILTER_TYPE,int NORM_TYPE,int SRC_T,int DST_T, int ROWS, int
COLS,int NPC,int NPC1>
void Canny(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<DST_T, ROWS,
COLS, NPC1> & _dst_mat,unsigned char _lowthreshold,unsigned char
_highthreshold)
The API syntax for EdgeTracing is:
template<int SRC_T, int DST_T, int ROWS, int COLS,int NPC_SRC,int NPC_DST>
voidEdgeTracing(xf::Mat<SRC_T, ROWS, COLS, NPC_SRC> & _src,xf::Mat<DST_T,
ROWS, COLS, NPC_DST> & _dst)
Parameter Descriptions
The following table describes the xf::Canny template and funcon parameters:
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 54

Table 50: xf::Canny Function Parameter Descriptions
Parameter Description
FILTER_TYPE The filter window dimensions. The options are 3 and 5.
NORM_TYPE The type of norm used. The options for norm type are L1NORM and L2NORM.
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
DST_T Output pixel type. The output in 1pixel case is 8-bit and packing four 2-bit pixel
values into 8-bit. The Output in 8 pixel case is 16-bit, 8-bit, 2-bit pixel values are
packing into 16-bit.
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
_lowthreshold The lower value of threshold for binary thresholding.
_highthreshold The higher value of threshold for binary thresholding.
The following table describes the EdgeTracing template and funcon parameters:
Table 51: EdgeTracing Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type
DST_T Output pixel type
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC_SRC Number of pixels to be processed per cycle. Fixed to XF_NPPC32.
NPC_DST Number of pixels to be written to destination. Fixed to XF_NPPC8.
_src Input image
_dst Output image
Resource Utilization
The following table summarizes the resource ulizaon of xf::Canny and EdgeTracing in
dierent conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image for Filter size is 3.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 55

Table 52: xf::Canny and EdgeTracing Function Resource Utilization Summary
Name
Resource Utilization
1 pixel 1 pixel 8 pixel 8 pixel Edge
Linking
Edge
Linking
L1NORM,FS:
3
L2NORM,FS:
3
L1NORM,FS:
3
L2NORM,FS:
3
300 MHz 300 MHz 150 MHz 150 MHz 300 MHz 150 MHz
BRAM_18K 22 18 36 32 84 84
DSP48E 2 4 16 32 3 3
FF 3027 3507 4899 6208 17600 14356
LUT 2626 3170 6518 9560 15764 14274
CLB 606 708 1264 1871 2955 3241
Performance Estimate
The following table summarizes the performance of the kernel in dierent conguraons, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image for L1NORM, lter size is 3 and including the edge linking
module.
Table 53: xf::Canny and EdgeTracing Function Performance Estimate Summary
Operating Mode
Latency Estimate
Operating Frequency (MHz) Latency (in ms)
1 pixel 300 10.2
8 pixel 150 8
Deviation from OpenCV
In OpenCV Canny funcon, the Gaussian blur is not applied as a pre-processing step.
Channel Combine
The merge funcon, merges single channel images into a mul-channel image. The number of
channels to be merged should be four.
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void merge(xf::Mat<SRC_T, ROWS, COLS, NPC> &_src1, xf::Mat<SRC_T, ROWS,
COLS, NPC> &_src2, xf::Mat<SRC_T, ROWS, COLS, NPC> &_src3, xf::Mat<SRC_T,
ROWS, COLS, NPC> &_src4, xf::Mat<DST_T, ROWS, COLS, NPC> &_dst)
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 56

Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 54: merge Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4 channel is supported (XF_8UC1)
DST_T Output pixel type. Only 8-bit, unsigned,1 channel is supported (XF_8UC4)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.
_src1 Input single-channel image
_src2 Input single-channel image
_src3 Input single-channel image
_src4 Input single-channel image
_dst Output multi-channel image
Resource Utilization
The following table summarizes the resource ulizaon of the merge funcon, generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process 4 single-
channel HD (1080x1920) images.
Table 55: merge Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 494 386 85
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process 4 single channel HD
(1080x1920) images.
Table 56: merge Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency
1 pixel operation (300 MHz) 6.92 ms
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 57

Channel Extract
The extractChannel funcon splits a mul-channel array (32-bit pixel-interleaved data) into
several single-channel arrays and returns a single channel. The channel to be extracted is
specied by using the channel argument.
The value of the channel argument is specied by macros dened in the
xf_channel_extract_e enumerated data type. The following table summarizes the possible
values for the xf_channel_extract_e enumerated data type:
Table 57: xf_channel_extract_e Enumerated Data Type Values
Channel Enumerated Type
Unknown XF_EXTRACT_CH_0
Unknown XF_EXTRACT_CH_1
Unknown XF_EXTRACT_CH_2
Unknown XF_EXTRACT_CH_3
RED XF_EXTRACT_CH_R
GREEN XF_EXTRACT_CH_G
BLUE XF_EXTRACT_CH_B
ALPHA XF_EXTRACT_CH_A
LUMA XF_EXTRACT_CH_Y
Cb/U XF_EXTRACT_CH_U
Cr/V/Value XF_EXTRACT_CH_V
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void extractChannel(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,
xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat, uint16_t _channel)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 58: extractChannel Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4channel is supported (XF_8UC4)
DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.
_src_mat Input multi-channel image
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 58

Table 58: extractChannel Function Parameter Descriptions (cont'd)
Parameter Description
_dst_mat Output single channel image
_channel Channel to be extracted (See xf_channel_extract_e enumerated type in file xf_params.h for
possible values.)
Resource Utilization
The following table summarizes the resource ulizaon of the extractChannel funcon,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to
process a 4 channel HD (1080x1920) image.
Table 59: extractChannel Function Resource Utilization Summary
Operating
Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 508 354 96
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a 4 channel HD
(1080x1920) image.
Table 60: extractChannel Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.92
Color Conversion
The color conversion funcons convert one image format to another image format, for the
combinaons listed in the following table. The rows represent the input formats and the columns
represent the output formats. Supported conversions are discussed in the following secons.
I/O Formats RGBA NV12 NV21 IYUV UYVY YUYV YUV4
RGBA N/A For details,
see the RGBA
to NV12
For details,
see the RGBA
to NV21
For details,
see the RGBA
to IYUV
For details,
see the RGBA
to YUV4
NV12 For details,
see the NV12
to RGBA
N/A For details,
see the NV12
to IYUV
For details,
see the NV12
to YUV4
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 59

NV21 For details,
see the NV21
to RGBA
N/A For details,
see the NV21
to IYUV
For details,
see the NV21
to YUV4
IYUV For details,
see the IYUV
to RGBA
For details,
see the IYUV
to NV12
N/A For details,
see the IYUV
to YUV4
UYVY For details,
see the UYVY
to RGBA
For details,
see the UYVY
to NV12
For details,
see the UYVY
to IYUV
N/A
YUYV For details,
see the YUYV
to RGBA
For details,
see the YUYV
to NV12
For details,
see the YUYV
to IYUV
N/A
YUV4 N/A
RGB to YUV Conversion Matrix
Following is the formula to convert RGB data to YUV data:
⎡
⎣
⎢
Y
U
V
⎤
⎦
⎥=
⎡
⎣
⎢
0.257 0.504 0.098 16
-0.148 -0.291 0.439 128
0.439 -0.368 -0.071 128
⎤
⎦
⎥
⎡
⎣
⎢
⎢
R
G
B
1
⎤
⎦
⎥
⎥
YUV to RGB Conversion Matrix
Following is the formula to convert YUV data to RGB data:
⎡
⎣
⎢
R
G
B
⎤
⎦
⎥=
⎡
⎣
⎢
1.164 0 1.596
1.164 -0.391 -0.813
1.164 2.018 0
⎤
⎦
⎥
⎡
⎣
⎢
⎢(Y- 16)
(U- 128)
(V- 128)
⎤
⎦
⎥
⎥
Source: hp://www.fourcc.org/fccyvrgb.php
RGBA to YUV4
The rgba2yuv4 funcon converts a 4-channel RGBA image to YUV444 format. The funcon
outputs Y, U, and V streams separately.
API Syntax
template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void rgba2yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,
ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS, COLS, NPC> & _u_image,
xf::Mat<DST_T, ROWS, COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 60

Table 62: rgba2yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input Y plane of size (ROWS, COLS).
_y_image Output Y image of size (ROWS, COLS).
_u_image Output U image of size (ROWS, COLS).
_v_image Output V image of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of RGBA to YUV4 for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 63: rgba2yuv4 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 589 328 96
Performance Estimate
The following table summarizes the performance of RGBA to YUV4 for dierent conguraons,
as generated using the Vivado HLS 2018.2 version for the Xilinx Xczu9eg-vb1156-1-i-es1, to
process a grayscale HD (1080x1920) image.
Table 64: rgba2yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.89
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 61

RGBA to IYUV
The rgba2iyuv funcon converts a 4-channel RGBA image to IYUV (4:2:0) format. The
funcon outputs Y, U, and V planes separately. IYUV holds subsampled data, Y is sampled for
every RGBA pixel and U,V are sampled once for 2row and 2column(2x2) pixels. U and V planes
are of (rows/2)*(columns/2) size, by cascading the consecuve rows into a single row the planes
size becomes (rows/4)*columns.
API Syntax
template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void rgba2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,
ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image,
xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 65: rgba2iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit,unsigned, 4-channel is supported (XF_8UC4).
DST_T Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input Y plane of size (ROWS, COLS).
_y_image Output Y image of size (ROWS, COLS).
_u_image Output U image of size (ROWS/4, COLS).
_v_image Output V image of size (ROWS/4, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of RGBA to IYUV for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 62

Table 66: rgba2iyuv Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 816 472 149
Performance Estimate
The following table summarizes the performance of RGBA to IYUV for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 67: rgba2iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.8
RGBA to NV12
The rgba2nv12 funcon converts a 4-channel RGBA image to NV12 (4:2:0) format. The
funcon outputs Y plane and interleaved UV plane separately. NV12 holds the subsampled data,
Y is sampled for every RGBA pixel and U, V are sampled once for 2row and 2columns (2x2)
pixels. UV plane is of (rows/2)*(columns/2) size as U and V values are interleaved.
API Syntax
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1>
void rgba2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS,
COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & _uv)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 68: rgba2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit,unsigned, 4-channel is supported (XF_8UC4).
Y_T Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).
UV_T Output pixel type. Only 8-bit,unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 63

Table 68: rgba2nv12 Function Parameter Descriptions (cont'd)
Parameter Description
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input RGBA image of size (ROWS, COLS).
_y Output Y image of size (ROWS, COLS).
_uv Output UV image of size (ROWS/2, COLS/2).
Resource Utilization
The following table summarizes the resource ulizaon of RGBA to NV12 for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 69: rgba2nv12 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 802 452 128
Performance Estimate
The following table summarizes the performance of RGBA to NV12 for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 70: rgba2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.8
RGBA to NV21
The rgba2nv21 funcon converts a 4-channel RGBA image to NV21 (4:2:0) format. The
funcon outputs Y plane and interleaved VU plane separately. NV21 holds subsampled data, Y is
sampled for every RGBA pixel and U, V are sampled once for 2 row and 2 columns (2x2) RGBA
pixels. UV plane is of (rows/2)*(columns/2) size as V and U values are interleaved.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 64

API Syntax
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1>
void rgba2nv21(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS,
COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & _uv)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 71: rgba2nv21 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input RGBA image of size (ROWS, COLS).
_y Output Y image of size (ROWS, COLS).
_uv Output UV image of size (ROWS/2, COLS/2).
Resource Utilization
The following table summarizes the resource ulizaon of RGBA to NV21 for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 72: rgba2nv21 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 802 453 131
Performance Estimate
The following table summarizes the performance of RGBA to NV21 for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 65

Table 73: rgba2nv21 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.89
YUYV to RGBA
The yuyv2rgba funcon converts a single-channel YUYV (YUV 4:2:2) image format to a 4-
channel RGBA image. YUYV is a sub-sampled format, a set of YUYV value gives 2 RGBA pixel
values. YUYV is represented in 16-bit values where as, RGBA is represented in 32-bit values.
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void yuyv2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,
ROWS, COLS, NPC> & _dst)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 74: yuyv2rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_dst Output image of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of YUYV to RGBA for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 66

Table 75: yuyv2rgba Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 6 765 705 165
Performance Estimate
The following table summarizes the performance of UYVY to RGBA for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 76: yuyv2rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
YUYV to NV12
The yuyv2nv12 funcon converts a single-channel YUYV (YUV 4:2:2) image format to NV12
(YUV 4:2:0) format. YUYV is a sub-sampled format, 1 set of YUYV value gives 2 Y values and 1 U
and V value each.
API Syntax
template<int SRC_T,int Y_T,int UV_T,int ROWS,int COLS,int NPC=1,int
NPC_UV=1>
void yuyv2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS,
COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 77: yuyv2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 67
Chapter 3: xfOpenCV Library API Reference

Table 77: yuyv2nv12 Function Parameter Descriptions (cont'd)
Parameter Description
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_uv_image Output U plane of size (ROWS/2, COLS/2).
Resource Utilization
The following table summarizes the resource ulizaon of YUYV to NV12 for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 78: yuyv2nv12 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 831 491 149
8 pixel 150 0 0 1196 632 161
Performance Estimate
The following table summarizes the performance of YUYV to NV12 for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 79: yuyv2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 68
Chapter 3: xfOpenCV Library API Reference

YUYV to IYUV
The yuyv2iyuv funcon converts a single-channel YUYV (YUV 4:2:2) image format to
IYUV(4:2:0) format. Outputs of the funcon are separate Y, U, and V planes. YUYV is a sub-
sampled format, 1 set of YUYV value gives 2 Y values and 1 U and V value each. U, V values of
the odd rows are dropped as U, V values are sampled once for 2 rows and 2 columns in the
IYUV(4:2:0) format.
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void yuyv2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,
ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image,
xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 80: yuyv2iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned,1 channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of YUYV to IYUV for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 69

Table 81: yuyv2iyuv Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 835 497 149
8 pixel 150 0 0 1428 735 210
Performance Estimate
The following table summarizes the performance of YUYV to IYUV for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 82: yuyv2iyuv Function Performance Estimate
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
UYVY to IYUV
The uyvy2iyuv funcon converts a UYVY (YUV 4:2:2) single-channel image to the IYUV
format. The outputs of the funcons are separate Y, U, and V planes. UYVY is sub sampled
format. 1 set of UYVY value gives 2 Y values and 1 U and V value each.
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void uyvy2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,
ROWS, COLS, NPC> & _y_image,xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image,
xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 83: uyvy2iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image (must be a multiple of 8).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 70

Table 83: uyvy2iyuv Function Parameter Descriptions (cont'd)
Parameter Description
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of UYVY to IYUV for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image..
Table 84: uyvy2iyuv Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 835 494 139
8 pixel 150 0 0 1428 740 209
Performance Estimate
The following table summarizes the performance of UYVY to IYUV for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 85: uyvy2iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
UYVY to RGBA
The uyvy2rgba funcon converts a UYVY (YUV 4:2:2) single-channel image to a 4-channel
RGBA image. UYVY is sub sampled format, 1set of UYVY value gives 2 RGBA pixel values. UYVY
is represented in 16-bit values where as RGBA is represented in 32-bit values.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 71

API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void uyvy2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T,
ROWS, COLS, NPC> & _dst)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 86: uyvy2rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_dst Output image of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of UYVY to RGBA for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 87: uyvy2rgba Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 6 773 704 160
Performance Estimate
The following table summarizes the performance of UYVY to RGBA for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 72

Table 88: uyvy2rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.8
UYVY to NV12
The uyvy2nv12 funcon converts a UYVY (YUV 4:2:2) single-channel image to NV12 format.
The outputs are separate Y and UV planes. UYVY is sub sampled format, 1 set of UYVY value
gives 2 Y values and 1 U and V value each.
API Syntax
template<int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1, int
NPC_UV=1>
void uyvy2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS,
COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 89: uyvy2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_uv_image Output U plane of size (ROWS/2, COLS/2).
Resource Utilization
The following table summarizes the resource ulizaon of UYVY to NV12 for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 73

Table 90: uyvy2nv12 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 831 488 131
8 pixel 150 0 0 1235 677 168
Performance Estimate
The following table summarizes the performance of UYVY to NV12 for dierent conguraons,
as generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 91: uyvy2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
IYUV to RGBA
The iyuv2rgba funcon converts single channel IYUV (YUV 4:2:0) image to a 4-channel RGBA
image. The inputs to the funcon are separate Y, U, and V planes. IYUV is sub sampled format, U
and V values are sampled once for 2 rows and 2 columns of the RGBA pixels. The data of the
consecuve rows of size (columns/2) is combined to form a single row of size (columns).
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void iyuv2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T,
ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_v,
xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 92: iyuv2rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 74

Table 92: iyuv2rgba Function Parameter Descriptions (cont'd)
Parameter Description
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_u Input U plane of size (ROWS/4, COLS).
src_v Input V plane of size (ROWS/4, COLS).
_dst0 Output RGBA image of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of IYUV to RGBA for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 93: iyuv2rgba Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 2 5 1208 728 196
Performance Estimate
The following table summarizes the performance of IYUV to RGBA for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 94: iyuv2rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
IYUV to NV12
The iyuv2nv12 funcon converts single channel IYUV image to NV12 format. The inputs are
separate U and V planes. There is no need of processing Y plane as both the formats have a same
Y plane. U and V values are rearranged from plane interleaved to pixel interleaved.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 75

API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC =1, int NPC_UV=1>
void iyuv2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T,
ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> &
src_v,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<UV_T, ROWS/2,
COLS/2, NPC_UV> & _uv_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 95: iyuv2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
NPC_UV Number of UV Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4
for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_u Input U plane of size (ROWS/4, COLS).
src_v Input V plane of size (ROWS/4, COLS).
_y_image Output V plane of size (ROWS, COLS).
_uv_image Output UV plane of size (ROWS/2, COLS/2).
Resource Utilization
The following table summarizes the resource ulizaon of IYUV to NV12 for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image..
Table 96: iyuv2nv12 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 12 907 677 158
8 pixel 150 0 12 1591 1022 235
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 76

Performance Estimate
The following table summarizes the performance of IYUV to NV12 for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 97: iyuv2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
IYUV to YUV4
The iyuv2yuv4 funcon converts a single channel IYUV image to a YUV444 format. Y plane is
same for both the formats. The inputs are separate U and V planes of IYUV image and the
outputs are separate U and V planes of YUV4 image. IYUV stores subsampled U,V values. YUV
format stores U and V values for every pixel. The same U, V values are duplicated for 2 rows and
2 columns (2x2) pixels in order to get the required data in the YUV444 format.
API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void iyuv2yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T,
ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> &
src_v,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<SRC_T, ROWS,
COLS, NPC> & _u_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 98: iyuv2yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_u Input U plane of size (ROWS/4, COLS).
src_v Input V plane of size (ROWS/4, COLS).
_y_image Output Y image of size (ROWS, COLS).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 77

Table 98: iyuv2yuv4 Function Parameter Descriptions (cont'd)
Parameter Description
_u_image Output U image of size (ROWS, COLS).
_v_image Output V image of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of IYUV to YUV4 for dierent
conguraons, generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a HD (1080x1920) image.
Table 99: iyuv2yuv4 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 1398 870 232
8 pixel 150 0 0 2134 1214 304
Performance Estimate
The following table summarizes the performance of IYUV to YUV4 for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 100: iyuv2yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 13.8
8 pixel operation (150 MHz) 3.4
NV12 to IYUV
The nv122iyuv funcon converts NV12 format to IYUV format. The funcon inputs the
interleaved UV plane and the outputs are separate U and V planes. There is no need of
processing the Y plane as both the formats have a same Y plane. U and V values are rearranged
from pixel interleaved to plane interleaved.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 78

API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1, int NPC_UV=1>
void nv122iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,
ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> &
_y_image,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _u_image,xf::Mat<SRC_T,
ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 101: nv122iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of NV12 to IYUV for dierent
conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 102: nv122iyuv Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 1344 717 208
8 pixel 150 0 1 1961 1000 263
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 79

Performance Estimate
The following table summarizes the performance of NV12 to IYUV for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 103: nv122iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
NV12 to RGBA
The nv122rgba funcon converts NV12 image format to a 4-channel RGBA image. The inputs
to the funcon are separate Y and UV planes. NV12 holds sub sampled data, Y plane is sampled
at unit rate and 1 U and 1V value each for every 2x2 Y values. To generate the RGBA data, each
U and V value is duplicated (2x2) mes.
API Syntax
template<int SRC_T, int UV_T, int DST_T, int ROWS, int COLS, int NPC=1>
void nv122rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y,xf::Mat<UV_T,
ROWS/2, COLS/2, NPC> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 104: nv122rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
DST_T Output pixel type. Only 8-bit,unsigned,4channel is supported (XF_8UC4).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_dst0 Output RGBA image of size (ROWS, COLS).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 80

Resource Utilization
The following table summarizes the resource ulizaon of NV12 to RGBA for dierent
conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 105: nv122rgba Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 2 5 1191 708 195
Performance Estimate
The following table summarizes the performance of NV12 to RGBA for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 106: nv122rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
NV12 to YUV4
The nv122yuv4 funcon converts a NV12 image format to a YUV444 format. The funcon
outputs separate U and V planes. Y plane is same for both the image formats. The UV planes are
duplicated 2x2 mes to represent one U plane and V plane of the YUV444 image format.
API Syntax
template<int SRC_T,int UV_T, int ROWS, int COLS, int NPC=1, int NPC_UV=1>
void nv122yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,
ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> &
_y_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _u_image,xf::Mat<SRC_T, ROWS,
COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 81

Table 107: nv122yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS, COLS).
_v_image Output V plane of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of NV12 to YUV4 for dierent
conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 108: nv122yuv4 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 1383 832 230
8 pixel 150 0 0 1772 1034 259
Performance Estimate
The following table summarizes the performance of NV12 to YUV4 for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 109: nv122yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 13.8
8 pixel operation (150 MHz) 3.4
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 82

NV21 to IYUV
The nv212iyuv funcon converts a NV21 image format to an IYUV image format. The input to
the funcon is the interleaved VU plane only and the outputs are separate U and V planes. There
is no need of processing Y plane as both the formats have same the Y plane. U and V values are
rearranged from pixel interleaved to plane interleaved.
API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>
void nv212iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,
ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> &
_y_image, xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _u_image,xf::Mat<SRC_T,
ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 110: nv212iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of NV21 to IYUV for dierent
conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 83

Table 111: nv212iyuv Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 1377 730 219
8 pixel 150 0 1 1975 1012 279
Performance Estimate
The following table summarizes the performance of NV21 to IYUV for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 112: nv212iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7
NV21 to RGBA
The nv212rgba funcon converts a NV21 image format to a 4-channel RGBA image. The inputs
to the funcon are separate Y and VU planes. NV21 holds sub sampled data, Yplane is sampled
at unit rate and 1 U and 1V value each for every 2x2 Yvalues. To generate the RGBA data, each U
and V value is duplicated (2x2) mes.
API Syntax
template<int SRC_T, int UV_T, int DST_T, int ROWS, int COLS, int NPC=1>
void nv212rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,
ROWS/2, COLS/2, NPC> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 113: nv212rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 84

Table 113: nv212rgba Function Parameter Descriptions (cont'd)
Parameter Description
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_dst0 Output RGBA image of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of NV21 to RGBA for dierent
conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 114: nv212rgba Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 2 5 1170 673 183
Performance Estimate
The following table summarizes the performance of NV12 to RGBA for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 115: nv212rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
NV21 to YUV4
The nv212yuv4 funcon converts an image in the NV21 format to a YUV444 format. The
funcon outputs separate U and V planes. Y plane is same for both formats. The UV planes are
duplicated 2x2 mes to represent one U plane and V plane of YUV444 format.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 85

API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>
void nv212yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T,
ROWS/2, COLS/2, NPC_UV> & src_uv, xf::Mat<SRC_T, ROWS, COLS, NPC> &
_y_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _u_image, xf::Mat<SRC_T, ROWS,
COLS, NPC> & _v_image)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 116: nv212yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image (must be a multiple of 8).
COLS Maximum width of input and output image (must be a multiple of 8).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS, COLS).
_v_image Output V plane of size (ROWS, COLS).
Resource Utilization
The following table summarizes the resource ulizaon of NV21 to YUV4 for dierent
conguraons, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.
Table 117: nv212yuv4 Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 1383 817 233
8 pixel 150 0 0 1887 1087 287
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 86

Performance Estimate
The following table summarizes the performance of NV21 to YUV4 for dierent conguraons,
as generated using the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1,
to process a grayscale HD (1080x1920) image.
Table 118: nv212yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 13.8
8 pixel operation (150 MHz) 3.5
Color Thresholding
The colorthresholding funcon compares the color space values of the source image with
low and high threshold values, and returns either 255 or 0 as the output.
API Syntax
template<int SRC_T,int DST_T,int MAXCOLORS, int ROWS, int COLS,int NPC>
void colorthresholding(xf::Mat<SRC_T, ROWS, COLS, NPC> &
_src_mat,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat,unsigned char
low_thresh[MAXCOLORS*3], unsigned char high_thresh[MAXCOLORS*3])
Parameter Descriptions
The table below describes the template and the funcon parameters.
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4 channel is supported (XF_8UC4)
DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
MAXCOLORS Maximum number of color values
ROWS Maximum height of input and output image
COLS Maximum width of input and output image
NPC Number of pixels to be processed per cycle. Only XF_NPPC1 supported.
_src_mat Input image
_dst_mat Thresholded image
low_thresh Lowest threshold values for the colors
high_thresh Highest threshold values for the colors
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 87

Custom Convolution
The filter2D funcon performs convoluon over an image using a user-dened kernel.
Convoluon is a mathemacal operaon on two funcons f and g, producing a third funcon,
The third funcon is typically viewed as a modied version of one of the original funcons, that
gives the area overlap between the two funcons to an extent that one of the original funcons
is translated.
The lter can be unity gain lter or a non-unity gain lter. The lter must be of type AU_16SP. If
the co-ecients are oang point, it must be converted into the Qm.n and provided as the input
as well as the shi parameter has to be set with the ‘n’ value. Else, if the input is not of oang
point, the lter is provided directly and the shi parameter is set to zero.
API Syntax
template<int BORDER_TYPE,int FILTER_WIDTH,int FILTER_HEIGHT, int SRC_T,int
DST_T, int ROWS, int COLS,int NPC=1>
void filter2D(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<DST_T,
ROWS, COLS, NPC> & _dst_mat,short int
filter[FILTER_HEIGHT*FILTER_WIDTH],unsigned char _shift)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 119: filter2D Function Parameter Descriptions
Parameter Description
BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT
FILTER_HEIGHT Number of rows in the input filter
FILTER_WIDTH Number of columns in the input filter
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
DST_T Output pixel type.8-bit unsigned single channel (XF_8UC1) and 16-bit signed single
channel (XF_16SC1) supported.
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and
XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
filter The input filter of any size, provided the dimensions should be an odd number. The filter
co-efficients either a 16-bit value or a 16-bit fixed point equivalent value.
_shift The filter must be of type XF_16SP. If the co-efficients are floating point, it must be
converted into the Qm.n and provided as the input as well as the shift parameter has to
be set with the ‘n’ value. Else, if the input is not of floating point, the filter is provided
directly and the shift parameter is set to zero.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 88

Resource Utilization
The following table summarizes the resource ulizaon of the kernel in dierent conguraons,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 120: filter2D Function Resource Utilization Summary
Operating
Mode Filter Size
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 3x3 300 3 9 1701 1161 269
5x5 300 5 25 3115 2144 524
8 pixel 3x3 150 6 72 2783 2768 638
5x5 150 10 216 3020 4443 1007
Performance Estimate
The following table summarizes the performance of the kernel in dierent conguraons, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 121: filter2D Function Performance Estimate Summary
Operating Mode
Operating Frequency
(MHz) Filter Size
Latency Estimate
Max (ms)
1 pixel 300 3x3 7
300 5x5 7.1
8 pixel 150 3x3 1.86
150 5x5 1.86
Delay
In image processing pipelines, it is possible that the inputs to a funcon with FIFO interfaces are
not synchronized. That is, the rst data packet for rst input might arrive a nite number of clock
cycles aer the rst data packet of the second input. If the funcon has FIFOs at its interface
with insucient depth, this causes the whole design to stall on hardware. To synchronize the
inputs, we provide this funcon to delay the input packet that arrives early, by a nite number of
clock cycles.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 89

API Syntax
template<int MAXDELAY, int SRC_T, int ROWS, int COLS,int NPC=1 >
void delayMat(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,
xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions
The table below describes the template and the funcon parameters.
Parameter Description
SRC_T Input and output pixel type
ROWS Maximum height of input and output image (must be multiple of 8)
COLS Maximum width of input and output image (must be multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
MAXDELAY Maximum delay that the function is to be instantiated for.
_src Input image
_dst Output image
Dilate
During a dilaon operaon, the current pixel intensity is replaced by the maximum value of the
intensity in a 3x3 neighborhood of the current pixel.
dst(x,y)= max
x-1 ≤ x'≤x+ 1
y- 1 ≤ y'≤y+ 1
src⎛
⎝x',y'⎞
⎠
API Syntax
template<int BORDER_TYPE, int SRC_T, int ROWS, int COLS,int NPC=1>
void dilate(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat, xf::Mat<SRC_T,
ROWS, COLS, NPC> & _dst_mat)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 122: dilate Function Parameter Descriptions
Parameter Description
BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 90

Table 122: dilate Function Parameter Descriptions (cont'd)
Parameter Description
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for
1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
Resource Utilization
The following table summarizes the resource ulizaon of the Dilaon funcon for 1 pixel
operaon and 8 pixel operaon, generated using Vivado HLS 2018.2 version tool for the Xilinx
Xczu9eg-vb1156-1-i-es1 FPGA.
Table 123: dilate Function Resource Utilization Summary
Name
Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 3 6
DSP48E 0 0
FF 339 644
LUT 350 1325
CLB 81 245
Performance Estimate
The following table summarizes a performance esmate of the Dilaon funcon for Normal
Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS
2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1 FPGA.
Table 124: dilate Function Performance Estimate Summary
Operating Mode Latency Estimate
Min (ms) Max (ms)
1 pixel (300 MHz) 7.0 7.0
8 pixel (150 MHz) 1.87 1.87
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 91

Duplicate
When various funcons in a pipeline are implemented by a programmable logic, FIFOs are
instanated between two funcons for dataow processing. When the output from one funcon
is consumed by two funcons in a pipeline, the FIFOs need to be duplicated. This funcon
facilitates the duplicaon process of the FIFOs.
API Syntax
template<int SRC_T, int ROWS, int COLS,int NPC=1>
void duplicateMat(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,
xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst1,xf::Mat<SRC_T, ROWS, COLS, NPC> &
_dst2)
Parameter Descriptions
The table below describes the template and the funcon parameters.
Parameter Description
SRC_T Input and output pixel type
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle. Possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel
and 8 pixel operations respectively.
_src Input image
_dst1 Duplicate output for _src
_dst2 Duplicate output for _src
Erode
The erode funcon nds the minimum pixel intensity in the 3x3 neighborhood of a pixel and
replaces the pixel intensity with the minimum value.
dst(x,y)= min
x- 1 ≤ x'≤x+ 1
y- 1 ≤ y'≤y+ 1
src⎛
⎝x',y'⎞
⎠
API Syntax
template<int BORDER_TYPE, int SRC_T, int ROWS, int COLS,int NPC=1>
void erode(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat, xf::Mat<SRC_T,
ROWS, COLS, NPC> & _dst_mat)
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 92

Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 125: erode Function Parameter Descriptions
Parameter Description
BORDER_TYPE Border type supported is XF_BORDER_CONSTANT
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8
for 1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
Resource Utilization
The following table summarizes the resource ulizaon of the Erosion funcon generated using
Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA.
Table 126: erode Function Resource Utilization Summary
Name
Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 3 6
DSP48E 0 0
FF 342 653
LUT 351 1316
CLB 79 230
Performance Estimate
The following table summarizes a performance esmate of the Erosion funcon for Normal
Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS
2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1 FPGA.
Table 127: erode Function Performance Estimate Summary
Operating Mode Latency Estimate
Min (ms) Max (ms)
1 pixel (300 MHz) 7.0 7.0
8 pixel (150 MHz) 1.85 1.85
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 93

FAST Corner Detection
Features from accelerated segment test (FAST) is a corner detecon algorithm, that is faster than
most of the other feature detectors.
The fast funcon picks up a pixel in the image and compares the intensity of 16 pixels in its
neighborhood on a circle, called the Bresenham's circle. If the intensity of 9 conguous pixels is
found to be either more than or less than that of the candidate pixel by a given threshold, then
the pixel is declared as a corner. Once the corners are detected, the non-maximal suppression is
applied to remove the weaker corners.
This funcon can be used for both sll images and videos. The corners are marked in the image.
If the corner is found in a parcular locaon, that locaon is marked with 255, otherwise it is
zero.
API Syntax
template<int NMS,int SRC_T,int ROWS, int COLS,int NPC=1>
void fast(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<SRC_T, ROWS,
COLS, NPC> & _dst_mat,unsigned char _threshold)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 128: fast Function Parameter Descriptions
Parameter Description
NMS If NMS == 1, non-maximum suppression is applied to detected corners (keypoints). The value
should be 0 or 1.
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)
ROWS Maximum height of input image (must be a multiple of 8)
COLS Maximum width of input image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image. The corners are marked in the image.
_threshold Threshold on the intensity difference between the center pixel and its neighbors. Usually it is taken
around 20.
Resource Utilization
The following table summarizes the resource ulizaon of the kernel for dierent conguraons,
generated using Vivado HLS 2018.2 for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image with NMS.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 94

Table 129: fast Function Resource Utilization Summary
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 10 20
DSP48E 0 0
FF 2695 7310
LUT 3792 20956
CLB 769 3519
Performance Estimate
The following table summarizes the performance of kernel for dierent conguraons, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image with non-maximum suppression (NMS).
Table 130: fast Function Performance Estimate Summary
Operating Mode
Operating Frequency
(MHz) Filter Size
Latency Estimate
Max (ms)
1 pixel 300 3x3 7
8 pixel 150 3x3 1.86
Gaussian Filter
The GaussianBlur funcon applies Gaussian blur on the input image. Gaussian ltering is done
by convolving each point in the input image with a Gaussian kernel.
G0(x,y)=e
-(x-μx)2
2σx
2+-(y-μy)2
2σy
2
Where
μx
,
μy
are the mean values and
σx
,
σy
are the variances in x and y direcons
respecvely. In the GaussianBlur funcon, values of
μx
,
μy
are considered as zeroes and the
values of
σx
,
σy
are equal.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 95

API Syntax
template<int FILTER_SIZE, int BORDER_TYPE, int SRC_T, int ROWS, int COLS,
int NPC = 1>
void GaussianBlur(xf::Mat<SRC_T, ROWS, COLS, NPC> & src, xf::Mat<SRC_T,
ROWS, COLS, NPC> & dst, float sigma)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 131: GaussianBlur Function Parameter Descriptions
Parameter Description
FILTER_SIZE Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7) are
supported.
BORDER_TYPE Border type supported is XF_BORDER_CONSTANT
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for
1 pixel and 8 pixel operations respectively.
src Input image
dst Output image
sigma Standard deviation of Gaussian filter
Resource Utilization
The following table summarizes the resource ulizaon of the Gaussian Filter in dierent
conguraons, generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to progress a grayscale HD (1080x1920) image.
Table 132: GaussianBlur Function Resource Utilization Summary
Operating
Mode Filter Size
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 3x3 300 3 17 3641 2791 610
5x5 300 5 27 4461 3544 764
7x7 250 7 35 4770 4201 894
8 pixel 3x3 150 6 52 3939 3784 814
5x5 150 10 111 5688 5639 1133
7x7 150 14 175 7594 7278 1518
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 96

Performance Estimate
The following table summarizes a performance esmate of the Gaussian Filter in dierent
conguraons, as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-vb1156-1-i-es1
FPGA, to process a grayscale HD (1080x1920) image.
Table 133: GaussianBlur Function Performance Estimate Summary
Operating Mode Filter Size Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 3x3 7.01
5x5 7.03
7x7 7.06
8 pixel operation (150 MHz) 3x3 1.6
5x5 1.7
7x7 1.74
Gradient Magnitude
The magnitude funcon computes the magnitude for the images. The input images are x-
gradient and y-gradient images of type 16S. The output image is of same type as the input image.
For L1NORM normalizaon, the magnitude computed image is the pixel-wise added image of
absolute of x-gradient and y-gradient, as shown below:.
g=|gx|+
|
gy
|
For L2NORM normalizaon, the magnitude computed image is as follows:
g=⎛
⎝gx
2+gy
2⎞
⎠
API Syntax
template< int NORM_TYPE ,int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1>
void magnitude(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_matx,xf::Mat<DST_T,
ROWS, COLS, NPC> & _src_maty,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 97

Table 134: magnitude Function Parameter Descriptions
Parameter Description
NORM_TYPE Normalization type can be either L1 or L2 norm. Values are XF_L1NORM or XF_L2NORM
SRC_T Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)
DST_T Output pixel type. Only 16-bit, signed,1 channel is supported (XF_16SC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for 1 pixel
and 8 pixel operations respectively.
_src_matx First input, x-gradient image.
_src_maty Second input, y-gradient image.
_dst_mat Output, magnitude computed image.
Resource Utilization
The following table summarizes the resource ulizaon of the kernel in dierent conguraons,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image and for L2 normalizaon.
Table 135: magnitude Function Resource Utilization Summary
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 0 0
DSP48E 2 16
FF 707 2002
LUT 774 3666
CLB 172 737
Performance Estimate
The following table summarizes the performance of the kernel in dierent conguraons, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image and for L2 normalizaon.
Table 136: magnitude Function Performance Estimate Summary
Operating Mode Operating Frequency (MHz) Latency Estimate
Max (ms)
1 pixel 300 7.2
8 pixel 150 1.7
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 98

Gradient Phase
The phase funcon computes the polar angles of two images. The input images are x-gradient
and y-gradient images of type 16S. The output image is of same type as the input image.
For radians:
angle(x,y)=atan2⎛
⎝gy, gx⎞
⎠
For degrees:
angle(x,y)=atan2(gy, gx)*180
π
API Syntax
template<int RET_TYPE ,int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1 >
void phase(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_matx,xf::Mat<DST_T,
ROWS, COLS, NPC> & _src_maty,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 137: phase Function Parameter Descriptions
Parameter Description
RET_TYPE Output format can be either in radians or degrees. Options are XF_RADIANS or XF_DEGREES.
•If the XF_RADIANS option is selected, phase API will return result in Q4.12 format. The output
range is (0, 2 pi)
•If the XF_DEGREES option is selected, xFphaseAPI will return result in Q10.6 degrees and
output range is (0, 360)
SRC_T Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)
DST_T Output pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src_matx First input, x-gradient image.
_src_maty Second input, y-gradient image.
_dst_mat Output, phase computed image.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 99

Resource Utilization
The following table summarizes the resource ulizaon of the kernel in dierent conguraons,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 138: phase Function Resource Utilization Summary
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 6 24
DSP48E 6 19
FF 873 2396
LUT 753 3895
CLB 185 832
Performance Estimate
The following table summarizes the performance of the kernel in dierent conguraons, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 139: phase Function Performance Estimate Summary
Operating Mode Operating Frequency (MHz) Latency Estimate (ms)
1 pixel 300 7.2
8 pixel 150 1.7
Deviation from OpenCV
In phase implementaon, the output is returned in a xed point format. If XF_RADIANS opon is
selected, phase API will return result in Q4.12 format. The output range is (0, 2 pi). If
XF_DEGREES opon is selected, phase API will return result in Q10.6 degrees and output range
is (0, 360);
Harris Corner Detection
In order to understand Harris Corner Detecon, let us consider a grayscale image. Sweep a
window w(x,y) (with displacements u in the x-direcon and v in the y-direcon), I calculates
the variaon of intensity w(x,y).
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 100

E(u,v)=∑w(x,y)⎡
⎣I(x+u,y+v)-I(x,y)⎤
⎦2
Where:
•w(x,y) is the window posion at (x,y)
•I(x,y) is the intensity at (x,y)
•I(x+u,y+v) is the intensity at the moved window (x+u,y+v).
Since we are looking for windows with corners, we are looking for windows with a large variaon
in intensity. Hence, we have to maximize the equaon above, specically the term:
⎡
⎣I(x+u,y+v)-I(x,y)⎤
⎦2
Using Taylor expansion:
E(u,v)=∑⎡
⎣I(x,y)+uI x+vI y-I(x,y)⎤
⎦2
Expanding the equaon and cancelling I(x,y) with -I(x,y):
E(u,v)=∑u2Ix
2+ 2uvI xIy+v2Iy
2
The above equaon can be expressed in a matrix form as:
E(u,v)=[u v]⎛
⎝
⎜∑w(x,y)⟦Ix
2IxIy
IxIyIy
2⟧⎞
⎠
⎟⟦u
v⟧
So, our equaon is now:
E(u,v)=[u v]M⟦u
v⟧
A score is calculated for each window, to determine if it can possibly contain a corner:
R=det(M)-k(trace(M))2
Where,
•
det(M)=λ1λ2
•
trace(M)=λ1+λ2
Non-Maximum Suppression:
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 101

In non-maximum suppression (NMS) if radius = 1, then the bounding box is 2*r+1 = 3.
In this case, consider a 3x3 neighborhood across the center pixel. If the center pixel is greater
than the surrounding pixel, then it is considered a corner. The comparison is made with the
surrounding pixels, which are within the radius.
Radius = 1
x-1, y-1 x-1, y x-1, y+1
x, y-1 x, y x, y+1
x+1, y-1 x+1, y x+1, y+1
Threshold:
A threshold=442, 3109 and 566 is used for 3x3, 5x5, and 7x7 lters respecvely. This threshold
is veried over 40 sets of images. The threshold can be varied, based on the applicaon. The
corners are marked in the output image. If the corner is found in a parcular locaon, that
locaon is marked with 255, otherwise it is zero.
API Syntax
template<int FILTERSIZE,int BLOCKWIDTH, int NMSRADIUS,int SRC_T,int ROWS,
int COLS,int NPC=1>
void cornerHarris(xf::Mat<SRC_T, ROWS, COLS, NPC> & src,xf::Mat<SRC_T,
ROWS, COLS, NPC> & dst,uint16_t threshold, uint16_t k)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 140: cornerHarris Function Parameter Descriptions
Parameter Description
FILTERSIZE Size of the Sobel filter. 3, 5, and 7 supported.
BLOCKWIDTH Size of the box filter. 3, 5, and 7 supported.
NMSRADIUS Radius considered for non-maximum suppression. Values supported are 1 and 2.
TYPE Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)
ROWS Maximum height of input image (must be a multiple of 8)
COLS Maximum width of input image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
src Input image
dst Output image.
threshold Threshold applied to the corner measure.
k Harris detector parameter
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 102

Resource Utilization
The following table summarizes the resource ulizaon of the Harris corner detecon in dierent
conguraons, generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.
The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=3 and
NMS_RADIUS =1
Table 141: Resource Utilization Summary - For Sobel Filter = 3, Box filter=3 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 33 66
DSP48E 10 80
FF 3254 9330
LUT 3522 13222
CLB 731 2568
The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=5 and
NMS_RADIUS =1
Table 142: Resource Utilization Summary - Sobel Filter = 3, Box filter=5 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 45 90
DSP48E 10 80
FF 5455 12459
LUT 5675 24594
CLB 1132 4498
The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=7 and
NMS_RADIUS =1
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 103

Table 143: Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 57 114
DSP48E 10 80
FF 8783 16593
LUT 9157 39813
CLB 1757 6809
The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=3 and
NMS_RADIUS =1
Table 144: Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 200 MHz
BRAM_18K 35 70
DSP48E 10 80
FF 4656 11659
LUT 4681 17394
CLB 1005 3277
The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=5 and
NMS_RADIUS =1
Table 145: Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 47 94
DSP48E 10 80
FF 6019 14776
LUT 6337 28795
CLB 1353 5102
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 104

The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=7 and
NMS_RADIUS =1
Table 146: Resource Utilization Summary - Sobel Filter = 5, Box filter=7 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 59 118
DSP48E 10 80
FF 9388 18913
LUT 9414 43070
CLB 1947 7508
The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=3 and
NMS_RADIUS =1
Table 147: Resource Utilization Summary - Sobel Filter = 7, Box filter=3 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 37 74
DSP48E 11 88
FF 6002 13880
LUT 6337 25573
CLB 1327 4868
The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=5 and
NMS_RADIUS =1
Table 148: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 49 98
DSP48E 11 88
FF 7410 17049
LUT 8076 36509
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 105

Table 148: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =1 (cont'd)
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
CLB 1627 6518
The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=7 and
NMS_RADIUS =1
Table 149: Resource Utilization Summary - Sobel Filter = 7, Box filter=7 and
NMS_RADIUS =1
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 61 122
DSP48E 11 88
FF 10714 21137
LUT 11500 51331
CLB 2261 8863
The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=3 and
NMS_RADIUS =2
Table 150: Resource Utilization Summary - Sobel Filter = 3, Box filter=3 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 41 82
DSP48E 10 80
FF 5519 10714
LUT 5094 16930
CLB 1076 3127
Resource ulizaon: For Sobel Filter = 3, Box lter=5 and NMS_RADIUS =2
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 106

Table 151: Resource Utilization Summary
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 53 106
DSP48E 10 80
FF 6798 13844
LUT 6866 28286
CLB 1383 4965
The following table summarizes the resource ulizaon for Sobel Filter = 3, Box lter=7 and
NMS_RADIUS =2
Table 152: Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 65 130
DSP48E 10 80
FF 10137 17977
LUT 10366 43589
CLB 1940 7440
The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=3 and
NMS_RADIUS =2
Table 153: Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 43 86
DSP48E 10 80
FF 5957 12930
LUT 5987 21187
CLB 1244 3922
The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=5 and
NMS_RADIUS =2
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 107

Table 154: Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 55 110
DSP48E 10 80
FF 5442 16053
LUT 6561 32377
CLB 1374 5871
The following table summarizes the resource ulizaon for Sobel Filter = 5, Box lter=7 and
NMS_RADIUS =2
Table 155: Resource Utilization Summary - Sobel Filter = 5, Box filter=7 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 67 134
DSP48E 10 80
FF 10673 20190
LUT 10793 46785
CLB 2260 8013
The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=3 and
NMS_RADIUS =2
Table 156: Resource Utilization Summary - Sobel Filter = 7, Box filter=3 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 45 90
DSP48E 11 88
FF 7341 15161
LUT 7631 29185
CLB 1557 5425
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 108

The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=5 and
NMS_RADIUS =2
Table 157: Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 57 114
DSP48E 11 88
FF 8763 18330
LUT 9368 40116
CLB 1857 7362
The following table summarizes the resource ulizaon for Sobel Filter = 7, Box lter=7 and
NMS_RADIUS =2
Table 158: Resource Utilization Summary - Sobel Filter = 7, Box filter=7 and
NMS_RADIUS =2
Name
Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 69 138
DSP48E 11 88
FF 12078 22414
LUT 12831 54652
CLB 2499 9628
Performance Estimate
The following table summarizes a performance esmate of the Harris corner detecon in
dierent conguraons, as generated using Vivado HLS 2018.2 tool for Xilinx Xczu9eg-
vb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.
Table 159: cornerHarris Function Performance Estimate Summary
Operating
Mode
Operating
Frequency
(MHz)
Configuration Latency Estimate
Sobel Box NMS Radius Latency(In ms)
1 pixel 300 MHz 3 3 1 7
1 pixel 300 MHz 3 5 1 7.1
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 109

Table 159: cornerHarris Function Performance Estimate Summary (cont'd)
Operating
Mode
Operating
Frequency
(MHz)
Configuration Latency Estimate
Sobel Box NMS Radius Latency(In ms)
1 pixel 300 MHz 3 7 1 7.1
1 pixel 300 MHz 5 3 1 7.2
1 pixel 300 MHz 5 5 1 7.2
1 pixel 300 MHz 5 7 1 7.2
1 pixel 300 MHz 7 3 1 7.22
1 pixel 300 MHz 7 5 1 7.22
1 pixel 300 MHz 7 7 1 7.22
8 pixel 150 MHz 3 3 1 1.7
8 pixel 150 MHz 3 5 1 1.7
8 pixel 150 MHz 3 7 1 1.7
8 pixel 150 MHz 5 3 1 1.71
8 pixel 150 MHz 5 5 1 1.71
8 pixel 150 MHz 5 7 1 1.71
8 pixel 150 MHz 7 3 1 1.8
8 pixel 150 MHz 7 5 1 1.8
8 pixel 150 MHz 7 7 1 1.8
1 pixel 300 MHz 3 3 2 7.1
1 pixel 300 MHz 3 5 2 7.1
1 pixel 300 MHz 3 7 2 7.1
1 pixel 300 MHz 5 3 2 7.21
1 pixel 300 MHz 5 5 2 7.21
1 pixel 300 MHz 5 7 2 7.21
1 pixel 300 MHz 7 3 2 7.22
1 pixel 300 MHz 7 5 2 7.22
1 pixel 300 MHz 7 7 2 7.22
8 pixel 150 MHz 3 3 2 1.8
8 pixel 150 MHz 3 5 2 1.8
8 pixel 150 MHz 3 7 2 1.8
8 pixel 150 MHz 5 3 2 1.81
8 pixel 150 MHz 5 5 2 1.81
8 pixel 150 MHz 5 7 2 1.81
8 pixel 150 MHz 7 3 2 1.9
8 pixel 150 MHz 7 5 2 1.91
8 pixel 150 MHz 7 7 2 1.92
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 110

Deviation from OpenCV
In xfOpenCV thresholding and NMS are included, but in OpenCV they are not included. In
xfOpenCV, all the blocks are implemented in xed point. Whereas,in OpenCV, all the blocks are
implemented in oang point.
Histogram Computation
The calcHist funcon computes the histogram of given input image.
H⎡
⎣src(x,y)⎤
⎦=H⎡
⎣src(x,y)⎤
⎦+ 1
Where, H is the array of 256 elements.
API Syntax
template<int SRC_T,int ROWS, int COLS,int NPC=1>
void calcHist(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, uint32_t *histogram)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 160: calcHist Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle
_src Input image
histogram Output array of 256 elements
Resource Utilization
The following table summarizes the resource ulizaon of the calcHist funcon for Normal
Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS
2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel case
and at 150 MHz for 8 pixel mode.
Table 161: calcHist Function Resource Utilization Summary
Name
Resource Utilization
Normal Operation (1 pixel) Resource Optimized (8 pixel)
BRAM_18K 2 16
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 111

Table 161: calcHist Function Resource Utilization Summary (cont'd)
Name
Resource Utilization
Normal Operation (1 pixel) Resource Optimized (8 pixel)
DSP48E 0 0
FF 196 274
LUT 240 912
CLB 57 231
Performance Estimate
The following table summarizes a performance esmate of the calcHist funcon for Normal
Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS
2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and
150 MHz for 8 pixel mode.
Table 162: calcHist Function Performance Estimate Summary
Operating Mode Latency Estimate
Max (ms)
1 pixel 6.9
8 pixel 1.7
Histogram Equalization
The equalizeHist funcon performs histogram equalizaon on input image or video. It
improves the contrast in the image, to stretch out the intensity range. This funcon maps one
distribuon (histogram) to another distribuon (a wider and more uniform distribuon of
intensity values), so the intensies are spread over the whole range.
For histogram H[i], the cumulave distribuon H'[i] is given as:
H'[i]=∑0 ≤ j<iH⎡
⎣j⎤
⎦
The intensies in the equalized image are computed as:
dst(x,y)=H'⎛
⎝src⎛
⎝x,y⎞
⎠⎞
⎠
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 112

API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC = 1>
void equalizeHist(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<SRC_T,
ROWS, COLS, NPC> & _src1,xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 163: equalizeHist Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle
_src Input image
_src1 Input image
_dst Output image
Resource Utilization
The following table summarizes the resource ulizaon of the equalizeHist funcon for Normal
Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS
2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and
150 MHz for 8 pixel mode.
Table 164: equalizeHist Function Resource Utilization Summary
Operating
Mode
Operating Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 4 5 3492 1807 666
8 pixel 150 25 5 3526 2645 835
Performance Estimate
The following table summarizes a performance esmate of the equalizeHist funcon for Normal
Operaon (1 pixel) and Resource Opmized (8 pixel) conguraons, generated using Vivado HLS
2018.2version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz for 1 pixel and
150 MHz for 8 pixel mode.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 113

Table 165: equalizeHist Function Performance Estimate Summary
Operating Mode Latency Estimate
Max (ms)
1 pixel per clock operation 13.8
8 pixel per clock operation 3.4
HOG
The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision for the
purpose of object detecon. The feature descriptors produced from this approach is widely used
in the pedestrian detecon.
The technique counts the occurrences of gradient orientaon in localized porons of an image.
HOG is computed over a dense grid of uniformly spaced cells and normalized over overlapping
blocks, for improved accuracy. The concept behind HOG is that the object appearance and shape
within an image can be described by the distribuon of intensity gradients or edge direcon.
Both RGB and gray inputs are accepted to the funcon. In the RGB mode, gradients are
computed for each plane separately, but the one with the higher magnitude is selected. With the
conguraons provided, the window dimensions are 64x128, block dimensions are 16x16.
API Syntax
template<int WIN_HEIGHT, int WIN_WIDTH, int WIN_STRIDE, int BLOCK_HEIGHT,
int BLOCK_WIDTH, int CELL_HEIGHT, int CELL_WIDTH, int NOB, int DESC_SIZE,
int IMG_COLOR, int OUTPUT_VARIANT, int SRC_T, int DST_T, int ROWS, int
COLS, int NPC = XF_NPPC1>
void HOGDescriptor(xf::Mat<SRC_T, ROWS, COLS, NPC> &_in_mat,
xf::Mat<DST_T, 1, DESC_SIZE, NPC> &_desc_mat);
Parameter Descriptions
The following table describes the template parameters.
Table 166: HOGDescriptor Template Parameter Descriptions
PARAMETERS DESCRIPTION
WIN_HEIGHT The number of pixel rows in the window. This must be a multiple of 8 and should not exceed the
number of image rows.
WIN_WIDTH The number of pixel cols in the window. This must be a multiple of 8 and should not exceed the
number of image columns.
WIN_STRIDE The pixel stride between two adjacent windows. It is fixed at 8.
BLOCK_HEIGHT Height of the block. It is fixed at 16.
BLOCK_WIDTH Width of the block. It is fixed at 16.
CELL_HEIGHT Number of rows in a cell. It is fixed at 8.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 114

Table 166: HOGDescriptor Template Parameter Descriptions (cont'd)
PARAMETERS DESCRIPTION
CELL_WIDTH Number of cols in a cell. It is fixed at 8.
NOB Number of histogram bins for a cell. It is fixed at 9
DESC_SIZE The size of the output descriptor.
IMG_COLOR The type of the image, set as either XF_GRAY or XF_RGB
OUTPUT_VARIENT Must be either XF_HOG_RB or XF_HOG_NRB
SRC_T Input pixel type. Must be either XF_8UC1 or XF_8UC4, for gray and color respectively.
DST_T Output descriptor type. Must be XF_32UC1.
ROWS Number of rows in the image being processed. (Should be a multiple of 8)
COLS Number of columns in the image being processed. (Should be a multiple of 8)
NPC Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per
cycle operations.
The following table describes the funcon parameters.
Table 167: HOGDescriptor Function Parameter Descriptions
PARAMETERS DESCRIPTION
_in_mat Input image, of xf::Mat type
_desc_mat Output descriptors, of xf::Mat type
Where,
• NO is normal operaon (single pixel processing)
• RB is repeve blocks (descriptor data are wrien window wise)
• NRB is non-repeve blocks (descriptor data are wrien block wise, in order to reduce the
number of writes).
Note: In the RB mode, the block data is wrien to the memory taking the overlap windows into
consideraon. In the NRB mode, the block data is wrien directly to the output stream without
consideraon of the window overlap. In the host side, the overlap must be taken care.
Resource Utilization
The following table shows the resource ulizaon of HOGDescriptor funcon for normal
operaon (1 pixel) mode as generated in Vivado HLS 2018.2 version tool for the part Xilinx
Xczu9eg-vb1156-1-i-es1 at 300 MHz to process an image of 1920x1080 resoluon.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 115

Table 168: HOGDescriptor Function Resource Utilization Summary
Resource
Utilization (at 300 MHz) of 1 pixel operation
NRB RB
Gray RGB Gray RGB
BRAM_18K 43 49 171 177
DSP48E 34 46 36 48
FF 15365 15823 15205 15663
LUT 12868 13267 13443 13848
Performance Estimate
The following table shows the performance esmates of HOGDescriptor() funcon for dierent
conguraons as generated in Vivado HLS 2018.2 version tool for the part Xilinx Xczu9eg-
vb1156-1-i-es1 to process an image of 1920x1080p resoluon.
Table 169: HOGDescriptor Function Performance Estimate Summary
Operating Mode Operating Frequency
(MHz)
Latency Estimate
Min (ms) Max (ms)
NRB-Gray 300 6.98 8.83
NRB-RGBA 300 6.98 8.83
RB-Gray 300 176.81 177
RB-RGBA 300 176.81 177
Deviations from OpenCV
Listed below are the deviaons from the OpenCV:
1. Border care
The border care that OpenCV has taken in the gradient computaon is
BORDER_REFLECT_101, in which the border padding will be the neighboring pixels'
reecon. Whereas, in the Xilinx implementaon, BORDER_CONSTANT (zero padding) was
used for the border care.
2. Gaussian weighing
The Gaussian weights are mulplied on the pixels over the block, that is a block has 256
pixels, and each posion of the block are mulplied with its corresponding Gaussian weights.
Whereas, in the HLS implementaon, gaussian weighing was not performed.
3. Cell-wise interpolaon
The magnitude values of the pixels are distributed across dierent cells in the blocks but on
the corresponding bins.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 116

Pixels in the region 1 belong only to its corresponding cells, but the pixels in region 2 and 3
are interpolated to the adjacent 2 cells and 4 cells respecvely. This operaon was not
performed in the HLS implementaon.
4. Output handling
The output of the OpenCV will be in the column major form. In the HLS implementaon,
output will be in the row major form. Also, the feature vector will be in the xed point type
Q0.16 in the HLS implementaon, while in the OpenCV it will be in oang point.
Limitations
1. The conguraons are limited to Dalal's implementaon
2. Image height and image width must be a mulple of cell height and cell width respecvely.
Houghlines
The HoughLines funcon here is equivalent to HoughLines Standard in OpenCV. The
Houghlines funcon is used to detect straight lines in a binary image. To apply the Hough
transform, edge detecon preprocessing is required. The input to the Hough transform is an edge
detected binary image. For each point (xi,yi) in a binary image, we dene a family of lines that go
through the point as:
rho= xi cos(theta) + yi sin(theta)
1N. Dalal, B. Triggs: Histograms of oriented gradients for human detecon, IEEE Computer Society
Conference on Computer Vision and Paern Recognion, 2005.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 117

Each pair of (rho,theta) represents a line that passes through the point (xi,yi). These (rho,theta)
pairs of this family of lines passing through the point form a sinusoidal curve in (rho,theta) plane.
If the sinusoids of N dierent points intersect in the (rho,theta) plane, then that intersecon
(rho1, theta1) represents the line that passes through these N points. In the Houghlines funcon,
an accumulator is used to keep the count (also called vong) of all the intersecon points in the
(rho,theta) plane. Aer vong, the funcon lters spurious lines by performing thinning, that is,
checking if the center vote value is greater than the neighborhood votes and threshold, then
making that center vote as valid and other wise making it zero. Finally, the funcon returns the
desired maximum number of lines (LINESMAX) in (rho,theta) form as output.
The design assumes the origin at the center of the image i.e at (Floor(COLS/2), Floor(ROWS/2)).
The ranges of rho and theta are:
theta = [0, pi)
rho=[-DIAG/2, DIAG/2), where DIAG = cvRound{SquareRoot( (COLS*COLS) +
(ROWS*ROWS))}
For ease of use, the input angles THETA, MINTHETA and MAXTHETA are taken in degrees, while
the output theta is in radians. The angle resoluon THETA is declared as an integer, but treated
as a value in Q6.1 format (that is, THETA=3 signies that the resoluon used in the funcon is
1.5 degrees). When the output (rho, Ɵ theta) is used for drawing lines,you should be aware of the
fact that origin is at the center of the image.
API Syntax
template<unsigned int RHO,unsigned int THETA,int MAXLINES,int DIAG,int
MINTHETA,int MAXTHETA,int SRC_T, int ROWS, int COLS,int NPC>
void HoughLines(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,float
outputrho[MAXLINES],float outputtheta[MAXLINES],short threshold,short
linesmax)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 170: Houghlines Function Parameter Descriptions
Parameter Description
RHO Distance resolution of the accumulator in pixels.
THETA Angle resolution of the accumulator in degrees and Q6.1 format.
MAXLINES Maximum number of lines to be detected
MINTHETA Minimum angle in degrees to check lines.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 118

Table 170: Houghlines Function Parameter Descriptions (cont'd)
Parameter Description
MAXTHETA Maximum angle in degrees to check lines
DIAG Diagonal of the image. It should be cvRound(sqrt(rows*rows + cols*cols)/RHO)
SRC_T Input Pixel Type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input image
COLS Maximum width of input image
NPC Number of Pixels to be processed per cycle; Only single pixel supported XF_NPPC1.
_src_mat Input image should be 8-bit, single-channel binary image.
outputrho Output array of rho values. rho is the distance from the coordinate origin (center of the image).
outputtheta Output array of theta values. Theta is the line rotation angle in radians.
threshold Accumulator threshold parameter. Only those lines are returned that get enough votes (>threshold).
linesmax Maximum number of lines.
Resource Utilization
The table below shows the resource ulizaon of the kernel for dierent conguraons,
generated using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 to
process a grayscale HD (1080x1920) image for 512 lines.
Table 171: Houghlines Function Resource Utilization Summary
Name
Resource Utilization
THETA=1, RHO=1
BRAM_18K 542
DSP48E 10
FF 60648
LUT 56131
Performance Estimate
The following table shows the performance of kernel for dierent conguraons, generated
using Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-es1 to process a
grayscale HD (1080x1920) image for 512 lines.
Table 172: Houghlines Function Performance Estimate Summary
Operating Mode Operating Frequency (MHz) Latency Estimate
Max (ms)
THETA=1, RHO=1 300 12.5
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 119

Pyramid Up
The pyrUp funcon is an image up-sampling algorithm. It rst inserts zero rows and zero
columns aer every input row and column making up to the size of the output image. The output
image size is always
(2*rows × 2*columns)
.The zero padded image is then smoothened
using Gaussian image lter. Gaussian lter for the pyramid-up funcon uses a xed lter kernel
as given below:
1
256
⎡
⎣
⎢
⎢
⎢
⎢1 4 6 4 1
4 16 24 16 4
6 24 36 24 6
4 16 24 16 4
1 4 6 4 1
⎤
⎦
⎥
⎥
⎥
⎥
However, to make up for the pixel intensity that is reduced due to zero padding, each output
pixel is mulplied by 4.
API Syntax
template<int TYPE, int ROWS, int COLS, int NPC>
void pyrUp (xf::Mat<TYPE, ROWS, COLS, NPC> & _src, xf::Mat<TYPE, ROWS,
COLS, NPC> & _dst)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 173: pyrUp Function Parameter Descriptions
Parameter Description
TYPE Pixel type. XF_8UC1 is the only supported pixel type.
ROWS Maximum Height or number of output rows to build the hardware for this kernel
COLS Maximum Width or number of output columns to build the hardware for this kernel
NPC Number of pixels to process per cycle. Currently, the kernel supports only 1 pixel per cycle
processing (XF_NPPC1).
_src Input image stream
_dst Output image stream
Resource Utilization
The following table summarizes the resource ulizaon of pyrUp for 1 pixel per cycle
implementaon, for a maximum input image size of 1920x1080 pixels. The results are aer
synthesis in Vivado HLS 2018.2 for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 120

Table 174: pyrUp Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
LUTs FFs DSPs BRAMs
1 Pixel 300 1124 1199 0 10
Performance Estimate
The following table summarizes performance esmates of pyrUp funcon on Vivado HLS 2018.2
for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA.
Table 175: pyrUp Function Performance Estimate Summary
Operating Mode
Operating Frequency
(MHz) Input Image Size
Latency Estimate
Max (ms)
1 pixel 300 1920x1080 27.82
Pyramid Down
The pyrDown funcon is an image down-sampling algorithm which smoothens the image before
down-scaling it. The image is smoothened using a Gaussian lter with the following kernel:
1
256
⎡
⎣
⎢
⎢
⎢
⎢1 4 6 4 1
4 16 24 16 4
6 24 36 24 6
4 16 24 16 4
1 4 6 4 1
⎤
⎦
⎥
⎥
⎥
⎥
Down-scaling is performed by dropping pixels in the even rows and the even columns. The
resulng image size is
⎛
⎝rows + 1
2 columns + 1
2
⎞
⎠
.
API Syntax
template<int TYPE, int ROWS, int COLS, int NPC>
void pyrDown (xf::Mat<TYPE, ROWS, COLS, NPC> & _src, xf::Mat<TYPE, ROWS,
COLS, NPC> & _dst)
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 121

Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 176: pyrDown Function Parameter Descriptions
Parameter Description
TYPE Pixel type. XF_8UC1 is the only supported pixel type.
ROWS Maximum Height or number of input rows to build the hardware for this kernel
COLS Maximum Width or number of input columns to build the hardware for this kernel
NPC Number of pixels to process per cycle. Currently, the kernel supports only 1 pixel per cycle
processing (XF_NPPC1).
_src Input image stream
_dst Output image stream
Resource Utilization
The following table summarizes the resource ulizaon of pyrDown for 1 pixel per cycle
implementaon, for a maximum input image size of 1920x1080 pixels. The results are aer
synthesis in Vivado HLS 2018.2 for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA at 300 MHz.
Table 177: pyrDown Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
LUTs FFs DSPs BRAMs
1 Pixel 300 1171 1238 1 5
Performance Estimate
The following table summarizes performance esmates of pyrDown funcon in Vivado HLS
2018.2 for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA.
Table 178: pyrDown Function Performance Estimate Summary
Operating Mode
Operating Frequency
(MHz) Input Image Size
Latency Estimate
Max (ms)
1 pixel 300 1920x1080 6.99
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 122

InitUndistortRectifyMapInverse
The InitUndistortRectifyMapInverse funcon generates mapx and mapy, based on a set
of camera parameters, where mapx and mapy are inputs for the xf::remap funcon. That is, for
each pixel in the locaon (u, v) in the desnaon (corrected and reced) image, the funcon
computes the corresponding coordinates in the source image (the original image from camera).
The InitUndistortRecfyMapInverse module is opmized for hardware, so the inverse of rotaon
matrix is computed outside the synthesizable logic. Note that the inputs are xed point, so the
oang point camera parameters must be type casted to Q12.20 format.
API Syntax
template< int CM_SIZE, int DC_SIZE, int MAP_T, int ROWS, int COLS, int NPC
>
void InitUndistortRectifyMapInverse ( ap_fixed<32,12> *cameraMatrix,
ap_fixed<32,12> *distCoeffs, ap_fixed<32,12> *ir, xf::Mat<MAP_T, ROWS,
COLS, NPC> &_mapx_mat, xf::Mat<MAP_T, ROWS, COLS, NPC> &_mapy_mat, int
_cm_size, int _dc_size)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 179: InitUndistortRectifyMapInverse Function Parameter Descriptions
Parameter Description
CM_SIZE It must be set at the compile time, 9 for 3x3 matrix
DC_SIZE It must be set at the compile time, must be 4,5 or 8
MAP_T It is the type of output maps, and must be XF_32FC1
ROWS Maximum image height, necessary to generate the output maps
COLS Maximum image width, necessary to generate the output maps
NPC Number of pixels per cycle. This function supports only one pixel per cycle, so set to XF_NPPC1
cameraMatrix The input matrix representing the camera in the old coordinate system
distCoeffs The input distortion coefficients (k1,k2,p1,p2[,k3[,k4,k5,k6]])
ir The input transformation matrix is equal to Invert(newCameraMatrix*R), where
newCameraMatrix represents the camera in the new coordinate system and R is the rotation
matrix.. This processing will be done outside the synthesizable block
_mapx_mat Output mat objects containing the mapx
_mapy_mat Output mat objects containing the mapy
_cm_size 9 for 3x3 matrix
_dc_size 4, 5 or 8. If this is 0, then it means there is no distortion
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 123

Integral Image
The integral funcon computes an integral image of the input. Each output pixel is the sum of
all pixels above and to the le of itself.
dst(x,y)=sum(x,y)=sum(x,y)+sum⎛
⎝x- 1, y⎞
⎠+sum⎛
⎝x,y- 1⎞
⎠-sum⎛
⎝x- 1, y- 1⎞
⎠
API Syntax
template<int SRC_TYPE,int DST_TYPE, int ROWS, int COLS, int NPC=1>
void integral(xf::Mat<SRC_TYPE, ROWS, COLS, NPC> & _src_mat,
xf::Mat<DST_TYPE, ROWS, COLS, NPC> & _dst_mat)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 180: integral Function Parameter Descriptions
Parameter Description
SRC_TYPE Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
DST_TYPE Output pixel type. Only 32-bit,unsigned,1 channel is supported(XF_32UC1)
ROWS Maximum height of input and output image (must be a multiple of 8)
COLS Maximum width of input and output image (must be a multiple of 8)
NPC Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per
cycle operations.
_src_mat Input image
_dst_mat Output image
Resource Utilization
The following table summarizes the resource ulizaon of the kernel in dierent conguraons,
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to
process a grayscale HD (1080x1920) image.
Table 181: integral Function Resource Utilization Summary
Name
Resource Utilization
1 pixel
300 MHz
BRAM_18K 4
DSP48E 0
FF 613
LUT 378
CLB 102
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 124

Performance Estimate
The following table summarizes the performance of the kernel in dierent conguraons, as
generated using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a
grayscale HD (1080x1920) image.
Table 182: integral Function Performance Estimate Summary
Operating Mode
Latency Estimate
Operating Frequency
(MHz) Latency(in ms)
1pixel 300 7.2
Dense Pyramidal LK Optical Flow
Opcal ow is the paern of apparent moon of image objects between two consecuve
frames, caused by the movement of object or camera. It is a 2D vector eld, where each vector is
a displacement vector showing the movement of points from rst frame to second.
Opcal Flow works on the following assumpons:
• Pixel intensies of an object do not have too many variaons in consecuve frames
• Neighboring pixels have similar moon
Consider a pixel I(x, y, t) in rst frame. (Note that a new dimension, me, is added here. When
working with images only, there is no need of me). The pixel moves by distance (dx, dy) in the
next frame taken aer me dt. Thus, since those pixels are the same and the intensity does not
change, the following is true:
I(x,y,t)=I⎛
⎝x+dx,y+dy,t+dt⎞
⎠
Taking the Taylor series approximaon on the right-hand side, removing common terms, and
dividing by dt gives the following equaon:
fxu+fyv+ft= 0
Where
fx=δ f
δx
,
fy=δ f
δx
,
u =dx
dt
and
v=dy
dt
.
The above equaon is called the Opcal Flow equaon, where, fx and fy are the image
gradientsand ft is the gradient along me. However, (u, v) is unknown. It is not possible to solve
this equaon with two unknown variables. Thus, several methods are provided to solve this
problem. One method is Lucas-Kanade. Previously it was assumed that all neighboring pixels
have similar moon. The Lucas-Kanade method takes a patch around the point, whose size can
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 125

be dened through the ‘WINDOW_SIZE’ template parameter. Thus, all the points in that patch
have the same moon. It is possible to nd (fx, fy, ft ) for these points. Thus, the problem now
becomes solving ‘WINDOW_SIZE * WINDOW_SIZE’ equaons with two unknown
variables,which is over-determined. A beer soluon is obtained with the “least square t”
method. Below is the nal soluon, which is a problem with two equaons and two unknowns:
⎡
⎣
u
v
⎤
⎦=
⎡
⎣
⎢∑fxi
2∑fxifyi
∑fxifyi∑fyi
2
⎤
⎦
⎥
-1⎡
⎣
⎢-∑fxifti
-∑fyifti
⎤
⎦
⎥
This soluon fails when a large moon is involved and so pyramids are used. Going up in the
pyramid, small moons are removed and large moons become small moons and so by applying
Lucas-Kanade, the opcal ow along with the scale is obtained.
API Syntax
template< int NUM_PYR_LEVELS, int NUM_LINES, int WINSIZE, int FLOW_WIDTH,
int FLOW_INT, int TYPE, int ROWS, int COLS, int NPC>
void densePyrOpticalFlow(
xf::Mat<TYPE,ROWS,COLS,NPC> & _current_img,
xf::Mat<TYPE,ROWS,COLS,NPC> & _next_image,
xf::Mat<XF_32UC1,ROWS,COLS,NPC> & _streamFlowin,
xf::Mat<XF_32UC1,ROWS,COLS,NPC> & _streamFlowout,
const int level, const unsigned char scale_up_flag, float scale_in,
ap_uint<1> init_flag)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 183: densePyrOpticalFlow Function Parameter Descriptions
Parameter Description
NUM_PYR_LEVELS Number of Image Pyramid levels used for the optical flow computation
NUM_LINES Number of lines to buffer for the remap algorithm – used to find the temporal gradient
WINSIZE Window Size over which Optical Flow is computed
FLOW_WIDTH,
FLOW_INT
Data width and number of integer bits to define the signed flow vector data type. Integer bit
includes the signed bit.
The default type is 16-bit signed word with 10 integer bits and 6 decimal bits.
TYPE Pixel type of the input image. XF_8UC1 is only the supported value.
ROWS Maximum Height or number of rows to build the hardware for this kernel
COLS Maximum Width or number of columns to build the hardware for this kernel
NPC Number of pixels the hardware kernel must process per clock cycle. Only XF_NPPC1, 1 pixel per
cycle, is supported.
_curr_img First input image stream
_next_img Second input image to which the optical flow is computed with respect to the first image
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 126

Table 183: densePyrOpticalFlow Function Parameter Descriptions (cont'd)
Parameter Description
_streamFlowin 32-bit Packed U and V flow vectors input for optical flow. The bits from 31-16 represent the flow
vector U while the bits from 15-0 represent the flow vector V.
_streamFlowout 32-bit Packed U and V flow vectors output after optical flow computation. The bits from 31-16
represent the flow vector U while the bits from 15-0 represent the flow vector V.
level Image pyramid level at which the algorithm is currently computing the optical flow.
scale_up_flag Flag to enable the scaling-up of the flow vectors. This flag is set at the host when switching from
one image pyramid level to the other.
scale_in Floating point scale up factor for the scaling-up the flow vectors.
The value is (previous_rows-1)/(current_rows-1). This is not 1 when switching from one image
pyramid level to the other.
init_flag Flag to initialize flow vectors to 0 in the first iteration of the highest pyramid level. This flag must
be set in the first iteration of the highest pyramid level (smallest image in the pyramid). The flag
must be unset for all the other iterations.
Resource Utilization
The following table summarizes the resource ulizaon of densePyrOpcalFlow for 1 pixel per
cycle implementaon, with the opcal ow computed for a window size of 11 over an image size
of 1920x1080 pixels. The results are aer implementaon in Vivado HLS 2018.2 for the Xilinx
xczu9eg-vb1156-2L-e FPGA at 300 MHz.
Table 184: densePyrOpticalFlow Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
LUTs FFs DSPs BRAMs
1 Pixel 300 32231 16596 52 215
Performance Estimate
The following table summarizes performance gures on hardware for the densePyrOpcalFlow
funcon for 5 iteraons over 5 pyramid levels scaled down by a factor of two at each level. This
has been tested on the zcu102 evaluaon board.
Table 185: densePyrOpticalFlow Function Performance Estimate Summary
Operating Mode
Operating Frequency
(MHz) Image Size
Latency Estimate
Max (ms)
1 pixel 300 1920x1080 49.7
1 pixel 300 1280x720 22.9
1 pixel 300 1226x370 12.02
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 127

Dense Non-Pyramidal LK Optical Flow
Opcal ow is the paern of apparent moon of image objects between two consecuve
frames, caused by the movement of object or camera. It is a 2D vector eld, where each vector is
a displacement vector showing the movement of points from rst frame to second.
Opcal Flow works on the following assumpons:
• Pixel intensies of an object do not have too many variaons in consecuve frames
• Neighboring pixels have similar moon
Consider a pixel I(x, y, t) in rst frame. (Note that a new dimension, me, is added here. When
working with images only, there is no need of me). The pixel moves by distance (dx, dy) in the
next frame taken aer me dt. Thus, since those pixels are the same and the intensity does not
change, the following is true:
I(x,y,t)=I⎛
⎝x+dx,y+dy,t+dt⎞
⎠
Taking the Taylor series approximaon on the right-hand side, removing common terms, and
dividing by dt gives the following equaon:
fxu+fyv+ft= 0
Where
fx=δ f
δx
,
fy=δ f
δx
,
u =dx
dt
and
v=dy
dt
.
The above equaon is called the Opcal Flow equaon, where, fx and fy are the image
gradientsand ft is the gradient along me. However, (u, v) is unknown. It is not possible to solve
this equaon with two unknown variables. Thus, several methods are provided to solve this
problem. One method is Lucas-Kanade. Previously it was assumed that all neighboring pixels
have similar moon. The Lucas-Kanade method takes a patch around the point, whose size can
be dened through the ‘WINDOW_SIZE’ template parameter. Thus, all the points in that patch
have the same moon. It is possible to nd (fx, fy, ft ) for these points. Thus, the problem now
becomes solving ‘WINDOW_SIZE * WINDOW_SIZE’ equaons with two unknown
variables,which is over-determined. A beer soluon is obtained with the “least square t”
method. Below is the nal soluon, which is a problem with two equaons and two unknowns:
⎡
⎣
u
v
⎤
⎦=
⎡
⎣
⎢∑fxi
2∑fxifyi
∑fxifyi∑fyi
2
⎤
⎦
⎥
-1⎡
⎣
⎢-∑fxifti
-∑fyifti
⎤
⎦
⎥
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 128

API Syntax
template<int TYPE, int ROWS, int COLS, int NPC, int WINDOW_SIZE>
void DenseNonPyrLKOpticalFlow (xf::Mat<TYPE, ROWS, COLS, NPC> & frame0,
xf::Mat<TYPE, ROWS, COLS, NPC> & frame1, xf::Mat<XF_32FC1, ROWS, COLS,
NPC> & flowx, xf::Mat<XF_32FC1, ROWS, COLS, NPC> & flowy)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 186: DenseNonPyrLKOpticalFlow Function Parameter Descriptions
Parameter Description
Type pixel type. The current supported pixel value is XF_8UC1, unsigned 8 bit.
ROWS Maximum number of rows of the input image that the hardware kernel must be built for.
COLS Maximum number of columns of the input image that the hardware kernel must be built for.
NPC Number of pixels to process per cycle. Supported values are XF_NPPC1 (=1) and XF_NPPC2(=2).
WINDOW_SIZE Window size over which optical flow will be computed. This can be any odd positive integer.
frame0 First input images.
frame1 Second input image. Optical flow is computed between frame0 and frame1.
flowx Horizontal component of the flow vectors. The format of the flow vectors is XF_32FC1 or single
precision.
flowy Vertical component of the flow vectors. The format of the flow vectors is XF_32FC1 or single
precision.
Resource Utilization
The following table summarizes the resource ulizaon of DenseNonPyrLKOpcalFlow for a 4K
image, as generated in the Vivado HLS 2018.2 version tool for the Xilinx Xczu9eg-vb1156-1-i-
es1 FPGA at 300 MHz.
Table 187: DenseNonPyrLKOpticalFlow Function Resource Utilization Summary
Operating
Mode
Operating
Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUTs
1 pixel 300 178 42 11984 7730
2 pixel 300 258 82 22747 15126
Performance Estimate
The following table summarizes performance esmates of the DenseNonPyrLKOpcalFlow
funcon for a 4K image, generated using Vivado HLS 2018.2 version tool for the Xilinx xczu9eg-
vb1156-1-i-es1 FPGA.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 129

Table 188: DenseNonPyrLKOpticalFlow Function Performance Estimate Summary
Operating Mode
Operating Frequency
(MHz)
Latency Estimate
Max (ms)
1 pixel 300 28.01
2 pixel 300 14.01
Mean and Standard Deviation
The meanStdDev funcon computes the mean and standard deviaon of input image. The
output Mean value is in xed point Q8.8 format, and the Standard Deviaon value is in Q8.8
format. Mean and standard deviaon are calculated as follows:
μ=
∑
y= 0
height
∑
x=0
width
src(x,y)
⎛
⎝width*height⎞
⎠
σ=
∑
y= 0
height
∑
x=0
width
(μ-src(x,y))2
⎛
⎝width*height⎞
⎠
API Syntax
template<int SRC_T,int ROWS, int COLS,int NPC=1>
void meanStdDev(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,unsigned short*
_mean,unsigned short* _stddev)
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 189: meanStdDev Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1) is supported.
ROWS Number of rows in the image being processed.
COLS Number of columns in the image being processed.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.
_src Input image
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 130

Table 189: meanStdDev Function Parameter Descriptions (cont'd)
Parameter Description
_mean 16-bit data pointer through which the computed mean of the image is returned.
_stddev 16-bit data pointer through which the computed standard deviation of the image is returned.
Resource Utilization
The following table summarizes the resource ulizaon of the meanStdDev funcon, generated
using Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image.
Table 190: meanStdDev Function Resource Utilization Summary
Operating
Mode
Operating Frequency
(MHz)
Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 6 896 461 121
8 pixel 150 0 13 1180 985 208
Performance Estimate
The following table summarizes the performance in dierent conguraons, as generated using
Vivado HLS 2018.2 tool for the Xilinx Xczu9eg-vb1156-1-i-es1, to process a grayscale HD
(1080x1920) image.
Table 191: meanStdDev Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency
1 pixel operation (300 MHz) 6.9 ms
8 pixel operation (150 MHz) 1.69 ms
Median Blur Filter
The funcon medianBlur performs a median lter operaon on the input image. The median lter
acts as a non-linear digital lter which improves noise reducon. A lter size of N would output
the median value of the NxN neighborhood pixel values, for each pixel.
API Syntax
template<int FILTER_SIZE, int BORDER_TYPE, int TYPE, int ROWS, int COLS,
int NPC>
void medianBlur (xf::Mat<TYPE, ROWS, COLS, NPC> & _src, xf::Mat<TYPE,
ROWS, COLS, NPC> & _dst)
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 131

Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 192: medianBlur Function Parameter Descriptions
Parameter Description
FILTER_SIZE Window size of the hardware filter for which the hardware kernel will be built. This can be any
odd positive integer greater than 1.
BORDER_TYPE The way in which borders will be processed in the hardware kernel. Currently, only
XF_BORDER_REPLICATE is supported.
TYPE Type of input pixel. XF_8UC1 is supported.
ROWS Number of rows in the image being processed.
COLS Number of columns in the image being processed.
NPC Number of pixels to be processed in parallel. Options are XF_NPPC1 (for 1 pixel processing per
clock), XF_NPPC8 (for 8 pixel processing per clock
_src Input image.
_dst Output image.
Resource Utilization
The following table summarizes the resource ulizaon of the medianBlur funcon for
XF_NPPC1 and XF_NPPC8 conguraons, generated using Vivado HLS 2018.2 version tool for
the Xilinx Xc7z020clg484-1 FPGA.
Table 193: medianBlur Function Resource Utilization Summary
Operating
Mode FILTER_SIZE
Operating
Frequency
(MHz)
Utilization Estimate
LUTs FFs DSPs BRAMs
1 pixel 3 300 1197 771 0 3
8 pixel 3 150 6559 1595 0 6
1 pixel 5 300 5860 1886 0 5
Performance Estimate
The following table summarizes performance esmates of medianBlur funcon on Vivado HLS
2018.2 version tool for the Xilinx xczu9eg-vb1156-1-i-es1 FPGA.
Chapter 3: xfOpenCV Library API Reference
UG1233 (v2018.2) July 2, 2018 www.xilinx.com [placeholder text]
Xilinx OpenCV User Guide 132

Table 194: medianBlur Function Performance Estimate Summary
Operating Mode FILTER_SIZE
Operating
Frequency
(MHz)
Input Image Size
Latency Estimate
Max (ms)
1 pixel 3 300 1920x1080 6.99
8 pixel 3 150 1920x1080 1.75
1 pixel 5 300 1920x1080 7.00
MinMax Location
The minMaxLoc funcon nds the minimum and maximum values in an image and locaon of
those values.
minVal = min
0 ≤ x'≤width
0 ≤ y'≤height
src⎛
⎝x',y'⎞
⎠
maxVal = max
0 ≤ x'≤width
0 ≤ y'≤height
src⎛
⎝x',y'⎞
⎠
API Syntax
template<int SRC_T,int ROWS,int COLS,int NPC>
void minMaxLoc(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,int32_t *max_value,
int32_t *min_value,uint16_t *_minlocx, uint16_t *_minlocy, uint16_t
*_maxlocx, uint16_t *_maxlocy )
Parameter Descriptions
The following table describes the template and the funcon parameters.
Table 195: minMaxLoc Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1), 16-bit, unsigned, 1 channel (XF_16UC1), 16-
bit, signed, 1 channel (XF_16SC1), 32-bit, signed, 1 channel (XF_32SC1) are supported.
ROWS Number of rows in the image being processed.
COLS Number of columns in the image being processed.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1
pixel and 8 pixel operations respectively.