NVIDIA | GPU Applications Catalog

GPU‑Accelerated Applications

NVIDIA Corporation

NVIDIA | GPU Applications Catalog

gpu-accelerated applications - Nvidia

experience with sensor fusion and machine learning technology. ... manuals, instructions, FAQ documents; ... Images, Walmart, Ford, Google, NASA.

gpu-applications-catalog
GPU-ACCELERATED APPLICATIONS

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 1

4/5/21 10:18 AM

Test Drive the World's Fastest Accelerator ­ Free!
Take the GPU Test Drive, a free and easy way to experience accelerated computing on GPUs. You can run your own application or try one of the preloaded ones, all running on a remote cluster. Try it today. www.nvidia.com/gputestdrive
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 2

4/5/21 10:18 AM

GPUACCELERATED APPLICATIONS
Accelerated computing has revolutionized a broad range of industries with over six hundred applications optimized for GPUs to help you accelerate your work.

CONTENTS
1 Computational Finance
2 Climate, Weather and Ocean Modeling
2 Data Science and Analytics
5 Artificial Intelligence
DEEP LEARNING AND MACHINE LEARNING
12 Public Sector and National Government
14 Design for Manufacturing/Construction: CAD/CAE/CAM
CFD (MFG) CFD (RESEARCH DEVELOPMENTS) COMPUTATIONAL STRUCTURAL MECHANICS DESIGN AND VISUALIZATION ELECTRONIC DESIGN AUTOMATION INDUSTRIAL INSPECTION
27 Media and Entertainment
ANIMATION, MODELING AND RENDERING COLOR CORRECTION AND GRAIN MANAGEMENT COMPOSITING, FINISHING AND EFFECTS (VIDEO) EDITING (IMAGE & PHOTO) EDITING ENCODING AND DIGITAL DISTRIBUTION ON-AIR GRAPHICS ON-SET, REVIEW AND STEREO TOOLS WEATHER GRAPHICS
42 Medical Imaging
45 Oil and Gas
46 Life Sciences
BIOINFORMATICS MICROSCOPY MOLECULAR DYNAMICS QUANTUM CHEMISTRY (MOLECULAR) VISUALIZATION AND DOCKING

58 Research: Higher Education and Supercomputing
NUMERICAL ANALYTICS PHYSICS SCIENTIFIC VISUALIZATION
63 Smart Spaces 66 Tools and Management 71 Agriculture 71 Business Process Optimization

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 3

4/5/21 10:18 AM

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 4

4/5/21 10:18 AM

Computational Finance

APPLICATION NAME
Accelerated Computing Engine

COMPANYNAME
Elsen

PRODUCT DESCRIPTION
Secure, accessible, and accelerated backtesting, scenario analysis, risk analytics and real-time trading designed for easy integration and rapid development.

SUPPORTED FEATURES
· W eb-like API with Native bindings for Python, R, Scala, C
· C ustom models and data streams

GPU SCALING
Multi-GPU Single Node

Adaptiv Analytics SunGard

A flexible and extensible engine for fast calculations of a wide variety of pricing and risk measures on a broad range of asset classes and derivatives.

· C odes in C# supported transparently, with minimal code changes
· S upports multiple backends including CUDA and OpenCL
· S witches transparently between multiple GPUs and CPUS depending on the deal support and load factors.

Multi-GPU Single Node

Alea.cuBase F#

QuantAleas

F# package enabling a growing set of F# capability to run on a GPU.

· F # for GPU accelerators

Multi-GPU Single Node

Esther

Global Valuation

In-memory risk analytics system for OTC portfolios with a particular focus on XVA metrics and balance sheet simulations.

· H igh quality models not admitting closed form solutions
· E fficient solvers based on full matrix linear algebra powered by GPUs and Monte Carlo algorithms

Multi-GPU Single Node

Global Risk

MISYS

Regulatory compliance and enterprise wide · R isk analytics risk transparency package.

Multi-GPU Single Node

Hybridizer C#

Altimesh

Multi-target C# framework for data parallel · C # with translation to GPU

computing.

· M ulti-Core Xeon

Multi-GPU Single Node

MACS Analytics Library

Murex

Analytics library for modeling valuation and risk for derivatives across multiple asset classes.

· M arket standard models for all asset classes paired with the most efficient resolution methods (Monte Carlo simulations and Partial Differential Equations)

Multi-GPU Single Node

NAG

Numerical Algorithms Group

Random number generators, Brownian bridges, and PDE solvers

· M onte Carlo and PDE solvers

Single GPU Single Node

Oneview

Numerix

Numerix introduced GPU support for Forward Monte Carlo simulation for Capital Markets and Insurance.

· E quity/FX basket models with BlackScholes/Local Vol models for individual equities and FX
· A lgorithms: AAD (Automatic Algebraic Differential)
· N ew approaches to AAD to reduce time to market for fast Price Greeks and XVA Greeks

Multi-GPU Multi-Node

O-Quant options pricing

O-Quant

Offering for risk management and complex options and derivatives pricing using GPUs.

· C loud-based interface to price complex derivatives representing large baskets of equities

Multi-GPU Multi-Node

Pathwise

Aon Benfield

Specialized platform for real-time hedging, valuation, pricing and risk management.

· S preadsheet-like modeling interfaces · P ython-based scripting environment · G rid middleware

Multi-GPU Single Node

SciFinance

SciComp, Inc

Derivative pricing (SciFinance)

· M onte Carlo and PDE pricing models

Single GPU Single Node

Synerscope Data Visualization

Synerscope

Visual big data exploration and insight tools

· G raphical exploration of large network datasets including geo-spatial and temporal components

Single GPU Single Node

Volera

Hanweck Associates

Real-time options analytical engine (Volera) · R eal-time analytics

Multi-GPU Single Node

Xcelerit SDK

Xcelerit

Software Development Kit (SDK) to boost the performance of Financial applications (e.g. Monte-Carlo, Finite-difference) with minimum changes to existing code.

· C ++ programming language, crossplatform (back-end generates CUDA and optimized CPU code)
· S upports Windows and Linux operating systems

Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 1

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 1
4/5/21 10:18 AM

Climate, Weather and Ocean Modeling

APPLICATION NAME
COSMO

COMPANYNAME
COSMO Consortium

PRODUCT DESCRIPTION
Regional numerical weather prediction and climate research model

SUPPORTED FEATURES
· R adiation only in the trunk release · A ll features in the MCH branch used for
operational weather forecasting

GPU SCALING
Multi-GPU Multi-Node

E3SM-EAM

US DOE

Global atmospheric model used as component to E3SM global coupled climate model.

· D ynamics and most physics

Multi-GPU Multi-Node

Gales

KNMI, TU Delft Regional numerical weather prediction model

· F ull Model

Multi-GPU Multi-Node

GRAF

IBM/TWC

New GPU-based global weather model based on MPAS from NCAR

· F ull application

Multi-GPU Multi-Node

WRF AceCASTWRF

TempoQuest Inc.

WRF model from NCAR now commercialized by TQI. Used for numerical weather prediction and regional climate studies. All popular aspects of WRF model are GPU developed.

· A RW dynamics · 1 9 physics options including enough to run
the full WRF model on GPUs

Multi-GPU Multi-Node

Data Science and Analytics

APPLICATION NAME
Anaconda Distribution

COMPANYNAME
Anaconda

PRODUCT DESCRIPTION
The open-source Anaconda Distribution is the easiest way to perform Python/R data science and machine learning on Linux, Windows, and Mac OS X. Developed for solo practitioners, it is the toolkit that equips you to work with thousands of open-source packages and libraries.

SUPPORTED FEATURES

GPU SCALING

· B indings to CUDA libraries: cuBLAS, cuFFT, cuSPARSE, cuRAND
· S orts algorithms from the CUB and Modern GPU libraries
· Includes Numba (JIT Python compiler), Dask (Python scheduler), NumPy, SciPy,
· Includes single-line install of numerous DL frameworks such as PYTORCH

Multi-GPU Multi-Node

AnswerRocket

AnswerRocket

AnswerRocket leverages AI and machine learning techniques to automate the hard work of business analysis, empowering teams to generate business intelligence and advanced analysis in seconds.

· P luggable machine learning models · A sk Questions in Plain English · C reate Interactive Visualizations &
Dashboards · P rovides Augmented Analytics · S upports a wide variety of data sources

Multi-GPU Multi-Node

ArgusSearch

Planet AI

Deep Learning driven document search tool.

· F ast full text search engine · S earches hand-written and text
documents, including PDF · A llows almost any arbitrary requests
(Regular Expressions are supported) · P rovides a list of matches sorted by
confidence

Multi-GPU Single Node

Automatic Speech Capio Recognition

In-house and Cloud-based speech recognition technologies

· R eal-time and offline (batch) speech recognition
· E xceptional accuracy for transcription of conversational speech
· C ontinuous Learning (System becomes more accurate as more data is pushed to the platform)

Multi-GPU Single Node

BlazingSQL

BlazingSQL

GPU-accelerated SQL Engine for analytics available on all major CSP and on-premise deployment.

· D istributed SQL Query Engine · S upports petabyte scale applications · S upports traditional big data formats and
data stores

Multi-GPU Multi-Node

BrytlytDB

Brytlyt

In-GPU-memory database built on top of PostgreSQL

· G PU-Accelerated joins, aggregations, scans, etc. on PostgreSQL
· V isualization platform bundled with database is called SpotLyt.

Multi-GPU Multi-Node

CuPy

Preferred Networks

CuPy (https://github.com/cupy/cupy) is a GPU-accelerated scientific computing library for Python with a NumPy compatible interface.

· CUDA · m ulti-GPU support

Multi-GPU Single Node

2 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 2

4/5/21 10:18 AM

Datalogue DeepGram Driverless AI
GPUdb H2O4GPU
IntelligentVoice

Datalogue Deepgram H2O.ai
Kinetica H2O.ai
INTELLIGENT VOICE

AI powered pipelines that automatically prepare any data from any source for immediate & compliant use.

· D ata transformation · O ntology mapping · D ata standardization · D ata augmentation

Multi-GPU Single Node

Voice processing solution for call centers, financials and other scenarios.

· S peech to text and phonetic search using GPU deep learning

Multi-GPU Single Node

Automated Machine Learning with Feature Extraction. Essentially BI for Machine Learning and AI, with accuracy very similar to Kaggle Experts. H2O Driverless AI is an artificial intelligence (AI) platform for automatic machine learning. Driverless AI automates some of the most difficult data science and machine learning workflows such as feature engineering, model validation, model tuning, model selection and model deployment. It aims to achieve highest predictive accuracy, comparable to expert data scientists, but in much shorter time thanks to end-to-end automation. Driverless AI also offers automatic visualizations and machine learning interpretability (MLI). Especially in regulated industries, model transparency and explanation are just as important as predictive performance. Modeling pipelines (feature engineering and models) are exported (in full fidelity, without approximations) both as Python modules and as pure Java standalone scoring artifacts.

· A utomated machine learning and feature extraction
· A utomated statistical visualization · Interpretability toolkit for machine learning
models

Multi-GPU Single Node

Multi-GPU, Multi-Machine distributed object store providing SQL style query capability, advanced geospatial query capability,heatmap generation, and distributed rasterization services.

· Q uery against big data in real time · N o pre-indexing allows for complex, ad-hoc
query chains · Interactively explore large, streaming
data sets

Multi-GPU Single Node

H2O is a popular machine learning platform which offers GPU-accelerated machine learning. In addition, they offer deep learning by integrating popular deep learning frameworks.

· A vailable algorithms include Gradient Boosting Machines (GBM's)
· G eneralized Linear Models (GLM's) · K -Means Clustering · SVD · PCA · K-means · X GBoost. · It can be used as a drop-in replacement
for scikit-learn with support for GPUs on selected (and ever-growing) algorithms. · A new R API brings the benefits of GPUaccelerated machine learning to the R user community. The R package is a wrapper around the H2O4GPU Python package, and the interface follows standard R conventions for modeling.

Multi-GPU Single Node

Far more than a transcription tool, this speech recognition software learns what is important in a telephone call, extracts information and stores a visual representation of phone calls to be combined with text/instant messaging and E-mail. Intelligent Voice's search and alert makes it possible to tackle issues before they arise, address data security concerns and monitor physical access to data.

· A dvanced Speech Recognition across large data sets
· J umpTo Technology, for data visualisation · E-Discovery · E xtraction from phone calls · IM & Email defining key phrases and
emotional analysis · C ompliance, defining key conversations
and interactions

Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 3

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 3
4/5/21 10:18 AM

Jedox Labellio Numba
OmniSci Polymatica

Jedox KYOCERA Communication Systems Co Anaconda
OmniSci Polymatica

Helps with portfolio analysis, management consolidation, liquidity controlling, cash flow statements, profit center accounting, treasury management, customer value analysis and many more applications. All accessible in a powerful web and mobile application or Excel environment.
The world's easiest deep learning web service for computer vision, allowing everyone to build own image classifier with only web browser.
Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Think of it as a compiler for Python array and numerical functions that gives you the power to speed up your applications with high performance functions written directly in Python. Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. You don't need to replace the Python interpreter, run a separate compilation step, or even have a C/C++ compiler installed. Just apply one of the Numba decorators to your Python function, and Numba does the rest. Numba generates optimized machine code from pure Python code using the LLVM compiler infrastructure. With a few simple annotations, array-oriented and math-heavy Python code can be just-in-time optimized to performance similar as C, C++ and Fortran, without having to switch languages or Python interpreters. Numba is designed to be used with NumPy arrays and functions. Numba generates specialized code for different array data types and layouts to optimize performance. Special decorators can create universal functions that broadcast over NumPy arrays just like NumPy functions do. Numba also works great with Jupyter notebooks for interactive computing, and with distributed execution frameworks, like Dask and Spark. With support for GPU acceleration, Numba lets you write parallel GPU algorithms entirely from Python.
OmniSci is GPU-powered big data analytics and visualization platform that is hundreds of times faster than CPU in-memory systems. OmniSci uses GPUs to execute SQL queries on multi-billion row datasets and optionally render the results, all in milliseconds.
Analytical OLAP and Data Mining Platform

· T his database holds all relevant data in GPU memory
· T esla K40 &12 GB on-board RAM · S cales up with multiple GPUs · K eeps close to 100 GB of compressed data
in GPU memory on a single server system · F ast analysis, reporting, and planning · N eural net fine-tuning for image data · D ata crawling and data browsing · D rag-and-drop style data cleansing backed
by AI support · O n-the-fly code generation (at import time
or runtime, at the user's preference) · N ative code generation for the CPU
(default) and GPU hardware · Integration with the Python scientific
software stack (enabled via Numpy) · J IT compilation of Python functions for
execution on various targets (including CUDA)
· U ses LLVM's nvptx backend to generate CUDA kernels
· O penGL- (EGL) based rendering · C an run in a docker container using
NVIDIA-docker
· V isualization, Reporting, OLAP in-memory with GPU acceleration
· D ata Mining · M achine Learning · P redictive Analytics

Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node Multi-GPU Multi-Node

4 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 4

4/5/21 10:18 AM

Sqream DB

SQream Technologies

SynerScope

Synerscope

ZX Lib (Fuzzy Logic) Tanay

GPU accelerated SQL database engine for big data analytics. Sqream speeds SQL analytics by 100X by translating SQL queries into highly parallel algorithms run on the GPU.

· U p to 100TB of raw data can be stored and queried in a standard 2U server
· Inserts and analyzes hundreds of billions of records in seconds
· N o indexes required · N o changes to SQL code or data science
paradigms required

Multi-GPU Single Node

Big data visualization and data discovery, for · R eal-time Interaction with data combining Analytics on Analytics with IoT compute-at-the-edge smart sensors.

Single GPU Single Node

Financial analytics and data mining library

· M onte Carlo simulations · P ricing of vanilla and exotic options · F ixed income analytics · D ata mining

Multi-GPU Single Node

Artificial Intelligence

DEEP LEARNING AND MACHINE LEARNING

APPLICATION NAME
AIC

COMPANYNAME
Tracxpoint

PRODUCT DESCRIPTION
AIC (Artificial Intelligence Cart) revolutionizes the supermarket shopping experience with sensor fusion and machine learning technology.

SUPPORTED FEATURES

GPU SCALING

· T he smart IoT cart recognizes the shopper, loads their shopping list and buying patterns, suggests compatible products and provides the most valuable offer
· R ecognizes the items placed in the cart and bill the customer at the end of the shopping experience with no checkout lanes
· F eature Jetpack

Single GPU Single Node

AiFi Nano

AiFi Inc.

Cashier-free (like Amazon grab and go solution) and stock out retail software

· cuDNN · TensorRT · DeepStream

Multi-GPU Single Node

AI Image Labeling Frenzy

Builds robust self-labeling training datasets for classifying exact objects and products in visual scenes at a fraction of the time and cost

· G PU in the cloud

Multi-GPU Single Node

AI Lifescycle

Clarifai

Clarifai brings a new level of understanding to visual content through deep learning technologies. Uses GPUs to train large neural networks to solve practical problems in advertising, media, and search across a wide variety of industries such as automated tagging, visual search, and recommendation engine, predictive maintenance, demographic analysis and more.

· G PU-based training and inference · R ecognizes and indexes images with
predefined classifiers or custom classifiers

Multi-GPU Single Node

Allganize NLU APIs Allganize, Inc. for Enterprises

Natural Language Understanding APIs for enterprise: Answer-bot based on documents with unstructured data (text + table), e.g., manuals, instructions, FAQ documents; Review analysis; sentiment analysis, summarizing etc. Provided as APIs.

· T raining and inferencing using V100

Multi-GPU Multi-Node

AlphaSense

AlphaSense

PaaS for Financial analysis based on public corporate information. Geared at financial analysts within financial services.. Allows very fast searches of public corporate information, and allows questing answering format ("the Google for Analyst research")

· P aaS for Financial analysis based on public corporate information
· G eared at financial analysts within financial services.
· A llows very fast searches of public corporate information, and allows questing answering format ("the Google for Analyst research")

Multi-GPU Single Node

AlwaysAI

Always AI

Easy-to-use platform to build and deploy computer vision applications for embedded devices at the edge. Apply for an early access on the product link

· J etson Nano

Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 5

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 5
4/5/21 10:18 AM

Anaconda

Anaconda

Enterprise Edition

The end-to-end data science platform. The Anaconda enterprise platform is a comprehensive foundation for any organization that wants to use data science and machine learning to make better decisions and build differentiating solutions.

· B indings to CUDA libraries: cuBLAS, cuFFT, cuSPARSE, cuRAND
· S orts algorithms from the CUB and Modern GPU libraries
· N umba (JIT Python compiler), Dask (Python scheduler), NumPy, SciPy,
· S ingle-line install of numerous DL frameworks such as PYTORCH

Multi-GPU Single Node

Antuit Demand Planning and Forecasting

Antuit

Extracts maximum predictability from the available data. Proprietary "Dynamic Aggregation" logic with attribute-based disaggregation generates forecasts for all products, including new, slow-moving, and end-of-life. Spark and GPU clusters, along with optimized AI algorithms, provide scaling for the largest retailers. Incorporates all available demand drivers, such as price elasticities, promotional lifts, weather, and hyper-local event data.

· C UDA 10.1 · C uDNN 7.6 · C uBLAS 10.2

Multi-GPU Multi-Node

Apache Mahout

Apache Mahout

Mahout is building an environment for quickly creating scalable performant machine learning applications.

· E xtremely easy to add new algorithms · D istributed instead of single machine

Multi-GPU Multi-Node

Applica RTA

Applica

Applica RTA combines computer vision and deep-learning driven NLP to process all documents types.

· G PU to accelerate model training, finetuning and inferencing

Multi-GPU Multi-Node

Artificial

Deepwave

Intelligence Radio Digital

Transceiver (AIR-T)

The Artificial Intelligence Radio Transceiver · T he AIR-T is designed to be an edge-

N/A

(AIR-T) is software defined radio designed

compute inference engine for deep learning

and developed for RF deep learning

algorithms.

applications. The app is equipped with three

signal processors including a 256 core

NVIDIA Jetson TX2, a field programmable

gate array (FPGA), and dual embedded

CPUs.

ARYA.ai

ARYA.ai

Deep learning platform with end-to-end workflows for Enterprise, incorporating TensorFlow. Focuses on consumer banking and insurance industries.

· D eep learning · TensorFlow.

Multi-GPU Multi-Node

Aura Vision

Aura Vision

Capture unique insights from every visitor, using your existing cameras

· S egmented footfall · S hopper motivation · P roduct engagement · W indow display ROI · S tore utilization · S ervice wait times

Single GPU Single Node

Avitas Systems - Inspection as a Service

Avitas Systems

Avitas Systems configures various multi rotor and helicopter drones with multiple sensor kits including RGB cameras, laser sensors, infrared and others collecting inspection data to meet different customer use cases. Ingests inspection data where an AI back-end turns the raw data into inspection findings such as corrosion levels, damaged/missing parts, encroaching vegetation volumes.

· D rone based data capture · R GB Camera, Laser and Infrared sensing · D eep learning driven Object detection for
Inspection · D etect corrosion levels, damaged/missing
parts, encroaching vegetation volumes. · A I workbench · Photogrammetry

Multi-GPU Multi-Node

AWM Smart Shelf

Adroit Worldwide Media Inc.

Application for Automated Inventory Intelligence (view and track virtually in a retail environment), Content Management System (manage inventory, prices and content), Led Display (prices. promotions and advertisements at the click of a button) and Product Mapper (automate creation of planograms and auditing process)

· kubernetes · Docker · R TX 2080

Multi-GPU Single Node

Badger Insights

Badger Technologies

Badger Technologies provides data and analytics for retail operations through automation solutions that include a fully autonomous robot to address out-of-stock, planogram compliance, and price integrity

· G PU accelerated

Single GPU Single Node

6 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 6

4/5/21 10:18 AM

BIDMach Bons.ai Brain Frame
Caffe2 Cartwatch Checkout CatBoost Chainer
checout inteliigence ClearML
CNTK
ConundrumAI

UC Berkeley Bons.ai Aotu
Facebook Signatrix Yandex Preferred Networks, Inc.
Everseen Allegro.AI
Microsoft Corp.
Conundrum Industrial Limited

The fastest machine learning library available. Holds the record for many common machine learning algorithms.

· W ritten in Scala and supports Scala and Java interfaces
· S upports linear regression, logistic regression, SVM, LDA, K-Means and other operations

Multi-GPU Single Node

Bons.ai is an artificial intelligence platform which abstracts away the low-level, inner workings of machine learning systems to empower more developers to integrate richer intelligence models into their work.

· Easy to use programming interface. Bons.ai · N ovel programming language called
Inkling · P rimary focus on reinforcement learning

Multi-GPU Single Node

BrainFrame platform provides Out-OfThe-Box Smart Vision Applications for multiple verticals. The drag-and-drop VisionCapsules system allows you to pick from a wide selection of custom algorithms to extract exactly the information you want

· Jetpack · Jetson

Single GPU Single Node

This is a faster framework for deep learning, it's forked from BVLC/caffe (master branch). Allows data-parallel via MPI.

· G PU cluster processing · M ass image data

Multi-GPU Single Node

Protect the checkout area and reduce the workload of your checkout staff

· R eal-time alerts on theft (mis-scan) at the checkout lanes
· F eaturing Jetpack and TensorRT

Single GPU Single Node

CatBoost is an open-source gradient boosting library with categorical features support.

· E xtremely fast learning on GPU · Multi-GPU · Multi-Node

Multi-GPU Multi-Node

DL framework that makes the construction of neural networks (NN) flexible and intuitive.

· D ynamic NN construction, which makes debugging easier
· C PU/GPU-agnostic coding, which is promoted by CuPy, partially NumPycompatible multidimensional array library for CUDA
· D ata-dependent NN construction, which fully exploits the control flows of Python without magic

Multi-GPU Multi-Node

Loss prevention solution at the POS powered by T4

· M Is-scan detection · P roduct and ticket switching detection · " Walk off" detection

Multi-GPU Single Node

ClearML provides a suite of tools to streamline ML workflow, including Experiment Manager, ML-Ops and Data Management.

· M ulti-system enterprise workflow scheduling
· V ersion control (e.g., the ?git?) for models · D GX-ready and available from NGC · O pen-source and paid options · E nables reproducibility and automation · C learML supports MIG functionality · T ensorFlow, Keras, and PyTorch · N VIDIA frameworks such as Clara for
healthcare and medical imaging · R APIDS and TLT

Multi-GPU Multi-Node

Microsoft Computational Network Toolkit (CNTK) is a unified computational network framework that describes deep neural networks as a series of computational steps via a directed graph.

· S peech Recognition · M achine Translation · Image Recognition · Image Captioning · T ext Processing and Relevance · L anguage Understanding · L anguage Modeling

Multi-GPU Single Node

Conundrum, a UK-based company, develops AI solutions for predictive maintenance and optimization of industrial processes.

· A utomated deep learning significantly speeds up a build of the applications based on DL models;
· T ransfer Learning enables to boost the performance of the applications by transferring knowledge between them;
· D ata based digital twins and reinforcement learning for optimization.

Multi-GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 7

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 7

4/5/21 10:18 AM

Darwin
Databricks Unified Analytics Platform
DeepInstinct Deeplearning4j
Dessa Dextro Dr. Retail Frenzy Enterprise Solutions
G3C.AI Gridspace
Insights Keras

SparkCognition
Databricks
DeepInstinct Skymind Dessa Axon SkyREC Inc. Frenzy
Graymatics Gridspace AnyVision Open Source

Darwin is a machine learning product that accelerates data science at scale by automating the building and deployment of models. Based on a proprietary neuroevolutional algorithm, Darwin uses a combination of ML methods and genetic algorithms, to arrive at a new generation of designs. Databricks provides a cloud-based platform designed to make big data and machine learning simple.
Zero day end point malware detection solution offered to enterprise markets. Deeplearning4j is the most popular deep learning framework for the JVM, and includes all major neural nets such as convolutional, recurrent (LSTMs) and feedforward.
Deep Learning Platform based on TensorFlow. Allows end-to-end workflows. Targets consumer banking and insurance industries. Dextro's API uses deep learning systems to analyze and categorize videos in real-time.
Instore data analytics
Frenzy Enterprise Solutions provides retailers and brands with the tools to provide customer's the best experience and more purchasing opportunities including Similar Product Recommendations, Inventory Tagging, Camera Search, Complimentary Product Recommendations, How To Wear It, Influencer Matching Retail in store analytics solutions through Deep CCTV Streaming Analytics
Voice analytics to turn streaming speech audio into useful data and service metrics. Instrumental to contact call center and work communications with powerful deep learning-driven voice analytics. Insight delivers in-store analytics with features such as: heavy shoppers, gaze estimation, heatmaps, customer journey, and offline to online Keras is a minimalist, highly modular neural networks library, written in Python. Capable of running on top of either TensorFlow or Theano and developed with a focus on enabling fast experimentation.

· U nique neuro-evolutionary algorithm on GPU
· A utomated ML for model building on GPUs · G PU accelerated PyTorch
· G PU instances available with CUDA drivers included
· G PU support provided by Spark scheduler · Integration of TensorFlow, Keras · T ensorFrames data connector · D eep learning pipelines/workflows · T ransfer learning and image loading · Z ero-day threats & APT attack detection on
endpoints, servers and mobile devices · Integrates with Hadoop and Spark to run
distributed · J ava and Scala APIs · C omposable framework that facilitates
building your own nets · Includes ND4J, the Numpy for Java. · D eep learning workflows can be built · B ased on TensorFlow · U se cases in consumer banking and
Insurance · O bject and scene detection · M achine transcription for audio · M otion and movement detection · T ensorRT 5.1 · nvJPEG · NVEnc · NVDec · G PU on the cloud
· In store analytics: heat-maps, shopper tracking, dwell time, people counting, mood detection, demographics
· F eaturing TensorRT and Deepstream · S peech-to-text transcription · Compliance · C all grading · C all topic modeling · C ustomer service enhancement · C ustomer churn prediction · N VIDIA Tesla T4 and Jetson
· c uDNN version (depends on the version of TensorFlow and Theano installed with Keras)
· S upported Interfaces: Python

Multi-GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node Single GPU Single Node
Multi-GPU Single Node
Multi-GPU Single Node
N/A
Multi-GPU Single Node
Multi-GPU Single Node

8 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 8

4/5/21 10:18 AM

Malong Retail AI Fresh

Malong Technologies

Malong Retail AI Protect

Malong Technologies

MatConvNet

Mathworks

Matriod

Matroid

MetaMind MXNet

Einstein Platform Services
Amazon

Neon

Intel

NVCaffe
out of stock detection PaddlePaddle

Berkeley AI Research Focal Systems
PaddlePaddle

Protects & Insights Briefcam

RetailAI® Fresh solves for the timeconsuming and error-prone experience that grocery store customers today struggle with when weighing fresh products on a selfserve scale.

· S upports T4 · S upports Deepstream

For loss prevention at self-checkout and staffed lanes. Leverages award-winning product recognition technologies, the system accurately identifies and stops common scan errors as they happen ? including mis-scans and ticket-switching ? while helping to protect customer privacy. Offers industry-leading accuracy while being massively scalable for effectively unlimited SKUs and stores.

· S upports T4 · S upports Deepstream

CNNs for MathWorks MATLAB, allows you to use MATLAB GPU support natively rather than writing your own CUDA code.

· B uilding Blocks · S imple CNN wrapper · D agNN wrapper · c uDNN implemented

Matroid offers video classification service in the cloud. Matroid allows training video detections on a set of images and then applying those video detection.

· M atroid is multi-cloud and allows it customers to easily switch between AWS, Azure and Google Cloud.

Provides a deep learning API for image recognition and text sentiment analysis. Uses either prebuilt, public, or custom classifiers.

· G PU-based training and inference · R ecognizes image and analyzes text · C reates and trains classifiers with tooling
for uploading and managing datasets

MXnet is a deep learning framework designed for both efficiency and flexibility that allows you to mix the flavors of symbolic programming and imperative programming to maximize efficiency and productivity.

· M Xnet supports cuDNN v5 for GPU acceleration

Neon is a fast, scalable, easy-to-use Python based deep learning framework that has been optimized down to the assembler level. Features a rich set of example and pre-trained models for image, video, text, deep reinforcement learning and speech applications.

· T raining, inference and deployment of deep learning models
· P rocesses over 442M images per day on a Titan X

The Caffe deep learning framework makes implementing state-of-the-art deep learning easy.

· P rocess over 40M images per day with a single NVIDIA K40 or Titan GPU

Deep Learning Computer Vision track your On-Shelf Availability throughout your entire store 100+ times a day

· O n-Shelf Availability Analytics per hour · R eal-time Alerts on your "never be outs"

PaddlePaddle (Parallel Distributed Deep Learning) is an easy-to-use, efficient, flexible and scalable deep learning platform, which is originally developed by Baidu scientists and engineers for the purpose of applying deep learning to many products at Baidu.

· O ptimized math operations through SSE/ AVX intrinsics, BLAS libraries (e.g. MKL, ATLAS, cuBLAS) or customized CPU/GPU kernels
· H ighly optimized recurrent networks which can handle variable-length sequence without padding
· O ptimized local and distributed training for models with high dimensional sparse data

Transform video into actionable intelligence. features: video synopsis and real time alerts, loss prevention, customer engagement and tying info to POS data, heatmaps, shopper tracking

· N VIDIA Tesla and Jetson. · TesnorRT

Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node
Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 9

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 9
4/5/21 10:18 AM

QA Bot

Pryon

Retail Analytics

Pilot AI Labs

samadii/dem

Metariver Technology

SAS

SAS

Sentient

Sentient

Shopic Frictionless Shopping
SmartCart

Shopic Imagr

Smart Skin

Human engine

SpaceKnow PaaS SPACEKNOW

Challenge: QA Bots are easy to build but hard to keep up-to-date . The last thing you want is a bot distributing wrong answers 24/7. Solution: With Pryon, QA bots are ridiculously fast and easy to create ? and more importantly easy to monitor and maintain. Benefits: - Real time monitoring of questions asked - Update or add more answers directly or
by adding documents - Process feedback easily

· V100

Multi-GPU Single Node

Retail in-store analytics for stock out (cameras in shelves), demographics (age/ gender), shopper tracking/counting, anomaly detection, drive through solutions and more

· Jetpack · Jetson TX2 · R TX 2080

Single GPU Single Node

Software for computing various behaviors of massive solid particles of various size particles from small particle with Brownian motion to large particle such as ore with DEM(Discrete Element Method).

· S olid particle simulator, DEM solver · M ulti-Physics module(Drag and Buoyancy
force, Magnetic force, Coulomb force, adhesion force, Van der Waals force, Brownian motion and heat effect) · VPS(Virtual Particle System), Cluster model · C o-simulation with MBD(Multi Body Dynamics) solvers (ADAMS, DADS, RecurDyn, Daful) · C o-simulation with ANSYS Mechanical (Flexible body).

Multi-GPU Multi-Node

SAS Machine Learning. SAS Viya Visual Data Mining and Visualization suites now leverage GPU deep learning

· V olta V100 with tensor cores · T ensorRT for inference on the NVIDIA
Jetson TX2 box · RNN · M ultiple GPUs on a single SMP node · H omogeneous and heterogeneous MPP
with synchronized Stochastic Gradient Descent

Multi-GPU Multi-Node

Sentient is an AI platform company with special focus on digital marketing, ecommerce and finance trading applications.

· S entient is using GPU deep learning in its commercially available ecommerce, digital marketing and financial trading applications
· S tudio.ml is a new project designed to make AI development easier by hiding most of the complexity
· S tudio.ml runs on-premise and in the cloud

Single GPU Single Node

Frictionless Shopping - using smart cart

· N VIDIA Xavier NX

Single GPU Single Node

SmartCart comprised of four tiny cameras · N VIDIA Jetson, Xaiver

and AI vision recognition system

· TensorRT

Single GPU Single Node

AI-enhanced processing of 3D and 4D data. Used to create high quality 3D characters for interactive media (games, mobile apps, VFX, VR/AR and mixed reality experiences, etc) - automatic retopology of 3D and 4D data
using machine learning - photogrammetry : noise-reduction and
hole-patching using machine learning - realistic lip-sync using 4D-trained neural
network

· CUDA · Hairworks · PhysX · cuDNN · OptiX

Multi-GPU Multi-Node

PaaS for deep learning extraction of satellite data information targeted at Financial Services and Defense / Intelligence. Tracks macro/micro-economic activity by applying deep learning to satellite images.

· E xtracts economic activity from satellite images using deep learning
· P rovides batch mode extraction

Multi-GPU Multi-Node

10 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 10

4/5/21 10:18 AM

Talkmap

Talkmap

Tensorflow

Google

Theano

LISA Lab

The Deep North Video Analytics platform
theft & safety

Deep North Third Eye Labs

ThermalNet

Malong Technologies

Torch7 TrigoVison Unify.ID Veesion

Open Source TrigoVision Unify.ID Veesion

Visual Intelligence Deep Vision API

Voca's Virtual Agent

voca.ai

NLU model training/re-training/fine-tuning for contact center operation automation trained from raw transcripts to identify the intentions automatically, complemented by human annotation. Models are used for post-call analysis, chatbot design etc. Google's TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. Theano is a symbolic expression compiler that powers large-scale computationally intensive scientific investigations. The Deep North platform includes Occupancy Management, Gesture Analysis, Zone Management, Vehicle Analysis, Dashboard and reporting Theft, safety and loss detection
AI-based dual camera thermal + computer vision screening system that can be utilized by enterprises to help people stay safe during epidemics. Powered by multiple world-class AI models, the system can accurately detect and alert on potentially dangerous temperature levels combined with PPE, occupancy, and social distancing compliance. Torch7 is an interactive development environment for machine learning and computer vision. Retail automation platform that provides seamless checkout, shoplifting prevention, and real-time inventory updates. Behavioral user authentication service
Shoplifting detection using deep learning algorithm that continuously analyses the content of security cameras. It automatically detects gestures associated with shoplifting in real-time. Sends a video alert to a human operator who confirms the theft and takes action. Deep Vision specializes in understanding visual content and getting the most value of data by applying visual recognition for enterprises.
Human like cell center conversation AI

· V100 · P100 · T 4 GPUs · cuDNN
· T ensorFlow is flexible, portable and performant creating an open standard for exchanging research ideas and putting machine learning in products
· A bstract expression graphs for transparent GPU acceleration
· TensorRT
· T esla T4 - metropolis · C oncealment detector in IN-AISLE AND
THE STOCKROOM · S afety - social distance detector · C heckout Theft Detector at the POS · S mart Alerts · P rivacy Protection · C ustomizable and Flexible Deployment · H igh Performance Accuracy
· C omputational back-ends for multicore GPUs
· TensorRT
· Identifies individuals based on unique factors such as the way they walk, type and sit
· R eal-time shoplifting prospects alerts
· V isual Intelligence API allows leader enterprises in verticals like e-commerce and online auctions, media and entertainment and retailers, to analyze content related with faces, brands and context tags to perform actions like:
· C urate and organize visual content · S earch and recommend visually · G et insights and analytics visually · Jasper · NeMo

Multi-GPU Multi-Node
Multi-GPU Single Node
Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node
Single GPU Single Node
Multi-GPU Multi-Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 11

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 11
4/5/21 10:18 AM

vuForecast Walkout Yusp
Zippin

deepVu walkout Gravity R&D
Zippin

ML/DL enabled vuForecast learns from historical inventory, point of sale, promotions and logistics data augmented with DeepVu's real-time data platform aggregating numerous external micro and macro economic signals to accurately forecast future demand Autonomous check out - smart cart
Personalized recommendations for E-commerce, powered by T4
Checkout-free technology offering inventory tracking and insights to ensure the right products are in the right place, at the right time.

· M L (dmlc/XGBoost) + Dask for distributed training
· D L (RNN/LSTM networks) + PyTorch 1.1 · D L (RL) + TensorFlow 1.14 and 2.0
· N VIDIA Jetson Tx2
· S earch solution to create a smooth product discovery experience
· P roduct/Content recommendation · O n-site personalization · S earch personalization · M obile personalization · E -mail marketing (and push, SMS)
personalization · P ersonalization ad retaergeting · A d exchange yield optimization · Jetpack

Multi-GPU Single Node
Single GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node

Public Sector and National Government

APPLICATION NAME
Advanced Ortho Series

COMPANYNAME
DigitalGlobe

PRODUCT DESCRIPTION
Geospatial visualization

SUPPORTED FEATURES
· Image orthorectification

GPU SCALING
Multi-GPU Single Node

ArcGIS Pro

ESRI

Viewshed2 determines the raster surface locations visible to a set of observer features, using geodesic methods. Transforms the elevation surface into a geocentric 3D coordinate system and runs 3D sightlines to each transformed cell center. Takes advantage of Tensor Cores for both training and inference .

· Viewshed2 · D eep Learning · A spect - The values of each cell in the
output raster indicate the compass direction the surface faces at that location. It is measured clockwise in degrees from 0 (due north) to 360 (again due north), coming full circle. · S lope - The output slope raster can be calculated in two types of units, degrees or percent (percent rise).

Multi-GPU Multi-Node

Blaze Terra

Eternix

Geospatial visualization tool

· 3 D visualization of geospatial data

Multi-GPU Single Node

Elcomsoft

Elcomsoft

High-performance distributed password recovery software with NVIDIA GPU acceleration and scalability to over 10,000 workstations.

· G PU acceleration for password recovery · 1 0-100x speedup for password recovery

Multi-GPU Single Node

ENVI

L3Harris Inc

Image Processing and Analytics

· D eep Learning training · D eep learning inferencing · Image orthorectification · Image transformation · A tmospheric correction · P anchromatic co-occurrence texture filter · V ideo processing and analytics using
Jagwire

Multi-GPU Single Node

ERDAS Imagine

Hexagon Geospatial

Remote sensing, photogrammetry and GIS toolset for the interactive, semi-automated and automated extraction of information from remotely sensed imagery and point clouds.

· G ray Level co-occurrence matrix (CLCM) image processing operation
· N NDiffuse image pan sharpening operation · D eep learning capabilities using the GPU
accelerated versions of Tensorflow

Single GPU Single Node

12 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 12

4/5/21 10:18 AM

Fortify

Corsight AI

Geomatics GXL

PCI

GeoWeb3d Desktop Geoweb3d

Graphistry

Graphistry

Ikena ISR

MotionDSP

LuciadLightspeed Hexagon Geospatial

Manifold Systems OmniSIG

Manifold Systems
DeepSig Inc.

SNEAK SocetGXP

OpCoast BAE Systems

Sureproof Facial Recognition AI For your Safety & Privacy

· S mart technology that can overcome face masks & PPE
· F acial recognition in almost complete darkness & extreme angles
· N on discriminative algorithm that is ethnicity neutral
· V intage image match up to 30 years old · M ask detection and alert on subjects not
wearing a face mask

Multi-GPU Single Node

Image processing

· Image orthorectification · A dditional image processing

Multi-GPU Single Node

Geospatial visualization of 3D and 2D data, · 3 D visualization and analysis of geospatial Multi-GPU

mensuration and mission planning

data

Single Node

Graphistry is the first visual investigation platform to handle increasing enterprisescale workloads.

· G raph reasoning · G PU-accelerated visual analytics · V isual pivoting · R ich investigation templating

Multi-GPU Single Node

Real-time full motion video (FMV) and widearea motion imagery (WAMI) enhancement and computer-vision-based analytics software.

· R eal-time super-resolution-based video enhancement on live streams
· G eospatial visualization · T arget detection and tracking · F ast 2-D mapping

Multi-GPU Single Node

Geospatial visualization and analysis

· G PU accelerated line of sight and view shed calculations
· G PU accelerated hypsometry calculations, including terrain slope, ridge and valley detection, terrain orientation and azimuth calculations
· G PU accelerated imaging operator for geospatially referenced imagery

Single GPU Single Node

Full-featured GIS, vector/raster processing · M anifold surface tools & analysis

Multi-GPU Single Node

The OmniSig sensor provides a new class of RF sensing and awareness using DeepSig's pioneering application of Artificial Intelligence (AI) to radio systems. Going beyond the capabilities of existing spectrum monitoring solutions, OmniSIG is able to not only detect and classify signals but understand the spectrum environment to inform contextual analysis and decision making. Compared to traditional approaches, OmniSIG provides higher sensitivity and accuracy, is more robust to harsh impairments and dynamic spectrum environments, and requires less computational resources and dynamic range.

· O perates in a real-time streaming fashion · Ingests radio samples from many common
radio interfaces · M ake use of packet formats like VITA49 or
SDDS. · C an be used from any device with a
browser, including mobile handsets · O mniSIG software also provides its
metadata output stream in JSON form for use by other applications

Multi-GPU Single Node

Electromagnetic signals propagation modeling for complex urban and terrain environments.

· R ay tracing, DTED and remote sensing inputs

Multi-GPU Single Node

Visual Profiler utilizes a cognitive vision and profiling methodology (using machine learning algorithms and state of the art deep learning schemes) to provide unlimited object definition and profiling flexibility. The Automatic Spatial Modeler (ASM) is designed to generate 3-D point clouds with accuracy similar to LiDAR. Extracts 3-D objects and 3_D dense point clouds from stereo images. Also extracts accurate building edges and corners from stereo images with high resolution, large overlaps, and high dynamic range.

· A utomated 3D feature extraction from LiDAR
· A utomated feature detection from imagery using deep learning

Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 13

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 13
4/5/21 10:18 AM

Terrabuilder PhotoMesh
Therm-App® MD Pro
Wesafe

Skyline Software Opgal WeSmart

PhotoMesh integrates a GPU-based, fast algorithm, able to automatically build 3D models from simple photographs. PhotoMesh revolutionizes the use of geospatial data by fully automating the generation of highresolution, textured, 3D mesh models from standard 2D images.

· 3 D model building from imagery · B uilding texture generation

Thermal imaging device for body temperature measurement

· U nlimited Hotspot Detection & Tracking · A dvanced Deep Learning Algorithm · L inux-Based Solution · S tand-Alone Solution · R emote Sensor · Q uick Hotspot Detection · U p to 20 Simultaneous Scans · A udio & Visual Alert

Simple low cost IVA solution for up to 4 cameras on a Jetson Nano, Performing people detection in ROI and people counting.

· P eople Detection in ROI · N ight/Day, People Counting · P ush notifications with visuals of the alerts · S imple setup, ONVIF Cameras detection.

Multi-GPU Single Node
Single GPU Single Node
Single GPU Single Node

Design for Manufacturing/Construction: CAD/CAE/CAM

CFD (MFG)
APPLICATION NAME
Actran

COMPANYNAME
FFT

ADS Flow Solver - ADSCFD, Inc. Code LEO

Altair AcuSolve

Altair

Altair nanoFluidX Altair

Altair ultraFluidX Altair

Ansys Fluent

ANSYS

Ansys Icepak Ansys Polyflow

ANSYS ANSYS

PRODUCT DESCRIPTION
Simulation of acoustics propagation at high frequency or in huge domains such as exhaust of turbomachines, full truck cabin exterior acoustics, and ultrasonic parking sensors.

SUPPORTED FEATURES
· D iscontinuous Galerkin Method (DGM) solver

A Compressible, explicit time-marching CFD solver for aerospace applications. Capable of handling both internal and external flows with robustness and accuracy

· U nstructured/Structured Meshes · M ultigrid Accelerations · M ultiple Turbulence Models · R otor-stator Interfaces

Computational Fluid Dynamics (CFD) tool, providing users with a full range of physical models. Simulations involving flow, heat transfer, turbulence, and nonNewtonian materials are handled with ease by AcuSolve's robust and scalable solver technology.

· L inear solvers for flow, temperature, turbulence model, and mesh movement equations

State-of-the-art particle-based (SPH) fluid dynamics code for simulation of single and multiphase flows in complex geometries with complex motion.

· E xtremely fast · S ingle and Multiphase Flows · A rbitrary motion definition · T ime-dependent acceleration · Inlets/outlets · S urface tension and adhesion · S teady-state thermal solutions through
coupling

Simulation tool for ultra-fast prediction of the aerodynamic properties of passenger and heavy-duty vehicles as well as for the evaluation of building and environmental aerodynamics.

· C UDA-accelerated high-fidelity flow field computations based on the Lattice Boltzmann method
· C UDA-aware MPI support for multi-GPU and multi-node usage
· E fficient implementation of tailor-made automotive features, including rotating wheels, belt systems, boundary layer suction and porous media support

General purpose CFD software

· L inear equation solver · R adiation heat transfer model · D iscrete Ordinate Radiation model

CFD software for electronics thermal management

· L inear Equation Solver

CFD software for the analysis of polymer and glass processing

· D irect Solvers

GPU SCALING
Multi-GPU Multi-Node
Multi-GPU Multi-Node
Single GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Multi-Node
Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Single Node

14 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 14

4/5/21 10:18 AM

CharLES
CPFD BarracudaVR and Barracuda DYVERSO Fine/Open
FINE/Turbo GeoPlat-RS
HiFUN
JSCAST midas NFX(CFD) MIKE 21 MIKE 3 MIKE FLOOD MSC Apex Generative Design

Cascade Technologies, Inc.
CPFD
Next Limit
Numeca International
Numeca International GridPoint Dynamics (GPD)
SANDI
Qualica Inc.
Midas DHI DHI DHI
MSC Software

CharLES is a GPU-accelerated CFD software application specializing in LES (Large Eddy Simulations). Runs on a range of CUDA GPUs from Kepler to Turing architectures and scales with multiple GPUs in a single server node as well as scales across multiple GPUs over a cluster of nodes.

· C UDA Toolkit

Multi-GPU Multi-Node

Modeling software for simulating Fluidized Reactors

· L inear equation solver for isothermal, non-reacting simulations and for thermal reacting cases
· D iscrete multi-component particle calculations

Single GPU Single Node

Multi-physics simulation engine for liquids and granular substances. Can be used to mimic behavior of rigid and soft bodies

· F luid solver in Real Flow 10.5 based on Smoothed particle hydrodynamics (SPH)
· F luid solver in Real Flow 10.5 based on Position based dynamics (PBD)

Single GPU Single Node

FINE/Open with OpenLabs is a powerful CFD Flow Integrated Environment dedicated to complex internal and external flows. It allows users to freely develop and exchange physical models in CFD, with a new open approach to CFD. Complex programming tasks are avoided through the usage of an easy meta-language.

· Incompressible, low and high speed flows · E fficient preconditioned compressible
solver with fast agglomerated multigrid acceleration and adaptation techniques to combine completely unstructured hexahedral grids

Multi-GPU Multi-Node

Structured, multi-block, multi-grid CFD solver targeting the turbo machinery industry

· M ulti-grid solver

Multi-GPU Multi-Node

Geoplat Pro-RS is a parallel hydrodynamic simulator with a flexible architecture. This enables to reduce the time for writing the entire simulator by 2/3, and, as consequence, to quickly bring new physical processes into the algorithm.

· CUDA

Multi-GPU

· S pectral Decomposition with CUFFT library Single Node

High Resolution Flow Solver on Unstructured Meshes. State-of-the-art Euler/RANS solver. Super scalability on massively parallel HPC platforms, with code ported using OpenACC directives for NVIDIA GPU.

· H iFUN imbibes most recent CFD technologies; many of them home grown
· H iFUN exhibits highly scalable parallel performance with its ability to scale up to several thousand processors on massively parallel computing platforms
· C apable of handling complex geometries and flow physics arising in high lift flows

Multi-GPU Single Node

Integrated CAE product for studying and predicting the casting process. Includes high precision mold filling and solidification solvers.

· S olvers for mold filling and solidification · Rendering

Single GPU Single Node

General purpose CFD software based on FEM

· L inear equation solver (Iterative Solver and Single GPU

AMG Preconditioner)

Single Node

2D hydrological modelling of coast and sea for simulating physical, chemical, and biological processes

· F lexible Mesh (FM) engines use GPUs.

Multi-GPU

· H ydrodynamic and turbulence calculations Single Node

3D Modeling of Coast and Sea

· H ydrodynamic part of the flexible mesh engines (MIKE 3 HD FM).

Multi-GPU Multi-Node

1D & 2D urban, coastal, and riverine flood modelling

· Hydrodynamics · 2 D Overland flow · C oupling of 1D and 2D models for complex
flooding issues

Multi-GPU Single Node

Generative Design based simulation to create several optimized, lightweight designs ultra-fast and almost fully automated

· U ltra-fast matrix solving · A ccelerated computing power for part
optimizations

Multi-GPU Multi-Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 15

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 15
4/5/21 10:18 AM

M-Star CFD Numerix Pacefish
Particleworks
PowerViz ScPOST Simcenter 3D Simcenter STARCCM+ Speed IT FLOW Turbostream zCFD

M-Star

General purpose CFD Multiphysics

Simulations, LLC modeling software

Zeus

Custom software development in the areas of CFD, FEA and Electromagnetics

Numeric Systems GmbH

CFD application for Automotive Aerodynamics, Pedestrian Comfort and Wind Loading

Prometech
Dassault Systèmes SIMULIA Corp. Hexagon, Cradle
Siemens Digital Industries Software
Siemens Digital Industries Software Vratis

CFD software using MPS (Moving Particle Simulation) method for automotive, energy, material, chemical processing, medical, food, and civil engineering industries where free surface fluid flow and fluid mixing phenomena occur.
Industry proven, modern post-processing app for EXA POWERFLOW CFD
Postprocessor for visualizing simulation results from CFD analysis, MSC Nastran and MSC Marc
A unified, scalable, open and extensible environment for 3D CAE with connections to design, 1D simulation, test, and data management.
Integrated solution for CFD-focused Multiphysics simulation
Incompressible single-phase CFD software

Turbostream Ltd.

CFD software for turbomachinery flows

Zenotech Simulation Unlimited

General purpose CFD solver

· F luid flow & heat transfer · D EM simulation · C hemical reactions · M ulti-phase flow

Multi-GPU Multi-Node

· L attice Boltzmann Method (LBM) for flow around buildings
· S PH based flow solver for simulating flow over urban environments

Multi-GPU Single Node

· T ransient Lattice-Boltzmann Method for single-phase flows
· Integrated fast and robust pre-processor for complex geometries
· L ocal grid refinement · u RANS (K-Omega-SST), hybrid uRANS-LES
(SST-DDES & SST-IDDES) · L ES (Smagorinsky) turbulence modeling · s calable up to 16 GPUs

Multi-GPU Single Node

· E xplicit and Implicit methods

Multi-GPU Multi-Node

· Rendering · R ay tracing
· F ile loading acceleration
· Rendering · Raytracing

Multi-GPU Single Node
Single GPU Single Node
Multi-GPU Single Node

· Rendering
· F inite-volume solver: Simple and piso, incompressible single-phase flows with k-OmegaSST turbulence
· F inite Volume explicit solver for RANS/ URANS calculations
· V ariable time-steps and multigrid for convergence acceleration
· T urbulent flow (RANS, URANS, DDES or LES) including automatic scalable wall functions

Single GPU Single Node
Single GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node

CFD (RESEARCH DEVELOPMENTS)

APPLICATION NAME
ALYA

COMPANYNAME
Barcelona Supercomputing Center (BSC)

PRODUCT DESCRIPTION

SUPPORTED FEATURES

Alya is a high performance computational mechanics code to solve complex coupled multi-physics / multi-scale problems, which are mostly coming from the engineering realm.

· Incompressible Flows · C ompressible Flows · N on-linear Solid Mechanics · S pecies transport equations · E xcitable Media · T hermal Flows · N -body collisions

DualSPHysics

University of Manchester

SPH-based CFD software

· S PH model

HiPSTAR

University of Southampton and University of Melbourne Sandberg

CFD software for compressible reacting flows

· E xplicit solver

GPU SCALING
Multi-GPU Multi-Node
Multi-GPU Single Node Multi-GPU Single Node

16 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 16

4/5/21 10:18 AM

Project Chrono
PyFR RAPTOR S3D

University of WisconsinMadison

Chrono is a physics-based modelling and simulation infrastructure based on a platform-independent open-source design implemented in C++. Systems can be made of rigid and flexible/compliant parts with constraints, motors and contacts; parts can have three-dimensional shapes for collision detection

Imperial College - Vincent US DOE
Sandia and Oak Ridge NL

General purpose CFD software for compressible flows
CFD formulation of turbulent combustion for fuel injector and other engine applications
Direct numerical solver (DNS) for turbulent combustion

· Robotics · W heeled vehicle dynamics · T racked vehicle dynamics · N onlinear finite element analysis · Mechatronics · O ff-road vehicle mobility · Terramechanics · V irtual reality · G ranular flows · C ollision detection · A utonomous vehicles · S eismic engineering · A ugmented reality
· H igh-order explicit solver based on flux reconstruction method
· F low solver
· C hemistry model

Multi-GPU Multi-Node
Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Multi-Node

COMPUTATIONAL STRUCTURAL MECHANICS

APPLICATION NAME
Adams

COMPANYNAME
MSC Software

PRODUCT DESCRIPTION
Multi-Body Dynamics simulation software

SUPPORTED FEATURES
· Rendering

GPU SCALING
Single GPU Single Node

Altair EDEM

Altair

Software for bulk material simulation that uses the Discrete Element Modeling (DEM) technology to simulate and analyze behavior of bulk materials

· E DEM Simulator, a DEM solver · Integration with Ansys and Abaqus for FEA
for bulk material simulation · Integration with Adams, Siemens and
RecurDyn for Multi-body Dynamics · Integration with Ansys Fluent for Particle-
Fluid Systems

Multi-GPU Single Node

Altair HyperWorks Altair

Comprehensive, open architecture CAE simulation suite in the industry, offering the best technologies to design and optimize high performance, weight efficient and innovative products. It includes a full set of modeling and visualization tools.

· O penGL v3.2 · O penCL v2.0 support · Anti-aliasing

Single GPU Single Node

Altair OptiStruct Altair

Industry proven, modern structural analysis solver for linear and nonlinear problems under static and dynamic loadings. It is also the market-leading solution for structural design and optimization.

· D irect solver (BCS) · E igenvalue solvers (AMSES and Lanczos) · Iterative solver (PCG)

Single GPU Single Node

Amphyon

AdditiveWorks

Simulation-based process software for powder bed based, laser beam melting additive manufacturing processes

· M echanical Process Simulation · T hermal Process Simulation

Single GPU Single Node

Ansys Mechanical ANSYS

Simulation and analysis tool for structural mechanics

· D irect and iterative solvers

Multi-GPU Multi-Node

Autodesk Nastran Autodesk

Autodesk Nastran FEA software analyzes linear and nonlinear stress, dynamics, and heat transfer characteristics of structures and mechanical components.

· D ouble Precision on GPU

Multi-GPU Multi-Node

GranuleWorks

Prometech

DEM-based advanced simulator for granular materials in pharma and powder metallurgy: granular material segregation, screening, grinding, screw conveying, mixing, compaction, filling. dustproof, toner transport, electrode materials filling, cliff collapses/debris flow, etc.

· S ize distribution, contact force model, rolling resistance model, liquid bridge force model, van der Waals force model, heat transfer and external force.
· B oundary conditions: polygon wall, inflow and outflow boundary, and simulation domain.
· C oupling with Particleworks MPS solver: support for aeration and pumps

Multi-GPU Multi-Node

Helyx PEM

Engys

Specialised add-on solver for HELYX to simulate · P olyhedral Elements Method solver large numbers of solid objects in motion using the Polyhedral Element Method (PEM)

Single GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 17

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 17

4/5/21 10:18 AM

Impetus Afea
Irazu
Marc MatDEM
midas GTS NX midas NFX(Structural) MSC Nastran
PERMAS-XPU RecurDyn Rocky DEM
Simcenter Nastran
SIMULIA 3DEXPERIENCE SIMULIA Abaqus/ Standard ThreeParticle/CAE

Impetus Afea

Predicts large deformations of structures and components exposed to extreme loading conditions

· N on-linear Explicit Finite-Element Solver

Multi-GPU Single Node

Geomechanica Inc.

Simulation and analysis tool for rock mechanics, involving large deformations, fracturing and multi-physics phenomena.

· E xplicit 2D and 3D FEM and FDEM solvers · C oupled hydraulic, mechanical, transport,
thermal and fracture processes

Single GPU Single Node

MSC Software

Simulation and analysis tool for structural mechanics

· D irect sparse solver

Multi-GPU Single Node

Nanjing University

MatDEM is a software for Fast GPU Matrix computing of Discrete Element Method. The software implements automatic stacking modeling, layered material, joint surface and load settings, rich post-processing functions and secondary development.

· F ull product support on GPU

Multi-GPU Single Node

Midas

Simulation tool for geo-technical analysis

· L inear equation solver(Multi Frontal Solver) Single GPU Single Node

Midas

Simulation and analysis tool for structural mechanics

· L inear equation solver(Multi Frontal Solver) Single GPU Single Node

MSC Software

Multidisciplinary structural analysis application used to perform static, dynamic, and thermal analysis across linear and nonlinear domains

· D irect sparse solver

Multi-GPU Single Node

INTES GmbH

General purpose structural simulation software

· L inear Equation Solver

Single GPU Single Node

FunctionBay, Inc. Multi-Flexible Body Dynamics simulation software

· Rendering

Single GPU Single Node

Rocky DEM

Discrete Element Modeling (DEM)-based particle simulation software for simulating behavior of bulk materials with complex particle shapes and size distributions

· E xplicit DEM solver (dry/sticky contact rheologies)
· 1 -way & 2-way coupling with ANSYS Fluent and ANSYS Mechanical

Multi-GPU Single Node

Siemens Digital Industries Software

Finite element method (FEM) solver for computational performance, accuracy, reliability and scalability

· L inear and nonlinear equation solver · F requency response module · M atrix decomposition computations

Multi-GPU Multi-Node

Dassault Systèmes SIMULIA Corp.

Realistic simulation solution (Uses Abaqus · D irect sparse solver Standard for GPU computing)

Single GPU Single Node

Dassault Systèmes SIMULIA Corp.

Simulation and analysis tool for structural mechanics

· D irect sparse solver · A MS Solver · S teady State Dynamics

Multi-GPU Multi-Node

BECKER 3D GmbH

Multiphysics Discrete Element Method (DEM) simulation platform for bulk materials with complex shapes and built-in multi-body dynamics (MBD), Finite Element Analysis (FEA) & Smoothed Particle Hydrodynamics (SPH)

· G PU accelerated Smoothed Particle Hydrodynamics
· S imulate complex and real particle shapes using DEM combined with SPH, FEA, MBD, Wear

Single GPU Single Node

DESIGN AND VISUALIZATION

APPLICATION NAME

COMPANYNAME

PRODUCT DESCRIPTION

3D CAT.live

Shenzhen Rayvision Technology Co Ltd

Real-time rendering cloud service for 3D applications. The massive GPU computing power in the cloud is used to process heavy image rendering calculations and stream output to the terminal device synchronously, thereby realizing light weight of the terminal device and making high-quality 3D graphics applications ubiquitous. Users can use any common networked device to access the 3D application hosted in the 3DCAT cloud without downloading and installing the application. Supports almost all rendering engines that can run on the Windows platform, and supports the opening of NVIDIA RTX real-time ray tracing function.

SUPPORTED FEATURES
· C loud XR SDK · D LSS (potential)

GPU SCALING
Multi-GPU Multi-Node

18 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 18

4/5/21 10:18 AM

3DEXCITE DeltaGen Dassault Systèmes

6SigmaET

Future Facilities

Abaqus/CAE Accelerad

Dassault Systèmes SIMULIA Corp.
MIT Sustainable Design Lab

Additive Mfg Toolkit Dyndrite

ALLPLAN

Nemetschek ALLPLAN

ANSA

BETA CAE Systems

Ansys Discovery Live

ANSYS

Ansys SPEOS

ANSYS

Ansys VRXPERIENCE for HMI and Perceived Quality

ANSYS

High-end 3D visualization and realtime interaction to help increase visual quality, speed, and flexibility.

· Interactive ray tracing and global illumination.
· Integration with Siemens TeamCenter. · C luster support Realtime & Offline
Production Process Integration and scene building. · S cene Analysis, Xplore DeltaGen, SDK for DeltaGen.

Multi-GPU Single Node

Thermal simulation software for the electronics industry. 6SigmaET's unique MLUS Computational Fluid Dynamics (CFD) solver predicts thermal issues in complex electronics equipment.

· M onte-Carlo ray tracing for Heat Radiation · N VIDIA's Optix library

Single GPU Single Node

Complete solution for Abaqus finite element · Rendering modeling, visualization, and process automation

Multi-GPU Single Node

Accelerad is a free suite of programs for fast and accurate lighting and daylighting analysis and visualization.

· U p to forty times faster using OptiX

N/A

· R enderings with large numbers of ambient

bounces

· C alculations over many thousands of

sensor points

· F ast simulation of annual climate-based

daylighting metrics

· A cceleradRT - Interactive interface for

real-time daylighting, glare, and visual

comfort analysis with validated accuracy.

includes AcceleradVR, an immersive

visualization interface compatible with

most virtual reality headsets.

Dyndrite has developed a GPU-based

· CUDA

N/A

geometry kernel with CUDA. The initial

application for this kernel is an Additive

Manufacturing Toolkit which speeds up

the process of 3D printing, especially for

complex parts.

Complete Building Information Modeling (BIM) for Architecture, Engineering, and Construction.

· O penGL 4, and now moving to Vulcan · V ulcan for wireframe rendering already
with plan to ship full integration with Version 2022 in September 2021

Single GPU Single Node

Multidisciplinary CAE pre-processing tool for full model build up, from CAD data to ready-to-run solver input file, in a single integrated environment

· OpenGL · OpenCL

Single GPU Single Node

Interactive and CAD-agnostic Windowsbased app that gives engineers instantaneous simulation results to help them explore and refine product designs

· O penGL-based visualization · C UDA-based Structural Stress, Modal,
Fluid Dynamics, Thermal, Electrical Conduction and Coupled Multi-Physics simulations

Single GPU Single Node

Physically accurate optical simulation software dedicated to predictive illumination and optical performance of systems. Highfidelity visualization of the final result, based on unique human vision algorithm.

· S PEOS Live Preview · 360 degrees for immersive or observer view · O ptical part design · O ptical sensors test · H UD design and analysis · Infrared modeling

Single GPU Single Node

Predictive physics-based real time lighting simulation with VR capabilities to experience and validate the impact of your design proposition on appearance and perceived quality.

· P hysics-based real time lighting simulation with VR capabilities from HMD to CAVEs (multi-GPU, multi-node)
· S PEOS Live Preview (raytracing) based on CUDA/OptiX benefiting from RTX architecture (single GPU)
· S calable rendering capabilities,ranging from rasterization to fully GPU ray-traced SPEOS Live Preview

Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 19

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 19
4/5/21 10:18 AM

Ansys VRXPERIENCE Lighting and Sensors
Ansys Workbench Apex Archicad
Arch-Log
AutoCAD
Avatar VR BricsCAD CATIA 3DEXPERIENCE
CATIA Live Rendering Clarisse Clip Studio Paint
Clo3D COMSOL
Creo Generative Topology Optimization Extension (GTO)

ANSYS
ANSYS MSC Software Nemetschek GRAPHISOFT
Luminova Japan
Autodesk
NeuroDigital Technologies Hexagon PPM Dassault Systèmes
Dassault Systèmes Isotropix Celsys
CLO Virtual Fashion Inc COMSOL
PTC

Predictive validation of vehicle systems for the optimization of intelligent headlamp units and sensors dedicated to ADAS and AD. Rapid and simple virtual test of systems, relying on the unique combination of visually realistic driving simulator, and physics-based simulation. Real-time and interactive driving simulator to virtually create, test and experience future vehicle driving in real-world like conditions. Industry proven, modern pre- & postprocessing app for CAE Unified environment for virtual product development Complete Building Information Modeling (BIM) for Architecture, Engineering, and Construction.
A web service based on NVIDIA Iray and RealityServer (from migenius) for rendering and configuring building materials.
2D and 3D CAD designing, drafting, modeling, architectural drawing, and engineering software.
Haptic VR gloves for training design or remote operation. Building information modeling software for design, construction, documentation, and manufactured building products. The reference CAD application for advanced engineering with batching capability and extreme reliability, used by 80 of the automotive industry and the entire aerospace industry.
Realistic 3D Rendering on full CATIA 3D CAD model.
Set dressing and layout tool with integrated renderer
Clip Studio Paint is a versatile digital painting program that is ideal for the digital creation of comics, general illustration, and 2D animation. 3D garment simulation and design
Multiphysics general-purpose simulation software for modeling designs, devices and processes in all fields of engineering, manufacturing, and scientific research Creo Generative Topology Optimization Extension (GTO) creates optimized product designs based on your constraints and requirements - including materials and manufacturing processes

· M ultispectral Physics-based real time lighting simulation with multi-display capabilities (driving simulator).
· Rendering
· Rendering
· O penGL based GPU rendering · F ast, efficient graphics in the viewport · R TX photorealistic rendering with
Twinmotion, internal rendering engine based on CineRender, and now integrating Redshift into Archicad. · Iray · RealityServer · Quadro · DGX · S urface, mesh and solid modeling tools, model documentation tools, parametric drawing capabilities · O pen GL · N ative DWG support · G RID Support. · PhysX
· Rendering
· G PU OpenGL performance scaling in R2017x
· V R native integration with HTC Vive in R2017x
· V R SLI in R2018x · S tellar GPU in R2019x FD01 · P hysically Based Rendering with no data
preparation thanks to native NVIDIA Iray Photoreal integration and interactive realistic rendering using NVIDIA Iray IRT · G PU accelerated interactive rendering 50100X faster than with CPU · O ptiX AI-accelerated de-noising · A ccelerated processing and AI features
· CUDA
· O penGL version 2.0 · D irectX version 9
· C UDA accelerated Generative Design

Multi-GPU Multi-Node
Multi-GPU Single Node Single GPU Single Node Single GPU Single Node
Multi-GPU Multi-Node
Single GPU Single Node
Single GPU Single Node Single GPU Single Node Single GPU Single Node
Multi-GPU Single Node
Single GPU Single Node Single GPU Single Node
Single GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node

20 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 20

4/5/21 10:18 AM

Creo Parametric
Easy 3D Scan Enscape Grasshopper
IC.IDO ImageStation
Inspire Studio/ Render (formerly known as Evolve)
Inventor Iray
Iray for 3ds Max Iray for Maya

PTC

Professional 3D CAD software for product · G PU accelerated real-time engineering

Single GPU

design and development, including

simulation with Creo Simulation Live

Single Node

parametric modeling, simulation/analysis, · F ull scene anti-aliasing

and product documentation for companies · O rder independent transparency

ranging from SMB to Enterprise.

· B etter lighting and enhanced shaded-with-

edges mode

· Immersive design environment with

realistic materials

Cappasity

3D digitizing software that creates and embeds 3D product images into your website, mobile and AR/VR apps, and gives your customer a near real shopping experience.

· OpenCL

Single GPU Single Node

Enscape GmbH Renderer with Plug-in for Revit, Rhino, SketchUp, ARCHICAD, and Vectorworks

· F ull RTX-enabled · O ne-click to VR experience · D esign reviews for buildings · 3D and VR visualization of CAD data for AEC

Single GPU Single Node

McNeel & Assoc.

Grasshopper is a graphical algorithm editor tightly integrated with Rhino's 3-D modeling tools. Unlike RhinoScript, Grasshopper requires no knowledge of programming or scripting, but still allows designers to build form generators from the simple to the awe-inspiring.

· F ast, scalable OpenGL 3.3 pipeline leverages latest NVIDIA GPUs
· G PU computed shaders and memory optimizations
· R hino 6 leverages NVIDIA RT Cores for Real-time ray tracing viewport mode
· R endering engine is CYCLES, fully integrated inside Rhino 6 now

Single GPU Single Node

ESI Group

Immersive VR solution for engineering and virtual prototyping. The Helios rendering engine is highly optimized for NVIDIA GPUs.

· N V Pro Pipeline (RiX) for OpenGL rendering · V RWorks SPS and VR SLI (NVLink support) · D esignWorks, including VR Occlusion
Culling open source sample and OptiX

Multi-GPU Single Node

Hexagon Geospatial

ImageStation software suite designed for high-volume photogrammetry and production mapping including aerial and satellite triangulation, stereo feature and digital terrain model (DTM) collection and editing, automatic DTM and digital surface model (DSM) generation, and orthophoto production and editing

· S tereo Display and Viewing

Single GPU Single Node

Altair

Inspire Studio is a high quality 3D Hybrid Modeling and Rendering environment that enables industrial designers to evaluate, research and visualize various designs faster than ever before. Inspire Studio runs on both Mac OS X and Windows.

· N URBS modeling · P olyNURBS modeling · O penGL 4.5 Core · O penGL-based real-time high-quality
rendering · Interactive high-quality rendering using
Thea Render · P roduction rendering using Thea Render · Integrated "dark room" environment to
manage render queue and post-processing of rendered images

Single GPU Single Node

Autodesk

3D mechanical design, documentation, and · U ses BIM for intelligent building

product simulation.

components to improve design accuracy

Single GPU Single Node

NVIDIA

A ready-to-integrate, physically-based, photorealistic rendering solution.

· Iray Interactive · Iray Photoreal · Iray Server · F ast interactive ray tracing · P hysically-based, global-illumination
rendering · D istributed cluster rendering.

Multi-GPU Multi-Node

Siemens Digital Industries Software

A physically-based renderer plugin for Autodesk 3ds Max

· Iray Photoreal and Iray Interactive support, VCA clustering, Cloud rendering, MDL support and AI based denoising

Multi-GPU Multi-Node

0x1 Software & Consulting GmbH

A physically-based renderer plugin for Autodesk Maya.

· Iray Photoreal and Iray Interactive support, VCA clustering, Cloud rendering, MDL support, AI based denoising

Multi-GPU Multi-Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 21

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 21
4/5/21 10:18 AM

Iray for Rhino Iray Server KeyShot
LensMechanix
LumenRT Medium by Adobe META META VR MicroStation Connect
Notch Builder NX

migenius Pty Ltd Iray plugin for Rhino

· Iray Photoreal and Iray Interactive support · V CA clustering · C loud rendering · M DL support.

migenius Pty Ltd The scaling solution for any Iray based application

· Iray Photoreal and Iray Interactive support, VCA clustering, Cloud rendering, MDL support and AI-based denoising

Luxion

Physically correct real time and batch CPU / GPU photorealistic renderer, popular in manufacturing, AEC, and M&E

· G PU accelerated real time and batch rendering with NVIDIA OptiX
· G PU accelerated AI Denoising with NVIDIA OptiX Denoiser
· N etwork rendering on GPU accelerated nodes
· S upport for 30 different native file formats, many free plugins and live linked applications

Zemax

LensMechanix is the best application for mechanical engineers to package optical systems in CAD software. It is available for SOLIDWORKS users and for Creo Parametric users.

· O ptical product teams need an easier and faster way to get from design to manufacture
· L ensMechanix is the answer · L ensMechanix is software for mechanical
engineers who design housing for optical products in CAD · W ith LensMechanix, mechanical engineers can access the complete design data of optical systems designed in OpticStudio and start designing the mechanical envelope right away · T hey can then validate their mechanical design and fix issues before building a physical prototype

Bentley Systems

Easily integrate life-like digital nature into your simulated infrastructure designs, and create high-impact visuals for stakeholders. Best for very large infrastructure, i.e. 100s of square kilometers rendering.

· R T Cores for real time ray tracing · T ensoRT for denoising · A ll using the DXR API

Adobe

PC-based VR sculpting app for modeling & painting in Quest VR headsets. For beginners as well as pros. Adobe acquired from Occulus in December 2019. Requires link cable to PC.

· G LSL shaders · Vulkan · NVENC

BETA CAE Systems

High-performance multi-disciplinary CAE post-processor

· OpenGL · OpenCL

BETA CAE Systems

Powerful processing and visualization environment for interaction with full-scale simulation models with collaboration capabilities

· OpenGL · OpenCL

Bentley Systems

MicroStation is the world's leading 3D computer-aided design and visualization software for the architecture, engineering, construction, and operation of all infrastructure types. Largest CAD in AEC for Civil Engineering users. · Very tight collaboration with Autodesk Revit. · MicroStation has internal Rendering tool
called Vue, shipping with the base CAD tool.

· D igital Nature modeling is Full Ray Tracing-enabled
· R eality Modeling leveraging NVIDIA AI acceleration
· G PU acceleration for Viz, Rendering, Simulation Bentley apps are optimized for NV Quadro RTX

10bit FX

A motion graphics and VFX tool designed by games artists and VJs. Compositing, grading and strong inter-operability with other packages.

· G PU accelerated graphics and effects

Siemens Digital Industries Software

Siemens PLM Software premium design app with full Iray integration, supporting multi-gpu rendering. Still CPU bound for most tasks otherwise

· G RID support · Iray, MDL (see NX Ray Traced Studio)

Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node Single GPU Single Node Single GPU Single Node
Single GPU Single Node Multi-GPU Multi-Node

22 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 22

4/5/21 10:18 AM

OpticStudio
Painter Patran Quark VR QUINDOS RealityServer Recap PRO
REMCOM WaveFarer RETOMO Review Revit RHINO
Simcenter Femap

Zemax

OpticStudio combines complex physics

· Share designs between OpticStudio and CAD N/A

and interactive visuals so you can analyze,

packages as native files, giving mechanical

simulate, and optimize optics, lighting and

engineers full access to the optical

illumination systems, and laser systems, all coordinate system and all critical dimension

within tolerance specifications.

there is no need for file format conversions

which can cause loss of design data

· S imulate the impact of mechanical

components on optical performance to

uncover any issues and make informed

design decisions

· C heck for, and resolve errors, before

building costly physical prototypes

Corel

Raster-based digital art application for drawing, sketching and painting.

· G PU accelerated brushes

Single GPU Single Node

MSC Software

Industry proven, modern pre- & postprocessing app for CAE

· Rendering

Single GPU Single Node

Quark VR

QuarkVR is an ultra-fast software solution which provides low-latency compression and wireless transmission. It offloads the heavy processing on the GPU, and is hardware-agnostic.

· CUDA

Single GPU Single Node

Hexagon Manufacturing Intelligence

Coordinate metrology software

· Rendering

Single GPU Single Node

migenius Pty Ltd 3D rendering and collaborative visualization · N VIDIA Iray. and model manipulation platform based on NVIDIA Iray.

Multi-GPU Multi-Node

Autodesk

ReMake is a solution for converting reality captured with photos or scans into highdefinition 3D meshes. These meshes can be cleaned up, fixed, edited, scaled, measured, re-topologized, decimated, aligned, compared and optimized for downstream workflows entirely in ReMake.

· G eneration of 3D meshed models from laser scans or photos of an object
· G PU accelerated photogrammetry process from 2D to 3D
· 3 D model display accelerated by GPU for smooth navigation of converted models in all display modes

Multi-GPU Single Node

REMCOM

WaveFarer is a high-fidelity radar simulator · N ear-field propagation method

for drive scenario modeling at frequencies · T argeted ray casting, dynamic scenario,

up to and beyond 100GHz.

radiation patterns from antennas

Multi-GPU Single Node

BETA CAE Systems

New software for the generation of

· OpenGL

3D-tesellated models from CT-scan images

Single GPU Single Node

PiXYZ

Imports any CAD data to prepare and experience your content with VR.

· L arge CAD file support with NVIDIA Pascal Single GPU Single Pass Stereo extension integration Single Node

Autodesk

Building Information Modeling (BIM) for architecture, engineering and construction.

· M odeling (BIM) to design, build, and maintain higher-quality, more energyefficient buildings
· G RID support

Single GPU Single Node

McNeel & Assoc.

General purpose conceptual/industrial design software for AEC and Manufacturing industries, including CYCLES (their customRenderer based on open source Blender) a real-time ray-traced display mode that is CUDA-based.

· F ast, scalable OpenGL 3.3 pipeline leverages latest NVIDIA GPUs
· G PU computed shaders and memory optimizations
· R hino 6 and new RHINO 7 leverages NVIDIA RT CUDA Cores for Real-time ray tracing viewport mode, and Tensor Cores for Denoising
· R endering engine is CYCLES, fully integrated inside RHINO 7 now

Single GPU Single Node

Siemens Digital Industries Software

Engineering simulation application for creating, editing, and importing/re-using mesh-centric finite element analysis models of complex products or systems

· Rendering

Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 23

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 23
4/5/21 10:18 AM

Simcenter Prescan Siemens Digital Industries Software

Simcenter STARCCM+ VR
Simpleware

Siemens Digital Industries Software
Synopsys

SketchUp Pro

Trimble SketchUp

Solid Edge SOLIDWORKS

Siemens Digital Industries Software
Dassault Systèmes

SOLIDWORKS Visualize
Spotscale
Studio

Dassault Systèmes
Spotscale
PiXYZ

Substance Alchemist

Adobe

Substance Designer
Substance Painter

Adobe Adobe

Sunata

Siemens Digital Industries Software

Teamcenter Active Workspace

Siemens Digital Industries Software

T-FLEX CAD UE4

Top Systems Epic Games

virtually validate ADAS and automated vehicle functionalities by replicating real world scenarios, adding sensor models, and interface for control systems to design and verify algorithms for data processing, sensor fusion, decision making and control
Immersive VR for CFD results visualization

· S peed up the TIS sensor used for radar, lidar, PMD and ultrasonic sensors
· C amera sensor and fisheye camera sensor

Multi-GPU Multi-Node

· H TC Vive virtual reality headset

Single GPU Single Node

3D image data visualization, analysis and model generation software
SketchUp, formerly Google SketchUp, now part of Trimble in Sunnyvale, CA. SketchUp is a 3D modeling computer program for a wide range of drawing applications such as architectural, interior design, landscape architecture, civil and mechanical engineering, film and video game design.
SMB CAD option from Siemens

· OpenGL
· O penGL now but moving to DirectX 11 for SketchUp, and DirectX 12 and VULKAN for TEKLA Structures (late 2021 and 2022)
· F ast, efficient graphics in the viewport · R TX photorealistic rendering · 3 rd party plug-ins supported by SketchUp
Pro
· K eyShot rendering

Single GPU Single Node Single GPU Single Node
Single GPU Single Node

3D design and product development solution including design, simulation, cost estimation, manufacturability checks, CAM, sustainable design, and data management.

· H igh performance in Shaded, Shaded w/ Edges, and RealView modes, FSAA for sharp edges, Order Independent Transparency
· R eal time photorealistic renderings with SOLIDWORKS Visualize, an Iray-based application.

Easy to use photorealistic rendering software based on NVIDIA Iray

· Iray-based ray-tracing · A nimation support · N etwork rendering · O ptiX-based Artificial Intelligence denoiser

3D reconstruction algorithms are tailored for buildings and urban environments. using drones to captured data.

· cuDNN

Interactively prepare & optimize any CAD data before using your favorite staging tool.

· L arge scale CAD format · S upport for multi-CAD file standard,
prepare, optimize and heal your geometry before experiencing it in VR

Allows to simply create material from picture or by blending pre-existing materials, create and manage your material libraries

· D L powered material recognition · M aterial scan, edit and blend

Material shader edition and market reference for procedural texture creation.

· R TX bakers · Iray viewport/rendering

Intuitive interactive 3D painting software with physics and particle support.

· R TX bakers · Iray viewport

Cloud-based thermal modeling for additive manufacturing. Recommends optimal parameters for the print, including print orientation and support structures.

· T hermal simulation

Active Workspace is an IT-friendly client for Teamcenter product lifecycle management, with zero-install footprint and web browser access that provides an identical and seamless experience on any computing or smart device.

· G RID support

3D and 2D parametric design, simulation, photorealistic rendering

· H igh performance visualization · R eal time photorealistic rendering · CUDA

Unreal Engine 4 is a suite of integrated tools for developers to design and build games, simulations, and visualizations.

· G PU Accelerated Rendering on OpenGL, DirectX and Vulkan
· P hys-X implemented

Single GPU Single Node
Single GPU Single Node
Multi-GPU Single Node Single GPU Single Node
Single GPU Single Node
Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node
Single GPU Single Node
Multi-GPU Single Node Single GPU Single Node

24 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 24

4/5/21 10:18 AM

Vectorworks Volumetric Camera Systems VRED
WeViz Studio WYSIWYG ZLVE

Nemetschek VECTORWORKS Volumetric Camera Systems
Autodesk
Meshroom VR Cast Software
Zerolight

Building Information Modeling (BIM) enabled design software for the Architecture, Landscape, and Entertainment industries. 4D capture service with high quality and realistic "holograms-in-motion" of people, animals, or any moving subject Secondly, we offer "photo-realistic 3D environment captures" using industrial grade Leica Laser Scanners and advanced high-resolution multi-camera systems. VRED 3D visualization software for automotive designers and engineers to create product presentations, design reviews, and virtual prototypes. Uses Digital Prototyping to quickly visualize ideas and evaluate designs.
Real-time rendering tool specially made for industrial design reviews, allowing to import, edit materials, set up your scene and showcase your model in real-time. Wysiwyg is an all-in-one lighting design software with fully integrated CAD, plots, data, visualization and virtual show control. Features the largest CAD library with thousands of 3D objects you can choose from to design your entire show. Immersive customer experience with VR or web GPU streaming

· O penGL based GPU rendering
· CUDA · Q uadro GPUs
· E nhanced geometry behavior · A utomotive product interoperability · N avigation in a scene · Import Alias layer structure · A sset Manager improvements · Integrated file converter · A nalytic rendering modes · G ap Analysis tool · O culus Rift support · A nimation module · M ultiple rendering modes · S ubsurface scattering · D isplacement mapping · R TX real-time ray tracing
· G PU accelerated Shaded Views and Virtual Views
· V RS and foveated rendering for VR and 3D experience through AWS GPU streaming

Multi-GPU Single Node Single GPU Single Node
Multi-GPU Single Node
Single GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node

ELECTRONIC DESIGN AUTOMATION

APPLICATION NAME
Advanced Design System (ADS)

COMPANYNAME
KeySight

PRODUCT DESCRIPTION

SUPPORTED FEATURES

Simulation tool for design of RF, microwave · T ransient Convolution simulation with

and high speed digital circuits

BSIM4 models

Altair Feko

Altair

Comprehensive computational electromagnetics (CEM) code used widely in the telecommunications, automobile, space and defense industries to solve highfrequency problems.

· F DTD solver · M oM solver · R L-GO solver · C MA Solver

Ansys HFSS

ANSYS

Simulation tool for modeling 3-D full-wave · T ransient solver

electromagnetic fields in high-frequency and · F EM solver

high-speed electronic components

· O penGL rendering

Ansys HFSS SBR+ ANSYS

Simulation tool for installed antenna performance and antenna-to-antenna coupling

· H igh-frequency solver · O penGL rendering

Ansys Maxwell

ANSYS

Industry-leading electromagnetic field simulation software for the design and analysis of electric motors, actuators, sensors, transformers and other electromagnetic and electromechanical devices

· E ddy Current Solver

Ansys Nexxim

ANSYS

Circuit simulation engine for RF/analog/ mixed-signal IC design, and IBIS-AMI analysis speedup with GPU computing.

· A MI analysis

Cadence Allegro

Cadence Design EDA/ECAD tool for PCB (Printed Circuit

Systems

Board) Design

· O penGL extensions · S calable Vector Graphics (SVG), Path
Rendering SDK

GPU SCALING
Single GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node
Single GPU Single Node
Multi-GPU Multi-Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 25

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 25

4/5/21 10:18 AM

CDP
CST MPHYSICS STUDIO CST STUDIO SUITE
EMPro JMAG REMCOM XFdtd samadii/em
samadii/plasma
SEMCAD-X Serenity Sim4Life
Synopsys LucidShape
TrueMask MDP TrueModel VSim for Electromagnetics WIPL-D 2D Solver

D2S
Dassault Systèmes SIMULIA Corp. Dassault Systèmes SIMULIA Corp. KeySight
JMAG
REMCOM Metariver Technology
Metariver Technology
SPEAG
Lucernhammer ZMT Zurich MedTech AG
Synopsys
D2S D2S Tech-X Corporation
WIPL-D

GPU acceleration of real-time in-line enhancement of semiconductor manufacturing equipment such as the NuFlare EBM-9500 and MBM-1000 mask writers.

· C omputational lithography simulations for mask synthesis on GPUs

Multiphysics simulation including thermal, CFD, and mechanical capabilities. Tightly integrated with CST's electromagnetic solvers.

· C onjugated Heat Transfer Solver

Accurate and efficient computational solution for 3D simulation of electromagnetic devices in a wide range of frequencies.

· T ransient Solver · Integral Equation Solver · A symptotic Solver · M ultilayer Solver

Modeling and simulation environment for analyzing 3D EM effects of high speed and RF/Microwave components.

· F inite Difference Time Domain (FDTD) solver

FEA software for electromechanical design. Fast solver / High quality mesh / Advanced modeling technologies.

· E M transient solver · E M time harmonic solver · E M static solver

3D EM Simulation solver.

· F DTD Solver

Software for computing the electromagnetic field in three dimensional space using the Maxwell equation, a governing equation that can comprehensively represent these electromagnetic phenomena

· E lectromagnetics simulator, FEM solver(scalar FEM, vector FEM)
· E lectrostatics solver, Electromagnetic wave solver
· M agnetostatics solver, Electric current solver, Electrodynamics solver
· C o-simulation with samadii/sciv, samadii/ dem and fluid flow solvers.

Software for computing plasma phenomenon with PIC(Particle-in-Cell) method. Two-way coupled simulation with samadii/em and samadii/sciv.

· P lasma simulator, Charged particle motion analysis
· P article and surface reaction calculation, Field analysis, Sheath range prediction
· D SMC collision module, PIC module · C o-simulation with samadii/em, Ansys
Maxwell and COMSOL.

3D Full wave electromagnetic and computational life sciences simulation solver

· F DTD solver

EM Simulation (RCS) tool

· M oM solver

3D Electromagnetics & Acoustic modeling and simulation
LucidShape is a computer aided lighting (CAL) design software for automotive lighting design tasks. Supports algorithms optimized for automotive applications, LucidShape facilitates the design of automotive forward, rear and signal lighting, and reflectors.
GPU-accelerated simulation and data preparation for mask writing.
GPU-accelerated simulation and geometric checking of curvilinear shapes.
Conformal FDTD for electromagnetics for a variety of material types, yielding engineering outputs that can be used for design of electromagnetic devices
2D EM modeling and simulation for long cylindrical structures

· T ransient, Broadband, and Harmonic simulations FDTD solver
· L inear and non-linear 3D full wave acoustics solvers
· R ay Tracing · M onte Carlo simulations using OptiX 6.5
and CUDA 10.2
· S imulation-based processing
· S imulation-based processing
· F DTD solver
· M oM Solver · M atrix fill-in and near-field calculations

Multi-GPU Multi-Node
Single GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node
Multi-GPU Multi-Node
Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node
Single GPU Single Node
Multi-GPU Multi-Node Multi-GPU Multi-Node Single GPU Single Node
Multi-GPU Single Node

26 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 26

4/5/21 10:18 AM

WIPL-D Pro

WIPL-D

WIPL-D Pro CAD WIPL-D

Wireless InSite

REMCOM

Solver for fast and accurate electromagnetic · M oM (Method of Moments) Solver analysis of arbitrary composite 3D metallic · D DS (Domain Decomposition Solver) and dielectric structures

Modeling and simulation environment uniting versatile, yet simple geometry modeling, with signature WIPL-D simulation accuracy

· M oM (Method of Moments) Solver

Uses Optix 4.1 for Ray-tracing and Propagation prediction

· X 3D Ray Tracer

Multi-GPU Multi-Node
Multi-GPU Single Node
Multi-GPU Single Node

INDUSTRIAL INSPECTION

APPLICATION NAME

COMPANYNAME

PRODUCT DESCRIPTION

SUPPORTED FEATURES

Cognex VisionPro Cognex ViDi

Deep learning-based software dedicated to industrial image analysis. Cognex ViDi Suite is a field-tested, optimized and reliable software solution based on a state-of-theart set of algorithms in machine learning.

· F eature localization and identification · S egmentation and defect detection · O bject and scene classification · Text & character recognition

HALCON

MVTec Software

MVTec HALCON is the comprehensive standard software for machine vision with an integrated development environment. HALCON allows models to be trained on GPUs, and outputs trained models for inference on CPU, GPU, or Jetson.

· D eep learning - pre-trained networks optimized for latency or precision
· H ALCON also provides an IDE for training neural networks
· S ub-pixel detection, edge detection, counting, OCR, barcode reading, 3D reconstruction from stereo

IBM Visual Insights

IBM Corporation

IBM Visual Insights uses cognitive capabilities to review and analyze parts, components, and products. Identifies defects by matching patterns to images of defects that it has previously analyzed and classified. Deploy models to edge computing on production lines to facilitate rapid image capture by camera and cognitive identification of defects. Quickly assess quality inspection metrics across manufacturing processes.

· C loud-based DL training, deployment on (spec'ed) edge server

GPU SCALING
Single GPU Single Node
Single GPU Single Node
Multi-GPU Single Node

Media and Entertainment

ANIMATION, MODELING AND RENDERING

APPLICATION NAME
3ds Max

COMPANYNAME
Autodesk

PRODUCT DESCRIPTION
3D modeling, animation, and rendering

SUPPORTED FEATURES
· F aster interactive graphics · A vailability of Arnold with AI denoising · A vailability of Chaos V-Ray, Otoy Octane,
Redshift, cebas finalRender third-party GPU renderers

GPU SCALING
Multi-GPU Single Node

Altair Thea Render Altair

Physically-based progressive spectral CPU/ GPU Renderer supporting fast interactive changes and bucket rendering for high resolution images

· G PU-accelerated hybrid renderer · A dvanced material layering system with
subsurface scattering, displacement mapping, physical sun-sky and IES support

Multi-GPU Single Node

ArmorPaint

Armory

ArmorPaint is a software designed for physically-based texture painting. There is a standalone version, or you can use as an Armory3D project. Draw textures directly using node based materials and brushes.

· G PU accelerated painting processes

Single GPU Single Node

Arnold

Autodesk

Solid Angle Arnold film and animation renderer

· RTX

Multi-GPU Single Node

Beauty Box

Digital Anarchy Automatic masking and skin retouching.

· G PU accelerated graphics and compute

Single GPU Single Node

Blender

Blender Institute 3D modeling, rendering and animation

· G PU-accelerated interactive viewport

Single GPU Single Node

Blender Cycles

Blender Institute GPU renderer

· C UDA-accelerated rendering · R TX-accelerated ray tracing

Multi-GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 27

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 27

4/5/21 10:18 AM

Character Creator Reallusion

Cinema 4D
Corona D5 Render

Maxon
Chaos Group D5 Innovation

Daz Studio Dimension EmberGen finalRender

Daz3D Adobe JangaFX Cebas

HIERO Player Houdini iClone

Foundry SideFX Reallusion

Indigo KATANA Lightwave 3D LuxRender MARI Mars
Marvelous Designer Massive

Glare Technology Foundry
NewTek
LuxRender
Foundry
sheencity
CLO Virtual Fashion Inc Massive

Character Creator 3 is a full character creation solution for designers to easily create, import and customize stylized or realistic looking character assets for use with iClone, Maya, Blender, Unreal Engine 4, Unity or any other 3D tools. It connects industry leading pipelines into one system for 3D character generation, animation, rendering, and interactive design. 3D modeling, animation, and rendering
High-performance photorealistic renderer

· G PU accelerated processing · Iray support
· Increased model complexity at interactive rates
· S upport for Redshift and Chaos V-Ray and Otoy Octane and third-party GPU renderers
· O ptiX AI de-noising

D5 Render, based on NVIDIA RTX GPU's real-time ray tracing and rasterization technology, aims to bring unprecedented real-time rendering experience for architecture and interior design.
Powerful and free 3D creation software tool that is not only easy to use but rich in features and functionality.
3D design tool enabling graphic designers to compose, adjust, and render photorealistic images.
A standalone real-time fluid simulation tool built specifically for real-time VFX Artists with an expansive node based system.
PLUGIN for 3dsMAX Physically Based (Spectral) Wavelength Simulation Biased + Unbiased Hybrid Rendering Unlimited Network Rendering
Shot management, conform and review timeline
Procedural 3D modeling, animation and rendering
iClone is the software for real-time 3D animation, blending character creation, scene design, and cinematic storytelling into a real-time engine.
Unbiased, physically-based renderer.

· R eal-time GPU accelerated physically based global illumination and ray tracing.
· G PU accelerated compute · R endering via NVIDIA IRAY and Optix · R TX ray tracing, accelerated graphics &
MDL (Material Definition Language) · G PU accelerated volumetric fluid
simulations · C UDA-accelerated renderer for Autodesk
3DS Max · O ptiX AI de-noising
· F luid, interactive playback · F aster simulations · G PU accelerated ray-tracing and rendering
· G PU-accelerated rendering

Powerful look development and lighting tool · F aster interactive graphics

3D modeling, animation, and rendering GPU 3D Renderer

· Increased model complexity at interactive rates
· G PU-accelerated ray tracing

3D paint tool that allows painting directly onto 3D models
Real-time architectural visualization tool with advanced features such as real-time ray tracing, DLSS, and VR.
Realistic and dynamic 3D modeling software for clothes and fabric.
Simulation and visualization tools for autonomous agent driven animation for film, games, television, architecture and transportation.

· F aster interactive painting
· R TX Ray tracing · DLSS
· G PU accelerated cloth simulations
· G PU accelerated effects

Single GPU Single Node
Single GPU Single Node
Single GPU Single Node Single GPU Single Node
Multi-GPU Single Node
Multi-GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node Multi-GPU Single Node Single GPU Single Node
Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node
Single GPU Single Node Single GPU Single Node

28 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 28

4/5/21 10:18 AM

Maverick Renderer Maxwell Maya
Meshroom Metashape
MODO Motion Builder Mudbox NX Ray Traced Studio OctaneRender Realflow RealityCapture Redshift Renderer Renderman Sculptris Trapcode TurbulenceFD Vantage
V-Ray GPU vRt
WispRenderer

Maverick Next Limit
Autodesk
Czech Technical University (CTU) Agisoft
Foundry Autodesk Autodesk Siemens Digital Industries Software Otoy Next Limit Capturing Reality Redshift Pixar Pixologic Red Giant
Jawset Chaos Group
Chaos Group vRt
Bred University of Applied Sciences

CUDA-based GPU renderer

· C UDA-accelerated ray-tracing · O ptiX 7 de-noising

Single GPU Single Node

CUDA-accelerated interactive and finalframe renderer

· C UDA-accelerated ray-tracing · U nrestricted image resolution · O ptiX de-noising

Multi-GPU Single Node

3D modeling, animation, and rendering

· Increased model complexity and larger scenes
· Availability of Chaos V-Ray, Otoy Octane and Redshift third-party GPU renderers

Single GPU Single Node

Open source photogrammetry 3D software · C UDA-accelerated depth analys

Single GPU Single Node

Agisoft PhotoScan is a stand-alone software product that performs photogrammetric processing of digital images. Generates 3D spatial data to be used in GIS applications, and cultural heritage documentation for visual effects production and indirect measurements of objects of various scales.

· C UDA-accelerated photogrammetry solution
· R TX opportunity

Multi-GPU Single Node

3D modeling, animation and rendering

· Increased model complexity, larger scenes Single GPU Single Node

Character animation and motion capture

· Increased model complexity at interactive Single GPU

rates

Single Node

3D sculpting

· Increased model complexity at interactive Single GPU

rates

Single Node

Embedded rendering feature for Siemens NX

· Iray based · MDL · A I denoising

Multi-GPU Single Node

CUDA-accelerated GPU renderer

· G PU accelerated rendering · A I de-noising

Multi-GPU Single Node

Fluid simulation system

· G PU-accelerated simulation

Single GPU Single Node

Photogrammetry

· C UDA-accelerated, fast photogrammetry

Multi-GPU Single Node

GPU-accelerated, biased renderer

· C UDA-based GPU final-frame rendering · M ac and Windows supported

Multi-GPU Single Node

Leading film renderer

· O ptiX AI de-noising

Single GPU Single Node

3D sculpting

· Increased model complexity at interactive Single GPU

rates

Single Node

Particle simulations and 3D effects for motion graphics and VFX. Now with Fluid Dynamics.

· G PU accelerated effects

Single GPU Single Node

Turbulence FD is a powerful simulation tool · G PU accelerated graphics, compute and to create smoke, fire and explosion effects. simulation

Single GPU Single Node

Vantage is an interactive viewer that takes V-Ray scene files and uses DXR-accelerated ray tracing to display interactive scenes. It will be sold as a separate product, not bundled with V-Ray.

· R TX-accelerated, high frame-rate camera · Interactive animations · B i-directional link to Autodesk 3ds Max · Ideal for AEC walk throughs and product
design

Multi-GPU Single Node

GPU renderer with CPU Hybrid rendering

· C UDA interactive and final-frame GPU rendering

Multi-GPU Single Node

vRt is an open-source project aiming to offer Vulkan-based ray-tracing for modern graphics cards that offers a unified raytracing, cross-platform library built against Vulkan 1.1

· v RtC (compute-based, native, default, wide GPU support)
· v RtX (NVIDIA RTX only, more higher performance at now)

Multi-GPU Single Node

General purpose high level rendering library with RTX, RTGI, HBAO+, and Ansel support.

· R TX, RTGI, HBAO+ · Ansel

Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 29

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 29
4/5/21 10:18 AM

COLOR CORRECTION AND GRAIN MANAGEMENT

APPLICATION NAME
ARRI de-bayering SDK

COMPANYNAME
ARRI

PRODUCT DESCRIPTION
RAW de-bayering SDK

SUPPORTED FEATURES
· D e-bayering of ARRI RAW and primary color grading.

GPU SCALING
Single GPU Single Node

Baselight

FilmLight

Color grading

· R eal-time color correction

Multi-GPU Single Node

Cinema RAW SDK Canon

RAW de-bayering

· G PU-accelerated de-bayering

Single GPU Single Node

Dark Energy

Cinnafilm

Application and plug-in for image enhancement

· Image de-noising and restoration · N oise reduction, de-noise and de-grain · G rain removal, image sharpening and
texture management dust busting · S DR to HDR upres

Multi-GPU Single Node

DaVinci Resolve

Blackmagic Design

Color grading and editing

· R eal-time color correction and de-noising · R TX-accelerated AI features for re-timing
and image enhancement

Multi-GPU Single Node

DeNoise AI

Topaz Labs

DeNoise AI uses machine-learning to remove noise from your image while preserving detail for a crisp, clear result. Whether you are shooting with High ISO or in a low light scenario, DeNoise will correct your image without removing any important information or patterns in your image.

· G PU accelerated effects

Single GPU Single Node

Diamant-Film Restoration

HS-Art

Film cleanup and restoration

· C UDA accelerated optical flow, de-flicker, in-painting and over 30 filters

Multi-GPU Single Node

Grain and Noise Reducer

Wavelet Beam Video noise reduction

· C UDA-accelerated grain and noise reduction

Multi-GPU Single Node

HDR Image

aja

Analyser

A 1RU waveform, histogram, vectorscope and Nit-level HDR monitoring solution for HD, UltraHD, 2K, and HD resolution with HDR and WCG content.

· P recise, high quality UltraHD UI for nativeresolution picture display
· A dvanced out of gamut and out of brightness detection with error intolerance
· S upport for SDR (Rec.709), ST2084/PQ and HLG analysis
· C IE graph, Vectorscope, Waveform, Histogram
· O ut of gamut false color mode to easily spot out of gamut/out of brightness pixels
· D ata analyzer with pixel picker · Up to 4K/UltraHD 60p over 4x 3G-SDI inputs · S DI auto signal detection · F ile base error logging with timecode · D isplay and color processing look up table
(LUT) support · L ine mode to focus a region of interest onto
a single horizontal or vertical line · L oop through output to broadcast monitors · S till store · N it levels and phase metering · B uilt-in support for color spaces from
ARRI, Canon, Panasonic, RED and Sony

Single GPU Single Node

Magic Bullet Colorista

Red Giant

Real time, interactive, multi-layered masked color correction (video playback too!) with the Mercury Playback engine in Premiere Pro.

· G PU accelerated effects

Single GPU Single Node

Magic Bullet Looks Red Giant

Powerful looks and color correction for filmmakers.

· G PU accelerated compute

Single GPU Single Node

Mist

Marquise Technologies

Mastering tool for cinema, broadcast and over-the-top content

· 1 00% CUDA-accelerated imaging pipeline for de-bayering, color grading, transcoding and image enhancement
· Integrated Dolby Vision pipeline

Multi-GPU Single Node

Nucoda

Digital Vision

Color grading

· G PU-accelerated color grading · A ccelerated scopes, playback and
rendering

Single GPU Single Node

30 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 30

4/5/21 10:18 AM

Pablo family

Grass Valley

Color grading and finishing

· R eal time color correction

Pablo Rio

Grass Valley

PFClean

The Pixel Farm

RAW Converter

ARRI

REDCINE-X PRO
Red Digital Cinema R3D SDK

Red Digital Cinema
Red Digital Cinema

Scratch VFX Suite

Assimilate Red Giant

Pablo Rio is a color grading application that · C UDA-accelerated color grading GV acquired when they purchased Snell.

Image restoration and remastering

· C UDA-based image processing acceleration

RAW de-Bayering and primary color grading · C UDA-accelerated de-bayering and primary grading

Primary color grading

· C UDA-accelerated de-bayering and primary color grading

Red Digital Cinema camera SDK decodes and de-bayers Red RAW camera data, and allows primary color grading. Used by many color grading and video editing applications.

· C UDA-accelerated wavelet decoding and de-bayering

Color grading and finishing

· A ccelerated de-bayering for real-time digital finishing

VFX Suite is a complete set of visual effects and motion graphics plugins for creating professional effects.

· G PU accelerated effects

Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node
Single GPU Single Node Single GPU Single Node

COMPOSITING, FINISHING AND EFFECTS

APPLICATION NAME
After Effects

COMPANYNAME
Adobe

PRODUCT DESCRIPTION
Motion graphics and effects

SUPPORTED FEATURES
· C UDA acceleration for up to 10x faster performance on key effects plus enhanced 3D ray tracing

GPU SCALING
Single GPU Single Node

Aura

Rowbyte

Aura is a procedural plug-in for After Effects that creates elegant geometric shapes in 3D space. It's akin to a particle system but instead of rendering small particles all over the place, it generates vector like shapes (waves) that change over time much like the classic Radiowaves plug-in.

· G PU-accelerated High Frequency Rendering

Single GPU Single Node

Clipster

Rohde & Schwarz

Video and film player and DCI Packager

· GPU-accelerated · V ideo scaling · C olor space conversion · D ata format conversion

Multi-GPU Single Node

Complete

CoreMelt

Visual effects plug-in

· F aster effects

Single GPU Single Node

Continuum

Boris FX

Visual effects plug-in for creative effects, titling, and quick fixes.

· G PU accelerated effects

Single GPU Single Node

DE:Noise

RE:Vision Effects

Reduce noise, dust, and artifacts with frame-to-frame motion tracking. Useful for low light shoots, CG renders with ray tracing sample artifacts, excessive film grain.

· F aster effects

Single GPU Single Node

DEFlicker

RE:Vision Effects Reducing flicker and artifacts in highframe-rate and time-lapse video.

· F aster effects

Single GPU Single Node

Element 3D

Video Copilot

Advanced 3D object & particle render engine plugin for Adobe After Effects

· G PU accelerated graphics and compute

Single GPU Single Node

Flame Premium

Autodesk

Finishing and color grading

· Integrated toolset for 3D VFX, editorial, and Multi-GPU

color grading

Single Node

Flicker Free

Digital Anarchy

Deflicker Time Lapse, Slow Motion, and Old Video. Flicker Free is a powerful, new way to deflicker video.

· G PU accelerated effects

Single GPU Single Node

Fusion

Blackmagic Design

Effects and compositing

· 3 D tracking · Compositing · VR

Single GPU Single Node

HIERO

Foundry

Multi-shot management tool that supports collaborative working, review and approval, quick production turnaround and delivery

· F luid, interactive playback

Single GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 31

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 31

4/5/21 10:18 AM

Imerge Pro
Magic Bullet Denoiser Magic Bullet Film Magic Bullet Suite Mamba FX MediaReactor Mighty Bake Mistika Ultima Mistika VR Mocha Pro Natron Neat Video NUKE Optics
PFTrack Plexus

FXhome
Red Giant Red Giant Red Giant SGO Drastic Technologies Mighty Bake SGO SGO Boris FX Natron Absoft Foundry Boris FX
The Pixel Farm Rowbyte

Imerge Pro is layer-based image compositing software that is GPU accelerated, making performance astonishingly fast, even on high-resolution images. Create pro-level composites with unlimited layers and zero baked-in changes. Imerge Pro is the first photo editing software to keep your image data RAW and your layers self-contained.

· G PU-accelerated processing

Single GPU Single Node

Magic Bullet Denoiser III lets you reduce visible noise and grain in digital video produced by digital video cameras, camcorders, or film.

· G PU accelerated effects

Single GPU Single Node

Gives digital footage the look of real film by emulating the entire photochemical process from the original film negative, to color grading, and finally to the print stock.

· G PU accelerated effects

Single GPU Single Node

Full suite of tools for color correction, finishing and film looks for filmmakers.

· G PU-accelerated processing and affects

Single GPU Single Node

High-end compositing

· F aster keying, tracking, painting and restoration

Single GPU Single Node

Debayering and processing of raw camera files.

· G PU-accelerated compute

Single GPU Single Node

A powerful, easy to use, all-in-one texture baking solution for any 3D artist

· G PU accelerated processing

Single GPU Single Node

Color grading and finishing

· F aster keying, tracking, painting and restoration, de-bayering

Single GPU Single Node

Near real-time optical flow stitching

· G PU-accelerated video stitching with manual controls
· E xport clips in many formats, including DPX and ProRes

Single GPU Single Node

Mocha Pro is an award-winning planar tracking tool for motion tracking, rotoscoping, object removal, camera stabilization and general visual effects.

· G PU accelerated planar tracking and object Single GPU

removal

Single Node

Natron is a free and open-source nodebased compositing software application.

· G PU-accelerated processing and rendering Single GPU Single Node

Digital filter with auto-profiling tool designed to reduce visible noise and grain found in footage.

· G PU accelerated processing

Single GPU Single Node

Compositing tool with 3D tracker

· G PU-accelerated BLINK processing · F aster compositing and effects

Single GPU Single Node

Optics is designed to simulate optical camera filters, specialized lenses, film stocks and grain, lens flares, optical lab processes, color correction as well as natural light and photographic effects. First collaborative product between Sapphire and Digital Film Tools. Plugin for Photoshop and Lightroom, also has a Windows and Mac standalone application.

· G PU accelerated processing and affects

Single GPU Single Node

3D scene creation and tracking

· C UDA-accelerated tracking

Multi-GPU Single Node

Plexus is a plug-in designed to bring generative art closer to a non-linear program like After Effects. It lets you create, manipulate and visualize data in a procedural manner. Render the particles and create all sorts of interesting relationships between them based on various parameters using lines and triangles.

· P lexus (interacts natively with AE's Camera)
· H igh-quality, GPU-accelerated Depth of Field effects

Single GPU Single Node

32 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 32

4/5/21 10:18 AM

Rotobot Sapphire SilhouetteFX
Silhouette Paint
Twixtor Video Essentials

Kognat

An AI product for compositing packages which uses machine learning to generate mattes for machine-based rotoscoping.

· C UDA accelerated AI rotoscoping

Boris FX

The Sapphire suite is an all-in-one solution containing hundreds of effects, presets, and workflows that are aimed at taking professional video work to the next level.

· F aster effects

Boris FX

Invaluable in post-production, Silhouette continues to bring best of class tools to the visual effects industry. As a fully featured GPU accelerated compositing system, its standout features are award winning rotoscoping and non-destructive paint as well as keying, matting, warping, morphing, and a total of 142 different nodes--all stereo enabled.

· G PU-accelerated processing and affects

Boris FX

Rotoscoping tool that allows for intensive VFX fixes, blemish cleanup, beauty effects, wire/object removal, style effects on video, and as an artistic paint tool. It is raster based so it has a smaller memory footprint (fastest paint plugin on the market), Integrated with Mocha Pro planar tracker

· G PU accelerated processing and affects

RE:Vision Effects

Optical flow tracking of pixel motion to synthesize new frames by warping & interpolating frames of the original sequence. Reduces artifacts & retime frames.

· F aster effects

NewBlueFX

Comprehensive collection of titling, transitions and video effects.

· F aster effects

Multi-GPU Single Node Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node

(VIDEO) EDITING

APPLICATION NAME

COMPANYNAME

Blackmagic RAW SDK

Blackmagic Design

Catalyst Production Sony Creative

Suite

Software

CineMatch

FilmConvert

Edius Pro Filmora

Grass Valley Wondershare

Gigapixel AI

Topaz Labs

PRODUCT DESCRIPTION
Blackmagic RAW is a CPU and GPUenabled SDK for decoding and debayering Blackmagic RAW files on MacOS, Windows and Linux

SUPPORTED FEATURES
· C UDA-accelerated de-coding and debayering

GPU SCALING
Single GPU Single Node

4K, Sony RAW, and HD video editing. Includes 3 applications: Browse, Prepare, Edit

· F aster effects, transitions and encoding · R AW camera de-bayering

Single GPU Single Node

CineMatch is a set of tools designed to help you match footage shot on different cameras to a baseline technical level - a seamless, matched timeline in Log or REC.709, ready for creative grading.

· R eal-time color matching conversions with Single GPU

CUDA

Single Node

Video editing

· F aster effects · R AW camera de-bayering

Single GPU Single Node

Filmora is an easy-to-use and trendy video editing software that lets you empower your story and be amazed at results, regardless of your skill level. With Filmora, you can get started with any new movie project by importing and editing your video, adding special effects and transitions, and sharing your final production on social media, mobile devices, or DVDs.

· G PU-accelerated processing

Single GPU Single Node

Photo up scaling by using AI to "fill in" and · G PU accelerated effects add new detail when enlarging photos.

Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 33

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 33
4/5/21 10:18 AM

GPUSqueeze

Multicamera Systems

HitFilm Pro

FXhome

Illustrator

Adobe

Lightroom Classic Adobe

Lightworks Live Planet Luminar AI

EditShare Live Planet Skylum

Media Composer Avid

Movavi Video Suite Movavi

MXF

Film Partners

Photoshop

Adobe

Pinnacle Studio PowerDirector

Corel CyberLink

PowerDVD Premiere Pro

CyberLink Adobe

GPUSqueeze is cross platform software library for multi-stream and ultra high speed video encoding, transcoding and processing using multi-GPU and distributed setups. The library uses highly optimized patent pending algorithms to achieve maximum speed, high hardware utilization and provides almost linear performance scaling with the increase of number of GPUs in the system.

· G PU accelerated video encoding and decoding

Multi-GPU Multi-Node

HitFilm Pro is an all-in-one video editor, compositor, and visual effects (VFX) software designed for filmmakers, professional video editors, and visual content producers.

· G PU accelerated effects and decoding

Single GPU Single Node

Vector graphics software for creating logos, icons, drawings, typography, and illustrations for print, web, video, and mobile devices.

· E ntire canvas optimized for NVIDIA GPUs for faster pan & zoom

Single GPU Single Node

Easily edits organizes, stores, and shares your photos.

· G PU accelerated Develop module plus new Sensei features like "Enhance Details" with NVIDIA GPU AI optimization.
· U p to 600% faster than integrated GPUs with controls like Texture, Dehaze, & Sharpening
· Improved editing in 1:1 view & on hi-rez displays.

Single GPU Single Node

Video editing

· F aster effects · C UDA-accelerated de-bayering

Single GPU Single Node

Livestreaming, recording and delivery of stereoscopic 360 VR

· R eal time 360 3D capture and stitch · 4K

Single GPU Single Node

Luminar is the world's first photo editor that adapts to your style & skill level. It is designed to make complex photo editing easy & enjoyable for everyone. Take advantage of over 300 powerful, yet simple photo editing tools that allow you to perform all kind of image editing tasks.

· G PU accelerated processing and AI affects

Single GPU Single Node

Video editing

· F aster video effects, unique stereo 3D capabilities

Single GPU Single Node

An all-in-one video maker: an editor, converter, screen recorder, and more.

· F aster conversion speed with NVIDIA CUDA Single GPU Single Node

Collaborative editing system supporting Avid Media Composer, Adobe Premiere Pro, Grass Valley Edius and Blackmagic Resolve

· N VIDIA Video Codec allowing remote GPUaccelerated production workflows

Single GPU Single Node

Photo editing to transform your images into anything you can imagine

· G PU-accelerated AI "Neural Filters" · 3 0+ other GPU accelerated features · B lur gallery, liquify, smart sharpen,
perspective warp

Single GPU Single Node

Video editing and sharing program.

· G PU accelerated compute and effects

Single GPU Single Node

PowerDirector delivers professional-grade video editing and production for creators of all levels. Whether you are editing in 360 degrees, Ultra HD 4K or even the latest online media formats, PowerDirector remains the definitive Windows video editing solution for anyone, whether they are beginners or professionals.

· G PU accelerated video processing and effects

Single GPU Single Node

CyberLink PowerDVD is a universal media player for movie discs, video files, photos and music.

· G PU accelerated encoding and decoding

Single GPU Single Node

Video editing software for film, TV, and the · R eal-time video editing & fast output

web.

rendering based on CUDA

Multi-GPU Single Node

34 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 34

4/5/21 10:18 AM

Premiere Rush Sharpen AI SmartCourtPro Smoke TotalFX Vegas Pro Velocity Video Enhance AI
Video Studio VLC Media Player
WonderLive

Adobe Topaz Labs PlaySight Autodesk

Easy-to-use video editor for creating and sharing online videos.

· CUDA · R eal-time video editing · F ast output rendering

Sharpening and shake reduction software that can tell difference between real detail and noise.

· G PU accelerated effects · M achine Learning

Sophisticated video and analytics training · IVA technology with the latest in AI, integrations and player development tools.

Finishing and editing

· F aster effects

NewBlueFX Magix

Comprehensive collection of Titling, Compositing, Polishing and Styling tools.
Video editing

Imagine

Video editing

Communications

Topaz Labs

Trained on thousands of videos and combining information from multiple input video frames, Topaz Video Enhance AI will enlarge and enhance your footage up to 8K resolution with true details and motion consistency.

Corel

High quality tools that build, edit, and correct video skillfully.

VideoLAN Organization

VLC is a free and open source crossplatform multimedia player and framework that plays most multimedia files as well as DVDs, Audio CDs, VCDs, and various streaming protocols.

Z Cam

Cinematic VR Camera with excellent image quality, stereoscopic 360 degrees; recording, and live streaming.

· G PU-accelerated affects
· F aster video effects and encoding · U ses NVENC to encode/decode H.264 and
HEVC streams · F aster effects
· G PU accelerated AI inference and processing
· G PU accelerated compute
· N V Video Codec accelerated encoding and decoding
· U p to 4K output resolution equirectangular image
· S ave live stitched video file · P review live stitched video · R TMP live streaming output · S upports VRworks 360 video SDK

Multi-GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node Single GPU Single Node Single GPU Single Node
Single GPU Single Node Single GPU Single Node
Single GPU Single Node Single GPU Single Node
Single GPU Single Node

(IMAGE & PHOTO) EDITING

APPLICATION NAME

COMPANYNAME

Adjust AI

Topaz Labs

Affinity Photo

Affinity

Corel Draw

Corel

Corel Photo-Paint Corel

Fresco JPEG to RAW AI

Adobe Topaz Labs

PRODUCT DESCRIPTION
Adjust AI is a one click application that leverages the power of machine learning to intelligently enhance photos.

SUPPORTED FEATURES
· G PU accelerated effects

A fast and precise image editing software for photography and creative professionals, from editing and retouching images, creating fullblown multi-layered compositions, to making beautiful raster paintings.

· G PU accelerated image processing

Professional vector illustration, layout, photo editing and design tools

· F aster processing of AI features

Corel PHOTO-PAINT is an advanced photo editing software that offers professional editing tools and support for PSD files, plus extensive RAW file support for over 300 types of cameras.

· F aster processing of AI features

Powerful painting and drawing app that let · D irectX acceleration on GPU you create with realistic watercolors and oils

AI powered conversion of JPEG to highquality RAW for better editing. Prevent banding, remove compression artifacts, recover detail, and enhance dynamic range

· G PU accelerated processing

GPU SCALING
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node Single GPU Single Node
Single GPU Single Node Single GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 35

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 35

4/5/21 10:18 AM

Mask AI

Topaz Labs

Neat Image ON1 Photo Raw

Absoft ON1

PhotoLab

DxO

Topaz Studio

Topaz Labs

This is a AI-based masking tool for

· G PU-accelerated processing

photography that lets creators automatically

detect and remove objects from image.

Reduces noise, film grain, artifacts from photos.

· G PU accelerated processing

Professional-grade photo organizer, raw processor, layered editor, and effects app, includes everything you need in one photography application.

· G PU-accelerated processing

PhotoLab is a photo editor with specializing in high-quality RAW processing and optical corrections for lens defect, along with powerful local image adjustment tools.

· G PU-accelerated processing and AI features

Topaz Studio is an intuitive image effect toolbox with Topaz Labs' powerful acclaimed photo enhancement technology. It works a plugin within Lightroom, Photoshop, Affinity Photo, and others, as well as a standalone editor and host application for your other Topaz plugins.

· G PU-accelerated processing

Single GPU Single Node
Single GPU Single Node Single GPU Single Node
Single GPU Single Node
Single GPU Single Node

ENCODING AND DIGITAL DISTRIBUTION

APPLICATION NAME
4K Capture Utility for Windows

COMPANYNAME
ElGato

PRODUCT DESCRIPTION

SUPPORTED FEATURES

ElGato sells Capture Cards and offers a capture software with them. The ElGato 4K60 Pro Mk.II capture card includes an implementation of the Video Codec SDK (i.e. NVENC).

· H DR recording over HEVC · H DR to SDR conversion

Alchemist on Demand

Grass Valley

Video standards conversion

· G PU-accelerated video processing and encoding

Amberfin

Dalet

Transcoding and video quality analysis

· G PU-accelerated video processing and encoding

Aurora

Tektronix

Automated video quality measurement

· C UDA-accelerated video quality assessment

AW-360C10

Panasonic

360-degree Live Camera designed for live sporting events, concerts and stadium events

· Low-latency · R eal-time 4K 360 degree stitching from
four camera inputs · J etson TX-1

Content Agent

Root6

Automated transcoding and workflow management

· G PU-accelerated video processing and encoding

Core

ArcVideo

Video processing and transcoding Live

· A ccelerated transcoding and encoding

Daniel2

Cinegy

Discord Go Live

Discord

DouYu App

DouYu

Resolution-independent, CUDA accelerated video codec.
Broadcast feature that enables Discord users to broadcast their screen to a Discord channel Douyu's streaming application

· 8 K+ video playback faster than real time · 3 D LUT color profiles supported · lossless 10-, 12-, 16-bit support · A dobe Premiere Pro plugin · NVENC
· NVENC

Elemental Live

Elemental

Elemental Server Elemental

Live streaming video processing and encoding
File-based video processing and encoding

· V ideo encoding and video processing · V ideo encoding and video processing

Fast CinemaDNG Processor

Fastvideo

RAW video debayering, denoising and color correction completely on GPU side

· H igh-quality GPU-based RAW video processing up to 160 fps
· W avelet, realtime de-noising · C olor correction features and monitoring · E xport to 16-bit TIF or 10-bit ProResFull-
sized video processing · R ealtime 4K, 6K, and 8K playback
supported

36 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

GPU SCALING
Single GPU Single Node
Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node
Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node
N/A
Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 36

4/5/21 10:18 AM

FAST TICO-RAW
FAST TICO-XS
Handbrake HuYa App JPEG2000 Codec Lightspeed Live
Live Logitech Capture Medialooks SDK Media Transcoding in the Cloud
Multiplatform Transcoder

intoPIX
intoPIX
Handbrake HuYa

The intoPIX TICO-RAW SDKs provide the highest quality, visually lossless codec for the optimization of your application's infrastructure. FastTICO-RAW SDKs are perfect for all professionals looking to deploy ultra-low latency, lossless RAW encoding over parts of their workflows.
The intoPIX FastTICO-XS SDKs provide the highest quality, lowest latency, visually lossless codec for the optimization of your application. FastTICO-XS SDKs are perfect for all professionals looking to deploy ultralow latency, lossless encoding over their whole infrastructure and workflows.
HandBrake is an open-source, GPLlicensed, multiplatform, multithreaded video transcoder.
Huya's streaming app

· C UDA GPU accelerated up to 10K decoding · L ossless and low latency · A ll operating systems
· C UDA GPU accelerated HD, UHD-4K and -8K encoding / decoding
· L ossless and low latency · A ll operating systems · J PEG XS standard compliant
· G PU accelerated encoding
· NVENC

Comprimato Telestream
ArcVideo Logitech

JPEG2000 encoding and decoding for DCP, IMF, video editing, broadcast contribution, and archiving.
Enterprise-class live streaming system that can ingest, encode, package and deploy multiple sources to multiple destinations. System utilizes the latest technologies to deliver pristine quality and exceptional processing speed. Video processing and transcoding can be accelerated with GPU for up to 9x speed improvements
High-density, real-time video processing and encoding.
Logitech's app to control their webcam

· F aster-than-real-time UltraHD / 4K · L ossy and mathematically lossless · H igh-bit-depth (HDR) · U ses NVENC to encode/decode multiple
H.264 and HEVC streams · V ideo processing and transcoding
· A ccelerated broadcast encoding with NVIDIA CUDA and NVENC
· NVENC

Medialooks

MFormats SDK provides complete control over the video pipeline

Ribbon Communications

Industry-leading SBC media transcoding scaling capabilities in virtual and cloud deployments using NVIDIA GPUs to increase performance and decrease cost per transcoded session. Expanded SBC and PSX support for SIP Recording (SIPRec) allows enterprises and call centers to conduct up to four (4) simultaneous recordings of sessions via secure, encrypted technology. Expanded capabilities for Virtual Network Functions (VNF) instantiation with the ability to instantiate Ribbon PSX VNF aligned with the Open Network Automation Platform (ONAP) framework. Enhancements for operational efficiencies that allow CSPs to reduce configuration complexity and improve ease of use. Enhanced security across all products to deliver more restrictive access, reduction in possible network exposure and additional encryption.

ERLAB

Video processing and encoding software

· N VIDIA Video Codec used for accelerated encoding and ecoding
· R ibbons Session Border Controller Release 7.0 now supports GPUs enabling greater performance and scale for media transcoding, at cost-effective price points, in cloud and virtualized environments.
· R ibbons Centralized Policy and Routing (PSX) can be instantiated as a Virtual Network Function (VNF) aligned with the ONAP architecture.
· E nterprises now have increased capacity for up to four (4) concurrent SIP Recording (SIPRec) sessions, enabling recorded data to be used for multiple purposes simultaneously such as real-time analytics for call center agents, recordings for corporate compliance and back-up, and lawful intercept
· T he Insight Element Management System (EMS) has an improved user interface for ease of use and offers improved provisioning and management processes
· P re-processing encoding, decoding, postprocessing and delivery

Single GPU Single Node
Single GPU Single Node
Single GPU Single Node Single GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node
Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 37

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 37
4/5/21 10:18 AM

mxfSPEEDRAIL

MOG Technologies

OBS Studio Piko TV PixelStrings

Open Broadcaster Software Kizil Electronik
Cinnafilm

Skywatch

MOG Technologies

Smart Render Editor
Smart Render SDK

Nablet Nablet

Speech Quality transformed using Neural Network Computing

BabbleLabs

StreamLabs OBS StreamLabs

Tachyon

Cinnafilm

Tornado

Marquise Technologies

Transkoder

Colorfront

Twitch Studio

Twitch.tv

Baseband broadcast news and sports production video ingest product line that allows editing of growing files during ingest.

· N VIDIA Video codec used for encoding for higher channel density
· C UDA RAW de-coding, de-bayering, and video re-sizing and re-sampling

Single GPU Single Node

Free and open source software for video recording and live streaming optimized for NVIDIA video encoder

· NVENC

Single GPU Single Node

Linear broadcast encoder

· H .264 and HEVC 4K encoding for broadcast Single GPU

channels

Single Node

Cloud-based image processing Platformas-a-Service (PaaS) delivering high-quality, automated video conversion and frame optimization

· M otion-compensated frame rate conversion
· H igh-quality de-interlacing · T exture-aware scaling · D e-grain/re-grain to any film look, · D e-noise/re-texture to limit banding · R everse telecine/pulldown pattern
correction · Interlace artifact and dust removal · R untime retiming

Multi-GPU Single Node

Video and broadcast production management system for collecting audio/ video usage and metadata.

· N VIDIA Video codec used for encoding for higher channel density
· C UDA RAW de-coding, de-bayering, and video re-sizing and re-sampling

Single GPU Single Node

H.264 and HEVC video encoding using NV Video Codec

· A ccelerated, high-density video encoding

Single GPU Single Node

Video de-noising, de-interlacing, JPEG 2000 · C UDA accelerated video processing

encoding and video fingerprinting

· N VIDIA Video codec

Single GPU Single Node

BabbleLabs has just launched broad production availability of our commercial speech API, web service, and phone mobile apps for iPhone and Android. These services clean up video and audio recordings to make the speech much easier to understand. The apps work on existing videos as well as new audio and video recorded inside the app.

· R eal time encoding/decoding of audio · V ideo signals

Single GPU Single Node

Branch of the OBS Studio project that adds a custom UI, integrates plugins, and a plugin store

· NVENC

Single GPU Single Node

Standards conversion

· Video processing and frame rate conversion · S tandards conversions and transcoding · S D to UHD, telecine correction, and frame
rate normalization

Multi-GPU Single Node

Transcoding engine for IMF and DCP facilities

· Image re-sizing up to 8K · C olor space conversion: 601/709, REC
2020, DCI XYZ, ACES 1.0 · D e-bayering: ARRIRAW, DNG, RED R3D,
SONY F65, F55 RAW, Phantom flex 4K, Canon C500 · M ezzanine: ProRes 444, Avid DNxHD 444, XDCAM, AVC Intra, AS-11 DPP, IMF · U ncompressed: DPX, TIFF, OpenEXR

Single GPU Single Node

Encoding and transcoding for DCP, and IMF mastering

· J PEG2000 encoding and decoding · 3 2-bit floating point processing on multiple
GPUs · M XF wrapping, accelerated checksums and
AES encryption and decryption, · IMF/IMP and DCI/DCP package authoring,
editing, transwrapping

Multi-GPU Single Node

Broadcasting app focused on beginners

· NVENC · M ulti-video Codec support

Single GPU Single Node

38 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 38

4/5/21 10:18 AM

Vantage LightSpeed Telestream

Viarte VidiCert Wormhole

Isovideo
Joanneum Research
Cinnafilm

Wowza Streaming Engine Transcoder
XSplit Broadcaster

Wowza
SplitmediaLabs, Ltd.

XSplit Gamecaster SplitmediaLabs, Ltd.

Enterprise-class live streaming system that can ingest, encode, package and deploy multiple sources to multiple destinations. System utilizes the latest technologies to deliver pristine quality and exceptional processing speed. Video processing and transcoding can be accelerated with GPU for up to 9x speed improvements Video standards conversion
Video and film quality assurance
Time alteration
H.264 video encoding

· V ideo transcoding and processing
· C UDA-accelerated video processing and encoding
· C UDA accelerated video quality analysis · G PU-accelerated noise, grain and dust
detection/removal · R etiming and motion compensation, · S uper slow motion, and run length
adjustment · C ommercial insertion, audio retiming, and
caption retiming · N VENC accelerated video encoding

Broadcast app for recording and streaming, now including a lightweight video editor
Simplified broadcast app for recording and streaming, now including a lightweight video editor

· NVENC · Record · Stream
· NVENC

Multi-GPU Single Node
Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node
Single GPU Single Node N/A
Single GPU Single Node

ON-AIR GRAPHICS

APPLICATION NAME

COMPANYNAME

Air

Cinegy

Aximmetry

Aximmetry

Brodcaast Dscript 3D
Camino

Monarch AJT Systems

Capture

Cinegy

PRODUCT DESCRIPTION
Broadcast play-out server

SUPPORTED FEATURES
· R eal-time on-air graphics · N VIDIA Video Codec for accelerated
encoding and decoding HD and HEVC

Aximmetry?s solutions cover all aspects of advanced broadcast presentation: tracked virtual sets, Augmented Reality (AR), interactive touch screen displays, datadriven graphics, virtual product placement, and audience interaction via second-screen devices.

· D irextX 11 3D Rendering, Post Processing and Compositing
· N VEnc encoding in H264/265 · TXAA · G ameworks: Screen-Space Ambient
Occlusion · G ameworks: Depth of Field

3D on-air graphics

· R eal-time rendering

Camino is a powerful 3D rendering system for live-to-air broadcast graphics, capable of up to 4K character generation. Camino's high end features, with excellent ease of use, combine to deliver an exceptional system for your broadcast graphics requirements.

· C amino's real-time graphics overlay can be applied to tickertapes, scoreboards, schedule boards, program junctions, and TV show promotions
· G raphics overlay may be done via predefined templates, which may then be populated with live data during playout
· M akes real-time rendering of data-driven graphics possible in news and sports events.4K, 1080p, 720p and SD Support
· N TSC and PAL Support · G raphics, Clips and 3D Objects Importer · 2 D and 3D Primitives · R eal-Time Key-Frame Animations · R eal-Time 3D Scene Lighting · T imeline-Based Audio Support · D ata Mapping to External Sources · T ransition Logic · A utomation Controller Support · S tereoscopic 3D rendering

Video ingest

· U ses NVENC to encode/decode multiple H.264 and HEVC streams

GPU SCALING
Single GPU Single Node Single GPU Single Node
Single GPU Single Node Single GPU Single Node
Single GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 39

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 39

4/5/21 10:18 AM

Clarity Click Effects PRIME Cube Designer eStudio InfinitySet
KAIROS
Livebook GFX
Mosaic Multiviewers Nexio Channelbrand Nexio G8 Nexio TitleOne Pixotope
PRIME Reality Engine

Pixel Power ChyronHego Dalet Disguise
Brainstorm Brainstorm
Panasonic
AJT Systems
ChyronHego Evertz Imagine Communications Imagine Communications Imagine Communications The Future Group
ChyronHego
Zero Density

On-air graphics
Click Effects PRIME is audiovisual content control and delivery solutions for live sports & entertainment productions. On-air Graphics
Designer is the ultimate software to visualize, design, and sequence projects wherever you are, from concept all the way through to showtime. Virtual sets and motion graphics
Realistic virtual sets
The IT/IP platform `KAIROS' is a live video production platform developed based on a new concept and innovative architecture. It incorporates proprietary, ground-breaking software to maximize the CPU and GPU capacities for video processing. The LiveBook is designed to fit every production environment and facilitate evolving work flows. Whether you are broadcasting over IP, or using SDI for internal or downstream keying, the LiveBook will be able to adapt to your environment. On-air graphics
Broadcast multiviewer
On-air graphics
On-air graphics
On-air graphics
All-in-one, real-time virtual production system with integrated Unreal Engine photorealistic rendering. Open softwarebased solution for rapidly creating virtual studios, augmented reality (AR), and on-air graphics. Offers a real-time WYSIWYG editor, a virtual set auto-generation tool, its own powerful internal chroma keyer, and user-designed custom control panels. PRIME Graphics Platform is the next generation of pioneering real-time graphics solutions, helping broadcasters create engaging visuals for all types of programming. Photorealistic virtual studio solution in broadcast industry, powered by Epic Unreal Engine 4.24
Using Mellanox Rivermax API

· R eal-time rendering
· R eal-time graphics rendering
· R eal-time graphics rendering
· R eal-time graphics rendering · S ynchronized video playback · P rojection Mapping
· R eal-time rendering · R TX accelerated ray-tracing optional Epic
Unreal Engine · R eal-time RTX ray tracing through UE4 · H DR I/O · P hysically-based rendering · R TX accelerated ray-tracing optional Epic
Unreal Engine · R ealtime playout · C UDA and NVEnc · R ivermax SMPTE 2110 · G PU Accelerated Video
· G raphics solution for compact live sports productions
· R eal-time rendering
· U ses NVENC H.264 and HEVC encoding and decoding
· R eal-time rendering
· R eal-time rendering
· R eal-time rendering
· R eal-time rendering · R TX accelerated ray-tracing
· R eal-time graphics rendering
· R TX-accelerated ray-tracing with Unreal Engine
· N ode-based compositing system designed for real-time production
· Image quality is achieved by on NVIDIA GPUs through deferred rendering methods unique anti-aliasing technology and advanced features such as depth of field, motion blur, light maps, screen space reflections and refraction

Single GPU Single Node Single GPU Single Node
Single GPU Single Node Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Multi-GPU Single Node
Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node
Single GPU Single Node
Single GPU Single Node

40 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 40

4/5/21 10:18 AM

Titler Pro tOG Type Vertigo Virtuoso Viz Engine Wasp3D - CG

NewBlueFX RT Software Cinegy Grass Valley Monarch vizrt Wasp3D

Create elegant video titles or 3D motion graphics. On-air graphics
On-air Graphics
On-air Graphics
Virtual sets and motion graphics
On-air graphics and virtual sets
On-air graphics and virtual sets

· G PU-accelerated graphics · R eal-time rendering · R eal-time graphics rendering · R eal-time rendering · R eal-time rendering · R eal-time graphics rendering · R eal-time graphics rendering

Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node

ON-SET, REVIEW AND STEREO TOOLS

APPLICATION NAME
4kScope

COMPANYNAME
Drastic Technologies

PRODUCT DESCRIPTION
4kScope software provides a real time, professional quality signal analysis tool for on set, production, post production, and research and development environments.

SUPPORTED FEATURES
· G PU accelerated effects and compute

GPU SCALING
Single GPU Single Node

8KScope

Drastic Technologies

Real time, professional quality signal analysis · G PU-accelerated effects and compute tool for on set, production, post production, and research and development environments.

Single GPU Single Node

Cortex Dailies

MTI Film

Review, color grading and transcoding on set

· C UDA accelerated grading and transcoding Multi-GPU Single Node

Fluid 4K Review

BlueFish444

Review and approval of 4K content

· R eal-time video review

Single GPU Single Node

ICE

Marquise

IMF reference video player

· R AW data support for ARRIRAW, DNG, RED Single GPU

Technologies

R3D, SONY F65, F55 RAW, Phantom flex 4K Single Node

and Canon C500

· H DR content encoded in Dolby Vision,

HDR10, HDR10+ or HLG

· U ncompressed formats support: DPX, TIFF

and OpenEXR

Net-X-Code

Drastic Technologies

Net-X-Code is a distributed capture and conversion system: IP Capture, Control, Convert and Output for server level.

· G PU accelerated compute

Single GPU Single Node

NewBlue Stream NewBlueFX

NewBlue Stream is a lightweight streaming · G PU-accelerated processing, encoding and and broadcast solution paired with dynamic, decoding data-driven graphics

Single GPU Single Node

On-Set Dailies

Colorfront

Review, color grading and transcoding on set

· R eal-time review

Multi-GPU

· N V Video Codec encoding and transcoding Single Node

Previzion

Lightcraft

On-set virtual production

· R eal-time, virtual set production

Single GPU Single Node

VideoQC

Drastic Technologies

videoQC is a suite of video and audio analysis and playback tools with both visual and automated quality checking tools. Takes the media coming into your facility and perform a series of automated tests on video, audio and metadata values against a template, then analyze the audio and video.

· G PU accelerated effects and compute

Single GPU Single Node

WEATHER GRAPHICS

APPLICATION NAME

COMPANYNAME

Max Weather

WSI

PRODUCT DESCRIPTION
Weather graphics

Metacast

ChyronHego

Weather graphics

MeteoEarth

MeteoGraphics Weather graphics

SUPPORTED FEATURES
· R eal-time graphics
· R eal-time graphics
· R eal-time graphics

GPU SCALING
Single GPU Single Node
Single GPU Single Node
Single GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 41

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 41

4/5/21 10:18 AM

Medical Imaging

APPLICATION NAME
3D Slicer

COMPANYNAME
3D Slicer

aidoc

Aidoc Medical

AI-LAB

American College of Radiology

deepflow

Helmholtz Zentrum München

EBM AI Workflow

EBM Technologies

Ibex Decision Support

IBEX

PRODUCT DESCRIPTION

SUPPORTED FEATURES

3D Slicer is an open-source software platform for medical image informatics, image processing, and three-dimensional visualization. Slicer brings free, powerful cross-platform processing tools to physicians, researchers, and the general public.

· N VIDIA Clara AI-assisted Annotation · S upports multi organs, from head to toe · M ulti-modality imaging (MRI, CT, US,
nuclear medicine, and microscopy) · B idirectional interface for devices

GPU SCALING
Single GPU Single Node

AI based decision support software analyzing medical imaging to provide solutions for detecting acute abnormalities across the body, helping radiologists prioritize life threatening cases and expedite patient care. Agnostic to PACS and RIS systems

· C lassification and segmentation using deep Single GPU

learning on top of any PACS platform

Single Node

ACR AI-LAB offers radiologists tools designed to help them learn the basics of AI and participate directly in the creation, validation and use of health care AI. It accelerates the development and adoption of artificial intelligence (AI) in clinical practice, empowering radiologists to create AI tools at their own institutions, to meet their own patient needs.

· A I models for diagnostic imaging · A I models tailored to their local patient
population · P atient data protection

Single GPU Single Node

Deep learning tool for reconstructing cell cycle and disease progression using deep learning from flow cytometry data.

· Tool will show that deep convolutional neural networks combined with nonlinear dimension reduction enable reconstructing biological processes based on raw image data
· Tool will demonstrate this by reconstructing the cell cycle of Jurkat cells and disease progression in diabetic retinopathy. In further analysis of Jurkat cells
· Tool will detect and separate a subpopulation of dead cells in an unsupervised manner and, in classifying discrete cell cycle stages
· Tool will reach a sixfold reduction in error rate compared to a recent approach based on boosting on image features. In contrast to previous methods, deep learning based predictions are fast enough for on-the-fly analysis in an imaging flow cytometer
· Uses MXNet, cv2, numpy, python3

Single GPU Single Node

EBM AI Workflow is a software platform for seamless data annotation, training, and advanced visualization and deployment of AI-based medical imaging applications. EBM AI workflow and NVIDIA Clara combine the power of AI and edge computing to retain critical processing tasks on devices at the point of care, enabling healthcare professionals, physicians and specialists to make instantaneous, life-saving predictions and emergency responses.

· P re-trained models for inference and AIassisted annotation
· A utomatic image analysis · E BM PACS viewer · F DA approved APP(UDE) · X Annptation APPs

Multi-GPU Multi-Node

IBEX run DL on prostate cancer digital pathology and to find any potential cancerous areas

· C ombines data from digitized glass slides and electronic medical records to reveal underlying patterns
· E xtracts valuable clinical insights that can transform how pathology and oncology are practiced and propel them into the information age

Single GPU Single Node

42 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 42

4/5/21 10:18 AM

iNtuition
LVO MITK
OHIF PowerGrid Proprio

Terarecon, Inc.

Intuition offers AI-driven advanced 3D and 4D medical imaging post-processing and visualization.

· V olumetric Navigation, CT and MRI Suites · Interventional Radiology · E VAR / TAVR Planning · B ody Fusion · Maxillo-Facial · iGENTLE noise reduction · L ung / Liver Segmentation · M itral Valve (TMVR) Workflow · L ung Density Analysis-II · Intuition AI Adapter · E ureka Clinical AI Platform framework · E xplorer UX/UI, and AI algorithm runtime
licenses

Multi-GPU Multi-Node

Viz.ai

Automatically identify suspected LVOs on CTA imaging in your network and to alert your on-call stroke physician within minutes

· R eal-Time Specialist Notifications · A I-Powered LVO Detection · A utomated Maximum Intensity Projections
(MIP)

Single GPU Single Node

German Cancer Free open-source software system for Research Center development of interactive medical image
processing software

· Interactive segmentation of slices in image volumes, including interactive region growing and easy correction, interpolation of missing slices, surface generation, and volumetry
· P oint based registration of medical image volumes allows to match two images based on two corresponding sets of points; Rigid registration of images by combination of the ITK registration objects (transforms, optimizers, metrics, etc.)
· M easurement of distances and angles; Volume visualization, GPU-based, easy to modify transfer functions; Movie generation (Windows only)
· D eformable Registration

Single GPU Single Node

Open Health Imaging Foundation

OHIF is a framework for building medical imaging web applications that uses react. The code is modular, using react components and a plug-in model making it possible to add new tools and workflows into the basic viewer UI.

· Integrated AI-assisted Annotation with NVIDIA Clara Plugin
· R etrieve and load medical images from most sources and formats
· R ender sets in 2D, 3D, and reconstructed representations
· A llows for the manipulation, annotation, and serialization of observations
· S upports internationalization, OpenID Connect, offline use, hotkeys

Single GPU Single Node

University of Illinois UrbanaChampaign

Provides iterative non-cartesian MRI reconstruction

· G PU accelerated implementations of the non-Unform FFT and Discrete Fourier Transform
· M PI is used to enable using multiple GPUs in one or several machines
· Iterative reconstruction using physicsbased model to correct for unwanted effects, such as field inhomogeneity and patient motion

Multi-GPU Single Node

Proprio

Proprio's multi-camera system, based on networked camera array, depth sensing, light filed for surgeons to operate and access all the data they need. Offers training based in captured real cases in a safe and collaborative environment.

· CUDA

Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 43

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 43
4/5/21 10:18 AM

Rad AI Follow-up RAD AI

Rad AI Impressions RAD AI

RadiAnt

Medixant

Radiology Assist Zebra Imaging

Radlogics Virtual Resident

RadLogics

Vitrea®

Vital Images

Rad AI provides communication and tracking of follow-up recommendations for incidentalomas (such as for pulmonary nodules and lung cancer screening programs) that are top of mind for improving patient safety. By ensuring that these follow-ups are performed, the overall quality of patient care is improved and reduces patient morbidity/mortality, while creating new imaging revenue for the health system, and generating value from additional downstream services.

· C ommunicates and tracks follow-up recommendations
· Integrates with a health system's existing workflow
· A ppropriate follow-up imaging is performed on a timely basis

Multi-GPU Multi-Node

Rad AI automatically generates customized report impressions that save radiologists an average of more than 60 minutes per day. AI automatically generates report impressions, customized to each radiologist's exact language and style, for more than 90% of imaging modalities.

· A utomatic report impressions, · C ustomized to your language · F leischner, Lung-RADS and TI-RADS · S eamless integration

Multi-GPU Multi-Node

RadiAnt DICOM Viewer provides basic tools for the manipulation and measurement of images

· F luid zooming and panning, Brightness and contrast adjustments, negative mode, Preset window settings for Computed Tomography (lung, bone, etc.)
· A bility to rotate (90, 180 degrees) or flip (horizontal and vertical) images, Segment length, Mean, minimum and maximum parameter values (e.g. density in Hounsfield Units in Computed Tomography) within circle/ellipse and its area, Angle value (normal and Cobb angle)
· P en tool for freehand drawing

Single GPU Single Node

Receives imaging scans from various modalities and automatically analyzes them for a number of different clinical findings. Findings are provided in real time to radiologists or other physicians and hospital systems as needed.

· C lassification and segmentation on top of any PACS platform

Single GPU Single Node

Software platform imports any DICOMcompatible study directly from the modality or the PACS. The software platform provides APIs for image analysis algorithms to incorporate search, measurement, and other findings into the radiologist existing PACS and reporting system as a preliminary report.

· R eal time analytics on medical imaging

Single GPU Single Node

Vitrea provides advanced visualization tools to a range of medical specialists (including radiologists, cardiologists, oncologists and other specialists) so that they can visualize patient images and communicate with each other efficiently on a course of action. Vitrea is a crucial tool for clinical decision support and enabling physicians to communicate effectively about a common patient, and specialists rely on its detailed 2D, 3D and 4D images for confident analysis in critical scenarios.

· Interface designed for viewing in the reading room
· Improved clinical outcomes with clinical workflows and partner applications
· Increased efficiency with a consistent user interface and experience for all modalities
· E asy to deploy thin client solution does not require specialized software to reside on client computers.

Multi-GPU Multi-Node

44 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 44

4/5/21 10:18 AM

XNAT xvision

Radiologics Augmedics

XNAT is an open source imaging informatics platform developed by the Neuroinformatics Research Group at Washington University. It facilitates common management, productivity, and quality assurance tasks for imaging and associated data. XNAT is extensible and can be used to support a wide range of imaging-based projects.

· U pload data using DICOM image data and metadata
· O rganize and share data within userdefined projects securely
· V isualize and download using an embedded medical image viewer that supports a number of common medical imaging formats
· S ecure and manage access to data using a tiered architecture
· S earch and explore large data sets and create and share customized search patterns
· P rocess data using pipelines that allow for the programming and automation of complex workflows

Single GPU Single Node

Augmented reality guidance system

· T ransparent AR Display

N/A

for surgery, allows surgeons to see the

· T racking system

patien'?s anatomy through skin and

tissue as if they have `x-ray vision' and to

accurately guide instruments and implants

during spine procedures

Oil and Gas

APPLICATION NAME
6X

COMPANYNAME
Ridgeway Kite

AISight for SCADA BRS Labs

AxRTM DecisionSpace Echelon

Acceleware
Halliburton (Landmark) Stone Ridge Technology

GeoDepth
Geoteric
Graydient S (SCADA)

Emerson Geoteric Giant Grey

HUESpace InsightEarth Omega2 RTM

Bluware CGG Schlumberger

PRODUCT DESCRIPTION
Reservoir Simulation on Tesla

SUPPORTED FEATURES
· C UDA Simulation Parallelization

GPU SCALING
Single GPU Single Node

Proactive integrity management and realtime precursor alerts for enhanced SCADA operations in oil and gas.

· 2 4/7 real-time analysis and alerting · S cales to thousands of sensors across
remote and geographically dispersed locations · H istorical analysis and trend
reports

Multi-GPU Single Node

Reverse Time Migration Software

· C UDA accelerated libraries for building RTM software

Multi-GPU Multi-Node

E&P platform for geoscience, well planning, CUDA acceleration of fault extraction drilling and earth modeling.

Multi-GPU Single Node

Full featured reservoir simulator designed from inception for GPU (Supported features)

· F ully GPU-accelerated reservoir model · D ual-perm, dual porosity, pressure varying
perm and porosity · E clipse compatible input deck

Multi-GPU Multi-Node

Seismic Interpretation Suite

· C UDA-accelerated RTM

Multi-GPU Multi-Node

Seismic interpretation

· A ttributes calculations · G eobodies extraction

Multi-GPU Single Node

Machine learning anomaly detection for large scale industrial data.

· P roactive integrity management and realtime precursor alerts for enhanced SCADA operations in oil and gas
· 2 4/7 real-time analysis and alerting scaling to thousands of sensors across remote and geographically dispersed location

Multi-GPU Single Node

Library SDK toolkit for creating applications · C UDA acceleration for compression

for seismic compression and seismic/

· L arge-scale visualization

geospatial imaging and interpretation.

Multi-GPU Single Node

Seismic Interpretation Suite

· O penCL acceleration for AFE · 3 D Curvature attributes

Multi-GPU Single Node

Seismic processing

· M ultiple algorithms (RTM, etc)

Multi-GPU Multi-Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 45

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 45
4/5/21 10:18 AM

PumaFlow IFP Roxar RMS RTM Seismic City RTM SKUA tNavigator
VoxelGeo

Beicip-Franlab Emerson Tsunami Seismic City Emerson Rock Flow Dynamics (RFD)
Emerson

Reservoir simulation

· G PU-accelerated linear solver

Reservoir modeling

· M ulti GPU capabilities via HUEspace

Seismic processing

· R TM algorithm

RTM Seismic Processing

· C UDA acceleration

Reservoir modeling

· F aults, Horizons and Flow Simulation Grid

tNavigator Solver is a software package, offered as a single executable, which allows to build static and dynamic reservoir models, run dynamic simulations, calculate PVT properties of fluids, build surface network model, calculate lifting tables, and perform extended uncertainty analysis as a part of one integrated workflow.
Seismic Interpretation Package

· CUDA · P ascal/Volta architecture · Multi-GPU
· M ulti-GPU volume rendering · Horizon-flattening · A ttribute calculations

Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Multi-Node
Multi-GPU Single Node

Life Sciences

BIOINFORMATICS

APPLICATION NAME

COMPANYNAME

Arioc

Johns Hopkins University

AtacWorks

NVIDIA

BarraCUDA BEAGLE-lib

University of Cambridge Metabolic Research Labs
Open Source

Campaign

SimTK

Clara Genomics Analysis

NVIDIA

CUDASW++

Open Source

PRODUCT DESCRIPTION
High-throughput read alignment with GPUaccelerated exploration of the seed-andextend search space.

SUPPORTED FEATURES
· S ingle-end alignment, paired-end alignment
· O utput in SAM or database-ready binary formats
· M ultiple GPU implementation

AtacWorks is a deep learning toolkit for coverage track denoising and peak calling from low-coverage or low-quality ATAC-Seq data.

· C overage track denoising · Retraining

Sequence mapping software

· A lignment of short sequencing reads · A lignment of indels with gap openings and
extensions.

GPU SCALING
Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Multi-Node

BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. Makes use of highly-parallel processors such as those in graphics cards (GPUs) found in many PCs. An open-source library of GPU-accelerated data clustering algorithms and tools.
Clara Genomics Analysis is a GPUaccelerated library for biological sequence analysis.
Open source software for Smith-Waterman protein database searches on GPUs.

· E valuation of likelihood for sequence evolution on trees and Arbitrary models (e.g. nucleotide, amino acid, codon)
· S peed-ups (over CPU only version): nucleotide model = up to 25x, codon model = up to 50x.
· K-means · Kps-means · K-medoids · K-centers · H ierarchical clustering · S elf-organizing map
· C UDA based libraries partial order alignment (cudapoa)
· G obal aligner (cudaaligner) · M apper (cudamapper)
· P arallel search of Smith-Waterman database.

Multi-GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node
Multi-GPU Single Node

46 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 46

4/5/21 10:18 AM

CUSHAW f5c
G-BLASTN GHOST-Z GPU GPU-Blast mCUDA-MEME MUMmer GPU NVBIO NVBowtie Parabricks
PEANUT
Racon

Open Source

Parallelized short read aligner

· P arallel, accurate long read aligner for large genomes

Multi-GPU Single Node

University of New South Wales

An optimised re-implementation of the call-methylation and eventalign modules in Nanopolish. Given a set of basecalled Nanopore reads and the raw signals, f5c call-methylation detects the methylated cytosine and f5c eventalign aligns raw nanopore DNA signals (events) to the basecalled read. f5c can optionally utilise NVIDIA graphics cards for acceleration.

· M ethylated cytosine base and frequency detection
· E vent alignment

Single GPU Single Node

Hong Kong Baptist University

GPU-accelerated nucleotide alignment tool · B lastn and megablast modes of NCBI-

based on the widely used NCBI-BLAST.

BLAST

Single GPU Single Node

Akiyama_ Laboratory, Tokyo Institute of Technology

Sequence homology search tool.

· S hotgun Metagenome Analysis.

Multi-GPU Multi-Node

Carnegie Mellon Local search with fast k-tuple heuristic University

· P rotein alignment according to BLASTP

Single GPU Single Node

Open Source

Ultrafast scalable motif discovery algorithm · S calable motif discovery algorithm based

based on MEME .

on MEME

Multi-GPU Single Node

Open Source

MUMmer GPU is a high-throughput local sequence alignment program

· A ligns multiple query sequences against reference sequence in parallel

Single GPU Single Node

Open Source

NVBIO is an open source C++ library of reusable components designed to accelerate bioinformatics applications using CUDA.

· D ata structures, algorithms · U tility routines useful for building complex
computational genomics applications on CPU-GPU systems

Multi-GPU Single Node

Open Source

A largely complete implementation of the Bowtie2 aligner on top of NVBIO.

· G ood coverage of Bowtie2 features · C omparable quality results

Multi-GPU Single Node

NVIDIA

Parabricks provides 30-50 times faster secondary analysis of sequencer generated FASTQ files to variant call files (VCFs). Parabricks has accelerated the standard secondary analyses such as GATK4, Google's Deepvariant to generate equivalent results, while increasing throughput significantly.

· B WA-mem, Star, haplotype caller, CNVKit, Mutect2, Deep Variant, ImportGVCF, Select Variants, Genotype GVCF, Mark, Sort, BQSR, Merge, VQSR, Variant Filtration, CNNScore, and many quality checking tools.

Multi-GPU Single Node

Open Source

Read mapper for DNA or RNA sequence that reads to a known reference genome.

· A chieves supreme sensitivity and speed compared to current state of the art
· R eads mappers like BWA MEM, Bowtie2 and RazerS3
· P EANUT reports both only the best hits or all hits

Single GPU Single Node

University of Zagreb, Faculty of Electrical Engineering and Computing

Racon is intended as a standalone consensus module to correct raw contigs generated by rapid assembly methods which do not include a consensus step. The goal of Racon is to generate genomic consensus which is of similar or better quality compared to the output generated by assembly methods which employ both error correction and consensus steps, while providing a speedup of several times compared to those methods.

· It supports data produced by both Pacific Biosciences and Oxford Nanopore Technologies. Racon can be used as a polishing tool after the assembly with either Illumina data or data produced by third generation of sequencing. The type of data inputed is automatically detected. Racon takes as input only three files: contigs in FASTA/FASTQ format, reads in FASTA/FASTQ format and overlaps/alignments between the reads and the contigs in MHAP/PAF/SAM format. Output is a set of polished contigs in FASTA format printed to stdout. All input files can be compressed with gzip (which will have impact on parsing time).
· Racon can also be used as a read errorcorrection tool. In this scenario, the MHAP/ PAF/SAM file needs to contain pairwise overlaps between reads including dual overlaps.

Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 47

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 47
4/5/21 10:18 AM

racon-gpu
REACTA SeqNFind SOAP3 SOAP3-dp Synomics Studio
UGene WideLM

Open Source

Racon is intended as a standalone consensus module to correct raw contigs generated by rapid assembly methods which do not include a consensus step. The goal of Racon is to generate genomic consensus which is of similar or better quality compared to the output generated by assembly methods which employ both error correction and consensus steps, while providing a speedup of several times compared to those methods. It supports data produced by both Pacific Biosciences and Oxford Nanopore Technologies.

· Racon can be used as a polishing tool after the assembly with either Illumina data or data produced by third generation of sequencing
· The type of data inputted is automatically detected.
· Racon takes as input only three files: contigs in FASTA/FASTQ format, reads in FASTA/ FASTQ format and overlaps/alignments between the reads and the contigs in MHAP/ PAF/SAM format. Output is a set of polished contigs in FASTA format printed to stdout. All input files can be compressed with gzip (which will have impact on parsing time).
· Racon can also be used as a read errorcorrection tool. In this scenario, the MHAP/ PAF/SAM file needs to contain pairwise overlaps between reads including dual overlaps.

Single GPU Single Node

Open Source

A modified version of GCTA with improved computational performance, support for Graphics Processing Units (GPUs), and additional features. The purpose of REACTA is to quantify the contribution of genetic variation to phenotypic variation for complex traits.

· G RM creation · R EML analysis · R egional Heritability (including multi-GPU)

Multi-GPU Single Node

Accelerated Technology Laboratories

SeqNFind; is a powerful tool suite that addresses the need for complete and accurate alignments of many small sequences against entire genomes utilizing a unique hardware/software cluster system for facilitating bioinformatics research in Next Generation sequencing and genomic comparisons.

· H ardware and software for reference assembly, blast, SW, HMM, de novo assembly

Multi-GPU Single Node

Genomics

GPU-based software for aligning short reads with a reference sequence. Finds all alignments with k mismatches, where k is chosen from 0 to 3.

· S hort read alignment tool that is not heuristic based
· R eports all answers

Multi-GPU Multi-Node

The University of SOAP3-dp is an ultra-fast GPU-based tool

Hong Kong

for short read alignment via index-assisted

dynamic programming.

· B orrows-Wheeler Transformation · D ynamic Programming

Multi-GPU Single Node

Row Analytics

Multi-Omics Biomarker Network Discovery and ValidationSynomics Studio is a new, highly scalable analysis platform that enables researchers and clinicians to discover novelassociations between multiple genotypic, phenotypic and clinical attributes of their patients and their disease risk /therapy responses.

· Multi-SNP association studies (GWAS studies with up to 30 SNPs/SNVs in combination)
· Configurable number of cycles of fully random permutation for validation of SNP networks Speed-up on GPU = 170x vs multicore CPU alone (further speed-up available on multi-GPU and NVLink devices)
· Representative performance for 15,000 case:controls, 200,000 SNPs
· 2 SNP associations found and validated in 12 mins on single 20 core IBM POWER8NVL with 4x Tesla P100 GPU
· 17 SNP associations found and validated in 6 days on single 20 core IBM POWER8NVL with 4x Tesla P100 GPU

Multi-GPU Single Node

Unipro

Open source Smith-Waterman for SSE/CUDA, · F ast short read alignment Suffix array based repeats finder and dotplot.

Multi-GPU Single Node

Open Source

Fits numerous linear models to a fixed design and response.

· P arallel linear regression on multiple similarly-shaped models

Multi-GPU Single Node

48 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 48

4/5/21 10:18 AM

MICROSCOPY

APPLICATION NAME

COMPANYNAME

ANNA-PALM

Institut Pasteur

Appion

New York Structural Biology Center

BioEM crYOLO
cryoSPARC

Max Planck Institute
Max Planck Institute for Molecular Physiology
cryoSPARC

Dynamo

Center for Cellular Imaging and Nano Analytics (C-CINA), Biozentrum, University of Basel

PRODUCT DESCRIPTION
Accelerating Single Molecule Localization Microscopy with Deep Learning: ANNAPALM is a computational method that can reconstruct super-resolution images from sparse single molecule localization data and/or widefield images. ANNA-PALM can produce high quality super-resolution images from data obtained in much shorter acquisition time than standard single molecule localization microscopy. By strongly reducing acquisition time, ANNAPALM facilitates super-resolution imaging of large numbers of cells (high throughput imaging), large samples, and live cells.

SUPPORTED FEATURES
· U ses a much smaller number of low resolution frames than other methods
· P rocessing by localization algorithms results in a sparse localization image using a neural network previously trained on conventional PALM images
· Inputs sparse image and outputs a superresolution image
· R uns well on GPU due to acceleration available in Tensorflow

GPU SCALING
Single GPU Single Node

Appion is a "pipeline" for processing and analysis of EM images. Appion is integrated with Leginon data acquisition but can also be used stand-alone after uploading images (either digital or scanned micrographs) or particle stacks using a set of provided tools. Appion consists of a web based user interface linked to a set of python scripts that control several underlying integrated processing packages. All data input and output within Appion is managed using tightly integrated SQL databases. The goal is to have all control of the processing pipeline managed from a web based user interface and all output from the processing presented using web based viewing tools.

· T he underlying packages integrated into Appion include MotionCor2, Gctf, EMAN, Spider, Frealign, Imagic, XMIPP, IMOD, ProTomo, ACE, CTFFind and CTFTilt, findEM, DogPicker, TiltPicker, RMeasure, EM-BFACTOR, and Chimera.

Single GPU Single Node

GPU-accelerated computing of Bayesian inference of electron microscopy images.

· B ioEM can use CUDA for the crosscorrelation step, which essentially consists of an image multiplication in Fourier space and a Fourier back-transformation

Multi-GPU Single Node

Novel automated particle picking software based on the deep learning object detection system `You Only Look Once' (YOLO). CrYOLO is available as standalone program under http://sphire.mpg.de/ and will be part of the image processing workflow in SPHIRE.

· P art of the image processing workflow in SPHIRE.

Multi-GPU Single Node

CryoSPARC is an easy to use software tool that enables rapid, unbiased structure discovery of proteins and molecular complexes from cryo-EM data.

· A b-initio reconstruction · H eterogeneous reconstruction · H igh-speed and high resolution refinement
of 3D protein structures implemented on GPUs · M ultiple simultaneous jobs on multiple GPUs

Multi-GPU Multi-Node

Dynamo is a software environment for subtomogram averaging of cryo-EM data.

· D ynamo provides workflows all the way from tomograms to averages and classes.
· In a full workflow, you would organize tomograms in catalogues, use them to pick particles and create alignment and classification projects to be run on different computing environments
· R equires CUDA Toolkit of version 7.5 or higher and CUDA driver compatible with your actual GPU device

Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 49

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 49
4/5/21 10:18 AM

EMAN2
emClarity Gautomatch
GCTF Huygens

Baylor College of Medicine
Benjamin Himes MRC Laboratory of Molecular Biology
MRC Laboratory of Molecular Biology Scientific Volume Imaging

EMAN2 is the successor to EMAN1. It is a broadly based greyscale scientific image processing suite with a primary focus on processing data from transmission electron microscopes. EMAN's original purpose was performing single particle reconstructions (3-D volumetric models from 2-D cryo-EM images) at the highest possible resolution, but the suite now also offers support for single particle cryo-ET, and tools useful in many other subdisciplines such as helical reconstruction, 2-D crystallography and whole-cell tomography. Image processing in a suite like EMAN differs from consumer image processing packages like Photoshop in that pixels in images are represented as floating-point numbers rather than small (8-16 bit) integers. In addition, image compression is avoided entirely, and there is a focus on quantitative analysis rather than qualitative image display. emClarity is a collection of gpu accelerated software developed to enable determination of biological structures at resolutions better than 1nm from heterogeneous specimen imaged by cryo-Electron Tomography. Gautomatch is a GPU accelerated program for accurate, fast, flexible and fully automatic particle picking from cryo-EM micrographs with or without templates.
Corrects contrast transfer function effects in electron microscope optics
Huygens Products: Greatly improve your microscope images

All EMAN2 programs, including GUI programs, are written in the easy-to-learn Python scripting language. This permits knowledgeable end-users to customize any of the code with unprecedented ease. If you aren't an advanced user, you can still make use of the integrated GUI and all of EMAN2's command-line programs.
· S ubtomogram averaging · V ery high resolution single particle analysis · H ybrid electron microscopy.
· F ast: typically, 1.5~2.0s with 15 templates, using a good GPU (e.g. GTX 980, Titan X)
· F ully automatic with simple command on entire data sets
· C onvenient and easy to use · F lexible: with or without template, suitable
for both basic or advanced users · C ompatible with Relion/EMAN · B ackground correction: automatic correct
the gradient background that affects the picking · R ejection of ice/carbon: automatically detect non-particle areas and reject them · P ost-optimization: scripts available to refilter the coordinates after picking within seconds · A ccuracy: the user's satisfaction is the only `gold standard' criterion · CUDA
· D econvolution of volumetric images and time series from widefield, confocal, light sheet, super-resolution STED microscopes and more
· C hromatic aberration and cross-talk correction, image stabilization and stitching
· V isualization, tracking, colocalization and object analysis
· M ulti-GPU and cluster support

Single GPU Single Node
Multi-GPU Single Node Single GPU Single Node
Single GPU Single Node Multi-GPU Single Node

50 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 50

4/5/21 10:18 AM

IMOD
ITK
Leginon Microvolution MotionCor2 PSSR RELION

University of Colorado
Kitware
New York Structural Biology Center Microvolution UCSF
Waitt Advanced Biophotonics Center Core MRC Laboratory of Molecular Biology

IMOD is a set of image processing, modeling and display programs used for tomographic reconstruction and for 3D reconstruction of EM serial sections and optical sections. Contains tools for assembling and aligning data within multiple types and sizes of image stacks, viewing 3-D data from any orientation, and modeling and display of the image files.

· ctfphaseflip : Corrects tilt series for microscope CTF by phase flipping
· gputilttest : Test whether a GPU is reliable for computing reconstructions with the tilt program
· 3dmod : Model editing and image display program. 3dmod can display threedimensional graphic data sets in many views simultaneously, can model these data sets, and can display models and graphic data in 3-D. The views include a slice through the 3D volume, a projection of a sub-volume and orthogonal views with contour overlays.
· xyzproj : Project 3-dimensional data at a series of tilts around the X, Y, or Z axis.

Single GPU Single Node

The National Library of Medicine Insight Segmentation and Registration Toolkit (ITK), or Insight Toolkit, is an open-source, crossplatform C++ toolkit for segmentation and registration. Segmentation is the process of identifying and classifying data found in a digitally sampled representation. Typically the sampled representation is an image acquired from such medical instrumentation as CT or MRI scanners. Registration is the task of aligning or developing correspondences between data. For example, in the medical environment, a CT scan may be aligned with a MRI scan in order to combine the information contained in both.

· L ibrary is used by Paraview, VTK, and many other software distributions
· M any capabilities for multi-dimensional image processing and extraction tools
· M ost recent GPU acceleration of FFTs using cuFFT (cuFFTW) and matrix math accelerated through CUDA enabled Eigen3

Single GPU Single Node

Leginon is a system designed for automated collection of images from a transmission electron microscope.

· A Leginon application is image acquisition process that is built of several smaller pieces called `nodes'
· N odes can be applications · S ome of these are GPU accelerated
applications such as Topaz, Relion, and MotionCor2

Single GPU Single Node

Nearly instantaneous 3D deconvolution & up to 200 times faster.

· 3 D deconvolution for fluorescence microscopy
· W ritten for use only on GPUs · M ulti-GPU support

Single GPU Single Node

A multi-GPU program that corrects beam-induced sample motion on dose fractionated movie stacks. Implements a robust iterative alignment algorithm that delivers precise measurement and correction of both global and non-uniform local motions at single pixel level across the whole frame. Suitable for both singleparticle and tomographic images.

· O verall, MotionCor2 is extremely robust, and sufficiently accurate at correcting local motions so that the very time-consuming and computationally-intensive particle polishing in RELION can be skipped. Importantly
· W orks on a wide range of data sets including cryo tomographic tilt series

Multi-GPU Single Node

Deep Learning-Based Point-Scanning Super-Resolution Imaging allows pointscanning super-resolution (PSSR) imaging and facilitates point-scanning image acquisition with otherwise unattainable resolution, speed, and sensitivity.

· Pre-trained models for · PSSR for Electron Microscopy (EM) · PSSR single frame (PSSR-SF) for mitoTracker · PSSR multiframe (PSSR-MF) for mitoTracker · PSSR for neuronal mitochondria

Single GPU Single Node

RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryomicroscopy (cryo-EM).

· Image classification and high resolution refinement accelerated up to 40-fold
· T emplate-based particle selection accelerated almost 1000-fold
· R educed memory requirements · H igh-resolution cryo-EM structure
determination in a matter of day on a single workstation

Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 51

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 51
4/5/21 10:18 AM

Thunder
Tomviz Topaz Warp

Tsinghua University
Kitware
Tristan Bepler
Max Planck Institute for Biophysical Chemistry

THUNDER is a particle-filter algorithm based cryoEM image processing software for using THUNDER to analysis cryoEM images in purpose of achieving a 3D model.

· B oth image classification and highresolution refinement accelerated up to 40-fold
· T emplate-based particle selection accelerated almost 1000-fold
· R educed memory requirements · H igh-resolution cryo-EM structure
determination in a matter of day on a single workstation

Multi-GPU Multi-Node

Tomviz enables 3D characterization of materials at the nano- and meso-scale, tailored for visualizing electron tomography data. It utilizes the large quantities of memory and processing resources required to render, manipulate, and analyze voluminous 3D tomograms.

· 3 D tomographic data processing, visualization, and analysis of
· Python · Windows · M ac OS · Linux

Single GPU Single Node

A pipeline for particle detection in cryoelectron microscopy images using convolutional neural networks trained from positive and unlabeled examples.

· D eep learning for cryo EM data particle picking
· U ses CUDA and pytorch

Single GPU Single Node

Warp integrates novel algorithms for frame alignment, defocus estimation, particle picking and tomographic reconstruction in a rich user interface. Enables data quality monitoring in real time, data analysis at microscope level and obtains high-resolution structures before data collection is over.

· C UDA enabled processing for electron microscopy
· T ensorFlow (v1.10) · C UDA kernels: backprojection, CTF,
deconvolution, FFT, tomography refinement, and others

Single GPU Single Node

MOLECULAR DYNAMICS

APPLICATION NAME
ACEMD

COMPANYNAME
Acellera Ltd

PRODUCT DESCRIPTION
GPU simulation of molecular mechanics force fields, implicit and explicit solvent

SUPPORTED FEATURES
· W ritten for use only on GPUs.

AMBER

University of

Suite of programs to simulate molecular

California at San dynamics on biomolecule.

Francisco

· P MEMD Explicit Solvent and GB Implicit Solvent

CHARMM

Harvard University

MD package to simulate molecular dynamics on biomolecule.

· Implicit (5x) · E xplicit (2x) · S olvent via OpenMM, now ported natively
to GPUs

Colvars

Temple University

Software module for molecular simulation and analysis that provides a highperformance implementation of sampling algorithms defined on a reduced space of continuously differentiable functions (aka collective variables) The module itself implements a variety of functions and algorithms, including freeenergy estimators based on thermodynamic forces, non-equilibrium work and probability distributions

· L AMMPS, NAMD, VMD · G PU support

Computational Crystallography Toolbox

Lawrence Berkeley Laboratories

Open source component of the PHENIX system to advance automation of macromolecular structure determination. Useful for small-molecule crystallography and even general scientific applications

· G PU acceleration for scattering and general purpose math via
· C UDA and cuFFT

DeePMD-kit

Princeton University

DeePMD-kit is a package written in Python/ C++, designed to minimize the effort required to build deep learning based model of interatomic potential energy and force field and to perform molecular dynamics (MD). Addresses the accuracyversus-efficiency dilemma in molecular simulations. Applications of DeePMD-kit span from finite molecules to extended systems and from metallic systems to chemically bonded systems.

· TensorFlow · H igh-performance classical MD and
quantum (path-integral) MD packages · D eep Potential series models · M PI and GPU support

52 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

GPU SCALING
Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Multi-Node
Multi-GPU Single Node
Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 52

4/5/21 10:18 AM

DeepSite DESMOND ESPResSo FEP+
Folding@Home Galamost GALAMOST
Genesis GENESIS

Acellera Ltd

DeepSite is a protein binding pocket predictor based on deep neural networks. Allows you to upload your structure on PDB format, monitor the progress of your job and visualize the results with our modern WebGL viewer.

· D eep learning · M achine learning · D rug discovery in a web interface

Single GPU Single Node

David E. Shaw Research

High-speed molecular dynamics simulations of biological systems.

· T he code uses novel parallel algorithms and numerical techniques to achieve high performance and accuracy

Multi-GPU Single Node

ESPResSo

Highly versatile software package for performing and analyzing scientific Molecular Dynamics, many-particle simulations of coarsegrained atomistic or bead-spring models as they are used in soft-matter research in physics, chemistry and molecular biology.

· H ydrodynamic / Electrokinetic forces · P 3M electrostatics.

Multi-GPU Single Node

Schrodinger, Inc.

Molecular Dynamics (MD) and Free Energy Perturbation (FEP) calculations occur on time scales that are computationally demanding to simulate. A key factor in determining whether a simulation will take days, hours, or minutes to run is the hardware being used. The advent of GPU computing, however, has opened the door to a new world of computationally intensive simulations that would not have been possible even a few years ago. Desmond's high-performance Molecular Dynamics code, together with continuously improving computer hardware technologies are helping scientists push the boundaries of discovery further than ever before. MD simulations to impact drug discovery has now been attained in FEP+, due to the confluence of hardware and software development along with the formulation of sufficiently accurate theoretical methods and models

· O ptimization of the FEP+ algorithm to take full advantage of the Desmond GPU MD engine enabling 2 to 4 ligands to be scored per day on a multi-GPU server.

Multi-GPU Multi-Node

Stanford University

A distributed computing project that studies · P owerful distributed computing molecular

protein folding, misfolding, aggregation, and dynamics system

related diseases.

· Implicit solvent and folding

Multi-GPU Single Node

CAS-CIAC

GALAMOST is a project of employing highperformance computational techniques to accelerate molecular simulation by fully utilizing the computational power of NVIDIA GPUs. Enables the investigation og polymeric systems in a large temporal and spatial scale at a very low cost.

· F ull Molecular Simulation on GPU

Multi-GPU Multi-Node

ChangChun CHINA

GALAMOST is a package of employing highperformance computational techniques on many-core processors to accelerate molecular dynamics simulations. The package is written with CUDA and C++ languages for particularly running on NVIDIA GPUs and focuses on the large scale simulations of soft matters.

· G eneral molecular dynamics · D issipative particle dynamics (DPD) · B rownian dynamics (BD) · Coarse-graining molecular dynamics (CGMD) · R eaction model · A nisotropic particle models · MD-SCF · D NA 3SPN model · R igid body method · S tretching method

Single GPU Single Node

Diamond Visionics

GenesisRTX, is an advanced high-fidelity runtime rendering engine which eliminates the need for traditional off-line database compiling or formatting.

· P owerful parallelization for hybrid (CPU+GPU) systems
· F ull electrostatics with PME · L arge (1-100 million atoms) biological
systems

Multi-GPU Single Node

RIKEN

GENESIS (GENeralized-Ensemble SImulation System) is a software package for molecular dynamics simulations and trajectory analyses.

· P owerful parallelization for hybrid (CPU+GPU) systems
· F ull electrostatics with PME · L arge (1-100 million atoms) biological
systems

Multi-GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 53

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 53

4/5/21 10:18 AM

GPUgrid.net GROMACS HALMD HOOMD-Blue HTMD
LAMMPS MELD MOLECULAR OPERATING ENVIRONMENT
myPresto NAMD OpenMM PolyFTS
SOP-GPU

Acellera Ltd
KTH Royal Institute of Technology HALMD
University of Michigan Acellera Ltd
Sandia National Lab University of Calgary
Chemical Computing Group ULC
N2PC/AIST/ JBIC, Japan University of Illinois at Champaign Urbana Stanford University University of California at Santa Barbara
SOP-GPU

A distributed computing project that uses GPUs for molecular simulations.
Simulation of biochemical molecules with complicated bond interactions

· H igh-performance all-atom biomolecular simulations
· E xplicit solvent and binding
· Implicit (5x) · E xplicit (2x) Solvent

Multi-GPU Single Node
Multi-GPU Single Node

Large-scale simulations of simple and complex liquids.

· S imple fluids and binary mixtures (pair potentials, high-precision NVE and NVT, dynamic correlations)

Particle dynamics package written grounds · W ritten for use only on GPUs up for GPUs.

High throughput molecular dynamics simulations.

· A vailable via Conda and github · ACEMD · PMEMD · NAMD · GROMACS · AMBER · C HARMM force fields · A daptive sampling, Markov State Models,
visualization, protein preparation and ligand parameterization

Classical molecular dynamics package

· Lennard-Jones · Gay-Berne · Tersoff

OpenMM plugin written for GPUs.

· Integrative approach to combine physics and information
· O rders of magnitude faster protein folding than brute force MD

Calculate and Analyze pH-Dependent Protein Properties. MOEsaic Session Sharing and Project Customization. Determine Conformation Population from NMR NOE Data Predict Relative Binding Energies with AMBER Thermodynamic Integration.

· G PU Accelerated 3D Stereo Graphics · A MBER GPU accelerated support

Open Source Computational Drug Discovery · H igh performance virtual screening by MD

Suite.

binding

· F ree energy calculation.

Designed for high-performance simulation of large molecular systems.

· F ull electrostatics with PME and most simulation features
· 1 00M atom capable

Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Multi-Node Multi-GPU Single Node
Single GPU Single Node
Multi-GPU Multi-Node Multi-GPU Single Node

Library and application for molecular dynamics for HPC with GPUs.

· M olecular Dynamics toolkit · E xtensible and growing · Implicit and explicit solvent, custom forces

Multi-GPU Single Node

Classical molecular simulation code for studying polymer self-assembly and thermodynamics.

· U ses auxiliary fields as the fundamental simulation degrees of freedom
· U ses cuFFT extensively (~ 80%) · C UDA code is ~20% · M ulti CPU or single GPU per job · 1 x = Ivy Bridge E5-2690 CPU all 10 cores · 3 -8X on K40 or K80 (utilizing 1/2 of the K80)

Single GPU Single Node

SOP-GPU package for the Self Organized Polymer Model fully implemented on a GPU. A scientific software package designed to perform Langevin Dynamics Simulations of the mechanical or thermal unfolding, and mechanical indentation of large biomolecular systems in the experimental subsecond (millisecond-to-second) timescale.

· L angevin dynamics simulations using the coarse-grained Self Organized Polymer (SOP) model
· M ultiple simulation trajectories can be performed simultaneously on a single GPU
· C alpha and Calpha-Cbeta models · S imulations of protein forced unfolding · Novel simulations of nanoindentation in silico · S upport for hydrodynamic interactions · U p to ~100 ms of simulation time per day, · S ystems of up to 1,000,000 amino-acids (on
GPUs with 6GB or great memory)

Single GPU Single Node

54 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 54

4/5/21 10:18 AM

QUANTUM CHEMISTRY

APPLICATION NAME
Abinit

COMPANYNAME
ABINIT

PRODUCT DESCRIPTION

SUPPORTED FEATURES

Allows to find total energy, charge density and electronic structure of systems made of electrons and nuclei within DFT.

· L ocal Hamiltonian · N on-local Hamiltonian · L OBPCG algorithm · D iagonalization/ orthogonalization.

GPU SCALING
Multi-GPU Single Node

ACES 4

University of Florida

New SIA/aces4 development A new super · Integrating scheduling GPU into SIAL

instruction architecture with interface

programming language and SIP runtime

applications for quantum chemistry (aces4). environment

Multi-GPU Single Node

ACES III

University of Florida

ACES III takes the best features of parallel implementations of quantum chemistry methods for electronic structure.

· Integrating scheduling GPU into SIAL programming language and SIP runtime environment.

Multi-GPU Multi-Node

ADF

Software for Chemistry & Materials

Density Functional Theory (DFT) software package that enables first-principles electronic structure calculations.

· G eometry optimizations and frequency calculations with GGA functionals.

Multi-GPU Single Node

BigDFT

BigDFT

Implements density functional theory by solving the Kohn-Sham equations describing the electrons in a material.

· D aubechies wavelets

Multi-GPU Multi-Node

BrianQC

StreamNovation Ltd.

BrianQC is a software product in the field of quantum chemistry. It accelerates features of Q-Chem 5.0 or later. Optimized for simulating large molecules and tested up to 20,000 Cartesian Gaussian basis functions. Has full support of s, p, d, f and g-type orbitals. Full support for NVIDIA GPU architectures (Kepler, Maxwell, Pascal, Volta) with double precision accuracy on 64-bit Linux operation systems. Targets the speeds up of Q-Chem for every calculation that uses Coulomb or Exchange integrals over Gaussian basis functions or their first analytic derivative (including HF-SCF, DFT, SCF geom. opt, DFT geom. opt for most functionals, etc.)

· T he range of NVIDIA architectures supported by BrianQC has been expanded. In addition to GPUs powered by Kepler, Maxwell and Pascal, BrianQC now supports NVIDIA Tesla V100 GPU as well
· C ompatible with features of Q-Chem 5.0 or later
· O ptimized for simulating large molecules · T ested up to 20,000 Cartesian Gaussian
basis functions · F ull support of s, p, d, f and g-type orbitals · F ull support for NVIDIA GPU architectures
(Kepler, Maxwell, Pascal). Double precision accuracy · R uns on 64-bit Linux operation systems · S peeds up Q-Chem for every calculation that uses Coulomb or Exchange integrals over Gaussian basis functions or their first analytic derivative (including HF-SCF, DFT, SCF geom. opt, DFT geom. opt for most functionals, etc.)

Multi-GPU Single Node

CP2K

CP2K

Program to perform atomistic and molecular simulations of solid state, liquid, molecular and biological systems.

· D BCSR (space matrix multiply library)

Multi-GPU Multi-Node

GAMESS-UK

Open Source

The general purpose ab initio molecular electronic structure program for performing SCF-, DFT- and MCSCF-gradient calculations.

· (ss|ss) type integrals within calculations using Hartree-Fock ab initio methods and density functional theory
· S upports organics and inorganics.

Multi-GPU Multi-Node

GAMESS-US

Ames

Computational chemistry suite used to

Laboratory/Iowa simulate atomic and molecular electronic

State University structure.

· L ibqc with Rys Quadrature Algorithm · Hartree-Fock · M P2 and CCSD

Multi-GPU Multi-Node

Gaussian

Gaussian, Inc.

Predicts energies, molecular structures, and vibrational frequencies of molecular systems.

· J oint NVIDIA · P GI and Gaussian collaboration

Multi-GPU Single Node

GPAW

GPAW

Real-space grid DFT code written in C and Python.

· E lectrostatic poisson equation · O rthonormalizing of vectors · R esidual minimization method (rmm-diis)

Multi-GPU Multi-Node

gWL-LSMS

ORNL

Materials code for investigating the effects · G eneralized Wang-Landau method of temperature on magnetism.

Multi-GPU Multi-Node

LATTE

Open Sourcee Density matrix computations

· CU_BLAS · S P2 Algorithm

Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 55

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 55
4/5/21 10:18 AM

libxc LSDalton MAPS
MOLCAS MOPAC2012 NWChem
NWChemEX Octopus
PEtot

TDDFT
LSDalton
Scienomics
MOLCAS MOPAC NWChem
Pacific Northwest National Laboratories Harvard University
Lawrence Berkeley Laboratories

Libxc is a library of exchange-correlation functionals for density-functional theory providing portable, well tested and reliable set of exchange and correlation functionals that can be used by all the ETSF codes and also other codes

· G PU acceleration for quantum chemistry · L DA, GGA, hybrids and mGGA · P ython 3 and C interfaces

Multi-GPU Single Node

Linear-scaling HF and DFT code suitable for large molecular systems, now also with some CCSD capabilitiesTensor Algebra Library Routines for Shared Memory Systems which is being used to GPU accelerate three (3) CAAR codes; NWChem, LSDALTON and DIRAC.

· (T) correction to the CCSD energy · R I-MP2 energy/gradient (in development) · C CSD energy (in development) · G PU-based ERI generator (in development)

Multi-GPU Single Node

MAPS CLASSICAL & MESOSCALE simulation toolkit contains world-class simulation engines such as LAMMPS, CHAMELEON, TOWHEE, NAMD. Includes a collection of ready-to-use workflows and a rich Force-Field library.

· Typical calculations that can be executed include molecular dynamics simulations and Monte Carlo simulations, structure relaxation in periodic or molecular systems using both classical and quantum mechanics tools
· Trajectory can be generated and then later analyzed using the appropriate tools
· Additional simulations can be performed using PC-SAFT and related methods for thermodynamics modeling

Single GPU Single Node

Methods for calculating general electronic structures in molecular systems in both ground and excited states.

· CU_BLAS

Multi-GPU Single Node

Semiempirical Quantum Chemistry

· Pseudodiagonalization · F ull diagonalization · D ensity matrix assembling via Magma
libraries

Single GPU Single Node

NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters.

· T riples part of Reg-CCSD(T) · C CSD and EOMCCSD task schedulers

Multi-GPU Single Node

NWChemEx targets developing highperformance computational models for the production of advanced biofuels and other bioproducts

· G PU acceleration · libraries like libxc

Single GPU Single Node

Used for ab initio virtual experimentation and quantum chemistry calculations.

· F ull GPU support for ground-state, realtime calculations
· K ohn-Sham Hamiltonian · Orthogonalization · S ubspace diagonalization · P oisson solver · T me propagation · D FT application

Single GPU Single Node

First principles materials code that computes the behavior of the electron structures of materials.

· D ensity functional theory (DFT) plane wave Multi-GPU

pseudopotential calculations

Single Node

56 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 56

4/5/21 10:18 AM

QBox
Q-CHEM QMCPACK Quantum Espresso QUICK RESCU RMG
TAL-SH TeraChem VASP

University of California Davis
Q-Chem Inc. QMCPACK
Quantum Espresso Foundation Michigan State University
Hongzhiwei technology
North Carolina State University
Oak Ridge National Lab PetaChem LLC University of Vienna

Qbox is a C++/MPI scalable parallel implementation of first-principles molecular dynamics (FPMD) based on the plane-wave, pseudopotential formalism. Designed for operation on large parallel computers.

· T he availability of double precision graphics cards provides an opportunity to speed up electronic structure computations. We modify the Qbox code to utilize Fermi GPUs on the Keeneland platform
· W e use the CUFFT library to speed up Fourier transforms and perform asynchronous communication to cut down the cost of data transfers
· T he modified code is used in simulations of a 64-molecule water system with an 85 Ry plane wave energy cut off
· P reliminary results show a 2-3 times speedup in the calculation of the charge density and in the application of the Hamiltonian operator to the wave function
· W e present these findings as well as further speedups measured in other parts of the code. http://eslab.ucdavis.edu/ software/qbox http://keeneland.gatech.edu

Single GPU Single Node

Computational chemistry package designed · V arious features including RI-MP2 for HPC clusters.

Single GPU Single Node

QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids.

· M ain features

Multi-GPU Multi-Node

An integrated suite of computer codes for electronic structure calculations and materials modeling at the nanoscale.

· P Wscf package: linear algebra (matix multiply)
· E xplicit computational kernels · 3 D FFTs

Multi-GPU Multi-Node

QUICK is a GPU-enabled ab intio quantum chemistry software package.

· R unning Hartree-Fock and DFT energy on GPU
· S upports s, p, d, f orbitals on energy calculation
· H F gradient with s,p,d orbital support · G PU-based ERI generator

Multi-GPU Single Node

RESCU is a KS-DFT calculation software that can study very large systems with only a small computer. Offers new, extremely powerful and parallel high efficiency KS-DFT self-consistent calculation method.

· P arallel high efficiency processing- KS-DFT Multi-GPU Single Node

RMG is a density functional theory (DFT) based electronics structure code that uses real space grids to represent wavefunctions, charge densities, and ionic potentials. Designed for scalability and runs successfully on systems with thousands of nodes (including GPU nodes) and hundreds of thousands of CPU cores.

· S upports 10k+ GPU nodes · M ultipetaflops capable · H andles thousands of atoms with full DFT
precision · S upports multiple GPUs per node · F ully open source · Installation support · C ray XE6/XK7

Multi-GPU Single Node

Tensor Algebra Library Routines for Shared Memory Systems accelerates three (3) CAAR codes; NWChem, LSDALTON and DIRAC.

· T ensor Algebra Library for Shared Memory Computers: Nodes equipped with multicore CPU, NVIDIA GPU, and Intel Xeon Phi (in progress)

Multi-GPU Multi-Node

Quantum chemistry software designed to run on NVIDIA GPU.

· F ull GPU-based solution; Performance compared to GAMESS CPU version

Multi-GPU Single Node

Complex package for performing ab-initio quantum-mechanical molecular dynamics (MD) simulations using pseudopotentials or the projector-augmented wave method and a plane wave basis set

· Blocked Davidson (ALGO = NORMAL & FAST) · RMM-DIIS (ALGO = VERYFAST & FAST) · K-Points and optimization for critical step in
exact exchange calculations

Multi-GPU Multi-Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 57

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 57
4/5/21 10:18 AM

(MOLECULAR) VISUALIZATION AND DOCKING

APPLICATION NAME
Amira

COMPANYNAME
Thermo fisher Scientific

PRODUCT DESCRIPTION

SUPPORTED FEATURES

A multifaceted software platform for

· 3 D visualization of volumetric data and

visualizing, manipulating, and understanding surfaces

Life Science and bio-medical data.

GPU SCALING
Single GPU Single Node

AUTODOCK

Scripps

The AutoDock Suite is a growing collection of methods for computational docking and virtual screening, for use in structurebased drug discovery and exploration of the basic mechanisms of biomolecular structure and function.

· O penCL-accelerated version of AutoDock4.2.6
· A utoDock GPU · ADADELTA

Single GPU Single Node

BINDSURF

Universidad Catolica de Murcia

A virtual screening methodology that uses GPUs to determine protein binding sites.

· A llows fast processing of large ligand databases

Single GPU Single Node

BUDE

Bristol University Docking Station

Molecular docking program

· E mpirical Free Energy Force field

Single GPU Single Node

FastROCS

OpenEye Scientific Software, Inc.

Molecule shape comparison application

· R eal-time shape similarity searching/ comparison

Multi-GPU Multi-Node

Interactive

University of

Molecule Visualizer Illinois

Experimental interactive molecule visualizer based on a ray-tracing engine.

· H igh quality images and ease of interaction · L atest GPUcomputing acceleration
techniques · N atural user interfaces such as Kinect and
Wiimotes

Single GPU Single Node

MEGADOCK

Akiyama_ Laboratory, Tokyo Institute of Technology

MEGADOCK is a fast protein-protein docking software when more acceleration is demanded for an interactome prediction, which is composed of millions of protein pairs.

· M EGADOCK-GPU on 12 CPU cores · 3 GPU calculation speed 37.0 times faster
than MEGADOCK on 1 CPU core · N ovel docking software facilitating the
application of docking techniques to assist large-scale protein interaction network analyses

Multi-GPU Single Node

Molegro Virtual Docker 6

QIAGEN

Method for performing high accuracy flexible molecular docking.

· E nergy grid computation · P ose evaluation · G uided differential evolution

Single GPU Single Node

PIPER Protein Docking

Boston University

Protein-protein docking program

· M olecule docking

Single GPU Single Node

PyMol

Schrodinger, Inc. User-sponsored molecular visualization system on an open-source foundation.

· L ines: 460% increase · C artoons: 1246% increase · S urface: 1746% increase · S pheres: 753% increase · R ibbon: 426% increase

Single GPU Single Node

VEGA ZZ

University of California, San Francisco

Molecular Modeling Toolkit

· V irtual logP · M olecular surface values

Single GPU Single Node

VMD

University of Illinois

Visualization and analyzation of large biomolecular systems in 3-D graphics.

· H igh quality rendering · L arge structures (100M atoms) · A nalysis and visualization tasks · M ultiple GPU support for display of
molecular orbitals

Multi-GPU Single Node

Research: Higher Education and Supercomputing

NUMERICAL ANALYTICS

APPLICATION NAME

COMPANYNAME

ArrayFire

ArrayFire

PRODUCT DESCRIPTION
ArrayFire helps organizations develop high-performance computing solutions on modern computational platforms. Specializes in machine learning and computer vision. Uses CUDA and OpenCL programming, code acceleration and optimization, and software design.

SUPPORTED FEATURES
· V ector Algorithms · Image Processing · C omputer Vision · S ignal Processing · L inear Algebra · Statistics

GPU SCALING
Multi-GPU Single Node

58 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 58

4/5/21 10:18 AM

Eigen Julia
Mathematica MATLAB
NMath Premium
PHYSICS
APPLICATION NAME
AWP
BQCD CADISHI
CASTRO Changa Chemora

Eigen Julia Computing
Wolfram Mathworks
NMath

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. Julia delivers dramatic improvements in simplicity, speed, scalability, capacity, and productivity to solve massive computational problems quickly and accurately, making it the preferred language for big data analytics.
A symbolic technical computing language and development environment.
GPU acceleration for MATLAB (high-level technical computing language).
GPU-accelerated math and statistics for .NET, automatically detects the presence of a CUDA-enabled GPU at runtime and seamlessly redirects appropriate computations to it.

· C UDA enabled linear algebra · e igen solver, reduction, random, etc.
· N VIDIA CUDA via Julia CUDA JIT plugin architecture
· P arallelism and distributed computation · L ightweight "green" threading (coroutines) · U nicode, including but not limited to UTF-8 · C all C · L isp-like macros and other
metaprogramming facilities
· D evelopment environment for CUDA and OpenCL
· G PU acceleration for Wolfram Finance Platform
· Acceleration for 200+ most used MATLAB functions
· Acceleration of more than 500 most parallelizable MATLAB functions
· Accelerated Signal Processing toolkit · Accelerated Image Processing toolkit · Accelerated Communications Systems toolkit · Available via an NGC container
· A utomatically offloads computations to the GPU.

Single GPU Single Node Multi-GPU Multi-Node
Multi-GPU Single Node
Multi-GPU Single Node
Single GPU Single Node

COMPANYNAME
AWP
USQCD Max Planck Institute
CASTRO
CHANGA
CHEMORA

PRODUCT DESCRIPTION
The Anelastic Wave Propagation, AWPODC, independently simulates the dynamic rupture and wave propagation that occurs during an earthquake. Dynamic rupture produces friction, traction, slip, and slip rate information on the fault. The moment function is constructed from this fault data and used to currentize wave propagation.

SUPPORTED FEATURES
· 3 D Finite Difference Computation

Lattice quantum chromodynamics

· W ilson-clover fermion linear solver

application, used for nuclear ad high energy

physics calculations.

CADISHI is a software package that enables scientists to compute (Euclidean) distance histograms efficiently. Any sets of objects that have 3D Cartesian coordinates may be used as input, for example, atoms in molecular dynamics datasets or galaxies in astrophysical contexts.

· H ighly tuned CPU and GPU kernels · P ython engine for throughput computing

A multicomponent compressible hydrodynamic code for astrophysical flows including self-gravity, nuclear reactions and radiation. CASTRO uses an Eulerian grid and incorporates adaptive mesh refinement (AMR).

· G ravitational Field Solver

Astrophysics code performs collisionless N-body simulations and performs cosmological simulations with periodic boundary conditions in comoving coordinates or simulations of isolated stellar systems.

· G ravitational Model has been accelerated using CUDA

Chemora is a system for performing simulations of systems described by differential equations running on accelerated computational clusters.

· C hemora embeds the equations' computational kernels into dynamically compiled loop nests shaped for input size and GPU structure

GPU SCALING
Single GPU Single Node
Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node
Single GPU Single Node
Multi-GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 59

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 59

4/5/21 10:18 AM

Cholla
Chroma CPS CPS (GRID)
CST PARTICLE STUDIO GADGET GAMER
GENE
GPU-AH GPUwalls GTC

Cholla
USQCD USQCD USQCD
Dassault Systèmes SIMULIA Corp. Max Planck Institute Open Source
GENE
Universidade do Porto Universidade do Porto University of California Irvine(UC Irvine)

Computational Hydrodynamics On ParaLLel Architectures for Astrophysics

· M odels the Euler equations on a static mesh and evolves the fluid properties of thousands of cells simultaneously using GPUs
· It can update over ten million cells per GPU-second while using an exact Riemann solver and PPM reconstruction, allowing computation of astrophysical simulations with physically interesting grid resolutions (>256^3) on a single device; calculations can be extended onto multiple devices with nearly ideal scaling beyond 64 GPUs

Lattice Quantum Chromodynamics (LQCD)

· W ilson-clover fermions · K rylov solvers · Domain-decomposition

Lattice quantum chromodynamics

· W ilson, domain-wall and Mbius fermion

application, used for nuclear ad high energy linear solvers

physics calculations.

CPS is developed for lattice QCD and written by C++, with some machine-specific assembly routines. It is being developed by members of Columbia University, Brookhaven National Laboratory. The CPS consists of code to build a library which is can be statically linked to your code to create an executable. CPS has optimized codes for QCDOC, IBM Blue Gene machines, and builds for scalar machines or parallel machines with QMP.

· C UDA is supported · T he GRID code from Edinburgh is currently
being optimized.

Self-consistent simulation of charged particles in electromagnetic fields

· P article-in-Cell Solver

A code for cosmological simulations of structure formation.

· MPI

A GPU-accelerated Adaptive Mesh Refinement Code for astrophysical applications. Currently the code solves the hydrodynamics with self-gravity.

· A daptive mesh refinement (AMR). Hydrodynamics with self-gravity
· A variety of GPU-accelerated hydrodynamic and Poisson solvers
· H ybrid OpenMP/MPI/GPU parallelization · C oncurrent CPU/GPU execution for
performance optimization. Hilbert spacefilling curve for load balance

GENE (Gyrokinetic Electromagnetic Numerical Experiment) is an open source plasma microturbulence code which can be used to efficiently compute gyroradiusscale fluctuations and the resulting transport coefficients in magnetized fusion/ astrophysical plasmas.

· B asic Modeling

Developed at Centro de Astrofisica e Astronomia da Universidade do Porto, GPUAH simulates the evolution of a network of line-like topological defects - Abelian-Higgs cosmic strings - in a cosmic context.

· C alculates average network density and velocity

Developed at Centro de Astrofisica e Astronomia da Universidade do Porto, GPUwalls simulates the evolution of a network of the simplest topological defect domain wall - in a cosmic context.

· C alculates average network density and velocity

Gyrokinetic Plasma Fusion for Modeling a Tokamak reactor

· NVLINK

Multi-GPU Single Node
Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Multi-Node
Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Single Node
Multi-GPU Multi-Node
Single GPU Single Node
Single GPU Single Node
Multi-GPU Multi-Node

60 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 60

4/5/21 10:18 AM

GTC Irvine
GTC-P HACC
HAMR GPU MAESTRO MILC

University of California Irvine(UC Irvine)
Princeton Plasma Phyiscs Lab HACC
HAMR
MAESTRO USCQD

The gyrokinetic toroidal code (GTC) is a massively parallel, particle-in-cell code for turbulence simulation in support of the burning plasma experiment ITER, the crucial next step in the quest for fusion energy. GTC is the production code for the multi-institutional US Department Of Energy (DOE) Scientific Discovery through Advanced Computing (SciDAC) project, GSEP Center (Gyrokinetic Simulation of Energetic Particle Turbulence and Transport), and DOE INCITE project that was awarded 35M hours of CPU time for 2011. Currently maintained at UC Irvine, GTC was the first fusion code to reach in production simulations the teraflop in 2001 on the seaborg computer at NERSC and the petaflop in 2008 on the jaguar computer at ORNL. GTC simulation of the turbulence self-regulation by zonal flows was published in a 1998 Science paper, which has received the most citations for any magnetic fusion research paper published since 1996.

· P USHe, Collision and Poisson Solver

A development code for optimization of plasma physics. Full science and data sets are included, but in a simplified form to allow performance testing and tuning.

· O ptimized with CUDA · O penACC development underway

Simulates N-Body Astrophysics. The HACC (Hardware/Hybrid Accelerated Cosmology Code) framework exploits this diverse landscape at the largest scales of problem size, obtaining high scalability and sustained performance. Developed to satisfy the science requirements of cosmological surveys, HACC melds particle and grid methods using a novel algorithmic structure that flexibly maps across architectures, including CPU/GPU, multi/many-core, and Blue Gene systems. We demonstrate the success of HACC on two very different machines, the CPU/GPU system Titan and the BG/Q systems Sequoia and Mira, attaining unprecedented levels of scalable performance. We demonstrate strong and weak scaling on Titan, obtaining up to 99.2% parallel efficiency, evolving 1.1 trillion particles.

· T his code has been optimized with CUDA runs in full production mode

GPU accelerated General Relativistic Magneto Hydrodynamic application

· A ctive galactic nuclei which assumes a radiatively inefficient sub-eddington rate torus
· A xisymmetric ideal MHD · V iscosity and resistivity through use of
Riemann solver (HLL) · D ensity floors to mass load the jet · U ses grids that can resolve the
substructure of the jet over 5 orders of magnitude

A low Mach number stellar hydrodynamics code that can be used to simulate longtime, low-speed flows that would be prohibitively expensive to model using traditional compressible code.

· G ravitational Field Solver

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the strong force to create larger particles like protons and neutrons.

· S taggered fermions · K rylov solvers · G auge-link fattening

Multi-GPU Multi-Node
Multi-GPU Single Node Multi-GPU Single Node
Multi-GPU Single Node
Multi-GPU Single Node Multi-GPU Multi-Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 61

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 61

4/5/21 10:18 AM

NekCEM ORB5
OSIRIS PIConGPU PPM QUDA RAMSES samadii/sciv XGC

ANL
EPFL
UCLA Plasma Physics Group HZDR PPM USQCD CEA Metariver Technology PPPL

A high-fidelity, open-source electromagnetics solver based on spectral element and spectral element discontinuous Galerkin methods, written in Fortran and C.
ORB5 is a global, gyrokinetic, Lagrangian, Particle-In-Cell (PIC), finite element, electromagnetic model
Simulates Plasma Physics including Laser interaction
A relativistic Particle-in-Cell code that describes the dynamics of a plasma by computing the motion of electrons and ions subject to the Maxwell-Vlasov equation. Piecewise parabolic method is a higherorder extension of Godunov's method which uses spatial interpolation and allows for a steeper representation of discontinuities, particularly contact discontinuities. Library for Lattice QCD calculations using GPUs.
Simulates astrophysical problems on different scales (e.g. star formation, galaxy dynamics, cosmological structure formation). Software for computing flow field in high vacuum condition using the DSMC(Direct Simulation with Monte Carlo) method. Simulating the interactions between gas and surfaces boundaries, the gas flow with molecular particles Simulates edge effects for MHD plasma physics

· T he OpenACC implementation covers all solution routines for the Maxwell equation solver in NekCEM, including a highly tuned element-by-element operator evaluation and a GPUDirect gather-scatter kernel to effect nearest-neighbor flux exchanges

Multi-GPU Multi-Node

· Plasma and background magnetic geometry · Axisymmetric ideal MHD equilibria · Computed with CHEASE code [9] kinetic
electrons, or various approximate models: hybrid-trapped or adiabatic intra- and inter-species linearized collision operators electromagnetic perturbations, with the cancellation problem solved using enhanced control variates and a `pullback' scheme

Multi-GPU Multi-Node

· 2 dimensions of the particle push have been optimized with CUDA
· A dditional optimization is being planned with OpenACC

Multi-GPU Single Node

· S imulation of laser-particle acceleration and relativistic plasma physics

Multi-GPU Multi-Node

· T urbulent, compressible mixing of gases in Single GPU the context of stars near the ends of their Single Node lives and also in inertial confinement fusion

· Q UDA supports the following fermion formulations: Wilson,Wilson-clover,Twisted mass,Improved staggered (asqtad or HISQ) and Domain wall
· G PU acceleration · R adiative transfer for reionization · H ydrodynamic solver using AMR

Multi-GPU Single Node
Multi-GPU Multi-Node

· D SMC simulator, gas dynamics solver · O LED & Semiconductor deposition and
etching analysis, Vacuum field analysis · P DL(Pixel Define Layer) growth analysis · D eposition mask toolkits, Wall growth,
Chemical reaction
· T he particle push portion has been optimized with CUDA and is being fully optimized with OpenACC and CUDA

Multi-GPU Multi-Node
Multi-GPU Multi-Node

SCIENTIFIC VISUALIZATION

APPLICATION NAME
Animator

COMPANYNAME
GNS

PRODUCT DESCRIPTION
Industry proven, modern post-processing app for CAE

Ansys EnSight

ANSYS

Industry proven post-processing app for CAE

FieldView

IntelligentLight Visualization application for CFD

HVR (LCSE, U of Minnesota)
IndeX

University of Minnesota
NVIDIA

Interactive volume rendering application
Interactive distributed volumetric compute and visualization framework.

Inside Explorer

Interspectral

An interactive and intuitive software with volumetric rendering and 3D-visualization of real captured data.

SUPPORTED FEATURES
· Rendering
· Rendering · R ay tracing · Rendering
· V olume rendering
· P arallel distributed 3D rendering of dense or sparse volumes
· A ccurate ray casting or ray tracing at high resolution of full size datasets
· P lug-in to ParaView also available. · vGPU

GPU SCALING
Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node
Single GPU Single Node

62 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 62

4/5/21 10:18 AM

ParaView Pix4Dmapper
SPECFEM3D
Tecplot VisIt vl3 (Argonne National Lab)

Kitware Pix4D
CIG
Tecplot LLNL Argonne National Lab

Scalable data analysis and visualization application. One of the main vis tools at HPC sites.

· R endering and analysis tasks · P lugin for NVIDIA IndeX · O ptiX rendering backend · C UDA accelerated filters (data
transformation routines)

This professional photogrammetry software uses images to generate point clouds, digital surface and terrain models, orthomosaics, textured models and more. It is most often used by geospatial professionals such as surveyors and civil engineers.

· G PU accelerated processing

There are two modules/apss in the SPECFEM family: GLOBE and CARTESIAN. The global model is the former Gordon Bell Awardee code. Used for global inversion. Also part of the CAAR effort (although, that one is mostly focused on workflow, rather than the actual model). The regional model is CARTESIAN and it is the app used for seismic simulations, earthquake models, submarine acoustics etc. In addition to being used as a community app, Specfem3D is also use as a proxy app for proprietary codes

· O penCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library
· S imulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra (structured or not).

General purpose scientific visualization software for Aerodynamics, O&G, Internal Combustion and Geoscience applications

· Rendering

Scalable data anlysis and visualization application

· R endering and analysis tasks

Large dataset visualization in cosmology, astrophysics, and biosciences fields.

· V olume rendering of particles

Multi-GPU Multi-Node
Single GPU Single Node
Multi-GPU Single Node
Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node

Smart Spaces

APPLICATION NAME
AI-NVR

COMPANYNAME
IronYun

Alert

Irvine Sensors

Arvas

VI Dimensions

BioSurveillance NEXT, BioFinder

Herta Security

PRODUCT DESCRIPTION
Search in Video, Real time intrusion detection

SUPPORTED FEATURES
· S earch amongst 1000s of videos for interesting activities or attributes.

GPU SCALING
Single GPU Single Node

Alert provides people counting and intrusion · P eople counting

detection

· Intrusion detection

Single GPU Single Node

ARVAS, is an Intelligent Video Analytics solution that uses advance statistical modelling based on deep machine learning technology to detect anomalies. This automated approach enables more accurate detection of complex risk pattern that would otherwise escape human analysts and caused high false alarm.

· Abnormally Detection Features - Break-ins, robbery, rioting, floods, accidents, fights, arson, fire, maintenance and vandalism.

Single GPU Single Node

Real time facial recognition and forensic alerts against multiple watchlists.

· S upports crowded scenes and difficult lighting
· F aster than real-time analysis · P artial face concealment

Multi-GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 63

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 63
4/5/21 10:18 AM

Cezurity EVO

Cezurity

Cylance FaceControl

Cylance VOCORD

Glueck Media; Glueck Analytics

Glueck

Ikena Forensic, Ikena Spotlight

MotionDSP

iMotionFocus

iCetana

IZA500G On Edge Processing ALPR System

Inex/Zamir

Nodeflux IVA

Nodeflux

OpenALPR

OpenALPR

Event Observer (EvO): engine for detecting malicious activity on user computers. Centralized detection engine; Event chains; Context; Real-time analysis - Cezurity Cloud: Cloud-based technology for detecting malware. Cezurity Cloud has the flexibility to fit into diverse solutions. Different information can be sent and processed by the server, depending on the needs of each product or solution. For example, Cezurity Cloud is currently used as a subsystem to supply data for the Cezurity EvO detection engine. Cezurity Cloud helps the Anti-Virus Scanner to detect malware. In addition, the technology is used for monitoring and analyzing changes in our APT-D solution designed to detect persistent threats against corporate networks.

· CUDA

Multi-GPU Single Node

Advanced AI-based endpoint malware detection.

· E ndpoint malware detection solution · G PU deep learning technology

Multi-GPU Single Node

Detects and recognizes the faces of people, freely passing-by cameras, providing an instant alert to people on a watchlist, recognizes age and gender, counts people by faces, tags newcomers and regular visitors. The system uses deep neural network algorithms and performs recognition with extremely high accuracy in field applications.

· N on-cooperative biometrical facial recognition system
· ALPR · V ideo analytics and pattern recognition, · V ideo processing and video enhancement

Multi-GPU Multi-Node

Deep Learning/Machine Learning based Computer Vision technology enabling understanding of how human feels and perceives the environment around them, focusing on face and people analytics.

· F acial Expression · A ge Estimation · Gender · Ethnicity · M ulti Face Tracking · A ttention Time

Multi-GPU Single Node

Real-time (render-less) super-resolutionbased video enhancement and redaction software for forensic analysts and law enforcement professionals.

· M ulti-filter, render-less video reconstruction (super-resolution, stabilization, light/color correction)
· A utomatic tracking for redaction video from body cameras, CCTV and other sources

Multi-GPU Single Node

Intelligent analysis of video on 1,000+ camera streams to significantly filter and reduce the camera streams requiring an operator view.

· G PU accelerated machine learning · Identifies abnormal activity within video
streams

Multi-GPU Single Node

The IZA500G with processing-on-edge combines two sensors (OV and LPR), a quad core processor, and ALPR software in a single housing, delivering crystal clear images, automatically recognized license plate data, GPS coordinates, and streaming video.

· O perating Distance: 9-19 ft (3-6m); 16-32 ft (5-10m)
· V ehicle Speed Range: 0 ? 120 mph (0 ? 193 km/h)
· F ield of View: 12 ft (3.66 m)

Single GPU Single Node

Nodeflux IVA products and services cover wide range of sector including but not limited to smart city, defense and security, traffic management, toll management, store analytic (wholesale and retail), asset and facilities management, advertising, and transportation.

· F ace recognition · L icense plate recognition · T raffic violation detection · T raffic monitoring, and flood monitoring

Multi-GPU Single Node

Automatic license plate and vehicle make/ model/year recognition software applied to video streams from IP cameras.

· H igh accuracy license plate character recognition spanning North America, Europe, United Kingdom, Australia, Korea, Singapore and Brazil
· A PIs and source code available for embedded applications and web services

Multi-GPU Single Node

64 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 64

4/5/21 10:18 AM

Operating Room Efficiency

Artisight, Inc.

Patient Location Tracking

Artisight, Inc.

Recotraffic; Recosecure; Recohospital

Recogine

SenDISA Platform Sensen Networks

Syndex Pro

Briefcam

Telemonitoring

Artisight, Inc.

Artisight?s Operating Room Efficiency solution improves operating room productivity with intelligent sensor network and machine learning algorithms. Delivers real-time access to the actionable data needed to improve your operating room productivity while ensuring HIPAA compliance. Deep-learning prediction helps reduce costs, improve productivity, increase profitability and provide clinicians with a safer, more efficient operating room environment.

· Independently validated de- identification protocols
· M achine learning algorithms · Intelligent cameras and Bluetooth sensors · A dvanced interoperability · H ighly granular analytics

Multi-GPU Multi-Node

An IoT sensor network for healthcare, Artisight?s intelligent solutions improve organizational operations and financial performance. Designed by physicians, AI scientists and operational experts, Artisight?s patient location tracking system uses data in a HIPAA-compliant platform to solve for the challenges of moving people efficiently around and through a hospital system.

· Independently validated de- identification protocols
· M achine learning algorithms · Intelligent cameras and Bluetooth sensors · A dvanced interoperability · H ighly granular analytics

Multi-GPU Multi-Node

Intelligent Transportation Systems covering complex multi-modal surface transportation solutions at a regional, sub-regional, corridor and small area level using deep computer vision technologies.

· T raffic Data Collection, · Incident Detection · Integrated Management · V ehicle Classification and supporting
related application

Multi-GPU Single Node

SenSen provides Video-IoT data analytic software solutions targeted at increasing revenue and reducing the cost of operations of customers. SenSen software can process and fuse data from cameras and other sensors like GPS, Radar, and Lidar in real time for parking guidance, parking enforcement, speed enforcement, traffic data analytics and road safety applications. Casinos use SenSen solutions for table game analytic solutions and customer analytics. SenSen solutions are also used in retail, security and tolling applications.

· Intelligent Transportation - parking enforcement
· C asino game table analytics

Single GPU Single Node

Improved security and operations by turning video data into useful information. Based on Video Synopsis technology, Syndex Pro allows users to review hours of video in minutes, while applying search filters for achieving accurate results and faster time-to-target. Data can be processed ondemand or in real time to support a wide range of use cases.

· R eview hours of video in minutes · S earch in Video

Single GPU Single Node

Artisight?s Telemonitoring solution uses a constellation of thousands of intelligent pan, tilt, and zoom cameras with two way audio to allow for the simultaneous monitoring of multiple patients from a single workstation. Provides constant visual and verbal contact with patients, while reducing personal protective equipment consumption, as well as front line workers exposure to the virus.

· M onitor up to 12 patients per screen, with 6 screens per station
· H igh definition 1080p video · 2 -way audio with push-to-talk functionality · Intuitive on-screen controls for responsive
pan, tilt, and zoom · P rivacy screen for patient and staff
autonomy

Multi-GPU Multi-Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 65

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 65
4/5/21 10:18 AM

Telesitting

Artisight, Inc.

Tera, Tera+, Tera Vortex

SmartCow

Thermal Screening Artisight, Inc.

XRVision, IoP

XRVision

With Artisight?s Intelligent Telesitting solution, your hospital can provide safe, accurate remote patient monitoring around the clock. Intelligent Telesitting allows a single staff member to remotely monitor multiple patients simultaneously, providing better oversight of each patient. Not only does this dramatically decrease staffing costs, it also provides more comprehensive information in real time to help avoid costly falls.
Embedded and Backend video analytics for real-time insights from your security and service-related monitoring systems.
Thermal imaging eliminates the obstacles associated with manual screening and maintains the safety of your screening staff. Our thermal imaging camera can screen thousands of people every hour, and its flexible viewing options mean you?ll spend less on staffing. It?s easy to configure, requires minimal training for operation and is accurate to within +/-0.3 degrees Celsius.
Face Recognition and Video Analytics for Uncontrolled, Crowded and In Motion Environments

· M achine learning algorithms that prevent falls and pressure ulcers
· A utomated bed capacity management and throughput coordination
· M ultiple video feeds on one screen, and multiple tabs per browser
· B i-directional audio with HD pan-tilt-zoom cameras
· S ystem available in mobile or fixed ceiling versions
· A utomatic number plate recognition · T raffic Management · S mart Car Parking Policy · A ccident Detection
· D ynamic temperature adjustment based on ambient humidity and temperature
· Intuitive multi-touch and slider- based interface
· M achine learning algorithms · W i-fi access gateway processes and
broadcasts · E ncrypted video feeds for enhanced
stability, security, and privacy · B luetooth integration for fully autonomous
screening
· F ace Recognition and Video Analytics · S mart City, Public Safety, Transportation
Analytics, Retail Analytics, Ordinance and Environment Safety

Multi-GPU Multi-Node
Multi-GPU Single Node Multi-GPU Multi-Node
Multi-GPU Single Node

Tools and Management

APPLICATION NAME
Acrobat

COMPANYNAME
Adobe

PRODUCT DESCRIPTION
Apps & web services to view, create, manipulate, print and manage files in PDF (Portable Document Format)

SUPPORTED FEATURES
· A I inference & training in the cloud

GPU SCALING
Single GPU Single Node

Altair Access

Altair

A simple, powerful, and consistent portal for submitting and monitoring jobs on remote clusters and clouds, and for remote visualization. Brings high-end 3D visualization datacenter hardware right to the user.

· 3 D Remote Visualization · H igh-fidelity collaboration · Integrated with Altair PBS Professional for
scheduling and control on GPU use and accounting

Multi-GPU Multi-Node

Altair Grid Engine Univa

Altair Grid Engine is a leading distributed resource management system for optimizing workloads and resources in thousands of data centers. Improves performance, productivity and efficiency. Optimizes throughput and performance of applications, containers, and services while maximizing shared compute resources across on-premises, hybrid, and cloud infrastructures.

· NVIDIA CUDA · OpenACC · O penCL plus MPI hybrid apps · O ptimizes scheduling with resource-
mapped GPUs · M anages GPU apps within or without
Docker containers · O btain visibility with CUDA-specific metrics
for GPU monitors and reports · E xtend on-premise deployments to
incorporate cloud-based GPU instances

Multi-GPU Multi-Node

Altair PBS Professional

Altair

PBS Professional is a fast, powerful workload manager designed to improve productivity, optimize utilization and efficiency, and simplify administration for clusters, clouds, and supercomputers. Supports biggest HPC workloads to millions of small, high-throughput jobs. PBS Professional automates job scheduling, management, monitoring, and reporting, and it's the trusted solution for complex Top500 systems as well as smaller clusters.

· G PU auto discovery · S pecify GPU count per CPU · S pecify GPU type · G PU/CPU affinity · G PU awareness and equality in accounting,
quotas, and fair share · G PU/CPU syntax/scheduling equivalence · S pecify memory use per GPU · A dd-on/integration project · N VIDIA Data Center GPU Management
(DCGM) · O pen source and commercial versions

Multi-GPU Multi-Node

66 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 66

4/5/21 10:18 AM

Arm Forge

Arm

(formerly Allinea)

Artec Leo

Artec 3D

Build reliable and optimized code for the right results on multiple Server and HPC architectures, from the latest compilers and C++ 11 standards including NVIDIA GPU hardware. Arm Forge combines Arm DDT, the leading debugger for time-saving high performance application debugging, Arm MAP, the trusted performance profiler for invaluable optimization advice across native and Python HPC codes, and Arm Performance Reports for advanced reporting capabilities. Arm Forge Professional (DDT & MAP) providing all you will need to debug, profile and optimize for high performance from single threads through to complex parallel HPC and scientific codes with MPI, OpenACC, OpenMP, threads or NVIDIA CUDA applications.
A smart 3D scanner that enables you to see your object projected in 3D directly on the HD display.

· C ross Platform: Moving to a new architecture or system is challenging enough without having to learn a new tool chain at the same time. Arm DDT, MAP and Performance Reports run everywhere - on your own laptop, the latest supercomputer, and tomorrow's upcoming architectures
· A utomatically detect memory bugs, profile behavior and see advanced performance metrics at all scales on Arm 64-bit, Intel Xeon, Intel Xeon Phi, NVIDIA GPUs , and OpenPOWER
· F ast Debug: Arm DDT is the debugger of choice for developing of C++, C or Fortran parallel, and threaded applications on CPUs, GPUs and Intel Xeon Phi
· Its powerful intuitive graphical interface helps you easily detect memory bugs and divergent behavior at all scales, making Arm DDT the number one debugger in research, industry and academia.
· L ow-overhead Profiling: Profile your code without distorting application behavior. Arm MAP is Arm Forge's scalable low-overhead profiler of C++, C, Fortran and Python with no instrumentation or code changes required. It helps developers accelerate their code by revealing the causes of slow performance
· F rom multicore Linux workstations to the largest supercomputers, you can profile realistic test cases with typically less than 5% runtime overhead.

Multi-GPU Multi-Node

· Short Learning Curve: Arm DDT offers a powerful intuitive GUI that sets the standard for multi-process and multi-threaded debugging
· Complex software debugging is made simple whether you're working on a PC or offline, with the help of zero-click variable comparisons, built-in memory debugging, and powerful array visualizations - for today's increasingly parallel processors, clusters, and supercomputers.
· Wide Issue Coverage: Arm MAP exposes a wide set of performance indicators, including MPI metrics, PAPI counters, IO metrics, energy metrics and even your own custom metrics
· Profile computation (with self and child and call tree representations over time), thread activity (to identify over-subscribed cores and sleeping threads that waste CPU time for OpenMP and pthreads), instruction types, as well as synchronization and I/O performance.
· Single and Multi Threaded Profiling: Arm MAP profiles parallel, multithreaded, and single threaded C, C++, Fortran, F90 and Python codes, providing in-depth analysis and bottleneck pinpointing to the source line
· Unlike most profilers , it can profile pthreads, OpenMP or MPI for parallel and threaded code, including communication and workload imbalance issues for MPI and multi-process codes

· Jetpack · Tx2

Single GPU Single Node

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 67

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 67

4/5/21 10:18 AM

Bright Cluster Manager CMake ELPA HPCToolkit
IBM Spectrum LSF Magma

Bright Computing
Kitware
Max Planck Institute
Rice University
IBM Corporation
ICL - University of Tennessee Knoxville

Bright Cluster Manager lets you administer clusters as a single entity, provisioning the servers, GPUs, operating system, and workload manager from a unified interface.
CMake is a cross-platform build tool for controlling the software compilation process using simple platform- and compiler-independent configuration files. Generates native makefiles and workspaces that can be used in the compiler of choice. Integrates with CDash to provide a comprehensive suite of tools.
The publicly available ELPA library provides highly efficient and highly scalable direct eigensolvers for symmetric matrices. Though especially designed for use for PetaFlop/s applications solving large problem sizes on massively parallel supercomputers, ELPA eigensolvers have proven to be also very efficient for smaller matrices.
HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to the nation's largest supercomputers. Provides support for analyzing a program execution cost, inefficiency, and scaling characteristics both within and across nodes of a parallel system.
A comprehensive workload management solution that simplifies HPC with an enhanced user and administrator experience, reliability and performance at scale. Great for big data, cognitive, GPU machine learning and containerized workloads.
MAGMA provides a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current "Multicore+GPU" systems.

· P owerful Cluster Management Shell (CMSH)
· N VIDIA libraries, CUDA, OpenCL, OpenACC, CUDA-aware libraries, NCCL, and CUB
· L inux distributions: RHEL and derivatives, SUSE SLES and Ubuntu LTS
· G PU-enabled Kubernetes and Singularity for running containers

Multi-GPU Multi-Node

· C olor output for make · P rogress output for make · Incremental linking support with vs 8,9 and
manifests · S upports out-of-tree builds · A uto-rerun of cmake if any cmake input files
change (works with vs 8, 9 using ide macros) · A uto depend information for C++, C, and
Fortran

Multi-GPU Multi-Node

· Improved one-step ScaLAPACK-type solver ELPA1
· N ovel two-step solver ELPA2

Multi-GPU Multi-Node

· C oarse-grain mode: collect multiple metrics in a single run
· G PU kernel metrics · S ynchronization metrics · M emory copy metrics · M emory allocation metrics · L ess than 2× overhead · F ine-grain mode: collect GPU PC samples · 8 PC sampling shortcomings · Introduces up to 20× overhead · S erialized GPU kernel executions

Multi-GPU Multi-Node

· Enforcement of GPU allocations via cgroups · E xclusive allocation and round robin shared
mode allocation · C PU-GPU affinity · B oost control · P ower management · M ulti-Process Server (MPS) support · N VIDIA Volta and DCGM support

Multi-GPU Multi-Node

· L inear system solvers · E igenvalue problem solvers · A uxiliary BLAS · B atched LA · S parse LA · C PU/GPU Interface · M ultiple precision support · N on-GPU-resident factorizations · M ulticore and multi-GPU support · M AGMA Analytics/DNN · L APACK testing · Linux · Windows · M ac OS · S upport for NVIDIA A100, V100, T4, P100
GPUs

Multi-GPU Single Node

68 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 68

4/5/21 10:18 AM

PAPI

ICL - University of Tennessee Knoxville

Parallware Trainer Appentra Solutions

SLURM STRIVR

SchedMD StriVR

PAPI provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events.

· S tandard API on most modern microprocessors
· S mall set of registers that count Events · Events-monitoring · C orrelation between source/object code
and underlying architecture · P lease refer to the PAPI News page for the
latest on GPU support: https://icl.utk.edu/papi/news/index.html

Parallelware Trainer is an interactive, realtime code editor with features that facilitate the learning, usage, and implementation of parallel programming by understanding how and why sections of code can be parallelized. Users are actively involved in learning parallel programming through observation, comparison, and hands-on experimentation. Parallelware Trainer provides support for widely used parallel programming strategies using OpenMP and OpenACC with execution on multicore processors and GPUs.

· Interactive, real-time editor GUI that shows you how and where to implement parallelism.
· A ssists in the parallelization of code using OpenMP and OpenACC.
· T ransparent, local/ remote, execution and benchmarking.
· S upport for the C programming language. Full Fortran support coming soon.
· D etailed report of opportunities for parallelism discovered in your code.
· S upport for multiple compilers including GCC, Intel and PGI.
· Benefits: · F aster, more effective learning. · R educed learning curve. · A ll-in-one learning tool for parallel programming. · Immediate use of parallel programming. · S upport for multicore processors and GPUs.

Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.

· G PU support · GPGPUs · M ilitary grade security · H eterogenous platform · F lexible plugin framework

STRIVR offers an end-to-end Immersive Learning platform that revolutionizes the way people and businesses train, learn, and perform.

· V RWorks 360 Video

Multi-GPU Multi-Node
N/A
Multi-GPU Multi-Node Single GPU Single Node

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 69

POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 69
4/5/21 10:18 AM

TAU - Tuning and Analysis Utilities

University of Oregon

Torque / Moab

Adaptive Computing

Totalview

Perforce

Vampir

TU Dresden

TAU Performance System is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python. TAU (Tuning and Analysis Utilities) is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements as well as event-based sampling. All C++ language features are supported including templates and namespaces. The API also provides selection of profiling groups for organizing and controlling instrumentation. The instrumentation can be inserted in the source code using an automatic instrumentor tool based on the Program Database Toolkit (PDT), dynamically using DyninstAPI, at runtime in the Java Virtual Machine, or manually using the instrumentation API. TAU's profile visualization tool, paraprof, provides graphical displays of all the performance analysis results, in aggregate and single node/context/thread forms. The user can quickly identify sources of performance bottlenecks in the application using the graphical interface. In addition, TAU can generate event traces that can be displayed with the Vampir, Paraver or JumpShot trace visualization tools.

· Instrumentation · PerfDMF · Paraprof · L oad Profiles · M etric Window · T hread Windows · C ommunication Matrix · 3 D Visualization · D erived Metrics · S elective Instrumentation · PerfExplorer · C luster Analysis · C orrelation Analysis · S calability Chart · P reset Charts · C ustom Charts · Visualizations · E clipse Introduction · S elective Instrumentation · Instrumenting Java · C onfiguration Manager

Multi-GPU Multi-Node

Moab HPC Suite is a workload and resource orchestration platform that automates the scheduling, managing, monitoring, and reporting of HPC workloads on massive scale. TORQUE provides control over batch jobs and distributed computing resources. It is an advanced open-source product based on the original PBS project and incorporates the best of both community and professional development.

· R equests and schedules gpus based on gpu location in NUMA systems
· C ollects and report smetrics and status information
· S ets gpu mode at job run time

Multi-GPU Multi-Node

TotalView is the leading dynamic analysis and debugging tool designed to handle complex CPU and GPU based multithreaded, multi-process and multi-node cluster applications.TotalView supports the latest CUDA SDK's, NVIDIA GPU hardware, Linux x86-64, Arm64, and OpenPower platforms and applications utilizing MPI and OpenMP technologies.

· OpenACC directives · CUDA running directly on NVIDIA latest GPUs · Linux and GPU device thread visibility · CUDA function calls, host pinned memory
regions and CUDA contexts · Handling CUDA functions inline and on the
stack · Command line interface (CLI) commands for
CUDA functions · MPI applications on CUDA-accelerated
clusters

Multi-GPU Multi-Node

Easy-to-use framework that enables developers to quickly display and analyze arbitrary program behavior at any level of detail. The tool suite implements optimized event analysis algorithms and customizable displays that enable fast and interactive rendering of very complex performance monitoring data.

· N VIDIA CUDA · CUPTI · C UDA libraries

Multi-GPU Multi-Node

70 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 70

4/5/21 10:18 AM

Agriculture

APPLICATION NAME
Taranis

COMPANYNAME
Taranis

PRODUCT DESCRIPTION
Taranis provides a platform for discovering various crop health issues, helping farmers take care of both land and crops and making sure they get the best of their yield.

SUPPORTED FEATURES
· r eport plant population to farmers · d etect when a weed emerges in field and
constitutes a potential threat · c alculate amounts of nutrients in
vegetation, water content in the soil, plant temperature · identify and categorize the top relevant diseases for prevalent crops

GPU SCALING
Multi-GPU Multi-Node

Business Process Optimization

APPLICATION NAME
Automated checkout

COMPANYNAME
Focal Systems

PRODUCT DESCRIPTION
Focal's Product Recognition eliminates barcode scanning entirely at the cashier and achieves 99% accuracy on thousands of products.

SUPPORTED FEATURES
· cuDNN · TensorRT

GPU SCALING
Multi-GPU Single Node

DataX.AI Helix Part Finder Kiosk

CrowdANALYTIX Maxerience Slyce

Cloud-based crowd-sourced analytics services that create an online retail product catalog, on-boarding SKU in minutes instead of the manual process of tagging and provide produce info and removing human error involved.
CPG product training platform: creates digital copies of products right at the production line in a matter of minutes, and creates an AI model in less than 30 minutes!
A visual search and image recognition solution for retailers and brands

· cuDNN
· TensorRT
· R eal time scan item and direct customer to item's location in store
· F ind a replacement or additional info · F eature Jetpack

Single GPU Single Node
Single GPU Single Node
Single GPU Single Node

Peak Trading Out Of Stock

BeMyEye

Out of Stock (OOS) and Almost OOS (AOOS) · P roduct recognition on the cloud crowed sourcing solutions for retailers

Single GPU Single Node

Perfect Shelf

BeMyEye

Track Hypermarkets, Supermarkets, Discounters, Managed Convenience and Chemists, using unique blend of IR technologies and crowdsourcing, to provide you with on-shelf sales fundamental data across an entire category

· R eal time inferencing on the cloud · S KU recognition

Single GPU Single Node

Predictive Pricing Evo Pricing

Third Wave Automation

Third Wave Automation

Market-driven optimal prices based on demand, competition, product features and customer feedback
Automation cloud robotics and machine learning technology to material handling forklift automation in a warehouse

· G PU on the cloud · G eforce 2080 Ti

Multi-GPU Single Node
Single GPU Single Node

For more information on GPU-accelerated applications please visit, www.nvidia.com/teslaapps
POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 71

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 71

4/5/21 10:18 AM

© 2021 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, and CUDA, are trademarks and/or registered trademarks of NVIDIA Corporation. All company and product names are trademarks or registered trademarks of the respective owners with which they are associated. Features, pricing, availability, and specifications are all subject to change without notice. Apr21
72 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21
hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 72

4/5/21 10:18 AM


Adobe PDF Library 15.0