experience with sensor fusion and machine learning technology. ... manuals, instructions, FAQ documents; ... Images, Walmart, Ford, Google, NASA.
GPU-ACCELERATED APPLICATIONS hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 1 4/5/21 10:18 AM Test Drive the World's Fastest Accelerator Free! Take the GPU Test Drive, a free and easy way to experience accelerated computing on GPUs. You can run your own application or try one of the preloaded ones, all running on a remote cluster. Try it today. www.nvidia.com/gputestdrive hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 2 4/5/21 10:18 AM GPUACCELERATED APPLICATIONS Accelerated computing has revolutionized a broad range of industries with over six hundred applications optimized for GPUs to help you accelerate your work. CONTENTS 1 Computational Finance 2 Climate, Weather and Ocean Modeling 2 Data Science and Analytics 5 Artificial Intelligence DEEP LEARNING AND MACHINE LEARNING 12 Public Sector and National Government 14 Design for Manufacturing/Construction: CAD/CAE/CAM CFD (MFG) CFD (RESEARCH DEVELOPMENTS) COMPUTATIONAL STRUCTURAL MECHANICS DESIGN AND VISUALIZATION ELECTRONIC DESIGN AUTOMATION INDUSTRIAL INSPECTION 27 Media and Entertainment ANIMATION, MODELING AND RENDERING COLOR CORRECTION AND GRAIN MANAGEMENT COMPOSITING, FINISHING AND EFFECTS (VIDEO) EDITING (IMAGE & PHOTO) EDITING ENCODING AND DIGITAL DISTRIBUTION ON-AIR GRAPHICS ON-SET, REVIEW AND STEREO TOOLS WEATHER GRAPHICS 42 Medical Imaging 45 Oil and Gas 46 Life Sciences BIOINFORMATICS MICROSCOPY MOLECULAR DYNAMICS QUANTUM CHEMISTRY (MOLECULAR) VISUALIZATION AND DOCKING 58 Research: Higher Education and Supercomputing NUMERICAL ANALYTICS PHYSICS SCIENTIFIC VISUALIZATION 63 Smart Spaces 66 Tools and Management 71 Agriculture 71 Business Process Optimization hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 3 4/5/21 10:18 AM hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 4 4/5/21 10:18 AM Computational Finance APPLICATION NAME Accelerated Computing Engine COMPANYNAME Elsen PRODUCT DESCRIPTION Secure, accessible, and accelerated backtesting, scenario analysis, risk analytics and real-time trading designed for easy integration and rapid development. SUPPORTED FEATURES · W eb-like API with Native bindings for Python, R, Scala, C · C ustom models and data streams GPU SCALING Multi-GPU Single Node Adaptiv Analytics SunGard A flexible and extensible engine for fast calculations of a wide variety of pricing and risk measures on a broad range of asset classes and derivatives. · C odes in C# supported transparently, with minimal code changes · S upports multiple backends including CUDA and OpenCL · S witches transparently between multiple GPUs and CPUS depending on the deal support and load factors. Multi-GPU Single Node Alea.cuBase F# QuantAleas F# package enabling a growing set of F# capability to run on a GPU. · F # for GPU accelerators Multi-GPU Single Node Esther Global Valuation In-memory risk analytics system for OTC portfolios with a particular focus on XVA metrics and balance sheet simulations. · H igh quality models not admitting closed form solutions · E fficient solvers based on full matrix linear algebra powered by GPUs and Monte Carlo algorithms Multi-GPU Single Node Global Risk MISYS Regulatory compliance and enterprise wide · R isk analytics risk transparency package. Multi-GPU Single Node Hybridizer C# Altimesh Multi-target C# framework for data parallel · C # with translation to GPU computing. · M ulti-Core Xeon Multi-GPU Single Node MACS Analytics Library Murex Analytics library for modeling valuation and risk for derivatives across multiple asset classes. · M arket standard models for all asset classes paired with the most efficient resolution methods (Monte Carlo simulations and Partial Differential Equations) Multi-GPU Single Node NAG Numerical Algorithms Group Random number generators, Brownian bridges, and PDE solvers · M onte Carlo and PDE solvers Single GPU Single Node Oneview Numerix Numerix introduced GPU support for Forward Monte Carlo simulation for Capital Markets and Insurance. · E quity/FX basket models with BlackScholes/Local Vol models for individual equities and FX · A lgorithms: AAD (Automatic Algebraic Differential) · N ew approaches to AAD to reduce time to market for fast Price Greeks and XVA Greeks Multi-GPU Multi-Node O-Quant options pricing O-Quant Offering for risk management and complex options and derivatives pricing using GPUs. · C loud-based interface to price complex derivatives representing large baskets of equities Multi-GPU Multi-Node Pathwise Aon Benfield Specialized platform for real-time hedging, valuation, pricing and risk management. · S preadsheet-like modeling interfaces · P ython-based scripting environment · G rid middleware Multi-GPU Single Node SciFinance SciComp, Inc Derivative pricing (SciFinance) · M onte Carlo and PDE pricing models Single GPU Single Node Synerscope Data Visualization Synerscope Visual big data exploration and insight tools · G raphical exploration of large network datasets including geo-spatial and temporal components Single GPU Single Node Volera Hanweck Associates Real-time options analytical engine (Volera) · R eal-time analytics Multi-GPU Single Node Xcelerit SDK Xcelerit Software Development Kit (SDK) to boost the performance of Financial applications (e.g. Monte-Carlo, Finite-difference) with minimum changes to existing code. · C ++ programming language, crossplatform (back-end generates CUDA and optimized CPU code) · S upports Windows and Linux operating systems Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 1 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 1 4/5/21 10:18 AM Climate, Weather and Ocean Modeling APPLICATION NAME COSMO COMPANYNAME COSMO Consortium PRODUCT DESCRIPTION Regional numerical weather prediction and climate research model SUPPORTED FEATURES · R adiation only in the trunk release · A ll features in the MCH branch used for operational weather forecasting GPU SCALING Multi-GPU Multi-Node E3SM-EAM US DOE Global atmospheric model used as component to E3SM global coupled climate model. · D ynamics and most physics Multi-GPU Multi-Node Gales KNMI, TU Delft Regional numerical weather prediction model · F ull Model Multi-GPU Multi-Node GRAF IBM/TWC New GPU-based global weather model based on MPAS from NCAR · F ull application Multi-GPU Multi-Node WRF AceCASTWRF TempoQuest Inc. WRF model from NCAR now commercialized by TQI. Used for numerical weather prediction and regional climate studies. All popular aspects of WRF model are GPU developed. · A RW dynamics · 1 9 physics options including enough to run the full WRF model on GPUs Multi-GPU Multi-Node Data Science and Analytics APPLICATION NAME Anaconda Distribution COMPANYNAME Anaconda PRODUCT DESCRIPTION The open-source Anaconda Distribution is the easiest way to perform Python/R data science and machine learning on Linux, Windows, and Mac OS X. Developed for solo practitioners, it is the toolkit that equips you to work with thousands of open-source packages and libraries. SUPPORTED FEATURES GPU SCALING · B indings to CUDA libraries: cuBLAS, cuFFT, cuSPARSE, cuRAND · S orts algorithms from the CUB and Modern GPU libraries · Includes Numba (JIT Python compiler), Dask (Python scheduler), NumPy, SciPy, · Includes single-line install of numerous DL frameworks such as PYTORCH Multi-GPU Multi-Node AnswerRocket AnswerRocket AnswerRocket leverages AI and machine learning techniques to automate the hard work of business analysis, empowering teams to generate business intelligence and advanced analysis in seconds. · P luggable machine learning models · A sk Questions in Plain English · C reate Interactive Visualizations & Dashboards · P rovides Augmented Analytics · S upports a wide variety of data sources Multi-GPU Multi-Node ArgusSearch Planet AI Deep Learning driven document search tool. · F ast full text search engine · S earches hand-written and text documents, including PDF · A llows almost any arbitrary requests (Regular Expressions are supported) · P rovides a list of matches sorted by confidence Multi-GPU Single Node Automatic Speech Capio Recognition In-house and Cloud-based speech recognition technologies · R eal-time and offline (batch) speech recognition · E xceptional accuracy for transcription of conversational speech · C ontinuous Learning (System becomes more accurate as more data is pushed to the platform) Multi-GPU Single Node BlazingSQL BlazingSQL GPU-accelerated SQL Engine for analytics available on all major CSP and on-premise deployment. · D istributed SQL Query Engine · S upports petabyte scale applications · S upports traditional big data formats and data stores Multi-GPU Multi-Node BrytlytDB Brytlyt In-GPU-memory database built on top of PostgreSQL · G PU-Accelerated joins, aggregations, scans, etc. on PostgreSQL · V isualization platform bundled with database is called SpotLyt. Multi-GPU Multi-Node CuPy Preferred Networks CuPy (https://github.com/cupy/cupy) is a GPU-accelerated scientific computing library for Python with a NumPy compatible interface. · CUDA · m ulti-GPU support Multi-GPU Single Node 2 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 2 4/5/21 10:18 AM Datalogue DeepGram Driverless AI GPUdb H2O4GPU IntelligentVoice Datalogue Deepgram H2O.ai Kinetica H2O.ai INTELLIGENT VOICE AI powered pipelines that automatically prepare any data from any source for immediate & compliant use. · D ata transformation · O ntology mapping · D ata standardization · D ata augmentation Multi-GPU Single Node Voice processing solution for call centers, financials and other scenarios. · S peech to text and phonetic search using GPU deep learning Multi-GPU Single Node Automated Machine Learning with Feature Extraction. Essentially BI for Machine Learning and AI, with accuracy very similar to Kaggle Experts. H2O Driverless AI is an artificial intelligence (AI) platform for automatic machine learning. Driverless AI automates some of the most difficult data science and machine learning workflows such as feature engineering, model validation, model tuning, model selection and model deployment. It aims to achieve highest predictive accuracy, comparable to expert data scientists, but in much shorter time thanks to end-to-end automation. Driverless AI also offers automatic visualizations and machine learning interpretability (MLI). Especially in regulated industries, model transparency and explanation are just as important as predictive performance. Modeling pipelines (feature engineering and models) are exported (in full fidelity, without approximations) both as Python modules and as pure Java standalone scoring artifacts. · A utomated machine learning and feature extraction · A utomated statistical visualization · Interpretability toolkit for machine learning models Multi-GPU Single Node Multi-GPU, Multi-Machine distributed object store providing SQL style query capability, advanced geospatial query capability,heatmap generation, and distributed rasterization services. · Q uery against big data in real time · N o pre-indexing allows for complex, ad-hoc query chains · Interactively explore large, streaming data sets Multi-GPU Single Node H2O is a popular machine learning platform which offers GPU-accelerated machine learning. In addition, they offer deep learning by integrating popular deep learning frameworks. · A vailable algorithms include Gradient Boosting Machines (GBM's) · G eneralized Linear Models (GLM's) · K -Means Clustering · SVD · PCA · K-means · X GBoost. · It can be used as a drop-in replacement for scikit-learn with support for GPUs on selected (and ever-growing) algorithms. · A new R API brings the benefits of GPUaccelerated machine learning to the R user community. The R package is a wrapper around the H2O4GPU Python package, and the interface follows standard R conventions for modeling. Multi-GPU Single Node Far more than a transcription tool, this speech recognition software learns what is important in a telephone call, extracts information and stores a visual representation of phone calls to be combined with text/instant messaging and E-mail. Intelligent Voice's search and alert makes it possible to tackle issues before they arise, address data security concerns and monitor physical access to data. · A dvanced Speech Recognition across large data sets · J umpTo Technology, for data visualisation · E-Discovery · E xtraction from phone calls · IM & Email defining key phrases and emotional analysis · C ompliance, defining key conversations and interactions Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 3 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 3 4/5/21 10:18 AM Jedox Labellio Numba OmniSci Polymatica Jedox KYOCERA Communication Systems Co Anaconda OmniSci Polymatica Helps with portfolio analysis, management consolidation, liquidity controlling, cash flow statements, profit center accounting, treasury management, customer value analysis and many more applications. All accessible in a powerful web and mobile application or Excel environment. The world's easiest deep learning web service for computer vision, allowing everyone to build own image classifier with only web browser. Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Think of it as a compiler for Python array and numerical functions that gives you the power to speed up your applications with high performance functions written directly in Python. Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. You don't need to replace the Python interpreter, run a separate compilation step, or even have a C/C++ compiler installed. Just apply one of the Numba decorators to your Python function, and Numba does the rest. Numba generates optimized machine code from pure Python code using the LLVM compiler infrastructure. With a few simple annotations, array-oriented and math-heavy Python code can be just-in-time optimized to performance similar as C, C++ and Fortran, without having to switch languages or Python interpreters. Numba is designed to be used with NumPy arrays and functions. Numba generates specialized code for different array data types and layouts to optimize performance. Special decorators can create universal functions that broadcast over NumPy arrays just like NumPy functions do. Numba also works great with Jupyter notebooks for interactive computing, and with distributed execution frameworks, like Dask and Spark. With support for GPU acceleration, Numba lets you write parallel GPU algorithms entirely from Python. OmniSci is GPU-powered big data analytics and visualization platform that is hundreds of times faster than CPU in-memory systems. OmniSci uses GPUs to execute SQL queries on multi-billion row datasets and optionally render the results, all in milliseconds. Analytical OLAP and Data Mining Platform · T his database holds all relevant data in GPU memory · T esla K40 &12 GB on-board RAM · S cales up with multiple GPUs · K eeps close to 100 GB of compressed data in GPU memory on a single server system · F ast analysis, reporting, and planning · N eural net fine-tuning for image data · D ata crawling and data browsing · D rag-and-drop style data cleansing backed by AI support · O n-the-fly code generation (at import time or runtime, at the user's preference) · N ative code generation for the CPU (default) and GPU hardware · Integration with the Python scientific software stack (enabled via Numpy) · J IT compilation of Python functions for execution on various targets (including CUDA) · U ses LLVM's nvptx backend to generate CUDA kernels · O penGL- (EGL) based rendering · C an run in a docker container using NVIDIA-docker · V isualization, Reporting, OLAP in-memory with GPU acceleration · D ata Mining · M achine Learning · P redictive Analytics Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node 4 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 4 4/5/21 10:18 AM Sqream DB SQream Technologies SynerScope Synerscope ZX Lib (Fuzzy Logic) Tanay GPU accelerated SQL database engine for big data analytics. Sqream speeds SQL analytics by 100X by translating SQL queries into highly parallel algorithms run on the GPU. · U p to 100TB of raw data can be stored and queried in a standard 2U server · Inserts and analyzes hundreds of billions of records in seconds · N o indexes required · N o changes to SQL code or data science paradigms required Multi-GPU Single Node Big data visualization and data discovery, for · R eal-time Interaction with data combining Analytics on Analytics with IoT compute-at-the-edge smart sensors. Single GPU Single Node Financial analytics and data mining library · M onte Carlo simulations · P ricing of vanilla and exotic options · F ixed income analytics · D ata mining Multi-GPU Single Node Artificial Intelligence DEEP LEARNING AND MACHINE LEARNING APPLICATION NAME AIC COMPANYNAME Tracxpoint PRODUCT DESCRIPTION AIC (Artificial Intelligence Cart) revolutionizes the supermarket shopping experience with sensor fusion and machine learning technology. SUPPORTED FEATURES GPU SCALING · T he smart IoT cart recognizes the shopper, loads their shopping list and buying patterns, suggests compatible products and provides the most valuable offer · R ecognizes the items placed in the cart and bill the customer at the end of the shopping experience with no checkout lanes · F eature Jetpack Single GPU Single Node AiFi Nano AiFi Inc. Cashier-free (like Amazon grab and go solution) and stock out retail software · cuDNN · TensorRT · DeepStream Multi-GPU Single Node AI Image Labeling Frenzy Builds robust self-labeling training datasets for classifying exact objects and products in visual scenes at a fraction of the time and cost · G PU in the cloud Multi-GPU Single Node AI Lifescycle Clarifai Clarifai brings a new level of understanding to visual content through deep learning technologies. Uses GPUs to train large neural networks to solve practical problems in advertising, media, and search across a wide variety of industries such as automated tagging, visual search, and recommendation engine, predictive maintenance, demographic analysis and more. · G PU-based training and inference · R ecognizes and indexes images with predefined classifiers or custom classifiers Multi-GPU Single Node Allganize NLU APIs Allganize, Inc. for Enterprises Natural Language Understanding APIs for enterprise: Answer-bot based on documents with unstructured data (text + table), e.g., manuals, instructions, FAQ documents; Review analysis; sentiment analysis, summarizing etc. Provided as APIs. · T raining and inferencing using V100 Multi-GPU Multi-Node AlphaSense AlphaSense PaaS for Financial analysis based on public corporate information. Geared at financial analysts within financial services.. Allows very fast searches of public corporate information, and allows questing answering format ("the Google for Analyst research") · P aaS for Financial analysis based on public corporate information · G eared at financial analysts within financial services. · A llows very fast searches of public corporate information, and allows questing answering format ("the Google for Analyst research") Multi-GPU Single Node AlwaysAI Always AI Easy-to-use platform to build and deploy computer vision applications for embedded devices at the edge. Apply for an early access on the product link · J etson Nano Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 5 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 5 4/5/21 10:18 AM Anaconda Anaconda Enterprise Edition The end-to-end data science platform. The Anaconda enterprise platform is a comprehensive foundation for any organization that wants to use data science and machine learning to make better decisions and build differentiating solutions. · B indings to CUDA libraries: cuBLAS, cuFFT, cuSPARSE, cuRAND · S orts algorithms from the CUB and Modern GPU libraries · N umba (JIT Python compiler), Dask (Python scheduler), NumPy, SciPy, · S ingle-line install of numerous DL frameworks such as PYTORCH Multi-GPU Single Node Antuit Demand Planning and Forecasting Antuit Extracts maximum predictability from the available data. Proprietary "Dynamic Aggregation" logic with attribute-based disaggregation generates forecasts for all products, including new, slow-moving, and end-of-life. Spark and GPU clusters, along with optimized AI algorithms, provide scaling for the largest retailers. Incorporates all available demand drivers, such as price elasticities, promotional lifts, weather, and hyper-local event data. · C UDA 10.1 · C uDNN 7.6 · C uBLAS 10.2 Multi-GPU Multi-Node Apache Mahout Apache Mahout Mahout is building an environment for quickly creating scalable performant machine learning applications. · E xtremely easy to add new algorithms · D istributed instead of single machine Multi-GPU Multi-Node Applica RTA Applica Applica RTA combines computer vision and deep-learning driven NLP to process all documents types. · G PU to accelerate model training, finetuning and inferencing Multi-GPU Multi-Node Artificial Deepwave Intelligence Radio Digital Transceiver (AIR-T) The Artificial Intelligence Radio Transceiver · T he AIR-T is designed to be an edge- N/A (AIR-T) is software defined radio designed compute inference engine for deep learning and developed for RF deep learning algorithms. applications. The app is equipped with three signal processors including a 256 core NVIDIA Jetson TX2, a field programmable gate array (FPGA), and dual embedded CPUs. ARYA.ai ARYA.ai Deep learning platform with end-to-end workflows for Enterprise, incorporating TensorFlow. Focuses on consumer banking and insurance industries. · D eep learning · TensorFlow. Multi-GPU Multi-Node Aura Vision Aura Vision Capture unique insights from every visitor, using your existing cameras · S egmented footfall · S hopper motivation · P roduct engagement · W indow display ROI · S tore utilization · S ervice wait times Single GPU Single Node Avitas Systems - Inspection as a Service Avitas Systems Avitas Systems configures various multi rotor and helicopter drones with multiple sensor kits including RGB cameras, laser sensors, infrared and others collecting inspection data to meet different customer use cases. Ingests inspection data where an AI back-end turns the raw data into inspection findings such as corrosion levels, damaged/missing parts, encroaching vegetation volumes. · D rone based data capture · R GB Camera, Laser and Infrared sensing · D eep learning driven Object detection for Inspection · D etect corrosion levels, damaged/missing parts, encroaching vegetation volumes. · A I workbench · Photogrammetry Multi-GPU Multi-Node AWM Smart Shelf Adroit Worldwide Media Inc. Application for Automated Inventory Intelligence (view and track virtually in a retail environment), Content Management System (manage inventory, prices and content), Led Display (prices. promotions and advertisements at the click of a button) and Product Mapper (automate creation of planograms and auditing process) · kubernetes · Docker · R TX 2080 Multi-GPU Single Node Badger Insights Badger Technologies Badger Technologies provides data and analytics for retail operations through automation solutions that include a fully autonomous robot to address out-of-stock, planogram compliance, and price integrity · G PU accelerated Single GPU Single Node 6 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 6 4/5/21 10:18 AM BIDMach Bons.ai Brain Frame Caffe2 Cartwatch Checkout CatBoost Chainer checout inteliigence ClearML CNTK ConundrumAI UC Berkeley Bons.ai Aotu Facebook Signatrix Yandex Preferred Networks, Inc. Everseen Allegro.AI Microsoft Corp. Conundrum Industrial Limited The fastest machine learning library available. Holds the record for many common machine learning algorithms. · W ritten in Scala and supports Scala and Java interfaces · S upports linear regression, logistic regression, SVM, LDA, K-Means and other operations Multi-GPU Single Node Bons.ai is an artificial intelligence platform which abstracts away the low-level, inner workings of machine learning systems to empower more developers to integrate richer intelligence models into their work. · Easy to use programming interface. Bons.ai · N ovel programming language called Inkling · P rimary focus on reinforcement learning Multi-GPU Single Node BrainFrame platform provides Out-OfThe-Box Smart Vision Applications for multiple verticals. The drag-and-drop VisionCapsules system allows you to pick from a wide selection of custom algorithms to extract exactly the information you want · Jetpack · Jetson Single GPU Single Node This is a faster framework for deep learning, it's forked from BVLC/caffe (master branch). Allows data-parallel via MPI. · G PU cluster processing · M ass image data Multi-GPU Single Node Protect the checkout area and reduce the workload of your checkout staff · R eal-time alerts on theft (mis-scan) at the checkout lanes · F eaturing Jetpack and TensorRT Single GPU Single Node CatBoost is an open-source gradient boosting library with categorical features support. · E xtremely fast learning on GPU · Multi-GPU · Multi-Node Multi-GPU Multi-Node DL framework that makes the construction of neural networks (NN) flexible and intuitive. · D ynamic NN construction, which makes debugging easier · C PU/GPU-agnostic coding, which is promoted by CuPy, partially NumPycompatible multidimensional array library for CUDA · D ata-dependent NN construction, which fully exploits the control flows of Python without magic Multi-GPU Multi-Node Loss prevention solution at the POS powered by T4 · M Is-scan detection · P roduct and ticket switching detection · " Walk off" detection Multi-GPU Single Node ClearML provides a suite of tools to streamline ML workflow, including Experiment Manager, ML-Ops and Data Management. · M ulti-system enterprise workflow scheduling · V ersion control (e.g., the ?git?) for models · D GX-ready and available from NGC · O pen-source and paid options · E nables reproducibility and automation · C learML supports MIG functionality · T ensorFlow, Keras, and PyTorch · N VIDIA frameworks such as Clara for healthcare and medical imaging · R APIDS and TLT Multi-GPU Multi-Node Microsoft Computational Network Toolkit (CNTK) is a unified computational network framework that describes deep neural networks as a series of computational steps via a directed graph. · S peech Recognition · M achine Translation · Image Recognition · Image Captioning · T ext Processing and Relevance · L anguage Understanding · L anguage Modeling Multi-GPU Single Node Conundrum, a UK-based company, develops AI solutions for predictive maintenance and optimization of industrial processes. · A utomated deep learning significantly speeds up a build of the applications based on DL models; · T ransfer Learning enables to boost the performance of the applications by transferring knowledge between them; · D ata based digital twins and reinforcement learning for optimization. Multi-GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 7 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 7 4/5/21 10:18 AM Darwin Databricks Unified Analytics Platform DeepInstinct Deeplearning4j Dessa Dextro Dr. Retail Frenzy Enterprise Solutions G3C.AI Gridspace Insights Keras SparkCognition Databricks DeepInstinct Skymind Dessa Axon SkyREC Inc. Frenzy Graymatics Gridspace AnyVision Open Source Darwin is a machine learning product that accelerates data science at scale by automating the building and deployment of models. Based on a proprietary neuroevolutional algorithm, Darwin uses a combination of ML methods and genetic algorithms, to arrive at a new generation of designs. Databricks provides a cloud-based platform designed to make big data and machine learning simple. Zero day end point malware detection solution offered to enterprise markets. Deeplearning4j is the most popular deep learning framework for the JVM, and includes all major neural nets such as convolutional, recurrent (LSTMs) and feedforward. Deep Learning Platform based on TensorFlow. Allows end-to-end workflows. Targets consumer banking and insurance industries. Dextro's API uses deep learning systems to analyze and categorize videos in real-time. Instore data analytics Frenzy Enterprise Solutions provides retailers and brands with the tools to provide customer's the best experience and more purchasing opportunities including Similar Product Recommendations, Inventory Tagging, Camera Search, Complimentary Product Recommendations, How To Wear It, Influencer Matching Retail in store analytics solutions through Deep CCTV Streaming Analytics Voice analytics to turn streaming speech audio into useful data and service metrics. Instrumental to contact call center and work communications with powerful deep learning-driven voice analytics. Insight delivers in-store analytics with features such as: heavy shoppers, gaze estimation, heatmaps, customer journey, and offline to online Keras is a minimalist, highly modular neural networks library, written in Python. Capable of running on top of either TensorFlow or Theano and developed with a focus on enabling fast experimentation. · U nique neuro-evolutionary algorithm on GPU · A utomated ML for model building on GPUs · G PU accelerated PyTorch · G PU instances available with CUDA drivers included · G PU support provided by Spark scheduler · Integration of TensorFlow, Keras · T ensorFrames data connector · D eep learning pipelines/workflows · T ransfer learning and image loading · Z ero-day threats & APT attack detection on endpoints, servers and mobile devices · Integrates with Hadoop and Spark to run distributed · J ava and Scala APIs · C omposable framework that facilitates building your own nets · Includes ND4J, the Numpy for Java. · D eep learning workflows can be built · B ased on TensorFlow · U se cases in consumer banking and Insurance · O bject and scene detection · M achine transcription for audio · M otion and movement detection · T ensorRT 5.1 · nvJPEG · NVEnc · NVDec · G PU on the cloud · In store analytics: heat-maps, shopper tracking, dwell time, people counting, mood detection, demographics · F eaturing TensorRT and Deepstream · S peech-to-text transcription · Compliance · C all grading · C all topic modeling · C ustomer service enhancement · C ustomer churn prediction · N VIDIA Tesla T4 and Jetson · c uDNN version (depends on the version of TensorFlow and Theano installed with Keras) · S upported Interfaces: Python Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node N/A Multi-GPU Single Node Multi-GPU Single Node 8 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 8 4/5/21 10:18 AM Malong Retail AI Fresh Malong Technologies Malong Retail AI Protect Malong Technologies MatConvNet Mathworks Matriod Matroid MetaMind MXNet Einstein Platform Services Amazon Neon Intel NVCaffe out of stock detection PaddlePaddle Berkeley AI Research Focal Systems PaddlePaddle Protects & Insights Briefcam RetailAI® Fresh solves for the timeconsuming and error-prone experience that grocery store customers today struggle with when weighing fresh products on a selfserve scale. · S upports T4 · S upports Deepstream For loss prevention at self-checkout and staffed lanes. Leverages award-winning product recognition technologies, the system accurately identifies and stops common scan errors as they happen ? including mis-scans and ticket-switching ? while helping to protect customer privacy. Offers industry-leading accuracy while being massively scalable for effectively unlimited SKUs and stores. · S upports T4 · S upports Deepstream CNNs for MathWorks MATLAB, allows you to use MATLAB GPU support natively rather than writing your own CUDA code. · B uilding Blocks · S imple CNN wrapper · D agNN wrapper · c uDNN implemented Matroid offers video classification service in the cloud. Matroid allows training video detections on a set of images and then applying those video detection. · M atroid is multi-cloud and allows it customers to easily switch between AWS, Azure and Google Cloud. Provides a deep learning API for image recognition and text sentiment analysis. Uses either prebuilt, public, or custom classifiers. · G PU-based training and inference · R ecognizes image and analyzes text · C reates and trains classifiers with tooling for uploading and managing datasets MXnet is a deep learning framework designed for both efficiency and flexibility that allows you to mix the flavors of symbolic programming and imperative programming to maximize efficiency and productivity. · M Xnet supports cuDNN v5 for GPU acceleration Neon is a fast, scalable, easy-to-use Python based deep learning framework that has been optimized down to the assembler level. Features a rich set of example and pre-trained models for image, video, text, deep reinforcement learning and speech applications. · T raining, inference and deployment of deep learning models · P rocesses over 442M images per day on a Titan X The Caffe deep learning framework makes implementing state-of-the-art deep learning easy. · P rocess over 40M images per day with a single NVIDIA K40 or Titan GPU Deep Learning Computer Vision track your On-Shelf Availability throughout your entire store 100+ times a day · O n-Shelf Availability Analytics per hour · R eal-time Alerts on your "never be outs" PaddlePaddle (Parallel Distributed Deep Learning) is an easy-to-use, efficient, flexible and scalable deep learning platform, which is originally developed by Baidu scientists and engineers for the purpose of applying deep learning to many products at Baidu. · O ptimized math operations through SSE/ AVX intrinsics, BLAS libraries (e.g. MKL, ATLAS, cuBLAS) or customized CPU/GPU kernels · H ighly optimized recurrent networks which can handle variable-length sequence without padding · O ptimized local and distributed training for models with high dimensional sparse data Transform video into actionable intelligence. features: video synopsis and real time alerts, loss prevention, customer engagement and tying info to POS data, heatmaps, shopper tracking · N VIDIA Tesla and Jetson. · TesnorRT Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 9 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 9 4/5/21 10:18 AM QA Bot Pryon Retail Analytics Pilot AI Labs samadii/dem Metariver Technology SAS SAS Sentient Sentient Shopic Frictionless Shopping SmartCart Shopic Imagr Smart Skin Human engine SpaceKnow PaaS SPACEKNOW Challenge: QA Bots are easy to build but hard to keep up-to-date . The last thing you want is a bot distributing wrong answers 24/7. Solution: With Pryon, QA bots are ridiculously fast and easy to create ? and more importantly easy to monitor and maintain. Benefits: - Real time monitoring of questions asked - Update or add more answers directly or by adding documents - Process feedback easily · V100 Multi-GPU Single Node Retail in-store analytics for stock out (cameras in shelves), demographics (age/ gender), shopper tracking/counting, anomaly detection, drive through solutions and more · Jetpack · Jetson TX2 · R TX 2080 Single GPU Single Node Software for computing various behaviors of massive solid particles of various size particles from small particle with Brownian motion to large particle such as ore with DEM(Discrete Element Method). · S olid particle simulator, DEM solver · M ulti-Physics module(Drag and Buoyancy force, Magnetic force, Coulomb force, adhesion force, Van der Waals force, Brownian motion and heat effect) · VPS(Virtual Particle System), Cluster model · C o-simulation with MBD(Multi Body Dynamics) solvers (ADAMS, DADS, RecurDyn, Daful) · C o-simulation with ANSYS Mechanical (Flexible body). Multi-GPU Multi-Node SAS Machine Learning. SAS Viya Visual Data Mining and Visualization suites now leverage GPU deep learning · V olta V100 with tensor cores · T ensorRT for inference on the NVIDIA Jetson TX2 box · RNN · M ultiple GPUs on a single SMP node · H omogeneous and heterogeneous MPP with synchronized Stochastic Gradient Descent Multi-GPU Multi-Node Sentient is an AI platform company with special focus on digital marketing, ecommerce and finance trading applications. · S entient is using GPU deep learning in its commercially available ecommerce, digital marketing and financial trading applications · S tudio.ml is a new project designed to make AI development easier by hiding most of the complexity · S tudio.ml runs on-premise and in the cloud Single GPU Single Node Frictionless Shopping - using smart cart · N VIDIA Xavier NX Single GPU Single Node SmartCart comprised of four tiny cameras · N VIDIA Jetson, Xaiver and AI vision recognition system · TensorRT Single GPU Single Node AI-enhanced processing of 3D and 4D data. Used to create high quality 3D characters for interactive media (games, mobile apps, VFX, VR/AR and mixed reality experiences, etc) - automatic retopology of 3D and 4D data using machine learning - photogrammetry : noise-reduction and hole-patching using machine learning - realistic lip-sync using 4D-trained neural network · CUDA · Hairworks · PhysX · cuDNN · OptiX Multi-GPU Multi-Node PaaS for deep learning extraction of satellite data information targeted at Financial Services and Defense / Intelligence. Tracks macro/micro-economic activity by applying deep learning to satellite images. · E xtracts economic activity from satellite images using deep learning · P rovides batch mode extraction Multi-GPU Multi-Node 10 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 10 4/5/21 10:18 AM Talkmap Talkmap Tensorflow Google Theano LISA Lab The Deep North Video Analytics platform theft & safety Deep North Third Eye Labs ThermalNet Malong Technologies Torch7 TrigoVison Unify.ID Veesion Open Source TrigoVision Unify.ID Veesion Visual Intelligence Deep Vision API Voca's Virtual Agent voca.ai NLU model training/re-training/fine-tuning for contact center operation automation trained from raw transcripts to identify the intentions automatically, complemented by human annotation. Models are used for post-call analysis, chatbot design etc. Google's TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. Theano is a symbolic expression compiler that powers large-scale computationally intensive scientific investigations. The Deep North platform includes Occupancy Management, Gesture Analysis, Zone Management, Vehicle Analysis, Dashboard and reporting Theft, safety and loss detection AI-based dual camera thermal + computer vision screening system that can be utilized by enterprises to help people stay safe during epidemics. Powered by multiple world-class AI models, the system can accurately detect and alert on potentially dangerous temperature levels combined with PPE, occupancy, and social distancing compliance. Torch7 is an interactive development environment for machine learning and computer vision. Retail automation platform that provides seamless checkout, shoplifting prevention, and real-time inventory updates. Behavioral user authentication service Shoplifting detection using deep learning algorithm that continuously analyses the content of security cameras. It automatically detects gestures associated with shoplifting in real-time. Sends a video alert to a human operator who confirms the theft and takes action. Deep Vision specializes in understanding visual content and getting the most value of data by applying visual recognition for enterprises. Human like cell center conversation AI · V100 · P100 · T 4 GPUs · cuDNN · T ensorFlow is flexible, portable and performant creating an open standard for exchanging research ideas and putting machine learning in products · A bstract expression graphs for transparent GPU acceleration · TensorRT · T esla T4 - metropolis · C oncealment detector in IN-AISLE AND THE STOCKROOM · S afety - social distance detector · C heckout Theft Detector at the POS · S mart Alerts · P rivacy Protection · C ustomizable and Flexible Deployment · H igh Performance Accuracy · C omputational back-ends for multicore GPUs · TensorRT · Identifies individuals based on unique factors such as the way they walk, type and sit · R eal-time shoplifting prospects alerts · V isual Intelligence API allows leader enterprises in verticals like e-commerce and online auctions, media and entertainment and retailers, to analyze content related with faces, brands and context tags to perform actions like: · C urate and organize visual content · S earch and recommend visually · G et insights and analytics visually · Jasper · NeMo Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Multi-Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 11 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 11 4/5/21 10:18 AM vuForecast Walkout Yusp Zippin deepVu walkout Gravity R&D Zippin ML/DL enabled vuForecast learns from historical inventory, point of sale, promotions and logistics data augmented with DeepVu's real-time data platform aggregating numerous external micro and macro economic signals to accurately forecast future demand Autonomous check out - smart cart Personalized recommendations for E-commerce, powered by T4 Checkout-free technology offering inventory tracking and insights to ensure the right products are in the right place, at the right time. · M L (dmlc/XGBoost) + Dask for distributed training · D L (RNN/LSTM networks) + PyTorch 1.1 · D L (RL) + TensorFlow 1.14 and 2.0 · N VIDIA Jetson Tx2 · S earch solution to create a smooth product discovery experience · P roduct/Content recommendation · O n-site personalization · S earch personalization · M obile personalization · E -mail marketing (and push, SMS) personalization · P ersonalization ad retaergeting · A d exchange yield optimization · Jetpack Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Public Sector and National Government APPLICATION NAME Advanced Ortho Series COMPANYNAME DigitalGlobe PRODUCT DESCRIPTION Geospatial visualization SUPPORTED FEATURES · Image orthorectification GPU SCALING Multi-GPU Single Node ArcGIS Pro ESRI Viewshed2 determines the raster surface locations visible to a set of observer features, using geodesic methods. Transforms the elevation surface into a geocentric 3D coordinate system and runs 3D sightlines to each transformed cell center. Takes advantage of Tensor Cores for both training and inference . · Viewshed2 · D eep Learning · A spect - The values of each cell in the output raster indicate the compass direction the surface faces at that location. It is measured clockwise in degrees from 0 (due north) to 360 (again due north), coming full circle. · S lope - The output slope raster can be calculated in two types of units, degrees or percent (percent rise). Multi-GPU Multi-Node Blaze Terra Eternix Geospatial visualization tool · 3 D visualization of geospatial data Multi-GPU Single Node Elcomsoft Elcomsoft High-performance distributed password recovery software with NVIDIA GPU acceleration and scalability to over 10,000 workstations. · G PU acceleration for password recovery · 1 0-100x speedup for password recovery Multi-GPU Single Node ENVI L3Harris Inc Image Processing and Analytics · D eep Learning training · D eep learning inferencing · Image orthorectification · Image transformation · A tmospheric correction · P anchromatic co-occurrence texture filter · V ideo processing and analytics using Jagwire Multi-GPU Single Node ERDAS Imagine Hexagon Geospatial Remote sensing, photogrammetry and GIS toolset for the interactive, semi-automated and automated extraction of information from remotely sensed imagery and point clouds. · G ray Level co-occurrence matrix (CLCM) image processing operation · N NDiffuse image pan sharpening operation · D eep learning capabilities using the GPU accelerated versions of Tensorflow Single GPU Single Node 12 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 12 4/5/21 10:18 AM Fortify Corsight AI Geomatics GXL PCI GeoWeb3d Desktop Geoweb3d Graphistry Graphistry Ikena ISR MotionDSP LuciadLightspeed Hexagon Geospatial Manifold Systems OmniSIG Manifold Systems DeepSig Inc. SNEAK SocetGXP OpCoast BAE Systems Sureproof Facial Recognition AI For your Safety & Privacy · S mart technology that can overcome face masks & PPE · F acial recognition in almost complete darkness & extreme angles · N on discriminative algorithm that is ethnicity neutral · V intage image match up to 30 years old · M ask detection and alert on subjects not wearing a face mask Multi-GPU Single Node Image processing · Image orthorectification · A dditional image processing Multi-GPU Single Node Geospatial visualization of 3D and 2D data, · 3 D visualization and analysis of geospatial Multi-GPU mensuration and mission planning data Single Node Graphistry is the first visual investigation platform to handle increasing enterprisescale workloads. · G raph reasoning · G PU-accelerated visual analytics · V isual pivoting · R ich investigation templating Multi-GPU Single Node Real-time full motion video (FMV) and widearea motion imagery (WAMI) enhancement and computer-vision-based analytics software. · R eal-time super-resolution-based video enhancement on live streams · G eospatial visualization · T arget detection and tracking · F ast 2-D mapping Multi-GPU Single Node Geospatial visualization and analysis · G PU accelerated line of sight and view shed calculations · G PU accelerated hypsometry calculations, including terrain slope, ridge and valley detection, terrain orientation and azimuth calculations · G PU accelerated imaging operator for geospatially referenced imagery Single GPU Single Node Full-featured GIS, vector/raster processing · M anifold surface tools & analysis Multi-GPU Single Node The OmniSig sensor provides a new class of RF sensing and awareness using DeepSig's pioneering application of Artificial Intelligence (AI) to radio systems. Going beyond the capabilities of existing spectrum monitoring solutions, OmniSIG is able to not only detect and classify signals but understand the spectrum environment to inform contextual analysis and decision making. Compared to traditional approaches, OmniSIG provides higher sensitivity and accuracy, is more robust to harsh impairments and dynamic spectrum environments, and requires less computational resources and dynamic range. · O perates in a real-time streaming fashion · Ingests radio samples from many common radio interfaces · M ake use of packet formats like VITA49 or SDDS. · C an be used from any device with a browser, including mobile handsets · O mniSIG software also provides its metadata output stream in JSON form for use by other applications Multi-GPU Single Node Electromagnetic signals propagation modeling for complex urban and terrain environments. · R ay tracing, DTED and remote sensing inputs Multi-GPU Single Node Visual Profiler utilizes a cognitive vision and profiling methodology (using machine learning algorithms and state of the art deep learning schemes) to provide unlimited object definition and profiling flexibility. The Automatic Spatial Modeler (ASM) is designed to generate 3-D point clouds with accuracy similar to LiDAR. Extracts 3-D objects and 3_D dense point clouds from stereo images. Also extracts accurate building edges and corners from stereo images with high resolution, large overlaps, and high dynamic range. · A utomated 3D feature extraction from LiDAR · A utomated feature detection from imagery using deep learning Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 13 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 13 4/5/21 10:18 AM Terrabuilder PhotoMesh Therm-App® MD Pro Wesafe Skyline Software Opgal WeSmart PhotoMesh integrates a GPU-based, fast algorithm, able to automatically build 3D models from simple photographs. PhotoMesh revolutionizes the use of geospatial data by fully automating the generation of highresolution, textured, 3D mesh models from standard 2D images. · 3 D model building from imagery · B uilding texture generation Thermal imaging device for body temperature measurement · U nlimited Hotspot Detection & Tracking · A dvanced Deep Learning Algorithm · L inux-Based Solution · S tand-Alone Solution · R emote Sensor · Q uick Hotspot Detection · U p to 20 Simultaneous Scans · A udio & Visual Alert Simple low cost IVA solution for up to 4 cameras on a Jetson Nano, Performing people detection in ROI and people counting. · P eople Detection in ROI · N ight/Day, People Counting · P ush notifications with visuals of the alerts · S imple setup, ONVIF Cameras detection. Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Design for Manufacturing/Construction: CAD/CAE/CAM CFD (MFG) APPLICATION NAME Actran COMPANYNAME FFT ADS Flow Solver - ADSCFD, Inc. Code LEO Altair AcuSolve Altair Altair nanoFluidX Altair Altair ultraFluidX Altair Ansys Fluent ANSYS Ansys Icepak Ansys Polyflow ANSYS ANSYS PRODUCT DESCRIPTION Simulation of acoustics propagation at high frequency or in huge domains such as exhaust of turbomachines, full truck cabin exterior acoustics, and ultrasonic parking sensors. SUPPORTED FEATURES · D iscontinuous Galerkin Method (DGM) solver A Compressible, explicit time-marching CFD solver for aerospace applications. Capable of handling both internal and external flows with robustness and accuracy · U nstructured/Structured Meshes · M ultigrid Accelerations · M ultiple Turbulence Models · R otor-stator Interfaces Computational Fluid Dynamics (CFD) tool, providing users with a full range of physical models. Simulations involving flow, heat transfer, turbulence, and nonNewtonian materials are handled with ease by AcuSolve's robust and scalable solver technology. · L inear solvers for flow, temperature, turbulence model, and mesh movement equations State-of-the-art particle-based (SPH) fluid dynamics code for simulation of single and multiphase flows in complex geometries with complex motion. · E xtremely fast · S ingle and Multiphase Flows · A rbitrary motion definition · T ime-dependent acceleration · Inlets/outlets · S urface tension and adhesion · S teady-state thermal solutions through coupling Simulation tool for ultra-fast prediction of the aerodynamic properties of passenger and heavy-duty vehicles as well as for the evaluation of building and environmental aerodynamics. · C UDA-accelerated high-fidelity flow field computations based on the Lattice Boltzmann method · C UDA-aware MPI support for multi-GPU and multi-node usage · E fficient implementation of tailor-made automotive features, including rotating wheels, belt systems, boundary layer suction and porous media support General purpose CFD software · L inear equation solver · R adiation heat transfer model · D iscrete Ordinate Radiation model CFD software for electronics thermal management · L inear Equation Solver CFD software for the analysis of polymer and glass processing · D irect Solvers GPU SCALING Multi-GPU Multi-Node Multi-GPU Multi-Node Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Single Node 14 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 14 4/5/21 10:18 AM CharLES CPFD BarracudaVR and Barracuda DYVERSO Fine/Open FINE/Turbo GeoPlat-RS HiFUN JSCAST midas NFX(CFD) MIKE 21 MIKE 3 MIKE FLOOD MSC Apex Generative Design Cascade Technologies, Inc. CPFD Next Limit Numeca International Numeca International GridPoint Dynamics (GPD) SANDI Qualica Inc. Midas DHI DHI DHI MSC Software CharLES is a GPU-accelerated CFD software application specializing in LES (Large Eddy Simulations). Runs on a range of CUDA GPUs from Kepler to Turing architectures and scales with multiple GPUs in a single server node as well as scales across multiple GPUs over a cluster of nodes. · C UDA Toolkit Multi-GPU Multi-Node Modeling software for simulating Fluidized Reactors · L inear equation solver for isothermal, non-reacting simulations and for thermal reacting cases · D iscrete multi-component particle calculations Single GPU Single Node Multi-physics simulation engine for liquids and granular substances. Can be used to mimic behavior of rigid and soft bodies · F luid solver in Real Flow 10.5 based on Smoothed particle hydrodynamics (SPH) · F luid solver in Real Flow 10.5 based on Position based dynamics (PBD) Single GPU Single Node FINE/Open with OpenLabs is a powerful CFD Flow Integrated Environment dedicated to complex internal and external flows. It allows users to freely develop and exchange physical models in CFD, with a new open approach to CFD. Complex programming tasks are avoided through the usage of an easy meta-language. · Incompressible, low and high speed flows · E fficient preconditioned compressible solver with fast agglomerated multigrid acceleration and adaptation techniques to combine completely unstructured hexahedral grids Multi-GPU Multi-Node Structured, multi-block, multi-grid CFD solver targeting the turbo machinery industry · M ulti-grid solver Multi-GPU Multi-Node Geoplat Pro-RS is a parallel hydrodynamic simulator with a flexible architecture. This enables to reduce the time for writing the entire simulator by 2/3, and, as consequence, to quickly bring new physical processes into the algorithm. · CUDA Multi-GPU · S pectral Decomposition with CUFFT library Single Node High Resolution Flow Solver on Unstructured Meshes. State-of-the-art Euler/RANS solver. Super scalability on massively parallel HPC platforms, with code ported using OpenACC directives for NVIDIA GPU. · H iFUN imbibes most recent CFD technologies; many of them home grown · H iFUN exhibits highly scalable parallel performance with its ability to scale up to several thousand processors on massively parallel computing platforms · C apable of handling complex geometries and flow physics arising in high lift flows Multi-GPU Single Node Integrated CAE product for studying and predicting the casting process. Includes high precision mold filling and solidification solvers. · S olvers for mold filling and solidification · Rendering Single GPU Single Node General purpose CFD software based on FEM · L inear equation solver (Iterative Solver and Single GPU AMG Preconditioner) Single Node 2D hydrological modelling of coast and sea for simulating physical, chemical, and biological processes · F lexible Mesh (FM) engines use GPUs. Multi-GPU · H ydrodynamic and turbulence calculations Single Node 3D Modeling of Coast and Sea · H ydrodynamic part of the flexible mesh engines (MIKE 3 HD FM). Multi-GPU Multi-Node 1D & 2D urban, coastal, and riverine flood modelling · Hydrodynamics · 2 D Overland flow · C oupling of 1D and 2D models for complex flooding issues Multi-GPU Single Node Generative Design based simulation to create several optimized, lightweight designs ultra-fast and almost fully automated · U ltra-fast matrix solving · A ccelerated computing power for part optimizations Multi-GPU Multi-Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 15 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 15 4/5/21 10:18 AM M-Star CFD Numerix Pacefish Particleworks PowerViz ScPOST Simcenter 3D Simcenter STARCCM+ Speed IT FLOW Turbostream zCFD M-Star General purpose CFD Multiphysics Simulations, LLC modeling software Zeus Custom software development in the areas of CFD, FEA and Electromagnetics Numeric Systems GmbH CFD application for Automotive Aerodynamics, Pedestrian Comfort and Wind Loading Prometech Dassault Systèmes SIMULIA Corp. Hexagon, Cradle Siemens Digital Industries Software Siemens Digital Industries Software Vratis CFD software using MPS (Moving Particle Simulation) method for automotive, energy, material, chemical processing, medical, food, and civil engineering industries where free surface fluid flow and fluid mixing phenomena occur. Industry proven, modern post-processing app for EXA POWERFLOW CFD Postprocessor for visualizing simulation results from CFD analysis, MSC Nastran and MSC Marc A unified, scalable, open and extensible environment for 3D CAE with connections to design, 1D simulation, test, and data management. Integrated solution for CFD-focused Multiphysics simulation Incompressible single-phase CFD software Turbostream Ltd. CFD software for turbomachinery flows Zenotech Simulation Unlimited General purpose CFD solver · F luid flow & heat transfer · D EM simulation · C hemical reactions · M ulti-phase flow Multi-GPU Multi-Node · L attice Boltzmann Method (LBM) for flow around buildings · S PH based flow solver for simulating flow over urban environments Multi-GPU Single Node · T ransient Lattice-Boltzmann Method for single-phase flows · Integrated fast and robust pre-processor for complex geometries · L ocal grid refinement · u RANS (K-Omega-SST), hybrid uRANS-LES (SST-DDES & SST-IDDES) · L ES (Smagorinsky) turbulence modeling · s calable up to 16 GPUs Multi-GPU Single Node · E xplicit and Implicit methods Multi-GPU Multi-Node · Rendering · R ay tracing · F ile loading acceleration · Rendering · Raytracing Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node · Rendering · F inite-volume solver: Simple and piso, incompressible single-phase flows with k-OmegaSST turbulence · F inite Volume explicit solver for RANS/ URANS calculations · V ariable time-steps and multigrid for convergence acceleration · T urbulent flow (RANS, URANS, DDES or LES) including automatic scalable wall functions Single GPU Single Node Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node CFD (RESEARCH DEVELOPMENTS) APPLICATION NAME ALYA COMPANYNAME Barcelona Supercomputing Center (BSC) PRODUCT DESCRIPTION SUPPORTED FEATURES Alya is a high performance computational mechanics code to solve complex coupled multi-physics / multi-scale problems, which are mostly coming from the engineering realm. · Incompressible Flows · C ompressible Flows · N on-linear Solid Mechanics · S pecies transport equations · E xcitable Media · T hermal Flows · N -body collisions DualSPHysics University of Manchester SPH-based CFD software · S PH model HiPSTAR University of Southampton and University of Melbourne Sandberg CFD software for compressible reacting flows · E xplicit solver GPU SCALING Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node 16 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 16 4/5/21 10:18 AM Project Chrono PyFR RAPTOR S3D University of WisconsinMadison Chrono is a physics-based modelling and simulation infrastructure based on a platform-independent open-source design implemented in C++. Systems can be made of rigid and flexible/compliant parts with constraints, motors and contacts; parts can have three-dimensional shapes for collision detection Imperial College - Vincent US DOE Sandia and Oak Ridge NL General purpose CFD software for compressible flows CFD formulation of turbulent combustion for fuel injector and other engine applications Direct numerical solver (DNS) for turbulent combustion · Robotics · W heeled vehicle dynamics · T racked vehicle dynamics · N onlinear finite element analysis · Mechatronics · O ff-road vehicle mobility · Terramechanics · V irtual reality · G ranular flows · C ollision detection · A utonomous vehicles · S eismic engineering · A ugmented reality · H igh-order explicit solver based on flux reconstruction method · F low solver · C hemistry model Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Multi-Node COMPUTATIONAL STRUCTURAL MECHANICS APPLICATION NAME Adams COMPANYNAME MSC Software PRODUCT DESCRIPTION Multi-Body Dynamics simulation software SUPPORTED FEATURES · Rendering GPU SCALING Single GPU Single Node Altair EDEM Altair Software for bulk material simulation that uses the Discrete Element Modeling (DEM) technology to simulate and analyze behavior of bulk materials · E DEM Simulator, a DEM solver · Integration with Ansys and Abaqus for FEA for bulk material simulation · Integration with Adams, Siemens and RecurDyn for Multi-body Dynamics · Integration with Ansys Fluent for Particle- Fluid Systems Multi-GPU Single Node Altair HyperWorks Altair Comprehensive, open architecture CAE simulation suite in the industry, offering the best technologies to design and optimize high performance, weight efficient and innovative products. It includes a full set of modeling and visualization tools. · O penGL v3.2 · O penCL v2.0 support · Anti-aliasing Single GPU Single Node Altair OptiStruct Altair Industry proven, modern structural analysis solver for linear and nonlinear problems under static and dynamic loadings. It is also the market-leading solution for structural design and optimization. · D irect solver (BCS) · E igenvalue solvers (AMSES and Lanczos) · Iterative solver (PCG) Single GPU Single Node Amphyon AdditiveWorks Simulation-based process software for powder bed based, laser beam melting additive manufacturing processes · M echanical Process Simulation · T hermal Process Simulation Single GPU Single Node Ansys Mechanical ANSYS Simulation and analysis tool for structural mechanics · D irect and iterative solvers Multi-GPU Multi-Node Autodesk Nastran Autodesk Autodesk Nastran FEA software analyzes linear and nonlinear stress, dynamics, and heat transfer characteristics of structures and mechanical components. · D ouble Precision on GPU Multi-GPU Multi-Node GranuleWorks Prometech DEM-based advanced simulator for granular materials in pharma and powder metallurgy: granular material segregation, screening, grinding, screw conveying, mixing, compaction, filling. dustproof, toner transport, electrode materials filling, cliff collapses/debris flow, etc. · S ize distribution, contact force model, rolling resistance model, liquid bridge force model, van der Waals force model, heat transfer and external force. · B oundary conditions: polygon wall, inflow and outflow boundary, and simulation domain. · C oupling with Particleworks MPS solver: support for aeration and pumps Multi-GPU Multi-Node Helyx PEM Engys Specialised add-on solver for HELYX to simulate · P olyhedral Elements Method solver large numbers of solid objects in motion using the Polyhedral Element Method (PEM) Single GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 17 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 17 4/5/21 10:18 AM Impetus Afea Irazu Marc MatDEM midas GTS NX midas NFX(Structural) MSC Nastran PERMAS-XPU RecurDyn Rocky DEM Simcenter Nastran SIMULIA 3DEXPERIENCE SIMULIA Abaqus/ Standard ThreeParticle/CAE Impetus Afea Predicts large deformations of structures and components exposed to extreme loading conditions · N on-linear Explicit Finite-Element Solver Multi-GPU Single Node Geomechanica Inc. Simulation and analysis tool for rock mechanics, involving large deformations, fracturing and multi-physics phenomena. · E xplicit 2D and 3D FEM and FDEM solvers · C oupled hydraulic, mechanical, transport, thermal and fracture processes Single GPU Single Node MSC Software Simulation and analysis tool for structural mechanics · D irect sparse solver Multi-GPU Single Node Nanjing University MatDEM is a software for Fast GPU Matrix computing of Discrete Element Method. The software implements automatic stacking modeling, layered material, joint surface and load settings, rich post-processing functions and secondary development. · F ull product support on GPU Multi-GPU Single Node Midas Simulation tool for geo-technical analysis · L inear equation solver(Multi Frontal Solver) Single GPU Single Node Midas Simulation and analysis tool for structural mechanics · L inear equation solver(Multi Frontal Solver) Single GPU Single Node MSC Software Multidisciplinary structural analysis application used to perform static, dynamic, and thermal analysis across linear and nonlinear domains · D irect sparse solver Multi-GPU Single Node INTES GmbH General purpose structural simulation software · L inear Equation Solver Single GPU Single Node FunctionBay, Inc. Multi-Flexible Body Dynamics simulation software · Rendering Single GPU Single Node Rocky DEM Discrete Element Modeling (DEM)-based particle simulation software for simulating behavior of bulk materials with complex particle shapes and size distributions · E xplicit DEM solver (dry/sticky contact rheologies) · 1 -way & 2-way coupling with ANSYS Fluent and ANSYS Mechanical Multi-GPU Single Node Siemens Digital Industries Software Finite element method (FEM) solver for computational performance, accuracy, reliability and scalability · L inear and nonlinear equation solver · F requency response module · M atrix decomposition computations Multi-GPU Multi-Node Dassault Systèmes SIMULIA Corp. Realistic simulation solution (Uses Abaqus · D irect sparse solver Standard for GPU computing) Single GPU Single Node Dassault Systèmes SIMULIA Corp. Simulation and analysis tool for structural mechanics · D irect sparse solver · A MS Solver · S teady State Dynamics Multi-GPU Multi-Node BECKER 3D GmbH Multiphysics Discrete Element Method (DEM) simulation platform for bulk materials with complex shapes and built-in multi-body dynamics (MBD), Finite Element Analysis (FEA) & Smoothed Particle Hydrodynamics (SPH) · G PU accelerated Smoothed Particle Hydrodynamics · S imulate complex and real particle shapes using DEM combined with SPH, FEA, MBD, Wear Single GPU Single Node DESIGN AND VISUALIZATION APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION 3D CAT.live Shenzhen Rayvision Technology Co Ltd Real-time rendering cloud service for 3D applications. The massive GPU computing power in the cloud is used to process heavy image rendering calculations and stream output to the terminal device synchronously, thereby realizing light weight of the terminal device and making high-quality 3D graphics applications ubiquitous. Users can use any common networked device to access the 3D application hosted in the 3DCAT cloud without downloading and installing the application. Supports almost all rendering engines that can run on the Windows platform, and supports the opening of NVIDIA RTX real-time ray tracing function. SUPPORTED FEATURES · C loud XR SDK · D LSS (potential) GPU SCALING Multi-GPU Multi-Node 18 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 18 4/5/21 10:18 AM 3DEXCITE DeltaGen Dassault Systèmes 6SigmaET Future Facilities Abaqus/CAE Accelerad Dassault Systèmes SIMULIA Corp. MIT Sustainable Design Lab Additive Mfg Toolkit Dyndrite ALLPLAN Nemetschek ALLPLAN ANSA BETA CAE Systems Ansys Discovery Live ANSYS Ansys SPEOS ANSYS Ansys VRXPERIENCE for HMI and Perceived Quality ANSYS High-end 3D visualization and realtime interaction to help increase visual quality, speed, and flexibility. · Interactive ray tracing and global illumination. · Integration with Siemens TeamCenter. · C luster support Realtime & Offline Production Process Integration and scene building. · S cene Analysis, Xplore DeltaGen, SDK for DeltaGen. Multi-GPU Single Node Thermal simulation software for the electronics industry. 6SigmaET's unique MLUS Computational Fluid Dynamics (CFD) solver predicts thermal issues in complex electronics equipment. · M onte-Carlo ray tracing for Heat Radiation · N VIDIA's Optix library Single GPU Single Node Complete solution for Abaqus finite element · Rendering modeling, visualization, and process automation Multi-GPU Single Node Accelerad is a free suite of programs for fast and accurate lighting and daylighting analysis and visualization. · U p to forty times faster using OptiX N/A · R enderings with large numbers of ambient bounces · C alculations over many thousands of sensor points · F ast simulation of annual climate-based daylighting metrics · A cceleradRT - Interactive interface for real-time daylighting, glare, and visual comfort analysis with validated accuracy. includes AcceleradVR, an immersive visualization interface compatible with most virtual reality headsets. Dyndrite has developed a GPU-based · CUDA N/A geometry kernel with CUDA. The initial application for this kernel is an Additive Manufacturing Toolkit which speeds up the process of 3D printing, especially for complex parts. Complete Building Information Modeling (BIM) for Architecture, Engineering, and Construction. · O penGL 4, and now moving to Vulcan · V ulcan for wireframe rendering already with plan to ship full integration with Version 2022 in September 2021 Single GPU Single Node Multidisciplinary CAE pre-processing tool for full model build up, from CAD data to ready-to-run solver input file, in a single integrated environment · OpenGL · OpenCL Single GPU Single Node Interactive and CAD-agnostic Windowsbased app that gives engineers instantaneous simulation results to help them explore and refine product designs · O penGL-based visualization · C UDA-based Structural Stress, Modal, Fluid Dynamics, Thermal, Electrical Conduction and Coupled Multi-Physics simulations Single GPU Single Node Physically accurate optical simulation software dedicated to predictive illumination and optical performance of systems. Highfidelity visualization of the final result, based on unique human vision algorithm. · S PEOS Live Preview · 360 degrees for immersive or observer view · O ptical part design · O ptical sensors test · H UD design and analysis · Infrared modeling Single GPU Single Node Predictive physics-based real time lighting simulation with VR capabilities to experience and validate the impact of your design proposition on appearance and perceived quality. · P hysics-based real time lighting simulation with VR capabilities from HMD to CAVEs (multi-GPU, multi-node) · S PEOS Live Preview (raytracing) based on CUDA/OptiX benefiting from RTX architecture (single GPU) · S calable rendering capabilities,ranging from rasterization to fully GPU ray-traced SPEOS Live Preview Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 19 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 19 4/5/21 10:18 AM Ansys VRXPERIENCE Lighting and Sensors Ansys Workbench Apex Archicad Arch-Log AutoCAD Avatar VR BricsCAD CATIA 3DEXPERIENCE CATIA Live Rendering Clarisse Clip Studio Paint Clo3D COMSOL Creo Generative Topology Optimization Extension (GTO) ANSYS ANSYS MSC Software Nemetschek GRAPHISOFT Luminova Japan Autodesk NeuroDigital Technologies Hexagon PPM Dassault Systèmes Dassault Systèmes Isotropix Celsys CLO Virtual Fashion Inc COMSOL PTC Predictive validation of vehicle systems for the optimization of intelligent headlamp units and sensors dedicated to ADAS and AD. Rapid and simple virtual test of systems, relying on the unique combination of visually realistic driving simulator, and physics-based simulation. Real-time and interactive driving simulator to virtually create, test and experience future vehicle driving in real-world like conditions. Industry proven, modern pre- & postprocessing app for CAE Unified environment for virtual product development Complete Building Information Modeling (BIM) for Architecture, Engineering, and Construction. A web service based on NVIDIA Iray and RealityServer (from migenius) for rendering and configuring building materials. 2D and 3D CAD designing, drafting, modeling, architectural drawing, and engineering software. Haptic VR gloves for training design or remote operation. Building information modeling software for design, construction, documentation, and manufactured building products. The reference CAD application for advanced engineering with batching capability and extreme reliability, used by 80 of the automotive industry and the entire aerospace industry. Realistic 3D Rendering on full CATIA 3D CAD model. Set dressing and layout tool with integrated renderer Clip Studio Paint is a versatile digital painting program that is ideal for the digital creation of comics, general illustration, and 2D animation. 3D garment simulation and design Multiphysics general-purpose simulation software for modeling designs, devices and processes in all fields of engineering, manufacturing, and scientific research Creo Generative Topology Optimization Extension (GTO) creates optimized product designs based on your constraints and requirements - including materials and manufacturing processes · M ultispectral Physics-based real time lighting simulation with multi-display capabilities (driving simulator). · Rendering · Rendering · O penGL based GPU rendering · F ast, efficient graphics in the viewport · R TX photorealistic rendering with Twinmotion, internal rendering engine based on CineRender, and now integrating Redshift into Archicad. · Iray · RealityServer · Quadro · DGX · S urface, mesh and solid modeling tools, model documentation tools, parametric drawing capabilities · O pen GL · N ative DWG support · G RID Support. · PhysX · Rendering · G PU OpenGL performance scaling in R2017x · V R native integration with HTC Vive in R2017x · V R SLI in R2018x · S tellar GPU in R2019x FD01 · P hysically Based Rendering with no data preparation thanks to native NVIDIA Iray Photoreal integration and interactive realistic rendering using NVIDIA Iray IRT · G PU accelerated interactive rendering 50100X faster than with CPU · O ptiX AI-accelerated de-noising · A ccelerated processing and AI features · CUDA · O penGL version 2.0 · D irectX version 9 · C UDA accelerated Generative Design Multi-GPU Multi-Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Multi-Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node 20 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 20 4/5/21 10:18 AM Creo Parametric Easy 3D Scan Enscape Grasshopper IC.IDO ImageStation Inspire Studio/ Render (formerly known as Evolve) Inventor Iray Iray for 3ds Max Iray for Maya PTC Professional 3D CAD software for product · G PU accelerated real-time engineering Single GPU design and development, including simulation with Creo Simulation Live Single Node parametric modeling, simulation/analysis, · F ull scene anti-aliasing and product documentation for companies · O rder independent transparency ranging from SMB to Enterprise. · B etter lighting and enhanced shaded-with- edges mode · Immersive design environment with realistic materials Cappasity 3D digitizing software that creates and embeds 3D product images into your website, mobile and AR/VR apps, and gives your customer a near real shopping experience. · OpenCL Single GPU Single Node Enscape GmbH Renderer with Plug-in for Revit, Rhino, SketchUp, ARCHICAD, and Vectorworks · F ull RTX-enabled · O ne-click to VR experience · D esign reviews for buildings · 3D and VR visualization of CAD data for AEC Single GPU Single Node McNeel & Assoc. Grasshopper is a graphical algorithm editor tightly integrated with Rhino's 3-D modeling tools. Unlike RhinoScript, Grasshopper requires no knowledge of programming or scripting, but still allows designers to build form generators from the simple to the awe-inspiring. · F ast, scalable OpenGL 3.3 pipeline leverages latest NVIDIA GPUs · G PU computed shaders and memory optimizations · R hino 6 leverages NVIDIA RT Cores for Real-time ray tracing viewport mode · R endering engine is CYCLES, fully integrated inside Rhino 6 now Single GPU Single Node ESI Group Immersive VR solution for engineering and virtual prototyping. The Helios rendering engine is highly optimized for NVIDIA GPUs. · N V Pro Pipeline (RiX) for OpenGL rendering · V RWorks SPS and VR SLI (NVLink support) · D esignWorks, including VR Occlusion Culling open source sample and OptiX Multi-GPU Single Node Hexagon Geospatial ImageStation software suite designed for high-volume photogrammetry and production mapping including aerial and satellite triangulation, stereo feature and digital terrain model (DTM) collection and editing, automatic DTM and digital surface model (DSM) generation, and orthophoto production and editing · S tereo Display and Viewing Single GPU Single Node Altair Inspire Studio is a high quality 3D Hybrid Modeling and Rendering environment that enables industrial designers to evaluate, research and visualize various designs faster than ever before. Inspire Studio runs on both Mac OS X and Windows. · N URBS modeling · P olyNURBS modeling · O penGL 4.5 Core · O penGL-based real-time high-quality rendering · Interactive high-quality rendering using Thea Render · P roduction rendering using Thea Render · Integrated "dark room" environment to manage render queue and post-processing of rendered images Single GPU Single Node Autodesk 3D mechanical design, documentation, and · U ses BIM for intelligent building product simulation. components to improve design accuracy Single GPU Single Node NVIDIA A ready-to-integrate, physically-based, photorealistic rendering solution. · Iray Interactive · Iray Photoreal · Iray Server · F ast interactive ray tracing · P hysically-based, global-illumination rendering · D istributed cluster rendering. Multi-GPU Multi-Node Siemens Digital Industries Software A physically-based renderer plugin for Autodesk 3ds Max · Iray Photoreal and Iray Interactive support, VCA clustering, Cloud rendering, MDL support and AI based denoising Multi-GPU Multi-Node 0x1 Software & Consulting GmbH A physically-based renderer plugin for Autodesk Maya. · Iray Photoreal and Iray Interactive support, VCA clustering, Cloud rendering, MDL support, AI based denoising Multi-GPU Multi-Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 21 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 21 4/5/21 10:18 AM Iray for Rhino Iray Server KeyShot LensMechanix LumenRT Medium by Adobe META META VR MicroStation Connect Notch Builder NX migenius Pty Ltd Iray plugin for Rhino · Iray Photoreal and Iray Interactive support · V CA clustering · C loud rendering · M DL support. migenius Pty Ltd The scaling solution for any Iray based application · Iray Photoreal and Iray Interactive support, VCA clustering, Cloud rendering, MDL support and AI-based denoising Luxion Physically correct real time and batch CPU / GPU photorealistic renderer, popular in manufacturing, AEC, and M&E · G PU accelerated real time and batch rendering with NVIDIA OptiX · G PU accelerated AI Denoising with NVIDIA OptiX Denoiser · N etwork rendering on GPU accelerated nodes · S upport for 30 different native file formats, many free plugins and live linked applications Zemax LensMechanix is the best application for mechanical engineers to package optical systems in CAD software. It is available for SOLIDWORKS users and for Creo Parametric users. · O ptical product teams need an easier and faster way to get from design to manufacture · L ensMechanix is the answer · L ensMechanix is software for mechanical engineers who design housing for optical products in CAD · W ith LensMechanix, mechanical engineers can access the complete design data of optical systems designed in OpticStudio and start designing the mechanical envelope right away · T hey can then validate their mechanical design and fix issues before building a physical prototype Bentley Systems Easily integrate life-like digital nature into your simulated infrastructure designs, and create high-impact visuals for stakeholders. Best for very large infrastructure, i.e. 100s of square kilometers rendering. · R T Cores for real time ray tracing · T ensoRT for denoising · A ll using the DXR API Adobe PC-based VR sculpting app for modeling & painting in Quest VR headsets. For beginners as well as pros. Adobe acquired from Occulus in December 2019. Requires link cable to PC. · G LSL shaders · Vulkan · NVENC BETA CAE Systems High-performance multi-disciplinary CAE post-processor · OpenGL · OpenCL BETA CAE Systems Powerful processing and visualization environment for interaction with full-scale simulation models with collaboration capabilities · OpenGL · OpenCL Bentley Systems MicroStation is the world's leading 3D computer-aided design and visualization software for the architecture, engineering, construction, and operation of all infrastructure types. Largest CAD in AEC for Civil Engineering users. · Very tight collaboration with Autodesk Revit. · MicroStation has internal Rendering tool called Vue, shipping with the base CAD tool. · D igital Nature modeling is Full Ray Tracing-enabled · R eality Modeling leveraging NVIDIA AI acceleration · G PU acceleration for Viz, Rendering, Simulation Bentley apps are optimized for NV Quadro RTX 10bit FX A motion graphics and VFX tool designed by games artists and VJs. Compositing, grading and strong inter-operability with other packages. · G PU accelerated graphics and effects Siemens Digital Industries Software Siemens PLM Software premium design app with full Iray integration, supporting multi-gpu rendering. Still CPU bound for most tasks otherwise · G RID support · Iray, MDL (see NX Ray Traced Studio) Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Multi-Node 22 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 22 4/5/21 10:18 AM OpticStudio Painter Patran Quark VR QUINDOS RealityServer Recap PRO REMCOM WaveFarer RETOMO Review Revit RHINO Simcenter Femap Zemax OpticStudio combines complex physics · Share designs between OpticStudio and CAD N/A and interactive visuals so you can analyze, packages as native files, giving mechanical simulate, and optimize optics, lighting and engineers full access to the optical illumination systems, and laser systems, all coordinate system and all critical dimension within tolerance specifications. there is no need for file format conversions which can cause loss of design data · S imulate the impact of mechanical components on optical performance to uncover any issues and make informed design decisions · C heck for, and resolve errors, before building costly physical prototypes Corel Raster-based digital art application for drawing, sketching and painting. · G PU accelerated brushes Single GPU Single Node MSC Software Industry proven, modern pre- & postprocessing app for CAE · Rendering Single GPU Single Node Quark VR QuarkVR is an ultra-fast software solution which provides low-latency compression and wireless transmission. It offloads the heavy processing on the GPU, and is hardware-agnostic. · CUDA Single GPU Single Node Hexagon Manufacturing Intelligence Coordinate metrology software · Rendering Single GPU Single Node migenius Pty Ltd 3D rendering and collaborative visualization · N VIDIA Iray. and model manipulation platform based on NVIDIA Iray. Multi-GPU Multi-Node Autodesk ReMake is a solution for converting reality captured with photos or scans into highdefinition 3D meshes. These meshes can be cleaned up, fixed, edited, scaled, measured, re-topologized, decimated, aligned, compared and optimized for downstream workflows entirely in ReMake. · G eneration of 3D meshed models from laser scans or photos of an object · G PU accelerated photogrammetry process from 2D to 3D · 3 D model display accelerated by GPU for smooth navigation of converted models in all display modes Multi-GPU Single Node REMCOM WaveFarer is a high-fidelity radar simulator · N ear-field propagation method for drive scenario modeling at frequencies · T argeted ray casting, dynamic scenario, up to and beyond 100GHz. radiation patterns from antennas Multi-GPU Single Node BETA CAE Systems New software for the generation of · OpenGL 3D-tesellated models from CT-scan images Single GPU Single Node PiXYZ Imports any CAD data to prepare and experience your content with VR. · L arge CAD file support with NVIDIA Pascal Single GPU Single Pass Stereo extension integration Single Node Autodesk Building Information Modeling (BIM) for architecture, engineering and construction. · M odeling (BIM) to design, build, and maintain higher-quality, more energyefficient buildings · G RID support Single GPU Single Node McNeel & Assoc. General purpose conceptual/industrial design software for AEC and Manufacturing industries, including CYCLES (their customRenderer based on open source Blender) a real-time ray-traced display mode that is CUDA-based. · F ast, scalable OpenGL 3.3 pipeline leverages latest NVIDIA GPUs · G PU computed shaders and memory optimizations · R hino 6 and new RHINO 7 leverages NVIDIA RT CUDA Cores for Real-time ray tracing viewport mode, and Tensor Cores for Denoising · R endering engine is CYCLES, fully integrated inside RHINO 7 now Single GPU Single Node Siemens Digital Industries Software Engineering simulation application for creating, editing, and importing/re-using mesh-centric finite element analysis models of complex products or systems · Rendering Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 23 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 23 4/5/21 10:18 AM Simcenter Prescan Siemens Digital Industries Software Simcenter STARCCM+ VR Simpleware Siemens Digital Industries Software Synopsys SketchUp Pro Trimble SketchUp Solid Edge SOLIDWORKS Siemens Digital Industries Software Dassault Systèmes SOLIDWORKS Visualize Spotscale Studio Dassault Systèmes Spotscale PiXYZ Substance Alchemist Adobe Substance Designer Substance Painter Adobe Adobe Sunata Siemens Digital Industries Software Teamcenter Active Workspace Siemens Digital Industries Software T-FLEX CAD UE4 Top Systems Epic Games virtually validate ADAS and automated vehicle functionalities by replicating real world scenarios, adding sensor models, and interface for control systems to design and verify algorithms for data processing, sensor fusion, decision making and control Immersive VR for CFD results visualization · S peed up the TIS sensor used for radar, lidar, PMD and ultrasonic sensors · C amera sensor and fisheye camera sensor Multi-GPU Multi-Node · H TC Vive virtual reality headset Single GPU Single Node 3D image data visualization, analysis and model generation software SketchUp, formerly Google SketchUp, now part of Trimble in Sunnyvale, CA. SketchUp is a 3D modeling computer program for a wide range of drawing applications such as architectural, interior design, landscape architecture, civil and mechanical engineering, film and video game design. SMB CAD option from Siemens · OpenGL · O penGL now but moving to DirectX 11 for SketchUp, and DirectX 12 and VULKAN for TEKLA Structures (late 2021 and 2022) · F ast, efficient graphics in the viewport · R TX photorealistic rendering · 3 rd party plug-ins supported by SketchUp Pro · K eyShot rendering Single GPU Single Node Single GPU Single Node Single GPU Single Node 3D design and product development solution including design, simulation, cost estimation, manufacturability checks, CAM, sustainable design, and data management. · H igh performance in Shaded, Shaded w/ Edges, and RealView modes, FSAA for sharp edges, Order Independent Transparency · R eal time photorealistic renderings with SOLIDWORKS Visualize, an Iray-based application. Easy to use photorealistic rendering software based on NVIDIA Iray · Iray-based ray-tracing · A nimation support · N etwork rendering · O ptiX-based Artificial Intelligence denoiser 3D reconstruction algorithms are tailored for buildings and urban environments. using drones to captured data. · cuDNN Interactively prepare & optimize any CAD data before using your favorite staging tool. · L arge scale CAD format · S upport for multi-CAD file standard, prepare, optimize and heal your geometry before experiencing it in VR Allows to simply create material from picture or by blending pre-existing materials, create and manage your material libraries · D L powered material recognition · M aterial scan, edit and blend Material shader edition and market reference for procedural texture creation. · R TX bakers · Iray viewport/rendering Intuitive interactive 3D painting software with physics and particle support. · R TX bakers · Iray viewport Cloud-based thermal modeling for additive manufacturing. Recommends optimal parameters for the print, including print orientation and support structures. · T hermal simulation Active Workspace is an IT-friendly client for Teamcenter product lifecycle management, with zero-install footprint and web browser access that provides an identical and seamless experience on any computing or smart device. · G RID support 3D and 2D parametric design, simulation, photorealistic rendering · H igh performance visualization · R eal time photorealistic rendering · CUDA Unreal Engine 4 is a suite of integrated tools for developers to design and build games, simulations, and visualizations. · G PU Accelerated Rendering on OpenGL, DirectX and Vulkan · P hys-X implemented Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node 24 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 24 4/5/21 10:18 AM Vectorworks Volumetric Camera Systems VRED WeViz Studio WYSIWYG ZLVE Nemetschek VECTORWORKS Volumetric Camera Systems Autodesk Meshroom VR Cast Software Zerolight Building Information Modeling (BIM) enabled design software for the Architecture, Landscape, and Entertainment industries. 4D capture service with high quality and realistic "holograms-in-motion" of people, animals, or any moving subject Secondly, we offer "photo-realistic 3D environment captures" using industrial grade Leica Laser Scanners and advanced high-resolution multi-camera systems. VRED 3D visualization software for automotive designers and engineers to create product presentations, design reviews, and virtual prototypes. Uses Digital Prototyping to quickly visualize ideas and evaluate designs. Real-time rendering tool specially made for industrial design reviews, allowing to import, edit materials, set up your scene and showcase your model in real-time. Wysiwyg is an all-in-one lighting design software with fully integrated CAD, plots, data, visualization and virtual show control. Features the largest CAD library with thousands of 3D objects you can choose from to design your entire show. Immersive customer experience with VR or web GPU streaming · O penGL based GPU rendering · CUDA · Q uadro GPUs · E nhanced geometry behavior · A utomotive product interoperability · N avigation in a scene · Import Alias layer structure · A sset Manager improvements · Integrated file converter · A nalytic rendering modes · G ap Analysis tool · O culus Rift support · A nimation module · M ultiple rendering modes · S ubsurface scattering · D isplacement mapping · R TX real-time ray tracing · G PU accelerated Shaded Views and Virtual Views · V RS and foveated rendering for VR and 3D experience through AWS GPU streaming Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node ELECTRONIC DESIGN AUTOMATION APPLICATION NAME Advanced Design System (ADS) COMPANYNAME KeySight PRODUCT DESCRIPTION SUPPORTED FEATURES Simulation tool for design of RF, microwave · T ransient Convolution simulation with and high speed digital circuits BSIM4 models Altair Feko Altair Comprehensive computational electromagnetics (CEM) code used widely in the telecommunications, automobile, space and defense industries to solve highfrequency problems. · F DTD solver · M oM solver · R L-GO solver · C MA Solver Ansys HFSS ANSYS Simulation tool for modeling 3-D full-wave · T ransient solver electromagnetic fields in high-frequency and · F EM solver high-speed electronic components · O penGL rendering Ansys HFSS SBR+ ANSYS Simulation tool for installed antenna performance and antenna-to-antenna coupling · H igh-frequency solver · O penGL rendering Ansys Maxwell ANSYS Industry-leading electromagnetic field simulation software for the design and analysis of electric motors, actuators, sensors, transformers and other electromagnetic and electromechanical devices · E ddy Current Solver Ansys Nexxim ANSYS Circuit simulation engine for RF/analog/ mixed-signal IC design, and IBIS-AMI analysis speedup with GPU computing. · A MI analysis Cadence Allegro Cadence Design EDA/ECAD tool for PCB (Printed Circuit Systems Board) Design · O penGL extensions · S calable Vector Graphics (SVG), Path Rendering SDK GPU SCALING Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Multi-Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 25 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 25 4/5/21 10:18 AM CDP CST MPHYSICS STUDIO CST STUDIO SUITE EMPro JMAG REMCOM XFdtd samadii/em samadii/plasma SEMCAD-X Serenity Sim4Life Synopsys LucidShape TrueMask MDP TrueModel VSim for Electromagnetics WIPL-D 2D Solver D2S Dassault Systèmes SIMULIA Corp. Dassault Systèmes SIMULIA Corp. KeySight JMAG REMCOM Metariver Technology Metariver Technology SPEAG Lucernhammer ZMT Zurich MedTech AG Synopsys D2S D2S Tech-X Corporation WIPL-D GPU acceleration of real-time in-line enhancement of semiconductor manufacturing equipment such as the NuFlare EBM-9500 and MBM-1000 mask writers. · C omputational lithography simulations for mask synthesis on GPUs Multiphysics simulation including thermal, CFD, and mechanical capabilities. Tightly integrated with CST's electromagnetic solvers. · C onjugated Heat Transfer Solver Accurate and efficient computational solution for 3D simulation of electromagnetic devices in a wide range of frequencies. · T ransient Solver · Integral Equation Solver · A symptotic Solver · M ultilayer Solver Modeling and simulation environment for analyzing 3D EM effects of high speed and RF/Microwave components. · F inite Difference Time Domain (FDTD) solver FEA software for electromechanical design. Fast solver / High quality mesh / Advanced modeling technologies. · E M transient solver · E M time harmonic solver · E M static solver 3D EM Simulation solver. · F DTD Solver Software for computing the electromagnetic field in three dimensional space using the Maxwell equation, a governing equation that can comprehensively represent these electromagnetic phenomena · E lectromagnetics simulator, FEM solver(scalar FEM, vector FEM) · E lectrostatics solver, Electromagnetic wave solver · M agnetostatics solver, Electric current solver, Electrodynamics solver · C o-simulation with samadii/sciv, samadii/ dem and fluid flow solvers. Software for computing plasma phenomenon with PIC(Particle-in-Cell) method. Two-way coupled simulation with samadii/em and samadii/sciv. · P lasma simulator, Charged particle motion analysis · P article and surface reaction calculation, Field analysis, Sheath range prediction · D SMC collision module, PIC module · C o-simulation with samadii/em, Ansys Maxwell and COMSOL. 3D Full wave electromagnetic and computational life sciences simulation solver · F DTD solver EM Simulation (RCS) tool · M oM solver 3D Electromagnetics & Acoustic modeling and simulation LucidShape is a computer aided lighting (CAL) design software for automotive lighting design tasks. Supports algorithms optimized for automotive applications, LucidShape facilitates the design of automotive forward, rear and signal lighting, and reflectors. GPU-accelerated simulation and data preparation for mask writing. GPU-accelerated simulation and geometric checking of curvilinear shapes. Conformal FDTD for electromagnetics for a variety of material types, yielding engineering outputs that can be used for design of electromagnetic devices 2D EM modeling and simulation for long cylindrical structures · T ransient, Broadband, and Harmonic simulations FDTD solver · L inear and non-linear 3D full wave acoustics solvers · R ay Tracing · M onte Carlo simulations using OptiX 6.5 and CUDA 10.2 · S imulation-based processing · S imulation-based processing · F DTD solver · M oM Solver · M atrix fill-in and near-field calculations Multi-GPU Multi-Node Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node Single GPU Single Node Multi-GPU Single Node 26 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 26 4/5/21 10:18 AM WIPL-D Pro WIPL-D WIPL-D Pro CAD WIPL-D Wireless InSite REMCOM Solver for fast and accurate electromagnetic · M oM (Method of Moments) Solver analysis of arbitrary composite 3D metallic · D DS (Domain Decomposition Solver) and dielectric structures Modeling and simulation environment uniting versatile, yet simple geometry modeling, with signature WIPL-D simulation accuracy · M oM (Method of Moments) Solver Uses Optix 4.1 for Ray-tracing and Propagation prediction · X 3D Ray Tracer Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node INDUSTRIAL INSPECTION APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES Cognex VisionPro Cognex ViDi Deep learning-based software dedicated to industrial image analysis. Cognex ViDi Suite is a field-tested, optimized and reliable software solution based on a state-of-theart set of algorithms in machine learning. · F eature localization and identification · S egmentation and defect detection · O bject and scene classification · Text & character recognition HALCON MVTec Software MVTec HALCON is the comprehensive standard software for machine vision with an integrated development environment. HALCON allows models to be trained on GPUs, and outputs trained models for inference on CPU, GPU, or Jetson. · D eep learning - pre-trained networks optimized for latency or precision · H ALCON also provides an IDE for training neural networks · S ub-pixel detection, edge detection, counting, OCR, barcode reading, 3D reconstruction from stereo IBM Visual Insights IBM Corporation IBM Visual Insights uses cognitive capabilities to review and analyze parts, components, and products. Identifies defects by matching patterns to images of defects that it has previously analyzed and classified. Deploy models to edge computing on production lines to facilitate rapid image capture by camera and cognitive identification of defects. Quickly assess quality inspection metrics across manufacturing processes. · C loud-based DL training, deployment on (spec'ed) edge server GPU SCALING Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Media and Entertainment ANIMATION, MODELING AND RENDERING APPLICATION NAME 3ds Max COMPANYNAME Autodesk PRODUCT DESCRIPTION 3D modeling, animation, and rendering SUPPORTED FEATURES · F aster interactive graphics · A vailability of Arnold with AI denoising · A vailability of Chaos V-Ray, Otoy Octane, Redshift, cebas finalRender third-party GPU renderers GPU SCALING Multi-GPU Single Node Altair Thea Render Altair Physically-based progressive spectral CPU/ GPU Renderer supporting fast interactive changes and bucket rendering for high resolution images · G PU-accelerated hybrid renderer · A dvanced material layering system with subsurface scattering, displacement mapping, physical sun-sky and IES support Multi-GPU Single Node ArmorPaint Armory ArmorPaint is a software designed for physically-based texture painting. There is a standalone version, or you can use as an Armory3D project. Draw textures directly using node based materials and brushes. · G PU accelerated painting processes Single GPU Single Node Arnold Autodesk Solid Angle Arnold film and animation renderer · RTX Multi-GPU Single Node Beauty Box Digital Anarchy Automatic masking and skin retouching. · G PU accelerated graphics and compute Single GPU Single Node Blender Blender Institute 3D modeling, rendering and animation · G PU-accelerated interactive viewport Single GPU Single Node Blender Cycles Blender Institute GPU renderer · C UDA-accelerated rendering · R TX-accelerated ray tracing Multi-GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 27 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 27 4/5/21 10:18 AM Character Creator Reallusion Cinema 4D Corona D5 Render Maxon Chaos Group D5 Innovation Daz Studio Dimension EmberGen finalRender Daz3D Adobe JangaFX Cebas HIERO Player Houdini iClone Foundry SideFX Reallusion Indigo KATANA Lightwave 3D LuxRender MARI Mars Marvelous Designer Massive Glare Technology Foundry NewTek LuxRender Foundry sheencity CLO Virtual Fashion Inc Massive Character Creator 3 is a full character creation solution for designers to easily create, import and customize stylized or realistic looking character assets for use with iClone, Maya, Blender, Unreal Engine 4, Unity or any other 3D tools. It connects industry leading pipelines into one system for 3D character generation, animation, rendering, and interactive design. 3D modeling, animation, and rendering High-performance photorealistic renderer · G PU accelerated processing · Iray support · Increased model complexity at interactive rates · S upport for Redshift and Chaos V-Ray and Otoy Octane and third-party GPU renderers · O ptiX AI de-noising D5 Render, based on NVIDIA RTX GPU's real-time ray tracing and rasterization technology, aims to bring unprecedented real-time rendering experience for architecture and interior design. Powerful and free 3D creation software tool that is not only easy to use but rich in features and functionality. 3D design tool enabling graphic designers to compose, adjust, and render photorealistic images. A standalone real-time fluid simulation tool built specifically for real-time VFX Artists with an expansive node based system. PLUGIN for 3dsMAX Physically Based (Spectral) Wavelength Simulation Biased + Unbiased Hybrid Rendering Unlimited Network Rendering Shot management, conform and review timeline Procedural 3D modeling, animation and rendering iClone is the software for real-time 3D animation, blending character creation, scene design, and cinematic storytelling into a real-time engine. Unbiased, physically-based renderer. · R eal-time GPU accelerated physically based global illumination and ray tracing. · G PU accelerated compute · R endering via NVIDIA IRAY and Optix · R TX ray tracing, accelerated graphics & MDL (Material Definition Language) · G PU accelerated volumetric fluid simulations · C UDA-accelerated renderer for Autodesk 3DS Max · O ptiX AI de-noising · F luid, interactive playback · F aster simulations · G PU accelerated ray-tracing and rendering · G PU-accelerated rendering Powerful look development and lighting tool · F aster interactive graphics 3D modeling, animation, and rendering GPU 3D Renderer · Increased model complexity at interactive rates · G PU-accelerated ray tracing 3D paint tool that allows painting directly onto 3D models Real-time architectural visualization tool with advanced features such as real-time ray tracing, DLSS, and VR. Realistic and dynamic 3D modeling software for clothes and fabric. Simulation and visualization tools for autonomous agent driven animation for film, games, television, architecture and transportation. · F aster interactive painting · R TX Ray tracing · DLSS · G PU accelerated cloth simulations · G PU accelerated effects Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node 28 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 28 4/5/21 10:18 AM Maverick Renderer Maxwell Maya Meshroom Metashape MODO Motion Builder Mudbox NX Ray Traced Studio OctaneRender Realflow RealityCapture Redshift Renderer Renderman Sculptris Trapcode TurbulenceFD Vantage V-Ray GPU vRt WispRenderer Maverick Next Limit Autodesk Czech Technical University (CTU) Agisoft Foundry Autodesk Autodesk Siemens Digital Industries Software Otoy Next Limit Capturing Reality Redshift Pixar Pixologic Red Giant Jawset Chaos Group Chaos Group vRt Bred University of Applied Sciences CUDA-based GPU renderer · C UDA-accelerated ray-tracing · O ptiX 7 de-noising Single GPU Single Node CUDA-accelerated interactive and finalframe renderer · C UDA-accelerated ray-tracing · U nrestricted image resolution · O ptiX de-noising Multi-GPU Single Node 3D modeling, animation, and rendering · Increased model complexity and larger scenes · Availability of Chaos V-Ray, Otoy Octane and Redshift third-party GPU renderers Single GPU Single Node Open source photogrammetry 3D software · C UDA-accelerated depth analys Single GPU Single Node Agisoft PhotoScan is a stand-alone software product that performs photogrammetric processing of digital images. Generates 3D spatial data to be used in GIS applications, and cultural heritage documentation for visual effects production and indirect measurements of objects of various scales. · C UDA-accelerated photogrammetry solution · R TX opportunity Multi-GPU Single Node 3D modeling, animation and rendering · Increased model complexity, larger scenes Single GPU Single Node Character animation and motion capture · Increased model complexity at interactive Single GPU rates Single Node 3D sculpting · Increased model complexity at interactive Single GPU rates Single Node Embedded rendering feature for Siemens NX · Iray based · MDL · A I denoising Multi-GPU Single Node CUDA-accelerated GPU renderer · G PU accelerated rendering · A I de-noising Multi-GPU Single Node Fluid simulation system · G PU-accelerated simulation Single GPU Single Node Photogrammetry · C UDA-accelerated, fast photogrammetry Multi-GPU Single Node GPU-accelerated, biased renderer · C UDA-based GPU final-frame rendering · M ac and Windows supported Multi-GPU Single Node Leading film renderer · O ptiX AI de-noising Single GPU Single Node 3D sculpting · Increased model complexity at interactive Single GPU rates Single Node Particle simulations and 3D effects for motion graphics and VFX. Now with Fluid Dynamics. · G PU accelerated effects Single GPU Single Node Turbulence FD is a powerful simulation tool · G PU accelerated graphics, compute and to create smoke, fire and explosion effects. simulation Single GPU Single Node Vantage is an interactive viewer that takes V-Ray scene files and uses DXR-accelerated ray tracing to display interactive scenes. It will be sold as a separate product, not bundled with V-Ray. · R TX-accelerated, high frame-rate camera · Interactive animations · B i-directional link to Autodesk 3ds Max · Ideal for AEC walk throughs and product design Multi-GPU Single Node GPU renderer with CPU Hybrid rendering · C UDA interactive and final-frame GPU rendering Multi-GPU Single Node vRt is an open-source project aiming to offer Vulkan-based ray-tracing for modern graphics cards that offers a unified raytracing, cross-platform library built against Vulkan 1.1 · v RtC (compute-based, native, default, wide GPU support) · v RtX (NVIDIA RTX only, more higher performance at now) Multi-GPU Single Node General purpose high level rendering library with RTX, RTGI, HBAO+, and Ansel support. · R TX, RTGI, HBAO+ · Ansel Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 29 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 29 4/5/21 10:18 AM COLOR CORRECTION AND GRAIN MANAGEMENT APPLICATION NAME ARRI de-bayering SDK COMPANYNAME ARRI PRODUCT DESCRIPTION RAW de-bayering SDK SUPPORTED FEATURES · D e-bayering of ARRI RAW and primary color grading. GPU SCALING Single GPU Single Node Baselight FilmLight Color grading · R eal-time color correction Multi-GPU Single Node Cinema RAW SDK Canon RAW de-bayering · G PU-accelerated de-bayering Single GPU Single Node Dark Energy Cinnafilm Application and plug-in for image enhancement · Image de-noising and restoration · N oise reduction, de-noise and de-grain · G rain removal, image sharpening and texture management dust busting · S DR to HDR upres Multi-GPU Single Node DaVinci Resolve Blackmagic Design Color grading and editing · R eal-time color correction and de-noising · R TX-accelerated AI features for re-timing and image enhancement Multi-GPU Single Node DeNoise AI Topaz Labs DeNoise AI uses machine-learning to remove noise from your image while preserving detail for a crisp, clear result. Whether you are shooting with High ISO or in a low light scenario, DeNoise will correct your image without removing any important information or patterns in your image. · G PU accelerated effects Single GPU Single Node Diamant-Film Restoration HS-Art Film cleanup and restoration · C UDA accelerated optical flow, de-flicker, in-painting and over 30 filters Multi-GPU Single Node Grain and Noise Reducer Wavelet Beam Video noise reduction · C UDA-accelerated grain and noise reduction Multi-GPU Single Node HDR Image aja Analyser A 1RU waveform, histogram, vectorscope and Nit-level HDR monitoring solution for HD, UltraHD, 2K, and HD resolution with HDR and WCG content. · P recise, high quality UltraHD UI for nativeresolution picture display · A dvanced out of gamut and out of brightness detection with error intolerance · S upport for SDR (Rec.709), ST2084/PQ and HLG analysis · C IE graph, Vectorscope, Waveform, Histogram · O ut of gamut false color mode to easily spot out of gamut/out of brightness pixels · D ata analyzer with pixel picker · Up to 4K/UltraHD 60p over 4x 3G-SDI inputs · S DI auto signal detection · F ile base error logging with timecode · D isplay and color processing look up table (LUT) support · L ine mode to focus a region of interest onto a single horizontal or vertical line · L oop through output to broadcast monitors · S till store · N it levels and phase metering · B uilt-in support for color spaces from ARRI, Canon, Panasonic, RED and Sony Single GPU Single Node Magic Bullet Colorista Red Giant Real time, interactive, multi-layered masked color correction (video playback too!) with the Mercury Playback engine in Premiere Pro. · G PU accelerated effects Single GPU Single Node Magic Bullet Looks Red Giant Powerful looks and color correction for filmmakers. · G PU accelerated compute Single GPU Single Node Mist Marquise Technologies Mastering tool for cinema, broadcast and over-the-top content · 1 00% CUDA-accelerated imaging pipeline for de-bayering, color grading, transcoding and image enhancement · Integrated Dolby Vision pipeline Multi-GPU Single Node Nucoda Digital Vision Color grading · G PU-accelerated color grading · A ccelerated scopes, playback and rendering Single GPU Single Node 30 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 30 4/5/21 10:18 AM Pablo family Grass Valley Color grading and finishing · R eal time color correction Pablo Rio Grass Valley PFClean The Pixel Farm RAW Converter ARRI REDCINE-X PRO Red Digital Cinema R3D SDK Red Digital Cinema Red Digital Cinema Scratch VFX Suite Assimilate Red Giant Pablo Rio is a color grading application that · C UDA-accelerated color grading GV acquired when they purchased Snell. Image restoration and remastering · C UDA-based image processing acceleration RAW de-Bayering and primary color grading · C UDA-accelerated de-bayering and primary grading Primary color grading · C UDA-accelerated de-bayering and primary color grading Red Digital Cinema camera SDK decodes and de-bayers Red RAW camera data, and allows primary color grading. Used by many color grading and video editing applications. · C UDA-accelerated wavelet decoding and de-bayering Color grading and finishing · A ccelerated de-bayering for real-time digital finishing VFX Suite is a complete set of visual effects and motion graphics plugins for creating professional effects. · G PU accelerated effects Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node COMPOSITING, FINISHING AND EFFECTS APPLICATION NAME After Effects COMPANYNAME Adobe PRODUCT DESCRIPTION Motion graphics and effects SUPPORTED FEATURES · C UDA acceleration for up to 10x faster performance on key effects plus enhanced 3D ray tracing GPU SCALING Single GPU Single Node Aura Rowbyte Aura is a procedural plug-in for After Effects that creates elegant geometric shapes in 3D space. It's akin to a particle system but instead of rendering small particles all over the place, it generates vector like shapes (waves) that change over time much like the classic Radiowaves plug-in. · G PU-accelerated High Frequency Rendering Single GPU Single Node Clipster Rohde & Schwarz Video and film player and DCI Packager · GPU-accelerated · V ideo scaling · C olor space conversion · D ata format conversion Multi-GPU Single Node Complete CoreMelt Visual effects plug-in · F aster effects Single GPU Single Node Continuum Boris FX Visual effects plug-in for creative effects, titling, and quick fixes. · G PU accelerated effects Single GPU Single Node DE:Noise RE:Vision Effects Reduce noise, dust, and artifacts with frame-to-frame motion tracking. Useful for low light shoots, CG renders with ray tracing sample artifacts, excessive film grain. · F aster effects Single GPU Single Node DEFlicker RE:Vision Effects Reducing flicker and artifacts in highframe-rate and time-lapse video. · F aster effects Single GPU Single Node Element 3D Video Copilot Advanced 3D object & particle render engine plugin for Adobe After Effects · G PU accelerated graphics and compute Single GPU Single Node Flame Premium Autodesk Finishing and color grading · Integrated toolset for 3D VFX, editorial, and Multi-GPU color grading Single Node Flicker Free Digital Anarchy Deflicker Time Lapse, Slow Motion, and Old Video. Flicker Free is a powerful, new way to deflicker video. · G PU accelerated effects Single GPU Single Node Fusion Blackmagic Design Effects and compositing · 3 D tracking · Compositing · VR Single GPU Single Node HIERO Foundry Multi-shot management tool that supports collaborative working, review and approval, quick production turnaround and delivery · F luid, interactive playback Single GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 31 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 31 4/5/21 10:18 AM Imerge Pro Magic Bullet Denoiser Magic Bullet Film Magic Bullet Suite Mamba FX MediaReactor Mighty Bake Mistika Ultima Mistika VR Mocha Pro Natron Neat Video NUKE Optics PFTrack Plexus FXhome Red Giant Red Giant Red Giant SGO Drastic Technologies Mighty Bake SGO SGO Boris FX Natron Absoft Foundry Boris FX The Pixel Farm Rowbyte Imerge Pro is layer-based image compositing software that is GPU accelerated, making performance astonishingly fast, even on high-resolution images. Create pro-level composites with unlimited layers and zero baked-in changes. Imerge Pro is the first photo editing software to keep your image data RAW and your layers self-contained. · G PU-accelerated processing Single GPU Single Node Magic Bullet Denoiser III lets you reduce visible noise and grain in digital video produced by digital video cameras, camcorders, or film. · G PU accelerated effects Single GPU Single Node Gives digital footage the look of real film by emulating the entire photochemical process from the original film negative, to color grading, and finally to the print stock. · G PU accelerated effects Single GPU Single Node Full suite of tools for color correction, finishing and film looks for filmmakers. · G PU-accelerated processing and affects Single GPU Single Node High-end compositing · F aster keying, tracking, painting and restoration Single GPU Single Node Debayering and processing of raw camera files. · G PU-accelerated compute Single GPU Single Node A powerful, easy to use, all-in-one texture baking solution for any 3D artist · G PU accelerated processing Single GPU Single Node Color grading and finishing · F aster keying, tracking, painting and restoration, de-bayering Single GPU Single Node Near real-time optical flow stitching · G PU-accelerated video stitching with manual controls · E xport clips in many formats, including DPX and ProRes Single GPU Single Node Mocha Pro is an award-winning planar tracking tool for motion tracking, rotoscoping, object removal, camera stabilization and general visual effects. · G PU accelerated planar tracking and object Single GPU removal Single Node Natron is a free and open-source nodebased compositing software application. · G PU-accelerated processing and rendering Single GPU Single Node Digital filter with auto-profiling tool designed to reduce visible noise and grain found in footage. · G PU accelerated processing Single GPU Single Node Compositing tool with 3D tracker · G PU-accelerated BLINK processing · F aster compositing and effects Single GPU Single Node Optics is designed to simulate optical camera filters, specialized lenses, film stocks and grain, lens flares, optical lab processes, color correction as well as natural light and photographic effects. First collaborative product between Sapphire and Digital Film Tools. Plugin for Photoshop and Lightroom, also has a Windows and Mac standalone application. · G PU accelerated processing and affects Single GPU Single Node 3D scene creation and tracking · C UDA-accelerated tracking Multi-GPU Single Node Plexus is a plug-in designed to bring generative art closer to a non-linear program like After Effects. It lets you create, manipulate and visualize data in a procedural manner. Render the particles and create all sorts of interesting relationships between them based on various parameters using lines and triangles. · P lexus (interacts natively with AE's Camera) · H igh-quality, GPU-accelerated Depth of Field effects Single GPU Single Node 32 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 32 4/5/21 10:18 AM Rotobot Sapphire SilhouetteFX Silhouette Paint Twixtor Video Essentials Kognat An AI product for compositing packages which uses machine learning to generate mattes for machine-based rotoscoping. · C UDA accelerated AI rotoscoping Boris FX The Sapphire suite is an all-in-one solution containing hundreds of effects, presets, and workflows that are aimed at taking professional video work to the next level. · F aster effects Boris FX Invaluable in post-production, Silhouette continues to bring best of class tools to the visual effects industry. As a fully featured GPU accelerated compositing system, its standout features are award winning rotoscoping and non-destructive paint as well as keying, matting, warping, morphing, and a total of 142 different nodes--all stereo enabled. · G PU-accelerated processing and affects Boris FX Rotoscoping tool that allows for intensive VFX fixes, blemish cleanup, beauty effects, wire/object removal, style effects on video, and as an artistic paint tool. It is raster based so it has a smaller memory footprint (fastest paint plugin on the market), Integrated with Mocha Pro planar tracker · G PU accelerated processing and affects RE:Vision Effects Optical flow tracking of pixel motion to synthesize new frames by warping & interpolating frames of the original sequence. Reduces artifacts & retime frames. · F aster effects NewBlueFX Comprehensive collection of titling, transitions and video effects. · F aster effects Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node (VIDEO) EDITING APPLICATION NAME COMPANYNAME Blackmagic RAW SDK Blackmagic Design Catalyst Production Sony Creative Suite Software CineMatch FilmConvert Edius Pro Filmora Grass Valley Wondershare Gigapixel AI Topaz Labs PRODUCT DESCRIPTION Blackmagic RAW is a CPU and GPUenabled SDK for decoding and debayering Blackmagic RAW files on MacOS, Windows and Linux SUPPORTED FEATURES · C UDA-accelerated de-coding and debayering GPU SCALING Single GPU Single Node 4K, Sony RAW, and HD video editing. Includes 3 applications: Browse, Prepare, Edit · F aster effects, transitions and encoding · R AW camera de-bayering Single GPU Single Node CineMatch is a set of tools designed to help you match footage shot on different cameras to a baseline technical level - a seamless, matched timeline in Log or REC.709, ready for creative grading. · R eal-time color matching conversions with Single GPU CUDA Single Node Video editing · F aster effects · R AW camera de-bayering Single GPU Single Node Filmora is an easy-to-use and trendy video editing software that lets you empower your story and be amazed at results, regardless of your skill level. With Filmora, you can get started with any new movie project by importing and editing your video, adding special effects and transitions, and sharing your final production on social media, mobile devices, or DVDs. · G PU-accelerated processing Single GPU Single Node Photo up scaling by using AI to "fill in" and · G PU accelerated effects add new detail when enlarging photos. Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 33 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 33 4/5/21 10:18 AM GPUSqueeze Multicamera Systems HitFilm Pro FXhome Illustrator Adobe Lightroom Classic Adobe Lightworks Live Planet Luminar AI EditShare Live Planet Skylum Media Composer Avid Movavi Video Suite Movavi MXF Film Partners Photoshop Adobe Pinnacle Studio PowerDirector Corel CyberLink PowerDVD Premiere Pro CyberLink Adobe GPUSqueeze is cross platform software library for multi-stream and ultra high speed video encoding, transcoding and processing using multi-GPU and distributed setups. The library uses highly optimized patent pending algorithms to achieve maximum speed, high hardware utilization and provides almost linear performance scaling with the increase of number of GPUs in the system. · G PU accelerated video encoding and decoding Multi-GPU Multi-Node HitFilm Pro is an all-in-one video editor, compositor, and visual effects (VFX) software designed for filmmakers, professional video editors, and visual content producers. · G PU accelerated effects and decoding Single GPU Single Node Vector graphics software for creating logos, icons, drawings, typography, and illustrations for print, web, video, and mobile devices. · E ntire canvas optimized for NVIDIA GPUs for faster pan & zoom Single GPU Single Node Easily edits organizes, stores, and shares your photos. · G PU accelerated Develop module plus new Sensei features like "Enhance Details" with NVIDIA GPU AI optimization. · U p to 600% faster than integrated GPUs with controls like Texture, Dehaze, & Sharpening · Improved editing in 1:1 view & on hi-rez displays. Single GPU Single Node Video editing · F aster effects · C UDA-accelerated de-bayering Single GPU Single Node Livestreaming, recording and delivery of stereoscopic 360 VR · R eal time 360 3D capture and stitch · 4K Single GPU Single Node Luminar is the world's first photo editor that adapts to your style & skill level. It is designed to make complex photo editing easy & enjoyable for everyone. Take advantage of over 300 powerful, yet simple photo editing tools that allow you to perform all kind of image editing tasks. · G PU accelerated processing and AI affects Single GPU Single Node Video editing · F aster video effects, unique stereo 3D capabilities Single GPU Single Node An all-in-one video maker: an editor, converter, screen recorder, and more. · F aster conversion speed with NVIDIA CUDA Single GPU Single Node Collaborative editing system supporting Avid Media Composer, Adobe Premiere Pro, Grass Valley Edius and Blackmagic Resolve · N VIDIA Video Codec allowing remote GPUaccelerated production workflows Single GPU Single Node Photo editing to transform your images into anything you can imagine · G PU-accelerated AI "Neural Filters" · 3 0+ other GPU accelerated features · B lur gallery, liquify, smart sharpen, perspective warp Single GPU Single Node Video editing and sharing program. · G PU accelerated compute and effects Single GPU Single Node PowerDirector delivers professional-grade video editing and production for creators of all levels. Whether you are editing in 360 degrees, Ultra HD 4K or even the latest online media formats, PowerDirector remains the definitive Windows video editing solution for anyone, whether they are beginners or professionals. · G PU accelerated video processing and effects Single GPU Single Node CyberLink PowerDVD is a universal media player for movie discs, video files, photos and music. · G PU accelerated encoding and decoding Single GPU Single Node Video editing software for film, TV, and the · R eal-time video editing & fast output web. rendering based on CUDA Multi-GPU Single Node 34 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 34 4/5/21 10:18 AM Premiere Rush Sharpen AI SmartCourtPro Smoke TotalFX Vegas Pro Velocity Video Enhance AI Video Studio VLC Media Player WonderLive Adobe Topaz Labs PlaySight Autodesk Easy-to-use video editor for creating and sharing online videos. · CUDA · R eal-time video editing · F ast output rendering Sharpening and shake reduction software that can tell difference between real detail and noise. · G PU accelerated effects · M achine Learning Sophisticated video and analytics training · IVA technology with the latest in AI, integrations and player development tools. Finishing and editing · F aster effects NewBlueFX Magix Comprehensive collection of Titling, Compositing, Polishing and Styling tools. Video editing Imagine Video editing Communications Topaz Labs Trained on thousands of videos and combining information from multiple input video frames, Topaz Video Enhance AI will enlarge and enhance your footage up to 8K resolution with true details and motion consistency. Corel High quality tools that build, edit, and correct video skillfully. VideoLAN Organization VLC is a free and open source crossplatform multimedia player and framework that plays most multimedia files as well as DVDs, Audio CDs, VCDs, and various streaming protocols. Z Cam Cinematic VR Camera with excellent image quality, stereoscopic 360 degrees; recording, and live streaming. · G PU-accelerated affects · F aster video effects and encoding · U ses NVENC to encode/decode H.264 and HEVC streams · F aster effects · G PU accelerated AI inference and processing · G PU accelerated compute · N V Video Codec accelerated encoding and decoding · U p to 4K output resolution equirectangular image · S ave live stitched video file · P review live stitched video · R TMP live streaming output · S upports VRworks 360 video SDK Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node (IMAGE & PHOTO) EDITING APPLICATION NAME COMPANYNAME Adjust AI Topaz Labs Affinity Photo Affinity Corel Draw Corel Corel Photo-Paint Corel Fresco JPEG to RAW AI Adobe Topaz Labs PRODUCT DESCRIPTION Adjust AI is a one click application that leverages the power of machine learning to intelligently enhance photos. SUPPORTED FEATURES · G PU accelerated effects A fast and precise image editing software for photography and creative professionals, from editing and retouching images, creating fullblown multi-layered compositions, to making beautiful raster paintings. · G PU accelerated image processing Professional vector illustration, layout, photo editing and design tools · F aster processing of AI features Corel PHOTO-PAINT is an advanced photo editing software that offers professional editing tools and support for PSD files, plus extensive RAW file support for over 300 types of cameras. · F aster processing of AI features Powerful painting and drawing app that let · D irectX acceleration on GPU you create with realistic watercolors and oils AI powered conversion of JPEG to highquality RAW for better editing. Prevent banding, remove compression artifacts, recover detail, and enhance dynamic range · G PU accelerated processing GPU SCALING Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 35 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 35 4/5/21 10:18 AM Mask AI Topaz Labs Neat Image ON1 Photo Raw Absoft ON1 PhotoLab DxO Topaz Studio Topaz Labs This is a AI-based masking tool for · G PU-accelerated processing photography that lets creators automatically detect and remove objects from image. Reduces noise, film grain, artifacts from photos. · G PU accelerated processing Professional-grade photo organizer, raw processor, layered editor, and effects app, includes everything you need in one photography application. · G PU-accelerated processing PhotoLab is a photo editor with specializing in high-quality RAW processing and optical corrections for lens defect, along with powerful local image adjustment tools. · G PU-accelerated processing and AI features Topaz Studio is an intuitive image effect toolbox with Topaz Labs' powerful acclaimed photo enhancement technology. It works a plugin within Lightroom, Photoshop, Affinity Photo, and others, as well as a standalone editor and host application for your other Topaz plugins. · G PU-accelerated processing Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node ENCODING AND DIGITAL DISTRIBUTION APPLICATION NAME 4K Capture Utility for Windows COMPANYNAME ElGato PRODUCT DESCRIPTION SUPPORTED FEATURES ElGato sells Capture Cards and offers a capture software with them. The ElGato 4K60 Pro Mk.II capture card includes an implementation of the Video Codec SDK (i.e. NVENC). · H DR recording over HEVC · H DR to SDR conversion Alchemist on Demand Grass Valley Video standards conversion · G PU-accelerated video processing and encoding Amberfin Dalet Transcoding and video quality analysis · G PU-accelerated video processing and encoding Aurora Tektronix Automated video quality measurement · C UDA-accelerated video quality assessment AW-360C10 Panasonic 360-degree Live Camera designed for live sporting events, concerts and stadium events · Low-latency · R eal-time 4K 360 degree stitching from four camera inputs · J etson TX-1 Content Agent Root6 Automated transcoding and workflow management · G PU-accelerated video processing and encoding Core ArcVideo Video processing and transcoding Live · A ccelerated transcoding and encoding Daniel2 Cinegy Discord Go Live Discord DouYu App DouYu Resolution-independent, CUDA accelerated video codec. Broadcast feature that enables Discord users to broadcast their screen to a Discord channel Douyu's streaming application · 8 K+ video playback faster than real time · 3 D LUT color profiles supported · lossless 10-, 12-, 16-bit support · A dobe Premiere Pro plugin · NVENC · NVENC Elemental Live Elemental Elemental Server Elemental Live streaming video processing and encoding File-based video processing and encoding · V ideo encoding and video processing · V ideo encoding and video processing Fast CinemaDNG Processor Fastvideo RAW video debayering, denoising and color correction completely on GPU side · H igh-quality GPU-based RAW video processing up to 160 fps · W avelet, realtime de-noising · C olor correction features and monitoring · E xport to 16-bit TIF or 10-bit ProResFull- sized video processing · R ealtime 4K, 6K, and 8K playback supported 36 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 GPU SCALING Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node N/A Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 36 4/5/21 10:18 AM FAST TICO-RAW FAST TICO-XS Handbrake HuYa App JPEG2000 Codec Lightspeed Live Live Logitech Capture Medialooks SDK Media Transcoding in the Cloud Multiplatform Transcoder intoPIX intoPIX Handbrake HuYa The intoPIX TICO-RAW SDKs provide the highest quality, visually lossless codec for the optimization of your application's infrastructure. FastTICO-RAW SDKs are perfect for all professionals looking to deploy ultra-low latency, lossless RAW encoding over parts of their workflows. The intoPIX FastTICO-XS SDKs provide the highest quality, lowest latency, visually lossless codec for the optimization of your application. FastTICO-XS SDKs are perfect for all professionals looking to deploy ultralow latency, lossless encoding over their whole infrastructure and workflows. HandBrake is an open-source, GPLlicensed, multiplatform, multithreaded video transcoder. Huya's streaming app · C UDA GPU accelerated up to 10K decoding · L ossless and low latency · A ll operating systems · C UDA GPU accelerated HD, UHD-4K and -8K encoding / decoding · L ossless and low latency · A ll operating systems · J PEG XS standard compliant · G PU accelerated encoding · NVENC Comprimato Telestream ArcVideo Logitech JPEG2000 encoding and decoding for DCP, IMF, video editing, broadcast contribution, and archiving. Enterprise-class live streaming system that can ingest, encode, package and deploy multiple sources to multiple destinations. System utilizes the latest technologies to deliver pristine quality and exceptional processing speed. Video processing and transcoding can be accelerated with GPU for up to 9x speed improvements High-density, real-time video processing and encoding. Logitech's app to control their webcam · F aster-than-real-time UltraHD / 4K · L ossy and mathematically lossless · H igh-bit-depth (HDR) · U ses NVENC to encode/decode multiple H.264 and HEVC streams · V ideo processing and transcoding · A ccelerated broadcast encoding with NVIDIA CUDA and NVENC · NVENC Medialooks MFormats SDK provides complete control over the video pipeline Ribbon Communications Industry-leading SBC media transcoding scaling capabilities in virtual and cloud deployments using NVIDIA GPUs to increase performance and decrease cost per transcoded session. Expanded SBC and PSX support for SIP Recording (SIPRec) allows enterprises and call centers to conduct up to four (4) simultaneous recordings of sessions via secure, encrypted technology. Expanded capabilities for Virtual Network Functions (VNF) instantiation with the ability to instantiate Ribbon PSX VNF aligned with the Open Network Automation Platform (ONAP) framework. Enhancements for operational efficiencies that allow CSPs to reduce configuration complexity and improve ease of use. Enhanced security across all products to deliver more restrictive access, reduction in possible network exposure and additional encryption. ERLAB Video processing and encoding software · N VIDIA Video Codec used for accelerated encoding and ecoding · R ibbons Session Border Controller Release 7.0 now supports GPUs enabling greater performance and scale for media transcoding, at cost-effective price points, in cloud and virtualized environments. · R ibbons Centralized Policy and Routing (PSX) can be instantiated as a Virtual Network Function (VNF) aligned with the ONAP architecture. · E nterprises now have increased capacity for up to four (4) concurrent SIP Recording (SIPRec) sessions, enabling recorded data to be used for multiple purposes simultaneously such as real-time analytics for call center agents, recordings for corporate compliance and back-up, and lawful intercept · T he Insight Element Management System (EMS) has an improved user interface for ease of use and offers improved provisioning and management processes · P re-processing encoding, decoding, postprocessing and delivery Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 37 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 37 4/5/21 10:18 AM mxfSPEEDRAIL MOG Technologies OBS Studio Piko TV PixelStrings Open Broadcaster Software Kizil Electronik Cinnafilm Skywatch MOG Technologies Smart Render Editor Smart Render SDK Nablet Nablet Speech Quality transformed using Neural Network Computing BabbleLabs StreamLabs OBS StreamLabs Tachyon Cinnafilm Tornado Marquise Technologies Transkoder Colorfront Twitch Studio Twitch.tv Baseband broadcast news and sports production video ingest product line that allows editing of growing files during ingest. · N VIDIA Video codec used for encoding for higher channel density · C UDA RAW de-coding, de-bayering, and video re-sizing and re-sampling Single GPU Single Node Free and open source software for video recording and live streaming optimized for NVIDIA video encoder · NVENC Single GPU Single Node Linear broadcast encoder · H .264 and HEVC 4K encoding for broadcast Single GPU channels Single Node Cloud-based image processing Platformas-a-Service (PaaS) delivering high-quality, automated video conversion and frame optimization · M otion-compensated frame rate conversion · H igh-quality de-interlacing · T exture-aware scaling · D e-grain/re-grain to any film look, · D e-noise/re-texture to limit banding · R everse telecine/pulldown pattern correction · Interlace artifact and dust removal · R untime retiming Multi-GPU Single Node Video and broadcast production management system for collecting audio/ video usage and metadata. · N VIDIA Video codec used for encoding for higher channel density · C UDA RAW de-coding, de-bayering, and video re-sizing and re-sampling Single GPU Single Node H.264 and HEVC video encoding using NV Video Codec · A ccelerated, high-density video encoding Single GPU Single Node Video de-noising, de-interlacing, JPEG 2000 · C UDA accelerated video processing encoding and video fingerprinting · N VIDIA Video codec Single GPU Single Node BabbleLabs has just launched broad production availability of our commercial speech API, web service, and phone mobile apps for iPhone and Android. These services clean up video and audio recordings to make the speech much easier to understand. The apps work on existing videos as well as new audio and video recorded inside the app. · R eal time encoding/decoding of audio · V ideo signals Single GPU Single Node Branch of the OBS Studio project that adds a custom UI, integrates plugins, and a plugin store · NVENC Single GPU Single Node Standards conversion · Video processing and frame rate conversion · S tandards conversions and transcoding · S D to UHD, telecine correction, and frame rate normalization Multi-GPU Single Node Transcoding engine for IMF and DCP facilities · Image re-sizing up to 8K · C olor space conversion: 601/709, REC 2020, DCI XYZ, ACES 1.0 · D e-bayering: ARRIRAW, DNG, RED R3D, SONY F65, F55 RAW, Phantom flex 4K, Canon C500 · M ezzanine: ProRes 444, Avid DNxHD 444, XDCAM, AVC Intra, AS-11 DPP, IMF · U ncompressed: DPX, TIFF, OpenEXR Single GPU Single Node Encoding and transcoding for DCP, and IMF mastering · J PEG2000 encoding and decoding · 3 2-bit floating point processing on multiple GPUs · M XF wrapping, accelerated checksums and AES encryption and decryption, · IMF/IMP and DCI/DCP package authoring, editing, transwrapping Multi-GPU Single Node Broadcasting app focused on beginners · NVENC · M ulti-video Codec support Single GPU Single Node 38 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 38 4/5/21 10:18 AM Vantage LightSpeed Telestream Viarte VidiCert Wormhole Isovideo Joanneum Research Cinnafilm Wowza Streaming Engine Transcoder XSplit Broadcaster Wowza SplitmediaLabs, Ltd. XSplit Gamecaster SplitmediaLabs, Ltd. Enterprise-class live streaming system that can ingest, encode, package and deploy multiple sources to multiple destinations. System utilizes the latest technologies to deliver pristine quality and exceptional processing speed. Video processing and transcoding can be accelerated with GPU for up to 9x speed improvements Video standards conversion Video and film quality assurance Time alteration H.264 video encoding · V ideo transcoding and processing · C UDA-accelerated video processing and encoding · C UDA accelerated video quality analysis · G PU-accelerated noise, grain and dust detection/removal · R etiming and motion compensation, · S uper slow motion, and run length adjustment · C ommercial insertion, audio retiming, and caption retiming · N VENC accelerated video encoding Broadcast app for recording and streaming, now including a lightweight video editor Simplified broadcast app for recording and streaming, now including a lightweight video editor · NVENC · Record · Stream · NVENC Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node N/A Single GPU Single Node ON-AIR GRAPHICS APPLICATION NAME COMPANYNAME Air Cinegy Aximmetry Aximmetry Brodcaast Dscript 3D Camino Monarch AJT Systems Capture Cinegy PRODUCT DESCRIPTION Broadcast play-out server SUPPORTED FEATURES · R eal-time on-air graphics · N VIDIA Video Codec for accelerated encoding and decoding HD and HEVC Aximmetry?s solutions cover all aspects of advanced broadcast presentation: tracked virtual sets, Augmented Reality (AR), interactive touch screen displays, datadriven graphics, virtual product placement, and audience interaction via second-screen devices. · D irextX 11 3D Rendering, Post Processing and Compositing · N VEnc encoding in H264/265 · TXAA · G ameworks: Screen-Space Ambient Occlusion · G ameworks: Depth of Field 3D on-air graphics · R eal-time rendering Camino is a powerful 3D rendering system for live-to-air broadcast graphics, capable of up to 4K character generation. Camino's high end features, with excellent ease of use, combine to deliver an exceptional system for your broadcast graphics requirements. · C amino's real-time graphics overlay can be applied to tickertapes, scoreboards, schedule boards, program junctions, and TV show promotions · G raphics overlay may be done via predefined templates, which may then be populated with live data during playout · M akes real-time rendering of data-driven graphics possible in news and sports events.4K, 1080p, 720p and SD Support · N TSC and PAL Support · G raphics, Clips and 3D Objects Importer · 2 D and 3D Primitives · R eal-Time Key-Frame Animations · R eal-Time 3D Scene Lighting · T imeline-Based Audio Support · D ata Mapping to External Sources · T ransition Logic · A utomation Controller Support · S tereoscopic 3D rendering Video ingest · U ses NVENC to encode/decode multiple H.264 and HEVC streams GPU SCALING Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 39 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 39 4/5/21 10:18 AM Clarity Click Effects PRIME Cube Designer eStudio InfinitySet KAIROS Livebook GFX Mosaic Multiviewers Nexio Channelbrand Nexio G8 Nexio TitleOne Pixotope PRIME Reality Engine Pixel Power ChyronHego Dalet Disguise Brainstorm Brainstorm Panasonic AJT Systems ChyronHego Evertz Imagine Communications Imagine Communications Imagine Communications The Future Group ChyronHego Zero Density On-air graphics Click Effects PRIME is audiovisual content control and delivery solutions for live sports & entertainment productions. On-air Graphics Designer is the ultimate software to visualize, design, and sequence projects wherever you are, from concept all the way through to showtime. Virtual sets and motion graphics Realistic virtual sets The IT/IP platform `KAIROS' is a live video production platform developed based on a new concept and innovative architecture. It incorporates proprietary, ground-breaking software to maximize the CPU and GPU capacities for video processing. The LiveBook is designed to fit every production environment and facilitate evolving work flows. Whether you are broadcasting over IP, or using SDI for internal or downstream keying, the LiveBook will be able to adapt to your environment. On-air graphics Broadcast multiviewer On-air graphics On-air graphics On-air graphics All-in-one, real-time virtual production system with integrated Unreal Engine photorealistic rendering. Open softwarebased solution for rapidly creating virtual studios, augmented reality (AR), and on-air graphics. Offers a real-time WYSIWYG editor, a virtual set auto-generation tool, its own powerful internal chroma keyer, and user-designed custom control panels. PRIME Graphics Platform is the next generation of pioneering real-time graphics solutions, helping broadcasters create engaging visuals for all types of programming. Photorealistic virtual studio solution in broadcast industry, powered by Epic Unreal Engine 4.24 Using Mellanox Rivermax API · R eal-time rendering · R eal-time graphics rendering · R eal-time graphics rendering · R eal-time graphics rendering · S ynchronized video playback · P rojection Mapping · R eal-time rendering · R TX accelerated ray-tracing optional Epic Unreal Engine · R eal-time RTX ray tracing through UE4 · H DR I/O · P hysically-based rendering · R TX accelerated ray-tracing optional Epic Unreal Engine · R ealtime playout · C UDA and NVEnc · R ivermax SMPTE 2110 · G PU Accelerated Video · G raphics solution for compact live sports productions · R eal-time rendering · U ses NVENC H.264 and HEVC encoding and decoding · R eal-time rendering · R eal-time rendering · R eal-time rendering · R eal-time rendering · R TX accelerated ray-tracing · R eal-time graphics rendering · R TX-accelerated ray-tracing with Unreal Engine · N ode-based compositing system designed for real-time production · Image quality is achieved by on NVIDIA GPUs through deferred rendering methods unique anti-aliasing technology and advanced features such as depth of field, motion blur, light maps, screen space reflections and refraction Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node 40 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 40 4/5/21 10:18 AM Titler Pro tOG Type Vertigo Virtuoso Viz Engine Wasp3D - CG NewBlueFX RT Software Cinegy Grass Valley Monarch vizrt Wasp3D Create elegant video titles or 3D motion graphics. On-air graphics On-air Graphics On-air Graphics Virtual sets and motion graphics On-air graphics and virtual sets On-air graphics and virtual sets · G PU-accelerated graphics · R eal-time rendering · R eal-time graphics rendering · R eal-time rendering · R eal-time rendering · R eal-time graphics rendering · R eal-time graphics rendering Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node Single GPU Single Node ON-SET, REVIEW AND STEREO TOOLS APPLICATION NAME 4kScope COMPANYNAME Drastic Technologies PRODUCT DESCRIPTION 4kScope software provides a real time, professional quality signal analysis tool for on set, production, post production, and research and development environments. SUPPORTED FEATURES · G PU accelerated effects and compute GPU SCALING Single GPU Single Node 8KScope Drastic Technologies Real time, professional quality signal analysis · G PU-accelerated effects and compute tool for on set, production, post production, and research and development environments. Single GPU Single Node Cortex Dailies MTI Film Review, color grading and transcoding on set · C UDA accelerated grading and transcoding Multi-GPU Single Node Fluid 4K Review BlueFish444 Review and approval of 4K content · R eal-time video review Single GPU Single Node ICE Marquise IMF reference video player · R AW data support for ARRIRAW, DNG, RED Single GPU Technologies R3D, SONY F65, F55 RAW, Phantom flex 4K Single Node and Canon C500 · H DR content encoded in Dolby Vision, HDR10, HDR10+ or HLG · U ncompressed formats support: DPX, TIFF and OpenEXR Net-X-Code Drastic Technologies Net-X-Code is a distributed capture and conversion system: IP Capture, Control, Convert and Output for server level. · G PU accelerated compute Single GPU Single Node NewBlue Stream NewBlueFX NewBlue Stream is a lightweight streaming · G PU-accelerated processing, encoding and and broadcast solution paired with dynamic, decoding data-driven graphics Single GPU Single Node On-Set Dailies Colorfront Review, color grading and transcoding on set · R eal-time review Multi-GPU · N V Video Codec encoding and transcoding Single Node Previzion Lightcraft On-set virtual production · R eal-time, virtual set production Single GPU Single Node VideoQC Drastic Technologies videoQC is a suite of video and audio analysis and playback tools with both visual and automated quality checking tools. Takes the media coming into your facility and perform a series of automated tests on video, audio and metadata values against a template, then analyze the audio and video. · G PU accelerated effects and compute Single GPU Single Node WEATHER GRAPHICS APPLICATION NAME COMPANYNAME Max Weather WSI PRODUCT DESCRIPTION Weather graphics Metacast ChyronHego Weather graphics MeteoEarth MeteoGraphics Weather graphics SUPPORTED FEATURES · R eal-time graphics · R eal-time graphics · R eal-time graphics GPU SCALING Single GPU Single Node Single GPU Single Node Single GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 41 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 41 4/5/21 10:18 AM Medical Imaging APPLICATION NAME 3D Slicer COMPANYNAME 3D Slicer aidoc Aidoc Medical AI-LAB American College of Radiology deepflow Helmholtz Zentrum München EBM AI Workflow EBM Technologies Ibex Decision Support IBEX PRODUCT DESCRIPTION SUPPORTED FEATURES 3D Slicer is an open-source software platform for medical image informatics, image processing, and three-dimensional visualization. Slicer brings free, powerful cross-platform processing tools to physicians, researchers, and the general public. · N VIDIA Clara AI-assisted Annotation · S upports multi organs, from head to toe · M ulti-modality imaging (MRI, CT, US, nuclear medicine, and microscopy) · B idirectional interface for devices GPU SCALING Single GPU Single Node AI based decision support software analyzing medical imaging to provide solutions for detecting acute abnormalities across the body, helping radiologists prioritize life threatening cases and expedite patient care. Agnostic to PACS and RIS systems · C lassification and segmentation using deep Single GPU learning on top of any PACS platform Single Node ACR AI-LAB offers radiologists tools designed to help them learn the basics of AI and participate directly in the creation, validation and use of health care AI. It accelerates the development and adoption of artificial intelligence (AI) in clinical practice, empowering radiologists to create AI tools at their own institutions, to meet their own patient needs. · A I models for diagnostic imaging · A I models tailored to their local patient population · P atient data protection Single GPU Single Node Deep learning tool for reconstructing cell cycle and disease progression using deep learning from flow cytometry data. · Tool will show that deep convolutional neural networks combined with nonlinear dimension reduction enable reconstructing biological processes based on raw image data · Tool will demonstrate this by reconstructing the cell cycle of Jurkat cells and disease progression in diabetic retinopathy. In further analysis of Jurkat cells · Tool will detect and separate a subpopulation of dead cells in an unsupervised manner and, in classifying discrete cell cycle stages · Tool will reach a sixfold reduction in error rate compared to a recent approach based on boosting on image features. In contrast to previous methods, deep learning based predictions are fast enough for on-the-fly analysis in an imaging flow cytometer · Uses MXNet, cv2, numpy, python3 Single GPU Single Node EBM AI Workflow is a software platform for seamless data annotation, training, and advanced visualization and deployment of AI-based medical imaging applications. EBM AI workflow and NVIDIA Clara combine the power of AI and edge computing to retain critical processing tasks on devices at the point of care, enabling healthcare professionals, physicians and specialists to make instantaneous, life-saving predictions and emergency responses. · P re-trained models for inference and AIassisted annotation · A utomatic image analysis · E BM PACS viewer · F DA approved APP(UDE) · X Annptation APPs Multi-GPU Multi-Node IBEX run DL on prostate cancer digital pathology and to find any potential cancerous areas · C ombines data from digitized glass slides and electronic medical records to reveal underlying patterns · E xtracts valuable clinical insights that can transform how pathology and oncology are practiced and propel them into the information age Single GPU Single Node 42 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 42 4/5/21 10:18 AM iNtuition LVO MITK OHIF PowerGrid Proprio Terarecon, Inc. Intuition offers AI-driven advanced 3D and 4D medical imaging post-processing and visualization. · V olumetric Navigation, CT and MRI Suites · Interventional Radiology · E VAR / TAVR Planning · B ody Fusion · Maxillo-Facial · iGENTLE noise reduction · L ung / Liver Segmentation · M itral Valve (TMVR) Workflow · L ung Density Analysis-II · Intuition AI Adapter · E ureka Clinical AI Platform framework · E xplorer UX/UI, and AI algorithm runtime licenses Multi-GPU Multi-Node Viz.ai Automatically identify suspected LVOs on CTA imaging in your network and to alert your on-call stroke physician within minutes · R eal-Time Specialist Notifications · A I-Powered LVO Detection · A utomated Maximum Intensity Projections (MIP) Single GPU Single Node German Cancer Free open-source software system for Research Center development of interactive medical image processing software · Interactive segmentation of slices in image volumes, including interactive region growing and easy correction, interpolation of missing slices, surface generation, and volumetry · P oint based registration of medical image volumes allows to match two images based on two corresponding sets of points; Rigid registration of images by combination of the ITK registration objects (transforms, optimizers, metrics, etc.) · M easurement of distances and angles; Volume visualization, GPU-based, easy to modify transfer functions; Movie generation (Windows only) · D eformable Registration Single GPU Single Node Open Health Imaging Foundation OHIF is a framework for building medical imaging web applications that uses react. The code is modular, using react components and a plug-in model making it possible to add new tools and workflows into the basic viewer UI. · Integrated AI-assisted Annotation with NVIDIA Clara Plugin · R etrieve and load medical images from most sources and formats · R ender sets in 2D, 3D, and reconstructed representations · A llows for the manipulation, annotation, and serialization of observations · S upports internationalization, OpenID Connect, offline use, hotkeys Single GPU Single Node University of Illinois UrbanaChampaign Provides iterative non-cartesian MRI reconstruction · G PU accelerated implementations of the non-Unform FFT and Discrete Fourier Transform · M PI is used to enable using multiple GPUs in one or several machines · Iterative reconstruction using physicsbased model to correct for unwanted effects, such as field inhomogeneity and patient motion Multi-GPU Single Node Proprio Proprio's multi-camera system, based on networked camera array, depth sensing, light filed for surgeons to operate and access all the data they need. Offers training based in captured real cases in a safe and collaborative environment. · CUDA Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 43 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 43 4/5/21 10:18 AM Rad AI Follow-up RAD AI Rad AI Impressions RAD AI RadiAnt Medixant Radiology Assist Zebra Imaging Radlogics Virtual Resident RadLogics Vitrea® Vital Images Rad AI provides communication and tracking of follow-up recommendations for incidentalomas (such as for pulmonary nodules and lung cancer screening programs) that are top of mind for improving patient safety. By ensuring that these follow-ups are performed, the overall quality of patient care is improved and reduces patient morbidity/mortality, while creating new imaging revenue for the health system, and generating value from additional downstream services. · C ommunicates and tracks follow-up recommendations · Integrates with a health system's existing workflow · A ppropriate follow-up imaging is performed on a timely basis Multi-GPU Multi-Node Rad AI automatically generates customized report impressions that save radiologists an average of more than 60 minutes per day. AI automatically generates report impressions, customized to each radiologist's exact language and style, for more than 90% of imaging modalities. · A utomatic report impressions, · C ustomized to your language · F leischner, Lung-RADS and TI-RADS · S eamless integration Multi-GPU Multi-Node RadiAnt DICOM Viewer provides basic tools for the manipulation and measurement of images · F luid zooming and panning, Brightness and contrast adjustments, negative mode, Preset window settings for Computed Tomography (lung, bone, etc.) · A bility to rotate (90, 180 degrees) or flip (horizontal and vertical) images, Segment length, Mean, minimum and maximum parameter values (e.g. density in Hounsfield Units in Computed Tomography) within circle/ellipse and its area, Angle value (normal and Cobb angle) · P en tool for freehand drawing Single GPU Single Node Receives imaging scans from various modalities and automatically analyzes them for a number of different clinical findings. Findings are provided in real time to radiologists or other physicians and hospital systems as needed. · C lassification and segmentation on top of any PACS platform Single GPU Single Node Software platform imports any DICOMcompatible study directly from the modality or the PACS. The software platform provides APIs for image analysis algorithms to incorporate search, measurement, and other findings into the radiologist existing PACS and reporting system as a preliminary report. · R eal time analytics on medical imaging Single GPU Single Node Vitrea provides advanced visualization tools to a range of medical specialists (including radiologists, cardiologists, oncologists and other specialists) so that they can visualize patient images and communicate with each other efficiently on a course of action. Vitrea is a crucial tool for clinical decision support and enabling physicians to communicate effectively about a common patient, and specialists rely on its detailed 2D, 3D and 4D images for confident analysis in critical scenarios. · Interface designed for viewing in the reading room · Improved clinical outcomes with clinical workflows and partner applications · Increased efficiency with a consistent user interface and experience for all modalities · E asy to deploy thin client solution does not require specialized software to reside on client computers. Multi-GPU Multi-Node 44 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 44 4/5/21 10:18 AM XNAT xvision Radiologics Augmedics XNAT is an open source imaging informatics platform developed by the Neuroinformatics Research Group at Washington University. It facilitates common management, productivity, and quality assurance tasks for imaging and associated data. XNAT is extensible and can be used to support a wide range of imaging-based projects. · U pload data using DICOM image data and metadata · O rganize and share data within userdefined projects securely · V isualize and download using an embedded medical image viewer that supports a number of common medical imaging formats · S ecure and manage access to data using a tiered architecture · S earch and explore large data sets and create and share customized search patterns · P rocess data using pipelines that allow for the programming and automation of complex workflows Single GPU Single Node Augmented reality guidance system · T ransparent AR Display N/A for surgery, allows surgeons to see the · T racking system patien'?s anatomy through skin and tissue as if they have `x-ray vision' and to accurately guide instruments and implants during spine procedures Oil and Gas APPLICATION NAME 6X COMPANYNAME Ridgeway Kite AISight for SCADA BRS Labs AxRTM DecisionSpace Echelon Acceleware Halliburton (Landmark) Stone Ridge Technology GeoDepth Geoteric Graydient S (SCADA) Emerson Geoteric Giant Grey HUESpace InsightEarth Omega2 RTM Bluware CGG Schlumberger PRODUCT DESCRIPTION Reservoir Simulation on Tesla SUPPORTED FEATURES · C UDA Simulation Parallelization GPU SCALING Single GPU Single Node Proactive integrity management and realtime precursor alerts for enhanced SCADA operations in oil and gas. · 2 4/7 real-time analysis and alerting · S cales to thousands of sensors across remote and geographically dispersed locations · H istorical analysis and trend reports Multi-GPU Single Node Reverse Time Migration Software · C UDA accelerated libraries for building RTM software Multi-GPU Multi-Node E&P platform for geoscience, well planning, CUDA acceleration of fault extraction drilling and earth modeling. Multi-GPU Single Node Full featured reservoir simulator designed from inception for GPU (Supported features) · F ully GPU-accelerated reservoir model · D ual-perm, dual porosity, pressure varying perm and porosity · E clipse compatible input deck Multi-GPU Multi-Node Seismic Interpretation Suite · C UDA-accelerated RTM Multi-GPU Multi-Node Seismic interpretation · A ttributes calculations · G eobodies extraction Multi-GPU Single Node Machine learning anomaly detection for large scale industrial data. · P roactive integrity management and realtime precursor alerts for enhanced SCADA operations in oil and gas · 2 4/7 real-time analysis and alerting scaling to thousands of sensors across remote and geographically dispersed location Multi-GPU Single Node Library SDK toolkit for creating applications · C UDA acceleration for compression for seismic compression and seismic/ · L arge-scale visualization geospatial imaging and interpretation. Multi-GPU Single Node Seismic Interpretation Suite · O penCL acceleration for AFE · 3 D Curvature attributes Multi-GPU Single Node Seismic processing · M ultiple algorithms (RTM, etc) Multi-GPU Multi-Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 45 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 45 4/5/21 10:18 AM PumaFlow IFP Roxar RMS RTM Seismic City RTM SKUA tNavigator VoxelGeo Beicip-Franlab Emerson Tsunami Seismic City Emerson Rock Flow Dynamics (RFD) Emerson Reservoir simulation · G PU-accelerated linear solver Reservoir modeling · M ulti GPU capabilities via HUEspace Seismic processing · R TM algorithm RTM Seismic Processing · C UDA acceleration Reservoir modeling · F aults, Horizons and Flow Simulation Grid tNavigator Solver is a software package, offered as a single executable, which allows to build static and dynamic reservoir models, run dynamic simulations, calculate PVT properties of fluids, build surface network model, calculate lifting tables, and perform extended uncertainty analysis as a part of one integrated workflow. Seismic Interpretation Package · CUDA · P ascal/Volta architecture · Multi-GPU · M ulti-GPU volume rendering · Horizon-flattening · A ttribute calculations Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Life Sciences BIOINFORMATICS APPLICATION NAME COMPANYNAME Arioc Johns Hopkins University AtacWorks NVIDIA BarraCUDA BEAGLE-lib University of Cambridge Metabolic Research Labs Open Source Campaign SimTK Clara Genomics Analysis NVIDIA CUDASW++ Open Source PRODUCT DESCRIPTION High-throughput read alignment with GPUaccelerated exploration of the seed-andextend search space. SUPPORTED FEATURES · S ingle-end alignment, paired-end alignment · O utput in SAM or database-ready binary formats · M ultiple GPU implementation AtacWorks is a deep learning toolkit for coverage track denoising and peak calling from low-coverage or low-quality ATAC-Seq data. · C overage track denoising · Retraining Sequence mapping software · A lignment of short sequencing reads · A lignment of indels with gap openings and extensions. GPU SCALING Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. Makes use of highly-parallel processors such as those in graphics cards (GPUs) found in many PCs. An open-source library of GPU-accelerated data clustering algorithms and tools. Clara Genomics Analysis is a GPUaccelerated library for biological sequence analysis. Open source software for Smith-Waterman protein database searches on GPUs. · E valuation of likelihood for sequence evolution on trees and Arbitrary models (e.g. nucleotide, amino acid, codon) · S peed-ups (over CPU only version): nucleotide model = up to 25x, codon model = up to 50x. · K-means · Kps-means · K-medoids · K-centers · H ierarchical clustering · S elf-organizing map · C UDA based libraries partial order alignment (cudapoa) · G obal aligner (cudaaligner) · M apper (cudamapper) · P arallel search of Smith-Waterman database. Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node 46 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 46 4/5/21 10:18 AM CUSHAW f5c G-BLASTN GHOST-Z GPU GPU-Blast mCUDA-MEME MUMmer GPU NVBIO NVBowtie Parabricks PEANUT Racon Open Source Parallelized short read aligner · P arallel, accurate long read aligner for large genomes Multi-GPU Single Node University of New South Wales An optimised re-implementation of the call-methylation and eventalign modules in Nanopolish. Given a set of basecalled Nanopore reads and the raw signals, f5c call-methylation detects the methylated cytosine and f5c eventalign aligns raw nanopore DNA signals (events) to the basecalled read. f5c can optionally utilise NVIDIA graphics cards for acceleration. · M ethylated cytosine base and frequency detection · E vent alignment Single GPU Single Node Hong Kong Baptist University GPU-accelerated nucleotide alignment tool · B lastn and megablast modes of NCBI- based on the widely used NCBI-BLAST. BLAST Single GPU Single Node Akiyama_ Laboratory, Tokyo Institute of Technology Sequence homology search tool. · S hotgun Metagenome Analysis. Multi-GPU Multi-Node Carnegie Mellon Local search with fast k-tuple heuristic University · P rotein alignment according to BLASTP Single GPU Single Node Open Source Ultrafast scalable motif discovery algorithm · S calable motif discovery algorithm based based on MEME . on MEME Multi-GPU Single Node Open Source MUMmer GPU is a high-throughput local sequence alignment program · A ligns multiple query sequences against reference sequence in parallel Single GPU Single Node Open Source NVBIO is an open source C++ library of reusable components designed to accelerate bioinformatics applications using CUDA. · D ata structures, algorithms · U tility routines useful for building complex computational genomics applications on CPU-GPU systems Multi-GPU Single Node Open Source A largely complete implementation of the Bowtie2 aligner on top of NVBIO. · G ood coverage of Bowtie2 features · C omparable quality results Multi-GPU Single Node NVIDIA Parabricks provides 30-50 times faster secondary analysis of sequencer generated FASTQ files to variant call files (VCFs). Parabricks has accelerated the standard secondary analyses such as GATK4, Google's Deepvariant to generate equivalent results, while increasing throughput significantly. · B WA-mem, Star, haplotype caller, CNVKit, Mutect2, Deep Variant, ImportGVCF, Select Variants, Genotype GVCF, Mark, Sort, BQSR, Merge, VQSR, Variant Filtration, CNNScore, and many quality checking tools. Multi-GPU Single Node Open Source Read mapper for DNA or RNA sequence that reads to a known reference genome. · A chieves supreme sensitivity and speed compared to current state of the art · R eads mappers like BWA MEM, Bowtie2 and RazerS3 · P EANUT reports both only the best hits or all hits Single GPU Single Node University of Zagreb, Faculty of Electrical Engineering and Computing Racon is intended as a standalone consensus module to correct raw contigs generated by rapid assembly methods which do not include a consensus step. The goal of Racon is to generate genomic consensus which is of similar or better quality compared to the output generated by assembly methods which employ both error correction and consensus steps, while providing a speedup of several times compared to those methods. · It supports data produced by both Pacific Biosciences and Oxford Nanopore Technologies. Racon can be used as a polishing tool after the assembly with either Illumina data or data produced by third generation of sequencing. The type of data inputed is automatically detected. Racon takes as input only three files: contigs in FASTA/FASTQ format, reads in FASTA/FASTQ format and overlaps/alignments between the reads and the contigs in MHAP/PAF/SAM format. Output is a set of polished contigs in FASTA format printed to stdout. All input files can be compressed with gzip (which will have impact on parsing time). · Racon can also be used as a read errorcorrection tool. In this scenario, the MHAP/ PAF/SAM file needs to contain pairwise overlaps between reads including dual overlaps. Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 47 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 47 4/5/21 10:18 AM racon-gpu REACTA SeqNFind SOAP3 SOAP3-dp Synomics Studio UGene WideLM Open Source Racon is intended as a standalone consensus module to correct raw contigs generated by rapid assembly methods which do not include a consensus step. The goal of Racon is to generate genomic consensus which is of similar or better quality compared to the output generated by assembly methods which employ both error correction and consensus steps, while providing a speedup of several times compared to those methods. It supports data produced by both Pacific Biosciences and Oxford Nanopore Technologies. · Racon can be used as a polishing tool after the assembly with either Illumina data or data produced by third generation of sequencing · The type of data inputted is automatically detected. · Racon takes as input only three files: contigs in FASTA/FASTQ format, reads in FASTA/ FASTQ format and overlaps/alignments between the reads and the contigs in MHAP/ PAF/SAM format. Output is a set of polished contigs in FASTA format printed to stdout. All input files can be compressed with gzip (which will have impact on parsing time). · Racon can also be used as a read errorcorrection tool. In this scenario, the MHAP/ PAF/SAM file needs to contain pairwise overlaps between reads including dual overlaps. Single GPU Single Node Open Source A modified version of GCTA with improved computational performance, support for Graphics Processing Units (GPUs), and additional features. The purpose of REACTA is to quantify the contribution of genetic variation to phenotypic variation for complex traits. · G RM creation · R EML analysis · R egional Heritability (including multi-GPU) Multi-GPU Single Node Accelerated Technology Laboratories SeqNFind; is a powerful tool suite that addresses the need for complete and accurate alignments of many small sequences against entire genomes utilizing a unique hardware/software cluster system for facilitating bioinformatics research in Next Generation sequencing and genomic comparisons. · H ardware and software for reference assembly, blast, SW, HMM, de novo assembly Multi-GPU Single Node Genomics GPU-based software for aligning short reads with a reference sequence. Finds all alignments with k mismatches, where k is chosen from 0 to 3. · S hort read alignment tool that is not heuristic based · R eports all answers Multi-GPU Multi-Node The University of SOAP3-dp is an ultra-fast GPU-based tool Hong Kong for short read alignment via index-assisted dynamic programming. · B orrows-Wheeler Transformation · D ynamic Programming Multi-GPU Single Node Row Analytics Multi-Omics Biomarker Network Discovery and ValidationSynomics Studio is a new, highly scalable analysis platform that enables researchers and clinicians to discover novelassociations between multiple genotypic, phenotypic and clinical attributes of their patients and their disease risk /therapy responses. · Multi-SNP association studies (GWAS studies with up to 30 SNPs/SNVs in combination) · Configurable number of cycles of fully random permutation for validation of SNP networks Speed-up on GPU = 170x vs multicore CPU alone (further speed-up available on multi-GPU and NVLink devices) · Representative performance for 15,000 case:controls, 200,000 SNPs · 2 SNP associations found and validated in 12 mins on single 20 core IBM POWER8NVL with 4x Tesla P100 GPU · 17 SNP associations found and validated in 6 days on single 20 core IBM POWER8NVL with 4x Tesla P100 GPU Multi-GPU Single Node Unipro Open source Smith-Waterman for SSE/CUDA, · F ast short read alignment Suffix array based repeats finder and dotplot. Multi-GPU Single Node Open Source Fits numerous linear models to a fixed design and response. · P arallel linear regression on multiple similarly-shaped models Multi-GPU Single Node 48 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 48 4/5/21 10:18 AM MICROSCOPY APPLICATION NAME COMPANYNAME ANNA-PALM Institut Pasteur Appion New York Structural Biology Center BioEM crYOLO cryoSPARC Max Planck Institute Max Planck Institute for Molecular Physiology cryoSPARC Dynamo Center for Cellular Imaging and Nano Analytics (C-CINA), Biozentrum, University of Basel PRODUCT DESCRIPTION Accelerating Single Molecule Localization Microscopy with Deep Learning: ANNAPALM is a computational method that can reconstruct super-resolution images from sparse single molecule localization data and/or widefield images. ANNA-PALM can produce high quality super-resolution images from data obtained in much shorter acquisition time than standard single molecule localization microscopy. By strongly reducing acquisition time, ANNAPALM facilitates super-resolution imaging of large numbers of cells (high throughput imaging), large samples, and live cells. SUPPORTED FEATURES · U ses a much smaller number of low resolution frames than other methods · P rocessing by localization algorithms results in a sparse localization image using a neural network previously trained on conventional PALM images · Inputs sparse image and outputs a superresolution image · R uns well on GPU due to acceleration available in Tensorflow GPU SCALING Single GPU Single Node Appion is a "pipeline" for processing and analysis of EM images. Appion is integrated with Leginon data acquisition but can also be used stand-alone after uploading images (either digital or scanned micrographs) or particle stacks using a set of provided tools. Appion consists of a web based user interface linked to a set of python scripts that control several underlying integrated processing packages. All data input and output within Appion is managed using tightly integrated SQL databases. The goal is to have all control of the processing pipeline managed from a web based user interface and all output from the processing presented using web based viewing tools. · T he underlying packages integrated into Appion include MotionCor2, Gctf, EMAN, Spider, Frealign, Imagic, XMIPP, IMOD, ProTomo, ACE, CTFFind and CTFTilt, findEM, DogPicker, TiltPicker, RMeasure, EM-BFACTOR, and Chimera. Single GPU Single Node GPU-accelerated computing of Bayesian inference of electron microscopy images. · B ioEM can use CUDA for the crosscorrelation step, which essentially consists of an image multiplication in Fourier space and a Fourier back-transformation Multi-GPU Single Node Novel automated particle picking software based on the deep learning object detection system `You Only Look Once' (YOLO). CrYOLO is available as standalone program under http://sphire.mpg.de/ and will be part of the image processing workflow in SPHIRE. · P art of the image processing workflow in SPHIRE. Multi-GPU Single Node CryoSPARC is an easy to use software tool that enables rapid, unbiased structure discovery of proteins and molecular complexes from cryo-EM data. · A b-initio reconstruction · H eterogeneous reconstruction · H igh-speed and high resolution refinement of 3D protein structures implemented on GPUs · M ultiple simultaneous jobs on multiple GPUs Multi-GPU Multi-Node Dynamo is a software environment for subtomogram averaging of cryo-EM data. · D ynamo provides workflows all the way from tomograms to averages and classes. · In a full workflow, you would organize tomograms in catalogues, use them to pick particles and create alignment and classification projects to be run on different computing environments · R equires CUDA Toolkit of version 7.5 or higher and CUDA driver compatible with your actual GPU device Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 49 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 49 4/5/21 10:18 AM EMAN2 emClarity Gautomatch GCTF Huygens Baylor College of Medicine Benjamin Himes MRC Laboratory of Molecular Biology MRC Laboratory of Molecular Biology Scientific Volume Imaging EMAN2 is the successor to EMAN1. It is a broadly based greyscale scientific image processing suite with a primary focus on processing data from transmission electron microscopes. EMAN's original purpose was performing single particle reconstructions (3-D volumetric models from 2-D cryo-EM images) at the highest possible resolution, but the suite now also offers support for single particle cryo-ET, and tools useful in many other subdisciplines such as helical reconstruction, 2-D crystallography and whole-cell tomography. Image processing in a suite like EMAN differs from consumer image processing packages like Photoshop in that pixels in images are represented as floating-point numbers rather than small (8-16 bit) integers. In addition, image compression is avoided entirely, and there is a focus on quantitative analysis rather than qualitative image display. emClarity is a collection of gpu accelerated software developed to enable determination of biological structures at resolutions better than 1nm from heterogeneous specimen imaged by cryo-Electron Tomography. Gautomatch is a GPU accelerated program for accurate, fast, flexible and fully automatic particle picking from cryo-EM micrographs with or without templates. Corrects contrast transfer function effects in electron microscope optics Huygens Products: Greatly improve your microscope images All EMAN2 programs, including GUI programs, are written in the easy-to-learn Python scripting language. This permits knowledgeable end-users to customize any of the code with unprecedented ease. If you aren't an advanced user, you can still make use of the integrated GUI and all of EMAN2's command-line programs. · S ubtomogram averaging · V ery high resolution single particle analysis · H ybrid electron microscopy. · F ast: typically, 1.5~2.0s with 15 templates, using a good GPU (e.g. GTX 980, Titan X) · F ully automatic with simple command on entire data sets · C onvenient and easy to use · F lexible: with or without template, suitable for both basic or advanced users · C ompatible with Relion/EMAN · B ackground correction: automatic correct the gradient background that affects the picking · R ejection of ice/carbon: automatically detect non-particle areas and reject them · P ost-optimization: scripts available to refilter the coordinates after picking within seconds · A ccuracy: the user's satisfaction is the only `gold standard' criterion · CUDA · D econvolution of volumetric images and time series from widefield, confocal, light sheet, super-resolution STED microscopes and more · C hromatic aberration and cross-talk correction, image stabilization and stitching · V isualization, tracking, colocalization and object analysis · M ulti-GPU and cluster support Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Single GPU Single Node Multi-GPU Single Node 50 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 50 4/5/21 10:18 AM IMOD ITK Leginon Microvolution MotionCor2 PSSR RELION University of Colorado Kitware New York Structural Biology Center Microvolution UCSF Waitt Advanced Biophotonics Center Core MRC Laboratory of Molecular Biology IMOD is a set of image processing, modeling and display programs used for tomographic reconstruction and for 3D reconstruction of EM serial sections and optical sections. Contains tools for assembling and aligning data within multiple types and sizes of image stacks, viewing 3-D data from any orientation, and modeling and display of the image files. · ctfphaseflip : Corrects tilt series for microscope CTF by phase flipping · gputilttest : Test whether a GPU is reliable for computing reconstructions with the tilt program · 3dmod : Model editing and image display program. 3dmod can display threedimensional graphic data sets in many views simultaneously, can model these data sets, and can display models and graphic data in 3-D. The views include a slice through the 3D volume, a projection of a sub-volume and orthogonal views with contour overlays. · xyzproj : Project 3-dimensional data at a series of tilts around the X, Y, or Z axis. Single GPU Single Node The National Library of Medicine Insight Segmentation and Registration Toolkit (ITK), or Insight Toolkit, is an open-source, crossplatform C++ toolkit for segmentation and registration. Segmentation is the process of identifying and classifying data found in a digitally sampled representation. Typically the sampled representation is an image acquired from such medical instrumentation as CT or MRI scanners. Registration is the task of aligning or developing correspondences between data. For example, in the medical environment, a CT scan may be aligned with a MRI scan in order to combine the information contained in both. · L ibrary is used by Paraview, VTK, and many other software distributions · M any capabilities for multi-dimensional image processing and extraction tools · M ost recent GPU acceleration of FFTs using cuFFT (cuFFTW) and matrix math accelerated through CUDA enabled Eigen3 Single GPU Single Node Leginon is a system designed for automated collection of images from a transmission electron microscope. · A Leginon application is image acquisition process that is built of several smaller pieces called `nodes' · N odes can be applications · S ome of these are GPU accelerated applications such as Topaz, Relion, and MotionCor2 Single GPU Single Node Nearly instantaneous 3D deconvolution & up to 200 times faster. · 3 D deconvolution for fluorescence microscopy · W ritten for use only on GPUs · M ulti-GPU support Single GPU Single Node A multi-GPU program that corrects beam-induced sample motion on dose fractionated movie stacks. Implements a robust iterative alignment algorithm that delivers precise measurement and correction of both global and non-uniform local motions at single pixel level across the whole frame. Suitable for both singleparticle and tomographic images. · O verall, MotionCor2 is extremely robust, and sufficiently accurate at correcting local motions so that the very time-consuming and computationally-intensive particle polishing in RELION can be skipped. Importantly · W orks on a wide range of data sets including cryo tomographic tilt series Multi-GPU Single Node Deep Learning-Based Point-Scanning Super-Resolution Imaging allows pointscanning super-resolution (PSSR) imaging and facilitates point-scanning image acquisition with otherwise unattainable resolution, speed, and sensitivity. · Pre-trained models for · PSSR for Electron Microscopy (EM) · PSSR single frame (PSSR-SF) for mitoTracker · PSSR multiframe (PSSR-MF) for mitoTracker · PSSR for neuronal mitochondria Single GPU Single Node RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryomicroscopy (cryo-EM). · Image classification and high resolution refinement accelerated up to 40-fold · T emplate-based particle selection accelerated almost 1000-fold · R educed memory requirements · H igh-resolution cryo-EM structure determination in a matter of day on a single workstation Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 51 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 51 4/5/21 10:18 AM Thunder Tomviz Topaz Warp Tsinghua University Kitware Tristan Bepler Max Planck Institute for Biophysical Chemistry THUNDER is a particle-filter algorithm based cryoEM image processing software for using THUNDER to analysis cryoEM images in purpose of achieving a 3D model. · B oth image classification and highresolution refinement accelerated up to 40-fold · T emplate-based particle selection accelerated almost 1000-fold · R educed memory requirements · H igh-resolution cryo-EM structure determination in a matter of day on a single workstation Multi-GPU Multi-Node Tomviz enables 3D characterization of materials at the nano- and meso-scale, tailored for visualizing electron tomography data. It utilizes the large quantities of memory and processing resources required to render, manipulate, and analyze voluminous 3D tomograms. · 3 D tomographic data processing, visualization, and analysis of · Python · Windows · M ac OS · Linux Single GPU Single Node A pipeline for particle detection in cryoelectron microscopy images using convolutional neural networks trained from positive and unlabeled examples. · D eep learning for cryo EM data particle picking · U ses CUDA and pytorch Single GPU Single Node Warp integrates novel algorithms for frame alignment, defocus estimation, particle picking and tomographic reconstruction in a rich user interface. Enables data quality monitoring in real time, data analysis at microscope level and obtains high-resolution structures before data collection is over. · C UDA enabled processing for electron microscopy · T ensorFlow (v1.10) · C UDA kernels: backprojection, CTF, deconvolution, FFT, tomography refinement, and others Single GPU Single Node MOLECULAR DYNAMICS APPLICATION NAME ACEMD COMPANYNAME Acellera Ltd PRODUCT DESCRIPTION GPU simulation of molecular mechanics force fields, implicit and explicit solvent SUPPORTED FEATURES · W ritten for use only on GPUs. AMBER University of Suite of programs to simulate molecular California at San dynamics on biomolecule. Francisco · P MEMD Explicit Solvent and GB Implicit Solvent CHARMM Harvard University MD package to simulate molecular dynamics on biomolecule. · Implicit (5x) · E xplicit (2x) · S olvent via OpenMM, now ported natively to GPUs Colvars Temple University Software module for molecular simulation and analysis that provides a highperformance implementation of sampling algorithms defined on a reduced space of continuously differentiable functions (aka collective variables) The module itself implements a variety of functions and algorithms, including freeenergy estimators based on thermodynamic forces, non-equilibrium work and probability distributions · L AMMPS, NAMD, VMD · G PU support Computational Crystallography Toolbox Lawrence Berkeley Laboratories Open source component of the PHENIX system to advance automation of macromolecular structure determination. Useful for small-molecule crystallography and even general scientific applications · G PU acceleration for scattering and general purpose math via · C UDA and cuFFT DeePMD-kit Princeton University DeePMD-kit is a package written in Python/ C++, designed to minimize the effort required to build deep learning based model of interatomic potential energy and force field and to perform molecular dynamics (MD). Addresses the accuracyversus-efficiency dilemma in molecular simulations. Applications of DeePMD-kit span from finite molecules to extended systems and from metallic systems to chemically bonded systems. · TensorFlow · H igh-performance classical MD and quantum (path-integral) MD packages · D eep Potential series models · M PI and GPU support 52 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 GPU SCALING Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 52 4/5/21 10:18 AM DeepSite DESMOND ESPResSo FEP+ Folding@Home Galamost GALAMOST Genesis GENESIS Acellera Ltd DeepSite is a protein binding pocket predictor based on deep neural networks. Allows you to upload your structure on PDB format, monitor the progress of your job and visualize the results with our modern WebGL viewer. · D eep learning · M achine learning · D rug discovery in a web interface Single GPU Single Node David E. Shaw Research High-speed molecular dynamics simulations of biological systems. · T he code uses novel parallel algorithms and numerical techniques to achieve high performance and accuracy Multi-GPU Single Node ESPResSo Highly versatile software package for performing and analyzing scientific Molecular Dynamics, many-particle simulations of coarsegrained atomistic or bead-spring models as they are used in soft-matter research in physics, chemistry and molecular biology. · H ydrodynamic / Electrokinetic forces · P 3M electrostatics. Multi-GPU Single Node Schrodinger, Inc. Molecular Dynamics (MD) and Free Energy Perturbation (FEP) calculations occur on time scales that are computationally demanding to simulate. A key factor in determining whether a simulation will take days, hours, or minutes to run is the hardware being used. The advent of GPU computing, however, has opened the door to a new world of computationally intensive simulations that would not have been possible even a few years ago. Desmond's high-performance Molecular Dynamics code, together with continuously improving computer hardware technologies are helping scientists push the boundaries of discovery further than ever before. MD simulations to impact drug discovery has now been attained in FEP+, due to the confluence of hardware and software development along with the formulation of sufficiently accurate theoretical methods and models · O ptimization of the FEP+ algorithm to take full advantage of the Desmond GPU MD engine enabling 2 to 4 ligands to be scored per day on a multi-GPU server. Multi-GPU Multi-Node Stanford University A distributed computing project that studies · P owerful distributed computing molecular protein folding, misfolding, aggregation, and dynamics system related diseases. · Implicit solvent and folding Multi-GPU Single Node CAS-CIAC GALAMOST is a project of employing highperformance computational techniques to accelerate molecular simulation by fully utilizing the computational power of NVIDIA GPUs. Enables the investigation og polymeric systems in a large temporal and spatial scale at a very low cost. · F ull Molecular Simulation on GPU Multi-GPU Multi-Node ChangChun CHINA GALAMOST is a package of employing highperformance computational techniques on many-core processors to accelerate molecular dynamics simulations. The package is written with CUDA and C++ languages for particularly running on NVIDIA GPUs and focuses on the large scale simulations of soft matters. · G eneral molecular dynamics · D issipative particle dynamics (DPD) · B rownian dynamics (BD) · Coarse-graining molecular dynamics (CGMD) · R eaction model · A nisotropic particle models · MD-SCF · D NA 3SPN model · R igid body method · S tretching method Single GPU Single Node Diamond Visionics GenesisRTX, is an advanced high-fidelity runtime rendering engine which eliminates the need for traditional off-line database compiling or formatting. · P owerful parallelization for hybrid (CPU+GPU) systems · F ull electrostatics with PME · L arge (1-100 million atoms) biological systems Multi-GPU Single Node RIKEN GENESIS (GENeralized-Ensemble SImulation System) is a software package for molecular dynamics simulations and trajectory analyses. · P owerful parallelization for hybrid (CPU+GPU) systems · F ull electrostatics with PME · L arge (1-100 million atoms) biological systems Multi-GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 53 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 53 4/5/21 10:18 AM GPUgrid.net GROMACS HALMD HOOMD-Blue HTMD LAMMPS MELD MOLECULAR OPERATING ENVIRONMENT myPresto NAMD OpenMM PolyFTS SOP-GPU Acellera Ltd KTH Royal Institute of Technology HALMD University of Michigan Acellera Ltd Sandia National Lab University of Calgary Chemical Computing Group ULC N2PC/AIST/ JBIC, Japan University of Illinois at Champaign Urbana Stanford University University of California at Santa Barbara SOP-GPU A distributed computing project that uses GPUs for molecular simulations. Simulation of biochemical molecules with complicated bond interactions · H igh-performance all-atom biomolecular simulations · E xplicit solvent and binding · Implicit (5x) · E xplicit (2x) Solvent Multi-GPU Single Node Multi-GPU Single Node Large-scale simulations of simple and complex liquids. · S imple fluids and binary mixtures (pair potentials, high-precision NVE and NVT, dynamic correlations) Particle dynamics package written grounds · W ritten for use only on GPUs up for GPUs. High throughput molecular dynamics simulations. · A vailable via Conda and github · ACEMD · PMEMD · NAMD · GROMACS · AMBER · C HARMM force fields · A daptive sampling, Markov State Models, visualization, protein preparation and ligand parameterization Classical molecular dynamics package · Lennard-Jones · Gay-Berne · Tersoff OpenMM plugin written for GPUs. · Integrative approach to combine physics and information · O rders of magnitude faster protein folding than brute force MD Calculate and Analyze pH-Dependent Protein Properties. MOEsaic Session Sharing and Project Customization. Determine Conformation Population from NMR NOE Data Predict Relative Binding Energies with AMBER Thermodynamic Integration. · G PU Accelerated 3D Stereo Graphics · A MBER GPU accelerated support Open Source Computational Drug Discovery · H igh performance virtual screening by MD Suite. binding · F ree energy calculation. Designed for high-performance simulation of large molecular systems. · F ull electrostatics with PME and most simulation features · 1 00M atom capable Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Library and application for molecular dynamics for HPC with GPUs. · M olecular Dynamics toolkit · E xtensible and growing · Implicit and explicit solvent, custom forces Multi-GPU Single Node Classical molecular simulation code for studying polymer self-assembly and thermodynamics. · U ses auxiliary fields as the fundamental simulation degrees of freedom · U ses cuFFT extensively (~ 80%) · C UDA code is ~20% · M ulti CPU or single GPU per job · 1 x = Ivy Bridge E5-2690 CPU all 10 cores · 3 -8X on K40 or K80 (utilizing 1/2 of the K80) Single GPU Single Node SOP-GPU package for the Self Organized Polymer Model fully implemented on a GPU. A scientific software package designed to perform Langevin Dynamics Simulations of the mechanical or thermal unfolding, and mechanical indentation of large biomolecular systems in the experimental subsecond (millisecond-to-second) timescale. · L angevin dynamics simulations using the coarse-grained Self Organized Polymer (SOP) model · M ultiple simulation trajectories can be performed simultaneously on a single GPU · C alpha and Calpha-Cbeta models · S imulations of protein forced unfolding · Novel simulations of nanoindentation in silico · S upport for hydrodynamic interactions · U p to ~100 ms of simulation time per day, · S ystems of up to 1,000,000 amino-acids (on GPUs with 6GB or great memory) Single GPU Single Node 54 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 54 4/5/21 10:18 AM QUANTUM CHEMISTRY APPLICATION NAME Abinit COMPANYNAME ABINIT PRODUCT DESCRIPTION SUPPORTED FEATURES Allows to find total energy, charge density and electronic structure of systems made of electrons and nuclei within DFT. · L ocal Hamiltonian · N on-local Hamiltonian · L OBPCG algorithm · D iagonalization/ orthogonalization. GPU SCALING Multi-GPU Single Node ACES 4 University of Florida New SIA/aces4 development A new super · Integrating scheduling GPU into SIAL instruction architecture with interface programming language and SIP runtime applications for quantum chemistry (aces4). environment Multi-GPU Single Node ACES III University of Florida ACES III takes the best features of parallel implementations of quantum chemistry methods for electronic structure. · Integrating scheduling GPU into SIAL programming language and SIP runtime environment. Multi-GPU Multi-Node ADF Software for Chemistry & Materials Density Functional Theory (DFT) software package that enables first-principles electronic structure calculations. · G eometry optimizations and frequency calculations with GGA functionals. Multi-GPU Single Node BigDFT BigDFT Implements density functional theory by solving the Kohn-Sham equations describing the electrons in a material. · D aubechies wavelets Multi-GPU Multi-Node BrianQC StreamNovation Ltd. BrianQC is a software product in the field of quantum chemistry. It accelerates features of Q-Chem 5.0 or later. Optimized for simulating large molecules and tested up to 20,000 Cartesian Gaussian basis functions. Has full support of s, p, d, f and g-type orbitals. Full support for NVIDIA GPU architectures (Kepler, Maxwell, Pascal, Volta) with double precision accuracy on 64-bit Linux operation systems. Targets the speeds up of Q-Chem for every calculation that uses Coulomb or Exchange integrals over Gaussian basis functions or their first analytic derivative (including HF-SCF, DFT, SCF geom. opt, DFT geom. opt for most functionals, etc.) · T he range of NVIDIA architectures supported by BrianQC has been expanded. In addition to GPUs powered by Kepler, Maxwell and Pascal, BrianQC now supports NVIDIA Tesla V100 GPU as well · C ompatible with features of Q-Chem 5.0 or later · O ptimized for simulating large molecules · T ested up to 20,000 Cartesian Gaussian basis functions · F ull support of s, p, d, f and g-type orbitals · F ull support for NVIDIA GPU architectures (Kepler, Maxwell, Pascal). Double precision accuracy · R uns on 64-bit Linux operation systems · S peeds up Q-Chem for every calculation that uses Coulomb or Exchange integrals over Gaussian basis functions or their first analytic derivative (including HF-SCF, DFT, SCF geom. opt, DFT geom. opt for most functionals, etc.) Multi-GPU Single Node CP2K CP2K Program to perform atomistic and molecular simulations of solid state, liquid, molecular and biological systems. · D BCSR (space matrix multiply library) Multi-GPU Multi-Node GAMESS-UK Open Source The general purpose ab initio molecular electronic structure program for performing SCF-, DFT- and MCSCF-gradient calculations. · (ss|ss) type integrals within calculations using Hartree-Fock ab initio methods and density functional theory · S upports organics and inorganics. Multi-GPU Multi-Node GAMESS-US Ames Computational chemistry suite used to Laboratory/Iowa simulate atomic and molecular electronic State University structure. · L ibqc with Rys Quadrature Algorithm · Hartree-Fock · M P2 and CCSD Multi-GPU Multi-Node Gaussian Gaussian, Inc. Predicts energies, molecular structures, and vibrational frequencies of molecular systems. · J oint NVIDIA · P GI and Gaussian collaboration Multi-GPU Single Node GPAW GPAW Real-space grid DFT code written in C and Python. · E lectrostatic poisson equation · O rthonormalizing of vectors · R esidual minimization method (rmm-diis) Multi-GPU Multi-Node gWL-LSMS ORNL Materials code for investigating the effects · G eneralized Wang-Landau method of temperature on magnetism. Multi-GPU Multi-Node LATTE Open Sourcee Density matrix computations · CU_BLAS · S P2 Algorithm Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 55 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 55 4/5/21 10:18 AM libxc LSDalton MAPS MOLCAS MOPAC2012 NWChem NWChemEX Octopus PEtot TDDFT LSDalton Scienomics MOLCAS MOPAC NWChem Pacific Northwest National Laboratories Harvard University Lawrence Berkeley Laboratories Libxc is a library of exchange-correlation functionals for density-functional theory providing portable, well tested and reliable set of exchange and correlation functionals that can be used by all the ETSF codes and also other codes · G PU acceleration for quantum chemistry · L DA, GGA, hybrids and mGGA · P ython 3 and C interfaces Multi-GPU Single Node Linear-scaling HF and DFT code suitable for large molecular systems, now also with some CCSD capabilitiesTensor Algebra Library Routines for Shared Memory Systems which is being used to GPU accelerate three (3) CAAR codes; NWChem, LSDALTON and DIRAC. · (T) correction to the CCSD energy · R I-MP2 energy/gradient (in development) · C CSD energy (in development) · G PU-based ERI generator (in development) Multi-GPU Single Node MAPS CLASSICAL & MESOSCALE simulation toolkit contains world-class simulation engines such as LAMMPS, CHAMELEON, TOWHEE, NAMD. Includes a collection of ready-to-use workflows and a rich Force-Field library. · Typical calculations that can be executed include molecular dynamics simulations and Monte Carlo simulations, structure relaxation in periodic or molecular systems using both classical and quantum mechanics tools · Trajectory can be generated and then later analyzed using the appropriate tools · Additional simulations can be performed using PC-SAFT and related methods for thermodynamics modeling Single GPU Single Node Methods for calculating general electronic structures in molecular systems in both ground and excited states. · CU_BLAS Multi-GPU Single Node Semiempirical Quantum Chemistry · Pseudodiagonalization · F ull diagonalization · D ensity matrix assembling via Magma libraries Single GPU Single Node NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters. · T riples part of Reg-CCSD(T) · C CSD and EOMCCSD task schedulers Multi-GPU Single Node NWChemEx targets developing highperformance computational models for the production of advanced biofuels and other bioproducts · G PU acceleration · libraries like libxc Single GPU Single Node Used for ab initio virtual experimentation and quantum chemistry calculations. · F ull GPU support for ground-state, realtime calculations · K ohn-Sham Hamiltonian · Orthogonalization · S ubspace diagonalization · P oisson solver · T me propagation · D FT application Single GPU Single Node First principles materials code that computes the behavior of the electron structures of materials. · D ensity functional theory (DFT) plane wave Multi-GPU pseudopotential calculations Single Node 56 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 56 4/5/21 10:18 AM QBox Q-CHEM QMCPACK Quantum Espresso QUICK RESCU RMG TAL-SH TeraChem VASP University of California Davis Q-Chem Inc. QMCPACK Quantum Espresso Foundation Michigan State University Hongzhiwei technology North Carolina State University Oak Ridge National Lab PetaChem LLC University of Vienna Qbox is a C++/MPI scalable parallel implementation of first-principles molecular dynamics (FPMD) based on the plane-wave, pseudopotential formalism. Designed for operation on large parallel computers. · T he availability of double precision graphics cards provides an opportunity to speed up electronic structure computations. We modify the Qbox code to utilize Fermi GPUs on the Keeneland platform · W e use the CUFFT library to speed up Fourier transforms and perform asynchronous communication to cut down the cost of data transfers · T he modified code is used in simulations of a 64-molecule water system with an 85 Ry plane wave energy cut off · P reliminary results show a 2-3 times speedup in the calculation of the charge density and in the application of the Hamiltonian operator to the wave function · W e present these findings as well as further speedups measured in other parts of the code. http://eslab.ucdavis.edu/ software/qbox http://keeneland.gatech.edu Single GPU Single Node Computational chemistry package designed · V arious features including RI-MP2 for HPC clusters. Single GPU Single Node QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. · M ain features Multi-GPU Multi-Node An integrated suite of computer codes for electronic structure calculations and materials modeling at the nanoscale. · P Wscf package: linear algebra (matix multiply) · E xplicit computational kernels · 3 D FFTs Multi-GPU Multi-Node QUICK is a GPU-enabled ab intio quantum chemistry software package. · R unning Hartree-Fock and DFT energy on GPU · S upports s, p, d, f orbitals on energy calculation · H F gradient with s,p,d orbital support · G PU-based ERI generator Multi-GPU Single Node RESCU is a KS-DFT calculation software that can study very large systems with only a small computer. Offers new, extremely powerful and parallel high efficiency KS-DFT self-consistent calculation method. · P arallel high efficiency processing- KS-DFT Multi-GPU Single Node RMG is a density functional theory (DFT) based electronics structure code that uses real space grids to represent wavefunctions, charge densities, and ionic potentials. Designed for scalability and runs successfully on systems with thousands of nodes (including GPU nodes) and hundreds of thousands of CPU cores. · S upports 10k+ GPU nodes · M ultipetaflops capable · H andles thousands of atoms with full DFT precision · S upports multiple GPUs per node · F ully open source · Installation support · C ray XE6/XK7 Multi-GPU Single Node Tensor Algebra Library Routines for Shared Memory Systems accelerates three (3) CAAR codes; NWChem, LSDALTON and DIRAC. · T ensor Algebra Library for Shared Memory Computers: Nodes equipped with multicore CPU, NVIDIA GPU, and Intel Xeon Phi (in progress) Multi-GPU Multi-Node Quantum chemistry software designed to run on NVIDIA GPU. · F ull GPU-based solution; Performance compared to GAMESS CPU version Multi-GPU Single Node Complex package for performing ab-initio quantum-mechanical molecular dynamics (MD) simulations using pseudopotentials or the projector-augmented wave method and a plane wave basis set · Blocked Davidson (ALGO = NORMAL & FAST) · RMM-DIIS (ALGO = VERYFAST & FAST) · K-Points and optimization for critical step in exact exchange calculations Multi-GPU Multi-Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 57 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 57 4/5/21 10:18 AM (MOLECULAR) VISUALIZATION AND DOCKING APPLICATION NAME Amira COMPANYNAME Thermo fisher Scientific PRODUCT DESCRIPTION SUPPORTED FEATURES A multifaceted software platform for · 3 D visualization of volumetric data and visualizing, manipulating, and understanding surfaces Life Science and bio-medical data. GPU SCALING Single GPU Single Node AUTODOCK Scripps The AutoDock Suite is a growing collection of methods for computational docking and virtual screening, for use in structurebased drug discovery and exploration of the basic mechanisms of biomolecular structure and function. · O penCL-accelerated version of AutoDock4.2.6 · A utoDock GPU · ADADELTA Single GPU Single Node BINDSURF Universidad Catolica de Murcia A virtual screening methodology that uses GPUs to determine protein binding sites. · A llows fast processing of large ligand databases Single GPU Single Node BUDE Bristol University Docking Station Molecular docking program · E mpirical Free Energy Force field Single GPU Single Node FastROCS OpenEye Scientific Software, Inc. Molecule shape comparison application · R eal-time shape similarity searching/ comparison Multi-GPU Multi-Node Interactive University of Molecule Visualizer Illinois Experimental interactive molecule visualizer based on a ray-tracing engine. · H igh quality images and ease of interaction · L atest GPUcomputing acceleration techniques · N atural user interfaces such as Kinect and Wiimotes Single GPU Single Node MEGADOCK Akiyama_ Laboratory, Tokyo Institute of Technology MEGADOCK is a fast protein-protein docking software when more acceleration is demanded for an interactome prediction, which is composed of millions of protein pairs. · M EGADOCK-GPU on 12 CPU cores · 3 GPU calculation speed 37.0 times faster than MEGADOCK on 1 CPU core · N ovel docking software facilitating the application of docking techniques to assist large-scale protein interaction network analyses Multi-GPU Single Node Molegro Virtual Docker 6 QIAGEN Method for performing high accuracy flexible molecular docking. · E nergy grid computation · P ose evaluation · G uided differential evolution Single GPU Single Node PIPER Protein Docking Boston University Protein-protein docking program · M olecule docking Single GPU Single Node PyMol Schrodinger, Inc. User-sponsored molecular visualization system on an open-source foundation. · L ines: 460% increase · C artoons: 1246% increase · S urface: 1746% increase · S pheres: 753% increase · R ibbon: 426% increase Single GPU Single Node VEGA ZZ University of California, San Francisco Molecular Modeling Toolkit · V irtual logP · M olecular surface values Single GPU Single Node VMD University of Illinois Visualization and analyzation of large biomolecular systems in 3-D graphics. · H igh quality rendering · L arge structures (100M atoms) · A nalysis and visualization tasks · M ultiple GPU support for display of molecular orbitals Multi-GPU Single Node Research: Higher Education and Supercomputing NUMERICAL ANALYTICS APPLICATION NAME COMPANYNAME ArrayFire ArrayFire PRODUCT DESCRIPTION ArrayFire helps organizations develop high-performance computing solutions on modern computational platforms. Specializes in machine learning and computer vision. Uses CUDA and OpenCL programming, code acceleration and optimization, and software design. SUPPORTED FEATURES · V ector Algorithms · Image Processing · C omputer Vision · S ignal Processing · L inear Algebra · Statistics GPU SCALING Multi-GPU Single Node 58 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 58 4/5/21 10:18 AM Eigen Julia Mathematica MATLAB NMath Premium PHYSICS APPLICATION NAME AWP BQCD CADISHI CASTRO Changa Chemora Eigen Julia Computing Wolfram Mathworks NMath Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. Julia delivers dramatic improvements in simplicity, speed, scalability, capacity, and productivity to solve massive computational problems quickly and accurately, making it the preferred language for big data analytics. A symbolic technical computing language and development environment. GPU acceleration for MATLAB (high-level technical computing language). GPU-accelerated math and statistics for .NET, automatically detects the presence of a CUDA-enabled GPU at runtime and seamlessly redirects appropriate computations to it. · C UDA enabled linear algebra · e igen solver, reduction, random, etc. · N VIDIA CUDA via Julia CUDA JIT plugin architecture · P arallelism and distributed computation · L ightweight "green" threading (coroutines) · U nicode, including but not limited to UTF-8 · C all C · L isp-like macros and other metaprogramming facilities · D evelopment environment for CUDA and OpenCL · G PU acceleration for Wolfram Finance Platform · Acceleration for 200+ most used MATLAB functions · Acceleration of more than 500 most parallelizable MATLAB functions · Accelerated Signal Processing toolkit · Accelerated Image Processing toolkit · Accelerated Communications Systems toolkit · Available via an NGC container · A utomatically offloads computations to the GPU. Single GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node COMPANYNAME AWP USQCD Max Planck Institute CASTRO CHANGA CHEMORA PRODUCT DESCRIPTION The Anelastic Wave Propagation, AWPODC, independently simulates the dynamic rupture and wave propagation that occurs during an earthquake. Dynamic rupture produces friction, traction, slip, and slip rate information on the fault. The moment function is constructed from this fault data and used to currentize wave propagation. SUPPORTED FEATURES · 3 D Finite Difference Computation Lattice quantum chromodynamics · W ilson-clover fermion linear solver application, used for nuclear ad high energy physics calculations. CADISHI is a software package that enables scientists to compute (Euclidean) distance histograms efficiently. Any sets of objects that have 3D Cartesian coordinates may be used as input, for example, atoms in molecular dynamics datasets or galaxies in astrophysical contexts. · H ighly tuned CPU and GPU kernels · P ython engine for throughput computing A multicomponent compressible hydrodynamic code for astrophysical flows including self-gravity, nuclear reactions and radiation. CASTRO uses an Eulerian grid and incorporates adaptive mesh refinement (AMR). · G ravitational Field Solver Astrophysics code performs collisionless N-body simulations and performs cosmological simulations with periodic boundary conditions in comoving coordinates or simulations of isolated stellar systems. · G ravitational Model has been accelerated using CUDA Chemora is a system for performing simulations of systems described by differential equations running on accelerated computational clusters. · C hemora embeds the equations' computational kernels into dynamically compiled loop nests shaped for input size and GPU structure GPU SCALING Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 59 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 59 4/5/21 10:18 AM Cholla Chroma CPS CPS (GRID) CST PARTICLE STUDIO GADGET GAMER GENE GPU-AH GPUwalls GTC Cholla USQCD USQCD USQCD Dassault Systèmes SIMULIA Corp. Max Planck Institute Open Source GENE Universidade do Porto Universidade do Porto University of California Irvine(UC Irvine) Computational Hydrodynamics On ParaLLel Architectures for Astrophysics · M odels the Euler equations on a static mesh and evolves the fluid properties of thousands of cells simultaneously using GPUs · It can update over ten million cells per GPU-second while using an exact Riemann solver and PPM reconstruction, allowing computation of astrophysical simulations with physically interesting grid resolutions (>256^3) on a single device; calculations can be extended onto multiple devices with nearly ideal scaling beyond 64 GPUs Lattice Quantum Chromodynamics (LQCD) · W ilson-clover fermions · K rylov solvers · Domain-decomposition Lattice quantum chromodynamics · W ilson, domain-wall and Mbius fermion application, used for nuclear ad high energy linear solvers physics calculations. CPS is developed for lattice QCD and written by C++, with some machine-specific assembly routines. It is being developed by members of Columbia University, Brookhaven National Laboratory. The CPS consists of code to build a library which is can be statically linked to your code to create an executable. CPS has optimized codes for QCDOC, IBM Blue Gene machines, and builds for scalar machines or parallel machines with QMP. · C UDA is supported · T he GRID code from Edinburgh is currently being optimized. Self-consistent simulation of charged particles in electromagnetic fields · P article-in-Cell Solver A code for cosmological simulations of structure formation. · MPI A GPU-accelerated Adaptive Mesh Refinement Code for astrophysical applications. Currently the code solves the hydrodynamics with self-gravity. · A daptive mesh refinement (AMR). Hydrodynamics with self-gravity · A variety of GPU-accelerated hydrodynamic and Poisson solvers · H ybrid OpenMP/MPI/GPU parallelization · C oncurrent CPU/GPU execution for performance optimization. Hilbert spacefilling curve for load balance GENE (Gyrokinetic Electromagnetic Numerical Experiment) is an open source plasma microturbulence code which can be used to efficiently compute gyroradiusscale fluctuations and the resulting transport coefficients in magnetized fusion/ astrophysical plasmas. · B asic Modeling Developed at Centro de Astrofisica e Astronomia da Universidade do Porto, GPUAH simulates the evolution of a network of line-like topological defects - Abelian-Higgs cosmic strings - in a cosmic context. · C alculates average network density and velocity Developed at Centro de Astrofisica e Astronomia da Universidade do Porto, GPUwalls simulates the evolution of a network of the simplest topological defect domain wall - in a cosmic context. · C alculates average network density and velocity Gyrokinetic Plasma Fusion for Modeling a Tokamak reactor · NVLINK Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Multi-Node Single GPU Single Node Single GPU Single Node Multi-GPU Multi-Node 60 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 60 4/5/21 10:18 AM GTC Irvine GTC-P HACC HAMR GPU MAESTRO MILC University of California Irvine(UC Irvine) Princeton Plasma Phyiscs Lab HACC HAMR MAESTRO USCQD The gyrokinetic toroidal code (GTC) is a massively parallel, particle-in-cell code for turbulence simulation in support of the burning plasma experiment ITER, the crucial next step in the quest for fusion energy. GTC is the production code for the multi-institutional US Department Of Energy (DOE) Scientific Discovery through Advanced Computing (SciDAC) project, GSEP Center (Gyrokinetic Simulation of Energetic Particle Turbulence and Transport), and DOE INCITE project that was awarded 35M hours of CPU time for 2011. Currently maintained at UC Irvine, GTC was the first fusion code to reach in production simulations the teraflop in 2001 on the seaborg computer at NERSC and the petaflop in 2008 on the jaguar computer at ORNL. GTC simulation of the turbulence self-regulation by zonal flows was published in a 1998 Science paper, which has received the most citations for any magnetic fusion research paper published since 1996. · P USHe, Collision and Poisson Solver A development code for optimization of plasma physics. Full science and data sets are included, but in a simplified form to allow performance testing and tuning. · O ptimized with CUDA · O penACC development underway Simulates N-Body Astrophysics. The HACC (Hardware/Hybrid Accelerated Cosmology Code) framework exploits this diverse landscape at the largest scales of problem size, obtaining high scalability and sustained performance. Developed to satisfy the science requirements of cosmological surveys, HACC melds particle and grid methods using a novel algorithmic structure that flexibly maps across architectures, including CPU/GPU, multi/many-core, and Blue Gene systems. We demonstrate the success of HACC on two very different machines, the CPU/GPU system Titan and the BG/Q systems Sequoia and Mira, attaining unprecedented levels of scalable performance. We demonstrate strong and weak scaling on Titan, obtaining up to 99.2% parallel efficiency, evolving 1.1 trillion particles. · T his code has been optimized with CUDA runs in full production mode GPU accelerated General Relativistic Magneto Hydrodynamic application · A ctive galactic nuclei which assumes a radiatively inefficient sub-eddington rate torus · A xisymmetric ideal MHD · V iscosity and resistivity through use of Riemann solver (HLL) · D ensity floors to mass load the jet · U ses grids that can resolve the substructure of the jet over 5 orders of magnitude A low Mach number stellar hydrodynamics code that can be used to simulate longtime, low-speed flows that would be prohibitively expensive to model using traditional compressible code. · G ravitational Field Solver Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the strong force to create larger particles like protons and neutrons. · S taggered fermions · K rylov solvers · G auge-link fattening Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 61 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 61 4/5/21 10:18 AM NekCEM ORB5 OSIRIS PIConGPU PPM QUDA RAMSES samadii/sciv XGC ANL EPFL UCLA Plasma Physics Group HZDR PPM USQCD CEA Metariver Technology PPPL A high-fidelity, open-source electromagnetics solver based on spectral element and spectral element discontinuous Galerkin methods, written in Fortran and C. ORB5 is a global, gyrokinetic, Lagrangian, Particle-In-Cell (PIC), finite element, electromagnetic model Simulates Plasma Physics including Laser interaction A relativistic Particle-in-Cell code that describes the dynamics of a plasma by computing the motion of electrons and ions subject to the Maxwell-Vlasov equation. Piecewise parabolic method is a higherorder extension of Godunov's method which uses spatial interpolation and allows for a steeper representation of discontinuities, particularly contact discontinuities. Library for Lattice QCD calculations using GPUs. Simulates astrophysical problems on different scales (e.g. star formation, galaxy dynamics, cosmological structure formation). Software for computing flow field in high vacuum condition using the DSMC(Direct Simulation with Monte Carlo) method. Simulating the interactions between gas and surfaces boundaries, the gas flow with molecular particles Simulates edge effects for MHD plasma physics · T he OpenACC implementation covers all solution routines for the Maxwell equation solver in NekCEM, including a highly tuned element-by-element operator evaluation and a GPUDirect gather-scatter kernel to effect nearest-neighbor flux exchanges Multi-GPU Multi-Node · Plasma and background magnetic geometry · Axisymmetric ideal MHD equilibria · Computed with CHEASE code [9] kinetic electrons, or various approximate models: hybrid-trapped or adiabatic intra- and inter-species linearized collision operators electromagnetic perturbations, with the cancellation problem solved using enhanced control variates and a `pullback' scheme Multi-GPU Multi-Node · 2 dimensions of the particle push have been optimized with CUDA · A dditional optimization is being planned with OpenACC Multi-GPU Single Node · S imulation of laser-particle acceleration and relativistic plasma physics Multi-GPU Multi-Node · T urbulent, compressible mixing of gases in Single GPU the context of stars near the ends of their Single Node lives and also in inertial confinement fusion · Q UDA supports the following fermion formulations: Wilson,Wilson-clover,Twisted mass,Improved staggered (asqtad or HISQ) and Domain wall · G PU acceleration · R adiative transfer for reionization · H ydrodynamic solver using AMR Multi-GPU Single Node Multi-GPU Multi-Node · D SMC simulator, gas dynamics solver · O LED & Semiconductor deposition and etching analysis, Vacuum field analysis · P DL(Pixel Define Layer) growth analysis · D eposition mask toolkits, Wall growth, Chemical reaction · T he particle push portion has been optimized with CUDA and is being fully optimized with OpenACC and CUDA Multi-GPU Multi-Node Multi-GPU Multi-Node SCIENTIFIC VISUALIZATION APPLICATION NAME Animator COMPANYNAME GNS PRODUCT DESCRIPTION Industry proven, modern post-processing app for CAE Ansys EnSight ANSYS Industry proven post-processing app for CAE FieldView IntelligentLight Visualization application for CFD HVR (LCSE, U of Minnesota) IndeX University of Minnesota NVIDIA Interactive volume rendering application Interactive distributed volumetric compute and visualization framework. Inside Explorer Interspectral An interactive and intuitive software with volumetric rendering and 3D-visualization of real captured data. SUPPORTED FEATURES · Rendering · Rendering · R ay tracing · Rendering · V olume rendering · P arallel distributed 3D rendering of dense or sparse volumes · A ccurate ray casting or ray tracing at high resolution of full size datasets · P lug-in to ParaView also available. · vGPU GPU SCALING Multi-GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Multi-Node Single GPU Single Node 62 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 62 4/5/21 10:18 AM ParaView Pix4Dmapper SPECFEM3D Tecplot VisIt vl3 (Argonne National Lab) Kitware Pix4D CIG Tecplot LLNL Argonne National Lab Scalable data analysis and visualization application. One of the main vis tools at HPC sites. · R endering and analysis tasks · P lugin for NVIDIA IndeX · O ptiX rendering backend · C UDA accelerated filters (data transformation routines) This professional photogrammetry software uses images to generate point clouds, digital surface and terrain models, orthomosaics, textured models and more. It is most often used by geospatial professionals such as surveyors and civil engineers. · G PU accelerated processing There are two modules/apss in the SPECFEM family: GLOBE and CARTESIAN. The global model is the former Gordon Bell Awardee code. Used for global inversion. Also part of the CAAR effort (although, that one is mostly focused on workflow, rather than the actual model). The regional model is CARTESIAN and it is the app used for seismic simulations, earthquake models, submarine acoustics etc. In addition to being used as a community app, Specfem3D is also use as a proxy app for proprietary codes · O penCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library · S imulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra (structured or not). General purpose scientific visualization software for Aerodynamics, O&G, Internal Combustion and Geoscience applications · Rendering Scalable data anlysis and visualization application · R endering and analysis tasks Large dataset visualization in cosmology, astrophysics, and biosciences fields. · V olume rendering of particles Multi-GPU Multi-Node Single GPU Single Node Multi-GPU Single Node Single GPU Single Node Multi-GPU Single Node Multi-GPU Single Node Smart Spaces APPLICATION NAME AI-NVR COMPANYNAME IronYun Alert Irvine Sensors Arvas VI Dimensions BioSurveillance NEXT, BioFinder Herta Security PRODUCT DESCRIPTION Search in Video, Real time intrusion detection SUPPORTED FEATURES · S earch amongst 1000s of videos for interesting activities or attributes. GPU SCALING Single GPU Single Node Alert provides people counting and intrusion · P eople counting detection · Intrusion detection Single GPU Single Node ARVAS, is an Intelligent Video Analytics solution that uses advance statistical modelling based on deep machine learning technology to detect anomalies. This automated approach enables more accurate detection of complex risk pattern that would otherwise escape human analysts and caused high false alarm. · Abnormally Detection Features - Break-ins, robbery, rioting, floods, accidents, fights, arson, fire, maintenance and vandalism. Single GPU Single Node Real time facial recognition and forensic alerts against multiple watchlists. · S upports crowded scenes and difficult lighting · F aster than real-time analysis · P artial face concealment Multi-GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 63 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 63 4/5/21 10:18 AM Cezurity EVO Cezurity Cylance FaceControl Cylance VOCORD Glueck Media; Glueck Analytics Glueck Ikena Forensic, Ikena Spotlight MotionDSP iMotionFocus iCetana IZA500G On Edge Processing ALPR System Inex/Zamir Nodeflux IVA Nodeflux OpenALPR OpenALPR Event Observer (EvO): engine for detecting malicious activity on user computers. Centralized detection engine; Event chains; Context; Real-time analysis - Cezurity Cloud: Cloud-based technology for detecting malware. Cezurity Cloud has the flexibility to fit into diverse solutions. Different information can be sent and processed by the server, depending on the needs of each product or solution. For example, Cezurity Cloud is currently used as a subsystem to supply data for the Cezurity EvO detection engine. Cezurity Cloud helps the Anti-Virus Scanner to detect malware. In addition, the technology is used for monitoring and analyzing changes in our APT-D solution designed to detect persistent threats against corporate networks. · CUDA Multi-GPU Single Node Advanced AI-based endpoint malware detection. · E ndpoint malware detection solution · G PU deep learning technology Multi-GPU Single Node Detects and recognizes the faces of people, freely passing-by cameras, providing an instant alert to people on a watchlist, recognizes age and gender, counts people by faces, tags newcomers and regular visitors. The system uses deep neural network algorithms and performs recognition with extremely high accuracy in field applications. · N on-cooperative biometrical facial recognition system · ALPR · V ideo analytics and pattern recognition, · V ideo processing and video enhancement Multi-GPU Multi-Node Deep Learning/Machine Learning based Computer Vision technology enabling understanding of how human feels and perceives the environment around them, focusing on face and people analytics. · F acial Expression · A ge Estimation · Gender · Ethnicity · M ulti Face Tracking · A ttention Time Multi-GPU Single Node Real-time (render-less) super-resolutionbased video enhancement and redaction software for forensic analysts and law enforcement professionals. · M ulti-filter, render-less video reconstruction (super-resolution, stabilization, light/color correction) · A utomatic tracking for redaction video from body cameras, CCTV and other sources Multi-GPU Single Node Intelligent analysis of video on 1,000+ camera streams to significantly filter and reduce the camera streams requiring an operator view. · G PU accelerated machine learning · Identifies abnormal activity within video streams Multi-GPU Single Node The IZA500G with processing-on-edge combines two sensors (OV and LPR), a quad core processor, and ALPR software in a single housing, delivering crystal clear images, automatically recognized license plate data, GPS coordinates, and streaming video. · O perating Distance: 9-19 ft (3-6m); 16-32 ft (5-10m) · V ehicle Speed Range: 0 ? 120 mph (0 ? 193 km/h) · F ield of View: 12 ft (3.66 m) Single GPU Single Node Nodeflux IVA products and services cover wide range of sector including but not limited to smart city, defense and security, traffic management, toll management, store analytic (wholesale and retail), asset and facilities management, advertising, and transportation. · F ace recognition · L icense plate recognition · T raffic violation detection · T raffic monitoring, and flood monitoring Multi-GPU Single Node Automatic license plate and vehicle make/ model/year recognition software applied to video streams from IP cameras. · H igh accuracy license plate character recognition spanning North America, Europe, United Kingdom, Australia, Korea, Singapore and Brazil · A PIs and source code available for embedded applications and web services Multi-GPU Single Node 64 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 64 4/5/21 10:18 AM Operating Room Efficiency Artisight, Inc. Patient Location Tracking Artisight, Inc. Recotraffic; Recosecure; Recohospital Recogine SenDISA Platform Sensen Networks Syndex Pro Briefcam Telemonitoring Artisight, Inc. Artisight?s Operating Room Efficiency solution improves operating room productivity with intelligent sensor network and machine learning algorithms. Delivers real-time access to the actionable data needed to improve your operating room productivity while ensuring HIPAA compliance. Deep-learning prediction helps reduce costs, improve productivity, increase profitability and provide clinicians with a safer, more efficient operating room environment. · Independently validated de- identification protocols · M achine learning algorithms · Intelligent cameras and Bluetooth sensors · A dvanced interoperability · H ighly granular analytics Multi-GPU Multi-Node An IoT sensor network for healthcare, Artisight?s intelligent solutions improve organizational operations and financial performance. Designed by physicians, AI scientists and operational experts, Artisight?s patient location tracking system uses data in a HIPAA-compliant platform to solve for the challenges of moving people efficiently around and through a hospital system. · Independently validated de- identification protocols · M achine learning algorithms · Intelligent cameras and Bluetooth sensors · A dvanced interoperability · H ighly granular analytics Multi-GPU Multi-Node Intelligent Transportation Systems covering complex multi-modal surface transportation solutions at a regional, sub-regional, corridor and small area level using deep computer vision technologies. · T raffic Data Collection, · Incident Detection · Integrated Management · V ehicle Classification and supporting related application Multi-GPU Single Node SenSen provides Video-IoT data analytic software solutions targeted at increasing revenue and reducing the cost of operations of customers. SenSen software can process and fuse data from cameras and other sensors like GPS, Radar, and Lidar in real time for parking guidance, parking enforcement, speed enforcement, traffic data analytics and road safety applications. Casinos use SenSen solutions for table game analytic solutions and customer analytics. SenSen solutions are also used in retail, security and tolling applications. · Intelligent Transportation - parking enforcement · C asino game table analytics Single GPU Single Node Improved security and operations by turning video data into useful information. Based on Video Synopsis technology, Syndex Pro allows users to review hours of video in minutes, while applying search filters for achieving accurate results and faster time-to-target. Data can be processed ondemand or in real time to support a wide range of use cases. · R eview hours of video in minutes · S earch in Video Single GPU Single Node Artisight?s Telemonitoring solution uses a constellation of thousands of intelligent pan, tilt, and zoom cameras with two way audio to allow for the simultaneous monitoring of multiple patients from a single workstation. Provides constant visual and verbal contact with patients, while reducing personal protective equipment consumption, as well as front line workers exposure to the virus. · M onitor up to 12 patients per screen, with 6 screens per station · H igh definition 1080p video · 2 -way audio with push-to-talk functionality · Intuitive on-screen controls for responsive pan, tilt, and zoom · P rivacy screen for patient and staff autonomy Multi-GPU Multi-Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 65 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 65 4/5/21 10:18 AM Telesitting Artisight, Inc. Tera, Tera+, Tera Vortex SmartCow Thermal Screening Artisight, Inc. XRVision, IoP XRVision With Artisight?s Intelligent Telesitting solution, your hospital can provide safe, accurate remote patient monitoring around the clock. Intelligent Telesitting allows a single staff member to remotely monitor multiple patients simultaneously, providing better oversight of each patient. Not only does this dramatically decrease staffing costs, it also provides more comprehensive information in real time to help avoid costly falls. Embedded and Backend video analytics for real-time insights from your security and service-related monitoring systems. Thermal imaging eliminates the obstacles associated with manual screening and maintains the safety of your screening staff. Our thermal imaging camera can screen thousands of people every hour, and its flexible viewing options mean you?ll spend less on staffing. It?s easy to configure, requires minimal training for operation and is accurate to within +/-0.3 degrees Celsius. Face Recognition and Video Analytics for Uncontrolled, Crowded and In Motion Environments · M achine learning algorithms that prevent falls and pressure ulcers · A utomated bed capacity management and throughput coordination · M ultiple video feeds on one screen, and multiple tabs per browser · B i-directional audio with HD pan-tilt-zoom cameras · S ystem available in mobile or fixed ceiling versions · A utomatic number plate recognition · T raffic Management · S mart Car Parking Policy · A ccident Detection · D ynamic temperature adjustment based on ambient humidity and temperature · Intuitive multi-touch and slider- based interface · M achine learning algorithms · W i-fi access gateway processes and broadcasts · E ncrypted video feeds for enhanced stability, security, and privacy · B luetooth integration for fully autonomous screening · F ace Recognition and Video Analytics · S mart City, Public Safety, Transportation Analytics, Retail Analytics, Ordinance and Environment Safety Multi-GPU Multi-Node Multi-GPU Single Node Multi-GPU Multi-Node Multi-GPU Single Node Tools and Management APPLICATION NAME Acrobat COMPANYNAME Adobe PRODUCT DESCRIPTION Apps & web services to view, create, manipulate, print and manage files in PDF (Portable Document Format) SUPPORTED FEATURES · A I inference & training in the cloud GPU SCALING Single GPU Single Node Altair Access Altair A simple, powerful, and consistent portal for submitting and monitoring jobs on remote clusters and clouds, and for remote visualization. Brings high-end 3D visualization datacenter hardware right to the user. · 3 D Remote Visualization · H igh-fidelity collaboration · Integrated with Altair PBS Professional for scheduling and control on GPU use and accounting Multi-GPU Multi-Node Altair Grid Engine Univa Altair Grid Engine is a leading distributed resource management system for optimizing workloads and resources in thousands of data centers. Improves performance, productivity and efficiency. Optimizes throughput and performance of applications, containers, and services while maximizing shared compute resources across on-premises, hybrid, and cloud infrastructures. · NVIDIA CUDA · OpenACC · O penCL plus MPI hybrid apps · O ptimizes scheduling with resource- mapped GPUs · M anages GPU apps within or without Docker containers · O btain visibility with CUDA-specific metrics for GPU monitors and reports · E xtend on-premise deployments to incorporate cloud-based GPU instances Multi-GPU Multi-Node Altair PBS Professional Altair PBS Professional is a fast, powerful workload manager designed to improve productivity, optimize utilization and efficiency, and simplify administration for clusters, clouds, and supercomputers. Supports biggest HPC workloads to millions of small, high-throughput jobs. PBS Professional automates job scheduling, management, monitoring, and reporting, and it's the trusted solution for complex Top500 systems as well as smaller clusters. · G PU auto discovery · S pecify GPU count per CPU · S pecify GPU type · G PU/CPU affinity · G PU awareness and equality in accounting, quotas, and fair share · G PU/CPU syntax/scheduling equivalence · S pecify memory use per GPU · A dd-on/integration project · N VIDIA Data Center GPU Management (DCGM) · O pen source and commercial versions Multi-GPU Multi-Node 66 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 66 4/5/21 10:18 AM Arm Forge Arm (formerly Allinea) Artec Leo Artec 3D Build reliable and optimized code for the right results on multiple Server and HPC architectures, from the latest compilers and C++ 11 standards including NVIDIA GPU hardware. Arm Forge combines Arm DDT, the leading debugger for time-saving high performance application debugging, Arm MAP, the trusted performance profiler for invaluable optimization advice across native and Python HPC codes, and Arm Performance Reports for advanced reporting capabilities. Arm Forge Professional (DDT & MAP) providing all you will need to debug, profile and optimize for high performance from single threads through to complex parallel HPC and scientific codes with MPI, OpenACC, OpenMP, threads or NVIDIA CUDA applications. A smart 3D scanner that enables you to see your object projected in 3D directly on the HD display. · C ross Platform: Moving to a new architecture or system is challenging enough without having to learn a new tool chain at the same time. Arm DDT, MAP and Performance Reports run everywhere - on your own laptop, the latest supercomputer, and tomorrow's upcoming architectures · A utomatically detect memory bugs, profile behavior and see advanced performance metrics at all scales on Arm 64-bit, Intel Xeon, Intel Xeon Phi, NVIDIA GPUs , and OpenPOWER · F ast Debug: Arm DDT is the debugger of choice for developing of C++, C or Fortran parallel, and threaded applications on CPUs, GPUs and Intel Xeon Phi · Its powerful intuitive graphical interface helps you easily detect memory bugs and divergent behavior at all scales, making Arm DDT the number one debugger in research, industry and academia. · L ow-overhead Profiling: Profile your code without distorting application behavior. Arm MAP is Arm Forge's scalable low-overhead profiler of C++, C, Fortran and Python with no instrumentation or code changes required. It helps developers accelerate their code by revealing the causes of slow performance · F rom multicore Linux workstations to the largest supercomputers, you can profile realistic test cases with typically less than 5% runtime overhead. Multi-GPU Multi-Node · Short Learning Curve: Arm DDT offers a powerful intuitive GUI that sets the standard for multi-process and multi-threaded debugging · Complex software debugging is made simple whether you're working on a PC or offline, with the help of zero-click variable comparisons, built-in memory debugging, and powerful array visualizations - for today's increasingly parallel processors, clusters, and supercomputers. · Wide Issue Coverage: Arm MAP exposes a wide set of performance indicators, including MPI metrics, PAPI counters, IO metrics, energy metrics and even your own custom metrics · Profile computation (with self and child and call tree representations over time), thread activity (to identify over-subscribed cores and sleeping threads that waste CPU time for OpenMP and pthreads), instruction types, as well as synchronization and I/O performance. · Single and Multi Threaded Profiling: Arm MAP profiles parallel, multithreaded, and single threaded C, C++, Fortran, F90 and Python codes, providing in-depth analysis and bottleneck pinpointing to the source line · Unlike most profilers , it can profile pthreads, OpenMP or MPI for parallel and threaded code, including communication and workload imbalance issues for MPI and multi-process codes · Jetpack · Tx2 Single GPU Single Node POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 67 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 67 4/5/21 10:18 AM Bright Cluster Manager CMake ELPA HPCToolkit IBM Spectrum LSF Magma Bright Computing Kitware Max Planck Institute Rice University IBM Corporation ICL - University of Tennessee Knoxville Bright Cluster Manager lets you administer clusters as a single entity, provisioning the servers, GPUs, operating system, and workload manager from a unified interface. CMake is a cross-platform build tool for controlling the software compilation process using simple platform- and compiler-independent configuration files. Generates native makefiles and workspaces that can be used in the compiler of choice. Integrates with CDash to provide a comprehensive suite of tools. The publicly available ELPA library provides highly efficient and highly scalable direct eigensolvers for symmetric matrices. Though especially designed for use for PetaFlop/s applications solving large problem sizes on massively parallel supercomputers, ELPA eigensolvers have proven to be also very efficient for smaller matrices. HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to the nation's largest supercomputers. Provides support for analyzing a program execution cost, inefficiency, and scaling characteristics both within and across nodes of a parallel system. A comprehensive workload management solution that simplifies HPC with an enhanced user and administrator experience, reliability and performance at scale. Great for big data, cognitive, GPU machine learning and containerized workloads. MAGMA provides a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current "Multicore+GPU" systems. · P owerful Cluster Management Shell (CMSH) · N VIDIA libraries, CUDA, OpenCL, OpenACC, CUDA-aware libraries, NCCL, and CUB · L inux distributions: RHEL and derivatives, SUSE SLES and Ubuntu LTS · G PU-enabled Kubernetes and Singularity for running containers Multi-GPU Multi-Node · C olor output for make · P rogress output for make · Incremental linking support with vs 8,9 and manifests · S upports out-of-tree builds · A uto-rerun of cmake if any cmake input files change (works with vs 8, 9 using ide macros) · A uto depend information for C++, C, and Fortran Multi-GPU Multi-Node · Improved one-step ScaLAPACK-type solver ELPA1 · N ovel two-step solver ELPA2 Multi-GPU Multi-Node · C oarse-grain mode: collect multiple metrics in a single run · G PU kernel metrics · S ynchronization metrics · M emory copy metrics · M emory allocation metrics · L ess than 2× overhead · F ine-grain mode: collect GPU PC samples · 8 PC sampling shortcomings · Introduces up to 20× overhead · S erialized GPU kernel executions Multi-GPU Multi-Node · Enforcement of GPU allocations via cgroups · E xclusive allocation and round robin shared mode allocation · C PU-GPU affinity · B oost control · P ower management · M ulti-Process Server (MPS) support · N VIDIA Volta and DCGM support Multi-GPU Multi-Node · L inear system solvers · E igenvalue problem solvers · A uxiliary BLAS · B atched LA · S parse LA · C PU/GPU Interface · M ultiple precision support · N on-GPU-resident factorizations · M ulticore and multi-GPU support · M AGMA Analytics/DNN · L APACK testing · Linux · Windows · M ac OS · S upport for NVIDIA A100, V100, T4, P100 GPUs Multi-GPU Single Node 68 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 68 4/5/21 10:18 AM PAPI ICL - University of Tennessee Knoxville Parallware Trainer Appentra Solutions SLURM STRIVR SchedMD StriVR PAPI provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events. · S tandard API on most modern microprocessors · S mall set of registers that count Events · Events-monitoring · C orrelation between source/object code and underlying architecture · P lease refer to the PAPI News page for the latest on GPU support: https://icl.utk.edu/papi/news/index.html Parallelware Trainer is an interactive, realtime code editor with features that facilitate the learning, usage, and implementation of parallel programming by understanding how and why sections of code can be parallelized. Users are actively involved in learning parallel programming through observation, comparison, and hands-on experimentation. Parallelware Trainer provides support for widely used parallel programming strategies using OpenMP and OpenACC with execution on multicore processors and GPUs. · Interactive, real-time editor GUI that shows you how and where to implement parallelism. · A ssists in the parallelization of code using OpenMP and OpenACC. · T ransparent, local/ remote, execution and benchmarking. · S upport for the C programming language. Full Fortran support coming soon. · D etailed report of opportunities for parallelism discovered in your code. · S upport for multiple compilers including GCC, Intel and PGI. · Benefits: · F aster, more effective learning. · R educed learning curve. · A ll-in-one learning tool for parallel programming. · Immediate use of parallel programming. · S upport for multicore processors and GPUs. Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. · G PU support · GPGPUs · M ilitary grade security · H eterogenous platform · F lexible plugin framework STRIVR offers an end-to-end Immersive Learning platform that revolutionizes the way people and businesses train, learn, and perform. · V RWorks 360 Video Multi-GPU Multi-Node N/A Multi-GPU Multi-Node Single GPU Single Node hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 69 POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 69 4/5/21 10:18 AM TAU - Tuning and Analysis Utilities University of Oregon Torque / Moab Adaptive Computing Totalview Perforce Vampir TU Dresden TAU Performance System is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python. TAU (Tuning and Analysis Utilities) is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements as well as event-based sampling. All C++ language features are supported including templates and namespaces. The API also provides selection of profiling groups for organizing and controlling instrumentation. The instrumentation can be inserted in the source code using an automatic instrumentor tool based on the Program Database Toolkit (PDT), dynamically using DyninstAPI, at runtime in the Java Virtual Machine, or manually using the instrumentation API. TAU's profile visualization tool, paraprof, provides graphical displays of all the performance analysis results, in aggregate and single node/context/thread forms. The user can quickly identify sources of performance bottlenecks in the application using the graphical interface. In addition, TAU can generate event traces that can be displayed with the Vampir, Paraver or JumpShot trace visualization tools. · Instrumentation · PerfDMF · Paraprof · L oad Profiles · M etric Window · T hread Windows · C ommunication Matrix · 3 D Visualization · D erived Metrics · S elective Instrumentation · PerfExplorer · C luster Analysis · C orrelation Analysis · S calability Chart · P reset Charts · C ustom Charts · Visualizations · E clipse Introduction · S elective Instrumentation · Instrumenting Java · C onfiguration Manager Multi-GPU Multi-Node Moab HPC Suite is a workload and resource orchestration platform that automates the scheduling, managing, monitoring, and reporting of HPC workloads on massive scale. TORQUE provides control over batch jobs and distributed computing resources. It is an advanced open-source product based on the original PBS project and incorporates the best of both community and professional development. · R equests and schedules gpus based on gpu location in NUMA systems · C ollects and report smetrics and status information · S ets gpu mode at job run time Multi-GPU Multi-Node TotalView is the leading dynamic analysis and debugging tool designed to handle complex CPU and GPU based multithreaded, multi-process and multi-node cluster applications.TotalView supports the latest CUDA SDK's, NVIDIA GPU hardware, Linux x86-64, Arm64, and OpenPower platforms and applications utilizing MPI and OpenMP technologies. · OpenACC directives · CUDA running directly on NVIDIA latest GPUs · Linux and GPU device thread visibility · CUDA function calls, host pinned memory regions and CUDA contexts · Handling CUDA functions inline and on the stack · Command line interface (CLI) commands for CUDA functions · MPI applications on CUDA-accelerated clusters Multi-GPU Multi-Node Easy-to-use framework that enables developers to quickly display and analyze arbitrary program behavior at any level of detail. The tool suite implements optimized event analysis algorithms and customizable displays that enable fast and interactive rendering of very complex performance monitoring data. · N VIDIA CUDA · CUPTI · C UDA libraries Multi-GPU Multi-Node 70 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 70 4/5/21 10:18 AM Agriculture APPLICATION NAME Taranis COMPANYNAME Taranis PRODUCT DESCRIPTION Taranis provides a platform for discovering various crop health issues, helping farmers take care of both land and crops and making sure they get the best of their yield. SUPPORTED FEATURES · r eport plant population to farmers · d etect when a weed emerges in field and constitutes a potential threat · c alculate amounts of nutrients in vegetation, water content in the soil, plant temperature · identify and categorize the top relevant diseases for prevalent crops GPU SCALING Multi-GPU Multi-Node Business Process Optimization APPLICATION NAME Automated checkout COMPANYNAME Focal Systems PRODUCT DESCRIPTION Focal's Product Recognition eliminates barcode scanning entirely at the cashier and achieves 99% accuracy on thousands of products. SUPPORTED FEATURES · cuDNN · TensorRT GPU SCALING Multi-GPU Single Node DataX.AI Helix Part Finder Kiosk CrowdANALYTIX Maxerience Slyce Cloud-based crowd-sourced analytics services that create an online retail product catalog, on-boarding SKU in minutes instead of the manual process of tagging and provide produce info and removing human error involved. CPG product training platform: creates digital copies of products right at the production line in a matter of minutes, and creates an AI model in less than 30 minutes! A visual search and image recognition solution for retailers and brands · cuDNN · TensorRT · R eal time scan item and direct customer to item's location in store · F ind a replacement or additional info · F eature Jetpack Single GPU Single Node Single GPU Single Node Single GPU Single Node Peak Trading Out Of Stock BeMyEye Out of Stock (OOS) and Almost OOS (AOOS) · P roduct recognition on the cloud crowed sourcing solutions for retailers Single GPU Single Node Perfect Shelf BeMyEye Track Hypermarkets, Supermarkets, Discounters, Managed Convenience and Chemists, using unique blend of IR technologies and crowdsourcing, to provide you with on-shelf sales fundamental data across an entire category · R eal time inferencing on the cloud · S KU recognition Single GPU Single Node Predictive Pricing Evo Pricing Third Wave Automation Third Wave Automation Market-driven optimal prices based on demand, competition, product features and customer feedback Automation cloud robotics and machine learning technology to material handling forklift automation in a warehouse · G PU on the cloud · G eforce 2080 Ti Multi-GPU Single Node Single GPU Single Node For more information on GPU-accelerated applications please visit, www.nvidia.com/teslaapps POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 | 71 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 71 4/5/21 10:18 AM © 2021 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, and CUDA, are trademarks and/or registered trademarks of NVIDIA Corporation. All company and product names are trademarks or registered trademarks of the respective owners with which they are associated. Features, pricing, availability, and specifications are all subject to change without notice. Apr21 72 | POPULAR GPUACCELERATED APPLICATIONS CATALOG | Apr21 hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 72 4/5/21 10:18 AMAdobe PDF Library 15.0