Microsoft Particle Based Simulations On Multi GPU Systems_add_image_by_kwtechx [읽기 전용] En Sight 2010 Metariver
2013-11-20
: Ensight Ensight 2010 Metariver EnSight_2010_Metariver cases China
Open the PDF directly: View PDF .
Page Count: 36
Download | |
Open PDF In Browser | View PDF |
Multi-GPU 시스템을 이용한 입자계 기반 해석사례 Particle-based Simulations on multi-GPU Systems 2010-10 coahn@metariver.kr www.metariver.kr GPU & CUDA Technology What is GPU? Entertainment High Performance Computing Games Movie CG VR : Engineering Medicine Science Finance Biology : GPU 3 GPU & CUDA Technology GPU(Graphics Processing Unit)는 매우 빠른 시간에 복잡한 3차원 이미지를 화면에 출력하 기 위해 사용되는 그래픽카드 전용 Processing Unit CUDA(Compute Unified Device Architecture)는 GPU를 이용하여 고속 연산이 가능하도록 하는 기술로서, 그래픽 작업을 처리하도록 작성된 고속 shader언어를 과학/공학 계산에 활용 가능한 형태로 사용할 수 있도록 작성된 형태 CUDA는 GPU가 가지는 수십~수백 개의 core를 동시 활용함으로써 수많은 thread를 고속으 로 처리할 수 있도록 하며, 이들 core들은 공통의 GPU 메모리를 사용가능 연산전용의 단일 GPU는 다수의 CPU를 고속의 network로 연결하여 대규모 계산을 공동 처 리하도록 하는 기존의 supercomputing(병렬처리, Parallel Processing) 방법에 비해 보다 낮 은 비용으로, 월등한 고속 연산 가능 NVIDIA에서 shader언어를 다목적으로 사용할 수 있도록 한 GPGPU(General Purpose GPU) 기술을 더욱 확장한 것으로, 2008년 이후 미국 및 유럽 등 선진 연구자들에 의해 지속적으로 성공적인 활용사례가 보고되고 있음 4 Parallel Computing (HPC) Multiple Instruction Multiple Data (conventional parallel processing) Data (Problem) Domain Decomposition node0 node1 node2 node3 node4 node5 node6 node7 proc.0 proc.1 proc.2 proc.3 proc.4 proc.5 proc.6 proc.7 Solution 5 GPU Computing (CUDA) Single Instruction Multiple Data (CUDA technology) GPU Thread(5,5) Thread(0,1) Thread(0,0) Thread(1,0) Thread(2,0) Thread(3,0) Data (Problem) • GPU is a data-parallel processor • Thousands of parallel threads / Thousands of data elements to process • GeForce 8800 has 128 streaming processor cores and 512MB RAM • Tesla C1060 has 240 streaming processor cores and 4GB RAM core core core 6 GPU Architecture 7 © NVIDIA Corporation Inside CUDA kernel 8 © NVIDIA Corporation GPU vs. CPU Wetting Simulation using SPH (Smoothed Particle Hydrodynamics) Performance [kEA/s] - W,D : 50mm H : 100mm Dpore : 1.5mm N : 83,300 EA (x0.6) (x1.0) (x2.1) (x2.5) (x11.8) (x33.6) (x58.7) GPU vs. HPC Lattice Boltzmann Method (CFD) Lid-driven cavity flow Speed-up GPU (C870) 1EA HPC (Opteron252) Speed-up : s(n) = Ts/ Tp Ts : WCT of serial code [sec.] Tp : WCT of parallel code [sec.] n : the number of processors 16 128x128 16 256x256 32 512x512 128 (n) 1024x1024 Mesh size 10 HPC vs. multi-GPU HPC (Parallel Processing) MPI (Message Passing Interface) Fast Algorithm + CUDA Fast Algorithm + CUDA Fast Algorithm + CUDA Fast Algorithm + CUDA multi-GPU MPI (Message Passing Interface) 11 Inside multi-GPU system Domain Decomposition sub domain sub domain sub domain Fast Algorithm GPU GPU GPU Interaction GPU GPU GPU Exchange 1 Exchange 2 12 SAMADII : Particle-based multi-physics solver SAMADII Particle-based multi-physics solver H/W acceleration using multi-GPU (GPU cluster) S/W acceleration (Fast algorithm) Discrete Element Method (DEM) Magnetic particle, charged particle simulation Wetting simulation Smoothed-Particle Hydrodynamics (SPH) * Fluid-Solid Interaction (FSI) * Deformable body simulation * * under development 14 SAMADII CAD DATA 3D CAD Assembly MESH PRE SOLVER Inlet Surface Selected Boundary Condition Outlet Surface Selected Boundary Condition Body1 Surface Mesh Motion Definition Rotation Body2 Surface Mesh Motion Definition Rotation : : Particle Creation / Filtering Particle Data 15 SAMADII - GUI - Graphics Engine : HyperCube4 (based on OpenGL & .NET, self-developed) - Specially designed for high-speed rendering Material Property Setup Mesh Converter Particle Creator GPU Configuration Mesh Assembler Particle Filter 16 Example-1 : Excavator Np : 80,000 EA Ne : 5,653 EA 500,000 steps Multi-CPU (HPC 16 core) 2010-02 Example-1 : Excavator Example-2 : Agitator Np : 30,000 EA Ne : 48,602 EA 120,000 steps Multi-GPU (Tesla C1060 x2) 2010-07 Example-2 : Agitator Example-3 : Blast Furnace Np : 640,000 EA Ne : 64,314 EA 250,000 steps Multi-GPU (Tesla C1060 x2) 2010-07 Example-3 : Blast Furnace Visualization : HyperCube4 & EnSight Sloshing Np : 248,845 EA Ne : 12,580 EA 1,000,000 steps Tesla C2050 Sloshing Np : 248,845 EA Ne : 12,580 EA 1,000,000 steps Tesla C2050 Sloshing Sloshing Np : 248,845 EA Ne : 12,580 EA 1,000,000 steps Tesla C2050 < Only Vector > < Vector & Transparent Particle> Sloshing 2-R Auger D=0.001 Np : 219,391 EA Ne : 45,282 EA 700,000 steps Multi-GPU (Tesla C1060 x2) 2-R Auger Np : 219,391 EA Ne : 45,282 EA 700,000 steps Multi-GPU (Tesla C1060 x2) 2-R Auger Np : 219,391 EA Ne : 45,282 EA 700,000 steps Multi-GPU (Tesla C1060 x2) 2-R Auger Rotating Drum Np : 124,266 EA Ne : 6,048 EA 700,000 steps Tesla C2050 D=0.01 Rotating Drum Rotating Drum Np : 124,266 EA Ne : 6,048 EA 700,000 steps Tesla C2050 < Only Vector > < Vector & Transparent Particle> Rotating Drum
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : Yes Author : jyko Create Date : 2010:10:28 14:00:01+09:00 Modify Date : 2010:10:28 14:00:01+09:00 XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-20:48:00 Format : application/pdf Creator : jyko Title : Microsoft PowerPoint - Particle-based Simulations on multi-GPU Systems_add_image_by_kwtech.pptx [읽기 전용] Creator Tool : PScript5.dll Version 5.2.2 Producer : Acrobat Distiller 9.0.0 (Windows) Document ID : uuid:bcfaf112-1a58-40c3-b71d-07c437164d8f Instance ID : uuid:dfe576ac-e87f-45e4-bacc-dbdc68956724 Page Count : 36EXIF Metadata provided by EXIF.tools