Sign In
GTC Logo
GPU
Technology
Conference

April 4-7, 2016 | Silicon Valley
Check back often for session updates.

Scheduler Planner

Print
Download Pdf
 

 
Refine:
  • Session Levels:
  • |
  • |
  • |
  • |
  • Session Levels:
  •  
  •  
  •  
  •  
  • = Highly Rated Speaker

TALK

Presentation
Details

S6583 - WetBrush: GPU-Based 3D Painting Simulation at the Bristle Level

Zhili Chen 3D Graphics Researcher, Adobe Research
Zhili Chen is a 3D graphics researcher at Adobe. He got his Ph.D. in computer science at The Ohio State University in 2015. His research interests include physically based simulation, real-time graphics, 3D reconstruction, and virtual reality.

We built a real-time oil painting system that simulates the physical interactions among brush, paint, and canvas at the bristle level entirely using CUDA. To simulate sub-pixel paint details given the limited computational resource, we propose to define paint liquid in a hybrid fashion: the liquid close to the brush is modeled by particles, and the liquid away from the brush is modeled by a density field. Based on this representation, we develop a variety of techniques to ensure the performance and robustness of our simulator under large time steps, including brush and particle simulations in non-inertial frames, a fixed-point method for accelerating Jacobi iterations, and a new Eulerian-Lagrangian approach for simulating detailed liquid effects.

Level: Intermediate
Type: Talk
Tags: Real-Time Graphics; Computational Fluid Dynamics

Day: Tuesday, 04/05
Time: 13:30 - 13:55
Location: Room 210E

S6215 - MBE: A GPU-Based Fast, Robust and Precise Solver for Chemical ODEs

Fan Feng Ph.D. Student, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
Fan Feng is a Ph.D. student with the Supercomputer Center, Computer Network Information Center, Chinese Academy of Sciences, Beijing.

Explore a GPU-based efficient algorithm for chemical ODEs, which is the core and costly part of atmosphere chemistry model in CAS-ESM project. Chemical ODEs is numerically sticky because of its stiffness, nonlinearity, and nonnegativity. Traditional solvers, such as LSODE, are hard for parallelism because of its complicated control flow and coupling. In our experiments, we have obtained 3-5X speedup on GPU when the same input is set on each node, which eliminates the divergences in kernel, while the performance with real input is even worse than the serial code. So we develop a new solver Modified-Backward-Euler (MBE). In our numerical experiments, MBE is shown to be faster and more precise than LSODE, and it's easy to parallelize, so we can expect a significant speedup on GPU.

Level: All
Type: Talk
Tags: Earth System Modelling; Computational Fluid Dynamics; Algorithms

Day: Tuesday, 04/05
Time: 14:30 - 14:55
Location: Room 211A

S6225 - Efficient Utilization of Large-Scale Heterogeneous Systems Using the Uintah Computational Framework

Alan Humphrey Software Developer and Ph.D. Student, Scientific Computing and Imaging Institute, University of Utah
Alan Humphrey is a software developer at the Scientific Computing and Imaging Institute and also a Ph.D. student at the University of Utah, where he works with Dr. Martin Berzins on improving the performance and scalability of the Uintah Computational Framework. Alan has been primarily involved in extending Uintah to run on hybrid CPU/GPU systems with the development of Uintah's prototype CPU-GPU task scheduler and most recently, Uintah's Unified multi-threaded heterogeneous task scheduler and runtime system that allows Uintah to dynamically dispatch computational tasks to both CPU cores and available GPUs on-node. Much of Alan's past research has been focused on formal verification of concurrent systems, specifically the Message Passing Interface (MPI) and dynamic verification tools like In-situ Partial Order (University of Utah) - and its integration within the Eclipse Parallel Tools Platform (PTP). Alan has also been involved with the Eclipse PTP project from 2009-2015.

We'll discuss how directed acyclic graph (DAG) approaches provide a powerful abstraction for solving challenging engineering problems and how using this abstraction and DAG approach, computational frameworks such as Uintah can be extended with relative ease to efficiently leverage GPUs, even at scale. Attendees will learn how frameworks like Uintah are able to shield the application developer from the complexities of the deep memory hierarchies and multiple levels of parallelism found in heterogeneous supercomputers. Attendees will be shown how Uintah applications can be made to utilize thousands of GPUs within a single simulation, as shown by recent results for a GPU-based radiation model that achieves excellent strong scaling to 16,384 GPUs on DOE Titan.

Level: All
Type: Talk
Tags: Supercomputing & HPC; Computational Fluid Dynamics

Day: Wednesday, 04/06
Time: 10:30 - 10:55
Location: Room 211A

S6199 - Raytracing Scientific Data in NVIDIA OptiX™ with GVDB Sparse Volumes

Rama Hoetzlein Graphics Research Engineer, NVIDIA
Rama Hoetzlein's current research with NVIDIA explores data structures for large-scale simulation and volume rendering. Rama completed a dual-degree in computer science and fine arts from Cornell in 2001, with research in robotics and imaging. In 2010, his dissertation at the University of California, Santa Barbara, focused on tools for creative interaction in procedural modeling for media artists. In 2010, Rama was co-director and lead scientist of the Transliteracies project in the Digital Humanities, and professor of media studies at the Medialogy program in Copenhagen with a focus on visual effects and animation.
Tom Fogal Senior Software Engineer, NVIDIA
Thomas Fogal is an NVIDIA engineer specializing in HPC visualization. As a doctoral student, he worked on parallel volume rendering techniques as well as novel approaches to in situ visualization. At the Scientific Computing & Imaging Institute, ORNL, and LLNL, he worked on parallel rendering for large scientific data. Thomas holds a B.S. and M.S. from the University of New Hampshire, and will soon have a doctorate from the University of Duisburg-Essen in Germany.

We present a novel technique for visualization of scientific data with compute operators and multi-scatter ray tracing entirely on GPU. Our source data consists of a high-resolution simulation using point-based wavelets, a representation not supported by existing tools. To visualize this data, and consider dynamic time-based rendering, our approach is inspired by OpenVDB from motion pictures, which uses a hierarchy of grids similar to AMR. We develop GVDB, a ground-up implementation with tree traversal, compute, and ray tracing via OptiX all on the GPU. GVDB enables multi-scatter rendering at 200 million rays/sec, and full-volume compute operations in a few milliseconds on datasets up to 4,200^3 entirely in GPU memory.

Level: All
Type: Talk
Tags: In-Situ and Scientific Visualization; Rendering & Ray Tracing; Computational Fluid Dynamics

Day: Wednesday, 04/06
Time: 16:00 - 16:50
Location: Room LL21D

S6195 - Burning on the GPU: Fast and Accurate Chemical Kinetics

Russell Whitesides Member of Technical Staff, Lawrence Livermore National Laboratory
Dr. Russell Whitesides has applied his theoretical and applied knowledge of chemical kinetics and scientific computing platforms towards internal combustion engine simulations with the goal of highly efficient, clean-combustion engines for transportation. Russell has pursued a variety of topics in mechanical engineering R&D in the course of his academic and research career. Since joining Lawrence Livermore National Laboratory, he has worked alongside the Methods Development Group at LLNL to enhance the capabilities and interoperability of scalable structural mechanics codes. His doctoral thesis focused on the atomistic chemical mechanisms of soot particle growth in combustion environments.

Come learn about our latest developments in accelerating combustion kinetics for computational fluid dynamics (CFD). We have extended our previously presented CUDA implementation to improve performance and will present multiple examples of improved solver performance. We will discuss the merits of our approach in comparison to related approaches and provide insight into the lessons we've learned along the way.

Level: Intermediate
Type: Talk
Tags: Computational Fluid Dynamics; Performance Optimization; Computational Physics; Aerospace & Defense

Day: Thursday, 04/07
Time: 09:00 - 09:25
Location: Marriott Salon 1

S6347 - Multi GPU, Interactive 3D Simulator for the Lattice Boltzmann Immersed Boundary Method

Bob Zigon Senior Staff Research Engineer, Beckman Coulter
Highly-Rated Speaker
Bob Zigon is a senior staff research engineer and has worked at Beckman Coulter for 13 years. He has degrees in computer science and mathematics from Purdue University. He was the architect of Kaluza, an NVIDIA Tesla-powered analysis application for flow cytometry. He's now researching how machine learning techniques can be applied to laboratory automation. His interests include high performance computing, numerical analysis, machine learning, and software development for life science.

The Lattice Boltzmann Immersed Boundary method is a technique in computational fluid dynamics used to model and simulate fluid-structure-interaction problems. The goal of this session is to demonstrate practical strategies for partitioning the computations across Tesla K40 cards while exploiting the programmable pipeline inside of a Quadro K5000 to visualize the 3D flow fields at interactive rates. GPU results will be compared with an OpenMP implementation. Full source code will be provided.

Level: All
Type: Talk
Tags: Computational Fluid Dynamics; Computational Physics

Day: Thursday, 04/07
Time: 09:00 - 09:25
Location: Marriott Salon 1

S6134 - High Performance and Productivity with Unified Memory and OpenACC: A LBM Case Study

Jiri Kraus Compute Devtech Software Engineer, NVIDIA
Highly-Rated Speaker
Jiri Kraus is a senior developer in NVIDIA's European Developer Technology team. As a consultant for GPU HPC applications at the NVIDIA Julich Applications Lab, Jiri collaborates with local developers and scientists at the Julich Supercomputing Centre and the Forschungszentrum Julich. Before joining NVIDIA, he worked on the parallelization and optimization of scientific and technical applications for clusters of multicore CPUs and GPUs at Fraunhofer SCAI in St. Augustin. He holds a diploma in mathematics from the University of Cologne, Germany.

Learn how to use unified memory to improve your productivity in accelerating applications with OpenACC. Using a Lattice Boltzmann CFD solver as an example, we'll explain how a profile-driven approach allows one to incrementally accelerate an application with OpenACC and unified memory. Besides the productivity gain, a primary advantages of this approach is that it is very accessible also for developers new to a project and therefore not familiar with the whole code base.

Level: Intermediate
Type: Talk
Tags: Computational Fluid Dynamics; Supercomputing & HPC; Tools & Libraries; OpenACC; Aerospace & Defense

Day: Thursday, 04/07
Time: 10:00 - 10:25
Location: Marriott Salon 1

S6302 - Two-Level Parallelization and CPU-GPU Hybrid Large Scale Discrete Element Simulation

Ji Xu Associate Professor, Institute of Process Engineering (IPE), Chinese Academy of Sciences (CAS)
Ji Xu is an associate professor at the Institute of Process Engineering, Chinese Academy of Sciences, where he also received his Ph.D. in chemical engineering. His interests include the mechanism of complex particle systems with discrete simulation methods, such as molecular dynamics and discrete element method, including large-scale, high-performance algorithm design and software development for hybrid CPU-GPU computing.

Learn how to develop algorithms for the discrete element method to efficiently simulate a large number of particles in complex-shaped systems with a large number of GPUs through: (1) Two-level domain decomposition parallel algorithm for multiple GPUs; (2) Faster particle-to-particle collision algorithms for one GPU; (3) Overlap communication and computation for efficient parallel computing. Scientific and industrial applications are given for single-phase particle only and multi-phase flow systems, such as granular flows, particle mixing, gas-solid fluidization, and liquid-solid flow in stirred tanks.

Level: Intermediate
Type: Talk
Tags: Computational Physics; Computational Fluid Dynamics; Supercomputing & HPC

Day: Thursday, 04/07
Time: 10:00 - 10:25
Location: Marriott Salon 6

S6329 - Petascale Computational Fluid Dynamics with Python on GPUs

Freddie Witherden Postdoctoral Research Assistant, Imperial College London
Freddie Witherden is a postdoctoral scholar in the Department of Aeronautics at Imperial College London. He obtained his Ph.D. in high-order methods for GPU-accelerated computational fluid dynamics in 2015 under the supervision of Dr. Peter Vincent. Between 2008-2012, Freddie studied physics with theoretical physics at Imperial College London, earning an M.S. with first-class honors. Outside of work, Freddie has a keen interest in helping academics track their engagement with the mass-media. In 2012, Freddie co-founded the news analytics start-up Newsflo, where he served as chief technology officer. Newsflo was acquired in January 2015 by the academic publisher Elsevier, which has since gone on to employ the core technology in a range of products.

Discover how Python and in-situ visualization are being used to enable petascale computational fluid dynamics simulations of flow over real-world geometries. We'll (1) introduce PyFR, an open-source Python framework for solving the compressible Navier-Stokes equations on unstructured grids, (2) describe how PyFR leverages the capabilities of NVIDIA GPUs to obtain in excess of 50% of peak FLOP/s at scale, (3) outline the challenges, both technical and logistical, faced when scaling such a code to thousands of GPUs, and (4) show how in-situ visualization can be used to remediate many of these issues. Examples of high-fidelity unsteady flow simulations enabled by PyFR and these approaches will be showcased throughout.

Level: Intermediate
Type: Talk
Tags: Computational Fluid Dynamics; Supercomputing & HPC; In-Situ and Scientific Visualization

Day: Thursday, 04/07
Time: 10:00 - 10:25
Location: Marriott Salon 1

S6113 - BLAZE-DEM: A Discrete Element Simulation Framework for NVIDIA GPUs

Nicolin Govender Senior Research Scientist, Center for High Performance Computing (CSIR)
Nicolin Govender is associated with several institutions. At the University of Johannesburg, he is a member of the ATLAS Collaboration at CERN and from this base he operates his research projects associated with computing in a high-energy physics environment, which spans computing for data acquisition as well as for analysis. He has also worked on the modeling of nuclear reactors, and this is the area where he obtained his M.S. with distinction, in a project involving collaboration between the University of Johannesburg and the South African Nuclear Energy Corporation (Necsa). He has a Ph.D. in computational mechanics from the University of Pretoria and has done his post-doc at the University of Utah and Ecole Mines France.

Understanding the dynamical behavior of particulate materials is important to many industrial processes, with applications that range from hopper flows in agriculture to tumbling mills in the mining industry. The discrete element method (DEM) has become the defacto standard to simulate particulate materials. The DEM is a computationally intensive numerical approach that is limited to hundreds of thousands of particles. However, the computational architecture plays a significant role on the performance that can be realized. The parallel nature of the GPU allows for a large number of independent processes to be executed in parallel. This results in a significant speedup over conventional implementations utilizing the CPU. In this talk we present the GPU-based large-scale code BLAZE-DEM.

Level: Beginner
Type: Talk
Tags: Computational Physics; Astronomy & Astrophysics; Computational Fluid Dynamics

Day: Thursday, 04/07
Time: 10:30 - 10:55
Location: Marriott Salon 6

S6241 - An Optimized Solver for Unsteady Transonic Aerodynamics and Aeroacoustics Around Wing Profiles

Jean-Marie Le Gouez Research scientist, ONERA
Jean-Marie Le Gouez is with the CFD department at ONERA, the French aeronautics research institute. Jean-Marie graduated from Ecole Polytechnique in Paris in 1977 and received a M.S. in mechanical engineering from Stanford University in 1978. He worked for 11 years at the CEA (French Nuclear Research Energy Institute) in thermal hydraulics of sodium-cooled reactors and safety studies. Jean-Marie joined the independent research company PRINCIPIA in 1990, where he led the CFD department, developing software for unsteady incompressible, free surface flows and fluid-structure interaction, with contracts in naval, car, and satellite industries. In 2007, he joined ONERA, where he was head of the CFD and Aeroacoustics department until 2014. Then he joined the research unit on novel CFD software architectures in this department.

The extensive optimisation for HPC of Fluid Dynamics software is possible under a variety of aspects on GPU clusters within GPUDirect/C/CUDA/Thrust programming paradigms. In particular, our algorithm could be made more modular to adapt to the CUDA register usage limit, Thrust libraries provide highly efficient solutions on global memory computations and warp collaboration through shared memory proves crucial. Large Eddy Simulation, based on basic principles of field mechanics, despite its very high computing requirements, complements Reynolds-Averaged Navier-Stokes models, which lack versatility. In the field of aeronautical flows around wing profiles in steady or off-design configurations, our solver provides efficient solutions on 128-TESLA clusters for adequate 2-billion cell grids.

Level: Intermediate
Type: Talk
Tags: Computational Fluid Dynamics; Supercomputing & HPC; In-Situ and Scientific Visualization; Aerospace & Defense

Day: Thursday, 04/07
Time: 14:00 - 14:25
Location: Marriott Salon 1

S6529 - Simulation of Rayleigh-Bernard Convention on GPUs

Massimiliano Fatica Senior Manager, Tesla HPC Performance Group, NVIDIA
Massimiliano Fatica is a senior manager at NVIDIA in the Tesla HPC Performance and Benchmark Group, where he works in the area of GPU computing (high performance computing and clusters). Prior to joining NVIDIA, he was a research staff member at Stanford University, where he worked on applications for the Stanford Streaming Supercomputer. He holds a laurea in aeronautical engineering and a Ph.D. in theoretical and applied mechanics from the University of Rome "La Sapienza."

We'll show the steps required to port a finite difference code for the direct numerical simulation of turbulent flow to run on GPUs. The code is the open source AFID project, developed by Twente University, SURFsara, and University of Rome "Tor Vergata." One of the main goals of the porting project was to keep the code as close as possible to the original CPU implementation. The porting was done with CUDA Fortran, using as much as possible the CUF kernel directives. We'll show how to profile the code, the verification process, and the results obtained.

Level: Intermediate
Type: Talk
Tags: Computational Fluid Dynamics; Supercomputing & HPC; Performance Optimization

Day: Thursday, 04/07
Time: 14:30 - 14:55
Location: Marriott Salon 1

S6328 - Towards the Industrial Adoption of GPU Accelerated Computational Fluid Dynamics

Peter Vincent Senior Lecturer, Imperial College London
Highly-Rated Speaker
Peter Vincent is a senior lecturer and EPSRC early career fellow in the department of Aeronautics at Imperial College London, working at the interface between mathematics, computing, fluid dynamics, and aeronautical engineering. He holds a first class B.S. from the Department of Physics at Imperial College (graduating top of the year), and a Ph.D. from the Department of Aeronautics at Imperial College in the field of CFD. He has also studied in the U.S., serving as a postdoctoral scholar in the Department of Aeronautics and Astronautics at Stanford University, where he developed novel high-order numerical methods for CFD, and implemented them for massively parallel, many-core GPUs.

We'll detail our experiences of translating next-generation high-order GPU-accelerated CFD technology from an academic codebase through to an industry-ready platform. We'll begin by introducing the flux reconstruction (FR) approach to high-order methods, a discretization that is particularly well-suited to many-core architectures. We'll then outline our open-source implementation of FR called PyFR, and proceed to describe the Hyperflux project with Zenotech and CFMS, which aims to translate technology from PyFR into the commercial zCFD software. The talk with touch on various topics, including maintainability, portability, algorithm choice, numerical robustness, and expected performance improvements -- all in an industrial context.

Level: All
Type: Talk
Tags: Computational Fluid Dynamics; Supercomputing & HPC; Algorithms; Aerospace & Defense

Day: Thursday, 04/07
Time: 15:00 - 15:25
Location: Marriott Salon 1

S6355 - Using AmgX to Accelerate PETSc-Based CFD Codes

Pi-Yueh Chuang Ph.D. Student, George Washington University
Pi-Yueh Chuang is a Ph.D. student in mechanical and aerospace engineering at George Washington University, Washington, D.C. He is a member of Professor Lorena A. Barba's research group. His current research interests are GPU applications in computational fluid dynamics simulations and immersed boundary methods. Prior to his Ph.D. studies, he worked as an engineer in Moldex3D, a company developing moldflow simulation software. He got his M.S. in mechanical engineering from National Taiwan University with a thesis and papers focusing on simulation Monte Carlo method and nanoscale energy transport. He has a B.S. in power mechanical engineering from National Tsing Hua University, Taiwan.

Learn to accelerate existing PETSc applications using AmgX-NVIDIA's library of multi-GPU linear solvers and multigrid preconditioners. We developed wrapper code to couple AmgX and PETSc, allowing programmers to use it with fewer than 10 additional lines of code. Using PetIBM, our PETSc-based, immersed-boundary CFD solver, we show how AmgX can speed up an application with little programming effort. AmgX can thus bring multi-GPU capability to large-scale 3D CFD simulations, reducing execution time and lowering hardware costs. As example, we estimate the potential cost savings using Amazon elastic compute cloud (EC2). We also present performance benchmarks of AmgX, and tips for optimizing GPU multigrid preconditioners for CFD. This presentation is co-authored with Professor Lorena A. Barba.

Level: Intermediate
Type: Talk
Tags: Computational Fluid Dynamics; Supercomputing & HPC; Aerospace & Defense; Tools & Libraries

Day: Thursday, 04/07
Time: 15:30 - 15:55
Location: Marriott Salon 1

Talk