S4670 - The Operational Impact of GPUs on ORNL's Cray XK7 Titan
( Director of Operations, National Center for Computational Sciences, Oak Ridge National Laboratory )
Jim Rogers is the Director of Operations for the National Center for Computational Sciences at Oak Ridge National Laboratory. The NCCS provides full facility and operations support for three petaFLOP-scale systems including Titan, a 27PF Cray XK7. Jim has a BS in Computer Engineering, and has worked in high performance computing systems acquisition, integration, and operation for more than 25 years.
With a peak computational capacity of more than 27PF, Oak Ridge National Lab's Cray XK7, Titan, is currently the largest computing resource available to the US Department of Energy. Titan contains 18,688 individual compute nodes, where each node pairs one commodity x86 processor with a single NVIDIA Kepler GPU. When compared to a typical multicore solution, the ability to offload substantive amounts of work to the GPUs provides benefits with significant operational impacts. Case studies show time-to-solution and energy-to-solution that are frequently more than 5 times more efficient than the non-GPU-enabled case. The need to understand how effectively the Kepler GPUs are being used by these applications is augmented by changes to the Kepler device driver and the Cray Resource Utilization software, which now provide a mechanism for reporting valuable GPU usage metrics for scheduled work and memory use, on a per job basis.
Session Level: All
Session Type: Talk
Tags: Supercomputing; Performance Optimization