Sign In
GTC Logo
GPU
Technology
Conference

March 24-27, 2014 | San Jose, California
Slidecasts of GTC sessions are available now for conference registrants – please “Sign In” to view.
PDFs of presentation slides will be available by mid-April. Registrants must login to view slidecasts and PDFs.
For non-registrants, this GTC content will be available at the end of April on GTC On Demand.

GPU Technology Conference Schedule Planner

Print
 
  • List View
  • Calender View

 
Refine:
  • Session Levels:
  • |
  • |
  • |
  • |
  • Session Levels:
  •  
  •  
  •  
  •  
  • = Highly Rated Speaker

KEYNOTE

Presentation
Details

S4780 - Keynote: Video Games and the Future of Cognitive Enhancement

Adam Gazzaley ( Associate Professor, UCSF )
Dr. Adam Gazzaley obtained an M.D. and a Ph.D. in Neuroscience at the Mount Sinai School of Medicine in New York, completed clinical residency in Neurology at the University of Pennsylvania, and postdoctoral training in cognitive neuroscience at UC Berkeley. He is the founding director of the Neuroscience Imaging Center at the UC San Francisco, an Associate Professor in Neurology, Physiology and Psychiatry, and Principal Investigator of a cognitive neuroscience laboratory. His laboratory studies neural mechanisms of perception, attention and memory, with an emphasis on the impact of distraction and multitasking on these abilities. His unique research approach utilizes a powerful combination of human neurophysiological tools, including functional magnetic resonance imaging (fMRI), electroencephalography (EEG) and transcranial stimulation (TES). A major accomplishment of his research has been to expand our understanding of alterations in the aging brain that lead to cognitive decline. His most recent studies explore how we may enhance our cognitive abilities via engagement with custom designed video games, neurofeedback and TES. Dr. Gazzaley has authored over 80 scientific articles, delivered over 300 invited presentations around the world, and his research and perspectives have been consistently profiled in high-impact media, such as The New York Times, New Yorker, Wall Street Journal, TIME, Discover, Wired, PBS, NPR, CNN and NBC Nightly News. Recently, he wrote and hosted the nationally televised, PBS-sponsored special "The Distracted Mind with Dr. Adam Gazzaley". Awards and honors for his research include the Pfizer/AFAR Innovations in Aging Award, the Ellison Foundation New Scholar Award in Aging, and the Harold Brenner Pepinsky Early Career Award in Neurobehavioral Science.

A fundamental challenge of modern society is the development of effective approaches to enhance brain function and cognition in both healthy and impaired individuals. For the healthy, this serves as a core mission of our educational system and for the cognitively impaired this is a critical goal of our medical system. Unfortunately, there are serious and growing concerns about the ability of either system to meet this challenge. I will describe an approach developed in our lab that uses custom-designed video games to achieve meaningful and sustainable cognitive enhancement (e.g., Anguera, et al. Nature 2013), as well the next stage of our research program, which uses video games integrated with technological innovations in software (e.g., brain computer interface algorithms, GPU computing) and hardware (e.g., virtual reality headsets, mobile EEG, transcranial electrical brain stimulation) to create a novel personalized closed loop system. I will share with you a vision of the future in which high-tech is used as an engine to enhance our brain's information processing systems, thus reducing our reliance on non-specific drugs to treat neurological and psychiatric conditions and allowing us to better target our educational efforts.

This keynote will be preceded by naming the winner of the CUDA Center of Excellence Achievement Award, winner for Best Poster, and the new CUDA Fellows, followed by the launch announcement of the Global Impact Award. (Award ceremony duration approximately 15 minutes).

Session Level: All
Session Type: Keynote
Tags: Medical Imaging & Visualization; Video & Image Processing; Recommended for All Press

Day: Thursday, 03/27
Time: 10:30 - 12:00
Location: Hall 3

Keynote
 

HANDS-ON LAB

Presentation
Details

S4793 - Hands-on Lab: Image Processing Using NPP

Yang Song ( Senior Software Engineer, NVIDIA )
Yang Song
Yang Song is the technical lead for NVIDIA's NPP library. As technical lead, he is responsible for NPP's overall design and schedule, and he is currently focused on high performance implementations of image codecs. He joined the NPP team originally as an intern in 2010, and returned full-time in 2011. Yang received his Ph.D in Electrical Engineering from University of Arizona in 2011, with a dissertation focused on hardware implementation of an H.264 codec. As a graduate student, he received a Chinese Government Award for Outstanding Student Abroad, and published a number of journal articles leading to technology disclosures through the University of Arizona. He received his MS and BS degrees from Nanjing University of Science and Technology, China.

Learn how to use the NVIDIA Performance Primitives (NPP) Library to solve image and signal processing problems. The workshop covers a simple but complete example for automatic contrast adjustment of an image. Topics covered include the specification of input and output data formats, the data alignment and memory management for high performance, and the flexibility of regions-of-interest for processing. Users will experience the simplicity to instantiate NPP primitives and the efficiency to leverage the GPU power for image processing. Be prepared for this hands-on lab by installing the suggested software at bit.ly/gtc14labs on your system.

Session Level: Beginner
Session Type: Hands-on Lab
Tags: Video & Image Processing

Day: Wednesday, 03/26
Time: 09:00 - 10:20
Location: Room 230A

Hands-on lab
 

TUTORIAL

Presentation
Details

S4654 - Detailed Overview of NVENC Encoder API

Swagat Mohapatra ( Senior Software Lead, NVIDIA )
Swagat received his Bachelors in Electrical Engineering from IIT in Kaharagpur and joined the NVIDIA Video team in 2006. For the past 3 years he has been working on video encoders. He is responsible for SW encoder SDK , driver and encoder microcode development.
Abhijit Patait ( Sr. Manager, System Software, NVIDIA )
Abhijit Patait
Abhijit Patait has been leading NVIDIA's GPU multimedia team for past 4 years. His team is responsible for supporting the multimedia (audio and video) functionality in the NVIDIA GPU driver for Windows, NVENC SDK and GRID SDK. Prior to NVIDIA, Abhijit held several engineering and management positions working in the areas of baseband signal processing, telecom and VoIP systems design, audio/DSP processing etc. Abhijit holds an MSEE degree from University of Missouri-Rolla and and MBA from Haas School of Business, University of California at Berkeley.

This session gives a detailed overview of the NVENC encoder interface and the video encoding capabilities of current (Kepler) and future (Maxwell) generations of NVIDIA GPUs. We will present how to correctly use the encoder interface to take advantage of the hardware capabilities and software APIs used for encoding. The tutorial will detail steps on how to create HW encoder session asynchronously using the encoder as well as demonstrate how NVENC can be used in various applications such as transcoding, low-latency applications, virtualization and streaming. Additionally, we will also give an overview of some of the new features and recent improvements in Maxwell GPUs, particularly related to performance and quality of the encoder.

Session Level: Beginner
Session Type: Tutorial
Tags: Video & Image Processing; Media & Entertainment

Day: Monday, 03/24
Time: 09:00 - 10:20
Location: Room 211A

S4324 - Topics in GPU-Based Video Processing

Thomas True ( Senior Applied Engineer for Professional Video, NVIDIA )
Thomas True
Thomas True is a Senior Applied Engineer for Professional Video in NVIDIA's Professional Solutions Group where for the past 10 years he has focused on the use of GPUs in broadcast, video and film applications ranging from pre-visualization to post production and live to air. Prior to joining NVIDIA, Thomas was an Applications Engineer at SGI. Thomas has an M.S degree from the Graphics Lab at Brown University and a B.S. degree from the Rochester Institute of Technology.

The GPU is a high performing floating point parallel processor with extremely high memory bandwidth. This makes it ideally suited for video and image processing applications. This tutorial will present the latest techniques for optimal GPU-based video processing.

Session Level: Intermediate
Session Type: Tutorial
Tags: Video & Image Processing; Performance Optimization; Media & Entertainment; Real-Time Graphics Applications

Day: Monday, 03/24
Time: 10:30 - 11:50
Location: Room 211A

S4711 - Session 2: Fast, Parallel Algorithms for Computer Vision and Machine Learning with GPUs (Presented by ArrayFire)

Umar Arshad ( Senior Software Engineer, CUDA Training Specialist, ArrayFire )
Umar Arshad is an engineer at ArrayFire where he primarily works on improving concurrency in ArrayFire and applications using ArrayFire. He also created the CUDA and OpenCL Optimization training material and regularly gives tutorials throughout the country. Before joining ArrayFire Umar was a developer at Inovalon where he was involved with improving performance and designing large scale applications. Umar has graduated from Georgia State University with a Masters degree in Computer Science. At GSU, he studied Parallel programming and was the Program Chair of ACM in the University.

Working on image processing, computer vision, or machine learning? Learn best practices for implementing parallel versions of popular algorithms on GPUs. Instead of reinventing the wheel, you will learn where to find and how to use excellent versions of these algorithms already available in CUDA and ArrayFire libraries. You will walk away equipped with the best tools and knowledge for implementing accelerated image processing and machine learning. This session will also include information about programming CUDA on Tegra mobile devices for computer vision applications.

Session Level: Beginner
Session Type: Tutorial
Tags: Computer Vision; Machine Learning & AI; Video & Image Processing; Numerical Algorithms & Libraries

Day: Monday, 03/24
Time: 10:30 - 11:50
Location: Room 210B

Tutorial
 

TALK

Presentation
Details

S4381 - Real-Time 3D Pose Estimation of Hundreds of Objects

Karl Pauwels ( Postdoctoral Research Fellow, University of Granada, Spain )
Karl Pauwels
Dr. Karl Pauwels received the M.Sc. in Commercial Engineering, the M.Sc. in Artificial Intelligence, and the Ph.D. in Medical Sciences from the Katholieke Universiteit Leuven, Belgium. He is currently a Marie Curie postdoctoral research fellow at the Computer Architecture and Technology Department of the University of Granada, Spain. His main research interest is real-time computer vision in the context of autonomous navigation and dexterous manipulation of complex objects. He takes inspiration from biological vision to more easily exploit the parallelism provided by GPUs.

Discover how hundreds of objects can be simultaneously located and tracked in 3D through the real-time combination of visual simulation and visual perception. A tight integration of GPU graphics and compute has allowed us to continuously update a 3D scene model on the basis of dense visual cues, while at the same time feeding back information from this model to facilitate the cue estimation process itself. In this session we will describe (1) the low-level dense motion and stereo engine that can exploit such model feedback, (2) the 6DOF pose (location and orientation) estimation of hundreds of rigid objects at 40 Hz, and (3) how the same framework enables multi-camera and/or complex articulated object tracking. Throughout the session, we will pay special attention to implementation and system integration aspects of our real-time demonstrator system.

Session Level: Intermediate
Session Type: Talk
Tags: Computer Vision; Mobile Applications; Video & Image Processing; Machine Learning & AI

Day: Tuesday, 03/25
Time: 13:30 - 13:55
Location: Room 212B

S4675 - Embedding CUDA

Dustin Franklin ( GPGPU Applications Engineer, GE Intelligent Platforms )
Highly-Rated Speaker
Dustin is a GPU expert in the defense & aerospace industry. Originally a 3D rendering architect for games & simulations, he changed focus in 2005 to GPGPU. Dustin has years of experience in deploying high-performance CUDA applications onto rugged platforms like UAVs, tanks, and attack helicopters. Currently he works for GE as a GPGPU Applications Engineer and lives near Washington DC.

Rugged GPUs are bringing leading edge performance and mission-critical reliability to platforms with harsh operating environments. Follow advances in GPU technology which unlock real-time CUDA capabilities for low-latency GPU applications. Learn how to architect systems with GPUDirect and 3rd-party IO devices and interconnects for efficient data streaming and increased scalability. Tune your CUDA kernels and control logic for low-latency asynchronous behavior with response times down into the microseconds. Explore embedded GPU applications in signal processing, imaging, avionics, vetronics, and shipboard.

Session Level: All
Session Type: Talk
Tags: Defense; Signal & Audio Processing; Video & Image Processing; Real-Time Graphics Applications

Day: Tuesday, 03/25
Time: 13:30 - 13:55
Location: Room 210D

S4401 - Real-Time Affine-Invariant Feature Extraction: Object Recognition Under Extreme Viewpoint Change

Valeriu Codreanu ( Postdoctoral Researcher, Eindhoven University of Technology )
Valeriu Codreanu
Valeriu Codreanu is a postdoctoral researcher at the Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen. Before joining the team in Groningen, Valeriu received his PhD in Electrical Engineering from the Polytechnic University of Bucharest in 2011 with a thesis proposing efficient cooperation between multi-threaded and vector processors. His general research interests lie in the field of energy-efficient computing systems, ranging from theory to architecture design and to programming such systems. His current interests revolve around software techniques to make efficient use of CPU-GPU systems and automatic ways of generating high quality parallel code, with the goal of making parallel programming easier.

Learn how to efficiently design affine-invariant feature extractors using GPU hardware for the purpose of robust object recognition. Local feature extraction from images is one of the main topics in pattern matching and computer vision in general. Some of the best feature extractors such as SIFT and SURF are scale, rotation, and translation invariant, but fall short when illumination and viewpoint change are taken into account. To increase the viewpoint-invariance of SIFT, the fully affine-invariant ASIFT was developed, but this came with a very high computational cost. We present results from using our simple image transformation framework to achieve real-time affine-invariant object recognition, while also being scalable in terms of the number of GPU devices used. Participants in this session will learn more about this high-performance CUDA solution for adding viewpoint-invariance to any feature extractor, relying on the hardware features of modern GPU devices.

Session Level: Intermediate
Session Type: Talk
Tags: Computer Vision; Video & Image Processing

Day: Tuesday, 03/25
Time: 14:00 - 14:25
Location: Room 212B

S4421 - GPU Computing with MATLAB

Andy Thé ( Sr. Product Marketing Manager - Image Processing , MathWorks )
Andy Thé
Andy holds a B.S. in Electrical Engineering from Georgia Institute of Technology and a B.A. in Business from Kennesaw State University. Before joining MathWorks, Andy spent 12 years as a field applications engineer focused on embedded processors at Texas Instruments, and 3 years as a software marketing manager for real-time software at IntervalZero.

Learn how to use NVIDIA GPUs to accelerate computationally intensive MATLAB applications in areas such as image processing, signal processing, and computational finance. We will use an image processing example to demonstrate how you can speed up your MATLAB code by using built-in GPU enabled functionality or by replacing key computations with CUDA kernels. We will also illustrate how MATLAB can be used as a development environment and test framework for CUDA kernel evaluation, visualization, and validation.

Session Level: Beginner
Session Type: Talk
Tags: Programming Languages & Compilers; Video & Image Processing; Medical Imaging & Visualization

Day: Tuesday, 03/25
Time: 14:00 - 14:50
Location: Room LL21F

S4481 - High Performance Video Pipelining: A Flexible Architecture for GPU Processing of Broadcast Video

Peter Walsh ( Chief Emerging Technology Engineer, ESPN )
Peter Walsh
Peter Walsh is Chief Emerging Technology Engineer in the Emerging Technology Department at ESPN. His primary focus is on the development of systems for the enhancements of broadcast video through the insertion of virtual graphics. This requires the combination of rendering, video processing and real-time computer vision. His work leverages the GPU, taking advantage of the CUDA computing platform.

Discover how to architect a system for the real time GPU processing of broadcast video. Learn how the current generation of GPU hardware and the CUDA computing platform can be used to support simultaneous video acquisition, transferring of video from CPU to GPU, processing on the GPU, transferring video from the GPU to the CPU and performing processing on the CPU. The architecture described achieves high throughput by pipelining these operation while maintaining the flexibility for easy reconfiguration. A common buffer mechanism will be described for both CPU and GPU memory. This buffer mechanism also supports buffers having different line pitches which may be required depending on the hardware configuration. In addition to video processing, the interoperation between graphics and CUDA processing is also addressed within the same framework.

Session Level: Intermediate
Session Type: Talk
Tags: Media & Entertainment Summit; Video & Image Processing; Real-Time Graphics Applications; Recommended Press Session – Media & Entertainment; Recommended for All Press

Day: Tuesday, 03/25
Time: 14:00 - 14:25
Location: Room 211B

S4651 - Deep Learning Meets Heterogeneous Computing

Ren Wu ( Distinguished Scientist, Baidu )
Ren Wu
Dr. Ren Wu is a distinguished scientist of Baidu. He was the lead architect for Heterogeneous system Architecture (HSA) and before that, he was the Principal Investigator of CUDA Research Center at HP Labs. Dr. Wu is renowned for pioneering the idea of using GPUs to accelerate big data analytics as well as his work on GPU-accelerated large-scale clustering algorithms. At Baidu, Dr. Wu is leading the effort to build the company's heterogeneous computing platform - a turbo engine to power Baidu's business and to unlock a new kind of intelligence.

The rise of the internet, especially mobile internet, has accelerated the data explosion - a driving force for the great success of deep learning in recent years. Behind the scenes, the heterogeneous high-performance computing is another key enabler of that success. In this talk, we will share some of work we did at Baidu. We will highlight how big data, deep analytics and high-performance heterogeneous computing can work together with great success.

Session Level: All
Session Type: Talk
Tags: Machine Learning & AI; Big Data Analytics & Data Algorithms; Supercomputing; Video & Image Processing; Recommended for All Press

Day: Tuesday, 03/25
Time: 14:00 - 14:50
Location: Room LL21B

S4446 - Graphics and Computer Vision for Live Augmented Reality: The 34th America's Cup

Tim Heidmann ( Chief Instigator, Serious Intent LLC )
Highly-Rated Speaker
Tim Heidmann
Tim is a software architect specializing in applying new technology to creative applications, most recently in graphics and tracking for live television. He has worked with America’s Cup technology chief and sailor Stan Honey and his team on several innovative projects in the past, including the prototype of the first down line for football and the glowing puck in hockey. Previously, Tim was an evangelist at Silicon Graphics, with responsibility for the development of the animation and special effects markets.

For the 2013 America's Cup sailboat races, the event tech team tracked the yachts, marks, and HDTV helicopter cameras with unprecedented accuracy, enabling a real-time augmented reality graphics system called AC LiveLine. This was used extensively throughout the over 100 hours of international live television broadcast. In 2012, it received the Emmy for Technical Achievement in Sports Broadcast. Visuals provided identification of the yachts, details of the course, graphical display of tactical information, and a number of detailed insights into wind, course, and currents. GPU technology was pivotal in solving the problems of simulation, display, tracking, and visual processing inherent in such a complex project. This talk builds on a talk from last year's conference, and includes new topics such as using computer vision techniques to fine-tune the positioning of the yachts, corner tracking to eliminate the jitter of graphics relative to the video, and accelerated particle system techniques to simulate and display visual effects.

Session Level: Intermediate
Session Type: Talk
Tags: Media & Entertainment Summit; Virtual & Augmented Reality; Video & Image Processing; Recommended Press Session – Media & Entertainment

Day: Tuesday, 03/25
Time: 14:30 - 14:55
Location: Room 211B

S4695 - A Real-Time Defocus Deblurring Method for Semiconductor Manufacturing

Tsutomu Sakuyama ( Imaging Technology Engineer, Dainippon Screen Mfg. Co., Ltd. )
Tsutomu Sakuyama
Tsutomu Sakuyama is an imaging technology engineer for Dainippon Screen Mfg. Co. Ltd. His focus is on image processing and general purpose computer application.

This session will present a real-time defocus deblurring method for the industrial equipment for semiconductors. Many studies have proposed fast deblurring methods for the natural and medical images, etc. However, these methods have difficulty in the equipment due to following reasons: Most of approaches requires the distance between imaging device and the object which cannot be obtained in most cases. In addition, the process must finish within constant cycle time determined by a specification of the equipment, which means 'real-time' in production purpose. In this session, we propose our deblurring method satisfying those constraints.

Session Level: All
Session Type: Talk
Tags: Computer Vision; Video & Image Processing; Computational Photography; Real-Time Graphics Applications

Day: Tuesday, 03/25
Time: 14:30 - 14:55
Location: Room 212B

S4247 - A GPU-Based Free-Viewpoint Video System for Surgical Training

Pierre Boulanger ( Professor, University of Alberta )
Pierre Boulanger
Dr. Boulanger worked for 18 years at the National Research Council of Canada as a senior research officer where his primary research interest was in 3D computer vision, rapid product development, and virtualized reality systems. He now has a double appointment as a professor at the University of Alberta's Department of Computing Science and at the Department of Radiology and Diagnostic Imaging. His main research topic and teaching is on virtualized reality systems. He is also principal investigator for stereo IPTV at TRLabs. In 2004, Dr. Boulanger was awarded an iCORE/TRLabs industrial chair in Collaborative Virtual Environment and is now the new CISCO chair in healthcare solutions. He has published more than 270 scientific papers in various Journals and Conferences. He is on the editorial board of two major academic journals. Dr. Boulanger is also on many international committees and frequently gives lectures on rapid product development and virtualized reality. He is the Director of the Advanced Man Machine Interface Laboratory. He is also the scientific director of the Servier Virtual Cardiac Center. On the commercial side, Dr. Boulanger is the president of PROTEUS Consulting Inc. an Alberta-based consulting firm specialized in Virtual Reality Applications.

In this presentation, we propose a novel GPU-based algorithm capable of generating free viewpoints from a network of fixed HD video cameras. This free viewpoint TV system consists of two main sub-systems: a real-time depth estimation sub-system, which extracts a disparity map from a network of cameras, and a synthetic viewpoint generation sub-system that uses the disparity map to interpolate new views between the cameras. In this system, we use a space-sweep algorithm to estimate depth information, which is amiable to parallel implementation. The view generation sub-system generates new synthetic images from 3D vertices and renders them from an arbitrary viewpoint specified by the user. Both steps are computationally extensive, but the computations can easily be divided from each other and thus can be efficiently implemented in parallel using CUDA. A surgical training application is presented.

Session Level: Beginner
Session Type: Talk
Tags: Computer Vision; Video & Image Processing; Virtual & Augmented Reality; Medical Imaging & Visualization; Recommended for All Press

Day: Tuesday, 03/25
Time: 15:00 - 15:25
Location: Room 212B

S4661 - Halide: A language for Portable High-Performance Image Processing

Andrew Adams ( Software Engineer, Google )
Highly-Rated Speaker
Andrew Adams
Andrew Adams is a software engineer at Google, where he works on the Halide compiler. Andrew did his doctoral work at Stanford under Marc Levoy, where he worked on programmable cameras, light fields, and fast bilateral filtering. He then moved to MIT to work with Fredo Durand and Jonathan Ragan-Kelley on Halide, before rejoining Marc at Google in February 2013.

Learn how Halide can help you write a single implementation of an image processing routine that achieves performance comparable with hand-tuned assembly on ARM, x86, and GPUs. Halide factors an imaging pipeline into two parts: A pure functional description of the algorithm; and a separate 'schedule' which specifies how to vectorize, parallelize, tile, fuse or inline the stages of the pipeline. The schedule varies per architecture but the algorithm does not, and changing the schedule is guaranteed not to change the result.

Session Level: Intermediate
Session Type: Talk
Tags: Media & Entertainment Summit; Computational Photography; Programming Languages & Compilers; Video & Image Processing

Day: Tuesday, 03/25
Time: 16:30 - 16:55
Location: Room 211A

S4439 - ATCOM: A Real-Time Image Enhancement Platform for Surveillance

Eric Kelmelis ( CEO, EM Photonics, Inc. )
Eric Kelmelis is CEO and Co-Founder of EM Photonics. For over 10 years, EM Photonics has focused on computational acceleration and efficient high performance computing primarily in the fields of scientific computing and image processing. Mr. Kelmelis has bachelors and masters degrees in Electrical Engineering from the University of Delaware and has more than 50 technical papers, 3 patents, and a book chapter. He also currently serves as chair of the Modeling and Simulation conference at SPIE's Defense, Security, and Sensing Symposium and as a Visiting Instructor at the University of Delaware.

Learn how GPUs can be applied to real-time, real-world image processing applications. Images and videos recorded at long distances (greater than 1 mile) often suffer degradation due to the atmospheric turbulence between the subject and camera, which severely limits the quality of data that is captured by high-end imaging systems. We will discuss the practical considerations of keeping up with real-time video, including kernel performance and pipelining, and effectively using multiple GPUs in a real-time context. We have optimized specifically for the Kepler warp-shuffle instruction and will go in depth on the performance boosts offered by this new technology.

Session Level: Intermediate
Session Type: Talk
Tags: Defense; Video & Image Processing

Day: Tuesday, 03/25
Time: 17:00 - 17:25
Location: Room 210D

S4197 - eyeSight on Android-based Set-Top-Boxes

Gideon Shmuel ( CEO, eyeSight Technologies Ltd. )
Gideon joined eyeSight with over 20 years of knowledge and experience in the Telecom and Enterprise Software markets, presenting an impressive track record in sales and enterprise development. Gideon has been involved in delivering complex solutions in growing technology organizations.

More and more we are seeing gesture recognition interfaces being integrated into digital devices around us. TVs and PCs with pre-installed gesture controls are becoming a standard feature in new devices launched in the market. As a provider of gesture solutions - we will discuss the benefits of running the gesture engines on GPU, as well as how Tegra based devices, including set top boxes, can benefit from such touch-free interfaces.

Session Level: All
Session Type: Talk
Tags: Mobile Summit; Smart TV, Mobile & Second Screen Applications; Video & Image Processing; Computer Vision

Day: Tuesday, 03/25
Time: 17:30 - 17:55
Location: Room 210E

S4434 - Real-Time 4K JPEG2000 for Broadcast and Digital Cinema

Jiri Matela ( CEO, Comprimato )
Jiri Matela
Jiri Matela received BSc and MSc degrees in Compute Science from Masaryk University in Brno, Czech Republic in 2007 and 2009. He is currently working toward the PhD degree at the Masaryk University focusing at image compressions, reformulations of image processing algorithms for massively parallel GPU architectures, high-speed networks. He is a member of team that recently received ACM Multimedia Best Open-Source Software Award for real-time image compressions and video transmission application UltraGrid and that demonstrated one of the first real-time compressed transmissions of video in 8K Ultra High-Definition resolution. Jiri is founder of Comprimato Systems, a company focusing on GPU accelerated image compressions and video codecs.

JPEG2000 is compression standard for digital cinema post-producition and it is an emerging standard for broadcast contribution and archiving. So far the JPEG2000 format was considered as computationally too heavy to be used for other then standardized applications such as cinema distribution. We present successful GPU design and implementation of JPEG2000 codec allowing for real-time film compression and decompression in digital cinema and broadcast applications. Fast GPU processing will help to further spread JPEG2000 as archiving and mezzanine format.

Session Level: Intermediate
Session Type: Talk
Tags: Media & Entertainment Summit; Video & Image Processing; Medical Imaging & Visualization

Day: Tuesday, 03/25
Time: 17:30 - 17:55
Location: Room 211B

S4706 - A GPU-Based Computational Framework for Large-Scale Critical Infrastructure Mapping Using Satellite Imagery

Dilip Patlolla ( R & D Staff, Oak Ridge National Laboratory )
Dilip Patlolla
Dilip Patlolla is Research staff member in the Geographic Information Science and Technology (GIST) Group at the Oak Ridge National Laboratory.He leads the development of Large-Scale Critical Infrastructure Mapping using advanced computing methods. His primary responsibilities include: opening up new domains of application for HPC, FPGAs, GPUs by researching and developing computing algorithms, and ensuring best possible performance on current and next-generation architectures. Dilip received his MS from the University of Tennessee, Knoxville and is the recipient of ORNL's 2013 Significant Event Award.

Assessing and monitoring critical infrastructures from space is a cost effective and efficient solution. Satellite images are now available with spatial resolutions and acquisition rates to enable image driven large-scale mapping and monitoring of critical infrastructure a viable possibility. However, processing huge volume of high spatial resolution imagery is not a trivial task. Often solutions require advanced algorithms capable of extracting, representing, modeling, and interpreting scene features that characterize the spatial, structural, and semantic attributes. Furthermore, these solutions should be scalable enabling analysis of big image datasets; at half-meter pixel resolution the earth's surface has roughly 600 Trillion pixels and the requirement to process at this scale at repeated intervals demands highly scalable solutions. In this research, we present a GPU-based computational framework designed for identifying critical infrastructures from large-scale satellite or aerial imagery to assess vulnerable population.

Session Level: All
Session Type: Talk
Tags: Defense; Video & Image Processing; Supercomputing; Big Data Analytics & Data Algorithms; Recommended Press Session – HPC-Science

Day: Tuesday, 03/25
Time: 17:30 - 17:55
Location: Room 210D

S4422 - A New GPU-Based Level Set Method for Medical Image Segmentation

Wenzhe Xue ( Research Assistant, Mayo Clinic Arizona; Arizona State University )
Wenzhe Xue
Wenzhe Xue is working towards his Ph.D. on Biomedical Informatics at ASU, and is a research assistant in the Medical Imaging Informatics (MII) lab at Mayo Clinic Arizona, under the supervision of Dr. Ross Mitchell. Wenzhe works on developing novel GPU-based level set methods for medical image segmentation and validating on both synthetic and real clinical image data. He aims to provide an accurate, precise, and fast tool for quantitative imaging on cancer treatment research and studies.

We have developed a new approach to measure lesion volumes in medical images using GPU programming. The approach is based on the level set method and minimizes the number of voxels included in the computational domain with unique efficiency. The underlying cost function and specifics of the level sets approach are not limited by the implementation, and multiple methods for determining the boundary progression speed are possible. We have experimented with intensity-based approaches as well as higher-order feature spaces using multiple image contrasts. We have tested our approach on synthetic images and in a clinical setting. GPU programming also enables real-time 3D rendering and visualization of the propagating level set surface volume. This GPU-enabled combination of speed and interactivity makes our approach an excellent candidate for use in oncology where change in tumor volume guides clinical decision making and assessment of treatment effectiveness.

Session Level: Beginner
Session Type: Talk
Tags: Medical Imaging & Visualization; Video & Image Processing; Combined Simulation & Real-Time Visualization; Recommended Press Session – HPC-Science; Recommended for All Press

Day: Wednesday, 03/26
Time: 09:00 - 09:25
Location: Room LL21B

S4475 - GPU-Based Multiplatform Transcoding

Mahmut Samil Sagiroglu ( Co-Founder, Erlab Software )
Mahmut Samil  Sagiroglu
Graduated in 1998 from Istanbul University Electronic Engineering department, he finished his MSc in the same university in 2001. He received his PhD. degree from Sabancı University in 2006. After working in several companies, he is employed by TÜBİTAK (The Scientific and Technological Research Council of Turkey) - UEKAE (National Electronics and Cryptology Research Institute) as a researcher in 1999. In pursuit of having roles in several projects and managing two different departments, he is actually responsible from Computational Biology and Security Applications Department. He is also managing Advanced Genomics and Bioinformatics Researches Center Infrastructure Project. His specialty areas include bioinformatics, cryptanalysis, signal processing and electronic designs.

Learn how to take the advantage of GPU for video processing and encoding in order to get a high efficient, real time, and multiplatform video output. Contemporary trends make it mandatory to transmit digital media to all platforms. We use GPU processing and NVENC hardware encoding in order to produce the video stream output in different formats simultaneously with minimum latency and high quality.

Session Level: All
Session Type: Talk
Tags: Media & Entertainment Summit; Video & Image Processing

Day: Wednesday, 03/26
Time: 10:00 - 10:25
Location: Room 211B

S4328 - Object Tracking Under Nonuniform Illumination Conditions

Kenia Picos ( Postgraduate Student, CITEDI-IPN )
Kenia Picos is currently a postgraduate fellow at National Polytechnic Institute in Tijuana, México. Her research interest includes image processing, mathematical modeling and computer graphics.

The goal of this session is to demonstrate the performance of object tracking with correlation filtering by using nonuniform illuminated scenes. For this work, there are two fundamental limiters to kernel performance: memory usage and processed frames per second. In this session we will describe how to use source code basis for image processing and correlation techniques. Concepts will be illustrated with an example of object recognition and tracking, using ArrayFire, a new generation of image processing libraries that improves sequential algorithms to the highly parallel GPU and multicores architectures.

Session Level: Beginner
Session Type: Talk
Tags: Video & Image Processing; Computer Vision

Day: Wednesday, 03/26
Time: 14:00 - 14:25
Location: Room LL21A

S4342 - CUDA-Accelerated MATLAB without Parallel Computing Toolbox for 3D Medical Image Segmentation

Jung W. Suh ( Senior Research Scientist, KLA-Tencor )
Jung W. Suh
Jung W. Suh is a senior algorithm engineer and research scientist at KLA-Tencor. Dr. Suh received his Ph.D. from Virginia Tech in 2007 for his 3D medical image processing work. He was involved in the development of MPEG-4 and Digital Mobile Broadcasting (DMB) systems in Samsung Electronics. He was a senior scientist at HeartFlow, Inc., prior to joining KLA-Tencor.His research interests are in the fields of biomedical image processing, pattern recognition, machine learning and image/video compression. He has more than 30 journal and conference papers and 6 patents.

Learn how to accelerate your MATLAB codes using CUDA without Parallel Computing Toolbox. Although the Parallel Computing Toolbox is useful for speeding up, this toolbox may not be accessible to every MATLAB user and may have limitations in fully exploiting the power of both MATLAB and CUDA. For the purpose of general speeding up of MATLAB applications, the GPU-utilization through c-mex would provide more flexibility and power in many situations. This session will go through the MATLAB implementation of the atlas-based 3D hippocampus segmentation for MRI image as an example. The atlas-based segmentation is widely used in neuroimage analysis due to its reliable segmentation result even for the challenging target objects with ambiguous and complicated boundaries. However, it requires a high computational power because 3D image registration is used during the segmentation process. This session will show the each step of CUDA optimization for our atlas-based segmentation MATLAB codes from profiling to CUDA conversions through c-mex.

Session Level: Intermediate
Session Type: Talk
Tags: Medical Imaging & Visualization; Video & Image Processing; Computer Vision

Day: Wednesday, 03/26
Time: 14:00 - 14:25
Location: Room LL21B

S4510 - A Parallel GPU Solution to the Maximal Clique Enumeration Problem for CBIR

Christopher Henry ( Assistant Professor, University of Winnipeg )
Christopher Henry
Christopher received his Ph.D., Department of Electrical and Computer Engineering, University of Manitoba in 2011. He currently holds a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant.

The focus of this talk is on a parallel GPU solution to the Maximal Clique Enumeration (MCE) problem, which is a depth-first search method commonly referred to as the backtracking paradigm. The solution to this problem is an outgrowth of work investigating an efficient method for finding all tolerance classes on a set of objects. Recently, the problem of finding tolerance classes has been shown to be the same as the MCE problem. Tolerance classes are sets where all the pairs of objects within a set must satisfy the tolerance relation and the set is maximal with respect to inclusion. Finding such classes is a computationally complex problem and has many applications areas (e.g. genomics and social media). In particular, this talk will focus on content-based image retrieval (CBIR) involving sets of objects with similar features. In the proposed application to CBIR, classes in image covers determined by a tolerance relation provide the content used for CBIR.

Session Level: Intermediate
Session Type: Talk
Tags: Video & Image Processing; Numerical Algorithms & Libraries

Day: Wednesday, 03/26
Time: 14:30 - 14:55
Location: Room LL21A

S4809 - GPU Usage and the VFX Industry (Presented by Lenovo)

Allen Bolden ( Chief Architect & Executive Producer, Bit Theory, Inc. )
Including work while an undergrad, Allen's experience includes over 10 programs in 3D and graphic media applications including Photoshop (15 years), 3DS Max (10 years), Maya (9 years), and Terragen (5 years). His 3D skill set includes Modeling, Rigging, Motion Capture (MoCap), Animation Texture Mapping, Texture Painting, Environment Creation and Animation, Building Creation, Walkthroughs, Product Creation and Animation Demos. B.S. in Computer Science and Engineering (CSE) in the Computer Science Division of the Soda Hall undergraduate Program at Cal Berkeley earned in 3 years (1998‐2001). With all of this under his belt Allen quickly became a leading generalist in the field of VFX and animation. Currently, Allen has developed a Prime Intelligence system named Athena, which automates key tasks in the animation/VFX pipeline, and learns to apply what it ascertains to new assets and effects making their development and production faster and more cost efficient while maintaining expected quality. His methods are currently being used on Major motion pictures in the CG, VFX, and Stereoscopic conversion areas.

A case study with on the top methods used in the VFX pipeline, and some of the more daring "out of the box" uses coming in the near future for VFX and beyond. The goal of this presentation is to show how multiple technologies tie into a pipeline which is accessible anywhere and, powered by a backbone of GPU's, puts production on set in real time during critical time on set.

Session Level: All
Session Type: Talk
Tags: Media & Entertainment Summit; Video & Image Processing; Media & Entertainment; Recommended Press Session – Media & Entertainment

Day: Wednesday, 03/26
Time: 14:30 - 14:55
Location: Room 211B

S4585 - FastFlow: Combining Pattern-Level Abstraction and Efficiency in GPGPUs

Marco Aldinucci ( Researcher, Computer Science Department, University of Torino )
Marco Aldinucci is an assistant professor at Computer Science Department of the University of Torino since 2008. Previously, he has been researcher at University of Pisa and Italian National Research Agency. He is the author of over a hundred papers in international journals and conference proceeding (Google scholar h-index 21). He has been participating in over 20 national and international research projects concerning parallel and autonomic computing. He is the recipient of the HPC Advisory Council University Award 2011 and the NVidia Academic Research programme 2013. He has been leading the "Low-Level Virtualization and Platform-Specific Deployment" workpackage within the EU-STREP FP7 ParaPhrase (Parallel Patterns for Adaptive Heterogeneous Multicore Systems) project, the GPGPU workpackage within the IMPACT project (Innovative Methods for Particle Colliders at the Terascale), and he is the contact person for University of Torino for the European Network of Excellence on High Performance and Embedded Architecture and Compilation. In the last year he delivered 5 invited talks in international workshops (March 2012 – March 2013). He co-designed, together with Massimo Torquati, the FastFlow programming framework and several other programming frameworks and libraries for parallel computing. His research is focused on parallel and distributed computing.

Learn how FastFlow's parallel patterns can be used to design parallel applications for execution on both CPUs and GPGPUs while avoiding most of the complex low-level detail needed to make them efficient, portable and rapid to prototype. As use case, we will show the design and effectiveness of a novel universal image filtering template based on the variational approach.

Session Level: Beginner
Session Type: Talk
Tags: Video & Image Processing; Numerical Algorithms & Libraries; Programming Languages & Compilers

Day: Wednesday, 03/26
Time: 15:00 - 15:50
Location: Room LL21A

S4355 - High Performance Edge-Preserving Filter on GPU

Jonas Li ( GPU Architect, NVIDIA )
Jonas Li
Jonas joined NVIDIA in April 2013 and is a GPU architect in Nvidia Shanghai Arch team. He is working on CUDA/OpenCL application profiling and optimization. Jonas has a lot of experience in parallel programming and performance tuning.

The goal of this session is to show you the GPU implementation of a novel approach for performing high-quality edge-preserving filtering of images and videos in real time. A variety of effects can be achieved based on this filter, including edge-preserving filtering, depth-of-field effects, and stylization. We develop a CUDA-based high performance GPU implementation of edge-preserving filter. In this session, we will present our efforts to address some of the challenges with optimizing performance of edge-preserving filter on GPU. We touch upon such issues as highly-dependent workload, warp synchronization, divergent memory access and transposed data storage. Applied these optimizing approaches, The GPU implementation can filter 256 megapixel color images per second on a Tesla K20c card.

Session Level: Intermediate
Session Type: Talk
Tags: Video & Image Processing; Performance Optimization; Mobile Applications

Day: Wednesday, 03/26
Time: 16:00 - 16:25
Location: Room LL21A

S4363 - Accelerated X-Ray Imaging: Real-Time Multi-Plane Image Reconstruction with CUDA

Prashanth Bhat ( CTO, Manipal Dot Net Pvt. Ltd. )
Prashanth Bhat
Dr. Prashanth Bhat is Chief Technology Officer and Executive Director (Software) at Manipal Dot Net Pvt. Ltd., India, a technology outsourcing company which takes up software development and hardware design projects for worldwide clients. His areas of expertise include High performance parallel computing, GPU acceleration using CUDA, Search engine technology, and Embedded systems. Prior to joining Manipal Dot Net in 2007, he worked in the search engine industry for over eight years, during his tenures at Yahoo! Inc (USA), Overture Services, and Alta Vista Search. In these roles, he has contributed to the core search engine, the Contextual Match advertising infrastructure, and also a distributed machine learning architecture. As a summer intern at HP Research Labs, Palo Alto, he developed new process scheduling techniques for HP's high-end parallel servers. Dr. Prashanth Bhat graduated with a PhD in Computer Engineering from the University of Southern California, Los Angeles. He holds 3 US patents in the field of High Performance Computing and Search engines, and has authored about 15 international publications.

Explore the realm of modern X-ray Fluoroscopy, where ever-increasing data rates and computational requirements are the norm. This session presents an efficient and scalable CUDA solution for multi-plane image reconstruction, an essential yet computationally challenging component of these systems. Our parallelization strategy incorporates several non-trivial techniques to improve performance: (a)reduce run-time computations by using pre-computed LUTs; (b)reduce memory bandwidth consumption by accumulating computations in registers before writing to memory; (c)exploit 2D data locality by using the GPU's texture memory and cache; (d) optimize occupancy by tuning the thread-block configuration. We present experimental results on three Kepler GPUs: GeForce GTX690, Tesla K10, and Tesla K20. On the GTX690, we show real-time rates of 15 fps for 32 1000x1000 image planes, with speed-ups of 6000x over a CPU implementation, and 10x over an alternative CUDA approach. On both Tesla GPUs, we show linear scaling, making a multi-GPU solution viable.

Session Level: All
Session Type: Talk
Tags: Medical Imaging & Visualization; Video & Image Processing; Ray Tracing

Day: Wednesday, 03/26
Time: 16:00 - 16:50
Location: Room LL21B

S4737 - On-Line and Batch Stitching of Gigapixel Images Using OpenGL and CUDA Frameworks

Daniel Marks ( Assistant Research Professor, Electrical and Computer Engineering, Duke University )
Daniel Marks
Daniel L. Marks is an Associate Research Professor of Electrical and Computer Engineering at Duke University. Dr. Marks is the lead optical and systems engineer on the DARPA AWARE program. He has made many contributions to computational imaging and real-time image processing, including novel methods in optical coherence tomography, compressive millimeter wave imaging and multiscale lens design.

We present GPU-based methods for generating gigapixel-scale image renderings from the AWARE multi-scale gigapixel cameras. We demonstrate a streaming zoomable gigapixel video interface, allowing viewers to digitally zoom by 30x over a 100 degree field of view. We also discuss adaptive batch gigapixel image stitching for online distribution. We compare the performance and utility of OpenGL-based image rendering through the traditional GPU video pipeline and CUDA-based image rendering via GPGPU methods.

Session Level: Intermediate
Session Type: Talk
Tags: Media & Entertainment Summit; Video & Image Processing; Media & Entertainment; Computational Photography

Day: Wednesday, 03/26
Time: 16:00 - 16:25
Location: Room 211B

S4134 - Accelerated Visual Effects Made Accessible with Javascript

Sean Safreed ( Co-founder, Red Giant )
Sean Safreed is co-founder of Red Giant Software, and a 16-year veteran of the computer graphics industry. The company started with 2 products and has since grown to offer more than 50 products with a team that spans the United States, and Canada. Before founding Red Giant in 2002, he worked on the Apple's QuickTime team. At Silicon Graphics' he lead efforts to add innovative video features to the companies hardware systems. At Puffin Designs, he worked as a product manager on Commotion, a ground-breaking video paint application that originated at Industrial Light and Magic.

Learn how to exploit the powerful new platform from Red Giant that leverages OpenGL and OpenCL on the latest GPUs coupled with easy to use Javascript to create visual effects tools that run on a variety of operating system and host applications for video editing and compositing. The Red Giant platform lets artists create both simple image processing tools and complete user interfaces with just a few simple lines of code. This session will provide both an architectural overview and live examples of advanced tools that exploit the Red Giant framework. In addition, this session will show the power of connecting real-time gaming render techniques and visual effects.

Session Level: Beginner
Session Type: Talk
Tags: Media & Entertainment Summit; Video & Image Processing; Real-Time Graphics Applications

Day: Wednesday, 03/26
Time: 16:30 - 16:55
Location: Room 211A

S4147 - SIFT Descriptor Extraction on the GPU for Large-Scale Video Analysis

Hannes Fassold ( Senior Researcher, Joanneum Research )
Hannes Fassold
Hannes Fassold received a MSc degree in Applied Mathematics from Graz University of Technology in 2004. Since then he works at Joanneum Research, where he is currently a senior researcher at the Audiovisual Media Group of the DIGITAL — Institute for Information and Communication Technologies. His main research interest are algorithms for digital film restoration and content-based video quality analysis as well as the efficient parallelization of these algorithms on the GPU. He has published several publications in these fields and is the principal investigator for the CUDA Research Center at DIGITAL - Institute for Information and Communication Technologies, Joanneum Research

Learn how the analysis of large-scale video data sets can be greatly accelerated by taking usage of the power of GPUs. Due to their robustness, SIFT (Scale-Invariant Feature Transform) descriptors are very popular for all sort of video analysis tasks. In this talk, we will first present an efficient GPU implementation of an interest point detector (e.g. using the DoG or LoG operator) and the extraction of SIFT descriptors around these interest points. We will compare the GPU implementation with the reference CPU implementation from the HessSIFT library in terms of runtime and quality. Furthermore, we will talk about the benefits of GPU-accelerated SIFT descriptors for applications such as near-duplicate video detection, which aims at detecting duplicates almost identical video segments in large video data sets, or linking video segments by shooting location or salient object.

Session Level: Intermediate
Session Type: Talk
Tags: Video & Image Processing; Computer Vision; Media & Entertainment

Day: Wednesday, 03/26
Time: 16:30 - 16:55
Location: Room LL21A

S4249 - Histograms in CUDA: Privatized for Fast, Level Performance

Nicholas Wilt ( Author, The CUDA Handbook )
Nicholas Wilt
Nicholas Wilt has been programming professionally for more than twenty-five years in a variety of areas, including industrial machine vision, graphics, and low-level multimedia software. While at Microsoft, he served as the development lead for Direct3D 5.0 and 6.0, built the prototype for the Desktop Window Manager, and did early GPU computing work. At NVIDIA, he worked on CUDA from its inception, designing and often implementing most of CUDA’s low-level abstractions. Now at Amazon, Mr. Wilt is working on cloud computing technologies relating to GPUs.

Histograms are an important statistical tool with a wide variety of applications, especially in image processing. Naive CUDA implementations suffer from low performance on degenerate input data due to contention. This presentation will show how to use "privatized" (per-thread) histograms to balance performance of the average case against data-dependent performance of degenerate cases.

Session Level: Intermediate
Session Type: Talk
Tags: Big Data Analytics & Data Algorithms; Video & Image Processing

Day: Wednesday, 03/26
Time: 16:30 - 16:55
Location: Room 210B

S4787 - Accelerated Software as a Service

Michael Houston ( Principal Engineer, NVIDIA )
Michael Houston
Mike Houston is a Principal Engineer at NVIDIA concentrating on mobile and cloud computing. He received his Ph.D. in Computer Science from Stanford University in 2008, focusing on research in programming models, algorithms, and run-time systems for parallel architectures. His currently the technical lead on NVIDIA’s data-center programs.

In this session we will cover how to use GPUs to accelerate back-end data-center infrastructure, specifically image processing. We will present an implementation of a REST API for image processing, concentrating on an approach to managing large numbers of concurrent requests while efficiently scheduling the CPU and GPU resources in the system. We will show that we can provide higher throughput and lower latency than CPU implementations we are replacing. We will also discuss the practical infrastructure implications, different deployment scenarios, and how to fit SW acceleration into those scenarios.

Session Level: Intermediate
Session Type: Talk
Tags: Video & Image Processing

Day: Wednesday, 03/26
Time: 16:30 - 16:55
Location: Room LL20B

S4961 - Audi Piloted Parking on zFAS: Valet Parking for the 21st Century

Miklós Kiss ( Head of ADAS Predevelopment, Audi Electronics Venture GmbH )
Miklós Kiss
Prior to becoming Head of ADAS predevelopment, Miklos was Head of Audi Accident Research Unit. Prior to joining Audi, Miklos served as head of HMI Research at Volkswagen Research. Miklos was also head of ADAS HIM Research at Voklswagen Research and teamleader of HMI at Munich University Geration Reserach Program (LMU). His Ph.D. project was in time and language perception in the human brain (neuropsychology).

What does it mean to bring super computing into the car? Examples of piloted parking systems show what that means for customers as well as for developers. Audis way into piloted driving for the 21st century.

Session Level: All
Session Type: Talk
Tags: Automotive; Video & Image Processing

Day: Wednesday, 03/26
Time: 16:30 - 16:55
Location: Room 210A

S4151 - Full GPU Image Processing Pipeline for Camera Applications

Fyodor Serzhenko ( CEO, Fastvideo )
Fyodor Serzhenko
Fyodor Serzhenko is CEO of Fastvideo company. His research interests include high speed cameras and software for high speed imaging, high performance computing. He was graduated from Moscow Institute of Physics and Technology in 1989 and got PhD in physics of semiconductors in 1993.

This advanced session provides a technical and detailed analysis of how to combine fast performance and high quality for full image processing pipeline on GPU for camera applications in real time. We provide details on GPU image processing pipeline for camera and its constituent parts (Dark Frame subtraction, Flat-Field Correction, PRNU, White Balance, Demosaicing, ICC profiling and Color Management, output via OpenGL, compression to JPEG), and their suitability for the GPU architecture, analysis of achieved results and comparison with existing implementations, applications to machine vision, broadcasting and high speed imaging.

Session Level: Advanced
Session Type: Talk
Tags: Video & Image Processing; Computer Vision; Mobile Applications

Day: Wednesday, 03/26
Time: 17:00 - 17:50
Location: Room LL21A

S4505 - WYSIWYG Computational Photography via Viewfinder Editing

Jongmin Baek ( Ph.D. Student, Stanford University )
Jongmin Baek
Jongmin Baek received his Ph.D. in computer science at Stanford University in 2013. Prior to this, he graduated from MIT in 2008 with a B.S. in mathematics and an M.Eng in computer science. Jongmin's area of interest includes fast high-dimensional image processing algorithms and various ways of improving a user's photography experience via novel modalities of interaction.

Digital cameras with electronic viewfinders provide a relatively faithful depiction of the final image, providing a WYSIWYG experience. If, however, the image is created from a burst of differently captured images, or non-linear interactive edits significantly alter the final outcome, then the photographer cannot directly see the results, but instead must imagine the post-processing effects. In this talk we explore the notion of viewfinder editing, which makes the viewfinder more accurately reflect the final image the user intends to create. We demonstrate an application that allows the user to alter the local or global appearance (tone, color, saturation, or focus) via stroke-based input, and propagate the edits spatiotemporally. The system then delivers a real-time visualization of these modifications to the user, and drives the camera control routines to select better capture parameters.

Session Level: Intermediate
Session Type: Talk
Tags: Media & Entertainment Summit; Computational Photography; Mobile Applications; Video & Image Processing

Day: Wednesday, 03/26
Time: 17:00 - 17:25
Location: Room 211B

S4662 - Boosting Image Processing Performance in Adobe Photoshop with GPGPU Technology

Joseph Hsieh ( Computer Scientist II, Adobe )
Joseph Hsieh
Joseph Hsieh is currently working on the Adobe Photoshop team for several features, including the acceleration of Photoshop features through OpenCL. He previously worked in 3VR, an analytics solution provider for the surveillance industry. While he was working there, he developed some computer vision solutions to analyze human activities and other information in video footage in real time. Before that, he worked at Topaz Labs, an image editing plug-in company, where he was in charge of the development of ReMask, prototype development, and frameworks for the plug-ins.

Get an inside look at two Photoshop features and how they use GPGPU to boost the performance. The Smart Sharpen and Blur Gallery use OpenCL to vastly improve performance, even while operating over very large images. Challenges and solutions will be discussed, followed by a demonstration of the achievements. We will also discuss and show case Adobe Photoshop on the latest NVIDIA GRID platform.

Session Level: Beginner
Session Type: Talk
Tags: Media & Entertainment Summit; Computational Photography; Performance Optimization; Video & Image Processing; Recommended for All Press

Day: Wednesday, 03/26
Time: 17:30 - 17:55
Location: Room 211A

S4345 - Optical Character Recognition with GPUs: Document Processing Throughput Increased by a Magnitude

Jeremy Reed ( Research Assistant, University of Kentucky )
Jeremy Reed
Jeremy Reed received a B.S. degree in computer science from Centre College in 2001. He is currently pursuing his Ph.D. degree at the University of Kentucky and is advised by Dr. Raphael Finkel. He has worked in a variety of software development roles over the past 15 years and is currently employed as a software architect and research assistant. His research interests include artificial intelligence, optical character recognition and software engineering.

Learn how an OCR engine, built from scratch for the GPU, enables businesses to turn document images into searchable, editable text several orders of magnitude faster than is possible with currently available commercial software. Several case studies will be presented outlining the cost and technical benefits and use cases of the technology before diving deeper into the technical details of the software itself. A demo of the software will also be given.

Session Level: Beginner
Session Type: Talk
Tags: Video & Image Processing; Big Data Analytics & Data Algorithms

Day: Thursday, 03/27
Time: 09:00 - 09:25
Location: Room 210E

S4583 - Middleware Framework Approach for BigData Analytics Using GPGPU

Ettikan Kandasamy Karuppiah ( Principal Researcher , MIMOS Bhd )
Ettikan KK, (Ph.D in the area of Distributed Computing) is the Principal Researcher and Head of Accelerative Technology Lab of ICT Division @MIMOS. Current research interest includes Big/Media Data Processing, Multi-processors, GPGPU & FPGA and Network Processing. Previously he was attached with Panasonic R&D {Panasonic Corporate Research Arm} as Principal Engineer and Group Manager of Panasonic Kuala Lumpur Lab with R&D responsibility in IP, AV, distributed and embedded communications protocols in the Home Networking/Network Processing products. Prior to Panasonic, he was with Intel Communication Group responsible for Network Processor related R&D. He has numerous international patents, publication and directly involved in world commercial product developments in those organizations. (ettikan.org)

Current application of GPU processors for parallel computing tasks shows excellent results in terms of speed-ups compared to CPU processors. However, there is no existing middleware framework that enables automatic distribution of data and processing across heterogeneous computing resources for structured and unstructured BigData applications. Thus, we propose a middleware framework for 'Big Data' analytics that provides mechanisms for automatic data segmentation, distribution, execution, information retrieval across multiple cards (CPU & GPU) and machines, a modular design for easy addition of new GPU kernels at both analytic and processing layer, and information presentation. The architecture and components of the framework such as multi-card data distribution and execution, data structures for efficient memory ac-cess, algorithms for parallel GPU computation and results for various test con-figurations are shown. Our results show proposed middleware framework pro-vides alternative and cheaper HPC solution to users.

Session Level: Intermediate
Session Type: Talk
Tags: Big Data Analytics & Data Algorithms; Video & Image Processing; Finance

Day: Thursday, 03/27
Time: 09:00 - 09:25
Location: Room 210B

S4334 - High Performance 2D Convolution and Block Matching on the GPU

Brant Zhao ( GPU Architect, NVIDIA )
Brant Zhao
Brant is a Compute Architect in NVIDIA Shanghai.

2D convolution is the most basic algorithm in image processing, while 2D block matching (BM) can be found in many application areas such as stereo, motion estimation and video compression. In this work, we will present their high performance implementations on the GPU. The optimized versions are not only of great use in the real applications, but also expose several common performance problems encountered by many applications: low compute/bytes ratio (2D convolution), massive number of compute operations (BM), and low throughput of specific instructions (IMAD and ISAD). On GK208, for the 2D convolution case, the optimized version is math limited and has achieved 88% of the peak performance for a general filter window size. For the BM case, its peak performance is 1.7x faster than the ISAD version and our implementation achieves 85% of peak performance.

Session Level: Beginner
Session Type: Talk
Tags: Video & Image Processing; Computer Vision; Mobile Applications; Media & Entertainment

Day: Thursday, 03/27
Time: 09:30 - 09:55
Location: Room 210E

S4297 - Optimization Opportunities and Pitfalls when Implementing High Performance 2D Convolutions

Ian Wainwright ( GPU Computing Specialist and Consultant, High Performance Consulting Sweden )
Highly-Rated Speaker
Ian Wainwright works as a software consultant within GPU Computing and mainly works with medical imaging, and signal processing for the aerospace and defence industry, and finance. He received a masters in engineering physics with a major in computational science from Uppsala University, Sweden.

Learn how to develop high performance 2D convolutions using Kepler specific features, such as warp-shuffle and __restrict__ pointers. Alternative strategies, such as FFT-based and shared memory-based implementations and their disadvantages, will also be presented.

Session Level: Advanced
Session Type: Talk
Tags: Video & Image Processing; Signal & Audio Processing; Medical Imaging & Visualization

Day: Thursday, 03/27
Time: 10:00 - 10:25
Location: Room 210E

S4378 - High-Performance Domain-Specific Languages for GPU Computing

Marcel Köster ( Ph.D. Student, Saarland University )
Marcel Köster
Marcel Köster is a Ph.D. student at the Compiler Design Lab at Saarland University. In 2012, he received his Master's degree in Computer Science from Saarland University. His research interests are compiler construction (intermediate representations as well as automatic parallelization and vectorization techniques), domain-specific languages, GPU computing, and computer graphics.

In this talk we present AnyDSL - a compiler framework for domain-specific languages (DSLs). The framework helps with defining concise and compact languages with high-level abstractions. At the same time, AnyDSL completely removes the overhead of these abstractions. Via its LLVM back end, AnyDSL supports a wide range of architectures including PTX to target NVIDIA GPUs.

Session Level: Intermediate
Session Type: Talk
Tags: Programming Languages & Compilers; Video & Image Processing

Day: Thursday, 03/27
Time: 10:00 - 10:25
Location: Room LL20D

S4267 - Pronunciation Assistance Based on Automatic Speech and Facial Recognition

Maria Pantoja ( Computer Engineering Adjunt Lecturer, Santa Clara University )
Maria Pantoja
Maria Pantoja obtained a BS and MS in engineering from the Universidad Politecnica de Valencia, Spain, a MS in computer science from California State University East Bay, Hayward, CA, and a PhD in computer engineering from Santa Clara University, CA. She has worked as a senior software engineer for Logical Automation, JDS Uniphase, and Nuko. She is currently a full time lecturer in the computer engineering department at Santa Clara University.

Learn how to create, develop, and implement a L2 learning assistance instructional e-tool (desktop plus mobile application) capable of assessing student's pronunciation and providing accurate corrective feed-back. The model presented integrates speech and image recognition technology capable of quantifying and analyzing the learner's input and providing evaluative feed-back and data to evaluate the model's performance. The image/audio analysis and the expert system needed to provide recommendations to students will be done in the GPU to allow for a fast feedback to students.

Session Level: Beginner
Session Type: Talk
Tags: Video & Image Processing; Computer Vision; Mobile Applications; Recommended for All Press

Day: Thursday, 03/27
Time: 14:00 - 14:25
Location: Room 210E

S4631 - Stereo3d Video Streaming for Remote Collaboration

Julien Berta ( VP Technology and Innovation, Mechdyne )
Julien Berta
Julien Berta serves as Vice President of Technology and Innovation within Mechdyne. His current responsibilities include guiding the company's technical vision and leading its technology development. Prior to his current role, Berta served as technical and product manager for Mechdyne's Software Division working on projects such as ISU, BP and Hess. Berta also worked for Fakespace Labs, a pioneer in virtual reality research and development. Before rejoining Mechdyne in 2010, he served as the head of software development for F4, an online game studio in Paris, France. Berta earned a post graduate degree in computer graphics from Université Louis Pasteur, Strasbourg, France; an MS in Software engineering, from Ecole Nationale Supérieure des Télécommunications, Paris, France; and an MS in physics from Ecole Nationale Supérieure de Physique, Strasbourg, France.

Learn how Mechdyne leverages video compression and streaming to create remote collaboration solutions. Connecting CAVEs, Powerwalls and other ultra-resolution displays to enable multi-site, multi-display sharing and decision making. We will explore multiple customer use-cases: immersive-to-immersive, desktop-to-immersive, immersive-to-desktop, monoscopic and stereoscopic.

Session Level: All
Session Type: Talk
Tags: Collaborative & Large Resolution Displays; Remote Graphics & Cloud-Based Graphics; Video & Image Processing; Virtual & Augmented Reality

Day: Thursday, 03/27
Time: 14:00 - 14:25
Location: Room 211A

S4738 - Real-Time GPU Processing for Compressed Sensing of Space, Time and Focus

Xuejun Liao ( Assistant Research Professor, Electrical and Computer Engineering, Duke University )
Xuejun Liao
Xuejun Liao is an Assistant Research Professor of Electrical and Computer Engineering at Duke University. Dr. Liao's research focuses on data mining, machine learning, planning, and bio-informatics.

Image space coded aperture modulation has been used for compressed sensing of video and for digital super-resolution. This talk extends this technique to "focal tomography," i.e. compressed sensing of the focal stack. We use a GPU implementation of the Generalized Alternating Projection (GAP) algorithm for decompressive inference. We exploit GAP's fast convergence time. GAP is based on the volume's global sparsity and requires no training. The GAP algorithm makes use of Euclidean projections on two convex sets, which respectively enforce data fidelity and structural sparsity. Furthermore, GAP is an anytime algorithm; the results produced by the algorithm converge monotonically to the true value as the computation proceeds.

Session Level: Advanced
Session Type: Talk
Tags: Video & Image Processing; Computational Photography

Day: Thursday, 03/27
Time: 14:30 - 14:55
Location: Room 210E

Talk
 

SPECIAL EVENT

Presentation
Details

S4940 - Hangout: Video & Image Processing on GPUs

Eric Young ( Senior Engineering Manager for Developer Technology and Evangelism, NVIDIA )
Yang Song ( Senior Software Engineer, NVIDIA )

Connect with NVIDIA engineers, devtechs and invited experts and get answers to all your burning questions.

Session Level: All
Session Type: Special Event
Tags: Video & Image Processing

Day: Monday, 03/24
Time: 10:00 - 11:50
Location: Concourse Pod A

S4907 - Hangout: Video & Image Processing on GPUs

Abhijit Patait ( Sr. Manager, System Software, NVIDIA )
Thomas True ( Senior Applied Engineer, NVIDIA )
Swagat Mohaptra ( Software Engineer, NVIDIA )

Get more face-to-face time with NVIDIA engineers, dev techs and invited experts to connect and answer all your burning questions.

Session Level: All
Session Type: Special Event
Tags: Video & Image Processing

Day: Wednesday, 03/26
Time: 09:00 - 10:50
Location: Concourse Pod C

S4954 - Hangout:Top 5 Poster Presenters

Mykhailo Vladymyrov ( Research Associate, Lebedev Physical Institute )
Abhinav Sarje ( Postdoctoral Fellow, Lawrence Berkeley National Laboratory )

Session Level: All
Session Type: Special Event
Tags: Video & Image Processing; Supercomputing

Day: Wednesday, 03/26
Time: 13:00 - 14:00
Location: Concourse Pod A

Special event