Sign In
GTC Logo
GPU
Technology
Conference

March 17-20, 2015 | San Jose, California
Check back often for session updates.
Registration is currently closed. To be notified when registration opens, along with other pertinent news from GTC, sign up here.

Scheduler Planner

Print
Download Pdf
 
  • List View
  • Calender View

 
Refine:
  • Session Levels:
  • |
  • |
  • |
  • |
  • Session Levels:
  •  
  •  
  •  
  •  
  • = Highly Rated Speaker

TALK

Presentation
Details

S5637 - ZFAS - The Brain of Piloted Driving at Audi

Matthias Rudolph Head of Architecture Driver Assistance Systems, Audi AG
Matthias Rudolph
Dr. Rudolph studied Electrical Engineering at the University of Kassel and got his Ph.D. in Aerospace Engineering and Engineering Mechanics from Iowa State in 1999 with a minor in mathematics. After holding various positions at Audi he took in 2009 the Lead of the Department "Architecture Driver Assistance Systems". The zFAS project is one of the core development of the department. Dr. Rudolph is a member of the management at Audi.

During the last several years, Audi has developed with partners a platform that enables piloted driving and piloted parking. At CES 2015 it was shown that the system can drive piloted on the highway from Silicon Valley to Las Vegas. The computational platform or brain of this vehicle is called zFAS, with the core element being the NVIDIA Tegra K1. This talk will start with the history and the motivation of piloted functions at Audi, followed by an overview of the current architecture and an outline of future potential leveraging deep learning algorithms.

Level: Intermediate
Type: Talk
Tags: Automotive; Computer Vision & Machine Vision; Video & Image Processing; Press-Suggested Sessions: Cars

Day: Tuesday, 03/17
Time: 13:00 - 13:25
Location: Room LL21F
View Recording

S5108 - Vision-Based Driver Assistance: Seeing the Way Forward

Ian Riches Director, Global Automotive Practice, Strategy Analytics
Ian Riches is a Director in the Global Automotive Practice at Strategy Analytics. He heads a research team that covers all aspects of embedded automotive electronic systems, semiconductors and sensors on a worldwide basis. His areas of research include powertrain, chassis, safety, security and body applications – including high-growth areas such as hybrid and electric vehicles and advanced driver assistance systems. Before joining Strategy Analytics, Ian spent two years as assistant editor of Automotive Engineer, the UK magazine published by the IMechE. He has also held the position of Press Officer/Technical Author for MTL, a safety-related electronic equipment manufacturing company. With over eighteen years of experience, he is one of the foremost industry analysts in the automotive electronics sector. Ian holds an MA in engineering from Cambridge University, UK, where he specialized in fluid dynamics, turbo-machinery and internal combustion engines.

This market introduction to vision-based solutions in advanced driver assistance systems will highlight the regions, applications and vehicle sectors are driving the growth. Current and likely future architectures will be explored and the implications for both traditional and non-traditional automotive suppliers will be highlighted. Finally, the role of and implications of automated driving will be investigated and analyzed.

Level: All
Type: Talk
Tags: Automotive; Computer Vision & Machine Vision; Video & Image Processing; Press-Suggested Sessions: Cars

Day: Tuesday, 03/17
Time: 13:30 - 13:55
Location: Room LL21F
View Recording
View PDF

S5131 - Mobile Visual Search

Martin Peniak Parallel Computing Software Engineer, Cortexica
Martin Peniak
Martin works as a parallel computing software engineer at Cortexica where he develops algorithms for discrete as well as mobile GPUs. Martin got his Ph.D. in GPU computing applied to cognitive robotics and previously collaborated with international EU FP7 ITALK and Poeticon++ consortium that aimed at developing biologically-inspired artificial systems capable of progressively developing their cognitive capabilities through the interaction with their environments. He also collaborated with ESA (European Space Agency) on a project evolving neural network controllers for simulated Mars rover robots. In summer 2012, Martin worked at NVIDIA research in Santa Clara where he evaluated several machine learning algorithms on the next-generation GPU architecture. During his work at NVIDIA, he also developed a novel bio-inspired system for 3D object recognition.More recently, Martin did a TEDx talk, the first one covering GPU computing and its implications to robotics.

The attendees will learn about Cortexica's FindSimilar™ technology. Its algorithms are based on the way the human visual cortex recognises images and objects, meaning that poor lighting conditions, rotated or skewed images and other 'imperfect' objects can all be recognized accurately. In this presentation, you will learn about the challenges in the field of visual search and how our company addresses them by leveraging the processing power of GPUs including the latest NVIDIA K1 processor. This session will include several demonstrations of our technology and the latest mobile applications using NVIDIA K1 processors to speedup the visual search performance.

Level: Intermediate
Type: Talk
Tags: Computer Vision & Machine Vision; Video & Image Processing; Embedded Systems

Day: Tuesday, 03/17
Time: 13:30 - 13:55
Location: Room 210B
View Recording

S5182 - The Future of Human Vision: Preferential Augmentation Using GPUs

Muhammad Shamim Bioinformatics Programmer, Baylor College of Medicine
Muhammad Shamim
Muhammad Shamim is a bioinformatics programmer in Dr. Erez Lieberman Aiden's Lab at the Baylor College of Medicine, working on a variety of projects ranging from big data and genomics to augmented reality. Muhammad is a graduate of Rice University with a BS in Computer Science and a BA in Computational & Applied Mathematics and Cognitive Sciences.

Loss of vision can result from an enormous number of visual disorders, a small subset of which can be addressed using traditional corrective lenses, i.e. by transforming light in accordance with Snell's law of refraction. In principle, a more general class of transformations might help address a broader range of disorders. Discover how GPUs are being used in augmented reality applications to correct or alleviate vision deterioration in real-time, as well as personalize vision in novel ways.

Level: All
Type: Talk
Tags: Augmented Reality & Virtual Reality; Computer Vision & Machine Vision; Video & Image Processing; Medical Imaging; Press-Suggested Sessions: Deep Learning & Computer Vision

Day: Tuesday, 03/17
Time: 13:30 - 13:55
Location: Room LL21C
View Recording
View PDF

S5436 - Deploying Low-Power Embedded Devices with Tegra K1 (Presented by GE)

Dustin Franklin GPGPU Applications Engineer, GE Intelligent Platforms
Highly-Rated Speaker
Dustin is an embedded GPGPU developer and system architect for GE Intelligent Platforms. With a background in robotics and computational imaging, he works with integrators to deploy CUDA-accelerated embedded systems. Visit www.ge-ip.com/gpgpu for more info.

Tegra's low power and computational efficiency are driving the development of new and exciting embedded devices. Explore CUDA-accelerated applications in sensor processing, security & surveillance, robotics, networking, medical imaging, industrial machine vision, energy & agriculture, that tap TK1 to provide next-generation features and capabilities to the user, all while consuming minimal power. Miniaturized Tegra modules can be quickly integrated into end-user products with a variety of packaging options available. Leverage TK1's friendly software ecosystem and code compatibility with NVIDIA's discrete GPUs to architect scalable embedded systems with reduced risk and shortened development cycles.

Level: All
Type: Talk
Tags: Embedded Systems; Video & Image Processing

Day: Tuesday, 03/17
Time: 13:30 - 13:55
Location: Room 210G
View Recording
View PDF

S5251 - Accelerating Automated Image Processing Pipelines for Cameras with Novel CFAs on GPUs

Qiyuan Tian Ph.D. Candidate, Stanford University
Qiyuan  Tian
Qiyuan Tian is a Ph.D. Candidate in the Department of Electrical Engineering at Stanford University. He received B.Eng. (2011) in Communication Science and Engineering at Fudan University, China, and M.S. (2013) in Electrical Engineering at Stanford University. He studied as an undergraduate exchange student (2009) in the Department of Electronic and Computer Engineering at The Hong Kong University of Science and Techonology. He is working on digital imaging, magnetic resonance imaging and neuroimaging.
Haomiao Jiang Ph.D. Candidate, Stanford University
Haomiao Jiang
Haomiao Jiang is a Ph.D. candidate in the Department of Electrical Engineering at Stanford University. He received B.A. (2011) in Information Security at Shanghai Jiao Tong University, China, and M.S. (2013) in Electrical Engineering at Stanford University. He is working with Professor Brian Wandell on color vision, display modeling and computational photography.

L3 (Local, Linear, Learned) is a new technology to automate and customize the design of image processing pipelines for cameras with novel architecture, such as unconventional color filter arrays. L3 classifies sensor image pixels into categories that are local in space and response and automatically learns linear operators that transform pixels to the calibrated output space using training data from camera simulation. The local and linear processing of individual pixels makes L3 ideal for parallelization. We accelerated the L3 pipeline on NVIDIA® Shield™ Tablets using GPUs for real time rendering of video captured by a multispectral camera prototype. The combination of L3 and GPUs delivers high performance with low power for image processing on mobile devices.

Level: All
Type: Talk
Tags: Defense; Video & Image Processing; Computer Vision & Machine Vision

Day: Tuesday, 03/17
Time: 14:30 - 14:55
Location: Room 210C
View Recording
View PDF

S5333 - SceneNet: 3D Reconstruction of Videos Taken by the Crowd on GPU

Chen Sagiv CEO, SagivTech Ltd.
Chen  Sagiv
Dr. Sagiv brings to SagivTech over 15 years of experience in the image processing industry both in Israel and the Netherlands. In addition to her activities with the company, she also collaborates with academic beneficiaries in Israel and Europe. Chen Sagiv holds a PhD from Tel Aviv University in Applied Mathematics, with specializations in texture analysis, filter banks and optimization problems.

If you visited a rock concert recently you probably recognized how many people are taking videos of the scenario, using their mobile phone cameras.The aim of SceneNet is to use these multiple video sources to create a high quality 3D video scene that can be shared via social networks. The SceneNet pipeline starts at the mobile device where the video streams are acquired, pre-processed and transmitted to the server, where the various video streams are registered and submitted to 3D reconstruction. We will share the compute challenges of SceneNet and the GPU based acceleration on mobile devices and the server, from pre-processing on the mobile device to extremely computationally demanding algorithms such as bundle adjustment and 3D reconstruction. SceneNet is a FP7 European funded project.

Level: All
Type: Talk
Tags: Computer Vision & Machine Vision; Developer - Algorithms; Video & Image Processing

Day: Tuesday, 03/17
Time: 14:30 - 14:55
Location: Room 210B
View Recording
View PDF

S5157 - Synthetic Aperture Radar on Jetson TK1

Massimiliano Fatica Senior Manager, Tesla HPC Performance Group, NVIDIA
Massimiliano Fatica
Massimiliano Fatica is a Senior Manager at NVIDIA in the Tesla HPC Performance and Benchmark Group, where he works in the area of GPU computing (high-performance computing and clusters). Prior to joining NVIDIA, he was a research staff member at Stanford University where he worked on applications for the Stanford Streaming Supercomputer. He holds a laurea in Aeronautical Engineering and a PhD in Theoretical and Applied Mechanics from the University of Rome “La Sapienza”.

This talk will present the details of a Synthetic Aperture Radar (SAR) imaging on the smallest CUDA-capable platform available, the Jetson TK1. The full processing starting from the raw radar data has been implemented using both Octave with CUDA acceleration and CUDA directly. The results indicate that GPU accelerated embedded platforms have considerable potential for this type of workload and in conjunction with low power consumption, light weight and standard programming tools, could open new horizons in the embedded space.

Level: Intermediate
Type: Talk
Tags: Embedded Systems; Video & Image Processing

Day: Tuesday, 03/17
Time: 15:30 - 15:55
Location: Room 210G
View Recording
View PDF

S5870 - Audi Piloted Driving: In the Fast Lane to the Future

Daniel Lipinski Senior Engineer, Audi of America
Daniel Lipinski
Daniel started working for Audi in 2008 as the lead developer for the European Traffic Sign Recognition system. In 2012 he joined the Volkswagen Electronics Research Lab (ERL) in Silicon Valley, where he led the application of several driver assistance systems to the U.S. market. Daniel is now the project lead with one of the most comprehensive Volkswagen Group and Audi research projects for piloted driving. One of his project cars is “Jack”, the Audi piloted driving concept car that successfully completed the 550 miles automated driving road test from Silicon Valley to Las Vegas. Lipinski studied Computer and Communications Systems Engineering at the Technical University in Braunschweig, Germany.

On the eve of CES 2015, Audi, ERL and VW Group Research accomplished the most dynamic automated driving road test yet, with non-engineers behind the wheel over 550 miles+ on public freeways. With the advanced Highway Pilot technology built into a car nicknamed "Jack", Audi demonstrated how far the automated technology has matured within the last decade. What enabled such complex technology is the massive growth in processing power, a field in which NVIDIA processors will be playing a central role in the future.

Level: All
Type: Talk
Tags: Automotive; Computer Vision & Machine Vision; Video & Image Processing; Press-Suggested Sessions: Cars

Day: Tuesday, 03/17
Time: 15:30 - 15:55
Location: Room LL21F
View Recording
View PDF

S5373 - GPU + Drones + 3D Imaging for Precision Farming and Construction

Bingcai Zhang Tech Fellow, BAE Systems
Bingcai Zhang
Dr. Zhang is a technical fellow at BAE Systems, the premier global defense and aerospace company. He joined BAE Systems in September 1995 right out of University of Wisconsin-Madison, where he earned his Ph.D. in engineering college and MS in computer science. His research interests are: (1)geospatial information technology and 3D mapping; (2)robot vision and unmanned systems; and (3)3D geoweb search. He has held positions as chief architect, chief photogrammetrist, R&D manager, and technical fellow with BAE Systems. Dr. Zhang has three inventions: (1)Embedded Photogrammetry, (2)Next Generation Automatic Terrain Extraction (NGATE), and Automatic 3D Object Extraction. Embedded photogrammetry is a concept to embed a precise 3D measurement component called photogrammetry into non-photogrammetry applications such as GIS and CAD. NGATE generates 3D terrain model from stereo images. AFE is a production capable system that automatically extracts 3D objects such as houses, buildings, trees from a digital surface model or LiDAR point clouds.

Agriculture and construction are two of the largest industries in the world. Democratization of 3-D imaging technology with drones, digital cameras, and GPU is applicable for precision farming and construction. Precision farming can increase crop yields, reduce pollution, save water, and increase productivity. The demand for precision farming has since increased, however, with more people living on planet Earth with fixed natural resources. Timely precise 3-D measurements are important for construction. Today, most of these 3-D measurements are obtained manually. BAE Systems is developing GPU-accelerated 3-D imaging technology with drone images for precision farming and construction.

Level: All
Type: Talk
Tags: Computer Vision & Machine Vision; Video & Image Processing; Press-Suggested Sessions: Deep Learning & Computer Vision

Day: Tuesday, 03/17
Time: 16:00 - 16:25
Location: Room 210B
View Recording
View PDF

S5306 - Direct Convolution for Deep Neural Network Classification on Tegra X1

Alan Wang Compute Architect, NVIDIA
Alan is a GPU Architect in the computer vision field at NVIDIA. He is experienced in parallelization, performance modeling and architecture-specific tuning. Alan is currently working on 2D convolution projects. Before joining computer architecture team, Alan works on graphics tracing and FPGA architecture&EDA software.

We prototype a direct convolution implementation to accelerate classification with a deep neural network. We take the Overfeat network as an example, analyzing some of its properties like math/memory ratio and input/coefficient ratio. We then discuss the workload distribution of the implementation and how we partition the computation into CUDA blocks. We also dive into details about how we optimize for data reuse, including the use of 3D texture for input pixels and a coefficient layout designed for coalesced stores. Experiments with Overfeat Layer 6 on Tegra X1 show that we can achieve 75% utilization of GFLOPs currently, with room for further optimization as future work.

Level: Advanced
Type: Talk
Tags: Developer - Performance Optimization; Video & Image Processing

Day: Wednesday, 03/18
Time: 09:00 - 09:25
Location: Room 210G
View Recording
View PDF

S5442 - High-Quality Rasterization

Chris Wyman Research Scientist, NVIDIA
Chris joined NVIDIA Research in 2012. Previously, he served as an associate professor of computer science at the University of Iowa. He has a PhD in computer science from the University of Utah. His research interests focus on realistic, real-time rendering including problems on lighting, global illumination, shadows, materials, participating media, and many related issues.

We describe three new rendering algorithms that rasterize many samples per pixel, taking advantage of Maxwell GPU features to make images that are sharper and less aliased. "ACAA" is a simple variation of MSAA that uses less memory. "AGAA" brings MSAA quality to deferred rendering, while shading less than twice per pixel. And thirdly, "FTIZB" renders alias-free hard shadows with 32 samples per pixel at real-time speeds.

Level: Intermediate
Type: Talk
Tags: Real-Time Graphics; Video & Image Processing; Rendering & Ray Tracing; Media & Entertainment

Day: Wednesday, 03/18
Time: 09:00 - 09:50
Location: Room LL21B
View Recording
View PDF

S5751 - Stereovision and the Future of Autonomous Machines

Edwin Azzam CTO, STEREOLABS
Edwin Azzam co-founded STEREOLABS in 2010. As STEREOLABS’s Chief Technical Officer, Edwin is responsible for leading the company’s product development and technology strategy in stereoscopic image processing. Prior to founding STEREOLABS, Edwin was a project manager at Airbus Defence and Space. Edwin holds a Master’s degree in Optics & Image Processing from Institut d’Optique, France, as well as a Master’s degree in Management from ESSEC Business School. He is a PhD supervisor and a National Technical Expert for the ANR (National Research Agency), where he uses his technical and market expertise for the assessment of national research projects in the field of computer vision and 3D image processing. Edwin was honored twice with the National Innovation Prize by the French Ministry of Research. Between 2010 and 2014, Edwin received 10 different distinctions for his achievements in the stereoscopic 3D field. In 2010, he won the European Innovation Award with STEREOLABS which recognizes the most promising and innovative technological companies in Europe.

Discover how stereovision and 3D depth sensing on mobile GPUs enable the development of future autonomous cars, drones and robots. We will discuss the benefits and challenges of using stereo cameras as depth sensing sensors, and how to leverage the power of embedded GPU to overcome these challenges. We will also show demonstrations of how the technology can be used to create 3D surrounding reconstruction, detect obstacles and navigate autonomously.

Level: All
Type: Talk
Tags: Computer Vision & Machine Vision; Automotive; Video & Image Processing; Press-Suggested Sessions: Deep Learning & Computer Vision; Press-Suggested Sessions: Cars

Day: Wednesday, 03/18
Time: 09:00 - 09:25
Location: Room 210B
View Recording

S5317 - Development of a GPU Accelerated Visual Tracking Framework

David Concha Researcher, Universidad Rey Juan Carlos
David Concha
David received his B.Sc. Degree in Computer Science from Universidad Rey Juan Carlos (URJC) in 2011 and is currently a Ph.D. student and grant holder at Universidad Rey Juan Carlos. His research interests focus on Computer Vision and GPU Computing. Some research works done recently, exploits the graphics hardware to accelerate Computer Vision algorithms. In particular, David uses GPUs to accelerate methods related to 3D/2D motion tracking, medical image reconstruction, face recognition, high-definition depth maps computation, image segmentation, etc.

This session presents the development of a visual tracking system whose ultimate goal is to track multiple articulated objects. Throughout the development, different technologies for GPU programming are used, like OpenGL, Cg and CUDA; various types of sensor such as cameras or Kinects; and different methodologies like particle filters, Kalman filter or Variable Neighborhood Search (VNS) metaheuristic.

Level: All
Type: Talk
Tags: Computer Vision & Machine Vision; Video & Image Processing

Day: Wednesday, 03/18
Time: 09:30 - 09:55
Location: Room 210B
View Recording
View PDF

S5205 - Real-Time and High Resolution Feature Tracking and Object Recognition

Peter Andreas Entschev Software Engineer, ArrayFire
Peter Andreas Entschev
Peter Entschev is currently a software engineer at ArrayFire, where he primarily works on concurrent computer vision problems. He has received his Bachelor's degree in Telecommunication Systems and Master's degree in Computer Science from the Federal University of Technology - Paraná (UTFPR), Brazil. Before joining ArrayFire, he worked on real-time computer vision research at SEW-Eurodrive in Germany and with system administration and development of Linux distributions for the Brazilian Government.

This session will cover real-time feature tracking and object recognition in high resolution videos using GPUs and productive software libraries including ArrayFire. Feature tracking and object recognition are computer vision problems that have challenged researchers for decades. Over the last 15 years, numerous approaches were proposed to solve these problems, some of the most important being SIFT, SURF and ORB. Traditionally, these approaches are so computationally complex that processing more than a few frames per second is impossible. Using an NVIDIA K20 GPU with ORB, we are able to process more than 30 frames per second on images in the order of 10000x10000 pixels. Multiple quality and timing benchmarks will be presented, covering some of the most robust feature tracking methods.

Level: All
Type: Talk
Tags: Computer Vision & Machine Vision; Developer - Algorithms; Video & Image Processing; Press-Suggested Sessions: Deep Learning & Computer Vision

Day: Wednesday, 03/18
Time: 10:00 - 10:25
Location: Room 210B
View Recording
View PDF

S5282 - Avoiding Shared Memory Bank Conflicts in Rate Conversion Filtering

Mrugesh Gajjar Lead Research Engineer, Siemens Corporate Technology
Mrugesh Gajjar
Mrugesh Gajjar is an Engineer with Siemens Corporate Technology at Bangalore and working with Siemens Ultrasound Division at Mountain View on GPU based implementations of ultrasound signal processing algorithms. He holds Masters in Information & Communication Technology from Dhirubhai Ambani Institute (DA-IICT), India. He has 10 years of research and industrial experience and his interests include parallel computing and computer systems with focus on signal processing applications. He has 6 international publications and 2 patents pending.

Shared memory bank conflicts can be a significant performance limiter, depending on thread-dependent access patterns. We will present ideas on how to reduce shared memory bank conflicts in rate conversion filtering--a frequently used signal processing function in a variety of tasks such as image resizing. We find severe performance degradation for specific downsampling factors in rate conversion due to heavy bank conflicts in shared memory. We propose a novel technique for avoiding it via the use of scrambled addressing across threads. This technique is applicable more generally across many GPU architectures. We will demonstrate effectiveness with specific examples and performance measurements on NVIDIA GPUs and leave the attendee with ideas on how to identify and mitigate bank conflicts.

Level: Intermediate
Type: Talk
Tags: Developer - Performance Optimization; Medical Imaging; Video & Image Processing

Day: Wednesday, 03/18
Time: 10:00 - 10:25
Location: Room 210G
View Recording
View PDF

S5396 - Pimp My Ride: How to Mod Cars with Tegra

Dave Anderson Sr. Automotive Solutions Architect, NVIDIA
Dave Anderson is Senior Automotive Solutions Architect for NVIDIA's Automotive Business Unit, which provides the industry with powerful, yet efficient, processors and solutions for infotainment, digital instrument clusters and advanced driver assistance systems. Dave has more than 14 years of engineering experience in the technology industry. Prior to joining NVIDIA in March 2011, he served in several engineering and technical roles at Altera, Trilogy Marketing, Sirius Satellite Radio and Visteon. He also has U.S. Patents for a Flexible LED Backlighting Circuit and a Cross-Point Matrix for Infrared Touchscreen. He earned a BSEE in Electrical and Computer Engineering from Purdue University.

Tapping into in-vehicle architectures for infotainment and driver information applications is a huge challenge. We will examine several production cars as examples and provide insight into how NVIDIA automotive Tegra processors can be retrofitted into these cars as a proof-of-concept for next-generation digital clusters and infotainment systems.

Level: All
Type: Talk
Tags: Automotive; Video & Image Processing; Embedded Systems; Press-Suggested Sessions: Cars

Day: Wednesday, 03/18
Time: 10:00 - 10:25
Location: Room LL20D
View Recording

S5221 - Tracking Objects Better, Faster, Longer

Alptekin Temizel Associate Professor, Middle East Technical University
Alptekin Temizel
Dr. Alptekin Temizel is an associate professor at Informatics Institute, Middle East Technical University (METU). He received his BSc in Electrical and Electronic Engineering from METU, Ankara, Turkey (1999) and his PhD from Centre for Vision, Speech and Signal Processing, University of Surrey, UK (2006). Between 1999-2001 he worked as a research assistant in University of Hertfordshire, UK. He co-founded Visioprime Ltd., UK –a company developing intelligent video systems for security and surveillance applications- and worked as a senior research engineer in this company between 2001-2006. Since 2006, he is a professor in Graduate School of Informatics, Middle East Technical University (METU), Turkey and consultant to several R&D companies. He is the principle investigator of Virtual Reality and Computer Vision Research Group (VRCV), NVIDIA CUDA Teaching Center and CUDA Research Center. His main research interest areas are: image and video processing, video surveillance, computer vision, parallel programming and GPU programming.

In this talk, we demonstrate a real-time long-term-tracker, Hybrid-TLD (H-TLD), which is based on the recently proposed Tracking-Learning-Detection (TLD) framework. TLD simultaneously tracks the object, learns its appearance and detects when it re-appears. While it has been shown to have promising results, its high computational cost prohibits running it at higher resolutions and frame-rates. We present our analysis of the framework and our modifications to make it work effectively on a CPU-GPU hybrid setting with a high utilization of both processing units using OpenMP and CUDA. Our results show that 10.25 speed up at 1920x1080 resolution could be obtained. The source code of the developed H-TLD library has been made publicly available.

Level: Intermediate
Type: Talk
Tags: Computer Vision & Machine Vision; Video & Image Processing

Day: Wednesday, 03/18
Time: 10:30 - 10:55
Location: Room 210B
View Recording
View PDF

S5619 - BLINK: A GPU-Enabled Image Processing Framework

Mark Davey HPC Lead Engineer, The Foundry
Mark Davey joined The Foundry in 2011, heading up the HPC group to bring device agnostic image processing frameworks to a number of key products and plug-ins. Before his role at The Foundry, Mark was employed as Technology Manager at Grandeye, a leading manufacturer of security solutions, where he helped create innovative 360 degree cameras complete with sophisticated video analytics. Other roles have seen Mark working in fields as diverse as Augmented Reality Surgery, 3D Foetal Ultrasound, and Document Analysis and Classification. Mark obtained his Physics degree from University College London.

We present BLINK, a language and framework for developing image processing algorithms across a range of computation devices. BLINK-based algorithms are automatically translated to optimised code for both GPUs and CPUs. This "write-once" approach enables us to target both existing and new GPU hardware with minimal extra effort. Many algorithms produce visibly different results if mathematical operations are allowed to differ across platforms. Therefore BLINK has been designed to ensure numerically identical results between NVIDIA GPUs and CPUs. BLINK is at the heart of a number of key Foundry plug-ins and applications. An overview of this work and performance profiles will be presented, highlighting the speed gains achieved by using NVIDIA GPUs.

Level: Intermediate
Type: Talk
Tags: Media & Entertainment; Video & Image Processing

Day: Wednesday, 03/18
Time: 10:30 - 10:55
Location: Room LL21D
View Recording
View PDF

S5483 - GPU Power through Javascript for Anyone with Universe 2.0 SDK

Sean Safreed Co-founder, Red Giant
Highly-Rated Speaker
Sean Safreed is co-founder of Red Giant Software, and a 16-year veteran of the computer graphics industry. The company started with two products and has since grown to offer more than 50 products with a team that spans the United States and Canada. Before founding Red Giant in 2002, he worked on the Apple's QuickTime team. At Silicon Graphics' he lead efforts to add innovative video features to the company's hardware systems. At Puffin Designs, he worked as a product manager on Commotion, a ground-breaking video paint application that originated at Industrial Light and Magic.

Red Giant Universe is a set of tools for creating visual effects across a wide range of popular DCC apps. It is now accessible by artists with basic Javascript programming skills. The system enables users to create in minutes or hours what used to take days or weeks to write in a mainstream computer language. This session will follow on the introductory session from 2014, with new expanded coverage of the SDK, Javascript examples and new additions to the system for real-time vector render and photo based rendering all in real-time on the GPU.

Level: All
Type: Talk
Tags: Media & Entertainment; Video & Image Processing; Real-Time Graphics

Day: Wednesday, 03/18
Time: 11:00 - 11:25
Location: Room LL21D
View Recording

ECS5005 - CEO Show & Tell: MirriAd

Mark Popkiewicz CEO, MirriAd
Mark Popkiewicz
Mark Popkiewicz is CEO of Mirriad, a video technology company built with circa $30m of investment capital.  A regular speaker at major industry events as well as TV and radio, Mark has led several technology businesses to global market leadership. Mark has a technology and commercial background with a truly global outlook.  To date Mark has spent more than half his career working with US based businesses and has set up 30 operations around the world including the BRIC and MINT countries.
He was previously a director of companies like BBC (commercial) Ventures, Mobile Media, Lucent and Eicon.  Mark has grown and improved performance of both small and large businesses with various trade exits and IPO. 

Launched in 2008 with a mission to revolutionize advertising for the Skip Generation, Mirriad's patented computer vision technology creates a new standard in advertising where a brand integration is an affordable, scalable ad unit running in multiple pieces of content. The resulting ads are seamless, authentic, and work across TV, tablet and mobile screens, building brand awareness and brand sentiment without interrupting the viewing experience. In 2013, an important aspect of Mirriad's imaging technology won an Academy Award.

Level: All
Type: Talk
Tags: Video & Image Processing; Emerging Companies Summit

Day: Wednesday, 03/18
Time: 11:15 - 11:30
Location: Room 220B

S5546 - GPU Accelerated Haze Removal on Tegra K1

Bin Zhou Adjunct Research Professor, University of Science and Technology of China
Bin Zhou
Dr. Bin Zhou is the director and chief scientist of Marine Information Processing Laboratory(MIPL) at Institution of Oceanography, Shandong Academy of Sciences. He serves as an Adjunct Research Professor in School of Information Science and Technology at USTC and an NVIDIA CUDA Fellow. He is the PI of CUDA research center (CRC) in Institute of Advanced Technology(IAT), USTC.In MIPL, he leads a team working on information processing systems for marine environmental pollution & natural hazard monitoring and ocean-atmosphere simulation. In CRC, he performs researches on drones control, video processing and computer vision algorithms on NVIDIA GPU/CUDA platform.

This talk shows how Tegra K1 GPU accelerates the dehazing process for outdoor computer vision systems. Toxic haze becomes a major air pollution threat in China, which affects not only public health but also outdoor computer vision systems. By adapting dark channel prior method into dehazing process, very good effects are achieved. However, huge processing requirements bring big challenges. We refined the parallel algorithm and performed deep-optimization on Tegra K1 Jetson platform. Compared to ARM CPU, experiments show 156x speedup. The results show Tegra K1 has great potential for embedded real-time computer vision processing.

Level: All
Type: Talk
Tags: Computer Vision & Machine Vision; Video & Image Processing; Press-Suggested Sessions: Deep Learning & Computer Vision

Day: Wednesday, 03/18
Time: 15:30 - 15:55
Location: Room 210B
View Recording

S5667 - A GPU-Accelerated Bundle Adjustment Solver

Lukáš Polok Researcher, Brno University of Technology
Lukáš Polok
Born in Brno (Czech Republic), Lukáš received a MSc in Computer Science with a specialization in computer graphics and multimedia at Brno University of Technology, where he is presently employed as a researcher, working on his thesis about using GPUs for general purpose calculations.
Simon Pabst R&D Programmer, Double Negative VFX
Simon Pabst
Simon joined the Double Negative VFX R&D team in 2012. Before, he obtained his Master’s degree (Diplom) and PhD (Dr. rer. nat) in Computer Science from the University of Tübingen (Germany) working on cloth simulation and deformable objects collision detection.
Jeff Clifford Head of R&D, Double Negative VFX
Jeff Clifford
With a background in Physics and an MSc in Applied Optics from Imperial College London, Jeff has pursued a career in the London Film Post-Production industry since joining Double Negative VFX in 2000. As a member the R&D team he wrote the DNB voxel renderer which has since been used in over 50 films. He developed DNeg's own 64-bit version of the 2D compositing software Shake, and subsequently transitioned the 2D department's tools to a stereo pipeline based around the compositing software Nuke. He has experience in developing many 2D and 3D tools for film production. More recently he has moved into a role as Head of R&D to oversee the strategic direction of internal tool development and use of 3rd party technology at DNeg.

This talk will give an overview of the film production processes, focusing on high quality 3D reconstruction of the scene and how GPU acceleration applies to it, as well as the math and algorithms behind. Several solutions for accelerating the algorithms involved in the 3D reconstruction process exist, but very few are concerned with the online quality assessment of the reconstructed areas. This is mostly due to the computational load of the algorithms computing the error in the reconstruction. The presented approach proposes efficient solutions based on GPU implementation of the matrix operations involved. It differs from the existing solutions by exploiting the inherent sparse, block structure of the underlying system matrices. The work is part of the EU FP7 IMPART project: impart.upf.edu.

Level: All
Type: Talk
Tags: Media & Entertainment; Developer - Algorithms; Video & Image Processing

Day: Wednesday, 03/18
Time: 15:30 - 15:55
Location: Room LL21D

S5187 - Real-Time Camera Tracking in the "1st & 10" System

Louis Gentry Principal Software Engineer, Sportvision
Louis Gentry
Louis Gentry is the lead developer for football, emerging sports, and core technologies at Sportvision and is responsible for the design and implementation of real-time broadcast rendering platforms. He has years of prior experience working in computer graphics and video for SGI, Pinnacle Systems, and other companies. In the last ten years at Sportvision, Louis has designed and implemented key technologies and systems used on broadcasts for ESPN, FOX, ABC, NFL Network, CBS, and other clients.
Rand Pendleton Senior Scientist and Advisor, Sportvision
Senior Scientist and Advisor at Sportvision assisting on various development projects with an emphasis on field deployment, camera tracking and algorithms. Prior to Sportvision Rand worked in defense related contract research, as high power microwave consultant, engineering physicist at Stanford Linear Accelerator Center and as a microwave tube engineer at Varian Associates.

Sportvision's "1st & 10" real-time system for displaying graphics during American football games has traditionally relied on hardware to calibrate and compute camera parameters necessary for inserting the "yellow line" and other effects into the scene. The hardware solution is limited to lock-down, broadcast cameras only. The vast compute power available in GPUs today provided a means for expanding the system to support both lock-down and mobile cameras without the need for hardware sensors. In this presentation, we will discuss how the optical camera tracking system works and its use on live NFL broadcasts.

Level: All
Type: Talk
Tags: Media & Entertainment; Augmented Reality & Virtual Reality; Video & Image Processing; Real-Time Graphics

Day: Wednesday, 03/18
Time: 16:00 - 16:25
Location: Room LL21D
View Recording
View PDF

S5637B - ZFAS - The Brain of Piloted Driving at Audi

Matthias Rudolph Head of Architecture Driver Assistance Systems, Audi AG
Matthias Rudolph
Dr. Rudolph studied Electrical Engineering at the University of Kassel and got his Ph.D. in Aerospace Engineering and Engineering Mechanics from Iowa State in 1999 with a minor in mathematics. After holding various positions at Audi he took in 2009 the Lead of the Department "Architecture Driver Assistance Systems". The zFAS project is one of the core development of the department. Dr. Rudolph is a member of the management at Audi.

During the last several years, Audi has developed with partners a platform that enables piloted driving and piloted parking. At CES 2015 it was shown that the system can drive piloted on the highway from Silicon Valley to Las Vegas. The computational platform or brain of this vehicle is called zFAS, with the core element being the NVIDIA Tegra K1. This talk will start with the history and the motivation of piloted functions at Audi, followed by an overview of the current architecture and an outline of future potential leveraging deep learning algorithms.

Level: Intermediate
Type: Talk
Tags: Automotive; Computer Vision & Machine Vision; Video & Image Processing; Press-Suggested Sessions: Cars

Day: Wednesday, 03/18
Time: 16:00 - 16:25
Location: Room LL20D
View Recording

ECS5017 - Early Stage Challenge: INTEMPORA

Xavier Rouah CTO, INTEMPORA
Xavier Rouah
Xavier Rouah is CTO at INTEMPORA SA. Graduated Engineer in Embedded Systems, he worked with French car manufacturers to experiment cooperative intelligent transport systems. This professional background helped him to understand advanced driving assistance systems needs and complexity. Since he joined INTEMPORA, Xavier works on INTEMPORA's RTMaps software portability on embedded systems. To meet the growing power needs of intensive image processing algorithms for driving assistance systems, he started working on heterogeneous architectures with scientists and experts from car manufacturers. He has responsibility for designing the integration architecture of intensive computation technologies into RTMaps.

Intempora is an Independent Software Vendor and provides the RTMaps technology. RTMaps is a modular software development and execution platform for the design of real time heterogeneous multi sensors applications and systems. Intempora is strongly present at Advanced Driving Assistance Systems R&D activities up to Autonomous Cars projects (with customers such as Renault, PSA, Valeo, Honda, ESG, SAIC Motor,… and numerous research labs such as INRIA, CEA, IFSTTAR, VEDECOM, DLR, Shanghai Jiao Tong, …), as well as in the robotics domain (THALES, DGA, DCNS, Airbus group…), cognitive load assessment, advanced multimodal HMIs development,… RTMaps and the Intempora team accompany researchers and engineers in all stages of their ADAS and Autonomous vehicles software development process: • Datalogging and data playback of multiple high-bandwidth sensors data streams (cameras, CAN & LIN bus, GPS, lidars, radars, IMUs, etc.) • Data management • Real-time or offline data processing and data fusion for perception • Navigation and communication • Decision making • Command-control • Multimodal HMIs development • Human factors studies • Validation and benchmarking

Level: All
Type: Talk
Tags: Automotive; Video & Image Processing; Emerging Companies Summit

Day: Wednesday, 03/18
Time: 16:20 - 16:28
Location: Room 220B

S5642 - Canvas: GPU Image Processing on Giant Surfaces

Thomas Soetens Founder and Research Director, Immersive Design Studios
Thomas Soetens
Thomas Soetens (1972) graduated in 1992 with an MFA in Visual Arts from the St-Lucas School of Arts in Belgium. After practicing as a painter, he co-founded Workspace Unlimited in 2001 and founded Immersive Design Studios in 2007 where he currently acts as its research and development director. Immersive Design Studios is an interdisciplinary design and technology company based in Montreal utilizing the potential of 3D game technology in corporate events, architecture, cultural new-media installations, and real-time collaborative environments. Thomas Soetens has initiated several research projects and workshops in collaboration with an international network of institutions, companies, and universities. He is frequently invited to participate in lectures and presentations and his work has been highlighted in numerous publications.

We will discuss how we are bridging the transition from FPGA to GPU-based image processing with our proprietary software - CANVAS: a GPU image-processing platform designed for various AV applications including multi-screen warping, blending, pixel- mapping and color matching. We will present a case-study based on a project at Montreal's Bell Centre hockey arena, featuring projections on ice during the 2013 NHL playoffs. The installation required image warping and blending with 12 overlapping projectors -each set of 6 projectors mapping in 6K onto the arena ice. The use of CANVAS allowed for pixel by pixel resolution, easy warping and blending, as well as cutting the projector calibration time from 8-12 hrs down to just 15 min. Attendees will learn about how to push the limits of the GPU's.

Level: All
Type: Talk
Tags: Media & Entertainment; Visualization - Large Scale & Multi-Display; Video & Image Processing

Day: Wednesday, 03/18
Time: 16:30 - 16:55
Location: Room LL21D
View Recording
View PDF

S5310 - GPU Computing for Distributed Acoustic Sensing

Marzban Palsetia Technical Advisor, Halliburton
Marzban Palsetia is a technical advisor at Halliburton where he develops software and algorithms for Fiber Optics technologies. He has more than fifteen years of experience in digital signal processing and has developed applications ranging from a Mobile Indoor Positioning system to a Synthetic Aperture Radar processing system for NASA’s LRO and Chandrayaan lunar missions. Prior to joining Halliburton, he was with Microsoft Corporation for six years and Vexcel Corporation for ten years, both in Boulder, CO. He holds a Master’s degree from the University of Florida, Gainesville and a Bachelor’s degree from the University of Bombay, India.

Distributed Acoustic Sensing (DAS) is a fiber optic technology deployed in energy production by Pinnacle, a Halliburton Service. DAS, based on Rayleigh scattering principles, is used to determine acoustic strain over several kilometers, effectively turning the fiber into a series of virtual microphones. DAS data analysis involves processing of high volume rate (> 400 Mbytes/sec) data with algorithms for data correction, spectral filtering, and spectrogram and image generation. We show processing speed up with GPU-adapted algorithms that far exceed the single CPU and multiple CPU algorithms, reducing processing time from the order of a day to a few minutesf

Level: All
Type: Talk
Tags: Energy Exploration; Signal & Audio Processing; Video & Image Processing

Day: Thursday, 03/19
Time: 09:00 - 09:25
Location: Room 210E

S5624 - GPU-Accelerated Image Processing for Modern Moving Images: Tachyon Wormhole

Lance Maurer CEO and founder, Cinnafilm, Inc.
Lance Maurer
Lance Maurer is the CEO and founder Cinnafilm, Inc, a software engineering company dedicated to the development of the highest quality image processing solutions for both cinema and broadcast. Cinnafilm has developed solutions that positively impacted many of the most valuable film and television projects of all time. Lance's background is as a mechanical engineer in the aerospace industry, and he continues to design solutions for the largest American rocket programs including Atlas, NMD, Delta IV and even the new NASA SLS program.

Cinnafilm CEO and founder Lance Maurer will discuss Tachyon Wormhole, a scalable, real-time, GPU-accelerated tool for lengthening or shortening video by precise amounts, avoiding the need for added editorial. This permits creating new commercial breaks and revenue opportunities. Processing is performed simultaneously to video, audio and captions, and the system also offers professional transcoding, motion-compensated frame-rate conversion, and unlimited format conversions. Wormhole is a software engineering marvel, receiving both "Best of Show" award at NAB 2014 and the prestigious HPA Engineering Excellence Award for 2014. Wormhole is a joint project between Cinnafilm and Wohler Technologies.

Level: All
Type: Talk
Tags: Media & Entertainment; Video & Image Processing; Real-Time Graphics

Day: Thursday, 03/19
Time: 09:30 - 09:55
Location: Room LL21D
View Recording
View PDF

S5152 - GPU-Accelerated Undecimated Wavelet Transform for Film and Video Denoising

Hermann Fuerntratt Senior Researcher, Joanneum Research
Hermann Fuerntratt
Hermann Fürntratt studied Telematics at the Graz University of Technology where he received his MSc in 1997. During his study, his special focus was on medical image processing. Since then he worked for more than a year in the UK for a company focused on digital color correction and is now a senior researcher at the Audiovisual Media research group of the DIGITAL institute at JOANEUM RESEARCH. He introduced CUDA at the Audiovisual Media group and implemented a real-time GPU-accelerated template-tracking library based on block-matching. His research activities comprise porting all sorts of algorithms to the GPU with CUDA.

The Undecimated Wavelet transform (UWT) is a valuable tool for all kinds of image and video enhancement tasks such as denoising, deconvolution and superresolution. Due to its translation invariance, it provides superior results when compared with the classical discrete wavelet transform, but at the cost of a significantly higher computational complexity. In this session, we will present an highly-efficient GPU implementation of the UWT for 16-bit or 32-bit floating point images, based on modern GPU implementation strategies like register blocking and the computation of multiple outputs per thread. Furthermore, we will show how the UWT is used within a novel film and video denoising algorithm which is able to deal with very different kinds of noise like film grain and digital sensor noise.

Level: Intermediate
Type: Talk
Tags: Media & Entertainment; Video & Image Processing

Day: Thursday, 03/19
Time: 10:00 - 10:25
Location: Room LL21D
View Recording
View PDF

S5510 - Real-Time Image Segmentation for Homeland Security Exploiting Hyper-Q Concurrency

Fanny Nina-Paravecino Ph.D. Candidate, Northeastern University
Fanny Nina-Paravecino
Fanny Nina-Paravecino is a Ph.D. Candidate in Computer Engineering at Northeastern University. She belongs to Northeastern University Research Group (NUCAR) under supervision of Dr. David Kaeli. She received her B.S. Summa cum laude in Computer Engineering from University of San Antonio Abad of Cusco in Perú in 2005. She received M.Sc. in Computer Engineering from University of Puerto Rico at Mayaguez in 2011. She achieved the best grade for undergrad thesis entitled "Virtual framework to simulate Industrial Robot" using OpenGL 3D graphics with C#. Her research interested focus on high performance optimization, with emphasis on parallel architectures. She is highlighted in Woman & CUDA on the NVIDIA website.

This talk will describe how concurrent kernel execution with Hyper-Q can impact our national security. By exploiting 32 concurrent work queues between the host and the device, we can identify the contents of baggage using CT images. This talk focuses on using Hyper-Q for real-time image segmentation as applied to luggage scanning at airports. Image segmentation plays a key role in this compute pipeline – the accuracy and real-time constraints of the application pose computational barriers. We discuss our ability to scale the number of streams using Hyper-Q, run on an NVIDIA GK110. We are able to achieve a ~47x speedup when processing 32 megapixels vs. an optimized OpenMP implementation running on an Intel Core i7-3770K.

Level: Beginner
Type: Talk
Tags: Defense; Video & Image Processing; Developer - Algorithms; Developer - Performance Optimization

Day: Thursday, 03/19
Time: 14:00 - 14:25
Location: Room LL21C
View Recording
View PDF

S5492 - Fast Digital Tomosynthesis for LIVE Radiation Therapy

Alexandros-Stavros Iliopoulos Ph.D. candidate, Department of Computer Science, Duke University
Alexandros-Stavros Iliopoulos
Alexandros-Stavros Iliopoulos is a Ph.D. candidate in the Department of Computer Science, Duke University. His research interests include developing efficient and robust computational models on parallel architectures, with applications to reconstruction from limited data, and signal/image processing. He is a Fulbright scholar, and received his Diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki, Greece, in 2011.

Learn about the recently developed LIVE radiation oncology imaging system for 4D localization of moving tumors, and how its computational reconstruction algorithm may enable clinical applicability during adaptive radiation therapy treatments. We discuss the approach of LIVE for high-fidelity reconstruction from a partial patient scan, together with its clinical significance and resulting computational challenges. By exploiting the GPU computing model and using a novel algorithm formulation, we obtain a simple and efficient reconstruction process, allowing LIVE to go into clinical trials for the first time. We present results with patient data, and remark on remaining challenges.

Level: All
Type: Talk
Tags: Medical Imaging; Video & Image Processing; Press-Suggested Sessions: HPC & Science

Day: Thursday, 03/19
Time: 14:30 - 14:55
Location: Room LL21B
View Recording
View PDF

S5592 - Using OpenCL for Performance-Portable, Hardware-Agnostic, Cross-Platform Video Processing

Dennis Adams Director of Technology, Sony Creative Software Inc.
Dennis Adams
Dennis Adams has been with Sony Creative Software (and Sonic Foundry before it) for 15 years, creating features for Vegas Pro editing software and working on the the video processing engine and effects. He led the Vegas Pro OpenCL GPU acceleration project which shipped in 2011, including dozens of accelerated video processing, effects and transitions.

This talk will discuss how Sony Creative Software used OpenCL to build a 4K video pipeline in Vegas Pro and the new Catalyst Prepare applications. It will cover the design as well as the promises and pitfalls of writing over 100 OpenCL kernels for all aspects of video processing from color management to plug-in video effects.

Level: Intermediate
Type: Talk
Tags: Media & Entertainment; Video & Image Processing

Day: Thursday, 03/19
Time: 14:30 - 14:55
Location: Room LL21D
View Recording
View PDF

S5261 - Get into VR with 360 Video

Nicolas Burtey CEO, VideoStitch
Nicolas Burtey has been working in the 360 industry for over a decade. It starts as a freelance to develop virtual tour long before Google Street View and works for a number of renown customers such as Publicis, Renault, the French government.In 2012, Nicolas Burtey founded VideoStitch, with the goal to bring next generation of 360 video software on the market. Nicolas Burtey hold a Master in Photography and a License in Computer Sciences.

Both Facebook and Hollywood view VR as a new medium, not only for computer- generated images but also for video. VideoStitch has developed 360-degree video stitching software that combines multiple HD video streams in real time using CUDA and NVIDIA GPUs. Camera manufacturers, the defense industry and movie production companies are among initial customers. This talk gives an overview of the state of art for creating 360 degree video including the challenges making multi-sensor cameras and combining 6-12 HD video streams for up to 8K video in real time with multiple GPUs.

Level: All
Type: Talk
Tags: Media & Entertainment; Video & Image Processing; Augmented Reality & Virtual Reality

Day: Thursday, 03/19
Time: 15:00 - 15:25
Location: Room LL21E
View Recording
View PDF

S5512 - Computer Aided Detection for 3D Breast Imaging and GPU Technology

Haili Chui Director, Imaging and CAD Science , Hologic Inc.
Haili Chui is currently the Director of the Imaging and CAD Science group at Hologic, Inc. The group is in charge of developing image processing and analysis algorithms for Hologic's 2D/3D breast imaging systems. Haili holds a Ph.D. degree in Electrical Engneering from Yale University.
Xiangwei Zhang Senior Principal Scientist, Hologic Inc.
Xiangwei  Zhang
Xiangwei Zhang holds a Ph.D. degree from University of Iowa and is currently working at Hologic Inc. as a Senior Principal Scientist.

In this talk, we will provide an overview of the current CAD (Computer Aided Detection) system & technology, and we will discuss the usage of GPU optimization as a key enabling factor to make such systems for 3D breast imaging. More specifically, we will cover the following topics: 1) The on-going transition of breast imaging from 2D to 3D; 2) The making of a 3D CAD system; 3) The role of GPU optimization; 4) Trends in medical imaging – big data analysis and risk modeling.

Level: Intermediate
Type: Talk
Tags: Medical Imaging; Video & Image Processing; Press-Suggested Sessions: HPC & Science

Day: Thursday, 03/19
Time: 15:00 - 15:25
Location: Room LL21B
View Recording
View PDF

S5236 - Advanced Geospatial Image Processing Using Graphics Processing Units

Ronald Kneusel Principal Software Engineer, Exelis Visual Information Solutions
Ronald Kneusel
Ronald Kneusel, MS (Physics, Michigan State University and Computer Science, University of Colorado, Boulder), has over 25 years of experience working in research and software industries. He is currently a Principal Software Engineer for Exelis Visual Information Solutions in Boulder, CO. He is also a 5th year PhD student at the University of Colorado, Boulder where his dissertation topic involves deep machine learning and image analysis. For Exelis VIS, Mr. Kneusel works in the Professional Services Group developing scientific algorithms and conducting research projects for clients. His areas of expertise include machine learning, algorithm development and research, and medical image analysis.
Atle Borsholm Senior Software Engineer, Exelis Visual Information Solutions
Atle Borsholm
Atle Borsholm, MS (EE, New Mexico State Univ) is an IDL software developer with comprehensive understanding of engineering workflows and processes. He has expertise in prototyping and developing custom IDL applications. His specialties include data visualization, analysis and scientific algorithm development, as well as code optimization. In addition to IDL, he also has extensive experience with ENVI, C, C++, CUDA, GL shader language. Atle has experience with a wide range of image processing and data analysis projects. Within geospatial data processing he has worked on projects analyzing airborne as well as space-borne data, including IR sensors, multi-spectral sensors, RADAR, and LIDAR. In non-destructive testing data analysis, he has experience with x-ray data (CR, DX), as well as ultrasonic data sources. His medical data processing experience includes MRI, CT, as well as NM.

Attendees will learn about advanced geospatial algorithms implemented for GPUs and integrated with existing high-level programming and analysis environments. Geospatial imagery presents a unique challenge for GPU analysis because of its massive size, often over 32 GB and larger per image. This talk will introduce a library and framework for working with geospatial images from within existing tools while allowing the user to easily develop new kernels or make use of the existing library of geospatial algorithms optimized for the GPU.

Level: All
Type: Talk
Tags: Defense; Developer - Tools & Libraries; Video & Image Processing; Developer - Algorithms; Developer - Performance Optimization

Day: Thursday, 03/19
Time: 15:30 - 15:55
Location: Room LL21C
View Recording
View PDF

S5554 - How ArcVideo Professional Broadcasting Transcoding Server Benefit from GPU Acceleration and Virtualization

Jin Huang Architect, ArcSoft, Inc
Jin Huang
Jin Huang is Architect of ArcSoft Multimedia and Cloud Product Group, and focus on multimedia and cloud solutions for PC and Enterprise customers. Prior to this Jin held roles in Video and Home Entertainment Group including Architect of both OEM and Retail multimedia applications and Manager of major technical contact with all hardware vendors, In this role Jin was responsible for technical communication between technical partners and planning the yearly products roadmap and providing marketing and sales support.

ArcVideo is ArcSoft's professional video server products for Broadcasting/Internet companies, and also enterprise customers. In this session, we introduce the whole ArcVideo transcoding/processing/analyzing pipeline, and how it benefits from NVIDIA's GRID/Tesla GPU acceleration. And will also outline ArcVideo's roadmap of GPU virtualization.

Level: Beginner
Type: Talk
Tags: Media & Entertainment; Video & Image Processing

Day: Thursday, 03/19
Time: 15:30 - 15:55
Location: Room LL21E
View Recording

S5274 - GPU Accelerated Video Frame Search on Video Streams

Halil Enver Soylu Software Development Engineer, Erlab Software
Halil Enver Soylu
Halil Enver Soylu is a Software Development Engineer at Erlab Software, working on GPU based video processing projects. Halil is graduated from Sabancı University Computer Science and Engineering program in 2012. In addition to his position at Erlab Software, he is currently a research assistant in Data Science Lab and a MSc student in Electronics and Computer Engineering program in Istanbul Sehir University.

In this session, attendees will learn how Erlab uses GPU processing for real-time analysis of broadcast video for an image search and automatic ad insertion system for catch-up TV. In existing conventional catch-up TV systems, operators watch tens of channels to flag the first and last frames of programs to extract the program from the streams. It's slow and costly operation. Our GPU-accelerated video frame catcher application extracts program contents from real-time streams automatically. The application compares program feeds against reference frames from the beginning and ending credits of each program and uses matches to signal the start and end of each program. The solution brings new analysis opportunities for advertising business as well. It can be implemented on the purpose of ad

Level: All
Type: Talk
Tags: Media & Entertainment; Video & Image Processing; Real-Time Graphics

Day: Thursday, 03/19
Time: 16:00 - 16:25
Location: Room LL21E
View Recording

S5352 - Real-Time Image Enhancement Using Multi-Frame Technique

Eric Kelmelis CEO, EM Photonics
Highly-Rated Speaker
Eric Kelmelis
Eric Kelmelis is CEO and co-founder of EM Photonics. For over 13 years, EM Photonics has focused on computational acceleration and efficient high performance computing primarily in the fields of scientific computing and image processing. Mr. Kelmelis has bachelors and masters degrees in Electrical Engineering from the University of Delaware and has more than 60 technical papers, 3 patents, and a book chapter. He also currently serves as chair of the Modeling and Simulation conference at SPIE's Defense, Security, and Sensing Symposium and as a Visiting Instructor at the University of Delaware.

Learn how GPUs can be applied to real-time, real-world image processing applications. Images and videos recorded at long distances (greater than 1 mile) often suffer degradation due to the atmospheric turbulence between the subject and camera, which severely limits the quality of data that is captured by high-end imaging systems. We will discuss the practical considerations of keeping up with real-time video; tuning kernel performance; architecting complex, asynchronous, multi-stage processing pipelines; and effectively using multiple GPUs in a real-time context.

Level: All
Type: Talk
Tags: Video & Image Processing; Developer - Performance Optimization; Defense

Day: Thursday, 03/19
Time: 16:00 - 16:25
Location: Room LL21A
View Recording

S5613 - High-Performance Video Encoding Using NVIDIA GPUs

Abhijit Patait Sr. Manager, System Software, NVIDIA
Abhijit Patait
Abhijit Patait has been leading NVIDIA's GPU multimedia team for past 5 years. His team is responsible for supporting the multimedia (audio and video) functionality in the NVIDIA GPU driver for Windows, NVENC SDK and GRID SDK. Prior to NVIDIA, Abhijit held several engineering and management positions working in the areas of baseband signal processing, telecom and VoIP systems design, audio/DSP processing etc. Abhijit holds an MSEE degree from University of Missouri-Rolla and and MBA from Haas School of Business, University of California at Berkeley.

This session is intended to provide a broad overview of the video encoding capabilities of current and future versions of NVIDIA's NVENC, a hardware accelerated encoder that ships with NVIDIA GPUs. We will provide an overview of the hardware capabilities and software APIs used for video encoding, with an overview of recent improvements in features, performance and quality. We will also provide a quick overview of how NVIDIA video encoding can be used in applications such as transcoding, video streaming, and GPU virtualization.

Level: All
Type: Talk
Tags: Media & Entertainment; Developer - Tools & Libraries; Video & Image Processing

Day: Thursday, 03/19
Time: 16:00 - 16:25
Location: Room LL21D
View Recording
View PDF

S5300 - High Quality Real Time Image Processing Framework on Mobile Platforms using Tegra K1

Eyal Hirsch Mobile GPU Leader , SagivTech Ltd.
Mr. Eyal Hirsch has 15 years’ experience as a software developer. Prior to joining SagivTech, Eyal was a member of AMD’s OpenCL team in Israel, developing and working on the OpenCL driver from AMD. Prior to AMD, Eyal was a team leader at Geomage, a leading software company in the Oil&Gas field. Geomage deployed one of the very first commercial GPU clusters in Israel, consisting of over many GPUs. Eyal developed all the GPU implementations and was responsible for all aspects of the GPU life cycle from development through production. Prior to Geomage, Eyal served as a team leader in Cyota, who was later sold to RSA.

Real time image processing involves computationally intensive tasks. It becomes extremely important for mobile platforms equipped with cameras, e.g. wearable devices. Image processing algorithms perfectly suit the GPU architecture, and their implementation on discrete GPUs is well established. Now, as compute enabled GPUs are available on mobile platforms, real time image processing is easier to obtain. SagivTech is a partner in Google's project Tango where it implemented Mantis Vision's depth algorithms on Tegra K1. Hear SagivTech experts on application of computer vision algorithms to the Tegra K1. We share our experience and provide tips on Mobile GPU computing, and demonstrate the advantages of implementing state of the art computer vision algorithms such as FREAK, BRISK and DOG.

Level: All
Type: Talk
Tags: Video & Image Processing; Computer Vision & Machine Vision; Developer - Performance Optimization

Day: Thursday, 03/19
Time: 16:30 - 16:55
Location: Room LL21A
View PDF

S5365 - CTB Directional Gradient Detection Using 2D-DWT for Intra-Frame Prediction in HEVC

Maria Pantoja Faculty Santa Clara University Computer Engineering Dept, Santa Clara University
Maria  Pantoja
Maria Pantoja was born in Merida, Spain. She received her B.S. and M.S. degree in engineering from Universidad Politecnica de Valencia, Spain, in 1994, and a second M.S. in computer science from California State University of the East Bay, Hayward, CA, in 2004. She received her Ph.D. from the Department of Computer Engineering at Santa Clara University (SCU) in 2009. She has worked as a senior software engineer for Logical Automation, JDS Uniphase and Nuko. She received the Packard fellowship in 2008 and worked as a teaching assistant for the department of computer engineering at SCU. She is currently a full time lecturer at the computer engineering department at SCU. Her current research interests lie in the areas of video compression, video transcoding and reconfigurable video coding. She is a member of the IEEE.
Damian Ruiz Coll Ph.D. Candidate, Universidad Politecnica Valencia
Damian Ruiz Coll
Damian Ruiz Coll received a M.Sc. degree in Telecommunications engineering from the Universidad Polytechnic of Madrid, in 2000; and currently he is a Ph.D. candidate in Computer Science from Universidad de Castilla La-Mancha, UCLM. During his doctoral studies in 2012, he was a guest researcher at the Florida Atlantic University (FAU), of United States. Nowadays, he is working as a researcher at Mobile Communication Group (MCG) of the Institute of Telecommunications and Multimedia Applications (iTEAM), Valencia, Spain. His main research interests include the low complexity algorithm for new video coding standard, HEVC (High Efficiency Video Coding). He is a member of DVB, FOBTV, MPEG and "Beyond HD" of EBU.

HEVC has 35 different intra prediction modes. The purpose of the project is to detect the dominant edge of the Prediction Blocks (PB). HEVC needs two arrays of neighbouring (up and left of the block) pixels of each available PB size to compute the predictor. These inter-PBs dependencies forces the HEVC reference software implementation of the search engine of the optimal directional prediction to be sequential. We propose a parallel algorithm that estimates the directional modes for each Prediction Unit (PU), that has higher probability of being optimal by using wavelets reducing the pool of possible directional modes to just 3 to 5.The 2D-DWT will be applied only at the CTB (64x64) level and we will test different edges extensions (zero, mirror, etc) and wavelet filters.

Level: Intermediate
Type: Talk
Tags: Media & Entertainment; Video & Image Processing

Day: Thursday, 03/19
Time: 16:30 - 16:55
Location: Room LL21D
View Recording
View PDF

S5567 - Cascaded Displays: Spatiotemporal Superresolution Using Offset Pixel Layers

Dikpal Reddy Research Scientist, Light.co
Dikpal Reddy's research interests are in computational photography, displays and computer vision. He received his Bachelors in Electrical Engineering from Indian Institute of Technology Kanpur in 2005 and Ph.D. in Electrical and Computer Engineering from University of Maryland College Park in 2011. Prior to joining Light.co he was a postdoctoral scholar at UC Berkeley from 2011-2013 working on efficient acquisition of light transport and light field imaging.

We describe a new approach to quadruple the effective pixel count and double the refresh rate of existing displays. Our approach, termed cascaded displays, achieve high resolution by stacking two or more spatial light modulators, such as LCDs, on top of one another, and offsetting them by half a pixel or less both horizontally and vertically. The same concept can also be applied temporally to increase effective frame rate. We use a real-time GPU-based non-negative matrix factorization to decompose the desired images, videos, or real-time content into appropriate multi-layered attenuation patterns. We have prototyped this technology with a dual-layer LCD, a digital projector containing a pair of LCoS microdisplays, and multi-layer stacks of printed films.

Level: Intermediate
Type: Talk
Tags: Video & Image Processing

Day: Thursday, 03/19
Time: 17:00 - 17:25
Location: Room LL21A
View Recording

S5585 - Multi-GPU Training for Large-Scale Visual Object Recognition

Wei Xia Research Scientist, Orbeus
Wei Xia
Wei XIA, now working in Orbeus Inc. as a research scientist, expected to obtain his Ph.D. degree in Computer Vision and Machine Learning from National University of Singapore, 2014. He has rich research experience in the field of generic object classification, detection and segmentation. He has won the winner awards of both segmentation and classification competitions in PASCAL VOC Challenge 2012, runner-up award in ILSVRC Challenge 2013, and winner award in ILSVRC Challenge 2014, both of which are among the most impactful competitions in this field. He visited Lund University, Sweden, as a visiting scholar in 2013. He has published many academic papers in top international conferences and journals of computer vision, and was awarded the President Graduate Fellowship (1%) for his achievements in both research and coursework in National University of Singapore. He also served as the reviewer for many international conferences and journals, like ECCV/BMVC/ICASSP/ICPR/ICIP/ACMMM/TCSVT/MVP, etc.. Besides, for industry experience, he was the research intern in Panasonic Singapore Laboratory (2012-2013) and Singapore 2359 media Pte Ltd (2013).

Despite the great progress of the deep learning models (Deep Convolutional Neural Networks) in the field of visual recognition in the past few years, one of the greatest bottlenecks lies in the extremely long training hours (from several weeks to months) to handle tens of millions of training images. The goal of this session is to share the results that we achieved when we used multiple-GPUs installed in one server to speed-up the training process. By configuring 16 GPUs (8 Titan Zs) and optimizing the parallel implementation for the CNN training, up to 14x speed increase is achieved without compromising, and even sometimes boosting, the model's accuracy. Comprehensive experimental results have demonstrated the linear scalability of the proposed multi-GPU training processes.

Level: Intermediate
Type: Talk
Tags: Machine Learning & Deep Learning; Computer Vision & Machine Vision; Video & Image Processing

Day: Thursday, 03/19
Time: 17:00 - 17:25
Location: Room 210A
View Recording
View PDF

S5602 - JPEG2000 on GPU: A Fast 4K Video Mastering, Archiving, and Contribution

Jiri Matela CEO, Comprimato
Jiri Matela
Jiri Matela received BSc and MSc degrees in Compute Science from Masaryk University in Brno, Czech Republic in 2007 and 2009. He is currently working toward the PhD degree at the Masaryk University focusing at image compressions, reformulations of image processing algorithms for massively parallel GPU architectures, high-speed networks. He is a member of team that recently received ACM Multimedia Best Open-Source Software Award for real-time image compressions and video transmission application UltraGrid and that demonstrated one of the first real-time compressed transmissions of video in 8K Ultra High-Definition resolution. Jiri is founder of Comprimato Systems, a company focusing on GPU accelerated image compressions and video codecs.

JPEG2000 is state-of-the-art video compression adopted by all digital cinemas. Besides that, it has become the format of choice for longterm archiving mainly because it significantly saves disk space, provides superior image quality, and it allows for mathematically lossless compression. The recent development in standardization of master video formats (IMF) makes JPEG2000 the emerging video compression for 4K delivery and because of the very high image quality it is being used for broadcast contribution as well. The talk will cover various applications of JPEG2000 in digital video production workflows and it will explain how NVIDIA GPUs enable such workflows with speed sufficient for 4K video processing.

Level: All
Type: Talk
Tags: Media & Entertainment; Video & Image Processing; Medical Imaging; Defense

Day: Thursday, 03/19
Time: 17:00 - 17:25
Location: Room LL21D
View Recording

S5209 - GPU-Accelerated Imaging Processing for NASA's Solar Dynamics Observatory

Mark Cheung Staff Physicist, Lockheed Martin Solar & Astrophysics Laboratory
Mark Cheung is an astrophysicist at the Lockheed Martin Solar and Astrophysics Lab in Palo Alto, CA. Since joining the lab in 2006, he has worked on a number of NASA-sponsored solar missions. His research focuses on improving our understanding of the Sun and how its changes affect us. Since 2011, he is the Science Lead for the Atmospheric Imaging Assembly instrument onboard NASA's Solar Dynamics Observatory (SDO). In this role, he is responsible for helping fellow US and international researchers get the best science out of SDO observations of the Sun's corona. Since 2014, he is the principal investigator for one of NASA's Heliophysics Grand Challenges Research programs. This award supports Cheung and his team to combine observations with massively parallel computer simulations to uncover the physical causes of sunspots, solar flares and eruptions. Mark Cheung holds a doctorate in natural sciences from the University of Göttingen, Germany. He is a recipient of the Otto Hahn medal awarded by the Max Planck Society.

Since its launch in 2010, NASA's Solar Dynamics Observatory (SDO) has continuously monitored the Sun's changes in magnetic activity. Both the Atmospheric Imaging Assembly (AIA) and Helioseismic & Magnetic Imager (HMI) instruments onboard SDO deliver 4096x4096 pixel images at a cadence of more than one image per second. Although SDO images are free from distortion by absorption and scattering in the Earth's atmosphere, images are still blurred by the intrinsic point spread functions of the telescopes. In this presentation, we show how the instrument teams have deployed CUDA-enabled GPUs to perform deconvolution of SDO images. The presentation will demonstrate how we leveraged cuFFT and Thrust to implement an efficient image processing pipeline.

Level: All
Type: Talk
Tags: Astronomy & Astrophysics; Video & Image Processing; Press-Suggested Sessions: HPC & Science

Day: Thursday, 03/19
Time: 17:30 - 17:55
Location: Room 210D
View Recording

S5314 - Practical Real-Time Video Rendering with Modern OpenGL and GStreamer

Heinrich Fink Software Engineer, ToolsOnAir Broadcast Engineering GmbH
Heinrich Fink
Heinrich Fink is a software engineer at ToolsOnAir. He has a MSc in Visual Computing and is currently working in the R&D team at ToolsOnAir. He has been working as a teaching assistant for several years together with Professor Michael Wimmer at TU Vienna. His work “Teaching a modern graphics pipeline using a shader-based software renderer” was published by the Computers & Graphics Journal. During his master thesis “GPU-based Video Processing in the Context of TV Broadcasting” he implemented the OpenGL benchmarking tool “gl-frame-bender” which he continues to develop as an open-source project at ToolsOnAir.

Learn about using OpenGL and GStreamer for advanced video rendering on the GPU. We will present two R&D projects at ToolsOnAir: (1) "gl-frame-bender", our open-source OpenGL benchmarking tool that we use to investigate advanced OpenGL methods for video rendering and (2) how we use and extend GStreamer to implement a live video mixing engine that is completely processed by graphics hardware. We will show practical examples of modern OpenGL techniques that we found to be most effective when rendering video. We will talk about our contribution to GStreamer's support for hardware codecs and OpenGL, and how it helps us to implement a flexible high-performance video mixing pipeline.

Level: Intermediate
Type: Talk
Tags: Media & Entertainment; Video & Image Processing; Real-Time Graphics

Day: Thursday, 03/19
Time: 17:30 - 17:55
Location: Room LL21D
View Recording
View PDF

S5869 - SenDISA: Distributed Intelligent, Video, Sensor & Actuator Analytics Platform for Smart Cities (Presented by Sensen)

Dr. Subhash Challa CEO, Sensen Networks
With a focus on sales & strategy, I help my team to close sales and manage major accounts in a variety of markets including transportation, security, gaming and hospitality. Prior to taking up the full time role as the CEO of SenSen Networks in January 2012, I was a Senior Principal Scientist at NICTA, University of Melbourne and lead a number of ICT for life sciences projects. I started my professional career as a Research Fellow at the University of Melbourne in 1998, where I led a number of tracking and data fusion projects. With deep and passionate interest in taking ideas to usable products, I spent over a decade of my career in R&D & product development. I was the Professor of Computer Systems Engineering at the University of Technology Sydney from 2004-2007.

This session will introduce SenSen's proprietary Video, Sensor and Actuator Analytics Platform (SenDISA) that is used by world's most prestigious and trusted organizations including the Abu Dhabi airport, Singapore police, Roads & Maritime Services Australia, Westgate Bridge, Melbourne, Australia, City of Trondheim, Norway, City of Brisbane, Ipswich city, Manly city and more. We will present how our innovative algorithms powered by the GPGPU based SenDISA platform is enabling Big Data analytic applications by fusing data from Video, Sensor & IoT devises and combining them with other transaction data to deliver smart city solutions across the globe. We will provide insights into the architecture of SenDISA and the market specific Big Data solutions serving different market verticals.

Level: Intermediate
Type: Talk
Tags: Big Data Analytics; Computer Vision & Machine Vision; Video & Image Processing; Press-Suggested Sessions: Deep Learning & Computer Vision

Day: Thursday, 03/19
Time: 17:30 - 17:55
Location: Room 210B
View Recording
View PDF

S5563 - FlexISP: A Flexible Camera Image Processing Framework

Dawid Pajak Senior Research Scientist, NVIDIA
Dawid Pająk joined NVIDIA in October 2011 as a member of Mobile Visual Computing Research group. His research interests include image processing, image/video compression, computational photography, HDR imaging and perception in CG. At times he also finds himself in high performance computing and GPGPU applications. From 2009 to 2010 he worked as a visiting researcher at Max Planck Institut für Informatik (Germany). Before moving into research, he was a Co-Founder and Technical Lead at Capricom Mobile, where he forged CG technology and games for emerging mobile platforms. He holds a M.Sc. Eng. degree in Computer Science from Technical University of Szczecin (Poland) and a Ph.D. degree in Computer Science from West Pomeranian University of Technology.

Conventional pipelines for capturing, displaying, and storing images are usually defined as a series of cascaded modules, each responsible for addressing a particular problem. While this divide-and-conquer approach offers many benefits, it also introduces a cumulative error, as each step in the pipeline only considers the output of the previous step, not the original sensor data. We propose an end-to-end system that is aware of the camera and image model, enforces natural-image priors, while jointly accounting for common image processing steps like demosaicking, denoising, deconvolution, and so forth, all directly in a given output representation (e.g., YUV, DCT). Our system is flexible and we demonstrate it on regular Bayer images as well as images from custom sensors. In all cases, we ac

Level: Advanced
Type: Talk
Tags: Video & Image Processing

Day: Friday, 03/20
Time: 09:00 - 09:25
Location: Room LL21A
View Recording

S5491 - GPU-Based, Real-Time HEVC Decoder, UHD Solution on Automotive Infotainment Platforms

Rama Mohana Reddy Technical Manager, PathPartner Technology Consulting Pvt Ltd.
Rama Mohana Reddy
Rama has more than 9 years of experience in video algorithms development & optimizations on embedded processors. He possesses extensive experience on design, development and optimizations for DSP, ARM & Multicore architectures. Currently he leads a team for develop HEVC Video codecs on DSP based multichip, GPGPU and many other SOCs.

In this session we present GPUs for HEVC decoding. By using GPU for HEVC decoder, we save significant CPU time and power, which can be used for other critical tasks. Our use of GPU for motion compensation module of HEVC decoder, made it possible to achieve real-time HEVC decoder solution for UHD resolution on Automotive infotainment platforms. By porting motion compensation module on GPU, we achieved 40% CPU time savings and good scalability.

Level: Intermediate
Type: Talk
Tags: Video & Image Processing; Embedded Systems; Automotive; Media & Entertainment

Day: Friday, 03/20
Time: 09:30 - 09:55
Location: Room LL21A
View Recording

S5208 - Streaming FFTs on Large 3D Microscope Images

Peter Steinbach HPC Developer, Max Planck Institute of Molecular Cell Biology and Genetics
Peter Steinbach
I have studied at Desy Hamburg and Zeuthen, Humboldt University of Berlin and the University of Leipzig of which I received a Diploma thesis in Physics. After that, I conducted a PhD thesis in particle physics by analysing data of the ATLAS experiment at the Large Hadron Collider (CERN. SUI). I am now a High Performance Computing (HPC) Developer at the Max Planck Institute of Molecular Cell Biology and Genetics where I support scientific groups to develop fast software that harnesses the capabilities of today's HPC installations.

Dive deep into efficient and fast memory transfers of multi-gigabyte image data to perform swift iterative deconvolutions of 3D microscope imagery. Through the creation of an open-source GPU deconvolution implementation (github.com/psteinb/libmultiviewnative), I studied various techniques to orchestrate memory copies of multi-dimensional images. I will present concepts, available options and details of efficient memory transfers from host to device memory. I will showcase CUDA/C++ code and discuss my experiences with various CUDA versions on NVIDIA hardware that lead to greater performance than achieved by just performing the calculations on device (2-3x). This work will enable the scientific community to push the limits of processing and handling data gathered by imaging living tissue.

Level: Intermediate
Type: Talk
Tags: Video & Image Processing; Life & Material Science; Computer Vision & Machine Vision; Data Center, Cloud Computing & HPC

Day: Friday, 03/20
Time: 10:00 - 10:25
Location: Room LL21A
View Recording

S5305 - A 2D Convolution Framework for Extreme Performance Tuning

Alan Wang Compute Architect, NVIDIA
Alan is a GPU Architect in computer vision field at NVIDIA. He is experienced in parallelization, performance modeling and architecture-specific tuning. Alan is currently working on 2D convolution projects. Before joining computer architecture team, Alan works on graphics tracing and FPGA architecture&EDA software.

We propose a 2D convolution framework that (1) maintains a unified abstraction incorporating a series of optimization techniques and (2) can auto-tune the performance on different GPUs. We quantify and analyze the performance impact of using a single strategy which reveals its potential when applied to other application. The experiment shows the algorithm tuned by our framework can reach a high GFLOPs utilization of nearly 80%, when target GM107.

Level: Intermediate
Type: Talk
Tags: Video & Image Processing; Developer - Performance Optimization; Computer Vision & Machine Vision

Day: Friday, 03/20
Time: 10:30 - 10:55
Location: Room LL21A
View Recording

S5800 - Deep Convolutional Neural Network for Computer Vision Products

Li Xu Director, R&D, Sensetime Group Limited
Dr. Li Xu is the Director of R&D department in Sensetime Group Limited, a company aiming to provide the industry with leading computer vision solutions. Li has more than 10 years ' R&D experience in the field of computer vision, image processing, & computational photography. His most recent interest lies in the combination of deep neural network and generative vision models to solve real-world problems, with the help of modern GPU and CUDA technology.

We have witnessed many ground-breaking results in computer vision research using deep learning techniques. In this talk, we introduce recent achievements in our group (http://sensetime.com/) which we believe will bridge the gap between research and product development and will bring about many computer-vision-enabled smart products. We show that our unified deep CNN framework, accelerated using modern GPU architecture, can be easily applied to various vision tasks including image processing, pedestrian detection, object localization and face recognition, meanwhile achieving state-of-the-art performance.

Level: All
Type: Talk
Tags: Machine Learning & Deep Learning; Video & Image Processing; Media & Entertainment

Day: Friday, 03/20
Time: 10:30 - 10:55
Location: Room 210A
View Recording
View PDF

S5562 - Fast ANN for High-Quality Collaborative Filtering

Yun-Ta Tsai Software Engineer, Google, Inc.
Yun-Ta Tsai joined Google [X] in November 2014. Prior to Google, he was a senior research scientist at NVIDIA Research. His research focus included parallel computing, image processing and computer architecture on mobile devices. Before NVIDIA, he has been working at Nokia Research Center for augmented reality and mobile navigation. Primarily, his role as a system architect oversaw the framework from design to implementation on mobile system. He received a degree of M.Sc. in Computer Science from USC at 2009. He designed and led several game production teams in Gamepipe Lab. His work involved substantially various aspects of mobile applications: graphics, vision, gaming, and user experience. Currently his primarily focus is mobile computation and image processing.

Collaborative filtering collects similar patches, jointly filters them, and scatters the output back to input patches; each pixel gets a contribution from each patch that overlaps with it, allowing signal reconstruction from highly corrupted data. Exploiting self-similarity, however, requires finding matching image patches, which is an expensive operation. We propose a GPU-friendly approximated-nearest-neighbor algorithm that produces high-quality results for any type of collaborative filter. We evaluate our ANN search against state-of-the-art ANN algorithms in several application domains. Our method is orders of magnitudes faster, yet provides similar or higher-quality results than the previous work.

Level: Advanced
Type: Talk
Tags: Video & Image Processing

Day: Friday, 03/20
Time: 13:00 - 13:25
Location: Room LL21A
View Recording

S5630 - Deep Learning Made Easy with GraphLab

Piotr Teterwak ToolKit Developer, Dato
Piotr Teterwak
Piotr recently finished his BA in Computer Science at Dartmouth College in Hanover, NH, where he conducted work exploring the learning of Convolutional Deep Neural Nets with applications in Computer Vision. He currently works on the Toolkit Development team at Dato.

Deep Learning is a promising machine learning technique with a high barrier to entry. In this talk, we provide an easy entry into this field via "deep features" from pre-trained models. These features can be trained on one data set for one task and used to obtain good predictions on a different task, on a different data set. No prior experience necessary. Real time demos will be given using GraphLab Create, a popular open source based software. GraphLab Create utilizes NVIDA GPUs for significant performance speedup.

Level: All
Type: Talk
Tags: Machine Learning & Deep Learning; Big Data Analytics; Video & Image Processing

Day: Friday, 03/20
Time: 13:00 - 13:50
Location: Room 210A
View Recording
View PDF

S5455 - High Capability Multidimensional Data Compression on GPUs

Sergio Zarantonello Adunct Lecturer (SCU), Chief Executive Officer (Algorithmica LLC), Santa Clara University and Algorithmica LLC
Sergio Zarantonello
Sergio Zarantonello, Ph.D., was in R&D at Exxon, held management positions at SGI and Fujitsu America, was VP of Engineering at 3DGeo, and is currently CEO of Algorithmica. Sergio obtained a Ph.D. in mathematics from the University of Wisconsin - Madison. Sergio is Adjunct Lecturer in Applied Mathematics at Santa Clara University, where he teaches graduate courses on wavelets, Fourier analysis, and numerical methods for partial differential equations. Further details of his background are provided in uploaded resume.
Ed Karrels Adjunct Lecturer, Santa Clara University
Ed Karrels
Ed Karrels earned his bachelor's degree in computer science from the University of Wisconsin Oshkosh and master's degree in computer engineering from Santa Clara University. He is an adjunct lecturer at Santa Clara University in the Department of Computer Engineering.

In this talk we present a CUDA implementation of a wavelet-based compression utility for multidimensional data, and give examples of its application in earth science and medical imaging. Key features of our codec are efficiency and speed. A special feature is the ability to guarantee compression errors no larger than an a priori set tolerance in a user-prescribed metric. Since this feature requires multiple passes of the compress-decompress process, hardware acceleration offered by GPUs is critical. This paper was written in collaboration with S. E. Zarantonello, D. Concha, D. Fabris, A. Goyal, E. Karrels, B. Smithson, Q. Wang from School of Enfineering at Santa Clara University.

Level: All
Type: Talk
Tags: Video & Image Processing; Developer - Algorithms; Energy Exploration

Day: Friday, 03/20
Time: 13:30 - 13:55
Location: Room LL21A
View Recording
View PDF

Talk
 

TUTORIAL

Presentation
Details

S5796 - Image Learning and Computer Vision in CUDA (Presented by ArrayFire)

Peter Andreas Entschev Software Engineer, ArrayFire
Peter Entschev is currently a Software Developer at ArrayFire, where he primarily works on concurrent computer vision problems. He has received his Bachelor's degree in Telecommunication Systems and Master's degree in Computer Science from the Federal University of Technology - Paraná (UTFPR), Brazil. Before joining ArrayFire, he worked on real-time computer vision research at SEW-Eurodrive in Germany and with system administration and development of Linux distributions for the Brazilian Government.

Analyzing a massive data set? Need fast results? Need computer vision algorithms? Not sure when and where to start? The answer is here and now! In this tutorial we will give you the tools to bring your favorite computer vision algorithm to life. In this tutorial we will go over key challenges for implementing computer vision and machine learning algorithms on the GPU. We will walk you through several computer vision algorithms for the GPU (ORB, Fast, SIFT) and give you the hands experience to implement you own algorithms.

Level: All
Type: Tutorial
Tags: Video & Image Processing; Computer Vision & Machine Vision

Day: Tuesday, 03/17
Time: 15:00 - 16:20
Location: Room 210F
View Recording
View PDF

Tutorial
 

POSTER

Presentation
Details

P5112 - CUDA Based Fog Removal : Machine Vision for Automotive/Defence Applications

RATUL WASNIK Sr. Software Engineer, KPIT technologies Ltd, Pune
Ratual Wasnik is presently working as a Sr. Software Engineer at KPIT Technologies, Pune with the defense team. My area of work includes High performance Computing/CUDA and Image Processing algorithms.

Proposed System is a fully automated De-weathering system to improve the visibility/stability during bad weather conditions much needed for surveillance/automotive infotainment/defense applications has been developed. Fog and haze during day and night time was handled with real time performance using accelerations from CUDA implemented algorithms. Videos from fixed cameras are processed with no special hardware except CUDA capable NVIDIA GPU's.

Level: All
Type: Poster
Tags: Computer Vision & Machine Vision; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5117 - GPU Accelerated Compressive Imaging System

Mohammad Azari Research assistant, UNC Charlotte
Mohammad Azari
Mohammad Azari is a PhD student at Electrical and Computer Engineering department of University of North Carolina at Charlotte working as a research assistant at Center for Precision Meteorology under supervision of Dr. Farahi. He received the MSc degree in electrical engineering in 2011 and the BSc degree in electrical engineering in 2008, both from Amirkabir University of Technology (Tehran Polytechnic). His research interests include computer vision, signal and image processing, computational imaging, and embedded system design.

A new GPU accelerated compressive imaging system is introduced that is based on single-pixel camera architecture which allows designing a high-resolution camera for scenarios that ordinary high-resolution sensors that are costly or impractical such as hyperspectral and SWIR imaging. One major obstacle for employing this technique is very high computational requirement of the recovery algorithm. By parallelizing the recovery algorithm using GPU, we achieve the required speedup to make our imaging system suitable for practical applications.

Level: All
Type: Poster
Tags: Video & Image Processing; Computer Vision & Machine Vision

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5142 - Multi-core GPU – Fast parallel SAR Image Generation

Mahesh Khadtare Research Student (PhD), I2IT, Pune
Research student highly motivated towards Biomedical signal/image processing applications, having twelve years of working experience which includes 4 years as a scientist in CRL, 4 years as a team lead in Trinity convergence,one year as a technology consultant in HP, two years as team lead in GE healthcare and one year as a lead software engineer in Verizon wireless. All Khadtare's work experience includes research in signal processing area. He has published papers on various application on GPU in GTC 2010 and GTC 2012.

Generating images from Synthetic Aperture Radar (SAR) video's raw data require complex computations that is tabulated in data sample for RADARSAT-1[1]. In SAR, the principle is based on matching of received(Rx) signal phase with the transmitted(Tx) signal phase at a stable frequency of transmission. Fast Fourier Transform (FFT) calculation is used to calculate phase of received and transmitted signal. The modified algorithm that is presented in this paper[2]is implemented by dividing the impulse response of Rx/Tx signal into several blocks.

Level: All
Type: Poster
Tags: Video & Image Processing; Visualization - In-Situ & Scientific

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5146 - Real-Time GPU Based Video Segmentation with Depth Information

Nilangshu Bidyanta Student, University of Arizona
Nilangshu Bidyanta is a graduate student, Reconfigurable Computing Lab, University Of Arizona

Extension to an existing GPU based real-time video segmentation work using information about the scene from a depth sensor.

Level: All
Type: Poster
Tags: Video & Image Processing; Real-Time Graphics

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5148 - Composite Radar Picture - 360 Degrees Situational Awareness

Kaj-Robin Weslien Senior Engineer - Signal Processing, Kongsberg Maritime
Kaj-Robin Weslien
M.Sc in Electrical engineering and signal processing. Previous experience from Image Processing in the field of Machine Vision and Robotics. Present position as Development Engineer at Kongsberg Maritime. Since 2011, one of the key developers for the new GPU based Radar system.

The new radar concept, K-Bridge 'Composite Picture' Radar CP360, is an integral part of the K-Bridge navigation system. With CUDA accelerated image and signal processing, seamless combination of up to 4 Radar antennas into a single presentation is achieved. The Radar display can perform at 60fps on a standard desktop PC with mid-range GPUs. Using the 2GB device memory for buffering old signals, we have developed innovative features like "Relief background" for weak target detection and "Instant Filtering" for immediate Radar image response

Level: All
Type: Poster
Tags: Signal & Audio Processing; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5162 - Tegra K1 Imaging Performance Study: Local Binary Patterns (LBP)

Antonio Sanz Montemayor Associate Professor , Universidad Rey Juan Carlos
Dr Antonio S. Montemayor was born in 1975, in Madrid, Spain. He received his MS degree in applied physics at Universidad Autónoma de Madrid in 1999 and PhD degree at Universidad Rey Juan Carlos in 2006. He is currently Associate Professor at Universidad Rey Juan Carlos and principal investigator of the CAPO research line at URJC. His research interests include soft computing, computer vision, GPU computing, image and video processing and real-time implementations.

In this work we wanted to test the computational performance of the recent Tegra K1 (TK1) mobile platform, its ARM processor as well as its CUDA-capable GPU compared to a high-performance desktop platform (CPU and GPU). For this purpose, we wanted to start with a popular as well as very interesting algorithm in terms of general applicability, especially when one of the most important TK1 target is the automotive imaging field (pedestrian detection, sign recognition, motion extraction, etc.).

Level: All
Type: Poster
Tags: Video & Image Processing; Embedded Systems

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5168 - Computing Corpus Callosum as Biomarker for Degenerative Disorders

Thomas Kovac PhD Student, Expertise Centre for Digital Media
Thomas Kovac
Thomas Kovac is a PhD student in Computer Science at Hasselt University, Belgium, where he also received both his B.S. and M.S. He currently resides in his hometown Genk, Belgium

MS is an inflammatory disorder of the brain and spinal cord and it has been known to cause atrophy and deformation in the corpus callosum. Longitudinal studies try to quantify these changes by using medical image analysis techniques for measuring its size and shape. Our framework searches for a plane with minimal corpus callosum area by means of image registration. The use of a GPU greatly improves computation time, so this framework is built out of algorithms and data structures that exploit its parallel computation capabilities and hardware.

Level: All
Type: Poster
Tags: Medical Imaging; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5171 - Synthetic Aperture Radar Image Processing by Range Migration Algorithm Using Multi-GPUs

Barath Sastha Student, IIT Bombay
Barath  Sastha
Barath Sastha received the B.E degree in Electrical and Electronics Engineering from Anna University, Chennai, India in 2008. He has worked with Infosys Technologies Ltd as Technology Analyst from 2008 to 2013. He is currently pursuing the M.Tech degree in the field of Control and Computing (EE) from Indian Institute of Technology, Bombay. His research interests include parallel computing, image processing and control engineering.

Processing satellite SAR raw data of large areas involves large number of computations to produce an image. As the size increases, the entire simulation cannot be handled by one GPU due to memory and core limitations. Here we have implemented Range Migration algorithm for large scene size by splitting the strip into azimuth patches which can be scheduled to different GPUs in multi-GPU system. By applying a mosaicking algorithm the images of individual patches can then be stitched to produce the complete image.

Level: All
Type: Poster
Tags: Video & Image Processing; Signal & Audio Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5174 - NVENC Based H.264 Encoding for Virtual Machine Based Monitor Wall Architecture

Rudolfs Bundulis Doctoral student, University of Latvia
Rudolfs Bundulis is a Ph.D. student at the University of Latvia. At the beginning of his master's studies Rudolfs became interested in video streaming and processing. Seeing the issues in CCTV domain where people need high resolution surfaces to provide space for real time playback of many high definition video sources he became interested int this topic. Seeing that the currently used hardware based solutions where not flexible and costly he pursued the issue in his Ph.D. studies to provide a better alternative to build display walls and further investigate the human interaction with large scale display surfaces.

There is a growing need for high resolution display surfaces that could visualize large graphs, maps, microscope and x-ray shots and ultra high definition media. The current solutions are not flexible and scale poorly. The poster demonstrates a new and innovative approach for display wall construction that is based on virtual machines and provides abstraction from the underlying physical GPUs and removes wiring limitations by using hardware accelerated H.264 encoding instead of raw video signal.

Level: All
Type: Poster
Tags: Visualization - Large Scale & Multi-Display; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5205 - High Speed Stabilization and Geo-Registration of Aerial Imagery based on GPU Optimization

Steve Suddarth Director, Transparent Sky, LLC
Dr. Steve Suddarth is the director of Transparent Sky, LLC, specializing in airborne Wide Area Motion Imaging (WAMI) technology for real-time surveillance. Steve has also served key leadership positions in the U.S. Military, including leading the development of WAMI systems that have been deployed to the Middle East. Dr. Suddarth also is the Chief Technical Officer and former Director of the Configurable Space Microsystems Innovation and Application Center (COSMIAC) at the University of New Mexico. Steve holds a Ph.D. in Electrical Engineering from the University of Washington and is also a graduate of the U.S. Air Force Academy.

GPU optimizations already improved projection speed of Wide-Area Motion Imaging (WAMI) maps by 100x. An Air Force-led team developed novel GPU-optimized algorithms that merge projection with stabilization and automated real-time tracking of items such as vehicles and people. The resulting systems will ultimately be deployed on small on-board processors in low-cost drones to replace multi-million dollar systems deployed on turbine aircraft. The imagery has military and civil applications such as security, traffic management and firefighting.

Level: All
Type: Poster
Tags: Computer Vision & Machine Vision; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5213 - GPU Accelerated Haze Removal on Tegra K1

Bin Zhou Adjunct Research Professor, University of Science and Technology of China
Dr. Bin Zhou is the director and chief scientist of Marine Information Processing Laboratory(MIPL) at Institution of Oceanography, Shandong Academy of Sciences. He serves as an Adjunct Research Professor in School of Information Science and Technology at USTC and an NVIDIA CUDA Fellow. He is the PI of CUDA research center (CRC) in Institute of Advanced Technology(IAT), USTC.In MIPL, he leads a team working on information processing systems for marine environmental pollution & natural hazard monitoring and ocean-atmosphere simulation. In CRC, he performs researches on drones control, video processing and computer vision algorithms on NVIDIA GPU/CUDA platform.

Toxic haze becomes a major air pollution threat in China, which affects not only public health but also outdoor computer vision systems. By adapting dark channel prior method into dehazing process, very good effects are achieved. However, huge processing requirements bring big challenges. We refined the parallel algorithm and performed deep-optimization on Tegra K1 Jetson platform. Compared to ARM CPU, experiments show 156x speedup. The results show Tegra K1 has great potential for embedded real-time computer vision processing.

Level: All
Type: Poster
Tags: Computer Vision & Machine Vision; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5235 - Enhanced Human Computer Interaction Using Hand Gesture Analysis on GPU

Pragati Dharmale Graduate Student, SNHU, NH
Graduate Student in Southern New Hampshire University, NH with specialization in information technology. His core interest is in GPU and signal processing application development.

This poster represents very active research topic in human computer interaction (HCI) as automatic hand gesture recognition using NVIDIA's GPU. In this work, neural network based video gesture are processed and the finger counts recognize. Due to real time requirement, algorithm need to optimize and be computationally efficient. We implemented the MATLAB code, it performs slow when neural network processing started. Implementing them in a parallel programming model such as GPU-CUDA provided the necessary gain in processing speed.

Level: All
Type: Poster
Tags: Computer Vision & Machine Vision; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5247 - Adaptive Sampling and Filtering for Fast Monte-Carlo Rendering

Soham Uday Mehta Research Intern, NVIDIA
Soham Uday Mehta is a 4th year Graduate student in Computer Science at the University of California, Berkeley, where he is advised by Prof. Ramamoorthi. Soham's PhD thesis topic is Fourier Analysis and Fast Filtering for Monte-Carlo Rendering. He completed a long internship at Nvidia research. He holds a Bachelor's in Electrical Engineering from IIT-Bombay, India.

Distribution effects such as defocus and motion blur, soft shadows and indirect illumination are important for photo-realistic rendering. Accurate rendering requires integrating radiance over a 4-dimensional lens-angle space. Monte-Carlo sampling converges very slowly, produces noise with fewer samples. Based on a frequency analysis of both primary and secondary effects, we introduce a two-level adaptive sampling and fast image-space filtering algorithm. We reduce sample counts by 30x and render HD images in under 5 sec with a GPU ray-tracer.

Level: All
Type: Poster
Tags: Rendering & Ray Tracing; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5251 - Seismic Attributes Computation on GPUs in Total's Integrated Interpretation Platform.

Rached Abdelkhalek Geoscience Software Developer, TOTAL
Rached Abdelkhalek
Rached Abdelkhalek obtained his Master Degree in Computer Science in 2007 and his PhD in Computer Science in 2013 at the University of Bordeaux under the supervision of Jean Roman (INRIA) and Henri Calandra (Total). In 2011, Rached Abdelkhalek joined Total as a geophysical software developer. His research interest includes seismic simulation, seismic interpretation, parallel distributed computing and GPU computing.

In seismic interpretation, seismic attributes are key to better understand structural and sedimentary features present in seismic images. We study the use of GPUs to speedup seismic attribute computations. Several attributes have been ported to CUDA and integrated in Total's interpretation platform. We show how GPU computing may drastically improve the interpreter experience, reducing computation time and allowing the use of complex seismic attributes to highlight subtle features that are difficult to detect with conventional attributes.

Level: All
Type: Poster
Tags: Energy Exploration; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5256 - High-Quality ASCII ART GENERATION with GPU Acceleration

Koji Nakano Professor, Hiroshima University
Koji Nakano
Koji Nakano received the BE, ME and Ph.D degrees from Department of Computer Science, Osaka University, Japan in 1987, 1989, and 1992 respectively. In 1992-1995, he was a Research Scientist at Advanced Research Laboratory. Hitachi Ltd. In 1995, he joined Department of Electrical and Computer Engineering, Nagoya Institute of Technology.He has been a full professor at School of Engineering, Hiroshima University since 2003. His research interests include image processing, hardware algorithms, GPU-based computing, FPGA-based reconfigurable computing, parallel computing, algorithms and architectures. He has been doing research on GPGPU since 2009, and has published 10 journal papers and more than 20 conference papers on this topic so far.

An ASCII art is a matrix of text characters that reproduces an original gray-scale image. It is commonly used to represent pseudo gray-scale images in text based messages and bulletin boards on the Web. We have developed a new exhaustive local search technique to generate high-quality ASCII art. Our implementation on a GPU can generate high-quality ASCII art for an original image with 1024x1024 pixels in less than 1 second, while a single Intel CPU takes more than 60 seconds.

Level: All
Type: Poster
Tags: Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5283 - GPU-­based Parallel Computing for Structural Network Analysis of Human Brain

Shih-Kai Huang Master Student, Department of Computer Science and Information Engineering, Chang Gung University
Shih-Kai Huang was born in Taipei, Taiwan in 1991. He received his B. Eng. from Chang Gung University, Taiyuan, Taiwan in 2014. He is currently pursuing a Masters degree at the Department of Computer Science and Information Engineering, Chang Gung University, Taiyuan, Taiwan. His current research includes medical image processing, cloud storage and computing, and GPU-based parallel computing for neuroscience applications.

In this presentation, we employed NVIDIA CUDA to implement a GPU-based structural brain network analysis scheme including the Q-ball imaging reconstruction using spherical harmonic functions, a jackknife-based probabilistic tractography algorithm and network analyses based on graph theory. The results show that our work could make the processing of brain network analyses done more efficiently and quickly comparing with the existed software. Our implementation also promise the gap reduction between academic research and clinical applications.

Level: All
Type: Poster
Tags: Medical Imaging; Video & Image Processing

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5298 - Image Matching Using Hypergraphs on the GPU

Peter Yoon Associate Professor of Computer Science, Trinity College
Peter Yoon
Professor Yoon received his Ph.D. in computer science from Pennsylvania State University in 1995, where he developed numerical algorithms for guidance and control systems for underwater signal processing at the Applied Research Laboratory. Since then he has taught at Azusa Pacific University and now is an associate professor at Trinity College. He is interested in developing visualization techniques for time-varying data for signal and image processing.

Hypergraph matching is one of the useful techniques in solving problems such as image matching and object recognition. However, it is computationally demanding and impractical in real-world applications. Our main contribution is to accelerate the process by implementing it on the GPUs. Our result shows an equal accuracy and a big speed-up when comparing the parallel implementation of GPUs to the serial implementation on CPUs.

Level: All
Type: Poster
Tags: Video & Image Processing; Developer - Algorithms

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

P5333 - Robot Soccer: A BYU Senior Design Project

Alex Wilson Research Assistant, Brigham Young University
Computer Engineering masters candidate at Brigham Young University with research focused in fault-tolerant FPGA design and embedded system reliability. Received his BS in Electrical Engineering at BYU in April 2014. Interests include eating good food, hiking mountains, the Linux kernel, and fancy new gadgets.

An analysis of the BYU Robot Soccer senior design project and the improvements that could be made using the NVIDIA Jetson TK1 (192 CUDA cores) in comparison to the previous embedded boards used in the project. The Jetson TK1 was acquired by the project through the Tegra K1 CUDA Vision Challenge in Summer 2014.

Level: All
Type: Poster
Tags: Video & Image Processing; Embedded Systems

Day: Monday, 03/16
Time: 17:00 - 20:00
Location: Grand Ballroom 220A
View PDF

Poster
 

HANGOUT

Presentation
Details

S5882 - Hangout: Signal & Image Processing

Have burning questions about signal and image processing? Come to the GTC Hangouts! Hangouts are like "office hours" with your favorite professor, designed to connect you directly with NVIDIA engineers and topic experts on a specific topic each hour. Pull up a chair and ask away – we're here to help!

Level: All
Type: Hangout
Tags: Signal & Audio Processing; Video & Image Processing

Day: Wednesday, 03/18
Time: 09:00 - 10:00
Location: Pod A

S5913 - Hangout: Video & Image Processing

Have burning questions about computer vision? Come to this GTC Hangout! Hangouts are like "office hours" with your favorite professor, designed to connect you directly with NVIDIA engineers on a specific topic each hour. Pull up a chair and ask away – we're here to help!

Level: All
Type: Hangout
Tags: Video & Image Processing

Day: Wednesday, 03/18
Time: 10:00 - 11:00
Location: Pod A

S5912 - Hangout: Video & Image Processing

Have burning questions about computer vision? Come to this GTC Hangout! Hangouts are like "office hours" with your favorite professor, designed to connect you directly with NVIDIA engineers on a specific topic each hour. Pull up a chair and ask away – we're here to help!

Level: All
Type: Hangout
Tags: Video & Image Processing

Day: Wednesday, 03/18
Time: 14:00 - 15:00
Location: Pod B

S5888 - Hangout: Signal & Image Processing

Have burning questions about signal and image processing? Come to this GTC Hangout! Hangouts are like "office hours" with your favorite professor, designed to connect you directly with NVIDIA engineers and expert guests on a specific topic each hour. Pull up a chair and ask away – we're here to help!

Level: All
Type: Hangout
Tags: Signal & Audio Processing; Video & Image Processing

Day: Friday, 03/20
Time: 09:00 - 10:00
Location: Pod C

Hangout