GPU-based Video Feature Tracking And Matching
Sudipta N. Sinha1 , Jan-Michael Frahm1 , Marc Pollefeys1 , Yakup Genc2
Department of Computer Science, CB# 3175 Sitterson Hall, University of North Carolina at Chapel Hill, NC 27599 Real-time Vision and Modeling Department, Siemens Corporate Research,755 College Road East, Princeton, NJ 08540
Abstract This paper describes novel implementations of the KLT feature tracking and SIFT feature extraction algorithms that run on the graphics processing unit (GPU) and is suitable for video analysis in real-time vision systems. While signiﬁcant acceleration over standard CPU implementations is obtained by exploiting parallelism provided by modernprogrammable graphics hardware, the CPU is freed up to run other computations in parallel. Our GPU-based KLT implementation tracks about a thousand features in real-time at 30 Hz on 1024 × 768 resolution video which is a 20 times improvement over the CPU. It works on both ATI and NVIDIA graphics cards. The GPU-based SIFT implementation works on NVIDIA cards and extracts about 800 features from 640 ×480 video at 10Hz which is approximately 10 times faster than an optimized CPU implementation.
1 Introduction Extraction and matching of salient 2D feature points in video is important in many computer vision tasks like object detection, recognition, structure from motion and marker-less augmented reality. While certain sequential tasks like structure from motion for video  require onlinefeature point tracking, others need features to be extracted and matched across frames separated in time (eg. wide-baseline stereo). The increasing programmability and computational power of the graphics processing unit (GPU) present in modern graphics hardware provides great scope for acceleration of computer vision algorithms which can be parallelized [3, 11, 12,14, 15,16, 17]. GPUs have beenevolving faster than CPUs (transistor count doubling every few months, a rate much higher than predicted by Moore’s Law), a trend that is expected to continue in the near future. While dedicated specialpurpose hardware or reconﬁgurable hardware can be used for speeding up vision algorithms [1,2], GPUs provide a much more attractive alternative since they are
Sudipta N. Sinha et al.affordable and easily available within most modern computers. Moreover with every new generation of graphics cards, a GPU implementation just gets faster. In this paper we present GPU-KLT, a GPU-based implementation for the popular KLT feature tracker [6, 7] and GPU-SIFT, a GPU-based implementation for the SIFT feature extraction algorithm . Our implementations are 10 to 20 times faster than thecorresponding optimized CPU counterparts and enable real-time processing of high resolution video. Both GPU-KLT and GPU-SIFT have been implemented using the OpenGL graphics library and the Cg shading language. While GPU-KLT works on both ATI and NVIDIA graphics cards, GPU-SIFT currently works only on NVIDIA but will be modiﬁed to also work with ATI cards in future. As an application, the GPU-KLTtracker has been used to track 2D feature points in high-resolution video streams within a vision based large-scale urban 3D modeling system described in . Our work is of broad interest to the computer vision, image processing and medical imaging community since many of the key steps in KLT and SIFT are shared by other algorithms, which can also be accelerated on the GPU. Some of these are (a)image ﬁltering and separable convolution, (b) Gaussian scale-space construction, (c) non-maximal suppression, (d) structure tensor computation, (e) thresholding a scalar ﬁeld and (f) re-sampling discrete 2D and 3D scalar volumes. This paper is organized as follows. Section 2 describes the basic computational model for general purpose computations on GPUs (GPGPU). Sections 3 presents the basic KLT...