Upload
brede
View
92
Download
0
Embed Size (px)
DESCRIPTION
Empowering visual categorization with the GPU. Present by 陳群元. outline. Introduction Overview of visual categorization Image feature extraction Category model learning Test image classification GPU accelerated categorization Experimental setup Results . introduction. - PowerPoint PPT Presentation
Citation preview
Empowering visual categorization with the GPU
Present by 陳群元
我是強壯 !
outline
我是強壯 !
Introduction Overview of visual categorization
Image feature extraction Category model learning Test image classification
GPU accelerated categorization Experimental setup Results
introduction
我是強壯 !
Use GPU accelerate the quantization and classification components of a visual categorization architecture
The algorithms and their implementations should push the state-of-the-art in categorization accuracy.
Visual categorization must be decomposable into components to locate bottlenecks.
Given the same input, implementations of a component on various hardware architectures must give the same output.
overview
我是強壯 !
我是強壯 !
Visual categorization system
我是強壯 !
Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words
Category Model Learning Test Image Classification
Visual categorization system
我是強壯 !
Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words
Category Model Learning Test Image Classification
Point sampling strategy
我是強壯 !
Dense sampling Typically, around10,000 points are sampled
per image Salient point method
Harris-Laplace salient point detector [29] Difference-of-Gaussians detector [28]
Visual categorization system
我是強壯 !
Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words
Category Model Learning Test Image Classification
Descriptors
我是強壯 !
SIFT descriptor ->128 dim 10 frames per second for 640x480
images(GPU) SURF descriptor
100 frames per second for 640x480 images(GPU) ColorSIFT descriptor ->384 dim
Triple of SIFT
Visual categorization system
我是強壯 !
Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words
Category Model Learning Test Image Classification
Bag-of-words
我是強壯 !
Vector quantization is computationally the most expensive part of the bag-of-words model.
Bag -> images set Words->features
Bag-of-words
我是強壯 !
N descriptors of length d in an image codebook with m elements
O(ndm) per image A tree-based codebook
O(nd log(m))->real-time on the GPU [25].
我是強壯 !
Visual categorization system
我是強壯 !
Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words
Category Model Learning Test Image Classification
Category model learning
我是強壯 !
precompute kernel function values kernel-based SVM algorithm
我是強壯 !
我是強壯 !
Support Vector Machines
Kernel Support Vector Machines
Visual categorization system
我是強壯 !
Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words
Category Model Learning Test Image Classification
Test image classification
我是強壯 !
我是強壯 !
outline
我是強壯 !
Introduction Overview of visual categorization
Image feature extraction Category model learning Test image classification
GPU accelerated categorization Parallel Programming on the GPU and CPU GPU-Accelerated Vector Quantization GPU-Accelerated Kernel Value Precomputation
Experimental setup Results
Parallel Programming on the GPU and CPU
我是強壯 !
SIMD instructions perform the same operation on multiple data elements at the same time
我是強壯 !
GPU-Accelerated Vector Quantization
我是強壯 !
The most expensive computational step in vector quantization is the calculation of the distance matrix.(n*m)
A:n*d matrix with all image descriptors as rows
B:m*d matrix with all codebook elements as rows
GPU-Accelerated Vector Quantization(cont.)
我是強壯 !
GPU-Accelerated Vector Quantization(cont.)
我是強壯 !
Compute the dot products between all rows of A and B (line 7).
matrix multiplications are the building block for many algorithms highly optimized BLAS linear algebra libraries containing this operation exist for both the CPU and the GPU.
我是強壯 !
GPU-Accelerated Kernel Value Precomputation
我是強壯 !
To compute kernel function values, we use the kernel function based on the distance
distance between feature vectors F and F’
kernel function based on this distance
GPU-Accelerated Kernel Value Precomputation(cont.)
我是強壯 !
multiple input features
For kernel value precomputation, memory usage is an important problem. for a dataset with 50, 000 images, the input data
is 12 GB and the output data is 19 GB to avoid holding all data in memory
simultaneously. We divide the processing into evenly sized chunks.(1024*1024)
GPU-Accelerated Kernel Value Precomputation(cont.)
我是強壯 !
EXPERIMENTAL SETUP
我是強壯 !
Experiment 1: Vector Quantization Speed CPU implementation is SIMD-optimized. codebook of size m = 4, 000 20, 000 descriptors per image descriptor lengths of d = 128 (SIFT) and d = 384
(ColorSIFT). Experiment 2: Kernel Value Precomputation Speed
chosen the large Mediamill Challenge training set of 30, 993 frames
Experiment 3: Visual Categorization Throughput comparison is made between the quad-core Core i7 920
CPU (2.66GHz) and the Gefore GTX260 GPU (27 cores).
Results
我是強壯 !
Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation
Speed Experiment 3: Visual Categorization
Throughput
Results
我是強壯 !
Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation
Speed Experiment 3: Visual Categorization
Throughput
Vector Quantization Speed(SIFT)
我是強壯 !
Vector Quantization Speed(ColorSIFT)
我是強壯 !
Results
我是強壯 !
Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation
Speed Experiment 3: Visual Categorization
Throughput
Kernel Value Precomputation Speed
我是強壯 !
Results
我是強壯 !
Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation
Speed Experiment 3: Visual Categorization
Throughput
Visual Categorization Throughput
我是強壯 !
Other applications
我是強壯 !
Application 1: k-means Clustering Application 2: Bag-of-Words Model for Text
Retrieval Application 3: Multi-Frame Processing for
Video Retrieval
Conclusions
我是強壯 !
This paper provides an efficiency analysis of a state-of-the art visual categorization pipeline based on the bag-of-words model.
two large bottlenecks were identified: the vector quantization step in the image feature extraction and the kernel value computation in the category classification
Compared to a multi-threaded CPU implementation on a quad-core CPU, the GPU is 4.8 times faster.
The end
我是強壯 !
Thank you!